WO2021026705A1 - 匹配关系确定方法、重投影误差计算方法及相关装置 - Google Patents

匹配关系确定方法、重投影误差计算方法及相关装置 Download PDF

Info

Publication number
WO2021026705A1
WO2021026705A1 PCT/CN2019/100093 CN2019100093W WO2021026705A1 WO 2021026705 A1 WO2021026705 A1 WO 2021026705A1 CN 2019100093 W CN2019100093 W CN 2019100093W WO 2021026705 A1 WO2021026705 A1 WO 2021026705A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
feature point
feature
target
point
Prior art date
Application number
PCT/CN2019/100093
Other languages
English (en)
French (fr)
Inventor
袁维平
张欢
王筱治
苏斌
吴祖光
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN201980051525.9A priority Critical patent/CN112640417B/zh
Priority to PCT/CN2019/100093 priority patent/WO2021026705A1/zh
Publication of WO2021026705A1 publication Critical patent/WO2021026705A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules

Definitions

  • This application relates to the field of automatic driving in the field of artificial intelligence, and in particular to a method for determining a matching relationship, a method for calculating a reprojection error, and related devices.
  • AI Artificial Intelligence
  • Autonomous driving is a mainstream application in the field of artificial intelligence.
  • Autonomous driving technology relies on the collaboration of computer vision, radar, monitoring devices, and global positioning systems to allow motor vehicles to achieve autonomous driving without the need for human active operations.
  • Self-driving vehicles use various computing systems to help transport passengers from one location to another. Since autonomous driving technology does not require humans to drive motor vehicles, it can theoretically effectively avoid human driving errors, reduce traffic accidents, and improve highway transportation efficiency. Therefore, autonomous driving technology has received more and more attention.
  • an automatic driving device uses positioning methods such as real-time positioning and mapping (Simultaneous Localization And Mapping, SLAM) for positioning, it is usually measured by the reprojection error of each frame of image it collects.
  • SLAM Simultaneous Localization And Mapping
  • the reprojection error of a frame of image refers to the error between the projected point and the measurement point on the frame of image.
  • the projected point can be the three-dimensional space coordinates corresponding to each feature point in the frame of image projected to the coordinates of the frame of image
  • the measurement points can be the coordinate points of these characteristic points in the frame of image.
  • a commonly used method for calculating the reprojection error is as follows: Determine the three-dimensional space coordinates corresponding to each feature point in the target frame image to obtain the first three-dimensional space coordinates; calculate the translation matrix between the target frame image and the reference frame image And the rotation matrix; using the translation matrix and the rotation matrix to transform each of the three-dimensional space coordinates in the first three-dimensional space coordinate to a reference coordinate system to obtain the second three-dimensional space coordinates; each three-dimensional space in the second three-dimensional space coordinate The coordinates are projected to the target frame image to obtain the projected point; the error between the projected point and the coordinate point (ie, the measurement point) of each feature point in the target frame image is calculated to obtain the reprojection error of the target frame image.
  • the reference coordinate system may be the world coordinate system established by the automatic driving device at the starting point of this driving
  • the reference frame image may be the first frame image collected by the automatic driving device at the starting point
  • the target frame image It may be any frame of image except the reference frame image collected by the automatic driving device during this driving process.
  • the automatic driving device needs to calculate the matching relationship between any two adjacent images it collects in order to calculate the relationship between each frame image it collects and the reference frame image, and then calculate each frame image it collects and the reference frame The matching relationship between the images.
  • feature matching is generally used to determine the matching relationship between two frames of images.
  • Random Sampling Consensus (RANSAC) is used in feature matching.
  • the flow of the RANSAC algorithm is as follows: Suppose the sample (multiple sets of feature point pairs obtained by matching two frames of images) contains inliers and outliers, corresponding to correct matching point pairs and incorrect matching point pairs, randomly from the sample Extract 4 sets of point pairs in the two frames to calculate the matching relationship between the two images; then according to the matching relationship, divide the remaining feature point pairs into interior points and exterior points, repeat the above steps, and select the matching relationship corresponding to the largest number of interior points The relationship is the matching relationship between the final two frames of images.
  • the two frames of images are the images collected by the automatic driving device at the first time and the second time respectively, and the matching relationship is the translation matrix and the rotation matrix between the two frames of images.
  • the essence of the RANSAC algorithm is an algorithm in which the minority obeys the majority.
  • a dynamic obstacle occupies a large part of the field of view, for example, the automatic driving device is driving behind a large car
  • the outer point dynamic obstacles such as other vehicles
  • the interior point static obstacle
  • the RANSAC algorithm sometimes cannot accurately determine the matching relationship between two frames of images. Therefore, it is necessary to study a scheme that can accurately determine the matching relationship between two frames of images in an autonomous driving scene with dynamic obstacles.
  • the embodiments of the present application provide a method for determining a matching relationship, a method for calculating a reprojection error, and related devices, which can accurately determine the matching relationship between two frames of images in an automatic driving scene with dynamic obstacles.
  • an embodiment of the present application provides a method for determining a matching relationship.
  • the method may include: acquiring N sets of feature point pairs, each set of feature point pairs includes two matching feature points, one of which is from the first The feature points extracted from one image, and the other feature points are the feature points extracted from the second image.
  • the first image and the second image are the images collected by the automatic driving device at the first time and the second time, respectively.
  • N is greater than An integer of 1; using the motion state information of the dynamic obstacle to adjust the pixel coordinates of the target feature point in the N groups of feature points, the target feature point belongs to the dynamic obstacle in the first image and/or the second image Corresponding feature points, the pixel coordinates of the feature points other than the target feature point in the N groups of feature point pairs remain unchanged; according to the adjusted pixel coordinates corresponding to each feature point in the N groups of feature point pairs, the The target matching relationship between the first image and the second image.
  • the matching relationship between the first image and the second image may be a translation matrix and a rotation matrix between the first image and the second image. Since the motion state of a dynamic obstacle is different from that of a static obstacle, the translation matrix and rotation matrix between the feature points corresponding to the dynamic obstacle in the first image and the second image are different from the first image and the second image. The translation matrix and rotation matrix between the feature points corresponding to the static obstacle are different. It can be understood that only when the feature points in the N groups of feature point pairs are feature points corresponding to static obstacles, the first image and the second image can be determined more accurately based on the pixel coordinates corresponding to each feature point in the N groups of feature point pairs. The matching relationship between the images.
  • the translation between the feature points corresponding to the dynamic obstacle in the N groups of feature point pairs is basically the same. Therefore, the pixel coordinates corresponding to each feature point in the N groups of feature point pairs can be more accurately Determine the matching relationship between the first image and the second image.
  • the motion state information includes the displacement of the dynamic obstacle from the first moment to the second moment; the motion state information of the dynamic obstacle is used to align the N groups of feature points to the target feature point.
  • Adjusting the pixel coordinates includes adjusting the pixel coordinates of a reference feature point by using the displacement, the reference feature point is included in the target feature point and belongs to the feature point corresponding to the dynamic obstacle in the second image.
  • the displacement of the dynamic obstacle from the first moment to the second moment is used to adjust the pixel coordinates of the reference feature point (ie motion compensation), so that the pixel coordinates of the reference feature point are adjusted to be basically equivalent to the static The pixel coordinates of the obstacle, so as to more accurately determine the matching relationship between the first image and the second image.
  • the method before using the motion state information of the dynamic obstacle to adjust the pixel coordinates of the target feature point in the N groups of feature point pairs, the method further includes: determining that the N groups of feature point pairs are located in The feature points of the first projection area and/or the second projection area are the target feature points; the first projection area is the area where the image of the dynamic obstacle in the first image is located, and the second projection area is the second projection area. 2. The area in the image where the image of the dynamic obstacle is located; obtain the pixel coordinates corresponding to the target feature point.
  • the target feature points in the N groups of feature point pairs can be determined quickly and accurately.
  • the method before determining that the feature points located in the first projection area and/or the second projection area of the N groups of feature point pairs are the target feature points, the method further includes: obtaining a target point cloud,
  • the target point cloud is a point cloud that characterizes the characteristics of the dynamic obstacle at the first moment; the target point cloud is projected onto the first image to obtain the first projection area.
  • the point cloud of the characteristics of the dynamic obstacle at the first moment is projected to the first image, and the area where the dynamic obstacle is located in the first image can be accurately determined.
  • the method before determining that the feature point located in the first projection area and/or the second projection area of the N groups of feature point pairs is the target feature point, the method further includes: comparing the first point cloud Perform interpolation calculation with the second point cloud to obtain a target point cloud.
  • the first point cloud and the second point cloud are the point clouds collected by the automatic driving device at the third time and the fourth time, respectively, and the target point cloud is a representation The point cloud of the characteristics of the dynamic obstacle at the first moment, the third moment is before the first moment, and the fourth moment is after the first moment; the target point cloud is projected onto the first image to obtain The first projection area.
  • the target point cloud is obtained by interpolation calculation, and the point cloud at any time can be determined more accurately.
  • the target matching relationship is a better matching relationship among two or more matching relationships between the first image and the second image determined by using a random sampling consensus RANSAC algorithm.
  • the N groups of feature point pairs may be N groups of feature point pairs randomly obtained from multiple groups of feature point pairs matching the first image and the second image.
  • the matching relationship determined by using the N groups of feature points to the adjusted pixel coordinates may not be the optimal matching relationship between the first image and the second image.
  • the RANSAC algorithm can be used to determine a better matching relationship from multiple matching relationships between the first image and the second image.
  • the better matching relationship may be: combining the first image and the second image
  • Multiple sets of matching feature point pairs can be substituted into the target matching relationship to obtain the most interior points, and the number of interior points is greater than the number threshold.
  • the number threshold may be 80%, 90%, etc. of the number of feature point pairs in the plurality of groups.
  • the RANSAC algorithm can be used to more accurately determine the matching relationship between the first image and the second image.
  • determining the target matching relationship between the first image and the second image according to the adjusted pixel coordinates corresponding to each feature point in the N groups of feature point pairs includes: The adjusted pixel coordinates corresponding to each feature point in the feature point pair determine the translation matrix and the rotation matrix between the first image and the second image.
  • the embodiments of the present application provide a method for calculating reprojection error.
  • the method may include: using the motion state information of the dynamic obstacle to adjust the spatial coordinates corresponding to the first feature point in the first spatial coordinates to obtain the first Two spatial coordinates, the first spatial coordinates include the spatial coordinates corresponding to each feature point in the first image, the first feature point is the feature point corresponding to the dynamic obstacle in the first image, and the first image is an automatic driving device
  • the motion state information includes the displacement and posture change of the automatic driving device from the first time to the second time; the second spatial coordinates are projected onto the first image to obtain the first pixel coordinates Calculate the reprojection error of the first image according to the first pixel coordinates and the second pixel coordinates; the second pixel coordinates include the pixel coordinates of each feature point in the first image.
  • the motion state information of the dynamic obstacle is used to adjust the space coordinate corresponding to the first feature point in the first space coordinate, so that the space coordinate corresponding to the first feature point is basically equivalent to the feature corresponding to the static obstacle
  • the method before calculating the reprojection error of the first image according to the first pixel coordinates and the second pixel coordinates, the method further includes: using the displacement to perform the first image in the first image.
  • the pixel coordinates of the feature points are adjusted to obtain the second pixel coordinates, and the pixel coordinates of the feature points except the first feature point in the first image remain unchanged.
  • the displacement of the dynamic obstacle from the first moment to the second moment is used to adjust the pixel coordinates of the first feature point (that is, motion compensation), so that the pixel coordinates of the first feature point are basically the same after being adjusted.
  • the pixel coordinates of the static obstacle are used to make the reprojection error of the first image more accurate.
  • the method before using the motion state information of the dynamic obstacle to adjust the space coordinates corresponding to the first feature point in the first space coordinates to obtain the second space coordinates, the method further includes: obtaining the second space coordinates.
  • the second feature point in the image that matches the first feature point; the first image and the second image are respectively the images collected by the first camera and the second camera on the automatic driving device at the second time, the The first camera and the second camera are located at different spatial positions; according to the first feature point and the second feature point, the spatial coordinates corresponding to the first feature point are determined.
  • the spatial coordinates corresponding to the first feature point can be determined quickly and accurately.
  • the method before using the motion state information of the dynamic obstacle to adjust the space coordinates corresponding to the first feature point in the first space coordinates to obtain the second space coordinates, the method further includes: obtaining the target point
  • the target point cloud is a point cloud that characterizes the characteristics of the dynamic obstacle at the second moment; the target point cloud is projected onto the first image to obtain the target projection area; it is determined that the first feature points are concentrated in the target projection
  • the feature points of the region are the first feature points; the feature points included in the first feature point set are feature points extracted from the first image, and they all match the feature points in the second feature point set, the second feature point set
  • the feature points included in the point set are feature points extracted from the second image.
  • the feature points located in the target projection area are used as the feature points corresponding to the dynamic obstacle, and the feature points corresponding to the dynamic obstacle in the first feature point set can be accurately determined.
  • an embodiment of the present application provides an apparatus for determining a matching relationship, including: an acquiring unit configured to acquire N sets of feature point pairs, each set of feature point pairs includes two matching feature points, one of the feature points is The feature point extracted from the first image, the other feature point is the feature point extracted from the second image, the first image and the second image are the images collected by the automatic driving device at the first time and the second time respectively, N Is an integer greater than 1; the adjustment unit is configured to use the motion state information of the dynamic obstacle to adjust the pixel coordinates of the target feature point in the N groups of feature point pairs, and the target feature point belongs to the first image and/or the first image 2.
  • the pixel coordinates of the feature points other than the target feature point in the N groups of feature point pairs remain unchanged; the determining unit is used to center each of the N groups of feature points
  • the adjusted pixel coordinates corresponding to the characteristic points determine the target matching relationship between the first image and the second image.
  • the translation matrix between the feature points corresponding to the dynamic obstacle in the N groups of feature point pairs is basically the same. Therefore, the pixel coordinates of each feature point in the N groups of feature point pairs can be more accurately determined.
  • the motion state information includes the displacement of the dynamic obstacle from the first moment to the second moment; the adjustment unit is specifically configured to use the displacement to adjust the pixel coordinates of the reference feature point,
  • the reference feature point is included in the target feature point and belongs to the feature point corresponding to the dynamic obstacle in the second image.
  • the determining unit is further configured to determine that a feature point located in the first projection area and/or the second projection area in the N groups of feature point pairs is the target feature point; the first projection area is The area where the image of the dynamic obstacle in the first image is located, and the second projection area is the area where the image of the dynamic obstacle in the second image is located; the acquisition unit is also used to obtain the target feature point The corresponding pixel coordinates.
  • the determining unit is further configured to perform interpolation calculation on the first point cloud and the second point cloud to obtain the target point cloud, and the first point cloud and the second point cloud are respectively the autonomous driving
  • the point cloud collected by the device at the third time and the fourth time, the target point cloud is a point cloud that characterizes the characteristics of the dynamic obstacle at the first time, the third time is before the first time, and the fourth time After the first moment;
  • the device further includes: a projection unit for projecting the target point cloud onto the first image to obtain the first projection area.
  • the target matching relationship is a better matching relationship among two or more matching relationships between the first image and the second image determined by using a random sampling consensus RANSAC algorithm.
  • the determining unit is specifically configured to determine the translation matrix sum between the first image and the second image according to the adjusted pixel coordinates corresponding to each feature point in the N groups of feature point pairs Rotation matrix.
  • an embodiment of the present application provides a reprojection error calculation device.
  • the device includes: an adjustment unit configured to use the motion state information of a dynamic obstacle to perform a calculation on the space coordinates corresponding to the first feature points in the first space coordinates.
  • the first spatial coordinate includes the spatial coordinate corresponding to each feature point in the first image, the first feature point is the feature point corresponding to the dynamic obstacle in the first image, and the first image It is the image collected by the automatic driving device at the second moment, the motion state information includes the displacement and posture change of the automatic driving device from the first moment to the second moment; the projection unit is used to project the second spatial coordinates to the The first image is used to obtain the first pixel coordinates; the determining unit is configured to calculate the reprojection error of the first image according to the first pixel coordinates and the second pixel coordinates; the second pixel coordinates include the features in the first image The pixel coordinates of the point.
  • the motion state information includes the displacement of the dynamic obstacle from the first moment to the second moment; the adjustment unit is specifically configured to use the displacement to determine the first feature point in the first image The pixel coordinates of is adjusted to obtain the second pixel coordinate, and the pixel coordinates of the feature points except the first feature point in the first image remain unchanged.
  • the determining unit is further configured to determine that a feature point located in the first projection area and/or the second projection area in the N groups of feature point pairs is the target feature point; the first projection area is The area where the image of the dynamic obstacle in the first image is located, and the second projection area is the area where the image of the dynamic obstacle in the second image is located; the acquisition unit is also used to obtain the target feature point corresponding The pixel coordinates.
  • the determining unit is further configured to perform interpolation calculation on the first point cloud and the second point cloud to obtain the target point cloud, and the first point cloud and the second point cloud are respectively the autonomous driving
  • the point cloud collected by the device at the third time and the fourth time, the target point cloud is a point cloud that characterizes the characteristics of the dynamic obstacle at the first time, the third time is before the first time, and the fourth time After the first moment;
  • the device further includes: a projection unit for projecting the target point cloud onto the first image to obtain the first projection area.
  • the embodiments of the present application provide a computer-readable storage medium that stores a computer program, and the computer program includes program instructions that, when executed by a processor, cause the processor to execute the first From aspect to second aspect and any optional implementation method.
  • the embodiments of the present application provide a computer program product, the computer program product includes program instructions, and when the program instructions are executed by a processor, the processor executes the first aspect to the second aspect and any one of the foregoing aspects.
  • an embodiment of the present application provides a computer device, including a memory, a communication interface, and a processor; the communication interface is used to receive data sent by an automatic driving device, the memory is used to store program instructions, and the processor is used to execute the program The instructions are used to execute the above-mentioned first aspect to the second aspect and any one of the optional implementation methods.
  • FIG. 1 is a functional block diagram of an automatic driving device 100 provided by an embodiment of the present application.
  • FIG. 2 is a schematic structural diagram of an automatic driving system provided by an embodiment of the application.
  • FIG. 3 is a flowchart of a method for determining a matching relationship between image frames according to an embodiment of the application
  • FIG. 5 is a flowchart of a method for calculating reprojection errors according to an embodiment of the application
  • Figure 6 is a schematic diagram of a triangulation process
  • FIG. 7 is a schematic flowchart of a positioning method provided by an embodiment of this application.
  • FIG. 8 is a schematic structural diagram of an apparatus for determining a matching relationship provided by an embodiment of the application.
  • FIG. 9 is a schematic structural diagram of a reprojection error calculation device provided by an embodiment of the application.
  • FIG. 10 is a schematic structural diagram of a computer device provided by an embodiment of this application.
  • FIG. 11 is a schematic structural diagram of a computer program product provided by an embodiment of the application.
  • the method for determining the matching relationship provided in the embodiment of the present application can be applied to an automatic driving scenario.
  • the following is a brief introduction to the autonomous driving scenario.
  • Autonomous driving scenarios Autonomous driving devices (such as autonomous vehicles) use lidar to collect the point cloud of the surrounding environment in real time or near real time and use the camera to collect images; use SLAM to locate the position of their own car based on the collected point clouds and images, And plan the driving route according to the positioning result.
  • Self-driving means self-driving device.
  • FIG. 1 is a functional block diagram of an automatic driving device 100 provided by an embodiment of the present application.
  • the automatic driving device 100 is configured in a fully or partially automatic driving mode.
  • the automatic driving device 100 can control itself while in the automatic driving mode, and can determine the current state of the automatic driving device 100 and its surrounding environment through human operation, and determine the possible behavior of at least one other vehicle in the surrounding environment, And determine the confidence level corresponding to the possibility of the other vehicle performing the possible behavior, and control the automatic driving device 100 based on the determined information.
  • the automatic driving device 100 may be set to operate without human interaction.
  • the automatic driving apparatus 100 may include various subsystems, such as a traveling system 102, a sensor system 104, a control system 106, one or more peripheral devices 108, and a power source 110, a computer system 112, and a user interface 116.
  • the automatic driving device 100 may include more or fewer subsystems, and each subsystem may include multiple elements.
  • each subsystem and element of the automatic driving device 100 may be interconnected by wire or wireless.
  • the traveling system 102 may include components that provide power movement for the autonomous driving device 100.
  • the propulsion system 102 may include an engine 118, an energy source 119, a transmission 120, and wheels/tires 121.
  • the engine 118 may be an internal combustion engine, an electric motor, an air compression engine, or other types of engine combinations, such as a hybrid engine composed of a gasoline engine and an electric motor, or a hybrid engine composed of an internal combustion engine and an air compression engine.
  • the engine 118 converts the energy source 119 into mechanical energy.
  • Examples of energy sources 119 include gasoline, diesel, other petroleum-based fuels, propane, other compressed gas-based fuels, ethanol, solar panels, batteries, and other sources of electricity.
  • the energy source 119 may also provide energy for other systems of the automatic driving device 100.
  • the transmission device 120 can transmit mechanical power from the engine 118 to the wheels 121.
  • the transmission device 120 may include a gearbox, a differential, and a drive shaft.
  • the transmission device 120 may also include other devices, such as a clutch.
  • the drive shaft may include one or more shafts that can be coupled to one or more wheels 121.
  • the sensor system 104 may include several sensors that sense information about the environment around the automatic driving device 100.
  • the sensor system 104 may include a positioning system 122 (the positioning system may be a global positioning system (GPS) system, a Beidou system or other positioning systems), an inertial measurement unit (IMU) 124, a radar 126, a laser rangefinder 128, and a camera 130.
  • the sensor system 104 may also include sensors of the internal system of the automatic driving device 100 to be monitored (for example, an in-vehicle air quality monitor, a fuel gauge, an oil temperature gauge, etc.). Sensor data from one or more of these sensors can be used to detect objects and their corresponding characteristics (position, shape, direction, speed, etc.). Such detection and recognition are key functions for the safe operation of the autonomous automatic driving device 100.
  • the positioning system 122 may be used to estimate the geographic location of the automatic driving device 100.
  • the IMU 124 is used to sense the position and orientation changes of the automatic driving device 100 based on inertial acceleration and angular velocity.
  • the IMU 124 may be a combination of an accelerometer and a gyroscope.
  • the radar 126 may use radio signals to sense objects in the surrounding environment of the automatic driving device 100.
  • the laser rangefinder 128 can use laser light to sense objects in the environment where the automatic driving device 100 is located.
  • the laser rangefinder 128 may include one or more laser sources, laser scanners, and one or more detectors, as well as other system components.
  • the laser rangefinder 128 may be a LiDAR (light detection and ranging, LiDAR).
  • Lidar ibeo
  • the lidar can be an Ibeo laser sensor. Lidar can send a detection signal (laser beam) to a target (i.e.
  • the point cloud is a massive collection of points that express the spatial distribution and surface characteristics of the target under the same spatial reference system.
  • the point cloud in this application may be a point cloud obtained according to the principle of laser measurement, including the three-dimensional coordinates of each point.
  • the camera 130 may be used to capture multiple images of the surrounding environment of the automatic driving device 100.
  • the camera 130 may be a still camera or a video camera.
  • the camera 130 may capture multiple images of the surrounding environment of the automatic driving device 100 in real time or periodically.
  • the camera 130 may be a binocular camera, including a left-eye camera and a right-eye camera, and the positions of the two cameras are different.
  • the control system 106 controls the operation of the automatic driving device 100 and its components.
  • the control system 106 may include various components, including a steering system 132, a throttle 134, a braking unit 136, a computer vision system 140, a route control system 142, and an obstacle avoidance system 144.
  • the steering system 132 is operable to adjust the forward direction of the automatic driving device 100.
  • it may be a steering wheel system in one embodiment.
  • the throttle 134 is used to control the operating speed of the engine 118 and thereby control the speed of the automatic driving device 100.
  • the braking unit 136 is used to control the automatic driving device 100 to decelerate.
  • the braking unit 136 may use friction to slow down the wheels 121.
  • the braking unit 136 may convert the kinetic energy of the wheels 121 into electric current.
  • the braking unit 136 may also take other forms to slow down the rotation speed of the wheels 121 to control the speed of the automatic driving device 100.
  • the computer vision system 140 may be operable to process and analyze the images captured by the camera 130 in order to recognize objects and/or features in the surrounding environment of the autonomous driving device 100.
  • the objects and/or features may include traffic signals, road boundaries, and obstacles.
  • the computer vision system 140 may use object recognition algorithms, automatic driving methods, Structure from Motion (SFM) algorithms, video tracking, and other computer vision technologies.
  • SFM Structure from Motion
  • the computer vision system 140 may be used to map the environment, track objects, estimate the speed of objects, and so on.
  • the computer vision system 140 may use the point cloud obtained by the lidar and the image of the surrounding environment obtained by the camera.
  • the route control system 142 is used to determine the driving route of the automatic driving device 100.
  • the route control system 142 may combine data from the sensor 138, the GPS 122, and one or more predetermined maps to determine the driving route for the automatic driving device 100.
  • the obstacle avoidance system 144 is used to identify, evaluate, and avoid or otherwise cross over potential obstacles in the environment of the automatic driving device 100.
  • control system 106 may add or alternatively include components other than those shown and described. Alternatively, a part of the components shown above may be reduced.
  • the automatic driving device 100 interacts with external sensors, other vehicles, other computer systems, or users through peripheral devices 108.
  • the peripheral device 108 may include a wireless communication system 146, an onboard computer 148, a microphone 150 and/or a speaker 152.
  • the peripheral device 108 provides a means for the user of the autonomous driving apparatus 100 to interact with the user interface 116.
  • the onboard computer 148 may provide information to the user of the automatic driving device 100.
  • the user interface 116 can also operate the onboard computer 148 to receive user input.
  • the on-board computer 148 can be operated through a touch screen.
  • the peripheral device 108 may provide a means for the autonomous driving device 100 to communicate with other devices located in the vehicle.
  • the microphone 150 may receive audio (eg, voice commands or other audio input) from the user of the autonomous driving device 100.
  • the speaker 152 may output audio to the user of the automatic driving device 100.
  • the wireless communication system 146 may wirelessly communicate with one or more devices directly or via a communication network.
  • the wireless communication system 146 may use 3G cellular communication, or 4G cellular communication, such as LTE, or 5G cellular communication.
  • the wireless communication system 146 may use WiFi to communicate with a wireless local area network (WLAN).
  • WLAN wireless local area network
  • the wireless communication system 146 may directly communicate with the device using an infrared link, Bluetooth, or ZigBee. Other wireless protocols, such as various vehicle communication systems.
  • the wireless communication system 146 may include one or more dedicated short-range communications (DSRC) devices, which may include vehicles and/or roadside stations. Public and/or private data communications.
  • DSRC dedicated short-range communications
  • the power supply 110 may provide power to various components of the automatic driving device 100.
  • the power source 110 may be a rechargeable lithium ion or lead-acid battery.
  • One or more battery packs of such batteries may be configured as a power source to provide power to various components of the automatic driving device 100.
  • the power source 110 and the energy source 119 may be implemented together, such as in some all-electric vehicles.
  • the computer system 112 may include at least one processor 113 that executes instructions 115 stored in a non-transitory computer readable medium such as a data storage device 114.
  • the computer system 112 may also be multiple computing devices that control individual components or subsystems of the automatic driving apparatus 100 in a distributed manner.
  • the processor 113 may be any conventional processor, such as a commercially available central processing unit (CPU). Alternatively, the processor may be a dedicated device such as an ASIC or other hardware-based processor.
  • FIG. 1 functionally illustrates the processor, memory, and other elements of the computer system 112 in the same block, those of ordinary skill in the art should understand that the processor, computer, or memory may actually include Multiple processors, computers, or memories stored in the same physical enclosure.
  • the memory may be a hard disk drive or other storage medium located in a housing other than the computer system 112. Therefore, a reference to a processor or computer will be understood to include a reference to a collection of processors or computers or memories that may or may not operate in parallel.
  • some components such as the steering component and the deceleration component may each have its own processor that only performs calculations related to component-specific functions.
  • the processor may be located far away from the automatic driving device and wirelessly communicate with the automatic driving device. In other aspects, some operations in the process described herein are performed on a processor arranged in the automatic driving device and others are performed by a remote processor, including taking the necessary steps to perform a single manipulation.
  • the data storage device 114 may include instructions 115 (e.g., program logic), which may be executed by the processor 113 to perform various functions of the automatic driving device 100, including those described above.
  • the data storage device 114 may also contain additional instructions, including sending data to, receiving data from, interacting with, and/or performing data on one or more of the propulsion system 102, the sensor system 104, the control system 106, and the peripheral device 108. Control instructions.
  • the data storage device 114 may also store data, such as road maps, route information, the location, direction, speed, and other information of the vehicle. This information may be used by the automatic driving device 100 and the computer system 112 during the operation of the automatic driving device 100 in autonomous, semi-autonomous, and/or manual modes.
  • the user interface 116 is used to provide information to or receive information from the user of the automatic driving device 100.
  • the user interface 116 may include one or more input/output devices in the set of peripheral devices 108, such as a wireless communication system 146, an in-vehicle computer 148, a microphone 150, and a speaker 152.
  • the computer system 112 may control the functions of the automatic driving device 100 based on inputs received from various subsystems (for example, the traveling system 102, the sensor system 104, and the control system 106) and from the user interface 116. For example, the computer system 112 may utilize input from the control system 106 in order to control the steering unit 132 to avoid obstacles detected by the sensor system 104 and the obstacle avoidance system 144. In some embodiments, the computer system 112 is operable to provide control of many aspects of the autonomous driving device 100 and its subsystems.
  • one or more of the aforementioned components may be installed or associated with the automatic driving device 100 separately.
  • the data storage device 114 may exist partially or completely separately from the automatic driving device 100.
  • the aforementioned components may be communicatively coupled together in a wired and/or wireless manner.
  • FIG. 1 should not be construed as a limitation to the embodiments of the present application.
  • An autonomous vehicle traveling on a road can recognize objects in its surrounding environment to determine the adjustment to the current speed.
  • the object may be other vehicles, traffic control equipment, or other types of objects.
  • each recognized object can be considered independently, and based on the respective characteristics of the object, such as its current speed, acceleration, distance from the vehicle, etc., can be used to determine the speed to be adjusted by the autonomous vehicle.
  • the automatic driving device 100 or the computing equipment associated with the automatic driving device 100 may be based on the characteristics of the identified object and the surrounding environment.
  • the state for example, traffic, rain, ice on the road, etc.
  • each recognized object depends on each other's behavior, so all recognized objects can also be considered together to predict the behavior of a single recognized object.
  • the automatic driving device 100 can adjust its speed based on the predicted behavior of the recognized object.
  • an autonomous vehicle can determine what stable state the vehicle will need to adjust to (for example, accelerate, decelerate, or stop) based on the predicted behavior of the object.
  • other factors can also be considered to determine the speed of the automatic driving device 100, such as the lateral position of the automatic driving device 100 on the traveling road, the curvature of the road, the proximity of static obstacles and dynamic obstacles, etc. .
  • the computing device can also provide instructions for modifying the steering angle of the self-driving device 100, so that the self-driving car follows a given trajectory and/or maintains objects near the self-driving car. (For example, a car in an adjacent lane on a road) safe horizontal and vertical distance.
  • the above-mentioned automatic driving device 100 may be a car, a truck, a motorcycle, a bus, a boat, an airplane, a helicopter, a lawn mower, a recreational vehicle, a playground vehicle, construction equipment, a tram, a golf cart, a train, a trolley, etc. ,
  • the embodiment of the present invention does not specifically limit.
  • Fig. 2 shows a functional block diagram of the automatic driving device 100, and an automatic driving system 101 is introduced below.
  • Fig. 2 is a schematic structural diagram of an automatic driving system provided by an embodiment of the application.
  • Fig. 1 and Fig. 2 describe the automatic driving device 100 from different perspectives.
  • the computer system 101 includes a processor 103, and the processor 103 is coupled to a system bus 105.
  • the processor 103 may be one or more processors, where each processor may include one or more processor cores.
  • a display adapter (video adapter) 107 can drive the display 109, and the display 109 is coupled to the system bus 105.
  • the system bus 105 is coupled with an input output (I/O) bus 113 through a bus bridge 111.
  • I/O input output
  • the I/O interface 115 is coupled to the I/O bus.
  • the I/O interface 115 communicates with a variety of I/O devices, such as an input device 117 (such as a keyboard, a mouse, a touch screen, etc.), a media tray 121, such as a CD-ROM, a multimedia interface, etc.
  • Transceiver 123 can send and/or receive radio communication signals
  • camera 155 can capture scene and dynamic digital video images
  • external USB interface 125 external USB interface 125.
  • the interface connected to the I/O interface 115 may be a USB interface.
  • the processor 103 may be any conventional processor, including a reduced instruction set computing ("RISC”) processor, a complex instruction set computing (“CISC”) processor, or a combination of the foregoing.
  • the processor may be a dedicated device such as an application specific integrated circuit (“ASIC").
  • the processor 103 may be a neural network processor (Neural-network Processing Unit, NPU) or a combination of a neural network processor and the foregoing traditional processors.
  • the processor 103 is mounted with a neural network processor.
  • the computer system 101 can communicate with the software deployment server 149 through the network interface 129.
  • the network interface 129 is a hardware network interface, such as a network card.
  • the network 127 may be an external network, such as the Internet, or an internal network, such as an Ethernet or a virtual private network.
  • the network 127 may also be a wireless network, such as a WiFi network, a cellular network, and so on.
  • the hard disk drive interface is coupled to the system bus 105.
  • the hardware drive interface is connected with the hard drive.
  • the system memory 135 is coupled to the system bus 105.
  • the data running in the system memory 135 may include the operating system 137 and application programs 143 of the computer system 101.
  • the operating system includes a shell (Shell) 139 and a kernel (kernel) 141.
  • the shell 139 is an interface between the user and the kernel of the operating system.
  • the shell 139 is the outermost layer of the operating system.
  • the shell 139 manages the interaction between the user and the operating system: waiting for the user's input, interpreting the user's input to the operating system, and processing the output results of various operating systems.
  • the kernel 141 is composed of those parts of the operating system for managing memory, files, peripherals, and system resources. Directly interact with hardware, the operating system kernel usually runs processes and provides inter-process communication, providing CPU time slice management, interrupts, memory management, IO management, and so on.
  • the application program 141 includes programs related to automatic driving, such as programs that manage the interaction between the automatic driving device and obstacles on the road, programs that control the driving route or speed of the automatic driving device, and programs that control the interaction between the automatic driving device 100 and other automatic driving devices on the road.
  • the application program 141 also exists on the system of a software deployment server (deploying server) 149. In one embodiment, when the application program 141 needs to be executed, the computer system 101 may download the application program 141 from the software deployment server 149.
  • the sensor 153 is associated with the computer system 101.
  • the sensor 153 is used to detect the environment around the computer system 101.
  • the sensor 153 can detect animals, cars, obstacles, and crosswalks.
  • the sensor can also detect the environment around objects such as animals, cars, obstacles, and crosswalks, such as the environment around the animals, for example, when the animals appear around them. Other animals, weather conditions, the brightness of the surrounding environment, etc.
  • the sensor may be a camera (ie camera), lidar, infrared sensor, chemical detector, microphone, etc.
  • the sensor 153 senses information at preset intervals when activated and provides the sensed information to the computer system 101 in real time or near real time.
  • the senor may include a lidar, which can provide the acquired point cloud to the computer system 101 in real time or near real time, and provide a series of acquired point clouds to the computer system 101 each time the acquired point cloud Corresponds to a timestamp.
  • the camera provides the acquired images to the computer system 101 in real time or near real time, and each frame of image corresponds to a time stamp. It should be understood that the computer system 101 can obtain an image sequence from a camera.
  • the computer system 101 may be located far away from the automatic driving device, and may perform wireless communication with the automatic driving device.
  • the transceiver 123 can send automatic driving tasks, sensor data collected by the sensor 153, and other data to the computer system 101; and can also receive control instructions sent by the computer system 101.
  • the automatic driving device can execute the control instructions from the computer system 101 received by the transceiver, and perform corresponding driving operations.
  • some of the processes described herein are executed on a processor installed in an autonomous vehicle, and others are executed by a remote processor, including taking actions required to perform a single manipulation.
  • FIG. 3 is a flowchart of a method for determining a matching relationship between image frames according to an embodiment of the application. As shown in FIG. 3, the method may include:
  • the device for determining a matching relationship acquires N sets of feature point pairs.
  • the device for determining the matching relationship may be an automatic driving device or a server.
  • the automatic driving device collects the first image and the second image, and executes the method flow of FIG. 3 to determine the matching relationship between the first image and the second image.
  • the automatic driving device may send the collected image data and point cloud data to a matching relationship determination device (such as a server), and the matching relationship determination device executes the method flow in FIG. 3, and determines according to the data.
  • Each feature point pair includes two matching feature points. One feature point is a feature point extracted from the first image, and the other feature point is a feature point extracted from the second image.
  • the first image and the second The images are the images collected by the automatic driving device at the first time and the second time respectively, and N is an integer greater than 1.
  • the first image and the second image are respectively images collected by the same camera on the automatic driving device at different times.
  • the automatic driving device collects the first image at the first moment and collects the second image at the second moment; performing feature extraction on the first image to obtain the first feature point Set, feature extraction of the second image to obtain a second feature point set; feature points in the first feature point set and feature points in the second feature point set are feature-matched to obtain a feature matching point set; where
  • the feature matching point set includes the N groups of feature point pairs.
  • the N sets of feature point pairs may be N sets of feature point pairs selected by the automatic driving device from the set of feature matching points.
  • N can be an integer such as 5, 6, or 8.
  • the device for determining the matching relationship uses the motion state information of the dynamic obstacle to adjust the pixel coordinates of the target feature point in the pair of N groups of feature points.
  • the target feature point belongs to the feature point corresponding to the dynamic obstacle in the first image and/or the second image, and the pixel coordinates of the feature points other than the target feature point in the N groups of feature point pairs remain unchanged.
  • the dynamic obstacle may be one or multiple, which is not limited in this application. In some embodiments, the dynamic obstacles may be all dynamic obstacles in the first image and/or the second image. The implementation of step 302 will be detailed later.
  • the device for determining the matching relationship determines the target matching relationship between the first image and the second image according to the adjusted pixel coordinates corresponding to each feature point in the N groups of feature point pairs.
  • the target matching relationship between the first image and the second image may be a translation matrix and a rotation matrix between the first image and the second image.
  • the automatic driving device determines the target matching relationship between the first image and the second image according to the adjusted pixel coordinates corresponding to each feature point in the N sets of feature point pairs.
  • the automatic driving device may center each feature according to the N sets of feature points.
  • the adjusted pixel coordinates corresponding to the points determine the translation matrix and the rotation matrix between the first image and the second image. The method of calculating the translation matrix and rotation matrix between the two frames of images will be detailed later.
  • the purpose of using the motion state information to adjust the pixel coordinates of the target feature points is to adjust the pixel coordinates of the feature points corresponding to the dynamic obstacles in the N groups of feature point pairs, so that the N groups of feature point pairs are corresponding to the dynamic obstacles.
  • the translation matrix and the rotation matrix between the points are basically the same as the translation matrix and the rotation matrix between the feature points corresponding to the static obstacle in the N groups of feature point pairs, so that the difference between the first image and the second image can be determined more accurately
  • the matching relationship is the translation matrix and rotation matrix between the first image and the second image.
  • the first feature point to the fifth feature point in the first image are sequentially matched with the sixth feature point to the tenth feature point in the second image; if the first feature point to the fifth feature point are all Is a feature point corresponding to a static obstacle, and the first image and the first image can be accurately determined based on the pixel coordinates of the first to the fifth feature point and the pixel coordinates of the sixth to tenth feature point.
  • the translation matrix between the feature points corresponding to the dynamic obstacle in the N groups of feature point pairs is basically the same. Therefore, the pixel coordinates of each feature point in the N groups of feature point pairs can be more accurately determined.
  • step 302 does not describe the implementation of step 302 in detail, and an optional implementation of step 302 is described below.
  • the motion state information includes the displacement of the dynamic obstacle from the first moment to the second moment; and the motion state information of the dynamic obstacle is used to center the N groups of feature points on the target feature point
  • the adjustment of the pixel coordinates of may include: using the displacement to adjust the pixel coordinates of a reference feature point, the reference feature point is included in the target feature point and belongs to the feature point corresponding to the dynamic obstacle in the second image.
  • the displacement may be the displacement of the dynamic obstacle in the camera coordinate system from the first moment to the second moment. Since the displacement of the dynamic obstacle in the camera coordinate system (also called the camera coordinate system) is approximately equal to the displacement of the dynamic obstacle in the image coordinate system, the displacement of the dynamic obstacle in the camera coordinate system can be regarded as the dynamic obstacle The displacement of the feature point corresponding to the object in the image coordinate system. The following describes how the device for determining the matching relationship obtains the displacement of the dynamic obstacle in the camera coordinate system from the first moment to the second moment.
  • the device for determining the matching relationship may determine the first speed of the dynamic obstacle at the first moment and the second speed at the second moment according to the point cloud data collected by the lidar on the automatic driving device; calculate the first speed Sum the average of the second speed to get the average speed.
  • the first speed is (V x1 , V y2 , V z3 )
  • the second speed is (V x2 , V y2 , V z2 )
  • the average speed is among them, They are the speed of the dynamic obstacle in the X direction, Y direction and Z direction. It can be understood that the average speed is the speed of the dynamic obstacle in the lidar coordinate system.
  • the device for determining the matching relationship may first convert the average speed from the lidar coordinate system to the vehicle coordinate system, and then convert the average speed from the vehicle coordinate system to the camera coordinate system.
  • the self-car coordinate system also called the vehicle coordinate system
  • the self-car coordinate system is a special dynamic coordinate system used to describe the movement of a car; its origin coincides with the center of mass.
  • the X axis is parallel to the ground and points to the front of the vehicle.
  • the axis points upward through the center of mass of the vehicle, and the Y axis points to the left of the driver.
  • the device for determining the matching relationship may directly convert the average speed from the lidar coordinate system to the camera coordinate system.
  • the automatic driving device can use the following formula to convert the average speed from the lidar coordinate system to the own vehicle coordinate system:
  • V 1 ′ R 1 ⁇ V 1 +T 1 (1);
  • V 1 ′ is the average speed in the self-car coordinate system
  • V 1 is the average speed in the lidar coordinate system
  • R 1 is the rotation matrix (external parameter) calibrated by the lidar
  • T 1 is the translation calibrated by the lidar matrix.
  • the automatic driving device can use the following formula to convert the average speed from the vehicle coordinate system to the camera coordinate system:
  • V 1 ′′ R 2 ⁇ V 1 ′+T 2 (2);
  • V 1 ′′ is the average speed in the camera coordinate system
  • V 1 ′ is the average speed in the self-car coordinate system
  • R 2 is the rotation matrix between the automatic driving device and the camera
  • T 2 is the automatic driving device and the camera The translation matrix between.
  • the automatic driving device can use the following formula to convert the average speed from the lidar coordinate system to the camera coordinate system:
  • V 1 ′′ R 3 ⁇ V 1 +T 3 (3);
  • V 1 ′′ is the average speed in the camera coordinate system
  • V 1 is the average speed in the lidar coordinate system
  • R 3 is the rotation matrix between the lidar and the camera
  • T 3 is the distance between the lidar and the camera. Translation matrix.
  • (x', y') is the pixel coordinate after the reference feature point is adjusted
  • (x, y) is the pixel coordinate before the reference feature point is adjusted
  • ⁇ t is the time length from the first moment to the second moment
  • V x ′′ Is the component of V 1 "in the X direction
  • V x " is the component of V 1 "in the Y direction, that is, V 1 " is (V x ”, V y ”, V z ”).
  • the displacement of the dynamic obstacle from the first moment to the second moment is used to adjust the pixel coordinates of the reference feature point (ie motion compensation), so that the pixel coordinates of the reference feature point are adjusted to be basically equivalent to the static The pixel coordinates of the obstacle, so as to more accurately determine the matching relationship between the first image and the second image.
  • the first projection area is the area where the image of the dynamic obstacle in the first image is located
  • the second projection area is the area where the image of the dynamic obstacle in the second image is located.
  • the automatic driving device obtains a target point cloud characterizing the characteristics of the dynamic obstacle at the first moment, and projects the target point cloud onto the first image to obtain the first projection area; obtains characterizing the dynamic obstacle Projecting the intermediate point cloud of the characteristic at the second moment onto the second image to obtain the second projection area.
  • the external parameters between the lidar and the camera (the first camera or the second camera) (the external parameters here mainly refer to the rotation matrix R ibeoTocam and the translation vector T ibeoTocam between the lidar and the camera).
  • the obtained target point cloud is projected to the camera coordinate system, and the projection formula is:
  • P ibeo represents the position of a certain point of the dynamic obstacle perceived by the lidar in the lidar coordinate system
  • P cam represents the position of this point in the camera coordinate system
  • K is the internal parameter matrix of the camera
  • U is the coordinates of the point in the image coordinate system.
  • the automatic driving device can scan the surrounding environment with a certain scanning frequency through lidar (ibeo) to obtain the point cloud of obstacles at different times, and use different neural network (NN) algorithms or non-NN algorithms.
  • the point cloud at the moment determines the motion information of the obstacle (such as position, speed, bounding box, and posture).
  • the lidar can provide the acquired point cloud to the computer system 101 in real time or near real time, and each acquired point cloud corresponds to a time stamp.
  • the camera provides the acquired images to the computer system 101 in real time or near real time, and each frame of image corresponds to a time stamp. It should be understood that the computer system 101 can obtain the image sequence from the camera and the point cloud sequence from the lidar.
  • the time stamps of the two sensors are usually not synchronized.
  • interpolation is performed on the motion information of obstacles detected by lidar. If the scanning frequency of the lidar is higher than the shooting frequency of the camera, interpolation is performed.
  • the specific calculation process is: for example, the camera time of the latest shot is t cam , find the two closest times t k and t k in the output of the lidar +1 , where t k ⁇ t cam ⁇ t k+1 ; take the position interpolation calculation as an example, for example, the position of the obstacle detected by the lidar at t k is The position of the obstacle detected by t k+1 is Then the position of the obstacle at t cam is:
  • the automatic driving device can use the same method to interpolate other movement information of the obstacle, such as speed, posture, point cloud, etc., to obtain the movement information of the obstacle when the camera takes an image.
  • the camera captures the first image at the first moment
  • the lidar scans the first point cloud at the third moment
  • the third moment and the fourth moment are Among the scanning moments of the lidar, the two scanning moments closest to the first moment are used to interpolate the corresponding points in the first point cloud and the second point cloud by using a formula similar to formula (7) to obtain The target point cloud of the obstacle at the first moment.
  • the scanning frequency of the lidar is higher than the shooting frequency of the camera, extrapolation is performed. Interpolation and extrapolation are commonly used mathematical calculation formulas and will not be detailed here.
  • the automatic driving device determines the matching relationship between the first image and the second image according to the adjusted pixel coordinates corresponding to each feature point in the N groups of feature point pairs.
  • the automatic driving device can arbitrarily select N sets of feature point pairs from the multiple sets of feature point pairs matching the first image and the second image, and adjust according to the corresponding feature points in the N sets of feature point pairs The following pixel coordinates are used to determine the matching relationship between the first image and the second image. Since there may be noise points in the N groups of feature point pairs that cannot accurately reflect the matching relationship between the first image and the second image, it is necessary to select N groups that can accurately reflect the difference between the first image and the second image.
  • the feature point pairs of the matching relationship can then accurately determine the matching relationship between the first image and the second image.
  • an improved RANSAC algorithm is used in this embodiment of the present application to determine the matching relationship between the two frames of images before and after.
  • FIG. 4 is a flowchart of a method for determining the matching relationship between two frames of images before and after according to an embodiment of the application.
  • Figure 4 is a further refinement and improvement of the method flow in Figure 3.
  • the method flow in FIG. 3 is a part of the method flow in FIG. 4.
  • the method may include:
  • the device for determining a matching relationship determines a first projection area where a dynamic obstacle is located in a first image and a second projection area where a dynamic obstacle is located in the second image.
  • the foregoing embodiments describe projecting the target point cloud corresponding to the dynamic obstacle at the first moment to the first image to obtain the first projection area, and projecting the intermediate point cloud corresponding to the dynamic obstacle at the second moment to the second image to obtain the first projection area.
  • the way to obtain the second projection area will not be repeated here.
  • the device for determining the matching relationship randomly selects N groups of feature point pairs from a set of matching feature points.
  • step 402 may be executed before step 401 is executed, or may be executed after step 401 is executed.
  • the matching feature point set is a feature point pair obtained by performing feature matching between the feature points extracted from the first image and the feature points extracted from the second image.
  • the automatic driving device may perform feature extraction on the first image to obtain a first feature point set, and perform feature extraction on the second image to obtain a second feature point set; the feature points in the first feature point set Perform feature matching with the feature points in the second feature point set to obtain a matching feature point set.
  • Step 402 is an implementation manner of step 301.
  • the device for determining the matching relationship determines whether the N groups of feature point pairs include special feature points.
  • the special feature point refers to the feature point in the first projection area and/or the second projection area among the N groups of feature point pairs. If not, execute 404; if yes, execute 405.
  • the device for determining the matching relationship calculates the matching relationship between the first image and the second image according to the pixel coordinates of each feature point in the N groups of feature points.
  • the matching relationship between the first image and the second image may be a translation matrix and a rotation matrix between the first image and the second image.
  • the matching relationship determination device uses the motion state information of the dynamic obstacle to adjust the pixel coordinates of the target feature point in the N groups of feature point pairs, and determines the first pixel coordinates according to the adjusted pixel coordinates of each feature point in the N groups of feature point pairs. The matching relationship between the image and the second image.
  • the target feature point belongs to the feature point corresponding to the dynamic obstacle in the first image and/or the second image.
  • the pixel coordinates corresponding to the feature points other than the target feature point in the N groups of feature point pairs remain unchanged.
  • Step 405 corresponds to step 302 and step 303 in FIG. 3.
  • the matching relationship determination device divides each feature point pair in the matching feature point set except for the N groups of feature point pairs into an interior point and an exterior point according to the matching relationship to obtain an interior point set and an exterior point set.
  • each feature point pair in the matching feature point set except the N groups of feature point pairs into interior points and exterior points to obtain the interior point set and the exterior point set can be sequentially detected in the matching feature point set except for N groups Whether each feature point outside the feature point pair satisfies the matching relationship; if it is, the feature point pair is determined to be an interior point; if not, the feature point pair is determined to be an exterior point.
  • the device for determining the matching relationship determines whether the number of interior points in the currently obtained interior point set is the largest.
  • the method process in Figure 4 is a process of multiple iterations. To determine whether the number of interior points in the currently obtained interior point set can be at most is to determine whether the currently obtained interior point set is compared with the previously obtained interior point sets. The number of interior points is the largest.
  • the device for determining the matching relationship determines whether the current iteration number meets the termination condition.
  • the target matching relationship is the better matching relationship among the two or more matching relationships between the determined first image and the second image. It can be understood that according to the better matching relationship, each feature point pair in the matching feature point set except for the N groups of feature point pairs is divided into interior points and exterior points, and the more interior points can be obtained.
  • the relationship between the two feature points included in the feature point pair corresponding to the dynamic obstacle is basically consistent with the relationship between the two feature points included in the feature point pair corresponding to the static obstacle. That is to say, after step 405 is executed, the N groups of feature point pairs can be regarded as the feature point pairs corresponding to static obstacles, so that the influence of the feature point pairs corresponding to dynamic obstacles can be reduced, so a set of feature point pairs can be determined quickly Better matching relationship.
  • using the RANSAC algorithm can select a better matching relationship from the multiple matching relationships between the first image and the second image that have been determined, so as to ensure the quality of the determined matching relationship.
  • the RANSAC algorithm can be used to accurately and quickly determine the matching relationship between the first image and the second image.
  • the foregoing embodiment did not describe in detail how to determine the matching relationship between the first image and the second image.
  • the following describes how to use multiple feature point pairs corresponding to the first image and the second image to calculate the rotation matrix R and the translation matrix T between the two images.
  • the above-mentioned matching feature point set includes multiple sets of feature point pairs obtained by performing feature matching between the feature points extracted from the first image and the feature points extracted from the second image.
  • Each set of feature point pairs in the matching feature point set includes two matching feature points.
  • One feature point is a feature point extracted from the first image
  • the other feature point is a feature point extracted from the second image.
  • the image and the second image are the images collected by the automatic driving device at the first time and the second time, respectively.
  • multiple sets of feature point pairs include point set A and point set B.
  • the feature points in point set A are feature points extracted from the first image
  • the feature points in point set B are feature points extracted from the second image.
  • the number of elements in these two point sets is the same and corresponds to each other.
  • the point set A can be the feature points extracted from the first image in the N groups of feature point pairs
  • the point set B can be the feature points extracted from the second image in the N groups of feature point pairs.
  • the translation matrix is the rotation matrix and translation matrix between the first image and the second image.
  • B represents the pixel coordinates of the feature points in the point set B
  • A represents the pixel coordinates of the feature points in the point set A.
  • A′ i is the pixel coordinate of the i-th feature point in the point set A′
  • B′ i is the pixel coordinate of the i-th feature point in the point set B′.
  • R is the rotation matrix between the point set A and the point set B, that is, the rotation matrix between the first image and the second image.
  • t is the translation matrix between the point set A and the point set B, that is, the translation matrix between the first image and the second image.
  • the foregoing embodiment describes the implementation manner of determining the matching relationship between the two frames of images before and after.
  • the matching relationship between adjacent image frames collected by the automatic driving device can be determined sequentially, and then the matching relationship between each frame image and the reference frame image can be determined.
  • the reference frame image may be the first frame image collected by the automatic driving device during a driving process.
  • the automatic driving device sequentially collects the first frame image to the 1000th frame image in a chronological order within a certain period of time, and the automatic driving device can determine the translation matrix and the rotation matrix between two adjacent frames of images.
  • the translation matrix and rotation matrix between the first frame image and the second frame image and according to these translation matrix and rotation matrix, it is determined that any frame of the 1000 frames of images except the first frame image and the first frame
  • the matching relationship between the frame images is used to calculate the reprojection error of each frame image.
  • the rotation matrix between the first image and the second image is R 4
  • the translation matrix is T 4
  • the rotation matrix between the second image and the fifth image is R 5
  • the translation matrix is T 5
  • the rotation matrix between the first image and the fifth image is (R 4 ⁇ R 5 )
  • the translation matrix between the first image and the fifth image is (R 4 ⁇ T 5 +T 4 ).
  • the automatic driving device collects a frame of image to determine the matching relationship between the frame of image and the previous frame of the frame of image, so that the matching relationship between any two frames of adjacent images can be obtained. Then the matching relationship between any two frames of images is obtained.
  • the matching relationship determination device can use the translation matrix and rotation matrix to calculate the feature points in the current frame. The corresponding three-dimensional space coordinates are converted from the self-car coordinate system to the reference coordinate system in order to calculate the reprojection error of the current frame.
  • the foregoing embodiments describe how to more accurately determine the matching relationship between image frames.
  • An important application of calculating the matching relationship between image frames is to calculate the matching relationship between the current frame and the reference frame, and then to calculate the reprojection error of the current frame.
  • the embodiment of the present application also provides a reprojection error calculation method, which is described in detail below.
  • FIG. 5 is a flowchart of a method for calculating reprojection errors according to an embodiment of the application. As shown in Figure 5, the method may include:
  • the reprojection error calculation device uses the motion state information of the dynamic obstacle to adjust the space coordinates corresponding to the first feature point in the first space coordinates to obtain the second space coordinates.
  • the re-projection error calculation device may be an automatic driving device, or a computer device such as a server or a computer.
  • the automatic driving device collects the first image and executes the method flow of FIG. 5 to calculate the reprojection error of the first image.
  • the automatic driving device may send the collected image data and point cloud data to a re-projection error calculation device (such as a server); the re-projection error calculation device executes the method in FIG. 5, based on the data Calculate the reprojection error of the first image.
  • the first space coordinates include space coordinates corresponding to each feature point in the first image, and the first feature point is a feature point corresponding to the dynamic obstacle in the first image.
  • the first image may be an image collected by the automatic driving device at the second moment.
  • the pixel coordinates of the feature points except the first feature point in the first image remain unchanged.
  • the motion state information may include the displacement (corresponding to a translation matrix) and the attitude change (corresponding to a selection matrix) of the automatic driving device from the first moment to the second moment.
  • the reprojection error calculation device may determine the corresponding three-dimensional space coordinates of each feature point in the first image in the reference coordinate system to obtain the first space coordinates, and determine the The space coordinates corresponding to the first feature point.
  • the reference coordinate system may be a world coordinate system established by the automatic driving device at the starting point of this driving. The implementation of determining the first space coordinates and the space coordinates corresponding to the first feature points will be detailed later.
  • the reprojection error calculation device projects the second space coordinates to the first image to obtain the first pixel coordinates.
  • the second space coordinate may be a space coordinate in a reference coordinate system.
  • the reprojection error calculation device projects the second spatial coordinates to the first image to obtain the first pixel coordinates, which may be projecting the second spatial coordinates in the reference coordinate system to the first image to obtain the first pixel coordinates. Since it is necessary to calculate the reprojection error of each frame of the image collected by the automatic driving device in a fixed coordinate system, it is necessary to determine the corresponding three-dimensional space coordinates of each feature point in the second image in the reference coordinate system to obtain the first Space coordinates.
  • the reference coordinate system is a fixed coordinate system, unlike the self-car coordinate system that will change. Using the motion state information of the dynamic obstacle to adjust the space coordinates corresponding to the first feature point in the first space coordinates is to use the motion state information of the dynamic obstacle to adjust the space coordinates of the first feature point in the reference coordinate system .
  • the reprojection error calculation device calculates the reprojection error of the first image according to the first pixel coordinate and the second pixel coordinate.
  • the second pixel coordinate includes the pixel coordinate of each feature point in the first image, and each pixel coordinate included in the first pixel coordinate corresponds to each pixel coordinate included in the second pixel coordinate in a one-to-one correspondence.
  • Each pixel coordinate included in the first pixel coordinate corresponds to a descriptor, and each descriptor is used to describe its corresponding feature point; each pixel coordinate included in the second pixel coordinate also corresponds to a descriptor. It can be understood that the pixel coordinates included in the first pixel coordinate and the second pixel coordinate correspond to the same pixel coordinate of the corresponding descriptor.
  • the reprojection error calculation device may use the displacement to adjust the pixel coordinates of the first feature point in the first image to obtain the second pixel coordinates, and the first image is divided by The pixel coordinates of the feature points other than the first feature point remain unchanged.
  • the implementation manner of using the displacement to adjust the pixel coordinates of the first feature point in the first image may be the same as the implementation manner of using the displacement to adjust the pixel coordinates of the reference feature point described above, and will not be described in detail here.
  • Re-projection error the error between the projected point and the measurement point on the frame image.
  • the projected point can be the three-dimensional space coordinates corresponding to each feature point in the frame image projected to the coordinate point of the frame image (ie the first Pixel coordinates), the measurement points may be the coordinate points of these characteristic points in the frame of the image (that is, the second pixel coordinates).
  • the re-projection error calculation device calculates the re-projection error of the first image according to the first pixel coordinate and the second pixel coordinate, which may be calculated by calculating the difference between the first pixel coordinate and the second pixel coordinate corresponding to the pixel coordinate.
  • the reprojection error of the first image includes the reprojection error of each feature point in the first image.
  • the motion state information of the dynamic obstacle is used to adjust the space coordinate corresponding to the first feature point in the first space coordinate, so that the space coordinate corresponding to the first feature point is basically equivalent to the feature corresponding to the static obstacle
  • the reprojection error calculation device Before performing step 501, the reprojection error calculation device needs to determine the first spatial coordinates and the first feature point. The following describes how to obtain the first spatial coordinates and the first feature point.
  • the re-projection error calculation device may determine the first feature point in the following manner: before performing step 501, the re-projection error calculation device obtains the first image collected by the first camera at the second time and the second camera collected at the second time. Second image; feature extraction of the first image to obtain a first original feature point set, feature extraction of the second image to obtain a second original feature point set; feature points in the first original feature point set and the The feature points in the second original feature point set are feature-matched to obtain a first feature point set.
  • the feature points included in the first feature point set are the same as those in the second original feature point set.
  • the matched feature point; the feature point corresponding to the dynamic obstacle in the first feature point set is determined to obtain the first feature point.
  • the re-projection error calculation device can determine the feature points corresponding to the dynamic obstacles in the first feature point set to obtain the first feature points in the following manner: Obtain a target point cloud, which represents the dynamic obstacle in the second feature point. Point cloud of the characteristics of the moment; project the target point cloud onto the first image to obtain the target projection area; determine that the first characteristic point is concentrated in the target projection area as the first characteristic point.
  • the reprojection error calculation device may determine the first spatial coordinates in the following manner; the feature points in the first original feature point set and the second original feature point Feature matching is performed on the set of feature points to obtain a first feature point set, where the first feature point set includes multiple sets of feature point pairs, and each set of feature point pairs includes two matching feature points, and one feature point comes from the The first original feature point set, and the other feature point comes from the second original feature point set; a triangulation formula is used to determine a three-dimensional space coordinate according to each group of feature point pairs in the first feature point set to obtain the first space coordinate.
  • a three-dimensional space coordinate calculated from a set of feature point pairs is the space coordinate corresponding to the two feature points included in the set of feature point pairs.
  • the first feature point is included in the first feature point set.
  • Triangulation was first proposed by Gauss and used in surveying. To put it simply: Observe the same three-dimensional point P(x,y,z) at different locations, and know the two-dimensional projection points X1(x1,y1), X2(x2, y2), using the triangle relationship to recover the depth information of the three-dimensional point, that is, the three-dimensional space coordinates.
  • Triangulation is mainly to calculate the three-dimensional coordinates of the feature points in the camera coordinate system through the matched feature points (ie, pixel points).
  • Figure 6 is a schematic diagram of a triangulation process. As shown in Figure 6, P1 represents the coordinates of the three-dimensional point P in O1 (left eye coordinate system) (ie two-dimensional projection point), and P2 represents the coordinates of the three-dimensional point P in O2 (right eye coordinate system) (ie two-dimensional projection point) ), P1 and P2 are matched feature points.
  • the triangulation formula is as follows:
  • s1 represents the scale of the feature point in O1 (left eye coordinate system)
  • s2 represents the scale of the feature point in O2 (right eye coordinate system)
  • R and t respectively represent the rotation from the left eye camera to the right eye camera Matrix and translation matrix.
  • T (uppercase) represents the transpose of the matrix.
  • step 501 did not describe the implementation of step 501 in detail.
  • the following describes how to use the motion state information of the dynamic obstacle to adjust the spatial coordinates corresponding to the first feature point in the first spatial coordinates to obtain the second spatial coordinates.
  • the motion state information may include the displacement (corresponding to a translation matrix T 6 ) and the attitude change (corresponding to a rotation matrix R 6 ) of the automatic driving device from the first moment to the second moment.
  • the rotation matrix R 6 represents the attitude change of the automatic driving device from the first time to the second time
  • the translation matrix T 6 represents the displacement of the automatic driving device from the first time to the second time.
  • the reprojection error calculation device may The following formula is used to adjust the spatial coordinate P corresponding to the first feature point (that is, motion compensation):
  • P′ is the adjusted spatial coordinates corresponding to the first feature point, that is, the compensated feature point coordinates
  • P′ is a three-dimensional vector
  • R 6 is a matrix with 3 rows and 3 columns
  • T 6 is a three-dimensional vector .
  • R 1 is Is [5 1.2 1.5]
  • T 1 is [10 20 0]
  • is the rotation angle between the two frames of images around the z axis.
  • the reprojection error calculation device calculates the rotation matrix R 6 in the following manner: Obtain the first angular velocity of the dynamic obstacle at the first moment and the second angular velocity at the second moment through the lidar; calculate the first angular velocity and the The average value of the second angular velocity; calculate the product of the average value and the first duration to obtain the rotation angle ⁇ , the first duration is the duration between the first moment and the second moment; obtain the first rotation matrix according to the rotation angle ,
  • the first rotation matrix is the rotation matrix in the lidar coordinate system; use the external parameters of the lidar (the orientation and position of the lidar) to convert the first rotation matrix from the lidar coordinate system to the self-car coordinate system to obtain the first
  • Two rotation matrix transform the second rotation matrix from the own vehicle coordinate system to the reference coordinate system to obtain the rotation matrix R 6 .
  • the rotation matrix R 6 is a rotation matrix corresponding to the posture change of the dynamic obstacle from the first moment to the second moment in the reference coordinate system.
  • the automatic driving device can detect the angular velocity of dynamic obstacles at different moments through lidar.
  • the reprojection error calculation device may use the following formula to convert the first rotation matrix from the lidar coordinate system to the vehicle coordinate system to obtain the second rotation matrix:
  • R 6 ′ R 1 ⁇ R 6 ′′ (19);
  • R 6 ′ is the second rotation matrix
  • R 6 ′′ is the first rotation matrix
  • R 1 is the rotation matrix calibrated by the lidar.
  • the reprojection error calculation device may use the following formula to convert the second rotation matrix from the own vehicle coordinate system to the reference coordinate system to obtain the rotation matrix R 6 :
  • R 6 R 7 ⁇ R 6 ′ (20);
  • R 6 is the rotation matrix corresponding to the posture change of the dynamic obstacle from the first moment to the second moment in the reference coordinate system
  • R 6 ′ is the second rotation matrix
  • R 7 is the difference between the first image and the reference frame image Rotation matrix.
  • the reprojection error calculation device and the matching relationship determination device may be the same device.
  • the foregoing embodiment describes the implementation of determining the translation matrix and the rotation matrix between any frame of image and the reference frame, which will not be detailed here.
  • the reprojection error calculation device calculates the translation matrix T 6 in the following manner: Obtain the first speed of the dynamic obstacle at the first time and the second speed at the second time through the lidar; calculate the first speed and the The average value of the second speed; calculate the product of the average value and the second time length to obtain the first translation matrix, the second time length is the time length between the first time and the second time, the first translation matrix is the laser
  • the translation matrix in the radar coordinate system use the external parameters of the lidar (the orientation and position of the lidar) to convert the first translation matrix from the lidar coordinate system to the vehicle coordinate system to obtain the second translation matrix;
  • the translation matrix is transformed from the self-car coordinate system to the reference coordinate system to obtain the translation matrix T 6 .
  • the translation matrix T 6 can be understood as the translation matrix corresponding to the position change of the dynamic obstacle from the first moment to the second moment in the reference coordinate system.
  • the automatic driving device can detect the speed of dynamic obstacles at different times through lidar.
  • the reprojection error calculation device may use the following formula to convert the first translation matrix from the lidar coordinate system to the vehicle coordinate system to obtain the second translation matrix:
  • T 6 ′ R 1 ⁇ T 6 ′′+T 1 (21);
  • T 6 ′ is the second translation matrix
  • R 6 ′′ is the first translation matrix
  • R 1 is the rotation matrix calibrated by the lidar
  • T 1 is the translation matrix calibrated by the lidar.
  • the reprojection error calculation device can use the following formula
  • the second translation matrix is transformed from the self-car coordinate system to the reference coordinate system to obtain the second translation matrix:
  • T 6 R 7 ⁇ T 6 ′+T 7 (22);
  • T 6 is the translation matrix corresponding to the position change of the dynamic obstacle from the first moment to the second moment in the reference coordinate system
  • T 6 ′ is the second translation matrix
  • R 7 is the difference between the first image and the reference frame image Rotation matrix
  • T 7 is the translation matrix between the first image and the reference frame image.
  • the displacement of the dynamic obstacle from the first moment to the second moment is used to adjust the pixel coordinates of the first feature point (that is, motion compensation), so that the pixel coordinates of the first feature point are basically the same after being adjusted.
  • the pixel coordinates of the static obstacle are used to make the reprojection error of the first image more accurate.
  • FIG. 7 is a schematic flowchart of a positioning method provided by an embodiment of the application, and the positioning method is applied to an automatic driving device including a lidar, an IMU, and a binocular camera. As shown in Figure 7, the method may include:
  • the automatic driving device collects images through a binocular camera.
  • the binocular camera collects images at time (t-1) (corresponding to the first time) to obtain the first image and the third image.
  • the first image may be an image collected by a left-eye camera
  • the third image may be an image collected by a right-eye camera.
  • the binocular camera can collect images in real time or near real time.
  • the binocular camera also collects the second image and the fourth image at time t (corresponding to the second time).
  • the second image may be an image collected by a left-eye camera
  • the fourth image may be an image collected by a right-eye camera.
  • the automatic driving device performs feature extraction on the image collected by the left-eye camera and the image collected by the right-eye camera, and performs feature matching.
  • the automatic driving device performs feature extraction on the first image to obtain a first feature point set, and performs feature extraction on the third image to obtain a second feature point set; the feature points in the first feature point set and the The feature points in the second feature point set are feature matched to obtain the first matching feature point set.
  • the automatic driving device performs feature extraction on the second image to obtain a third feature point set, and performs feature extraction on the fourth image to obtain a fourth feature point set; the feature points in the third feature point set and the The feature points in the fourth feature point set are subjected to feature matching to obtain the second matching feature point set.
  • the automatic driving device performs feature extraction and feature matching on two images collected by the binocular camera at the same time.
  • the automatic driving device performs feature tracking on images collected at different times.
  • the feature tracking of the images collected at different times by the automatic driving device may be to determine the matching relationship between the first image and the second image, and/or the matching relationship between the third image and the fourth image. That is to say, the feature tracking of the images collected by the automatic driving device at different times may be to determine the matching relationship between the two frames of images collected by the automatic driving device at different times.
  • the feature tracking in Figure 7 refers to determining the matching relationship between the two images before and after.
  • the matching relationship between the two frames of images may be a rotation matrix and a translation matrix between the two frames of images.
  • the implementation manner of the automatic driving device determining the matching relationship between the two frames of images can be referred to FIG. 3 and FIG. 4, which will not be repeated here.
  • the automatic driving device can respectively determine the matching relationship between all two adjacent frames in the multiple frames of images that it has successively collected.
  • the automatic driving device collects a frame of image to determine the matching relationship between the frame of image and the previous frame of the frame of image, so that the matching relationship between any two frames of adjacent images can be obtained. Then the matching relationship between any two frames of images is obtained. For example, the rotation matrix and the translation matrix between the current frame and the reference frame.
  • the automatic driving device performs motion estimation according to the angular rate and speed of the dynamic obstacle.
  • the motion estimation performed by the automatic driving device may be to estimate the motion state of the dynamic obstacle to obtain the motion state information of the dynamic obstacle, for example, the displacement of the dynamic obstacle from time (t-1) to time t in the camera coordinate system, the dynamic The attitude change of the obstacle from time (t-1) to time t in the reference coordinate system (for example, the rotation matrix R 6 ) and the position change of the dynamic obstacle from time (t-1) to time t in the reference coordinate system (For example, the translation matrix T 6 ).
  • the foregoing embodiments describe the implementation of motion estimation based on the angular rate and velocity of the dynamic obstacle to obtain the motion state information of the dynamic obstacle, which will not be repeated here.
  • the automatic driving device performs three-dimensional reconstruction on the space coordinates corresponding to the feature points in the image.
  • the automatic driving device's three-dimensional reconstruction of the space coordinates corresponding to the feature points in the image may include: using a triangulation formula to determine a three-dimensional space coordinate according to each group of matching feature point pairs in the first matching feature point set to obtain the first reference space Coordinates; convert the first reference space coordinates from the lidar coordinate system to the reference coordinate system to obtain the first intermediate space coordinates; adjust the space coordinates corresponding to the feature points corresponding to the dynamic obstacles in the first intermediate space coordinates according to the motion state information , Get the first target space coordinates.
  • the first target space coordinates are adjusted (reconstructed) three-dimensional space coordinates corresponding to the feature points in the first image and the third image.
  • the image may be any one of the first image, the second image, the third image, and the fourth image.
  • the motion state information is obtained by the automatic driving device in step 704.
  • the automatic driving device's three-dimensional reconstruction of the space coordinates corresponding to the feature points corresponding to the dynamic obstacles in the image may also include: using a triangulation formula to determine a three-dimensional space coordinate according to each set of matching feature point pairs in the second matching feature point set.
  • the space coordinates corresponding to the points are used to obtain the second target space coordinates.
  • the second target space coordinates are adjusted (reconstructed) three-dimensional space coordinates corresponding to the feature points in the second image and the fourth image. It can be understood that the automatic driving device performs three-dimensional reconstruction of the spatial coordinates corresponding to the feature points in the image, that is, adjusts the three-dimensional spatial coordinates corresponding to the feature points of the dynamic obstacle in the image.
  • the implementation of step 705 may be the same as the implementation of step 501.
  • the automatic driving device calculates the reprojection error.
  • the automatic driving device can calculate the reprojection error as follows: project the three-dimensional space coordinates in the second target space coordinates to the second image to obtain the target projection point; calculate the error between the target projection point and the target measurement point to obtain The reprojection error of the second image.
  • the target measurement point includes the pixel coordinates of each feature point in the second image, and the pixel coordinates included in the target projection point correspond to the pixel coordinates included in the target measurement point in a one-to-one correspondence. It should be understood that the automatic driving device can calculate the reprojection error of any frame of image in a similar manner. Refer to FIG. 5 for the implementation of step 706.
  • An electronic control unit (ECU) on the automatic driving device determines the position and speed of the obstacle according to the point cloud data collected by the lidar.
  • Obstacles may include dynamic obstacles and static obstacles.
  • the ECU can determine the location and speed of dynamic obstacles and the location of static obstacles based on the point cloud data collected by the lidar.
  • the ECU on the automatic driving device determines the bounding box of the obstacle according to the point cloud data collected by the lidar, and outputs external parameters.
  • the external parameters may be calibration parameters that characterize the position and orientation of the lidar, that is, a rotation matrix (corresponding to the orientation) and a translation matrix (corresponding to the position). This external parameter is used when the automatic driving device projects the bounding box onto the image to obtain the projection area.
  • the automatic driving device determines that the dynamic obstacle is in the projection area of the image.
  • the automatic driving device determines the projection area of the dynamic obstacle in the first image, so as to determine the feature points corresponding to the dynamic obstacle among the feature points extracted from the first image.
  • the automatic driving device determines the projection area of the dynamic obstacle in the image according to the bounding box of the dynamic obstacle.
  • the automatic driving device can determine the projection area of the dynamic obstacle in each frame of the image according to the bounding box of the dynamic obstacle.
  • the automatic driving device needs to determine the feature point corresponding to the dynamic obstacle according to the projection area corresponding to the dynamic obstacle.
  • the automatic driving device determines the speed and angular velocity of the dynamic obstacle.
  • the automatic driving device determines the speed and angular velocity of the dynamic obstacle through the point cloud data collected by the lidar, so as to perform motion estimation according to the speed and angular velocity of the dynamic obstacle to obtain the motion state information of the dynamic obstacle.
  • the automatic driving device uses an extended Kalman filter (EKF) to determine the attitude error, the speed error, the position error, and the second output.
  • EKF extended Kalman filter
  • the second output may include the position, attitude, and speed of the dynamic obstacle.
  • the measurement in Figure 7 includes the reprojection error of the current frame image and the location of dynamic obstacles.
  • the IMU outputs the linear acceleration and angular velocity to the state model
  • the lidar outputs the position and velocity of the dynamic obstacle to the state model.
  • the state model can construct the state equation based on this information; the measurement model can be based on
  • the EKF can calculate the attitude error, the velocity error, the position error and the second output according to the measurement equation and the state equation. The method of constructing the measurement equation and the state equation will be detailed later.
  • the functions of the measurement model, state model, and extended Kalman filter in the dashed frame can be implemented by the computer system 112.
  • Kalman filter an algorithm that uses the linear system state equation to optimally estimate the system state through system input and output observation data. Since the observation data includes the influence of noise and interference in the system, the optimal estimation can also be regarded as a filtering process.
  • the Extended Kalman Filter Extended Kalman Filter (Extended Kalman Filter, EKF) is an extended form of the standard Kalman filter in non-linear situations, and it is a highly efficient recursive filter (autoregressive filter).
  • the basic idea of EKF is to use Taylor series expansion to linearize the nonlinear system, and then use the Kalman filter framework to filter the signal, so it is a sub-optimal filter.
  • the measurement data can be used to adjust the positioning result.
  • the SLAM process contains many steps, and the whole process is to use the environment to update the position of the autonomous driving device. Because the positioning results of automatic driving devices are often not accurate enough. We can use laser scanning of the environment and/or collect images to correct the position of the autonomous driving device. This can be achieved by extracting the characteristics of the environment, and then make new observations when the autonomous driving device moves around.
  • the extended Kalman filter EKF is the core of the SLAM process. It is responsible for updating the original state position of the autonomous driving device based on these environmental features, which are often called landmarks. EKF is used to track uncertain estimates of the position of the autonomous driving device and uncertain landmarks in the environment. The implementation of EKR in the embodiment of the present application will be described below.
  • the automatic driving device determines its own attitude, speed, and position through an inertial navigation system (Inertial Navigation System, INS).
  • INS Inertial Navigation System
  • the speed error and position error are output to the INS, and the INS can correct the speed and position of its own vehicle based on the speed error and the position error; the attitude error is output to the multiplier, and the multiplier controls the rotation of the INS output
  • the matrix (characterizing attitude) is corrected. This process is the process of correcting the constant drift of the IMU.
  • the constant drift of IMU is an inherent property of IMU, which will cause its navigation error to accumulate over time.
  • the multiplier to correct the rotation matrix output by the INS may be to calculate the product of the rotation matrix output by the INS and the attitude error (a rotation matrix) to obtain the corrected rotation matrix.
  • the first output in FIG. 7 is the attitude, speed, and position of the automatic driving device.
  • the linear acceleration and angular velocity in Figure 7 are the output of the IMU.
  • the INS performs a first-order integration on the linear acceleration to get the speed of the own vehicle, and the second-order integration of the linear acceleration can get the position of the own vehicle.
  • the angular velocity is first-order Points can get the posture of the own car.
  • the reprojection error can be calculated more accurately, so that the positioning is more accurate.
  • the extended Kalman filter is a commonly used technical means in this field. The following briefly describes the application of EKR in the embodiments of this application.
  • the automatic driving device can perform system modeling: the position, speed, posture, and IMU constant deviation of obstacles and the self-vehicle are modeled into the system equations.
  • the position, speed and angle of obstacles are also further optimized.
  • lidar can detect the position, speed, and attitude of dynamic obstacles.
  • the IMU can estimate the position, speed, and attitude of the vehicle.
  • the state equation of the system the state quantity of the system
  • the first 15-dimensional state quantities are the position error, velocity error, and attitude error of the IMU.
  • the last 9n dimensions are the position, speed and angle information of the obstacle.
  • q is the attitude error of the self-vehicle (that is, the automatic driving device)
  • b g is the constant deviation error of the gyroscope
  • b a is the constant deviation error of the accelerometer
  • Is the position error of the vehicle Is the position of the first obstacle
  • Is the speed of the first obstacle For the posture of the first obstacle, the same goes for recursion.
  • Each parameter in X corresponds to a three-dimensional vector.
  • F I is the state translation matrix of the IMU
  • G I is the noise driving matrix of the IMU
  • n I is the noise matrix of the IMU
  • F O is the state transition matrix of the obstacle
  • G O is the noise driving matrix of the obstacle
  • n O is the obstacle The noise matrix of the object.
  • the measurement equation of the system is mainly composed of two parts:
  • To measure noise Is the reprojection error of the feature point.
  • the self-car coordinate system is a coordinate system with the center point of the rear wheel of the automatic driving device as the origin, and it changes with the position of the car.
  • the global coordinate system specifies an origin and direction, it is constant, and its position and direction do not change with the transformation of the car.
  • FIG. 8 is a schematic structural diagram of an apparatus for determining a matching relationship provided by an embodiment of the application. As shown in FIG. 8, the device for determining the matching relationship includes:
  • the acquiring unit 801 is configured to acquire N groups of feature point pairs, each group of feature point pairs includes two matching feature points, one of the feature points is a feature point extracted from the first image, and the other feature point is from the second image
  • the extracted feature points, the first image and the second image are images collected by the automatic driving device at the first time and the second time, respectively, and N is an integer greater than 1;
  • the adjustment unit 802 is configured to use the motion state information of the dynamic obstacle to adjust the pixel coordinates of the target feature point in the N groups of feature point pairs, and the target feature point belongs to the dynamic obstacle in the first image and/or the second image. For the feature points corresponding to the obstacle, the pixel coordinates of the feature points other than the target feature point in the N groups of feature point pairs remain unchanged;
  • the determining unit 803 is configured to determine the target matching relationship between the first image and the second image according to the adjusted pixel coordinates corresponding to each feature point in the N groups of feature point pairs.
  • the acquiring unit 801 is specifically configured to execute the method mentioned in step 301 and the method that can be equivalently replaced;
  • the adjusting unit 802 is specifically configured to execute the method mentioned in the step 302 and the method that can be equivalently replaced;
  • the determining unit 803 is specifically configured to execute the method mentioned in step 303 and a method that can be equivalently replaced.
  • the functions of the acquiring unit 801, the adjusting unit 802, and the determining unit 803 can all be implemented by the processor 113.
  • the motion state information includes the displacement of the dynamic obstacle from the first moment to the second moment
  • the adjustment unit 802 is specifically configured to use the displacement to adjust the pixel coordinates of the reference feature point, the reference feature point being included in the target feature point and belonging to the feature point corresponding to the dynamic obstacle in the second image.
  • the determining unit 803 is further configured to determine the feature point located in the first projection area and/or the second projection area in the N groups of feature point pairs as the target feature point; the first projection area Is the area where the image of the dynamic obstacle in the first image is located, and the second projection area is the area where the image of the dynamic obstacle in the second image is located;
  • the obtaining unit 801 is also used to obtain the pixel coordinates corresponding to the target feature point.
  • the determining unit 803 is further configured to perform interpolation calculation on the first point cloud and the second point cloud to obtain the target point cloud, and the first point cloud and the second point cloud are the automatic The point cloud collected by the driving device at the third time and the fourth time, the target point cloud is a point cloud that characterizes the characteristics of the dynamic obstacle at the first time, the third time is before the first time, the fourth time The time is after the first time; the device further includes:
  • the projection unit 804 is configured to project the target point cloud onto the first image to obtain the first projection area.
  • FIG. 9 is a schematic structural diagram of a reprojection error calculation device provided by an embodiment of the application. As shown in Figure 9, the reprojection error calculation device includes:
  • the adjustment unit 901 is configured to use the motion state information of the dynamic obstacle to adjust the space coordinates corresponding to the first feature points in the first space coordinates to obtain second space coordinates, where the first space coordinates include each feature point in the first image Corresponding spatial coordinates, the first feature point is the feature point corresponding to the dynamic obstacle in the first image, the first image is the image collected by the automatic driving device at the second moment, and the motion state information includes the automatic driving device The displacement and attitude change from the first moment to the second moment;
  • the projection unit 902 is configured to project the second spatial coordinates to the first image to obtain the first pixel coordinates
  • the determining unit 903 is configured to calculate the reprojection error of the first image according to the first pixel coordinate and the second pixel coordinate; the second pixel coordinate includes the pixel coordinate of each feature point in the first image.
  • the adjustment unit 901 is further configured to use the displacement to adjust the pixel coordinates of the first feature point in the first image to obtain the second pixel coordinates.
  • the pixel coordinates of the feature points other than the first feature point remain unchanged.
  • the device further includes:
  • the first acquiring unit 904 is configured to acquire a second feature point in a second image that matches the first feature point; the first image and the second image are the first camera and the second camera on the automatic driving device, respectively. For the image collected by the camera at the second moment, the spatial positions of the first camera and the second camera are different;
  • the determining unit 903 is further configured to determine the spatial coordinates corresponding to the first feature point according to the first feature point and the second feature point.
  • the device further includes:
  • the second obtaining unit 905 is configured to obtain a target point cloud, where the target point cloud is a point cloud that characterizes the characteristics of the dynamic obstacle at the second moment;
  • the projection unit 902 is further configured to project the target point cloud onto the first image to obtain a target projection area
  • the determining unit 903 is further configured to determine that a feature point located in the target projection area in the first feature point set is the first feature point; the feature points included in the first feature point set are feature points extracted from the first image, and Both match with the feature points in the second feature point set, and the feature points included in the second feature point set are feature points extracted from the second image.
  • the first obtaining unit 904 and the second obtaining unit 905 may be the same unit or different units.
  • the functions of each unit in FIG. 9 can be implemented by the processor 113.
  • each unit in the above matching relationship determination device and the reprojection error calculation device is only a logical function division, and may be fully or partially integrated into a physical entity in actual implementation, or may be physically separated.
  • each of the above units can be separately established processing elements, or they can be integrated in a certain chip of the terminal for implementation.
  • they can also be stored in the storage element of the controller in the form of program codes and processed by a certain processor.
  • the components call and execute the functions of the above units.
  • the various units can be integrated together or implemented independently.
  • the processing element here can be an integrated circuit chip with signal processing capabilities.
  • each step of the above method or each of the above units can be completed by an integrated logic circuit of hardware in the processor element or instructions in the form of software.
  • the processing element can be a general-purpose processor, such as a central processing unit (English: central processing unit, CPU for short), or one or more integrated circuits configured to implement the above methods, such as one or more specific integrated circuits. Circuit (English: application-specific integrated circuit, abbreviation: ASIC), or, one or more microprocessors (English: digital signal processor, abbreviation: DSP), or, one or more field programmable gate arrays (English: field-programmable gate array, referred to as FPGA), etc.
  • ASIC application-specific integrated circuit
  • DSP digital signal processor
  • FPGA field-programmable gate array
  • FIG. 10 is a schematic structural diagram of a computer device provided by an embodiment of the application.
  • the computer device includes: a memory 1001, a processor 1002, a communication interface 1003, and a bus 1004; among them, the memory 1001 and the processor 1002 ,
  • the communication interface 1003 realizes the communication connection between each other through the bus 1004.
  • the communication interface 1003 is used for data interaction with the automatic driving device.
  • the processor 1003 reads the code stored in the memory to perform the following operations: obtain N sets of feature point pairs, each set of feature point pairs includes two matching feature points, one of which is from the first image
  • the extracted feature point, the other feature point is the feature point extracted from the second image
  • the first image and the second image are the images collected by the automatic driving device at the first time and the second time respectively, and N is greater than 1.
  • Integer use the motion state information of the dynamic obstacle to adjust the pixel coordinates of the target feature point in the N groups of feature points, the target feature point belongs to the first image and/or the second image corresponding to the dynamic obstacle Feature points, the pixel coordinates of the feature points other than the target feature point in the N groups of feature point pairs remain unchanged; according to the adjusted pixel coordinates corresponding to each feature point in the N groups of feature point pairs, the first The target matching relationship between the image and the second image.
  • the processor 1003 reads the code stored in the memory to perform the following operations: use the motion state information of the dynamic obstacle to adjust the space coordinates corresponding to the first feature point in the first space coordinates to obtain the second space coordinates ,
  • the first spatial coordinates include the spatial coordinates corresponding to each feature point in the first image, the first feature point is the feature point corresponding to the dynamic obstacle in the first image, and the first image shows that the automatic driving device is in the second Image collected at any time, the motion state information includes the displacement and posture change of the automatic driving device from the first time to the second time; the second spatial coordinates are projected onto the first image to obtain the first pixel coordinates;
  • the first pixel coordinate and the second pixel coordinate are used to calculate the reprojection error of the first image;
  • the second pixel coordinate includes the pixel coordinate of each feature point in the first image.
  • the disclosed methods may be implemented as computer program instructions encoded on a computer-readable storage medium in a machine-readable format or encoded on other non-transitory media or articles.
  • Figure 11 schematically illustrates a conceptual partial view of an example computer program product arranged in accordance with at least some of the embodiments presented herein, the example computer program product including a computer program for executing a computer process on a computing device.
  • the example computer program product 1100 is provided using a signal bearing medium 1101.
  • the signal bearing medium 1101 may include one or more program instructions 1102, which, when executed by one or more processors, can provide the functions or part of the functions described above with respect to FIGS. 8-9. Therefore, for example, referring to the embodiment shown in FIG.
  • the realization of one or more functions of the blocks 801-804 may be undertaken by one or more instructions associated with the signal bearing medium 1101.
  • the program instructions 1102 in FIG. 11 also describe example instructions.
  • the above program instructions 1102 are implemented when executed by the processor: N groups of feature point pairs are obtained, each of which includes two matching feature points, one of which is a feature point extracted from the first image, and the other feature point Are the feature points extracted from the second image, the first image and the second image are the images collected by the automatic driving device at the first time and the second time respectively, and N is an integer greater than 1; using the motion state of the dynamic obstacle
  • the information adjusts the pixel coordinates of the target feature point in the N sets of feature point pairs, the target feature point belongs to the feature point corresponding to the dynamic obstacle in the first image and/or the second image, and the N sets of feature point pairs
  • the pixel coordinates of the feature points other than the target feature point remain unchanged; according to the adjusted pixel coordinates corresponding to each feature point in the N groups of feature point pairs,
  • the above program instruction 1102 is implemented when executed by the processor: using the motion state information of the dynamic obstacle to adjust the space coordinates corresponding to the first feature point in the first space coordinates to obtain the second space coordinates, and the first space coordinates include The spatial coordinates corresponding to each feature point in the first image, the first feature point is the feature point corresponding to the dynamic obstacle in the first image, the first image is the image collected by the automatic driving device at the second moment, the motion
  • the state information includes the displacement and posture change of the automatic driving device from the first moment to the second moment; the second spatial coordinate is projected onto the first image to obtain the first pixel coordinate; according to the first pixel coordinate and the second Pixel coordinates, calculate the reprojection error of the first image; the second pixel coordinates include the pixel coordinates of each feature point in the first image
  • the signal-bearing medium 1101 may include a computer-readable medium 1103, such as, but not limited to, a hard disk drive, compact disk (CD), digital video compact disk (DVD), digital tape, memory, read-only storage memory (Read -Only Memory, ROM) or Random Access Memory (RAM), etc.
  • the signal bearing medium 1101 may include a computer recordable medium 1104, such as, but not limited to, memory, read/write (R/W) CD, R/W DVD, and so on.
  • the signal-bearing medium 1101 may include a communication medium 1105, such as, but not limited to, digital and/or analog communication media (e.g., fiber optic cables, waveguides, wired communication links, wireless communication links, etc.). Therefore, for example, the signal bearing medium 1101 may be communicated by a wireless communication medium 1105 (for example, a wireless communication medium that complies with the IEEE 602.11 standard or other transmission protocols).
  • the one or more program instructions 1102 may be, for example, computer-executable instructions or logic-implemented instructions. In some examples, a processor such as that described with respect to FIG.
  • Program instructions 1102 provide various operations, functions, or actions. It should be understood that the arrangement described here is for illustrative purposes only. Thus, those skilled in the art will understand that other arrangements and other elements (for example, machines, interfaces, functions, sequences, and functional groups, etc.) can be used instead, and some elements can be omitted altogether depending on the desired result . In addition, many of the described elements are functional entities that can be implemented as discrete or distributed components, or combined with other components in any appropriate combination and position.
  • the embodiments of the present invention may be provided as methods, systems, or computer program products. Therefore, the present invention may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, the present invention may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes.
  • a computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • the present invention is described according to the flowcharts and/or block diagrams of the method, device (system), and computer program product of the embodiments of the present invention. It should be understood that each process and/or block in the flowchart and/or block diagram, and the combination of processes and/or blocks in the flowchart and/or block diagram can be implemented by computer program instructions. These computer program instructions can be provided to the processor of a general-purpose computer, a special-purpose computer, an embedded processor, or other programmable data processing equipment to generate a machine, so that the instructions executed by the processor of the computer or other programmable data processing equipment are generated It is a device that realizes the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
  • These computer program instructions can also be stored in a computer-readable memory that can guide a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device.
  • the device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
  • These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment.
  • the instructions provide steps for implementing functions specified in a flow or multiple flows in the flowchart and/or a block or multiple blocks in the block diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Traffic Control Systems (AREA)

Abstract

一种匹配关系确定方法、重投影误差计算方法及相关装置,涉及人工智能领域,具体涉及自动驾驶领域,该匹配关系确定方法包括:获取N组特征点对(301),每组特征点对包括两个相匹配的特征点,其中一个特征点为从第一图像提取的特征点,另一个特征点为从第二图像提取的特征点;利用动态障碍物的运动状态信息对所述N组特征点对中目标特征点的像素坐标进行调整(302),所述目标特征点属于所述第一图像和/或所述第二图像中所述动态障碍物对应的特征点;根据所述N组特征点对中各特征点对应的调整后的像素坐标,确定所述第一图像和所述第二图像之间的目标匹配关系(303);在存在动态障碍物的自动驾驶场景能准确地确定两帧图像之间的匹配关系。

Description

匹配关系确定方法、重投影误差计算方法及相关装置 技术领域
本申请涉及人工智能领域的自动驾驶领域,尤其涉及一种匹配关系确定方法、重投影误差计算方法及相关装置。
背景技术
人工智能(Artificial Intelligence,AI)是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。自动驾驶是人工智能领域的一种主流应用,自动驾驶技术依靠计算机视觉、雷达、监控装置和全球定位系统等协同合作,让机动车辆可以在不需要人类主动操作下,实现自动驾驶。自动驾驶的车辆使用各种计算系统来帮助将乘客从一个位置运输到另一位置。由于自动驾驶技术无需人类来驾驶机动车辆,所以理论上能够有效避免人类的驾驶失误,减少交通事故的发生,且能够提高公路的运输效率。因此,自动驾驶技术越来越受到重视。
目前,自动驾驶装置采用即时定位与地图构建(Simultaneous Localization And Mapping,SLAM)等定位方法进行定位时,通常以其采集到的各帧图像的重投影误差为量测量。也就是说,自动驾驶装置在采用SLAM进行定位时,需要计算其采集到的各帧图像的重投影误差。一帧图像的重投影误差是指投影的点与该帧图像上的测量点之间的误差,投影的点可以是该帧图像中的各特征点对应的三维空间坐标投影至该帧图像的坐标点,测量点可以是这些特征点在该帧图像中的坐标点。当前普遍采用的一种计算重投影误差的方式如下:确定目标帧图像中各特征点对应的三维空间坐标,以得到第一三维空间坐标;计算该目标帧图像与参考帧图像之间的平移矩阵和旋转矩阵;利用该平移矩阵和旋转矩阵将该第一三维空间坐标中的各三维空间坐标转换至参考坐标系,以得到第二三维空间坐标;将该第二三维空间坐标中的各三维空间坐标投影至该目标帧图像以得到投影的点;计算投影的点与该目标帧图像中各特征点的坐标点(即测量点)之间的误差,以得到该目标帧图像的重投影误差。其中,该参考坐标系可以是自动驾驶装置在本次行驶的起始地点建立的世界坐标系,该参考帧图像可以是自动驾驶装置在该起始地点采集的第一帧图像,该目标帧图像可以是该自动驾驶装置在本次行驶过程中采集的除该参考帧图像之外的任一帧图像。自动驾驶装置为计算其采集的各帧图像与参考帧图像之间的关系需要计算其采集到的任意两帧相邻图像之间的匹配关系,进而计算得到其采集到的各帧图像与参考帧图像之间的匹配关系。目前,一般采用特征匹配的方式来确定两帧图像之间的匹配关系。
在特征匹配中,为了消除特征匹配中的误匹配,随机抽样一致性(RANdom SAmple Consensus,RANSAC)被使用到特征匹配中。RANSAC算法的流程如下:假设样本(匹配两帧图像得到的多组特征点对)中包含内点(inliers)和外点(outliers),分别对应正确匹配点对和错误匹配点对,随机从样本中抽取4组点对,计算出两帧图像之间的匹配关系;然后根据该匹配关系,把剩余特征点对分成内点和外点,重复上述步骤,选取数量最多的 内点所对应的匹配关系为最终的两帧图像之间的匹配关系。其中,两帧图像分别为自动驾驶装置在第一时刻和第二时刻采集的图像,该匹配关系是这两帧图像之间的平移矩阵和旋转矩阵。RANSAC算法的本质是一种少数服从多数的算法。当动态障碍物占据视野很大一部分的时候,比如,自动驾驶装置跟在一个很大的车后面行驶,外点(其他车辆等动态障碍物)会被算法当成内点,而内点(静态障碍物)被错误当成了外点剔除,这样就不能准确地确定两帧图像之间的匹配关系。可见,在一些存在动态障碍物的自动驾驶场景中,采用RANSAC算法有时候不能准确地确定两帧图像之间的匹配关系。因此,需要研究在存在动态障碍物的自动驾驶场景能准确地确定两帧图像之间的匹配关系的方案。
发明内容
本申请实施例提供了一种匹配关系确定方法、重投影误差计算方法及相关装置,在存在动态障碍物的自动驾驶场景能准确地确定两帧图像之间的匹配关系。
第一方面,本申请实施例提供了一种匹配关系确定方法,该方法可包括:获取N组特征点对,每组特征点对包括两个相匹配的特征点,其中一个特征点为从第一图像提取的特征点,另一个特征点为从第二图像提取的特征点,该第一图像和该第二图像分别为自动驾驶装置在第一时刻和第二时刻采集的图像,N为大于1的整数;利用动态障碍物的运动状态信息对该N组特征点对中目标特征点的像素坐标进行调整,该目标特征点属于该第一图像和/或该第二图像中该动态障碍物对应的特征点,该N组特征点对中除该目标特征点之外的特征点的像素坐标保持不变;根据该N组特征点对中各特征点对应的调整后的像素坐标,确定该第一图像和该第二图像之间的目标匹配关系。
该第一图像和该第二图像之间的匹配关系可以是该第一图像和该第二图像之间的平移矩阵和旋转矩阵。由于动态障碍物的运动状态与静态障碍物的运动状态不同,第一图像和第二图像中动态障碍物对应的特征点之间的平移矩阵和旋转矩阵,与该第一图像和该第二图像中静态障碍物对应的特征点之间的平移矩阵和旋转矩阵不同。可以理解,只有N组特征点对中的特征点均为静态障碍物对应的特征点时,根据该N组特征点对中各特征点对应的像素坐标才能较准确地确定第一图像和第二图像之间的匹配关系。本申请实施例中,利用动态障碍物的运动状态信息对N组特征点对中目标特征点对应的像素坐标进行调整之后,该N组特征点对中动态障碍物对应的特征点之间的平移矩阵和旋转矩阵与该N组特征点对中静态障碍物对应的特征点之间的平移矩阵和旋转矩阵基本相同,因此根据该N组特征点对中各特征点对应的像素坐标能够较准确地确定第一图像和第二图像之间的匹配关系。
在一个可选的实现方式中,运动状态信息包括该动态障碍物从该第一时刻至该第二时刻的位移;利用动态障碍物的运动状态信息对该N组特征点对中目标特征点的像素坐标进行调整包括:利用该位移对参考特征点的像素坐标进行调整,该参考特征点包含于该目标特征点,且属于该第二图像中该动态障碍物对应的特征点。
在该实现方式中,利用动态障碍物从第一时刻至第二时刻的位移对参考特征点的像素坐标进行调整(即运动补偿),使得该参考特征点的像素坐标被调整后基本等同于静态障碍物的像素坐标,以便于更准确地确定第一图像和第二图像之间的匹配关系。
在一个可选的实现方式中,在利用动态障碍物的运动状态信息对该N组特征点对中目 标特征点的像素坐标进行调整之前,该方法还包括:确定该N组特征点对中位于第一投影区域和/或第二投影区域的特征点为该目标特征点;该第一投影区域为该第一图像中该动态障碍物的图像所处的区域,该第二投影区域为该第二图像中该动态障碍物的图像所处的区域;获得该目标特征点对应的像素坐标。
在该实现方式中,根据第一图像和第二图像中动态障碍物的图像所处的区域,可以快速、准确地确定N组特征点对中的目标特征点。
在一个可选的实现方式中,在确定该N组特征点对中位于第一投影区域和/或第二投影区域的特征点为该目标特征点之前,该方法还包括:获得目标点云,该目标点云为表征该动态障碍物在该第一时刻的特性的点云;将该目标点云投影到该第一图像以得到该第一投影区域。
在该实现方式中,将动态障碍物在第一时刻的特性的点云投影至第一图像,可以准确地确定该第一图像中动态障碍物所处的区域。
在一个可选的实现方式中,在确定该N组特征点对中位于第一投影区域和/或第二投影区域的特征点为该目标特征点之前,该方法还包括:对第一点云和第二点云进行插值计算以得到目标点云,该第一点云和该第二点云分别为该自动驾驶装置在第三时刻和第四时刻采集的点云,该目标点云为表征该动态障碍物在该第一时刻的特性的点云,该第三时刻在该第一时刻之前,该第四时刻在该第一时刻之后;将该目标点云投影到该第一图像以得到该第一投影区域。
在该实现方式中,通过插值计算的方式得到目标点云,可以较准确地确定任一时刻的点云。
在一个可选的实现方式中,该目标匹配关系为采用随机抽样一致性RANSAC算法确定的该第一图像和该第二图像之间的两个或两个以上匹配关系中较优的匹配关系。
该N组特征点对可以为从第一图像和第二图像相匹配的多组特征点对中随机获取的N组特征点对。使用该N组特征点对调整后的像素坐标确定的匹配关系,可能不是该第一图像和该第二图像之间最优的匹配关系。为更准确地确定第一图像和第二图像之间的匹配关系,可以采用RANSAC算法来从第一图像和第二图像之间的多个匹配关系中确定一个较优的匹配关系。可选的,在该第一图像和该第二图像之间的匹配关系之后,则重新随机从第一图像和第二图像相匹配的多组特征点对中获取N组特征点对,根据新获取的获取N组特征点对,再次确定该第一图像和该第二图像之间的匹配关系,直到得到较优的匹配关系。采用随机抽样一致性RANSAC算法确定该目标匹配关系为该第一图像和该第二图像之间的两个或两个以上匹配关系中较优的匹配关系可以是:将第一图像和第二图像相匹配的多组特征点对代入至该目标匹配关系可得到最多的内点,且内点的个数大于数量阈值。该数量阈值可以是该多组特征点对的个数的百分之八十、百分之九十等。在该实现方式中,采用RANSAC算法可以更准确地确定该第一图像和该第二图像之间的匹配关系。
在一个可选的实现方式中,根据该N组特征点对中各特征点对应的调整后的像素坐标,确定该第一图像和该第二图像之间的目标匹配关系包括:根据该N组特征点对中各特征点对应的调整后的像素坐标,确定该第一图像和该第二图像之间的平移矩阵和旋转矩阵。
第二方面,本申请实施例提供了一种重投影误差计算方法,该方法可包括:利用动态 障碍物的运动状态信息对第一空间坐标中第一特征点对应的空间坐标进行调整以得到第二空间坐标,该第一空间坐标包括第一图像中各特征点对应的空间坐标,该第一特征点为该第一图像中该动态障碍物对应的特征点,该第一图像为自动驾驶装置在第二时刻采集的图像,该运动状态信息包括该自动驾驶装置从第一时刻至该第二时刻的位移和姿态变化;将该第二空间坐标投影至该第一图像以得到第一像素坐标;根据该第一像素坐标和第二像素坐标,计算该第一图像的重投影误差;该第二像素坐标包括该第一图像中各特征点的像素坐标。
本申请实施例中,利用动态障碍物的运动状态信息对第一空间坐标中第一特征点对应的空间坐标进行调整,使得该第一特征点对应的空间坐标基本等同于静态障碍物对应的特征点所对应的空间坐标;在计算重投影误差时可以有效减少动态障碍物对应的特征点的影响,得到的重投影误差更准确。
在一个可选的实现方式中,在根据该第一像素坐标和第二像素坐标,计算该第一图像的重投影误差之前,该方法还包括;利用该位移对该第一图像中该第一特征点的像素坐标进行调整以得到该第二像素坐标,该第一图像中除该第一特征点之外的特征点的像素坐标均保持不变。
在该实现方式中,利用动态障碍物从第一时刻至第二时刻的位移对第一特征点的像素坐标进行调整(即运动补偿),使得该第一特征点的像素坐标被调整后基本等同于静态障碍物的像素坐标,以便于更准确地该第一图像的重投影误差。
在一个可选的实现方式中,在利用动态障碍物的运动状态信息对第一空间坐标中第一特征点对应的空间坐标进行调整以得到第二空间坐标之前,该方法还包括:获得第二图像中与该第一特征点相匹配的第二特征点;该第一图像和该第二图像分别为该自动驾驶装置上的第一摄像头和第二摄像头在该第二时刻采集的图像,该第一摄像头和该第二摄像头所处的空间位置不同;根据该第一特征点和该第二特征点,确定第一特征点对应的空间坐标。
在该实现方式中,可以快速、准确地确定第一特征点对应的空间坐标。
在一个可选的实现方式中,在利用动态障碍物的运动状态信息对第一空间坐标中第一特征点对应的空间坐标进行调整以得到第二空间坐标之前,该方法还包括:获得目标点云,该目标点云为表征该动态障碍物在该第二时刻的特性的点云;将该目标点云投影到该第一图像以得到目标投影区域;确定第一特征点集中位于该目标投影区域的特征点为该第一特征点;该第一特征点集包括的特征点为从该第一图像提取的特征点,且均与第二特征点集中的特征点相匹配,该第二特征点集包括的特征点为从第二图像提取的特征点。
在该实现方式中,将位于目标投影区域中的特征点作为动态障碍物对应的特征点,可以准确地确定第一特征点集中动态障碍物对应的特征点。
第三方面,本申请实施例提供了一种匹配关系确定装置,包括:获取单元,用于获取N组特征点对,每组特征点对包括两个相匹配的特征点,其中一个特征点为从第一图像提取的特征点,另一个特征点为从第二图像提取的特征点,该第一图像和该第二图像分别为自动驾驶装置在第一时刻和第二时刻采集的图像,N为大于1的整数;调整单元,用于利用动态障碍物的运动状态信息对该N组特征点对中目标特征点的像素坐标进行调整,该目标特征点属于该第一图像和/或该第二图像中该动态障碍物对应的特征点,该N组特征点对 中除该目标特征点之外的特征点的像素坐标保持不变;确定单元,用于根据该N组特征点对中各特征点对应的调整后的像素坐标,确定该第一图像和该第二图像之间的目标匹配关系。
本申请实施例中,利用动态障碍物的运动状态信息对N组特征点对中目标特征点的像素坐标进行调整之后,该N组特征点对中动态障碍物对应的特征点之间的平移矩阵和旋转矩阵与该N组特征点对中静态障碍物对应的特征点之间的平移矩阵和旋转矩阵基本相同,因此根据该N组特征点对中各特征点的像素坐标能够较准确地确定第一图像和第二图像之间的匹配关系。
在一个可选的实现方式中,运动状态信息包括该动态障碍物从该第一时刻至该第二时刻的位移;该调整单元,具体用于利用该位移对参考特征点的像素坐标进行调整,该参考特征点包含于该目标特征点,且属于该第二图像中该动态障碍物对应的特征点。
在一个可选的实现方式中,确定单元,还用于确定该N组特征点对中位于第一投影区域和/或第二投影区域的特征点为该目标特征点;该第一投影区域为该第一图像中该动态障碍物的图像所处的区域,该第二投影区域为该第二图像中该动态障碍物的图像所处的区域;该获取单元,还用于获得该目标特征点对应的像素坐标。
在一个可选的实现方式中,确定单元,还用于对第一点云和第二点云进行插值计算以得到目标点云,该第一点云和该第二点云分别为该自动驾驶装置在第三时刻和第四时刻采集的点云,该目标点云为表征该动态障碍物在该第一时刻的特性的点云,该第三时刻在该第一时刻之前,该第四时刻在该第一时刻之后;该装置还包括:投影单元,用于将该目标点云投影到该第一图像以得到该第一投影区域。
在一个可选的实现方式中,该目标匹配关系为采用随机抽样一致性RANSAC算法确定的该第一图像和该第二图像之间的两个或两个以上匹配关系中较优的匹配关系。
在一个可选的实现方式中,确定单元,具体用于根据该N组特征点对中各特征点对应的调整后的像素坐标,确定该第一图像和该第二图像之间的平移矩阵和旋转矩阵。
第四方面,本申请实施例提供了一种重投影误差计算装置,该装置包括:调整单元,用于利用动态障碍物的运动状态信息对第一空间坐标中第一特征点对应的空间坐标进行调整以得到第二空间坐标,该第一空间坐标包括第一图像中各特征点对应的空间坐标,该第一特征点为该第一图像中该动态障碍物对应的特征点,该第一图像为自动驾驶装置在第二时刻采集的图像,该运动状态信息包括该自动驾驶装置从第一时刻至该第二时刻的位移和姿态变化;投影单元,用于将该第二空间坐标投影至该第一图像以得到第一像素坐标;确定单元,用于根据该第一像素坐标和第二像素坐标,计算该第一图像的重投影误差;该第二像素坐标包括该第一图像中各特征点的像素坐标。
在一个可选的实现方式中,运动状态信息包括该动态障碍物从该第一时刻至该第二时刻的位移;调整单元,具体用于利用该位移对该第一图像中该第一特征点的像素坐标进行调整以得到该第二像素坐标,该第一图像中除该第一特征点之外的特征点的像素坐标均保持不变。
在一个可选的实现方式中,确定单元,还用于确定该N组特征点对中位于第一投影区域和/或第二投影区域的特征点为该目标特征点;该第一投影区域为该第一图像中该动态障 碍物的图像所处的区域,该第二投影区域为该第二图像中该动态障碍物的图像所处的区域;获取单元,还用于获得该目标特征点对应的像素坐标。
在一个可选的实现方式中,确定单元,还用于对第一点云和第二点云进行插值计算以得到目标点云,该第一点云和该第二点云分别为该自动驾驶装置在第三时刻和第四时刻采集的点云,该目标点云为表征该动态障碍物在该第一时刻的特性的点云,该第三时刻在该第一时刻之前,该第四时刻在该第一时刻之后;该装置还包括:投影单元,用于将该目标点云投影到该第一图像以得到该第一投影区域。
第五方面本申请实施例提供了一种计算机可读存储介质,该计算机存储介质存储有计算机程序,该计算机程序包括程序指令,该程序指令当被处理器执行时使该处理器执行上述第一方面至第二方面以及任一种可选的实现方式的方法。
第六方面,本申请实施例提供了一种计算机程序产品,该计算机程序产品包括程序指令,该程序指令当被处理器执行时使该信处理器执行上述第一方面至第二方面以及任一种可选的实现方式的方法。
第七方面,本申请实施例提供了一种计算机设备,包括存储器、通信接口以及处理器;该通信接口用于接收自动驾驶装置发送的数据,存储器用于保存程序指令,处理器用于执行该程序指令以执行上述第一方面至第二方面以及任一种可选的实现方式的方法。
附图说明
图1是本申请实施例提供的自动驾驶装置100的功能框图;
图2为本申请实施例提供的一种自动驾驶系统的结构示意图;
图3为本申请实施例提供的一种图像帧之间的匹配关系确定方法流程图;
图4为本申请实施例提供的另一种图像帧之间的匹配关系确定方法流程图;
图5为本申请实施例提供的一种重投影误差计算方法流程图;
图6为一种三角化过程示意图;
图7为本申请实施例提供的一种定位方法流程示意图;
图8为本申请实施例提供的一种匹配关系确定装置的结构示意图;
图9为本申请实施例提供的一种重投影误差计算装置的结构示意图;
图10为本申请实施例提供的一种计算机设备的结构示意图;
图11为本申请实施例提供的一种计算机程序产品的结构示意图。
具体实施方式
为了使本技术领域的人员更好地理解本申请实施例方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚地描述,显然,所描述的实施例仅仅是本申请一部分的实施例,而不是全部的实施例。
本申请的说明书实施例和权利要求书及上述附图中的术语“第一”、“第二”、和“第三”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元。 方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
本申请实施例提供的匹配关系确定方法可以应用到自动驾驶场景。下面对自动驾驶场景进行简单的介绍。
自动驾驶场景:自动驾驶装置(例如自动驾驶汽车)使用激光雷达实时或接近实时的采集周围环境的点云以及使用相机采集图像;采用SLAM根据采集到的点云以及图像来定位自车的位置,并根据定位结果来规划行车路线。自车是指自动驾驶装置。
图1是本申请实施例提供的自动驾驶装置100的功能框图。在一个实施例中,将自动驾驶装置100配置为完全或部分地自动驾驶模式。例如,自动驾驶装置100可以在处于自动驾驶模式中的同时控制自身,并且可通过人为操作来确定自动驾驶装置100及其周边环境的当前状态,确定周边环境中的至少一个其他车辆的可能行为,并确定该其他车辆执行可能行为的可能性相对应的置信水平,基于所确定的信息来控制自动驾驶装置100。在自动驾驶装置100处于自动驾驶模式中时,可以将自动驾驶装置100置为在没有和人交互的情况下操作。
自动驾驶装置100可包括各种子系统,例如行进系统102、传感器系统104、控制系统106、一个或多个外围设备108以及电源110、计算机系统112和用户接口116。可选地,自动驾驶装置100可包括更多或更少的子系统,并且每个子系统可包括多个元件。另外,自动驾驶装置100的每个子系统和元件可以通过有线或者无线互连。
行进系统102可包括为自动驾驶装置100提供动力运动的组件。在一个实施例中,推进系统102可包括引擎118、能量源119、传动装置120和车轮/轮胎121。引擎118可以是内燃引擎、电动机、空气压缩引擎或其他类型的引擎组合,例如汽油发动机和电动机组成的混动引擎,内燃引擎和空气压缩引擎组成的混动引擎。引擎118将能量源119转换成机械能量。
能量源119的示例包括汽油、柴油、其他基于石油的燃料、丙烷、其他基于压缩气体的燃料、乙醇、太阳能电池板、电池和其他电力来源。能量源119也可以为自动驾驶装置100的其他系统提供能量。
传动装置120可以将来自引擎118的机械动力传送到车轮121。传动装置120可包括变速箱、差速器和驱动轴。在一个实施例中,传动装置120还可以包括其他器件,比如离合器。其中,驱动轴可包括可耦合到一个或多个车轮121的一个或多个轴。
传感器系统104可包括感测关于自动驾驶装置100周边的环境的信息的若干个传感器。例如,传感器系统104可包括定位系统122(定位系统可以是全球定位(global positioning system,GPS)系统,也可以是北斗系统或者其他定位系统)、惯性测量单元(inertial measurement unit,IMU)124、雷达126、激光测距仪128以及相机130。传感器系统104还可包括被监视自动驾驶装置100的内部系统的传感器(例如,车内空气质量监测器、燃油量表、机油温度表等)。来自这些传感器中的一个或多个的传感器数据可用于检测对象及其相应特性(位置、形状、方向、速度等)。这种检测和识别是自主自动驾驶装置100的安全操作的关键功能。
定位系统122可用于估计自动驾驶装置100的地理位置。IMU 124用于基于惯性加速 度和角速度来感测自动驾驶装置100的位置和朝向变化。在一个实施例中,IMU 124可以是加速度计和陀螺仪的组合。
雷达126可利用无线电信号来感测自动驾驶装置100的周边环境内的物体。
激光测距仪128可利用激光来感测自动驾驶装置100所位于的环境中的物体。在一些实施例中,激光测距仪128可包括一个或多个激光源、激光扫描器以及一个或多个检测器,以及其他系统组件。在一些实施例中,除了感测物体以外,激光测距仪128可以是激光雷达(light detection and ranging,LiDAR)。激光雷达(ibeo),是以发射激光束探测目标的位置、速度等特征量的雷达系统。激光雷达可以是Ibeo激光传感器。激光雷达可向目标(即障碍物)或某个方向发射探测信号(激光束),然后将接收到的从目标反射回来的信号(目标回波)与发射信号进行比较,作适当处理后,就可获得目标的有关信息,例如表示目标的表面特性的点云。点云是在同一空间参考系下表达目标空间分布和目标表面特性的海量点集合。本申请中的点云可以是根据激光测量原理得到的点云,包括每个点的三维坐标。
相机130可用于捕捉自动驾驶装置100的周边环境的多个图像。相机130可以是静态相机或视频相机。相机130可以实时或周期性的捕捉自动驾驶装置100的周边环境的多个图像。相机130可以是双目摄像机,包括左目摄像头和右目摄像头,这两个摄像头所处的位置不同。
控制系统106为控制自动驾驶装置100及其组件的操作。控制系统106可包括各种元件,其中包括转向系统132、油门134、制动单元136、计算机视觉系统140、路线控制系统142以及障碍物避免系统144。
转向系统132可操作来调整自动驾驶装置100的前进方向。例如在一个实施例中可以为方向盘系统。
油门134用于控制引擎118的操作速度并进而控制自动驾驶装置100的速度。
制动单元136用于控制自动驾驶装置100减速。制动单元136可使用摩擦力来减慢车轮121。在其他实施例中,制动单元136可将车轮121的动能转换为电流。制动单元136也可采取其他形式来减慢车轮121转速从而控制自动驾驶装置100的速度。
计算机视觉系统140可以操作来处理和分析由相机130捕捉的图像以便识别自动驾驶装置100周边环境中的物体和/或特征。该物体和/或特征可包括交通信号、道路边界和障碍物。计算机视觉系统140可使用物体识别算法、自动驾驶方法、运动中恢复结构(Structure from Motion,SFM)算法、视频跟踪和其他计算机视觉技术。在一些实施例中,计算机视觉系统140可以用于为环境绘制地图、跟踪物体、估计物体的速度等等。计算机视觉系统140可使用激光雷达获取的点云以及相机获取的周围环境的图像。
路线控制系统142用于确定自动驾驶装置100的行驶路线。在一些实施例中,路线控制系统142可结合来自传感器138、GPS 122和一个或多个预定地图的数据以为自动驾驶装置100确定行驶路线。
障碍物避免系统144用于识别、评估和避免或者以其他方式越过自动驾驶装置100的环境中的潜在障碍物。
当然,在一个实例中,控制系统106可以增加或替换地包括除了所示出和描述的那些以外的组件。或者也可以减少一部分上述示出的组件。
自动驾驶装置100通过外围设备108与外部传感器、其他车辆、其他计算机系统或用户之间进行交互。外围设备108可包括无线通信系统146、车载电脑148、麦克风150和/或扬声器152。
在一些实施例中,外围设备108提供自动驾驶装置100的用户与用户接口116交互的手段。例如,车载电脑148可向自动驾驶装置100的用户提供信息。用户接口116还可操作车载电脑148来接收用户的输入。车载电脑148可以通过触摸屏进行操作。在其他情况中,外围设备108可提供用于自动驾驶装置100与位于车内的其它设备通信的手段。例如,麦克风150可从自动驾驶装置100的用户接收音频(例如,语音命令或其他音频输入)。类似地,扬声器152可向自动驾驶装置100的用户输出音频。
无线通信系统146可以直接地或者经由通信网络来与一个或多个设备无线通信。例如,无线通信系统146可使用3G蜂窝通信,或者4G蜂窝通信,例如LTE,或者5G蜂窝通信。无线通信系统146可利用WiFi与无线局域网(wireless local area network,WLAN)通信。在一些实施例中,无线通信系统146可利用红外链路、蓝牙或ZigBee与设备直接通信。其他无线协议,例如各种车辆通信系统,例如,无线通信系统146可包括一个或多个专用短程通信(dedicated short range communications,DSRC)设备,这些设备可包括车辆和/或路边台站之间的公共和/或私有数据通信。
电源110可向自动驾驶装置100的各种组件提供电力。在一个实施例中,电源110可以为可再充电锂离子或铅酸电池。这种电池的一个或多个电池组可被配置为电源为自动驾驶装置100的各种组件提供电力。在一些实施例中,电源110和能量源119可一起实现,例如一些全电动车中那样。
自动驾驶装置100的部分或所有功能受计算机系统112控制。计算机系统112可包括至少一个处理器113,处理器113执行存储在例如数据存储装置114这样的非暂态计算机可读介质中的指令115。计算机系统112还可以是采用分布式方式控制自动驾驶装置100的个体组件或子系统的多个计算设备。
处理器113可以是任何常规的处理器,诸如商业可获得的中央处理器(central processing unit,CPU)。替选地,该处理器可以是诸如ASIC或其它基于硬件的处理器的专用设备。尽管图1功能性地图示了处理器、存储器和在相同块中的计算机系统112的其它元件,但是本领域的普通技术人员应该理解该处理器、计算机、或存储器实际上可以包括可以或者可以不存储在相同的物理外壳内的多个处理器、计算机、或存储器。例如,存储器可以是硬盘驱动器或位于不同于计算机系统112的外壳内的其它存储介质。因此,对处理器或计算机的引用将被理解为包括对可以或者可以不并行操作的处理器或计算机或存储器的集合的引用。不同于使用单一的处理器来执行此处所描述的步骤,诸如转向组件和减速组件的一些组件每个都可以具有其自己的处理器,该处理器只执行与特定于组件的功能相关的计算。
在此处所描述的各个方面中,处理器可以位于远离该自动驾驶装置并且与该自动驾驶装置进行无线通信。在其它方面中,此处所描述的过程中的一些操作在布置于自动驾驶装置内的处理器上执行而其它则由远程处理器执行,包括采取执行单一操纵的必要步骤。
在一些实施例中,数据存储装置114可包含指令115(例如,程序逻辑),指令115可 被处理器113执行来执行自动驾驶装置100的各种功能,包括以上描述的那些功能。数据存储装置114也可包含额外的指令,包括向推进系统102、传感器系统104、控制系统106和外围设备108中的一个或多个发送数据、从其接收数据、与其交互和/或对其进行控制的指令。
除了指令115以外,数据存储装置114还可存储数据,例如道路地图、路线信息,车辆的位置、方向、速度以及其他信息。这些信息可在自动驾驶装置100在自主、半自主和/或手动模式中操作期间被自动驾驶装置100和计算机系统112使用。
用户接口116,用于向自动驾驶装置100的用户提供信息或从其接收信息。可选地,用户接口116可包括在外围设备108的集合内的一个或多个输入/输出设备,例如无线通信系统146、车车在电脑148、麦克风150和扬声器152。
计算机系统112可基于从各种子系统(例如,行进系统102、传感器系统104和控制系统106)以及从用户接口116接收的输入来控制自动驾驶装置100的功能。例如,计算机系统112可利用来自控制系统106的输入以便控制转向单元132来避免由传感器系统104和障碍物避免系统144检测到的障碍物。在一些实施例中,计算机系统112可操作来对自动驾驶装置100及其子系统的许多方面提供控制。
可选地,上述这些组件中的一个或多个可与自动驾驶装置100分开安装或关联。例如,数据存储装置114可以部分或完全地与自动驾驶装置100分开存在。上述组件可以按有线和/或无线方式来通信地耦合在一起。
可选地,上述组件只是一个示例,实际应用中,上述各个模块中的组件有可能根据实际需要增添或者删除,图1不应理解为对本申请实施例的限制。
在道路行进的自动驾驶汽车,如上面的自动驾驶装置100,可以识别其周围环境内的物体以确定对当前速度的调整。该物体可以是其它车辆、交通控制设备、或者其它类型的物体。在一些示例中,可以独立地考虑每个识别的物体,并且基于物体的各自的特性,诸如它的当前速度、加速度、与车辆的间距等,可以用来确定自动驾驶汽车所要调整的速度。
可选地,自动驾驶装置100或者与自动驾驶装置100相关联的计算设备(如图1的计算机系统112、计算机视觉系统140、数据存储装置114)可以基于所识别的物体的特性和周围环境的状态(例如,交通、雨、道路上的冰等等)来预测该识别的物体的行为。可选地,每一个所识别的物体都依赖于彼此的行为,因此还可以将所识别的所有物体全部一起考虑来预测单个识别的物体的行为。自动驾驶装置100能够基于预测的该识别的物体的行为来调整它的速度。换句话说,自动驾驶汽车能够基于所预测的物体的行为来确定车辆将需要调整到(例如,加速、减速、或者停止)什么稳定状态。在这个过程中,也可以考虑其它因素来确定自动驾驶装置100的速度,诸如,自动驾驶装置100在行驶的道路中的横向位置、道路的曲率、静态障碍物和动态障碍物的接近度等等。
除了提供调整自动驾驶汽车的速度的指令之外,计算设备还可以提供修改自动驾驶装置100的转向角的指令,以使得自动驾驶汽车遵循给定的轨迹和/或维持与自动驾驶汽车附近的物体(例如,道路上的相邻车道中的轿车)的安全横向和纵向距离。
上述自动驾驶装置100可以为轿车、卡车、摩托车、公共汽车、船、飞机、直升飞机、割草机、娱乐车、游乐场车辆、施工设备、电车、高尔夫球车、火车、和手推车等,本发 明实施例不做特别的限定。
图2介绍了自动驾驶装置100的功能框图,下面介绍一种自动驾驶系统101。图2为本申请实施例提供的一种自动驾驶系统的结构示意图。图1和图2是从不同的角度来描述自动驾驶装置100。如图2所示,计算机系统101包括处理器103,处理器103和系统总线105耦合。处理器103可以是一个或者多个处理器,其中,每个处理器都可以包括一个或多个处理器核。显示适配器(video adapter)107,显示适配器可以驱动显示器109,显示器109和系统总线105耦合。系统总线105通过总线桥111和输入输出(I/O)总线113耦合。I/O接口115和I/O总线耦合。I/O接口115和多种I/O设备进行通信,比如输入设备117(如:键盘,鼠标,触摸屏等),多媒体盘(media tray)121,例如CD-ROM,多媒体接口等。收发器123(可以发送和/或接受无线电通信信号),摄像头155(可以捕捉景田和动态数字视频图像)和外部USB接口125。可选的,和I/O接口115相连接的接口可以是USB接口。
其中,处理器103可以是任何传统处理器,包括精简指令集计算(“RISC”)处理器、复杂指令集计算(“CISC”)处理器或上述的组合。可选的,处理器可以是诸如专用集成电路(“ASIC”)的专用装置。可选的,处理器103可以是神经网络处理器(Neural-network Processing Unit,NPU)或者是神经网络处理器和上述传统处理器的组合。可选的,处理器103挂载有一个神经网络处理器。
计算机系统101可以通过网络接口129和软件部署服务器149通信。网络接口129是硬件网络接口,比如,网卡。网络127可以是外部网络,比如因特网,也可以是内部网络,比如以太网或者虚拟私人网络。可选的,网络127还可以是无线网络,比如WiFi网络,蜂窝网络等。
硬盘驱动接口和系统总线105耦合。硬件驱动接口和硬盘驱动器相连接。系统内存135和系统总线105耦合。运行在系统内存135的数据可以包括计算机系统101的操作系统137和应用程序143。
操作系统包括壳(Shell)139和内核(kernel)141。壳139是介于使用者和操作系统之内核(kernel)间的一个接口。壳139是操作系统最外面的一层。壳139管理使用者与操作系统之间的交互:等待使用者的输入,向操作系统解释使用者的输入,并且处理各种各样的操作系统的输出结果。
内核141由操作系统中用于管理存储器、文件、外设和系统资源的那些部分组成。直接与硬件交互,操作系统内核通常运行进程,并提供进程间的通信,提供CPU时间片管理、中断、内存管理、IO管理等等。
应用程序141包括自动驾驶相关程序,比如,管理自动驾驶装置和路上障碍物交互的程序,控制自动驾驶装置的行车路线或者速度的程序,控制自动驾驶装置100和路上其他自动驾驶装置交互的程序。应用程序141也存在于软件部署服务器(deploying server)149的系统上。在一个实施例中,在需要执行应用程序141时,计算机系统101可以从软件部署服务器149下载应用程序141。
传感器153和计算机系统101关联。传感器153用于探测计算机系统101周围的环境。举例来说,传感器153可以探测动物,汽车,障碍物和人行横道等,进一步传感器还可以探测上述动物,汽车,障碍物和人行横道等物体周围的环境,比如:动物周围的环境,例 如,动物周围出现的其他动物,天气条件,周围环境的光亮度等。可选的,如果计算机系统101位于自动驾驶装置上,传感器可以是摄像头(即相机),激光雷达,红外线感应器,化学检测器,麦克风等。传感器153在激活时按照预设间隔感测信息并实时或接近实时地将所感测的信息提供给计算机系统101。可选的,传感器可以包括激光雷达,该激光雷达可以实时或接近实时地将获取的点云提供给计算机系统101,即将获取到的一系列点云提供给计算机系统101,每次获取的点云对应一个时间戳。可选的,摄像头实时或接近实时地将获取的图像提供给计算机系统101,每帧图像对应一个时间戳。应理解,计算机系统101可得到来自摄像头的图像序列。
可选的,在本文该的各种实施例中,计算机系统101可位于远离自动驾驶装置的地方,并且可与自动驾驶装置进行无线通信。收发器123可将自动驾驶任务、传感器153采集的传感器数据和其他数据发送给计算机系统101;还可以接收计算机系统101发送的控制指令。自动驾驶装置可执行收发器接收的来自计算机系统101的控制指令,并执行相应的驾驶操作。在其它方面,本文该的一些过程在设置在自动驾驶车辆内的处理器上执行,其它由远程处理器执行,包括采取执行单个操纵所需的动作。
在自动驾驶过程中,如背景技术该自动驾驶装置在采用SLAM进行定位时,需要确定图像帧之间的匹配关系。下面介绍如何确定两帧图像之间的匹配关系。图3为本申请实施例提供的一种图像帧之间的匹配关系确定方法流程图,如图3所示,该方法可包括:
301、匹配关系确定装置获取N组特征点对。
该匹配关系确定装置可以是自动驾驶装置,也可以是服务器。在一些实施例中,自动驾驶装置采集第一图像和第二图像,并执行图3的方法流程来确定该第一图像和该第二图像之间的匹配关系。在一些实施例中,自动驾驶装置可以将其采集的图像数据以及点云数据等发送至匹配关系确定装置(例如服务器),该匹配关系确定装置执行图3中的方法流程,根据这些数据来确定第一图像和第二图像之间的匹配关系。每组特征点对包括两个相匹配的特征点,其中一个特征点为从第一图像提取的特征点,另一个特征点为从第二图像提取的特征点,该第一图像和该第二图像分别为自动驾驶装置在第一时刻和第二时刻采集的图像,N为大于1的整数。可选的,该第一图像和该第二图像分别为自动驾驶装置上的同一摄像头在不同时刻采集的图像。可选的,自动驾驶装置在执行步骤301之前,在该第一时刻采集得到该第一图像且在该第二时刻采集到该第二图像;对该第一图像进行特征提取得到第一特征点集,对该第二图像进行特征提取得到第二特征点集;将该第一特征点集中的特征点与该第二特征点集中的特征点进行特征匹配以得到特征匹配点集;其中,该特征匹配点集包括该N组特征点对。该N组特征点对可以是该自动驾驶装置从该特征匹配点集中选取的N组特征点对。N可以是5、6、8等整数。
302、匹配关系确定装置利用动态障碍物的运动状态信息对N组特征点对中目标特征点的像素坐标进行调整。
该目标特征点属于该第一图像和/或该第二图像中该动态障碍物对应的特征点,该N组特征点对中除该目标特征点之外的特征点的像素坐标保持不变。该动态障碍物可以是一个,也可以是多个,本申请不作限定。在一些实施例中,动态障碍物可以是该第一图像和/或该第二图像中所有的动态障碍物。后续再详述步骤302的实现方式。
303、匹配关系确定装置根据N组特征点对中各特征点对应的调整后的像素坐标,确定第一图像和第二图像之间的目标匹配关系。
该第一图像和该第二图像之间的目标匹配关系可以该第一图像和该第二图像之间的平移矩阵和旋转矩阵。自动驾驶装置根据N组特征点对中各特征点对应的调整后的像素坐标,确定第一图像和第二图像之间的目标匹配关系可以是该自动驾驶装置根据N组特征点对中各特征点对应的调整后的像素坐标,确定第一图像和第二图像之间的平移矩阵和旋转矩阵。后续再详述计算两帧图像的之间的平移矩阵和旋转矩阵的方式。
利用运动状态信息对目标特征点的像素坐标进行调整的目的是对N组特征点对中动态障碍物对应的特征点的像素坐标进行调整,使得该N组特征点对中动态障碍物对应的特征点之间的平移矩阵和旋转矩阵与该N组特征点对中静态障碍物对应的特征点之间的平移矩阵和旋转矩阵基本相同,这样可以更准确地确定第一图像和第二图像之间的匹配关系,即第一图像和第二图像之间的平移矩阵和旋转矩阵。举例来说,第一图像中的第1特征点至第5特征点依次与第二图像中的第6特征点至第10特征点相匹配;若该第1特征点至该第5特征点均为静态障碍物对应的特征点,则根据该第1特征点至该第5特征点的像素坐标和该第6特征点至第10特征点的像素坐标,可以准确地确定该第一图像和该第二图像之间的匹配关系;若该第1特征点至该第5特征点中至少一个为动态障碍物对应的特征点,则根据该第1特征点至该第5特征点的像素坐标和该第6特征点至第10特征点的像素坐标,不能准确地确定该第一图像和该第二图像之间的匹配关系。
本申请实施例中,利用动态障碍物的运动状态信息对N组特征点对中目标特征点的像素坐标进行调整之后,该N组特征点对中动态障碍物对应的特征点之间的平移矩阵和旋转矩阵与该N组特征点对中静态障碍物对应的特征点之间的平移矩阵和旋转矩阵基本相同,因此根据该N组特征点对中各特征点的像素坐标能够较准确地确定第一图像和第二图像之间的匹配关系。
前述实施例未详述步骤302的实现方式,下面来描述步骤302的一种可选的实现方式。
在一个可选的实现方式中,该运动状态信息包括该动态障碍物从该第一时刻至该第二时刻的位移;利用动态障碍物的运动状态信息对该N组特征点对中目标特征点的像素坐标进行调整可以包括:利用该位移对参考特征点的像素坐标进行调整,该参考特征点包含于该目标特征点,且属于该第二图像中该动态障碍物对应的特征点。
该位移可以是动态障碍物从该第一时刻至该第二时刻在相机坐标系中的位移。由于动态障碍物在相机坐标系(也称摄像机坐标系)下的位移近似等同于该动态障碍物在图像坐标系中的位移,因此该动态障碍物在相机坐标系下的位移可以作为该动态障碍物对应的特征点在图像坐标系中的位移。下面介绍匹配关系确定装置如何获得动态障碍物从该第一时刻至该第二时刻在相机坐标系中的位移的方式。
可选的,匹配关系确定装置可根据自动驾驶装置上的激光雷达采集的点云数据,确定动态障碍物在第一时刻的第一速度以及在第二时刻的第二速度;计算该第一速度和该第二速度的平均值以得到平均速度。假定第一速度为(V x1,V y2,V z3),第二速度为(V x2,V y2,V z2),则该平均速度为
Figure PCTCN2019100093-appb-000001
其中,
Figure PCTCN2019100093-appb-000002
分别为动态障碍 物在X方向、Y方向以及Z方向的速度。可以理解,该平均速度为动态障碍物在激光雷达坐标系下的速度。可选的,该匹配关系确定装置可先将该平均速度从激光雷达坐标系转换至自车坐标系,再将该平均速度从自车坐标系转换至相机坐标系。自车坐标系(也称车辆坐标系)是用来描述汽车运动的特殊动坐标系;其原点与质心重合,当自车在水平路面上处于静止状态,X轴平行于地面指向车辆前方,Z轴通过自车质心指向上方,Y轴指向驾驶员的左侧。可选的,该匹配关系确定装置可将该平均速度从激光雷达坐标系直接转换至相机坐标系。
自动驾驶装置将该平均速度从激光雷达坐标系转换至自车坐标系可采用如下公式:
V 1′=R 1×V 1+T 1 (1);
其中,V 1′为自车坐标系下的平均速度,V 1为激光雷达坐标系下的平均速度,R 1为激光雷达标定的旋转矩阵(外参),T 1为该激光雷达标定的平移矩阵。
自动驾驶装置将该平均速度从自车坐标系转换至相机坐标系可采用如下公式:
V 1″=R 2×V 1′+T 2  (2);
其中,V 1″为相机坐标系下的平均速度,V 1′为自车坐标系下的平均速度,R 2为自动驾驶装置与相机之间的旋转矩阵,T 2为该自动驾驶装置与相机之间的平移矩阵。
自动驾驶装置将该平均速度从激光雷达坐标系转换至相机坐标系可采用如下公式:
V 1″=R 3×V 1+T 3  (3);
其中,V 1″为相机坐标系下的平均速度,V 1为激光雷达坐标系下的平均速度,R 3为激光雷达与相机之间的旋转矩阵,T 3为该激光雷达与相机之间的平移矩阵。
匹配关系确定装置利用位移对参考特征点的像素坐标进行调整的公式如下:
Figure PCTCN2019100093-appb-000003
其中,(x′,y′)为参考特征点调整后的像素坐标,(x,y)为该参考特征点调整前的像素坐标,Δt为第一时刻至第二时刻的时长,V x″为V 1″在X方向的分量,V x″为V 1″在Y方向的分量,即V 1″为(V x″,V y″,V z″)。应理解,参考特征点中包括的每个特征点的像素点均可采用公式(4)进行调整。
在该实现方式中,利用动态障碍物从第一时刻至第二时刻的位移对参考特征点的像素坐标进行调整(即运动补偿),使得该参考特征点的像素坐标被调整后基本等同于静态障碍物的像素坐标,以便于更准确地确定第一图像和第二图像之间的匹配关系。
自动驾驶装置在执行步骤302之前,需要确定N组特征点对中该动态障碍物对应的特征点以得到该目标特征点,以便于对该目标特征点的像素坐标进行调整。确定N组特征点对中动态障碍物对应的特征点以得到目标特征点可以是:确定该N组特征点对中位于第一投影区域和/或第二投影区域的特征点为该目标特征点;该第一投影区域为该第一图像中该动态障碍物的图像所处的区域,该第二投影区域为该第二图像中该动态障碍物的图像所处的区域。可选的,自动驾驶装置获得表征该动态障碍物在该第一时刻的特性的目标点云,将该目标点云投影到该第一图像以得到该第一投影区域;获得表征该动态障碍物在该第二时刻的特性的中间点云,将该中间点云投影到该第二图像以得到该第二投影区域。下面介绍点云投影到图像坐标系的方式,具体方式如下:
(1)激光雷达与相机(第一摄像头或第二摄像头)之间的外参(这里的外参主要是指激光雷达和相机之间的旋转矩阵R ibeoTocam和平移向量T ibeoTocam),把激光雷达得到的目标点云投影到相机坐标系,其投影公式为:
P cam=P ibeoTocam*P ibeo+T ibeoTocam  (5);
其中,P ibeo表示激光雷达感知到的动态障碍物的某个点在激光雷达坐标系中的位置,P cam表示这个点在相机坐标系中的位置。
(2)通过相机的内参,将相机坐标系中的点转换到图像坐标系,其公式如下:
U=KP cam  (6);
其中,K为相机的内参矩阵,U为该点在图像坐标系下的坐标。
在实际应用中,自动驾驶装置可以通过激光雷达(ibeo)按照一定扫描频率扫描周围环境以得到障碍物在不同时刻的点云,通过神经网络(Neural Networks,NN)算法或则非NN算法利用不同时刻的点云确定障碍物的运动信息(例如位置、速度、包围盒和姿态等)。激光雷达可以实时或接近实时地将获取的点云提供给计算机系统101,每次获取的点云对应一个时间戳。可选的,相机(摄像头)实时或接近实时地将获取的图像提供给计算机系统101,每帧图像对应一个时间戳。应理解,计算机系统101可得到来自相机的图像序列以及来自激光雷达的点云序列。由于激光雷达和相机(camera)的频率不一致,因此两种传感器的时间戳通常不同步。在以相机的时间戳为基准的情况下,对通过激光雷达检测到的障碍物的运动信息进行插值运算。若激光雷达的扫描频率比相机的拍摄频率高的话,就进行内插,具体运算过程为:例如最新拍摄的相机时间为t cam,找到距离激光雷达输出中最近的两个时间t k和t k+1,其中,t k<t cam<t k+1;以位置插值计算为例,如t k时刻激光雷达检测到障碍物的位置为
Figure PCTCN2019100093-appb-000004
t k+1检测到障碍物的位置为
Figure PCTCN2019100093-appb-000005
则t cam时刻障碍物的位置为:
Figure PCTCN2019100093-appb-000006
其中,
Figure PCTCN2019100093-appb-000007
为障碍物在t cam时刻的位置。应理解,自动驾驶装置可采用相同的方式对障碍物的其他运动信息,例如速度、姿态、点云等,进行插值,进而得到相机拍摄图像时,障碍物的运动信息。举例来说,相机在第一时刻拍摄得到第一图像,激光雷达在第三时刻扫描得到第一点云,在第四时刻扫描得到第二点云,且该第三时刻和该第四时刻为激光雷达的扫描时刻中与该第一时刻最接近的两个扫描时刻,采用与公式(7)类似的公式对该第一点云和该第二点云中相对应的点进行插值运算以得到障碍物在该第一时刻的目标点云。若激光雷达的扫描频率比相机的拍摄频率高的话,就进行外插。内插和外插是常用的数学计算公式,这里不再详述。
前述实施例中,自动驾驶装置根据N组特征点对中各特征点对应的调整后的像素坐标,来确定第一图像和第二图像之间的匹配关系。在实际应用中,自动驾驶装置可以根据从第一图像和第二图像相匹配的多组特征点对中任意选取N组特征点对,并根据该N组特征点对中各特征点对应的调整后的像素坐标来确定第一图像和第二图像之间的匹配关系。由于N组特征点对中可能存在噪声点等不能准确反映该第一图像和该第二图像的匹配关系的特征点对,因此需要选择N组能够准确反映该第一图像和该第二图像的匹配关系的特征点对, 进而准确地确定第一图像和第二图像之间的匹配关系。为更准确地确定两帧图像之间的匹配关系,本申请实施例采用一种改进的RANSAC算法来确定前后两帧图像的匹配关系。
图4为本申请实施例提供的一种确定前后两帧图像的匹配关系的方法流程图。图4是对图3中的方法流程的进一步细化和完善。也就是说,图3的方法流程为图4中的方法流程的一部分。如图4所示,该方法可包括:
401、匹配关系确定装置确定第一图像中动态障碍物所处的第一投影区域以及第二图像中动态障碍物所处的第二投影区域。
前述实施例描述了将动态障碍物在第一时刻对应的目标点云投影至第一图像以得到第一投影区域,以及将动态障碍物在第二时刻对应的中间点云投影至第二图像以得到第二投影区域的方式,这里不再赘述。
402、匹配关系确定装置从匹配特征点集中随机选择N组特征点对。
本申请实施例中,步骤402可在执行步骤401之前执行,也可以在执行步骤401之后执行。该匹配特征点集为对从该第一图像提取的特征点与从该第二图像提取的特征点做特征匹配得到的特征点对。自动驾驶装置在执行步骤402之前,可对第一图像进行特征提取以得到第一特征点集,对第二图像进行特征提取以得到第二特征点集;对该第一特征点集中的特征点与该第二特征点集中的特征点进行特征匹配以得到匹配特征点集。步骤402为步骤301的一种实现方式。
403、匹配关系确定装置判断N组特征点对是否包括特殊特征点。
特殊特征点是指该N组特征点对中处于第一投影区域和/或第二投影区域的特征点。若否,执行404;若是,执行405。
404、匹配关系确定装置根据N组特征点中各特征点的像素坐标,计算第一图像与第二图像之间的匹配关系。
本申请中,第一图像与第二图像之间的匹配关系可以是该第一图像和该第二图像之间的平移矩阵和旋转矩阵。
405、匹配关系确定装置利用动态障碍物的运动状态信息对N组特征点对中目标特征点的像素坐标进行调整,并根据该N组特征点对中各特征点调整后的像素坐标确定第一图像和第二图像之间的匹配关系。
该目标特征点属于该第一图像和/或该第二图像中该动态障碍物对应的特征点。该N组特征点对中除该目标特征点之外的特征点对应的像素坐标均保持不变。步骤405对应于图3中的步骤302和步骤303。
406、匹配关系确定装置根据匹配关系,将匹配特征点集中除N组特征点对之外的各特征点对分为内点和外点以得到内点集和外点集。
根据匹配关系,将匹配特征点集中除N组特征点对之外的各特征点对分为内点和外点以得到内点集和外点集可以是依次检测该匹配特征点集中除N组特征点对之外的各特征点是否满足该匹配关系;若是,则确定该特征点对为内点,若否,则确定该特征点对为外点。
407、匹配关系确定装置判断当前得到的内点集中的内点的个数是否最多。
若是,执行408;若否,执行402。图4中的方法流程是一个多次迭代的流程,判断当前得到的内点集中的内点的个数是否最多可以是判断当前得到的内点集与之前得到的各内 点集相比是否包括的内点的个数最多。
408、匹配关系确定装置判断当前迭代次数是否满足终止条件。
若是,执行409;若否,执行402。当前迭代次数可以是当前已执行的步骤402的次数。判断当前迭代次数是否满足终止条件可以是判断当前迭代次数是否大于或等于M,M为大于1的整数。M可以是5、10、20、50、100等。
409、结束本流程,且将目标匹配关系作为第一图像与第二图像之间的匹配关系。
该目标匹配关系为已确定的第一图像和该第二图像之间的两个或两个以上匹配关系中较优的匹配关系。可以理解,根据越优的匹配关系,将匹配特征点集中除N组特征点对之外的各特征点对分为内点和外点,可以得到越多的内点。
可以理解,通过执行步骤405可以使得动态障碍物对应的特征点对包括的两个特征点之间的关系与静态障碍物对应的特征点对包括的两个特征点之间的关系基本一致。也就是说,执行步骤405之后,N组特征点对均可视为静态障碍物对应的特征点对,这样就可以减少动态障碍物对应的特征点对的影响,因此能够较快的确定一组较优的匹配关系。另外,采用RANSAC算法可以从已确定的第一图像和第二图像之间的多个匹配关系中,选择一个较优的匹配关系,从而保证确定的匹配关系的质量。
本申请实施例中,采用RANSAC算法可准确、快速地确定第一图像与第二图像之间的匹配关系。
前述实施例未详细描述如何确定第一图像和第二图像的匹配关系的方式。下面介绍如何利用第一图像和第二图像对应的多个特征点对计算这两个图像之间的旋转矩阵R和平移矩阵T。上述匹配特征点集包括对从该第一图像提取的特征点与从该第二图像提取的特征点做特征匹配得到的多组特征点对。该匹配特征点集中每组特征点对包括两个相匹配的特征点,其中一个特征点为从第一图像提取的特征点,另一个特征点为从第二图像提取的特征点,该第一图像和该第二图像分别为自动驾驶装置在第一时刻和第二时刻采集的图像。可以理解,多组特征点对包括点集A和点集B,点集A中的特征点为从第一图像提取的特征点,点集B中的特征点为从第二图像提取的特征点,这两个点集合的元素数目相同且一一对应。点集A可以是N组特征点对中从第一图像提取的特征点,点集B可以是N组特征点对中从第二图像提取的特征点,这两个点集之间的旋转矩阵和平移矩阵就是第一图像和第二图像之间的旋转矩阵和平移矩阵。
为了确定两个点集之间的旋转矩阵和平移矩阵,可以将这个问题建模成如下的公式:
B=R*A+t  (8);
其中,B表示点集B中的特征点的像素坐标,A表示点集A中特征点的像素坐标。为了寻找这两个点集之间的旋转矩阵和平移矩阵,通常需要以下三个步骤:
(1)、计算点集合的中心点,计算公式如下:
Figure PCTCN2019100093-appb-000008
Figure PCTCN2019100093-appb-000009
其中,
Figure PCTCN2019100093-appb-000010
表示点集A中的第i特征点的像素坐标,
Figure PCTCN2019100093-appb-000011
表示点集B中的第i特征点的像 素坐标,u A为点集A对应的中心点,u B为点集B对应的中心点。
Figure PCTCN2019100093-appb-000012
u A以及u B均为向量。例如
Figure PCTCN2019100093-appb-000013
(2)、将点集合移动到原点,计算最优旋转矩阵R。
为了计算旋转矩阵R,需要消除平移矩阵t的影响,所以首先需要将点集重新中心化,生成新的点集A′和点集B′,然后计算点集A′和点集B′之间的协方差矩阵。
采用如下公式将点集重新中心化:
Figure PCTCN2019100093-appb-000014
Figure PCTCN2019100093-appb-000015
其中,A′ i为点集A′中的第i特征点的像素坐标,B′ i为点集B′中的第i特征点的像素坐标。
计算点集之间的协方差矩阵H,计算公式如下;
Figure PCTCN2019100093-appb-000016
通过奇异值分解(Singular Value Decomposition,SVD)方法获得矩阵的U、S和V,计算点集之间的旋转矩阵,公式如下:
[U V D]=SVD(H) (14);
R=VU T   (15);
其中,R为点集A和点集B之间的旋转矩阵,即第一图像和第二图像之间的旋转矩阵。
(3)、计算平移矩阵
采用如下公式计算平移矩阵:
t=-R×u A+u B  (16);
其中,t为点集A和点集B之间的平移矩阵,即第一图像和第二图像之间的平移矩阵。
应理解,上述仅是本申请实施例提供的一种确定图像帧之间的匹配关系的一种实现方式,还可以采用其他方式来确定图像帧之间的匹配关系。
前述实施例描述了确定前后两帧图像之间的匹配关系的实现方式。在实际应用中,可以依次确定自动驾驶装置采集的各相邻图像帧之间的匹配关系,进而确定各帧图像与参考帧图像之间的匹配关系。该参考帧图像可以是自动驾驶装置在一次行驶过程中采集的第一帧图像。举例来说,自动驾驶装置在某段时间内按照时间先后顺序依次采集到第1帧图像至第1000帧图像,该自动驾驶装置可以分别确定相邻两帧图像之间的平移矩阵和旋转矩阵,例如第1帧图像和第2帧图像之间的平移矩阵和旋转矩阵,并根据这些平移矩阵和旋转矩阵确定这1000帧图像中除该第1帧图像之外的任一帧图像与该第一帧图像之间的匹配关系,进而计算各帧图像的重投影误差。又举例来说,第一图像和第二图像之间的旋转矩阵为R 4,平移矩阵为T 4;第二图像和第五图像之间的旋转矩阵为R 5,平移矩阵为T 5;则该第一图像和该第五图像之间的旋转矩阵为(R 4×R 5),该第一图像和该第五图像之间的平移矩阵为(R 4×T 5+T 4)。在一些实施例中,自动驾驶装置采集到一帧图像就确定该帧图像与该帧图像的前一帧图像之间的匹配关系,这样就可以得到任意两帧相邻图像之间的匹配关系,进而得到任意两帧图像之间的匹配关系。在实际应用中,匹配关系确定装置在确定当前帧(即当前时刻采集的图像帧)与参考帧之间的平移矩阵和旋转矩阵之后,可以利用该平移 矩阵和旋转矩阵将当前帧中的特征点对应的三维空间坐标从自车坐标系转换至参考坐标系,以便于计算该当前帧的重投影误差。
前述实施例描述了如何更准确地确定图像帧之间的匹配关系。计算图像帧之间的匹配关系的一个重要应用是计算当前帧与参考帧之间的匹配关系,进而计算该当前帧的重投影误差。本申请实施例还提供了一种重投影误差计算方法,下面具体描述该重投影误差计算方法。
图5为本申请实施例提供的一种重投影误差计算方法流程图。如图5所示,该方法可包括:
501、重投影误差计算装置利用动态障碍物的运动状态信息对第一空间坐标中第一特征点对应的空间坐标进行调整以得到第二空间坐标。
重投影误差计算装置可以是自动驾驶装置,也可以是服务器、电脑等计算机设备。在一些实施例中,自动驾驶装置采集第一图像,并执行图5的方法流程来计算该第一图像的重投影误差。在一些实施例中,自动驾驶装置可以将其采集的图像数据以及点云数据等发送至重投影误差计算装置(例如服务器);该重投影误差计算装置执行图5中的方法,根据这些数据来计算第一图像的重投影误差。该第一空间坐标包括第一图像中各特征点对应的空间坐标,该第一特征点为该第一图像中该动态障碍物对应的特征点。该第一图像可以为自动驾驶装置在第二时刻采集的图像。该第一图像中除该第一特征点之外的特征点的像素坐标均保持不变。该运动状态信息可以包括该自动驾驶装置从第一时刻至该第二时刻的位移(对应一个平移矩阵)和姿态变化(对应一个选择矩阵)。
可选的,重投影误差计算装置在执行步骤501之前,可以确定该第一图像中各特征点在参考坐标系中对应的三维空间坐标以得到第一空间坐标,以及确定该第一空间坐标中第一特征点对应的空间坐标。该参考坐标系可以是自动驾驶装置在本次行驶的起始地点建立的世界坐标系。后续再详述确定第一空间坐标以及第一特征点对应的空间坐标的实现方式。
502、重投影误差计算装置将第二空间坐标投影至第一图像以得到第一像素坐标。
该第二空间坐标可以是在参考坐标系下的空间坐标。重投影误差计算装置将第二空间坐标投影至第一图像以得到第一像素坐标可以是将该参考坐标系下的该第二空间坐标投影至第一图像以得到第一像素坐标。由于需要在一个固定不变的坐标系下计算自动驾驶装置采集到的每帧图像的重投影误差,因此需要确定第二图像中各特征点在参考坐标系下对应的三维空间坐标以得到第一空间坐标。该参考坐标系是一个固定的坐标系,不像自车坐标系会发生改变。利用动态障碍物的运动状态信息对第一空间坐标中第一特征点对应的空间坐标进行调整是利用动态障碍物的运动状态信息对该第一特征点在参考坐标系下对应的空间坐标进行调整。
503、重投影误差计算装置根据第一像素坐标和第二像素坐标,计算第一图像的重投影误差。
该第二像素坐标包括该第一图像中各特征点的像素坐标,第一像素坐标包括的各像素坐标与该第二像素坐标包括的各像素坐标一一对应。该第一像素坐标包括的每个像素坐标对应一个描述子,每个描述子用于描述其对应的特征点;该第二像素坐标包括的每个像素坐标也对应一个描述子。可以理解,第一像素坐标和第二像素坐标包括的像素坐标中对应 的描述子相同的像素坐标相对应。
可选的,重投影误差计算装置在执行步骤503之前,可利用该位移对该第一图像中该第一特征点的像素坐标进行调整以得到该第二像素坐标,该第一图像中除该第一特征点之外的特征点的像素坐标均保持不变。利用该位移对该第一图像中该第一特征点的像素坐标进行调整的实现方式可以与前文描述的利用位移对参考特征点的像素坐标进行调整的实现方式相同,这里不再详述。
重投影误差:投影的点与该帧图像上的测量点之间的误差,投影的点可以是该帧图像中的各特征点对应的三维空间坐标投影至该帧图像的坐标点(即第一像素坐标),测量点可以是这些特征点在该帧图像中的坐标点(即第二像素坐标)。重投影误差计算装置根据第一像素坐标和第二像素坐标,计算第一图像的重投影误差可以是计算第一像素坐标和第二像素坐标中一一对应的像素坐标之差。举例来说,某个特征点在第一像素坐标中对应的像素坐标为(U1,V1),该特征点在第二像素坐标中对应的像素坐标为(U2,V2),则该特征点的重投影误差为
Figure PCTCN2019100093-appb-000017
ΔU=U1-U2,
Figure PCTCN2019100093-appb-000018
第一图像的重投影误差包括该第一图像中各个特征点的重投影误差。
本申请实施例中,利用动态障碍物的运动状态信息对第一空间坐标中第一特征点对应的空间坐标进行调整,使得该第一特征点对应的空间坐标基本等同于静态障碍物对应的特征点所对应的空间坐标;在计算重投影误差时可以有效减少动态障碍物对应的特征点的影响,得到的重投影误差更准确。
重投影误差计算装置在执行步骤501之前,需要确定第一空间坐标以及第一特征点。下面描述如何得到第一空间坐标以及第一特征点的方式。
重投影误差计算装置可采用如下方式确定第一特征点:重投影误差计算装置在执行步骤501之前,获得第一摄像头在第二时刻采集的第一图像以及第二摄像头在该第二时刻采集的第二图像;对该第一图像进行特征提取以得到第一原始特征点集,对第二图像进行特征提取以得到第二原始特征点集;对该第一原始特征点集中的特征点与该第二原始特征点集中的特征点进行特征匹配以得到第一特征点集,该第一特征点集包括的特征点为该第一原始特征点集中与该第二原始特征点集中的特征点相匹配的特征点;确定该第一特征点集中动态障碍物对应的特征点以得到该第一特征点。
重投影误差计算装置可采用如下方式确定该第一特征点集中动态障碍物对应的特征点以得到该第一特征点:获得目标点云,该目标点云为表征该动态障碍物在该第二时刻的特性的点云;将该目标点云投影到该第一图像以得到目标投影区域;确定第一特征点集中位于该目标投影区域的特征点为该第一特征点。
重投影误差计算装置在获得第一原始特征点集和第二原始特征点集之后,可采用如下方式确定第一空间坐标;对该第一原始特征点集中的特征点与该第二原始特征点集中的特征点进行特征匹配以得到第一特征点集,其中,该第一特征点集包括多组特征点对,每组特征点对包括两个相匹配的特征点,一个特征点来自于该第一原始特征点集,另一个特征点来自于该第二原始特征点集;采用三角化公式根据第一特征点集中的每组特征点对确定一个三维空间坐标,得到该第一空间坐标。每组特征点对中一个特征点为从该第一图像提取的,另一个为从该第二图像提取的。由一组特征点对计算得到的一个三维空间坐标即为 该组特征点对包括的两个特征点对应的空间坐标。该第一特征点包含于该第一特征点集。三角化最早由高斯提出,并应用于测量学中。简单来讲就是:在不同的位置观测同一个三维点P(x,y,z),已知在不同位置处观察到的三维点的二维投影点X1(x1,y1),X2(x2,y2),利用三角关系,恢复出该三维点的深度信息,即三维空间坐标。三角化主要是通过匹配的特征点(即像素点)来计算特征点在相机坐标系下的三维坐标。图6为一种三角化过程示意图。如图6所示,P1表示三维点P在O1(左目坐标系)中的坐标(即二维投影点),P2表示三维点P在O2(右目坐标系)中的坐标(即二维投影点),P1和P2为匹配的特征点。三角化公式如下:
Figure PCTCN2019100093-appb-000019
公式(17)中s1表示特征点在O1(左目坐标系)中的尺度,s2表示特征点在O2(右目坐标系)中的尺度,R和t分别表示从左目摄像头到右目摄像头之间的旋转矩阵和平移矩阵。T(大写)表示矩阵的转置。应理解,利用三角关系确定特征点的三维空间坐标仅是一种可选的确定特征点的三维空间坐标的方式,还可以采用其他方式确定特征点的三维空间坐标,本申请不作限定。
前述实施例未详述步骤501的实现方式,下面描述如何利用动态障碍物的运动状态信息对第一空间坐标中第一特征点对应的空间坐标进行调整以得到第二空间坐标的实现方式。
该运动状态信息可以包括该自动驾驶装置从第一时刻至该第二时刻的位移(对应一个平移矩阵T 6)和姿态变化(对应一个旋转矩阵R 6)。
举例来说,旋转矩阵R 6表征自动驾驶装置从第一时刻至第二时刻的姿态变化,平移矩阵T 6表征该自动驾驶装置从第一时刻至第二时刻的位移,重投影误差计算装置可采用如下公式对该第一特征点对应的空间坐标P进行调整(即运动补偿):
P′=R 6P+T 6  (18);
其中,P′为该第一特征点对应的调整后的空间坐标,即补偿后的特征点坐标,P′为一个三维向量;R 6为一个3行3列的矩阵,T 6为一个三维向量。例如,R 1
Figure PCTCN2019100093-appb-000020
为[5 1.2 1.5],T 1为[10 20 0],其中,α为两帧图像绕z轴之间的旋转角度。
可选的,重投影误差计算装置计算旋转矩阵R 6的方式如下:通过激光雷达获取动态障碍物在第一时刻的第一角速度以及在第二时刻的第二角速度;计算该第一角速度和该第二角速度的平均值;计算该平均值和第一时长的乘积得到旋转角度α,该第一时长为该第一时刻与该第二时刻之间的时长;根据该旋转角度得到第一旋转矩阵,该第一旋转矩阵为激光雷达坐标系下的旋转矩阵;使用激光雷达的外参(激光雷达的朝向和位置)将该第一旋转矩阵从激光雷达坐标系转换至自车坐标系以得到第二旋转矩阵;将该第二旋转矩阵从自车坐标系转换至参考坐标系,得到旋转矩阵R 6。可以理解,旋转矩阵R 6为该动态障碍物在参考坐标系下从第一时刻至第二时刻的姿态变化对应的旋转矩阵。在实际应用中,自动驾驶装置可通过激光雷达检测得到动态障碍物在不同时刻的角速度。重投影误差计算装置可以采用如下公式将该第一旋转矩阵从激光雷达坐标系转换至自车坐标系以得到第二旋转矩阵:
R 6′=R 1×R 6″  (19);
R 6′为第二旋转矩阵,R 6″为第一旋转矩阵,R 1为激光雷达标定的旋转矩阵。
重投影误差计算装置可以采用如下公式将该第二旋转矩阵从自车坐标系转换至参考坐标系以得到旋转矩阵R 6
R 6=R 7×R 6′  (20);
R 6为该动态障碍物在参考坐标系下从第一时刻至第二时刻的姿态变化对应的旋转矩阵,R 6′为第二旋转矩阵,R 7为第一图像与参考帧图像之间的旋转矩阵。重投影误差计算装置与匹配关系确定装置可以是同一装置。前述实施例描述了确定任一帧图像与参考帧之间的平移矩阵和旋转矩阵的实现方式,这里不再详述。
可选的,重投影误差计算装置计算平移矩阵T 6的方式如下:通过激光雷达获取动态障碍物在第一时刻的第一速度以及在第二时刻的第二速度;计算该第一速度和该第二速度的平均值;计算该平均值和第二时长的乘积得到第一平移矩阵,该第二时长为该第一时刻与该第二时刻之间的时长,该第一平移矩阵为在激光雷达坐标系下的平移矩阵;使用激光雷达的外参(激光雷达的朝向和位置)将该第一平移矩阵从激光雷达坐标系转换至自车坐标系以得到第二平移矩阵;将该第二平移矩阵从该自车坐标系转换至参考坐标系,得到平移矩阵T 6。平移矩阵T 6可以理解为该动态障碍物在参考坐标系下从第一时刻至第二时刻的位置变化对应的平移矩阵。在实际应用中,自动驾驶装置可通过激光雷达检测得到动态障碍物在不同时刻的速度。重投影误差计算装置可以采用如下公式将该第一平移矩阵从激光雷达坐标系转换至自车坐标系以得到第二平移矩阵:
T 6′=R 1×T 6″+T 1  (21);
T 6′为第二平移矩阵,R 6″为第一平移矩阵,R 1为激光雷达标定的旋转矩阵,T 1为激光雷达标定的平移矩阵。重投影误差计算装置可以采用如下公式将该第二平移矩阵从自车坐标系转换至参考坐标系以得到第二平移矩阵:
T 6=R 7×T 6′+T 7  (22);
T 6为该动态障碍物在参考坐标系下从第一时刻至第二时刻的位置变化对应的平移矩阵,T 6′为第二平移矩阵,R 7为第一图像与参考帧图像之间的旋转矩阵,T 7为第一图像与参考帧图像之间的平移矩阵。
在该实现方式中,利用动态障碍物从第一时刻至第二时刻的位移对第一特征点的像素坐标进行调整(即运动补偿),使得该第一特征点的像素坐标被调整后基本等同于静态障碍物的像素坐标,以便于更准确地该第一图像的重投影误差。
下面介绍前述实施例提供的图像帧之间的匹配关系确定方法以及重投影计算方法在定位过程中的应用。图7为本申请实施例提供的一种定位方法流程示意图,该定位方法应用于包括激光雷达、IMU、双目相机的自动驾驶装置。如图7所示,该方法可包括:
701、自动驾驶装置通过双目相机采集图像。
通过双目相机在(t-1)时刻(对应于第一时刻)采集图像,得到第一图像和第三图像。该第一图像可以是左目摄像头采集的图像,该第三图像可以是右目摄像头采集的图像。在实际应用中,该双目相机可以实时或接近实时的采集图像。如图7所示,该双目摄像机在t时刻(对应于第二时刻)也采集得到第二图像和第四图像。该第二图像可以是左目摄像头 采集的图像,该第四图像可以是右目摄像头采集的图像。
702、自动驾驶装置对左目摄像头采集的图像和右目摄像头采集的图像进行特征提取,并进行特征匹配。
可选的,自动驾驶装置对第一图像进行特征提取以得到第一特征点集,对该第三图像进行特征提取以得到第二特征点集;对该第一特征点集中的特征点与该第二特征点集中的特征点做特征匹配,得到第一匹配特征点集。可选的,自动驾驶装置对第二图像进行特征提取以得到第三特征点集,对该第四图像进行特征提取以得到第四特征点集;对该第三特征点集中的特征点与该第四特征点集中的特征点做特征匹配,得到第二匹配特征点集。在实际应用中,自动驾驶装置对双目摄像机在同一时刻采集的两张图像做特征提取以及特征匹配。
703、自动驾驶装置对不同时刻采集的图像进行特征追踪。
自动驾驶装置对不同时刻采集的图像进行特征追踪可以是确定第一图像和第二图像的匹配关系,和/或,第三图像和第四图像的匹配关系。也就是说,自动驾驶装置对不同时刻采集的图像进行特征追踪可以是确定该自动驾驶装置在不同时刻采集的两帧图像之间的匹配关系。图7中特征追踪是指确定前后两帧图像的匹配关系。两帧图像之间的匹配关系可以是两帧图像之间的旋转矩阵和平移矩阵。自动驾驶装置确定两帧图像之间的匹配关系的实现方式可参阅图3和图4,这里不再赘述。在实际应用中,自动驾驶装置可分别确定其先后采集的多帧图像中所有前后相邻的两帧图像之间的匹配关系。在一些实施例中,自动驾驶装置采集到一帧图像就确定该帧图像与该帧图像的前一帧图像之间的匹配关系,这样就可以得到任意两帧相邻图像之间的匹配关系,进而得到任意两帧图像之间的匹配关系。例如,当前帧与参考帧之间的旋转矩阵和平移矩阵。
704、自动驾驶装置根据动态障碍物的角速率和速度进行运动估计。
自动驾驶装置进行运动估计可以是估计动态障碍物的运动状态以得到该动态障碍物的运动状态信息,例如动态障碍物在相机坐标系下从(t-1)时刻至t时刻的位移、该动态障碍物在参考坐标系下从(t-1)时刻至t时刻的姿态变化(例如旋转矩阵R 6)以及该动态障碍物在参考坐标系下从(t-1)时刻至t时刻的位置变化(例如平移矩阵T 6)。前述实施例描述了根据动态障碍物的角速率和速度进行运动估计以得到该动态障碍物的运动状态信息的实现方式,这里不再赘述。
705、自动驾驶装置对图像中的特征点所对应的空间坐标进行三维重建。
自动驾驶装置对图像中的特征点所对应的空间坐标进行三维重建可以包括:利用三角化公式根据第一匹配特征点集中每组相匹配的特征点对确定一个三维空间坐标以得到第一参考空间坐标;将该第一参考空间坐标从激光雷达坐标系转换至参考坐标系以得到第一中间空间坐标;根据运动状态信息调整该第一中间空间坐标中动态障碍物对应的特征点对应的空间坐标,得到第一目标空间坐标。该第一目标空间坐标为第一图像和第三图像中的特征点对应的调整后(重建)的三维空间坐标。该图像可以是该第一图像、该第二图像、第三图像以及第四图像中的任一个。该运动状态信息为自动驾驶装置在步骤704得到的。自动驾驶装置对图像中动态障碍物对应的特征点所对应的空间坐标进行三维重建还可以包括:利用三角化公式根据第二匹配特征点集中每组相匹配的特征点对确定一个三维空间坐标以 得到第二参考空间坐标;将该第二参考空间坐标从激光雷达坐标系转换至参考坐标系以得到第二中间空间坐标;根据运动状态信息调整该第二中间空间坐标中动态障碍物对应的特征点对应的空间坐标,得到第二目标空间坐标。该第二目标空间坐标为第二图像和第四图像中的特征点对应的调整后(重建)的三维空间坐标。可以理解,自动驾驶装置对图像中的特征点所对应的空间坐标进行三维重建也就是对图像中动态障碍物对应的特征点所对应的三维空间坐标进行调整。步骤705的实现方式可以与步骤501的实现方式相同。
706、自动驾驶装置计算重投影误差。
自动驾驶装置计算重投影误差的方式可以如下:将上述第二目标空间坐标中的三维空间坐标投影至第二图像以得到目标投影点;计算该目标投影点和目标测量点之间的误差,得到该第二图像的重投影误差。该目标测量点包括该第二图像中各特征点的像素坐标,该目标投影点包括的像素坐标与该目标测量点包括的像素坐标一一对应,。应理解,自动驾驶装置可采用类似的方式计算任一帧图像的重投影误差。步骤706的实现方式可参阅图5。
707、自动驾驶装置上的电子控制单元(Electronic Control Unit,ECU)根据激光雷达采集的点云数据确定障碍物的位置和速度。
障碍物可以包括动态障碍物和静态障碍物。具体的,ECU可根据激光雷达采集的点云数据,确定动态障碍物的位置和速度,以及静态障碍物的位置。
708、自动驾驶装置上的ECU根据激光雷达采集的点云数据确定障碍物的包围盒(Bounding Box),以及输出外参。
该外参可以是表征该激光雷达的位置和朝向的标定参数,即旋转矩阵(对应朝向)和平移矩阵(对应位置)。该外参在自动驾驶装置将该包围盒投影至图像以得到投影区域时会用到。
709、自动驾驶装置确定动态障碍物在图像的投影区域。
可选的,自动驾驶装置确定动态障碍物在第一图像的投影区域,以便于确定从该第一图像提取的特征点中属于该动态障碍物对应的特征点。可选的,自动驾驶装置根据动态障碍物的包围盒确定该动态障碍物在图像中的投影区域,具体实现方式可参阅公式(5)和公式(6)。应理解,自动驾驶装置可根据动态障碍物的包围盒,确定动态障碍物在每一帧图像中的投影区域。自动驾驶装置在执行步骤705时需要根据动态障碍物对应的投影区域来确定动态障碍物对应的特征点。
710、自动驾驶装置确定动态障碍物的速度和角速度等。
可选的,自动驾驶装置通过激光雷达采集的点云数据来确定动态障碍物的速度和角速度,以便于根据该动态障碍物的速度和角速度进行运动估计以得到该动态障碍物的运动状态信息。
711、自动驾驶装置采用扩展卡尔曼滤波器(Extended kalman filter,EKF)确定姿态误差、速度误差、位置误差以及第二输出。
该第二输出可以包括动态障碍物的位置、姿态以及速度。图7中量测量包括当前帧图像的重投影误差以及动态障碍物的位置。如图7所示,IMU将线加速度和角速度输出至状态模型,激光雷达将动态障碍物的位置和速度输出至该状态模型,该状态模型可根据这些信息来构建状态方程;量测模型可根据量测量来构建量测方程;EKF可以根据该量测方程 以及该状态方程计算得到姿态误差、速度误差、位置误差以及第二输出。后续再详述构建量测方程以及状态方程的实现方式。图7中,虚线框中的量测模型、状态模型以及扩展卡尔曼滤波器的功能可由计算机系统112实现。卡尔曼滤波的定义:一种利用线性系统状态方程,通过系统输入输出观测数据,对系统状态进行最优估计的算法。由于观测数据中包括系统中的噪声和干扰的影响,所以最优估计也可看作是滤波过程。扩展卡尔曼滤波(Extended Kalman Filter,EKF)是标准卡尔曼滤波在非线性情形下的一种扩展形式,它是一种高效率的递归滤波器(自回归滤波器)。EKF的基本思想是利用泰勒级数展开将非线性系统线性化,然后采用卡尔曼滤波框架对信号进行滤波,因此它是一种次优滤波。自动驾驶装置在定位过程中,由于IMU存在常值漂移,往往不能准确地定位,这时可以利用测量数据对定位结果进行调整。
SLAM过程包含许多步骤,整个过程是为了利用环境来更新自动驾驶装置的位置。由于自动驾驶装置的定位结果往往不够准确。我们可以利用对环境的激光扫描和/或采集图像来纠正自动驾驶装置的位置,这能通过提取环境的特征来实现,然后当自动驾驶装置向四周运动时再进行新的观察。扩展卡尔曼滤波EKF是SLAM过程的核心,其基于这些环境特征来负责更新自动驾驶装置原始的状态位置,这些特征常称为地标。EKF用于跟踪自动驾驶装置位置的不确定估计以及环境中的不确定地标。下文下再介绍本申请实施例中EKR的实现。
712、自动驾驶装置通过惯性导航系统(Inertial Navigation System,INS)确定其自身的姿态、速度以及位置。
图7中,速度误差以及位置误差输出至INS,INS可根据速度误差以及位置误差对其计算得到的自车的速度以及位置进行修正;姿态误差输出至乘法器,该乘法器对INS输出的旋转矩阵(表征姿态)进行修正,这个过程就是对IMU的常值漂移进行修正的过程。IMU的常值漂移是IMU的一种固有属性,会导致其导航误差随时间累积。该乘法器对INS输出的旋转矩阵进行修正可以是计算INS输出的旋转矩阵与姿态误差(一个旋转矩阵)的乘积以得到修正后的旋转矩阵。图7中的第一输出是自动驾驶装置的姿态、速度以及位置。图7中的线加速度和角速度是IMU的输出,INS对该线加速度进行一阶积分可得到自车的速度,对该线加速度进行二阶积分可得到自车的位置,对该角速度进行一阶积分可得到自车的姿态。
本申请实施例中,可以更准确地计算重投影误差,使得定位更准确。
扩展卡尔曼滤波器是本领域常用的技术手段。下面简单描述一下本申请实施例中EKR的应用。
在实际应用中,自动驾驶装置可进行系统建模:将障碍物和自车的位置、速度、姿态以及IMU的常值偏差等建模到系统的方程中,在对自车进行定位时,同时也对障碍物的位置、速度和角度等做进一步的优化。其中,激光雷达可检测动态障碍物的位置、速度、姿态。IMU可估计自车的位置、速度、姿态。
系统的状态方程:系统的状态量
Figure PCTCN2019100093-appb-000021
其中前15维状态量为IMU的位置误差、速度误差、姿态误差等。后9n维为障碍物的位置、速度和角度信息。具体的,q为自车(即自动驾驶装置)的姿态误差,b g为陀螺仪的常值偏差误 差,
Figure PCTCN2019100093-appb-000022
为速度误差,b a为加速度计的常值偏差误差,
Figure PCTCN2019100093-appb-000023
为自车的位置误差,
Figure PCTCN2019100093-appb-000024
为第一个障碍物的位置,
Figure PCTCN2019100093-appb-000025
为第一个障碍物的速度,
Figure PCTCN2019100093-appb-000026
为第一个障碍物的姿态,同理递推。X中每个参数均对应一个三维的向量。
根据捷联惯导的误差方程以及障碍物的运动模型可得:
Figure PCTCN2019100093-appb-000027
其中,
Figure PCTCN2019100093-appb-000028
F I为IMU的状态平移矩阵,G I为IMU的噪声驱动阵,n I为IMU的噪声矩阵,F O为障碍物的状态转换矩阵,G O为障碍物的噪声驱动阵,n O为障碍物的噪声矩阵。
系统的量测方程,系统的量测方程主要由两部分组成:
(1)以三维特征点的重投影误差为量测量,其量测方程可以表示为:
Figure PCTCN2019100093-appb-000029
其中,
Figure PCTCN2019100093-appb-000030
为量测矩阵,
Figure PCTCN2019100093-appb-000031
为量测噪声,
Figure PCTCN2019100093-appb-000032
为特征点的重投影误差。
(2)以激光雷达观测障碍物到自车的位置为量测量,其量测方程可以表示为:
Figure PCTCN2019100093-appb-000033
其中,
Figure PCTCN2019100093-appb-000034
为在全局坐标系下自车的位置,
Figure PCTCN2019100093-appb-000035
为全局坐标系下障碍物的位置,
Figure PCTCN2019100093-appb-000036
为从全局坐标系到自车坐标系下的转换矩阵,
Figure PCTCN2019100093-appb-000037
为自车坐标系下障碍物的位置。自车坐标系是以自动驾驶装置的后轮中心点为原点的坐标系,它随着车的位置的变化而变化。全局坐标系指定一个原点和方向,它是不变的,其位置和指向不随车的变换而变化。
由于扩展卡尔曼滤波器是本领域常用的技术手段,这里不再详述通过扩展卡尔曼滤波器确定姿态误差、速度误差以及位置误差的实现过程。
下面结合匹配关系确定装置的结构来描述如何确定图像帧之间的匹配关系。图8为本申请实施例提供的一种匹配关系确定装置的结构示意图。如图8所示,该匹配关系确定装置包括:
获取单元801,用于获取N组特征点对,每组特征点对包括两个相匹配的特征点,其中一个特征点为从第一图像提取的特征点,另一个特征点为从第二图像提取的特征点,该第一图像和该第二图像分别为自动驾驶装置在第一时刻和第二时刻采集的图像,N为大于1的整数;
调整单元802,用于利用动态障碍物的运动状态信息对该N组特征点对中目标特征点的像素坐标进行调整,该目标特征点属于该第一图像和/或该第二图像中该动态障碍物对应的特征点,该N组特征点对中除该目标特征点之外的特征点的像素坐标保持不变;
确定单元803,用于根据该N组特征点对中各特征点对应的调整后的像素坐标,确定该第一图像和该第二图像之间的目标匹配关系。
在具体实现过程中,获取单元801具体用于执行步骤301中所提到的方法以及可以等同替换的方法;调整单元802具体用于执行步骤302中所提到的方法以及可以等同替换的方法;确定单元803,具体用于执行步骤303中所提到的方法以及可以等同替换的方法。获取单元801、调整单元802以及确定单元803的功能均可由处理器113实现。
在一个可选的实现方式中,该运动状态信息包括该动态障碍物从该第一时刻至该第二时刻的位移;
调整单元802,具体用于利用该位移对参考特征点的像素坐标进行调整,该参考特征点包含于该目标特征点,且属于该第二图像中该动态障碍物对应的特征点。
在一个可选的实现方式中,确定单元803,还用于确定该N组特征点对中位于第一投影区域和/或第二投影区域的特征点为该目标特征点;该第一投影区域为该第一图像中该动态障碍物的图像所处的区域,该第二投影区域为该第二图像中该动态障碍物的图像所处的区域;
获取单元801,还用于获得该目标特征点对应的像素坐标。
在一个可选的实现方式中,确定单元803,还用于对第一点云和第二点云进行插值计算以得到目标点云,该第一点云和该第二点云分别为该自动驾驶装置在第三时刻和第四时刻采集的点云,该目标点云为表征该动态障碍物在该第一时刻的特性的点云,该第三时刻在该第一时刻之前,该第四时刻在该第一时刻之后;该装置还包括:
投影单元804,用于将该目标点云投影到该第一图像以得到该第一投影区域。
图9为本申请实施例提供的一种重投影误差计算装置的结构示意图。如图9所示,该重投影误差计算装置包括:
调整单元901,用于利用动态障碍物的运动状态信息对第一空间坐标中第一特征点对应的空间坐标进行调整以得到第二空间坐标,该第一空间坐标包括第一图像中各特征点对应的空间坐标,该第一特征点为该第一图像中该动态障碍物对应的特征点,该第一图像为自动驾驶装置在第二时刻采集的图像,该运动状态信息包括该自动驾驶装置从第一时刻至该第二时刻的位移和姿态变化;
投影单元902,用于将该第二空间坐标投影至该第一图像以得到第一像素坐标;
确定单元903,用于根据该第一像素坐标和第二像素坐标,计算该第一图像的重投影误差;该第二像素坐标包括该第一图像中各特征点的像素坐标。
在一个可选的实现方式中,调整单元901,还用于利用该位移对该第一图像中该第一特征点的像素坐标进行调整以得到该第二像素坐标,该第一图像中除该第一特征点之外的特征点的像素坐标均保持不变。
在一个可选的实现方式中,该装置还包括:
第一获取单元904,用于获得第二图像中与该第一特征点相匹配的第二特征点;该第一图像和该第二图像分别为该自动驾驶装置上的第一摄像头和第二摄像头在该第二时刻采集的图像,该第一摄像头和该第二摄像头所处的空间位置不同;
确定单元903,还用于根据该第一特征点和该第二特征点,确定第一特征点对应的空间坐标。
在一个可选的实现方式中,该装置还包括:
第二获取单元905,用于获得目标点云,该目标点云为表征该动态障碍物在该第二时刻的特性的点云;
投影单元902,还用于将该目标点云投影到该第一图像以得到目标投影区域;
确定单元903,还用于确定第一特征点集中位于该目标投影区域的特征点为该第一特 征点;该第一特征点集包括的特征点为从该第一图像提取的特征点,且均与第二特征点集中的特征点相匹配,该第二特征点集包括的特征点为从第二图像提取的特征点。
第一获取单元904和第二获取单元905可以是同一单元,也可以是不同的单元。图9中各单元的功能均可由处理器113实现。
应理解以上匹配关系确定装置以及重投影误差计算装置中的各个单元的划分仅仅是一种逻辑功能的划分,实际实现时可以全部或部分集成到一个物理实体上,也可以物理上分开。例如,以上各个单元可以为单独设立的处理元件,也可以集成在终端的某一个芯片中实现,此外,也可以以程序代码的形式存储于控制器的存储元件中,由处理器的某一个处理元件调用并执行以上各个单元的功能。此外各个单元可以集成在一起,也可以独立实现。这里的处理元件可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤或以上各个单元可以通过处理器元件中的硬件的集成逻辑电路或者软件形式的指令完成。该处理元件可以是通用处理器,例如中央处理器(英文:central processing unit,简称:CPU),还可以是被配置成实施以上方法的一个或多个集成电路,例如:一个或多个特定集成电路(英文:application-specific integrated circuit,简称:ASIC),或,一个或多个微处理器(英文:digital signal processor,简称:DSP),或,一个或者多个现场可编程门阵列(英文:field-programmable gate array,简称:FPGA)等。
图10为本申请实施例提供的一种计算机设备的结构示意图,如图10所示,该计算机设备包括:存储器1001、处理器1002、通信接口1003以及总线1004;其中,存储器1001、处理器1002、通信接口1003通过总线1004实现彼此之间的通信连接。通信接口1003用于与自动驾驶装置进行数据交互。
处理器1003通过读取该存储器中存储的该代码以用于执行如下操作:获取N组特征点对,每组特征点对包括两个相匹配的特征点,其中一个特征点为从第一图像提取的特征点,另一个特征点为从第二图像提取的特征点,该第一图像和该第二图像分别为自动驾驶装置在第一时刻和第二时刻采集的图像,N为大于1的整数;利用动态障碍物的运动状态信息对该N组特征点对中目标特征点的像素坐标进行调整,该目标特征点属于该第一图像和/或该第二图像中该动态障碍物对应的特征点,该N组特征点对中除该目标特征点之外的特征点的像素坐标保持不变;根据该N组特征点对中各特征点对应的调整后的像素坐标,确定该第一图像和该第二图像之间的目标匹配关系。
处理器1003通过读取该存储器中存储的该代码以用于执行如下操作:利用动态障碍物的运动状态信息对第一空间坐标中第一特征点对应的空间坐标进行调整以得到第二空间坐标,该第一空间坐标包括第一图像中各特征点对应的空间坐标,该第一特征点为该第一图像中该动态障碍物对应的特征点,该第一图像为自动驾驶装置在第二时刻采集的图像,该运动状态信息包括该自动驾驶装置从第一时刻至该第二时刻的位移和姿态变化;将该第二空间坐标投影至该第一图像以得到第一像素坐标;根据该第一像素坐标和第二像素坐标,计算该第一图像的重投影误差;该第二像素坐标包括该第一图像中各特征点的像素坐标。
在一些实施例中,所公开的方法可以实施为以机器可读格式被编码在计算机可读存储介质上的或者被编码在其它非瞬时性介质或者制品上的计算机程序指令。图11示意性地示出根据这里展示的至少一些实施例而布置的示例计算机程序产品的概念性局部视图,该示 例计算机程序产品包括用于在计算设备上执行计算机进程的计算机程序。在一个实施例中,示例计算机程序产品1100是使用信号承载介质1101来提供的。该信号承载介质1101可以包括一个或多个程序指令1102,其当被一个或多个处理器运行时可以提供以上针对图8-图9描述的功能或者部分功能。因此,例如,参考图8中所示的实施例,方框801-804的一个或多个的功能的实现可以由与信号承载介质1101相关联的一个或多个指令来承担。此外,图11中的程序指令1102也描述示例指令。上述程序指令1102被处理器执行时实现:获取N组特征点对,每组特征点对包括两个相匹配的特征点,其中一个特征点为从第一图像提取的特征点,另一个特征点为从第二图像提取的特征点,该第一图像和该第二图像分别为自动驾驶装置在第一时刻和第二时刻采集的图像,N为大于1的整数;利用动态障碍物的运动状态信息对该N组特征点对中目标特征点的像素坐标进行调整,该目标特征点属于该第一图像和/或该第二图像中该动态障碍物对应的特征点,该N组特征点对中除该目标特征点之外的特征点的像素坐标保持不变;根据该N组特征点对中各特征点对应的调整后的像素坐标,确定该第一图像和该第二图像之间的目标匹配关系。
或者,上述程序指令1102被处理器执行时实现:利用动态障碍物的运动状态信息对第一空间坐标中第一特征点对应的空间坐标进行调整以得到第二空间坐标,该第一空间坐标包括第一图像中各特征点对应的空间坐标,该第一特征点为该第一图像中该动态障碍物对应的特征点,该第一图像为自动驾驶装置在第二时刻采集的图像,该运动状态信息包括该自动驾驶装置从第一时刻至该第二时刻的位移和姿态变化;将该第二空间坐标投影至该第一图像以得到第一像素坐标;根据该第一像素坐标和第二像素坐标,计算该第一图像的重投影误差;该第二像素坐标包括该第一图像中各特征点的像素坐标
在一些示例中,信号承载介质1101可以包含计算机可读介质1103,诸如但不限于,硬盘驱动器、紧密盘(CD)、数字视频光盘(DVD)、数字磁带、存储器、只读存储记忆体(Read-Only Memory,ROM)或随机存储记忆体(Random Access Memory,RAM)等等。在一些实施方式中,信号承载介质1101可以包含计算机可记录介质1104,诸如但不限于,存储器、读/写(R/W)CD、R/W DVD、等等。在一些实施方式中,信号承载介质1101可以包含通信介质1105,诸如但不限于,数字和/或模拟通信介质(例如,光纤电缆、波导、有线通信链路、无线通信链路、等等)。因此,例如,信号承载介质1101可以由无线形式的通信介质1105(例如,遵守IEEE 602.11标准或者其它传输协议的无线通信介质)来传达。一个或多个程序指令1102可以是,例如,计算机可执行指令或者逻辑实施指令。在一些示例中,诸如针对图1描述的处理器可以被配置为,响应于通过计算机可读介质1103、计算机可记录介质1104、和/或通信介质1105中的一个或多个传达到处理器的程序指令1102,提供各种操作、功能、或者动作。应该理解,这里描述的布置仅仅是用于示例的目的。因而,本领域技术人员将理解,其它布置和其它元素(例如,机器、接口、功能、顺序、和功能组等等)能够被取而代之地使用,并且一些元素可以根据所期望的结果而一并省略。另外,所描述的元素中的许多是可以被实现为离散的或者分布式的组件的、或者以任何适当的组合和位置来结合其它组件实施的功能实体。
本领域内的技术人员应明白,本发明的实施例可提供为方法、系统、或计算机程序产品。因此,本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实 施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本发明是根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以权利要求的保护范围为准。

Claims (22)

  1. 一种匹配关系确定方法,其特征在于,包括:
    获取N组特征点对,每组特征点对包括两个相匹配的特征点,其中一个特征点为从第一图像提取的特征点,另一个特征点为从第二图像提取的特征点,所述第一图像和所述第二图像分别为自动驾驶装置在第一时刻和第二时刻采集的图像,N为大于1的整数;
    利用动态障碍物的运动状态信息对所述N组特征点对中目标特征点的像素坐标进行调整,所述目标特征点属于所述第一图像和/或所述第二图像中所述动态障碍物对应的特征点,所述N组特征点对中除所述目标特征点之外的特征点的像素坐标保持不变;
    根据所述N组特征点对中各特征点对应的调整后的像素坐标,确定所述第一图像和所述第二图像之间的目标匹配关系。
  2. 根据权利要求1所述的方法,其特征在于,所述运动状态信息包括所述动态障碍物从所述第一时刻至所述第二时刻的位移;所述利用动态障碍物的运动状态信息对所述N组特征点对中目标特征点的像素坐标进行调整包括:
    利用所述位移对参考特征点的像素坐标进行调整,所述参考特征点包含于所述目标特征点,且属于所述第二图像中所述动态障碍物对应的特征点。
  3. 根据权利要求1或2所述的方法,其特征在于,所述利用动态障碍物的运动状态信息对所述N组特征点对中目标特征点的像素坐标进行调整之前,所述方法还包括:
    确定所述N组特征点对中位于第一投影区域和/或第二投影区域的特征点为所述目标特征点;所述第一投影区域为所述第一图像中所述动态障碍物的图像所处的区域,所述第二投影区域为所述第二图像中所述动态障碍物的图像所处的区域;
    获得所述目标特征点对应的像素坐标。
  4. 根据权利要求3所述的方法,其特征在于,所述确定所述N组特征点对中位于第一投影区域和/或第二投影区域的特征点为所述目标特征点之前,所述方法还包括:
    对第一点云和第二点云进行插值计算以得到目标点云,所述第一点云和所述第二点云分别为所述自动驾驶装置在第三时刻和第四时刻采集的点云,所述目标点云为表征所述动态障碍物在所述第一时刻的特性的点云,所述第三时刻在所述第一时刻之前,所述第四时刻在所述第一时刻之后;
    将所述目标点云投影到所述第一图像以得到所述第一投影区域。
  5. 根据权利要求1至4任一项所述的方法,其特征在于,所述目标匹配关系为采用随机抽样一致性RANSAC算法确定的所述第一图像和所述第二图像之间的两个或两个以上匹配关系中较优的匹配关系。
  6. 根据权利要求5所述的方法,其特征在于,所述根据所述N组特征点对中各特征点对应的调整后的像素坐标,确定所述第一图像和所述第二图像之间的目标匹配关系包括:
    根据所述N组特征点对中各特征点对应的调整后的像素坐标,确定所述第一图像和所述第二图像之间的平移矩阵和旋转矩阵。
  7. 一种重投影误差计算方法,其特征在于,
    利用动态障碍物的运动状态信息对第一空间坐标中第一特征点对应的空间坐标进行调整以得到第二空间坐标,所述第一空间坐标包括第一图像中各特征点对应的空间坐标,所述第一特征点为所述第一图像中所述动态障碍物对应的特征点,所述第一图像为自动驾驶装置在第二时刻采集的图像,所述运动状态信息包括所述自动驾驶装置从第一时刻至所述第二时刻的位移和姿态变化;
    将所述第二空间坐标投影至所述第一图像以得到第一像素坐标;
    根据所述第一像素坐标和第二像素坐标,计算所述第一图像的重投影误差;所述第二像素坐标包括所述第一图像中各特征点的像素坐标。
  8. 根据权利要求7所述的方法,其特征在于,所述根据所述第一像素坐标和第二像素坐标,计算所述第一图像的重投影误差之前,所述方法还包括;
    利用所述位移对所述第一图像中所述第一特征点的像素坐标进行调整以得到所述第二像素坐标,所述第一图像中除所述第一特征点之外的特征点的像素坐标均保持不变。
  9. 根据权利要求7或8所述的方法,其特征在于,所述利用动态障碍物的运动状态信息对第一空间坐标中第一特征点对应的空间坐标进行调整以得到第二空间坐标之前,所述方法还包括:
    获得第二图像中与所述第一特征点相匹配的第二特征点;所述第一图像和所述第二图像分别为所述自动驾驶装置上的第一摄像头和第二摄像头在所述第二时刻采集的图像,所述第一摄像头和所述第二摄像头所处的空间位置不同;
    根据所述第一特征点和所述第二特征点,确定第一特征点对应的空间坐标。
  10. 根据权利要求7至9任一项所述的方法,其特征在于,所述利用动态障碍物的运动状态信息对第一空间坐标中第一特征点对应的空间坐标进行调整以得到第二空间坐标之前,所述方法还包括:
    获得目标点云,所述目标点云为表征所述动态障碍物在所述第二时刻的特性的点云;
    将所述目标点云投影到所述第一图像以得到目标投影区域;
    确定第一特征点集中位于所述目标投影区域的特征点为所述第一特征点;所述第一特征点集包括的特征点为从所述第一图像提取的特征点,且均与第二特征点集中的特征点相匹配,所述第二特征点集包括的特征点为从第二图像提取的特征点。
  11. 一种匹配关系确定装置,其特征在于,包括:
    获取单元,用于获取N组特征点对,每组特征点对包括两个相匹配的特征点,其中一个特征点为从第一图像提取的特征点,另一个特征点为从第二图像提取的特征点,所述第 一图像和所述第二图像分别为自动驾驶装置在第一时刻和第二时刻采集的图像,N为大于1的整数;
    调整单元,用于利用动态障碍物的运动状态信息对所述N组特征点对中目标特征点的像素坐标进行调整,所述目标特征点属于所述第一图像和/或所述第二图像中所述动态障碍物对应的特征点,所述N组特征点对中除所述目标特征点之外的特征点的像素坐标保持不变;
    确定单元,用于根据所述N组特征点对中各特征点对应的调整后的像素坐标,确定所述第一图像和所述第二图像之间的目标匹配关系。
  12. 根据权利要求11所述的装置,其特征在于,所述运动状态信息包括所述动态障碍物从所述第一时刻至所述第二时刻的位移;
    所述调整单元,具体用于利用所述位移对参考特征点的像素坐标进行调整,所述参考特征点包含于所述目标特征点,且属于所述第二图像中所述动态障碍物对应的特征点。
  13. 根据权利要求11或12所述的装置,其特征在于,
    所述确定单元,还用于确定所述N组特征点对中位于第一投影区域和/或第二投影区域的特征点为所述目标特征点;所述第一投影区域为所述第一图像中所述动态障碍物的图像所处的区域,所述第二投影区域为所述第二图像中所述动态障碍物的图像所处的区域;
    所述获取单元,还用于获得所述目标特征点对应的像素坐标。
  14. 根据权利要求13所述的装置,其特征在于,
    所述确定单元,还用于对第一点云和第二点云进行插值计算以得到目标点云,所述第一点云和所述第二点云分别为所述自动驾驶装置在第三时刻和第四时刻采集的点云,所述目标点云为表征所述动态障碍物在所述第一时刻的特性的点云,所述第三时刻在所述第一时刻之前,所述第四时刻在所述第一时刻之后;所述装置还包括:
    投影单元,用于将所述目标点云投影到所述第一图像以得到所述第一投影区域。
  15. 根据权利要求11至14任一项所述的装置,其特征在于,所述目标匹配关系为采用随机抽样一致性RANSAC算法确定的所述第一图像和所述第二图像之间的两个或两个以上匹配关系中较优的匹配关系。
  16. 根据权利要求15所述的装置,其特征在于,
    所述确定单元,具体用于根据所述N组特征点对中各特征点对应的调整后的像素坐标,确定所述第一图像和所述第二图像之间的平移矩阵和旋转矩阵。
  17. 一种重投影误差计算装置,其特征在于,包括:
    调整单元,用于利用动态障碍物的运动状态信息对第一空间坐标中第一特征点对应的空间坐标进行调整以得到第二空间坐标,所述第一空间坐标包括第一图像中各特征点对应 的空间坐标,所述第一特征点为所述第一图像中所述动态障碍物对应的特征点,所述第一图像为自动驾驶装置在第二时刻采集的图像,所述运动状态信息包括所述自动驾驶装置从第一时刻至所述第二时刻的位移和姿态变化;
    投影单元,用于将所述第二空间坐标投影至所述第一图像以得到第一像素坐标;
    确定单元,用于根据所述第一像素坐标和第二像素坐标,计算所述第一图像的重投影误差;所述第二像素坐标包括所述第一图像中各特征点的像素坐标。
  18. 根据权利要求17所述的装置,其特征在于,
    所述调整单元,还用于利用所述位移对所述第一图像中所述第一特征点的像素坐标进行调整以得到所述第二像素坐标,所述第一图像中除所述第一特征点之外的特征点的像素坐标均保持不变。
  19. 根据权利要求17或18所述的装置,其特征在于,所述装置还包括:
    第一获取单元,用于获得第二图像中与所述第一特征点相匹配的第二特征点;所述第一图像和所述第二图像分别为所述自动驾驶装置上的第一摄像头和第二摄像头在所述第二时刻采集的图像,所述第一摄像头和所述第二摄像头所处的空间位置不同;
    所述确定单元,还用于根据所述第一特征点和所述第二特征点,确定第一特征点对应的空间坐标。
  20. 根据权利要求17至19任一项所述的装置,其特征在于,所述装置还包括:
    第二获取单元,用于获得目标点云,所述目标点云为表征所述动态障碍物在所述第二时刻的特性的点云;
    所述投影单元,还用于将所述目标点云投影到所述第一图像以得到目标投影区域;
    所述确定单元,还用于确定第一特征点集中位于所述目标投影区域的特征点为所述第一特征点;所述第一特征点集包括的特征点为从所述第一图像提取的特征点,且均与第二特征点集中的特征点相匹配,所述第二特征点集包括的特征点为从第二图像提取的特征点。
  21. 一种电子设备,其特征在于,包括:
    存储器,用于存储程序;
    处理器,用于执行所述存储器存储的所述程序,当所述程序被执行时,所述处理器用于执行如权利要求1-10中任一所述的步骤。
  22. 一种计算机可读存储介质,其特征在于,所述计算机存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行时使所述处理器执行如权利要求1-10任一项所述的方法。
PCT/CN2019/100093 2019-08-09 2019-08-09 匹配关系确定方法、重投影误差计算方法及相关装置 WO2021026705A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201980051525.9A CN112640417B (zh) 2019-08-09 2019-08-09 匹配关系确定方法及相关装置
PCT/CN2019/100093 WO2021026705A1 (zh) 2019-08-09 2019-08-09 匹配关系确定方法、重投影误差计算方法及相关装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/100093 WO2021026705A1 (zh) 2019-08-09 2019-08-09 匹配关系确定方法、重投影误差计算方法及相关装置

Publications (1)

Publication Number Publication Date
WO2021026705A1 true WO2021026705A1 (zh) 2021-02-18

Family

ID=74570425

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/100093 WO2021026705A1 (zh) 2019-08-09 2019-08-09 匹配关系确定方法、重投影误差计算方法及相关装置

Country Status (2)

Country Link
CN (1) CN112640417B (zh)
WO (1) WO2021026705A1 (zh)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113139031A (zh) * 2021-05-18 2021-07-20 智道网联科技(北京)有限公司 用于自动驾驶的交通标识的生成方法及相关装置
CN113484843A (zh) * 2021-06-02 2021-10-08 福瑞泰克智能系统有限公司 一种激光雷达与组合导航间外参数的确定方法及装置
CN113609914A (zh) * 2021-07-09 2021-11-05 北京经纬恒润科技股份有限公司 一种障碍物识别方法、装置和车辆控制系统
CN113687336A (zh) * 2021-09-09 2021-11-23 北京斯年智驾科技有限公司 一种雷达标定方法、装置、电子设备和介质
CN113687337A (zh) * 2021-08-02 2021-11-23 广州小鹏自动驾驶科技有限公司 车位识别性能测试方法、装置、测试车辆及存储介质
CN113792674A (zh) * 2021-09-17 2021-12-14 支付宝(杭州)信息技术有限公司 空座率的确定方法、装置和电子设备
CN113869422A (zh) * 2021-09-29 2021-12-31 北京易航远智科技有限公司 多相机目标匹配方法、系统、电子设备及可读存储介质
CN114049404A (zh) * 2022-01-12 2022-02-15 深圳佑驾创新科技有限公司 一种车内相机外参标定方法及装置
CN114092647A (zh) * 2021-11-19 2022-02-25 复旦大学 一种基于全景双目立体视觉的三维重建系统和方法
CN114549750A (zh) * 2022-02-16 2022-05-27 清华大学 多模态场景信息采集与重建方法及系统
CN115015955A (zh) * 2022-05-23 2022-09-06 天津卡尔狗科技有限公司 运动信息的确定方法、装置、设备、存储介质和程序产品
CN116625385A (zh) * 2023-07-25 2023-08-22 高德软件有限公司 路网匹配方法、高精地图构建方法、装置及设备
CN116664643A (zh) * 2023-06-28 2023-08-29 哈尔滨市科佳通用机电股份有限公司 基于SuperPoint算法的铁路列车图像配准方法及设备
CN117994759A (zh) * 2024-01-31 2024-05-07 小米汽车科技有限公司 障碍物的位置检测方法及装置

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113239072B (zh) * 2021-04-27 2024-09-06 华为技术有限公司 一种终端设备定位方法及其相关设备
CN113345046B (zh) * 2021-06-15 2022-04-08 萱闱(北京)生物科技有限公司 操作设备的移动轨迹记录方法、装置、介质和计算设备
CN117677974A (zh) * 2021-07-05 2024-03-08 抖音视界有限公司 点云编解码的方法、装置和介质
CN114419580A (zh) * 2021-12-27 2022-04-29 北京百度网讯科技有限公司 障碍物关联方法、装置、电子设备及存储介质
CN116736227B (zh) * 2023-08-15 2023-10-27 无锡聚诚智能科技有限公司 一种麦克风阵列和摄像头联合标定声源位置的方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107357286A (zh) * 2016-05-09 2017-11-17 两只蚂蚁公司 视觉定位导航装置及其方法
US20170337434A1 (en) * 2016-01-22 2017-11-23 Beijing Smarter Eye Technology Co. Ltd. Warning Method of Obstacles and Device of Obstacles
CN107730551A (zh) * 2017-01-25 2018-02-23 问众智能信息科技(北京)有限公司 车载相机姿态自动估计的方法和装置
CN109922258A (zh) * 2019-02-27 2019-06-21 杭州飞步科技有限公司 车载相机的电子稳像方法、装置及可读存储介质

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3293700B1 (en) * 2016-09-09 2019-11-13 Panasonic Automotive & Industrial Systems Europe GmbH 3d reconstruction for vehicle
US10460471B2 (en) * 2017-07-18 2019-10-29 Kabushiki Kaisha Toshiba Camera pose estimating method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170337434A1 (en) * 2016-01-22 2017-11-23 Beijing Smarter Eye Technology Co. Ltd. Warning Method of Obstacles and Device of Obstacles
CN107357286A (zh) * 2016-05-09 2017-11-17 两只蚂蚁公司 视觉定位导航装置及其方法
CN107730551A (zh) * 2017-01-25 2018-02-23 问众智能信息科技(北京)有限公司 车载相机姿态自动估计的方法和装置
CN109922258A (zh) * 2019-02-27 2019-06-21 杭州飞步科技有限公司 车载相机的电子稳像方法、装置及可读存储介质

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113139031A (zh) * 2021-05-18 2021-07-20 智道网联科技(北京)有限公司 用于自动驾驶的交通标识的生成方法及相关装置
CN113139031B (zh) * 2021-05-18 2023-11-03 智道网联科技(北京)有限公司 用于自动驾驶的交通标识的生成方法及相关装置
CN113484843A (zh) * 2021-06-02 2021-10-08 福瑞泰克智能系统有限公司 一种激光雷达与组合导航间外参数的确定方法及装置
CN113609914A (zh) * 2021-07-09 2021-11-05 北京经纬恒润科技股份有限公司 一种障碍物识别方法、装置和车辆控制系统
CN113609914B (zh) * 2021-07-09 2024-05-10 北京经纬恒润科技股份有限公司 一种障碍物识别方法、装置和车辆控制系统
CN113687337A (zh) * 2021-08-02 2021-11-23 广州小鹏自动驾驶科技有限公司 车位识别性能测试方法、装置、测试车辆及存储介质
CN113687337B (zh) * 2021-08-02 2024-05-31 广州小鹏自动驾驶科技有限公司 车位识别性能测试方法、装置、测试车辆及存储介质
CN113687336A (zh) * 2021-09-09 2021-11-23 北京斯年智驾科技有限公司 一种雷达标定方法、装置、电子设备和介质
CN113792674A (zh) * 2021-09-17 2021-12-14 支付宝(杭州)信息技术有限公司 空座率的确定方法、装置和电子设备
CN113792674B (zh) * 2021-09-17 2024-03-26 支付宝(杭州)信息技术有限公司 空座率的确定方法、装置和电子设备
CN113869422A (zh) * 2021-09-29 2021-12-31 北京易航远智科技有限公司 多相机目标匹配方法、系统、电子设备及可读存储介质
CN114092647A (zh) * 2021-11-19 2022-02-25 复旦大学 一种基于全景双目立体视觉的三维重建系统和方法
CN114049404B (zh) * 2022-01-12 2022-04-05 深圳佑驾创新科技有限公司 一种车内相机外参标定方法及装置
CN114049404A (zh) * 2022-01-12 2022-02-15 深圳佑驾创新科技有限公司 一种车内相机外参标定方法及装置
CN114549750A (zh) * 2022-02-16 2022-05-27 清华大学 多模态场景信息采集与重建方法及系统
CN115015955A (zh) * 2022-05-23 2022-09-06 天津卡尔狗科技有限公司 运动信息的确定方法、装置、设备、存储介质和程序产品
CN116664643A (zh) * 2023-06-28 2023-08-29 哈尔滨市科佳通用机电股份有限公司 基于SuperPoint算法的铁路列车图像配准方法及设备
CN116625385B (zh) * 2023-07-25 2024-01-26 高德软件有限公司 路网匹配方法、高精地图构建方法、装置及设备
CN116625385A (zh) * 2023-07-25 2023-08-22 高德软件有限公司 路网匹配方法、高精地图构建方法、装置及设备
CN117994759A (zh) * 2024-01-31 2024-05-07 小米汽车科技有限公司 障碍物的位置检测方法及装置

Also Published As

Publication number Publication date
CN112640417A (zh) 2021-04-09
CN112640417B (zh) 2021-12-31

Similar Documents

Publication Publication Date Title
WO2021026705A1 (zh) 匹配关系确定方法、重投影误差计算方法及相关装置
WO2021184218A1 (zh) 一种相对位姿标定方法及相关装置
CN110543814B (zh) 一种交通灯的识别方法及装置
US12061088B2 (en) Obstacle avoidance method and apparatus
EP4141736A1 (en) Lane tracking method and apparatus
CN113792566B (zh) 一种激光点云的处理方法及相关设备
CN112639882B (zh) 定位方法、装置及系统
EP4057227A1 (en) Pose estimation of inertial measurement unit and camera mounted on a moving object
CN110930323B (zh) 图像去反光的方法、装置
US12001517B2 (en) Positioning method and apparatus
CN113498529B (zh) 一种目标跟踪方法及其装置
WO2021174445A1 (zh) 预测车辆驶出口的方法和装置
WO2022089577A1 (zh) 一种位姿确定方法及其相关设备
US20230048680A1 (en) Method and apparatus for passing through barrier gate crossbar by vehicle
CN112810603B (zh) 定位方法和相关产品
CN115265561A (zh) 车辆定位方法、装置、车辆及介质
CN115039095A (zh) 目标跟踪方法以及目标跟踪装置
WO2021159397A1 (zh) 车辆可行驶区域的检测方法以及检测装置
CN115205461B (zh) 场景重建方法、装置、可读存储介质及车辆
WO2022022284A1 (zh) 目标物的感知方法及装置
CN114708292A (zh) 一种对象状态估计方法、装置、计算设备和车辆的改进方法
US20240290108A1 (en) Information processing apparatus, information processing method, learning apparatus, learning method, and computer program
CN114764816A (zh) 一种对象状态估计方法、装置、计算设备和车辆
CN115601386A (zh) 一种点云处理方法、装置、计算设备和存储介质
CN118435081A (zh) 一种数据处理方法以及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19941234

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19941234

Country of ref document: EP

Kind code of ref document: A1