WO2022089577A1 - Pose determination method and related device thereof - Google Patents

Pose determination method and related device thereof Download PDF

Info

Publication number
WO2022089577A1
WO2022089577A1 PCT/CN2021/127380 CN2021127380W WO2022089577A1 WO 2022089577 A1 WO2022089577 A1 WO 2022089577A1 CN 2021127380 W CN2021127380 W CN 2021127380W WO 2022089577 A1 WO2022089577 A1 WO 2022089577A1
Authority
WO
WIPO (PCT)
Prior art keywords
image frame
corner points
corner
pose
camera device
Prior art date
Application number
PCT/CN2021/127380
Other languages
French (fr)
Chinese (zh)
Inventor
宋佳蓉
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2022089577A1 publication Critical patent/WO2022089577A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C11/00Photogrammetry or videogrammetry, e.g. stereogrammetry; Photographic surveying
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/26Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
    • G01C21/28Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network with correlation of data from several navigational instruments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose

Definitions

  • the present application relates to the technical field of image processing, and more particularly, to a pose determination method and related devices.
  • Common positioning schemes include pure vision-based positioning technology.
  • the main idea of this scheme is to solve the pose of the moving body based on visual feature point matching and global optimization.
  • the corner points are extracted from the image, and the matching corner points between the two frames are used to calculate the pose change of the shooting device between the two frames.
  • Pose changes can include position changes as well as rotational angle changes.
  • the present application provides a method for determining a pose, the method comprising:
  • the second image frames respectively include dynamic objects, wherein the dynamic objects are objects that are displaced relative to the ground when the imaging device captures the image frames.
  • the so-called displacement relative to the ground when the imaging device captures the image frame means that the dynamic target moves relative to the ground in space, for example, from position A to position B (position A and position B are two different positions).
  • the dynamic target may be a vehicle, such as a car, a truck, a passenger car, a trailer, an incomplete vehicle, a motorcycle, etc.
  • first image frame and the second image frame may include The same dynamic target may also include different dynamic targets; by performing corner detection (corner detection) on the first image frame and the second image frame respectively, to obtain multiple first image frames of the first image frame.
  • corner detection corner detection
  • a corner point and a plurality of second corner points of the second image frame in some cases, it can be understood that the recognition of the image by the human eye is usually completed in a local small area or small window. If the grayscale of the image in the area of the window changes greatly when the small window is moved in a small range in all directions, then it can be considered that there are corner points in the window.
  • the plurality of first corner points and the plurality of second corner points may be accelerated segment test (Features from accelerated segment test, FAST) corner points, Harris corner points or binary robust Variable scalable key points (Binary Robust Invariant Scalable Keypoints, BRISK) corner points, etc., are not limited in this embodiment of the present invention.
  • the second image frame includes the second corner points of the area where the dynamic target is located, so as to obtain a plurality of second corner points after culling; when there are dynamic corner points (that is, the corner points of the area where the dynamic target is located), the same corner
  • the observed amount of points in different camera states includes not only the parallax caused by the movement of the camera itself, but also the parallax caused by the movement of the feature points themselves.
  • the corner points of the region can ensure that the visual reprojection error term is reliable. According to the plurality of first corner points after the culling and the plurality of second corner points after the culling, it is determined that the image capturing device when the second image frame is photographed is relative to the time when the first image frame is photographed. Pose changes.
  • the pose change may refer to the change of the position of the photographing device and the change of the rotation angle.
  • the change in position may represent the distance between the position at which the capturing device captured the second image frame and the position at which the first image frame was captured.
  • the change in the rotation angle may represent an angular difference between the rotation angle when the photographing device photographed the second image frame and the rotation angle when the first image frame was photographed.
  • first image frame as the image frame before the second image frame
  • multiple corner points extracted from the first image frame can be used for optical flow to find the distance between the first image frame and the second image frame. Match the corners, and obtain the optical flow information of the matching corners.
  • the optical flow information is used to represent the motion information of the matching corners in the two adjacent images (from the perspective of the image frame, it can be called parallax), and further , the pose change of the imaging device when the second image frame is photographed relative to when the first image frame is photographed can be determined based on the optical flow information.
  • the algorithm used for the optical flow can be the Lucas-Kanade optical flow algorithm or other algorithms. In addition to the optical flow, a descriptor or a direct method can also be used to match the corner points.
  • the observed amount of the same corner point in different camera states includes not only the parallax caused by the movement of the camera itself, but also the parallax caused by the movement of the corner point itself.
  • the above-mentioned methods are used to address the dynamic environment.
  • the corner points in the area where the dynamic target is located in the corner points are eliminated to ensure that the visual reprojection error is more reliable, so as to overcome the error caused by the dynamic feature when the pose changes.
  • the dynamic target is a vehicle.
  • the method is applied to a target vehicle, and the target vehicle is fixedly provided with the camera device, the inertial measurement unit IMU, and a wheel speedometer; the method further includes: acquiring a measurement obtained by the IMU. During the period from capturing the first image frame to capturing the second image frame, the camera device obtains the motion state data of the target vehicle; the camera device obtains the measurement of the wheel speed when the camera device captures the first image.
  • the pose change of the camera device when shooting the second image frame relative to when shooting the first image frame includes: according to the plurality of first corner points after the culling and the plurality of first corner points after the culling
  • the disparity of the second corner points in the respective image frames determines the change of the first pose of the camera device when the second image frame is captured relative to the first image frame when the first image frame is captured, wherein the first
  • the attitude change does not include scale information; according to the motion state data, the wheel speed data, and the first attitude change, it is determined that when the camera device captures the second image frame, relative to capturing the first image
  • the second pose change at frame time, the second pose change includes scale information; the second pose change is non-linearly optimized to obtain the pose change.
  • the motion state data may be vehicle acceleration data and angular velocity data measured by an inertial
  • timestamps may be aligned to the removed first corner points and the removed second corner points with the motion state data and the wheel speed data, wherein, due to different data frequencies, A single frame of image data (removed corner points) corresponds to multiple motion state data and wheel speed data.
  • the motion state data and wheel speed data of each frame of image can be pre-integrated to provide initial pose values for the image.
  • the sliding window method is used, and pure visual data is used for initialization processing to obtain the pose and position without scale information (also called depth) when the camera equipment captures all image frames in the sliding window.
  • the speed data is restored without scale information, and the positions of all corner points are recalculated, and then nonlinear optimization is performed on the restored scale information and the inverse depth of the corner points to obtain the pose change.
  • the performing nonlinear optimization on the second pose change includes: performing nonlinear optimization on the second pose change according to a preset optimization function, wherein the preset The optimization function includes a wheel speedometer residual term.
  • Nonlinear optimization refers to finding an optimal set of numerical mappings in a given objective function, namely x-min f(x).
  • f(x) is a nonlinear function
  • the corner points in the area where the dynamic target is located among the multiple corner points are eliminated to ensure that the visual reprojection error is more reliable, thereby overcoming the positioning error introduced by dynamic features.
  • the wheel speedometer residual term is added. The pose is initialized and calculated through the joint information of vision, IMU and wheel speedometer. So that when the IMU convergence is not good, the wheel speedometer can be used to compensate, so as to improve the effect.
  • the method further includes:
  • a pre-trained neural network is used to detect the dynamic target in the first image frame, and detect the dynamic target in the second image frame, so as to obtain the region where the dynamic target included in the first image frame is located, and the The second image frame includes the area where the dynamic object is located.
  • the current key frame and the map can also be closed-loop detection, the detected frame is a closed-loop frame, and the common-view feature point between the closed-loop frame and the frame in the sliding window can be found; the reprojection error term is established to add nonlinear optimization. Medium; perform four-degree-of-freedom optimization on the keyframes that have been optimized and slide out of the window and their associated frames; insert the optimized keyframes into the map; in actual use, the map, that is, loopback detection, can be turned on or off. While positioning and building a map, the final position is equivalent to the first fixed camera position. Of course, this can be fixed to a specific reference system by subsequent coordinate transformation.
  • the present application provides a pose determination device, the device comprising:
  • an acquisition module configured to acquire a first image frame and a second image frame captured by a camera device, where the first image frame and the second image frame are adjacent image frames captured by the camera device, and the first image frame and the second image frame are adjacent image frames captured by the camera device.
  • the image frame and the second image frame respectively include dynamic objects;
  • a corner extraction module configured to obtain a plurality of first corners of the first image frame and the a plurality of second corner points of the second image frame
  • a culling module configured to cull the first corner points of the region where the dynamic target included in the first image frame is located among the plurality of first corner points, so as to obtain a plurality of culled first corner points;
  • a positioning module configured to determine, according to the plurality of first corner points after culling and the plurality of second corner points after culling, relative to when the camera device is photographing the second image frame, relative to when photographing the first The pose change over an image frame.
  • the plurality of first corner points and the plurality of second corner points include one of the following: accelerated segment test feature FAST corner points, Harris corner points, and binary robust invariant Scalable key BRISK corners.
  • the apparatus is applied to a target vehicle, and the target vehicle is fixedly provided with the camera device, the inertial measurement unit IMU and a wheel speedometer; the acquisition module is configured to acquire the IMU measurement result obtained by the motion state data of the target vehicle during the period from shooting the first image frame to shooting the second image frame by the camera device;
  • the positioning module is configured to determine, according to the disparity of the plurality of first corner points after culling and the plurality of second corner points after culling in respective image frames, that the camera device is in the shooting location.
  • the first pose change when the second image frame is taken relative to when the first image frame is captured, wherein the first pose change does not include scale information;
  • the wheel speed data and the change of the first posture determine the second posture of the camera device when the second image frame is photographed relative to when the first image frame is photographed change, the second pose change includes scale information;
  • Non-linear optimization is performed on the second pose change to obtain the pose change.
  • the positioning module is configured to perform nonlinear optimization on the second pose change according to a preset optimization function, wherein the preset optimization function includes a wheel speed meter residual item .
  • the apparatus further includes:
  • a dynamic target detection module configured to detect the dynamic target in the first image frame through the pre-trained neural network, and detect the dynamic target in the second image frame, so as to obtain the dynamic target included in the first image frame the region where the dynamic target is located, and the region where the dynamic object included in the second image frame is located.
  • the dynamic target is a vehicle.
  • a computer-readable storage medium stores program codes, wherein the program codes include instructions for performing part or all of the operations in the method described in the first aspect.
  • an embodiment of the present application provides a computer program product that, when the computer program product runs on a communication device, causes the communication device to perform some or all of the operations in the method described in the first aspect.
  • a chip in a fifth aspect, includes a processor, and the processor is configured to perform some or all of the operations in the method described in the first aspect above.
  • An embodiment of the present application provides a method for determining a pose, the method includes: acquiring a first image frame and a second image frame captured by a camera device, where the first image frame and the second image frame are for the camera
  • the adjacent image frames captured by the device, the first image frame and the second image frame respectively include dynamic targets; by performing corner detection on the first image frame and the second image frame respectively (corner detection) ) to obtain a plurality of first corner points of the first image frame and a plurality of second corner points of the second image frame; excluding the plurality of first corner points included in the first image frame the first corner points of the area where the dynamic target is located, to obtain a plurality of first corner points after culling; culling the second corner points of the area where the dynamic target is located in the second image frame included in the plurality of second corner points, to obtain a plurality of second corner points after culling; according to the plurality of first corner points after culling and the plurality of second corner points after culling, it is determined that the camera device is shooting the second
  • the observed amount of the same corner point in different camera states includes not only the parallax caused by the movement of the camera itself, but also the parallax caused by the movement of the corner point itself.
  • the above-mentioned methods are used to address the dynamic environment.
  • the corner points in the region where the dynamic target is located among the multiple corner points are eliminated to ensure that the visual reprojection error is more reliable, thereby overcoming the determination error of the pose change introduced by the dynamic feature.
  • FIG. 1 is a functional block diagram of a vehicle provided by an embodiment of the present invention.
  • FIG. 2 is a functional block diagram of an automatic driving system provided by an embodiment of the present invention.
  • FIG. 3 is a schematic flowchart of a method for determining a pose provided by an embodiment of the present application
  • FIG. 4 is a schematic flowchart of a pose determination method provided by an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of a pose determination device provided by an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of a pose determination apparatus provided by an embodiment of the present application.
  • a component may be, but is not limited to, a process running on a processor, a processor, an object, an executable, a thread of execution, a program, and/or a computer.
  • an application running on a computing device and the computing device may be components.
  • One or more components may reside within a process and/or thread of execution, and a component may be localized on one computer and/or distributed between 2 or more computers.
  • these components can execute from various computer readable media having various data structures stored thereon.
  • a component may, for example, be based on a signal having one or more data packets (eg, data from two components interacting with another component between a local system, a distributed system, and/or a network, such as the Internet interacting with other systems via signals) Communicate through local and/or remote processes.
  • data packets eg, data from two components interacting with another component between a local system, a distributed system, and/or a network, such as the Internet interacting with other systems via signals
  • FIG. 1 is a functional block diagram of a vehicle 100 according to an embodiment of the present invention.
  • the vehicle 100 is configured in a fully or partially autonomous driving mode.
  • the vehicle 100 can control itself while in an autonomous driving mode, and can determine the current state of the vehicle and its surroundings through human manipulation, determine the possible behavior of at least one other vehicle in the surrounding environment, and determine the other vehicles perform The confidence level corresponding to the likelihood of the possible behavior, the vehicle 100 is controlled based on the determined information.
  • the vehicle 100 may be placed to operate without human interaction.
  • Vehicle 100 may include various subsystems, such as travel system 102 , sensor system 104 , control system 106 , one or more peripherals 108 and power supply 110 , computer system 112 , and user interface 116 .
  • vehicle 100 may include more or fewer subsystems, and each subsystem may include multiple elements. Additionally, each of the subsystems and elements of the vehicle 100 may be interconnected by wire or wirelessly.
  • the travel system 102 may include components that provide powered motion for the vehicle 100 .
  • propulsion system 102 may include engine 118 , energy source 119 , transmission 120 , and wheels/tires 121 .
  • the engine 118 may be an internal combustion engine, an electric motor, an air compression engine, or other types of engine combinations, such as a hybrid engine consisting of a gas oil engine and an electric motor, a hybrid engine consisting of an internal combustion engine and an air compression engine.
  • Engine 118 converts energy source 119 into mechanical energy.
  • Examples of energy sources 119 include gasoline, diesel, other petroleum-based fuels, propane, other compressed gas-based fuels, ethanol, solar panels, batteries, and other sources of electricity.
  • the energy source 119 may also provide energy to other systems of the vehicle 100 .
  • Transmission 120 may transmit mechanical power from engine 118 to wheels 121 .
  • Transmission 120 may include a gearbox, a differential, and a driveshaft.
  • transmission 120 may also include other devices, such as clutches.
  • the drive shaft may include one or more axles that may be coupled to one or more wheels 121 .
  • the sensor system 104 may include several sensors that sense information about the environment surrounding the vehicle 100 .
  • the sensor system 104 may include a positioning system 122 (the positioning system may be a GPS system, a Beidou system or other positioning systems), an inertial measurement unit (IMU) 124, a radar system 126, a laser rangefinder 128 and camera 130 .
  • the sensor system 104 may also include sensors of the internal systems of the vehicle 100 being monitored (eg, an in-vehicle air quality monitor, a fuel gauge, an oil temperature gauge, etc.). Sensor data from one or more of these sensors can be used to detect objects and their corresponding characteristics (position, shape, orientation, velocity, etc.). This detection and identification is a critical function for the safe operation of the autonomous vehicle 100 .
  • ADAS advanced driver assistance systems
  • unmanned driving technologies higher requirements are placed on the performance of the radar system 126 such as distance and angular resolution.
  • the improvement of the distance and angular resolution of the vehicle-mounted radar system 126 enables the vehicle-mounted radar system 126 to detect multiple measurement points for a target object when imaging the target to form high-resolution point cloud data.
  • the radar system in this application 126 may also be referred to as point cloud imaging radar.
  • the positioning system 122 may be used to estimate the geographic location of the vehicle 100 .
  • the IMU 124 is used to sense position and orientation changes of the vehicle 100 based on inertial acceleration.
  • IMU 124 may be a combination of an accelerometer and a gyroscope.
  • Radar system 126 may utilize radio signals to sense objects within the surrounding environment of vehicle 100 . In some embodiments, in addition to sensing objects, radar system 126 may be used to sense the speed and/or heading of objects.
  • the laser rangefinder 128 may utilize laser light to sense objects in the environment in which the vehicle 100 is located.
  • the laser rangefinder 128 may include one or more laser sources, laser scanners, and one or more detectors, among other system components.
  • Camera 130 may be used to capture multiple images of the surrounding environment of vehicle 100 .
  • Camera 130 may be a still camera or a video camera.
  • the camera 130 may also be referred to as an imaging device.
  • Control system 106 controls the operation of the vehicle 100 and its components.
  • Control system 106 may include various elements including steering system 132 , throttle 134 , braking unit 136 , computer vision system 140 , route control system 142 , and obstacle avoidance system 144 .
  • the steering system 132 is operable to adjust the heading of the vehicle 100 .
  • it may be a steering wheel system.
  • the throttle 134 is used to control the operating speed of the engine 118 and thus the speed of the vehicle 100 .
  • the braking unit 136 is used to control the deceleration of the vehicle 100 .
  • the braking unit 136 may use friction to slow the wheels 121 .
  • the braking unit 136 may convert the kinetic energy of the wheels 121 into electrical current.
  • the braking unit 136 may also take other forms to slow the wheels 121 to control the speed of the vehicle 100 .
  • Computer vision system 140 may process and analyze images captured by camera 130 in order to identify objects and/or features in the environment surrounding vehicle 100 .
  • the objects and/or features may include traffic signals, road boundaries and obstacles.
  • Computer vision system 140 may use object recognition algorithms, Structure from Motion (SFM) algorithms, video tracking, and other computer vision techniques.
  • SFM Structure from Motion
  • the computer vision system 140 may be used to map the environment, track objects, estimate the speed of objects, and the like.
  • the route control system 142 is used to determine the travel route of the vehicle 100 .
  • the route control system 142 may combine data from the sensors 138 , the GPS 122 , and one or more predetermined maps to determine a driving route for the vehicle 100 .
  • the obstacle avoidance system 144 is used to identify, evaluate, and avoid or otherwise traverse potential obstacles in the environment of the vehicle 100 .
  • control system 106 may additionally or alternatively include components other than those shown and described. Alternatively, some of the components shown above may be reduced.
  • Peripherals 108 may include a wireless communication system 146 , an onboard computer 148 , a microphone 150 and/or a speaker 152 .
  • peripherals 108 provide a means for a user of vehicle 100 to interact with user interface 116 .
  • the onboard computer 148 may provide information to the user of the vehicle 100 .
  • User interface 116 may also operate on-board computer 148 to receive user input.
  • the onboard computer 148 can be operated via a touch screen.
  • peripheral devices 108 may provide a means for vehicle 100 to communicate with other devices located within the vehicle.
  • microphone 150 may receive audio (eg, voice commands or other audio input) from a user of vehicle 100 .
  • speakers 152 may output audio to a user of vehicle 100 .
  • Wireless communication system 146 may wirelessly communicate with one or more devices, either directly or via a communication network.
  • the wireless communication system 146 may use 3G cellular communications, such as code division multiple access (CDMA), enhanced versatile disk (EVD), global system for mobile communications , GSM)/general packet radio service (GPRS), or 4G cellular communications such as LTE. Or 5G cellular communications.
  • the wireless communication system 146 may communicate with a wireless local area network (WLAN) using WiFi.
  • the wireless communication system 146 may communicate directly with the device using an infrared link, Bluetooth, or ZigBee.
  • Other wireless protocols, such as various vehicle communication systems, for example, wireless communication system 146 may include one or more dedicated short range communications (DSRC) devices, which may include communication between vehicles and/or roadside stations public and/or private data communications.
  • DSRC dedicated short range communications
  • the power supply 110 may provide power to various components of the vehicle 100 .
  • the power source 110 may be a rechargeable lithium-ion or lead-acid battery.
  • One or more battery packs of such a battery may be configured as a power source to provide power to various components of the vehicle 100 .
  • power source 110 and energy source 119 may be implemented together, such as in some all-electric vehicles.
  • Computer system 112 may include at least one processor 113 that executes instructions 115 stored in a non-transitory computer-readable medium such as data storage device 114 .
  • Computer system 112 may also be multiple computing devices that control individual components or subsystems of vehicle 100 in a distributed fashion.
  • the processor 113 may be any conventional processor, such as a commercially available central processing unit (CPU). Alternatively, the processor may be a dedicated device such as an application specific integrated circuit (ASIC) or other hardware-based processor.
  • FIG. 1 functionally illustrates the processor, memory, and other elements of the computer 110 in the same block, one of ordinary skill in the art will understand that the processor, computer, or memory may actually include a processor, a computer, or a memory that may or may not Multiple processors, computers, or memories stored within the same physical enclosure.
  • the memory may be a hard drive or other storage medium located within an enclosure other than computer 110 .
  • reference to a processor or computer will be understood to include reference to a collection of processors or computers or memories that may or may not operate in parallel.
  • some components such as the steering and deceleration components, may each have their own processors that only perform computations related to component-specific functions.
  • the processor 113 may acquire data from the camera 130 and other sensor devices, and perform vehicle positioning based on the acquired data.
  • a processor may be located remotely from the vehicle and in wireless communication with the vehicle. In other aspects, some of the processes described herein are performed on a processor disposed within the vehicle while others are performed by a remote processor, including taking steps necessary to perform a single maneuver.
  • data storage 114 may include instructions 115 (eg, program logic) executable by processor 113 to perform various functions of vehicle 100 , including those described above.
  • Data storage 114 may also contain additional instructions, including sending data to, receiving data from, interacting with, and/or performing data processing on one or more of propulsion system 102 , sensor system 104 , control system 106 , and peripherals 108 . control commands.
  • data storage 114 may store data such as road maps, route information, vehicle location, direction, speed, and other vehicle data, among other information. Such information may be used by the vehicle 100 and the computer system 112 while the vehicle 100 is in autonomous, semi-autonomous, and/or manual modes.
  • a user interface 116 for providing information to or receiving information from a user of the vehicle 100 .
  • the user interface 116 may include one or more input/output devices within the set of peripheral devices 108 , such as a wireless communication system 146 , an onboard computer 148 , a microphone 150 and a speaker 152 .
  • Computer system 112 may control functions of vehicle 100 based on input received from various subsystems (eg, travel system 102 , sensor system 104 , and control system 106 ) and from user interface 116 .
  • computer system 112 may utilize input from control system 106 in order to control steering unit 132 to avoid obstacles detected by sensor system 104 and obstacle avoidance system 144 .
  • computer system 112 is operable to provide control of various aspects of vehicle 100 and its subsystems.
  • one or more of these components described above may be installed or associated with the vehicle 100 separately.
  • data storage device 114 may exist partially or completely separate from vehicle 100 .
  • the above-described components may be communicatively coupled together in a wired and/or wireless manner.
  • FIG. 1 should not be construed as a limitation on the embodiment of the present invention.
  • a self-driving car traveling on a road can recognize objects within its surroundings to determine adjustments to the current speed.
  • the objects may be other vehicles, traffic control devices, or other types of objects.
  • each identified object may be considered independently, and based on the object's respective characteristics, such as its current speed, acceleration, distance from the vehicle, etc., may be used to determine the speed at which the autonomous vehicle is to adjust.
  • the autonomous vehicle vehicle 100 or a computing device associated with the autonomous vehicle 100 may be based on the characteristics of the identified objects and the surrounding environment state (eg, traffic, rain, ice on the road, etc.) to predict the behavior of the identified objects.
  • each identified object is dependent on the behavior of the other, so it is also possible to predict the behavior of a single identified object by considering all identified objects together.
  • the vehicle 100 can adjust its speed based on the predicted behavior of the identified object.
  • the self-driving car can determine what steady state the vehicle will need to adjust to (eg, accelerate, decelerate, or stop) based on the predicted behavior of the object.
  • other factors may also be considered to determine the speed of the vehicle 100, such as the lateral position of the vehicle 100 in the road being traveled, the curvature of the road, the proximity of static and dynamic objects, and the like.
  • the computing device may also provide instructions to modify the steering angle of the vehicle 100 so that the self-driving car follows a given trajectory and/or maintains contact with objects in the vicinity of the self-driving car (eg, , cars in adjacent lanes on the road) safe lateral and longitudinal distances.
  • objects in the vicinity of the self-driving car eg, , cars in adjacent lanes on the road
  • the above-mentioned vehicle 100 can be a car, a truck, a motorcycle, a bus, a boat, an airplane, a helicopter, a lawn mower, a recreational vehicle, a playground vehicle, construction equipment, a tram, a golf cart, a train, a cart, etc.
  • the embodiments of the invention are not particularly limited.
  • computer system 101 includes processor 103 coupled to system bus 105 .
  • the processor 103 may be one or more processors, each of which may include one or more processor cores.
  • a video adapter 107 which can drive a display 109, is coupled to the system bus 105.
  • the system bus 105 is coupled to an input-output (I/O) bus through a bus bridge 111 .
  • I/O interface 115 is coupled to the I/O bus.
  • I/O interface 115 communicates with various I/O devices, such as input device 117 (eg, keyboard, mouse, touch screen, etc.), media tray 121, (eg, compact disc read-only) memory, CD-ROM), multimedia interface, etc.).
  • Transceiver 123 (which can send and/or receive radio communication signals), camera 155 (which can capture still and moving digital video images) and external USB port 125 .
  • the interface connected to the I/O interface 115 may be a USB interface.
  • the processor 103 may be any conventional processor, including a reduced instruction set computing (reduced instruction set computing, RISC) processor, a complex instruction set computing (complex instruction set computing, CISC) processor or a combination of the above.
  • the processor may be a special purpose device such as an ASIC.
  • the processor 103 may be a neural network processor or a combination of a neural network processor and the above-mentioned conventional processors.
  • computer system 101 may be located remotely from the autonomous vehicle and may communicate wirelessly with the autonomous vehicle.
  • some of the processes herein are performed on a processor disposed within an autonomous vehicle, others are performed by a remote processor, including taking actions required to perform a single maneuver.
  • Network interface 129 is a hardware network interface, such as a network card.
  • the network 127 may be an external network, such as the Internet, or an internal network, such as an Ethernet network or a virtual private network (VPN).
  • the network 127 may also be a wireless network, such as a WiFi network, a cellular network, and the like.
  • the hard disk drive interface is coupled to the system bus 105 .
  • the hard drive interface is connected to the hard drive.
  • System memory 135 is coupled to system bus 105 . Data running in system memory 135 may include operating system 137 and application programs 143 of computer 101 .
  • the operating system includes a Shell 139 and a kernel 141 .
  • Shell 139 is an interface between the user and the operating system's kernel.
  • the shell is the outermost layer of the operating system. The shell manages the interaction between the user and the operating system: waiting for user input, interpreting user input to the operating system, and processing various operating system output.
  • Kernel 141 consists of those parts of the operating system that manage memory, files, peripherals, and system resources.
  • the kernel 141 directly interacts with the hardware, and the operating system kernel usually runs processes, provides inter-process communication, provides CPU time slice management, interrupts, memory management, IO management, and the like.
  • Application 143 includes programs that control the autonomous driving of the car, for example, programs that manage the interaction of the autonomous vehicle with road obstacles, programs that control the route or speed of the autonomous vehicle, and programs that control the interaction between the autonomous vehicle and other autonomous vehicles on the road. .
  • Application 143 also exists on the system of deploying server 149.
  • computer system 101 may download application 143 from software deployment server 149 when application 143 needs to be executed.
  • Sensor 153 is associated with computer system 101 .
  • Sensor 153 is used to detect the environment around computer system 101 .
  • the sensor 153 can detect animals, cars, obstacles and pedestrian crossings, etc.
  • the sensor 153 can also detect the environment around the above-mentioned animals, cars, obstacles and pedestrian crossings, such as: the environment around animals, for example, around animals Other animals present, weather conditions, ambient light levels, etc.
  • the computer 101 is located in a self-driving car, the sensor may be a radar system or the like.
  • a common positioning scheme is based on pure vision positioning technology, namely visual SLAM.
  • the main idea of this scheme is to solve the pose of the moving body based on visual feature point matching and global optimization. First, the features are extracted from the image, and the feature points matched between the two frames are used to calculate the relative pose transformation between the two frames, and finally the information is used. Calculate the odometer information.
  • FIG. 3 is a schematic flowchart of a method for determining a pose provided by an embodiment of the present application. As shown in FIG. 3 , the method for determining a pose provided by an embodiment of the present application includes:
  • first image frame and a second image frame captured by an imaging device, where the first image frame and the second image frame are adjacent image frames captured by the imaging device, and the first image frame and the second image frame are adjacent image frames captured by the imaging device.
  • the second image frames respectively include dynamic objects.
  • the first image frame and the second image frame captured by the camera device can be acquired , wherein the first image frame and the second image frame may be two consecutive image frames captured by a camera device.
  • the first image frame and the second image frame respectively include a dynamic target
  • the dynamic target is a target that is displaced relative to the ground when the imaging device captures the image frame.
  • the so-called displacement relative to the ground when the imaging device captures the image frame means that the dynamic target moves relative to the ground in space, for example, from position A to position B (position A and position B are two different positions).
  • the dynamic target may be a vehicle, such as a car, a truck, a passenger car, a trailer, a non-holonomic vehicle, a motorcycle, and the like.
  • first image frame and the second image frame may include the same dynamic object, the dynamic object being located in different regions in the first image frame and the second image frame.
  • the first image frame and the second image frame may also include different dynamic objects.
  • the first image frame may include vehicle 1 and vehicle 2
  • the second image frame may include vehicle 1 and vehicle 2
  • the position of vehicle 1 in the first image frame is different from the position of vehicle 1 in the second image frame
  • the vehicle The position of 2 in the first image frame is different from the position of vehicle 2 in the second image frame.
  • the first image frame may include vehicle 1 and vehicle 2
  • the second image frame may include vehicle 1 and vehicle 3
  • the second image frame does not include vehicle 3
  • the first image frame does not include vehicle 2
  • the vehicle 1 The position in the first image frame is different from the position of the vehicle 1 in the second image frame.
  • corner detection may be further performed on the first image frame and the second image frame to obtain the first image frame and the second image frame.
  • step 302 may be performed by the processor of the vehicle itself, that is, the processor of the vehicle itself may perform corner detection on the first image frame and the second image frame to obtain the first image a plurality of first corner points of the frame and a plurality of second corner points of the second image frame.
  • step 302 may be performed by a server on the cloud side, that is, the vehicle may send the first image frame and the second image frame captured by the camera device to the server on the cloud side, and the server may The image frame and the second image frame are subjected to corner detection to obtain a plurality of first corner points of the first image frame and a plurality of second corner points of the second image frame.
  • the plurality of first corner points and the plurality of second corner points may be accelerated segment test (Features from accelerated segment test, FAST) corner points, Harris corner points or binary robust Variable scalable key points (Binary Robust Invariant Scalable Keypoints, BRISK) corner points, etc., are not limited in this embodiment of the present invention.
  • the recognition of the image by the human eye is usually completed in a small local area or small window. If the grayscale of the image in the area of the window changes greatly when the small window is moved in a small range in all directions, then it can be considered that there are corner points in the window. If the grayscale of the image in the area of the window does not change when the small window is moved in a small range in all directions, then it can be considered that there are no corners in the window.
  • the multiple second corner points may be based on existing implementations, and details are not described herein again.
  • first corner points of the region where the dynamic target is located in the first image frame included in the plurality of first corner points may be firstly eliminated, and then the second image frame of the plurality of second corner points may be eliminated.
  • a corner point the first corner point of the region where the dynamic object is included in the first image frame; A corner point and an operation of culling a second corner point of the region where the dynamic object included in the second image frame is located among the plurality of second corner points.
  • a pre-trained neural network may be used to detect dynamic objects in the first image frame and detect dynamic objects in the second image frame, so as to obtain the dynamic objects included in the first image frame the area where the dynamic target is located, and the area where the dynamic target is located in the second image frame, and the first corner point located in the area where the dynamic target detected in the first image frame is located among the plurality of first corner points is culled to obtain The eliminated first corner point, and the second corner point located in the region where the dynamic target detected in the second image frame is located among the plurality of second corner points is eliminated to obtain the eliminated second corner point.
  • the optical flow method can also be used to track the corners and add new ones, and then give each corner a unique id, and calculate the normalized coordinates of the corners in the camera coordinate system, Pixel coordinates, pixel velocity, etc.
  • the observation amount of the same corner point in different camera states includes not only the parallax caused by the motion of the camera itself, but also the parallax caused by the motion of the feature point itself.
  • Parallax in the embodiment of the present application, by eliminating the corner points in the area where the dynamic target is located among the plurality of corner points, the visual reprojection error term can be guaranteed to be reliable.
  • the visual reprojection optimization term can refer to the following formula:
  • the observed coordinate of the l-th landmark point in the j-th camera's normalized camera coordinate system is the observed coordinate of the l-th landmark point in the j-th camera's normalized camera coordinate system
  • ⁇ c is the camera internal parameter.
  • the observation of the same corner in different camera states includes not only the parallax caused by the motion of the camera itself, but also the parallax caused by the movement of the corner itself.
  • the dynamic corner points are removed during image processing to improve the accuracy of the visual residual item.
  • the imaging device when the imaging device is a forward-looking monocular camera, feature extraction and tracking and dynamic target culling can be performed on the image frames captured by the camera, and then the dynamic features can be removed by using the monocular cross-matrix structure SFM. Visual residuals affected by points (also called dynamic corners).
  • the first corner points of the region where the dynamic target included in the first image frame is located are removed from the plurality of first corner points, so as to obtain a plurality of removed first corner points;
  • the second corner points of the region where the dynamic target is located in the second image frame included in the plurality of second corner points, after obtaining the plurality of second corner points after culling, can point and the culled second corner points to determine the change of the pose of the imaging device when the second image frame is captured relative to the first image frame.
  • the embodiments of the present application can be applied to a target vehicle, where the camera device, the inertial measurement unit IMU, and the wheel speedometer are fixedly installed in the target vehicle, and the camera device measured by the IMU can also be obtained when shooting the first The motion state data of the target vehicle during the period from the image frame to the capture of the second image frame, and the acquisition of the camera device measured by the wheel speedometer between the capture of the first image frame and the capture of the second image frame.
  • the wheel speed data of the target vehicle can be obtained, and further, according to the motion state data, the wheel speed data, the plurality of first corner points after the elimination, and the plurality of second corner points after the elimination, determine the The pose of the imaging device when the second image is photographed is changed relative to when the first image is photographed.
  • the motion state data may be vehicle acceleration data and angular velocity data measured by an inertial sensor (IMU), and the wheel speed data may be vehicle wheel speed, steering wheel angle data, and the like.
  • the first posture change does not include scale information
  • the second pose change determines a second pose change when the camera device captures the second image frame relative to when the first image frame is captured, and the second pose change includes scale information;
  • Non-linear optimization is performed on the second pose change to obtain the pose change.
  • nonlinear optimization may be performed on the second pose change according to a preset optimization function, wherein the preset optimization function includes a wheel speed meter residual term.
  • Nonlinear optimization refers to finding an optimal set of numerical mappings in a given objective function, namely x-min f(x).
  • f(x) is a nonlinear function
  • the optimization is nonlinear optimization.
  • a plurality of first corner points after removal and a plurality of second corner points after removal may be time stamped to align with motion state data and wheel speed data, wherein, due to different data frequencies, A single frame of image data (removed corner points) corresponds to multiple motion state data and wheel speed data.
  • the motion state data and wheel speed data of each frame of image can be pre-integrated to provide initial pose values for the image.
  • the sliding window method is used, and pure visual data is used for initialization processing to obtain the pose and position without scale information (also called depth) when the camera equipment captures all image frames in the sliding window.
  • the speed data is restored without scale information, and the positions of all corner points are recalculated, and then nonlinear optimization is performed on the restored scale information and the inverse depth of the corner points to obtain the pose change.
  • the estimated value can include the accelerometer bias b a and the gyroscope bias b ⁇ . Due to insufficient excitation under two-dimensional motion, the acceleration bias is difficult to estimate, which will cause the bias value to converge too slowly. The pose estimation is inaccurate. Therefore, the wheel speed meter residual term is introduced in the embodiment of the present application, which can be specifically shown in the following formula:
  • pre-integration can be performed on the wheel speed data (including vehicle speed data and steering wheel data, etc.) measured by the wheel speedometer to obtain a pre-integration result, and then a pre-integration residual is performed based on the pre-integration result. establishment.
  • a wheel speed meter residual item is added in view of the disadvantage that the two-dimensional motion initialization error of the visual inertial navigation system is relatively large.
  • the pose is initialized and calculated through the joint information of vision, IMU and wheel speedometer. So that when the IMU convergence is not good, the wheel speedometer can be used to compensate, so as to improve the effect.
  • the current key frame and the map can also be closed-loop detection, the detected frame is a closed-loop frame, and the common-view feature point between the closed-loop frame and the frame in the sliding window can be found; the reprojection error term is established to add nonlinear optimization. Medium; perform four-degree-of-freedom optimization on the keyframes that have been optimized and slide out of the window and their associated frames; insert the optimized keyframes into the map; in actual use, the map, that is, loopback detection, can be selected to be enabled or disabled. While positioning and building a map, the final position is equivalent to the first fixed camera position. Of course, this can be fixed to a specific reference system by subsequent coordinate transformation.
  • An embodiment of the present application provides a method for determining a pose, the method includes: acquiring a first image frame and a second image frame captured by a camera device, where the first image frame and the second image frame are for the camera
  • the adjacent image frames captured by the device, the first image frame and the second image frame respectively include dynamic targets; by performing corner detection on the first image frame and the second image frame respectively (corner detection) ) to obtain a plurality of first corner points of the first image frame and a plurality of second corner points of the second image frame; excluding the plurality of first corner points included in the first image frame the first corner points of the area where the dynamic target is located, to obtain a plurality of first corner points after culling; culling the second corner points of the area where the dynamic target is located in the second image frame included in the plurality of second corner points, to obtain a plurality of second corner points after culling; according to the plurality of first corner points after culling and the plurality of second corner points after culling, it is determined that the camera device is shooting the second
  • the observation amount of the same corner point in different camera states includes not only the parallax caused by the movement of the camera itself, but also the parallax caused by the movement of the corner point itself.
  • the above-mentioned methods are used for the dynamic environment.
  • the corner points in the region where the dynamic target is located among the multiple corner points are eliminated to ensure that the visual reprojection error is more reliable, thereby overcoming the determination error of the pose change introduced by the dynamic feature.
  • FIG. 5 is a schematic structural diagram of a pose determination apparatus provided by an embodiment of the present application.
  • a pose determination apparatus 500 provided by an embodiment of the present application includes:
  • the acquiring module 501 is configured to acquire a first image frame and a second image frame captured by a camera device, where the first image frame and the second image frame are adjacent image frames captured by the camera device, and the first image frame and the second image frame are adjacent image frames captured by the camera device.
  • An image frame and the second image frame respectively include dynamic objects;
  • a corner extraction module 502 is configured to perform corner detection (corner detection) on the first image frame and the second image frame to obtain a plurality of first corners and all the first corners of the first image frame. multiple second corner points of the second image frame;
  • a culling module 503, configured to cull the first corner points of the region where the dynamic target included in the first image frame is located among the plurality of first corner points, so as to obtain a plurality of culled first corner points;
  • the positioning module 504 is configured to determine, according to the plurality of first corner points after culling and the plurality of second corner points after culling, relative to the photographing of the camera device when photographing the second image frame The pose change at the first image frame.
  • the plurality of first corner points and the plurality of second corner points include one of the following: accelerated segment test feature FAST corner points, Harris corner points, and binary robust invariant Scalable key BRISK corners.
  • the device is applied to a target vehicle, and the target vehicle is fixedly provided with the camera device, the inertial measurement unit IMU, and a wheel speedometer; the acquisition module is configured to acquire the camera device measured by the IMU in the shooting location. motion state data of the target vehicle during the period from the first image frame to the shooting of the second image frame;
  • the positioning module is configured to determine, according to the disparity of the plurality of first corner points after culling and the plurality of second corner points after culling in respective image frames, that the camera device is in the shooting location.
  • the first pose change when the second image frame is taken relative to when the first image frame is captured, wherein the first pose change does not include scale information;
  • the wheel speed data and the change of the first posture determine the second posture of the camera device when the second image frame is photographed relative to when the first image frame is photographed change, the second pose change includes scale information;
  • Non-linear optimization is performed on the second pose change to obtain the pose change.
  • the positioning module is configured to perform nonlinear optimization on the second pose change according to a preset optimization function, wherein the preset optimization function includes a wheel speed meter residual item .
  • the apparatus further includes:
  • a dynamic target detection module configured to detect the dynamic target in the first image frame through a pre-trained neural network, and detect the dynamic target in the second image frame, so as to obtain the dynamic target included in the first image frame the region where the dynamic target is located, and the region where the dynamic object included in the second image frame is located.
  • an embodiment of the present application provides an apparatus 600 for determining a pose and attitude, including a transceiver 610, a processor 620, and a memory 630; the memory 630 is used to store programs, instructions, or codes; in executing programs, instructions or codes in memory 630;
  • a transceiver 610 configured to receive the first image frame and the second image frame input by the camera device
  • the processor 620 is configured to perform corner detection (corner detection) on the first image frame and the second image frame respectively, so as to obtain a plurality of first corners of the first image frame and the Multiple second corner points of two image frames; culling the first corner points of the region where the dynamic target included in the first image frame is located among the multiple first corner points, so as to obtain the multiple first corner points after the culling ; Eliminate the second corner points of the dynamic target area included in the second image frame in the plurality of second corner points to obtain a plurality of second corner points after the elimination; According to the plurality of second corner points after the elimination A corner point and a plurality of second corner points after the culling are used to determine the change of the pose of the imaging device when the second image frame is captured relative to the first image frame.
  • corner detection corner detection
  • the processor 620 may be an integrated circuit chip with signal processing capability. In the implementation process, each step of the above-mentioned method may be completed by an integrated logic circuit of hardware in the processor 620 or an instruction in the form of software.
  • the above-mentioned processor 602 may be a general-purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the steps of the methods disclosed in conjunction with the embodiments of the present application may be directly embodied as executed by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor.
  • the software module may be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other storage media mature in the art.
  • the storage medium is located in the memory 630, and the processor 620 reads the information in the memory 630, and performs the above method steps in combination with its hardware.
  • embodiments of the present application may be provided as a method, a system, or a computer program product. Accordingly, the embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product implemented on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • computer-usable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions
  • the apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Remote Sensing (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Automation & Control Theory (AREA)
  • Traffic Control Systems (AREA)

Abstract

Provided in an embodiment of the present invention is a pose determination method. The method comprises: acquiring a first image frame and a second image frame captured by a camera device, each of the first image frame and the second image frame comprising a dynamic target; respectively performing corner detection on the first image frame and the second image frame, and obtaining multiple first corners of the first image frame and multiple second corners of the second image frame; removing, from the multiple first corners, a first corner at a region where the dynamic target of the first image frame is located, and obtaining multiple first corners without said removed first corner; removing, from the multiple second corners, a second corner at a region where the dynamic target of the second image frame is located, and obtaining multiple second corners without said removed second corner; and determining a pose change of the camera device according to the multiple first corners without said removed first corner and the multiple second corners without said removed second corner. In this embodiment, removal of a corner at a region where a dynamic target is located from multiple corners ensures that a visual reprojection error is more reliable, thereby preventing occurrence of an error in determination of a pose change due to introduction of a dynamic feature.

Description

一种位姿确定方法及其相关设备A pose determination method and related equipment
本申请要求于2020年10月31日提交中国专利局、申请号为202011199017.6、发明名称为“一种位姿确定方法及其相关设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed on October 31, 2020 with the application number 202011199017.6 and the invention titled "A Pose Determination Method and Related Equipment", the entire contents of which are incorporated by reference in in this application.
技术领域technical field
本申请涉及图像处理技术领域,并且更具体地,涉及一种位姿确定方法及其相关设备。The present application relates to the technical field of image processing, and more particularly, to a pose determination method and related devices.
背景技术Background technique
随着移动机器人技术的发展,室内实时定位技术得到了广泛的关注,机器人知道自身所处的位置可为规划控制等模块提供实时信息,以完成期望的任务。然而,在室内环境下,全球定位系统(global positioning system,GPS)因信号不稳定无法用于定位。一些基于信号发生装置的室内定位方法,例如超宽带(ultra wide band,UWB)、无线保真(wireless fidelity,WIFI),需要在使用场景中安装信号发生设备或者容易带来运动区域限制问题和成本问题。其次,由于激光传感器过于昂贵,基于激光的室内定位技术会带来成本过高的问题。With the development of mobile robot technology, indoor real-time positioning technology has received extensive attention. The robot knows its own location and can provide real-time information for planning control and other modules to complete the desired task. However, in an indoor environment, the global positioning system (GPS) cannot be used for positioning due to unstable signals. Some indoor positioning methods based on signal generators, such as ultra wide band (UWB) and wireless fidelity (WIFI), require the installation of signal generators in the usage scenarios or are prone to limit the movement area and cost question. Second, because the laser sensor is too expensive, the laser-based indoor positioning technology will bring the problem of high cost.
由于摄像设备具有低成本、可以捕获丰富信息的特点,基于视觉的主动实时定位技术被提出。常见的定位方案有基于纯视觉的定位技术。该方案主要思想是基于视觉特征点匹配和全局优化求解移动体位姿,首先从图像中提取角点,利用两帧之间匹配的角点计算拍摄设备在拍摄这两帧之间的位姿变化,位姿变化可以包括位置变化以及转角变化。然而在一些场景中,存在动态运动的物体,此时视觉约束无法提供可靠的观测值,造成误差较大,从而影响相对位姿的计算精度。Due to the low cost of camera equipment and the ability to capture rich information, vision-based active real-time localization techniques are proposed. Common positioning schemes include pure vision-based positioning technology. The main idea of this scheme is to solve the pose of the moving body based on visual feature point matching and global optimization. First, the corner points are extracted from the image, and the matching corner points between the two frames are used to calculate the pose change of the shooting device between the two frames. Pose changes can include position changes as well as rotational angle changes. However, in some scenes, there are objects in dynamic motion. At this time, visual constraints cannot provide reliable observations, resulting in large errors, which affect the calculation accuracy of relative poses.
发明内容SUMMARY OF THE INVENTION
第一方面,本申请提供了一种位姿确定方法,所述方法包括:In a first aspect, the present application provides a method for determining a pose, the method comprising:
获取摄像设备拍摄的第一图像帧和第二图像帧,所述第一图像帧和所述第二图像帧为所述摄像设备拍摄得到的相邻图像帧,所述第一图像帧和所述第二图像帧分别包括动态目标,其中,动态目标为在所述摄像设备拍摄图像帧时相对于地面发生位移的目标。所谓摄像设备拍摄图像帧时相对于地面发生位移是指,在空间上动态目标相对于地面发生了移动,例如从A位置移动到了B位置(A位置和B位置是不同的两个位置)。在一种实现中,动态目标可以是车辆,例如可以是轿车、载货汽车、客车、挂车、非完整车辆以及摩托车等等,应理解,第一图像帧和所述第二图像帧可以包括相同的动态目标,也可以包括不同的动态目标;通过分别对所述第一图像帧和所述第二图像帧进行角点检测(corner detection),以得到所述第一图像帧的多个第一角点和所述第二图像帧的多个第二角点;在一些情况下,可以理解,人眼对图像的识别通常是在一个局部的小区域或小窗口完成的。如果在各个方向上小范围移动这个小窗口时,窗口内区域图像的灰度发生了较大的变化,那么就可以认为在窗口内存在角点。如果在各个方向上小范围移动这个小窗口时,窗口内区域图像的灰度没有发生变化,那么就可以认为窗口内不存在角点。在一些实施例中,所述多个第一角点 以及所述多个第二角点可以是加速片段测试特征(Features fromaccelerated segment test,FAST)角点,哈里斯Harris角点或者二进制鲁棒不变可伸缩关键点(Binary Robust Invariant Scalable Keypoints,BRISK)角点等,对此,本发明实施例不进行限定。剔除所述多个第一角点中所述第一图像帧包括的动态目标所在区域的第一角点,以得到剔除后的多个第一角点;剔除所述多个第二角点中所述第二图像帧包括的动态目标所在区域的第二角点,以得到剔除后的多个第二角点;当存在动态角点(也就是动态目标所在区域的角点)时,同一角点在不同相机状态下的观测量不仅包括由于相机自身运动引入的视差,还包括由于特征点本身运动带来的视差,本申请实施例中,通过剔除所述多个角点中所述动态目标所在区域的角点,可以保证视觉重投影误差项可靠。根据所述剔除后的多个第一角点以及所述剔除后的多个第二角点,确定所述摄像设备在拍摄所述第二图像帧时相对于拍摄所述第一图像帧时的位姿变化。Acquire a first image frame and a second image frame captured by a camera device, where the first image frame and the second image frame are adjacent image frames captured by the camera device, and the first image frame and the second image frame are adjacent image frames captured by the camera device. The second image frames respectively include dynamic objects, wherein the dynamic objects are objects that are displaced relative to the ground when the imaging device captures the image frames. The so-called displacement relative to the ground when the imaging device captures the image frame means that the dynamic target moves relative to the ground in space, for example, from position A to position B (position A and position B are two different positions). In one implementation, the dynamic target may be a vehicle, such as a car, a truck, a passenger car, a trailer, an incomplete vehicle, a motorcycle, etc. It should be understood that the first image frame and the second image frame may include The same dynamic target may also include different dynamic targets; by performing corner detection (corner detection) on the first image frame and the second image frame respectively, to obtain multiple first image frames of the first image frame. A corner point and a plurality of second corner points of the second image frame; in some cases, it can be understood that the recognition of the image by the human eye is usually completed in a local small area or small window. If the grayscale of the image in the area of the window changes greatly when the small window is moved in a small range in all directions, then it can be considered that there are corner points in the window. If the grayscale of the image in the area of the window does not change when the small window is moved in a small range in all directions, then it can be considered that there are no corners in the window. In some embodiments, the plurality of first corner points and the plurality of second corner points may be accelerated segment test (Features from accelerated segment test, FAST) corner points, Harris corner points or binary robust Variable scalable key points (Binary Robust Invariant Scalable Keypoints, BRISK) corner points, etc., are not limited in this embodiment of the present invention. Eliminate the first corner points in the region where the dynamic target included in the first image frame is located from the plurality of first corner points, so as to obtain the plurality of first corner points after culling; culling out the plurality of second corner points The second image frame includes the second corner points of the area where the dynamic target is located, so as to obtain a plurality of second corner points after culling; when there are dynamic corner points (that is, the corner points of the area where the dynamic target is located), the same corner The observed amount of points in different camera states includes not only the parallax caused by the movement of the camera itself, but also the parallax caused by the movement of the feature points themselves. The corner points of the region can ensure that the visual reprojection error term is reliable. According to the plurality of first corner points after the culling and the plurality of second corner points after the culling, it is determined that the image capturing device when the second image frame is photographed is relative to the time when the first image frame is photographed. Pose changes.
位姿变化可以指拍摄设备所处的位置的变化以及转角的变化。位置的变化可以表示拍摄设备拍摄该第二图像帧时的位置与拍摄第一图像帧时的位置之间的距离。转角的变化可以表示拍摄设备拍摄该第二图像帧时的旋转角度与拍摄第一图像帧时的旋转角度之间的角度差。当拍摄设备固定于车辆上时,拍摄设备的位姿变化可以认为是车辆的位姿变化。The pose change may refer to the change of the position of the photographing device and the change of the rotation angle. The change in position may represent the distance between the position at which the capturing device captured the second image frame and the position at which the first image frame was captured. The change in the rotation angle may represent an angular difference between the rotation angle when the photographing device photographed the second image frame and the rotation angle when the first image frame was photographed. When the photographing device is fixed on the vehicle, the posture change of the photographing device can be regarded as the posture change of the vehicle.
以第一图像帧是第二图像帧之前的图像帧为例,可以使用从第一图像帧中提取的多个角点进行光流,从而找到第一图像帧中与第二图像帧之间的匹配角点,并得到匹配角点的光流信息,该光流信息用于表示匹配角点在该相邻两个图像中的运动信息(从图像帧的角度可以称之为视差),进一步的,可以基于光流信息确定摄像设备在拍摄所述第二图像帧时相对于拍摄所述第一图像帧时的位姿变化。进行光流时采用的算法可以为Lucas-Kanade光流算法或者其他算法,除光流外,也可以采用描述子或者直接法对角点进行匹配。Taking the first image frame as the image frame before the second image frame as an example, multiple corner points extracted from the first image frame can be used for optical flow to find the distance between the first image frame and the second image frame. Match the corners, and obtain the optical flow information of the matching corners. The optical flow information is used to represent the motion information of the matching corners in the two adjacent images (from the perspective of the image frame, it can be called parallax), and further , the pose change of the imaging device when the second image frame is photographed relative to when the first image frame is photographed can be determined based on the optical flow information. The algorithm used for the optical flow can be the Lucas-Kanade optical flow algorithm or other algorithms. In addition to the optical flow, a descriptor or a direct method can also be used to match the corner points.
当存在动态角点时,同一角点在不同相机状态下的观测量不仅包括由于相机自身运动引入的视差,还包括由于角点本身运动带来的视差,本实施例通过上述方式,针对动态环境下定位误差大的问题,剔除了角点中所述动态目标所在区域的角点,以确保视觉重投影误差更可靠,从而克服动态特征引入的位姿变化时的误差。When there are dynamic corner points, the observed amount of the same corner point in different camera states includes not only the parallax caused by the movement of the camera itself, but also the parallax caused by the movement of the corner point itself. In this embodiment, the above-mentioned methods are used to address the dynamic environment. To solve the problem of large positioning error, the corner points in the area where the dynamic target is located in the corner points are eliminated to ensure that the visual reprojection error is more reliable, so as to overcome the error caused by the dynamic feature when the pose changes.
在一种可能的实现中,所述动态目标为车辆。In one possible implementation, the dynamic target is a vehicle.
在一种可能的实现中,所述方法应用于目标车辆,所述目标车辆固定设置有所述摄像设备、惯性测量单元IMU以及轮速计;所述方法还包括:获取所述IMU测量得到的所述摄像设备在拍摄所述第一图像帧到拍摄所述第二图像帧期间所述目标车辆的运动状态数据;获取所述轮速计测量得到的所述摄像设备在拍摄所述第一图像帧到拍摄所述第二图像帧期间所述目标车辆的轮速数据;相应的,所述根据所述剔除后的多个第一角点以及所述剔除后的多个第二角点,确定所述摄像设备在拍摄所述第二图像帧时相对于拍摄所述第一图像帧时的位姿变化,包括:根据所述剔除后的多个第一角点以及所述剔除后的多个第二角点在各自所在图像帧的视差,确定所述摄像设备在拍摄所述第二图像帧时相对于拍摄所述第一图像帧时的第一位姿变化,其中,所述第一位姿变化不包括尺度信息;根据所述运动状态数据、所述轮速数据以及所述第一位姿变化,确定所述摄像设备在拍摄所述第二图像帧时相对于拍摄所述第一图像帧时的第二位姿变化,所述第二位姿变化包括尺度信息;对所述 第二位姿变化进行非线优化,得到所述位姿变化。其中,运动状态数据可以是惯性传感器(IMU)测量得到的车辆加速度数据以及角速度数据,轮速数据可以是车辆车轮转速、方向盘转角数据等等。In a possible implementation, the method is applied to a target vehicle, and the target vehicle is fixedly provided with the camera device, the inertial measurement unit IMU, and a wheel speedometer; the method further includes: acquiring a measurement obtained by the IMU. During the period from capturing the first image frame to capturing the second image frame, the camera device obtains the motion state data of the target vehicle; the camera device obtains the measurement of the wheel speed when the camera device captures the first image. the wheel speed data of the target vehicle during the period from the frame to the shooting of the second image frame; correspondingly, the determination is based on the plurality of first corner points after the elimination and the plurality of second corner points after the elimination The pose change of the camera device when shooting the second image frame relative to when shooting the first image frame includes: according to the plurality of first corner points after the culling and the plurality of first corner points after the culling The disparity of the second corner points in the respective image frames determines the change of the first pose of the camera device when the second image frame is captured relative to the first image frame when the first image frame is captured, wherein the first The attitude change does not include scale information; according to the motion state data, the wheel speed data, and the first attitude change, it is determined that when the camera device captures the second image frame, relative to capturing the first image The second pose change at frame time, the second pose change includes scale information; the second pose change is non-linearly optimized to obtain the pose change. The motion state data may be vehicle acceleration data and angular velocity data measured by an inertial sensor (IMU), and the wheel speed data may be vehicle wheel speed, steering wheel angle data, and the like.
在一种实现中,可以将剔除后的多个第一角点以及所述剔除后的多个第二角点与运动状态数据、轮速数据进行时间戳的对齐,其中,由于数据频率不同,单帧的图像数据(剔除后的角点)对应多个运动状态数据和轮速数据,之后可以对每帧图像的运动状态数据、轮速数据进行预积分处理,为图像提供位姿初值,之后采用滑窗方法,利用纯视觉数据进行初始化处理,得到摄像设备拍摄滑窗内所有图像帧时的位姿和不带尺度信息(也可以称之为深度)的位置,结合运动状态数据、轮速数据恢复上述不带尺度信息,并且重新计算所有角点的位置,然后对恢复的尺度信息和角点逆深度进行非线性优化,得到所述位姿变化。In one implementation, timestamps may be aligned to the removed first corner points and the removed second corner points with the motion state data and the wheel speed data, wherein, due to different data frequencies, A single frame of image data (removed corner points) corresponds to multiple motion state data and wheel speed data. After that, the motion state data and wheel speed data of each frame of image can be pre-integrated to provide initial pose values for the image. After that, the sliding window method is used, and pure visual data is used for initialization processing to obtain the pose and position without scale information (also called depth) when the camera equipment captures all image frames in the sliding window. The speed data is restored without scale information, and the positions of all corner points are recalculated, and then nonlinear optimization is performed on the restored scale information and the inverse depth of the corner points to obtain the pose change.
在一种可能的实现中,所述对所述第二位姿变化进行非线优化,包括:根据预设的优化函数对所述第二位姿变化进行非线优化,其中,所述预设的优化函数包括轮速计残差项。In a possible implementation, the performing nonlinear optimization on the second pose change includes: performing nonlinear optimization on the second pose change according to a preset optimization function, wherein the preset The optimization function includes a wheel speedometer residual term.
为了对位姿变化进行平滑或者趋近于真实值的优化,通常会进行非线性优化。非线性优化是指在给定的目标函数中,寻找最优的一组数值映射,即x-min f(x)。根据导数理论,可以借助导数方程Δf(x)=0的求解获取有效的x的取值,当f(x)是非线性函数时,该优化为非线性优化,本申请实施例中优化函数可以由五个项构成,具体可以表示为:优化函数=先验残差+IMU残差项+轮速计残差项+视觉重投影误差项+回环检测重投影误差项。In order to optimize the pose change smoothly or approach the real value, nonlinear optimization is usually performed. Nonlinear optimization refers to finding an optimal set of numerical mappings in a given objective function, namely x-min f(x). According to the derivative theory, the effective value of x can be obtained by solving the derivative equation Δf(x)=0. When f(x) is a nonlinear function, the optimization is nonlinear optimization. In the embodiment of the present application, the optimization function can be represented by It is composed of five items, which can be specifically expressed as: optimization function = a priori residual + IMU residual term + wheel speedometer residual term + visual reprojection error term + loop closure detection reprojection error term.
本申请实施例中,针对动态环境下定位误差大的问题,剔除了多个角点中所述动态目标所在区域的角点,以确保视觉重投影误差更可靠,从而克服动态特征引入的定位误差。且针对视觉惯导系统二维运动初始化误差较大的缺点,添加了轮速计残差项。通过视觉、IMU和轮速计三者信息联合初始化并计算位姿。使得在IMU收敛不好时,可以使用轮速计补偿,从而提升效果。In the embodiment of the present application, in view of the problem of large positioning error in a dynamic environment, the corner points in the area where the dynamic target is located among the multiple corner points are eliminated to ensure that the visual reprojection error is more reliable, thereby overcoming the positioning error introduced by dynamic features. . In view of the disadvantage of the large two-dimensional motion initialization error of the visual inertial navigation system, the wheel speedometer residual term is added. The pose is initialized and calculated through the joint information of vision, IMU and wheel speedometer. So that when the IMU convergence is not good, the wheel speedometer can be used to compensate, so as to improve the effect.
在一种可能的实现中,所述方法还包括:In a possible implementation, the method further includes:
通过预训练的神经网络检测所述第一图像帧中的动态目标,以及检测所述第二图像帧中的动态目标,以获取所述第一图像帧包括的动态目标所在的区域,以及所述第二图像帧包括的动态目标所在的区域。A pre-trained neural network is used to detect the dynamic target in the first image frame, and detect the dynamic target in the second image frame, so as to obtain the region where the dynamic target included in the first image frame is located, and the The second image frame includes the area where the dynamic object is located.
在一种实现中,还可以将当前关键帧与地图进行闭环检测,检测到的帧为闭环帧,找出闭环帧与滑窗中帧的共视特征点;建立重投影误差项加入非线性优化中;将优化完并滑出窗口的关键帧与其关联帧进行四自由度优化;将优化完的关键帧插入地图;在实际使用时,地图也就是回环检测可以选择开启或者不开启,开启后就是边定位边建图,最后得到的位置都是相当于第一个固定相机位置的,当然这个可以在后续做坐标变换固定到具体参考系。In one implementation, the current key frame and the map can also be closed-loop detection, the detected frame is a closed-loop frame, and the common-view feature point between the closed-loop frame and the frame in the sliding window can be found; the reprojection error term is established to add nonlinear optimization. Medium; perform four-degree-of-freedom optimization on the keyframes that have been optimized and slide out of the window and their associated frames; insert the optimized keyframes into the map; in actual use, the map, that is, loopback detection, can be turned on or off. While positioning and building a map, the final position is equivalent to the first fixed camera position. Of course, this can be fixed to a specific reference system by subsequent coordinate transformation.
第二方面,本申请提供了一种位姿确定装置,所述装置包括:In a second aspect, the present application provides a pose determination device, the device comprising:
获取模块,用于获取摄像设备拍摄的第一图像帧和第二图像帧,所述第一图像帧和所述第二图像帧为所述摄像设备拍摄得到的相邻图像帧,所述第一图像帧和所述第二图像帧分别包括动态目标;an acquisition module, configured to acquire a first image frame and a second image frame captured by a camera device, where the first image frame and the second image frame are adjacent image frames captured by the camera device, and the first image frame and the second image frame are adjacent image frames captured by the camera device. The image frame and the second image frame respectively include dynamic objects;
角点提取模块,用于通过分别对所述第一图像帧和所述第二图像帧进行角点检测(corner detection),以得到所述第一图像帧的多个第一角点和所述第二图像帧的多个第二角点;a corner extraction module, configured to obtain a plurality of first corners of the first image frame and the a plurality of second corner points of the second image frame;
剔除模块,用于剔除所述多个第一角点中所述第一图像帧包括的动态目标所在区域的第一角点,以得到剔除后的多个第一角点;A culling module, configured to cull the first corner points of the region where the dynamic target included in the first image frame is located among the plurality of first corner points, so as to obtain a plurality of culled first corner points;
剔除所述多个第二角点中所述第二图像帧包括的动态目标所在区域的第二角点,以得到剔除后的多个第二角点;Eliminating the second corner points of the region where the dynamic target is located in the second image frame included in the plurality of second corner points, to obtain a plurality of eliminated second corner points;
定位模块,用于根据所述剔除后的多个第一角点以及所述剔除后的多个第二角点,确定所述摄像设备在拍摄所述第二图像帧时相对于拍摄所述第一图像帧时的位姿变化。a positioning module, configured to determine, according to the plurality of first corner points after culling and the plurality of second corner points after culling, relative to when the camera device is photographing the second image frame, relative to when photographing the first The pose change over an image frame.
在一种可能的实现中,所述多个第一角点以及所述多个第二角点包括如下的一种:加速片段测试特征FAST角点、哈里斯Harris角点以及二进制鲁棒不变可伸缩关键点BRISK角点。In a possible implementation, the plurality of first corner points and the plurality of second corner points include one of the following: accelerated segment test feature FAST corner points, Harris corner points, and binary robust invariant Scalable key BRISK corners.
在一种可能的实现中,所述装置应用于目标车辆,所述目标车辆固定设置有所述摄像设备、惯性测量单元IMU以及轮速计;所述获取模块,用于获取所述IMU测量得到的所述摄像设备在拍摄所述第一图像帧到拍摄所述第二图像帧期间所述目标车辆的运动状态数据;In a possible implementation, the apparatus is applied to a target vehicle, and the target vehicle is fixedly provided with the camera device, the inertial measurement unit IMU and a wheel speedometer; the acquisition module is configured to acquire the IMU measurement result obtained by the motion state data of the target vehicle during the period from shooting the first image frame to shooting the second image frame by the camera device;
获取所述轮速计测量得到的所述摄像设备在拍摄所述第一图像帧到拍摄所述第二图像帧期间所述目标车辆的轮速数据;acquiring the wheel speed data of the target vehicle during the period from shooting the first image frame to shooting the second image frame by the camera device measured by the wheel speed meter;
相应的,所述定位模块,用于根据所述剔除后的多个第一角点以及所述剔除后的多个第二角点在各自所在图像帧的视差,确定所述摄像设备在拍摄所述第二图像帧时相对于拍摄所述第一图像帧时的第一位姿变化,其中,所述第一位姿变化不包括尺度信息;Correspondingly, the positioning module is configured to determine, according to the disparity of the plurality of first corner points after culling and the plurality of second corner points after culling in respective image frames, that the camera device is in the shooting location. The first pose change when the second image frame is taken relative to when the first image frame is captured, wherein the first pose change does not include scale information;
根据所述运动状态数据、所述轮速数据以及所述第一位姿变化,确定所述摄像设备在拍摄所述第二图像帧时相对于拍摄所述第一图像帧时的第二位姿变化,所述第二位姿变化包括尺度信息;According to the motion state data, the wheel speed data and the change of the first posture, determine the second posture of the camera device when the second image frame is photographed relative to when the first image frame is photographed change, the second pose change includes scale information;
对所述第二位姿变化进行非线优化,得到所述位姿变化。Non-linear optimization is performed on the second pose change to obtain the pose change.
在一种可能的实现中,所述定位模块,用于根据预设的优化函数对所述第二位姿变化进行非线优化,其中,所述预设的优化函数包括轮速计残差项。In a possible implementation, the positioning module is configured to perform nonlinear optimization on the second pose change according to a preset optimization function, wherein the preset optimization function includes a wheel speed meter residual item .
在一种可能的实现中,所述装置还包括:In a possible implementation, the apparatus further includes:
动态目标检测模块,用于通过预训练的神经网络检测所述第一图像帧中的动态目标,以及检测所述第二图像帧中的动态目标,以获取所述第一图像帧包括的动态目标所在的区域,以及所述第二图像帧包括的动态目标所在的区域。A dynamic target detection module, configured to detect the dynamic target in the first image frame through the pre-trained neural network, and detect the dynamic target in the second image frame, so as to obtain the dynamic target included in the first image frame the region where the dynamic target is located, and the region where the dynamic object included in the second image frame is located.
在一种可能的实现中,所述动态目标为车辆。In one possible implementation, the dynamic target is a vehicle.
第三方面,提供了一种计算机可读存储介质,计算机可读存储介质存储了程序代码,其中,程序代码包括用于执行上述第一方面所描述的方法中的部分或全部操作的指令。In a third aspect, a computer-readable storage medium is provided, where the computer-readable storage medium stores program codes, wherein the program codes include instructions for performing part or all of the operations in the method described in the first aspect.
第四方面,本申请实施例提供一种计算机程序产品,当计算机程序产品在通信装置上运行时,使得通信装置执行上述第一方面所描述的方法中的部分或全部操作。In a fourth aspect, an embodiment of the present application provides a computer program product that, when the computer program product runs on a communication device, causes the communication device to perform some or all of the operations in the method described in the first aspect.
第五方面,提供了一种芯片,所述芯片包括处理器,所述处理器用于执行上述第一方面所描述的方法中的部分或全部操作。In a fifth aspect, a chip is provided, the chip includes a processor, and the processor is configured to perform some or all of the operations in the method described in the first aspect above.
本申请实施例提供了一种位姿确定方法,所述方法包括:获取摄像设备拍摄的第一图像帧和第二图像帧,所述第一图像帧和所述第二图像帧为所述摄像设备拍摄得到的相邻图像帧,所述第一图像帧和所述第二图像帧分别包括动态目标;通过分别对所述第一图像帧和所述第二图像帧进行角点检测(corner detection),以得到所述第一图像帧的多个第一角点和所述第二图像帧的多个第二角点;剔除所述多个第一角点中所述第一图像帧包括的动态目标所在区域的第一角点,以得到剔除后的多个第一角点;剔除所述多个第二角点中所述第二图像帧包括的动态目标所在区域的第二角点,以得到剔除后的多个第二角点;根据所述剔除后的多个第一角点以及所述剔除后的多个第二角点,确定所述摄像设备在拍摄所述第二图像帧时相对于拍摄所述第一图像帧时的位姿变化。当存在动态角点时,同一角点在不同相机状态下的观测量不仅包括由于相机自身运动引入的视差,还包括由于角点本身运动带来的视差,本实施例通过上述方式,针对动态环境下定位误差大的问题,剔除了多个角点中所述动态目标所在区域的角点,以确保视觉重投影误差更可靠,从而克服动态特征引入的位姿变化确定误差。An embodiment of the present application provides a method for determining a pose, the method includes: acquiring a first image frame and a second image frame captured by a camera device, where the first image frame and the second image frame are for the camera The adjacent image frames captured by the device, the first image frame and the second image frame respectively include dynamic targets; by performing corner detection on the first image frame and the second image frame respectively (corner detection) ) to obtain a plurality of first corner points of the first image frame and a plurality of second corner points of the second image frame; excluding the plurality of first corner points included in the first image frame the first corner points of the area where the dynamic target is located, to obtain a plurality of first corner points after culling; culling the second corner points of the area where the dynamic target is located in the second image frame included in the plurality of second corner points, to obtain a plurality of second corner points after culling; according to the plurality of first corner points after culling and the plurality of second corner points after culling, it is determined that the camera device is shooting the second image frame relative to the pose change when the first image frame was captured. When there are dynamic corner points, the observed amount of the same corner point in different camera states includes not only the parallax caused by the movement of the camera itself, but also the parallax caused by the movement of the corner point itself. In this embodiment, the above-mentioned methods are used to address the dynamic environment. To solve the problem of large positioning error, the corner points in the region where the dynamic target is located among the multiple corner points are eliminated to ensure that the visual reprojection error is more reliable, thereby overcoming the determination error of the pose change introduced by the dynamic feature.
附图说明Description of drawings
图1是本发明实施例提供的车辆的功能框图;1 is a functional block diagram of a vehicle provided by an embodiment of the present invention;
图2是本发明实施例提供的自动驾驶系统的功能框图;2 is a functional block diagram of an automatic driving system provided by an embodiment of the present invention;
图3为本申请实施例提供的一种位姿确定方法的流程示意;FIG. 3 is a schematic flowchart of a method for determining a pose provided by an embodiment of the present application;
图4为本申请实施例提供的一种位姿确定方法的流程示意;FIG. 4 is a schematic flowchart of a pose determination method provided by an embodiment of the present application;
图5为本申请实施例提供的一种位姿确定装置的结构示意;FIG. 5 is a schematic structural diagram of a pose determination device provided by an embodiment of the present application;
图6为本申请实施例提供的一种位姿确定装置的结构示意。FIG. 6 is a schematic structural diagram of a pose determination apparatus provided by an embodiment of the present application.
具体实施方式Detailed ways
下面将结合本发明实施例中的附图,对本发明实施例进行描述。The embodiments of the present invention will be described below with reference to the accompanying drawings in the embodiments of the present invention.
本申请的说明书和权利要求书及所述附图中的术语“第一”、“第二”、“第三”和“第四”等是用于区别不同对象,而不是用于描述特定顺序。此外,术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。The terms "first", "second", "third" and "fourth" in the description and claims of the present application and the drawings are used to distinguish different objects, rather than to describe a specific order . Furthermore, the terms "comprising" and "having" and any variations thereof are intended to cover non-exclusive inclusion. For example, a process, method, system, product or device comprising a series of steps or units is not limited to the listed steps or units, but optionally also includes unlisted steps or units, or optionally also includes For other steps or units inherent to these processes, methods, products or devices.
在本文中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是,本文所描述的实施例可以与其它实施例相结合。Reference herein to an "embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the present application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor a separate or alternative embodiment that is mutually exclusive of other embodiments. It is explicitly and implicitly understood by those skilled in the art that the embodiments described herein may be combined with other embodiments.
在本说明书中使用的术语“部件”、“模块”、“系统”等用于表示计算机相关的实体、硬件、固件、硬件和软件的组合、软件、或执行中的软件。例如,部件可以是但不限于,在处理器上运行的进程、处理器、对象、可执行文件、执行线程、程序和/或计算机。通过图示,在计算设备上运行的应用和计算设备都可以是部件。一个或多个部件可驻留在进程和/或执 行线程中,部件可位于一个计算机上和/或分布在2个或更多个计算机之间。此外,这些部件可从在上面存储有各种数据结构的各种计算机可读介质执行。部件可例如根据具有一个或多个数据分组(例如来自与本地系统、分布式系统和/或网络间的另一部件交互的二个部件的数据,例如通过信号与其它系统交互的互联网)的信号通过本地和/或远程进程来通信。The terms "component", "module", "system" and the like are used in this specification to refer to a computer-related entity, hardware, firmware, a combination of hardware and software, software, or software in execution. For example, a component may be, but is not limited to, a process running on a processor, a processor, an object, an executable, a thread of execution, a program, and/or a computer. By way of illustration, both an application running on a computing device and the computing device may be components. One or more components may reside within a process and/or thread of execution, and a component may be localized on one computer and/or distributed between 2 or more computers. In addition, these components can execute from various computer readable media having various data structures stored thereon. A component may, for example, be based on a signal having one or more data packets (eg, data from two components interacting with another component between a local system, a distributed system, and/or a network, such as the Internet interacting with other systems via signals) Communicate through local and/or remote processes.
图1是本发明实施例提供的车辆100的功能框图。在一个实施例中,将车辆100配置为完全或部分地自动驾驶模式。例如,车辆100可以在处于自动驾驶模式中的同时控制自身,并且可通过人为操作来确定车辆及其周边环境的当前状态,确定周边环境中的至少一个其他车辆的可能行为,并确定其他车辆执行可能行为的可能性相对应的置信水平,基于所确定的信息来控制车辆100。在车辆100处于自动驾驶模式中时,可以将车辆100置为在没有和人交互的情况下操作。FIG. 1 is a functional block diagram of a vehicle 100 according to an embodiment of the present invention. In one embodiment, the vehicle 100 is configured in a fully or partially autonomous driving mode. For example, the vehicle 100 can control itself while in an autonomous driving mode, and can determine the current state of the vehicle and its surroundings through human manipulation, determine the possible behavior of at least one other vehicle in the surrounding environment, and determine the other vehicles perform The confidence level corresponding to the likelihood of the possible behavior, the vehicle 100 is controlled based on the determined information. When the vehicle 100 is in an autonomous driving mode, the vehicle 100 may be placed to operate without human interaction.
车辆100可包括各种子系统,例如行进系统102、传感器系统104、控制系统106、一个或多个外围设备108以及电源110、计算机系统112和用户接口116。可选地,车辆100可包括更多或更少的子系统,并且每个子系统可包括多个元件。另外,车辆100的每个子系统和元件可以通过有线或者无线互连。Vehicle 100 may include various subsystems, such as travel system 102 , sensor system 104 , control system 106 , one or more peripherals 108 and power supply 110 , computer system 112 , and user interface 116 . Alternatively, vehicle 100 may include more or fewer subsystems, and each subsystem may include multiple elements. Additionally, each of the subsystems and elements of the vehicle 100 may be interconnected by wire or wirelessly.
行进系统102可包括为车辆100提供动力运动的组件。在一个实施例中,推进系统102可包括引擎118、能量源119、传动装置120和车轮/轮胎121。引擎118可以是内燃引擎、电动机、空气压缩引擎或其他类型的引擎组合,例如气油发动机和电动机组成的混动引擎,内燃引擎和空气压缩引擎组成的混动引擎。引擎118将能量源119转换成机械能量。The travel system 102 may include components that provide powered motion for the vehicle 100 . In one embodiment, propulsion system 102 may include engine 118 , energy source 119 , transmission 120 , and wheels/tires 121 . The engine 118 may be an internal combustion engine, an electric motor, an air compression engine, or other types of engine combinations, such as a hybrid engine consisting of a gas oil engine and an electric motor, a hybrid engine consisting of an internal combustion engine and an air compression engine. Engine 118 converts energy source 119 into mechanical energy.
能量源119的示例包括汽油、柴油、其他基于石油的燃料、丙烷、其他基于压缩气体的燃料、乙醇、太阳能电池板、电池和其他电力来源。能量源119也可以为车辆100的其他系统提供能量。Examples of energy sources 119 include gasoline, diesel, other petroleum-based fuels, propane, other compressed gas-based fuels, ethanol, solar panels, batteries, and other sources of electricity. The energy source 119 may also provide energy to other systems of the vehicle 100 .
传动装置120可以将来自引擎118的机械动力传送到车轮121。传动装置120可包括变速箱、差速器和驱动轴。在一个实施例中,传动装置120还可以包括其他器件,比如离合器。其中,驱动轴可包括可耦合到一个或多个车轮121的一个或多个轴。Transmission 120 may transmit mechanical power from engine 118 to wheels 121 . Transmission 120 may include a gearbox, a differential, and a driveshaft. In one embodiment, transmission 120 may also include other devices, such as clutches. Among other things, the drive shaft may include one or more axles that may be coupled to one or more wheels 121 .
传感器系统104可包括感测关于车辆100周边的环境的信息的若干个传感器。例如,传感器系统104可包括定位系统122(定位系统可以是GPS系统,也可以是北斗系统或者其他定位系统)、惯性测量单元(inertial measurement unit,IMU)124、雷达系统126、激光测距仪128以及相机130。传感器系统104还可包括被监视车辆100的内部系统的传感器(例如,车内空气质量监测器、燃油量表、机油温度表等)。来自这些传感器中的一个或多个的传感器数据可用于检测对象及其相应特性(位置、形状、方向、速度等)。这种检测和识别是自主车辆100的安全操作的关键功能。The sensor system 104 may include several sensors that sense information about the environment surrounding the vehicle 100 . For example, the sensor system 104 may include a positioning system 122 (the positioning system may be a GPS system, a Beidou system or other positioning systems), an inertial measurement unit (IMU) 124, a radar system 126, a laser rangefinder 128 and camera 130 . The sensor system 104 may also include sensors of the internal systems of the vehicle 100 being monitored (eg, an in-vehicle air quality monitor, a fuel gauge, an oil temperature gauge, etc.). Sensor data from one or more of these sensors can be used to detect objects and their corresponding characteristics (position, shape, orientation, velocity, etc.). This detection and identification is a critical function for the safe operation of the autonomous vehicle 100 .
其中,随着高级驾驶辅助系统(ADAS)和无人驾驶技术的发展,其对雷达系统126的距离、角度分辨率等性能提出了更高的要求。而车载雷达系统126的距离、角度分辨率的提升使得车载雷达系统126在对目标进行成像时,一个目标对象会检测出多个测量点,形成高分辨率点云数据,本申请中的雷达系统126也可以称为点云成像雷达。Among them, with the development of advanced driver assistance systems (ADAS) and unmanned driving technologies, higher requirements are placed on the performance of the radar system 126 such as distance and angular resolution. The improvement of the distance and angular resolution of the vehicle-mounted radar system 126 enables the vehicle-mounted radar system 126 to detect multiple measurement points for a target object when imaging the target to form high-resolution point cloud data. The radar system in this application 126 may also be referred to as point cloud imaging radar.
定位系统122可用于估计车辆100的地理位置。IMU 124用于基于惯性加速度来感测车辆100的位置和朝向变化。在一个实施例中,IMU 124可以是加速度计和陀螺仪的组合。The positioning system 122 may be used to estimate the geographic location of the vehicle 100 . The IMU 124 is used to sense position and orientation changes of the vehicle 100 based on inertial acceleration. In one embodiment, IMU 124 may be a combination of an accelerometer and a gyroscope.
雷达系统126可利用无线电信号来感测车辆100的周边环境内的物体。在一些实施例中,除了感测物体以外,雷达系统126还可用于感测物体的速度和/或前进方向。Radar system 126 may utilize radio signals to sense objects within the surrounding environment of vehicle 100 . In some embodiments, in addition to sensing objects, radar system 126 may be used to sense the speed and/or heading of objects.
激光测距仪128可利用激光来感测车辆100所位于的环境中的物体。在一些实施例中,激光测距仪128可包括一个或多个激光源、激光扫描器以及一个或多个检测器,以及其他系统组件。The laser rangefinder 128 may utilize laser light to sense objects in the environment in which the vehicle 100 is located. In some embodiments, the laser rangefinder 128 may include one or more laser sources, laser scanners, and one or more detectors, among other system components.
相机130可用于捕捉车辆100的周边环境的多个图像。相机130可以是静态相机或视频相机。本申请实施例中,相机130也可称之为摄像设备。Camera 130 may be used to capture multiple images of the surrounding environment of vehicle 100 . Camera 130 may be a still camera or a video camera. In this embodiment of the present application, the camera 130 may also be referred to as an imaging device.
控制系统106为控制车辆100及其组件的操作。控制系统106可包括各种元件,其中包括转向系统132、油门134、制动单元136、计算机视觉系统140、路线控制系统142以及障碍物避免系统144。The control system 106 controls the operation of the vehicle 100 and its components. Control system 106 may include various elements including steering system 132 , throttle 134 , braking unit 136 , computer vision system 140 , route control system 142 , and obstacle avoidance system 144 .
转向系统132可操作来调整车辆100的前进方向。例如在一个实施例中可以为方向盘系统。The steering system 132 is operable to adjust the heading of the vehicle 100 . For example, in one embodiment it may be a steering wheel system.
油门134用于控制引擎118的操作速度并进而控制车辆100的速度。The throttle 134 is used to control the operating speed of the engine 118 and thus the speed of the vehicle 100 .
制动单元136用于控制车辆100减速。制动单元136可使用摩擦力来减慢车轮121。在其他实施例中,制动单元136可将车轮121的动能转换为电流。制动单元136也可采取其他形式来减慢车轮121转速从而控制车辆100的速度。The braking unit 136 is used to control the deceleration of the vehicle 100 . The braking unit 136 may use friction to slow the wheels 121 . In other embodiments, the braking unit 136 may convert the kinetic energy of the wheels 121 into electrical current. The braking unit 136 may also take other forms to slow the wheels 121 to control the speed of the vehicle 100 .
计算机视觉系统140可以处理和分析由相机130捕捉的图像以便识别车辆100周边环境中的物体和/或特征。所述物体和/或特征可包括交通信号、道路边界和障碍物。计算机视觉系统140可使用物体识别算法、运动中恢复结构(Structure from Motion,SFM)算法、视频跟踪和其他计算机视觉技术。在一些实施例中,计算机视觉系统140可以用于为环境绘制地图、跟踪物体、估计物体的速度等等。Computer vision system 140 may process and analyze images captured by camera 130 in order to identify objects and/or features in the environment surrounding vehicle 100 . The objects and/or features may include traffic signals, road boundaries and obstacles. Computer vision system 140 may use object recognition algorithms, Structure from Motion (SFM) algorithms, video tracking, and other computer vision techniques. In some embodiments, the computer vision system 140 may be used to map the environment, track objects, estimate the speed of objects, and the like.
路线控制系统142用于确定车辆100的行驶路线。在一些实施例中,路线控制系统142可结合来自传感器138、GPS 122和一个或多个预定地图的数据以为车辆100确定行驶路线。The route control system 142 is used to determine the travel route of the vehicle 100 . In some embodiments, the route control system 142 may combine data from the sensors 138 , the GPS 122 , and one or more predetermined maps to determine a driving route for the vehicle 100 .
障碍物避免系统144用于识别、评估和避免或者以其他方式越过车辆100的环境中的潜在障碍物。The obstacle avoidance system 144 is used to identify, evaluate, and avoid or otherwise traverse potential obstacles in the environment of the vehicle 100 .
当然,在一个实例中,控制系统106可以增加或替换地包括除了所示出和描述的那些以外的组件。或者也可以减少一部分上述示出的组件。Of course, in one example, the control system 106 may additionally or alternatively include components other than those shown and described. Alternatively, some of the components shown above may be reduced.
车辆100通过外围设备108与外部传感器、其他车辆、其他计算机系统或用户之间进行交互。外围设备108可包括无线通信系统146、车载电脑148、麦克风150和/或扬声器152。Vehicle 100 interacts with external sensors, other vehicles, other computer systems, or users through peripheral devices 108 . Peripherals 108 may include a wireless communication system 146 , an onboard computer 148 , a microphone 150 and/or a speaker 152 .
在一些实施例中,外围设备108为车辆100的用户提供与用户接口116交互的手段。例如,车载电脑148可向车辆100的用户提供信息。用户接口116还可操作车载电脑148来接收用户的输入。车载电脑148可以通过触摸屏进行操作。在其他情况中,外围设备108可提供用于车辆100与位于车内的其它设备通信的手段。例如,麦克风150可从车辆100的用户接收音频(例如,语音命令或其他音频输入)。类似地,扬声器152可向车辆100的用户输出音频。In some embodiments, peripherals 108 provide a means for a user of vehicle 100 to interact with user interface 116 . For example, the onboard computer 148 may provide information to the user of the vehicle 100 . User interface 116 may also operate on-board computer 148 to receive user input. The onboard computer 148 can be operated via a touch screen. In other cases, peripheral devices 108 may provide a means for vehicle 100 to communicate with other devices located within the vehicle. For example, microphone 150 may receive audio (eg, voice commands or other audio input) from a user of vehicle 100 . Similarly, speakers 152 may output audio to a user of vehicle 100 .
无线通信系统146可以直接地或者经由通信网络来与一个或多个设备无线通信。例如, 无线通信系统146可使用3G蜂窝通信,例如码分多址(code division multiple access,CDMA)、增强型多媒体盘片系统(enhanced versatile disk,EVD)、全球移动通讯系统(global system for mobile communications,GSM)/通用分组无线业务(general packet radio service,GPRS),或者4G蜂窝通信,例如LTE。或者5G蜂窝通信。无线通信系统146可利用WiFi与无线局域网(wireless local area network,WLAN)通信。在一些实施例中,无线通信系统146可利用红外链路、蓝牙或ZigBee与设备直接通信。其他无线协议,例如各种车辆通信系统,例如,无线通信系统146可包括一个或多个专用短程通信(dedicated short range communications,DSRC)设备,这些设备可包括车辆和/或路边台站之间的公共和/或私有数据通信。Wireless communication system 146 may wirelessly communicate with one or more devices, either directly or via a communication network. For example, the wireless communication system 146 may use 3G cellular communications, such as code division multiple access (CDMA), enhanced versatile disk (EVD), global system for mobile communications , GSM)/general packet radio service (GPRS), or 4G cellular communications such as LTE. Or 5G cellular communications. The wireless communication system 146 may communicate with a wireless local area network (WLAN) using WiFi. In some embodiments, the wireless communication system 146 may communicate directly with the device using an infrared link, Bluetooth, or ZigBee. Other wireless protocols, such as various vehicle communication systems, for example, wireless communication system 146 may include one or more dedicated short range communications (DSRC) devices, which may include communication between vehicles and/or roadside stations public and/or private data communications.
电源110可向车辆100的各种组件提供电力。在一个实施例中,电源110可以为可再充电锂离子或铅酸电池。这种电池的一个或多个电池组可被配置为电源为车辆100的各种组件提供电力。在一些实施例中,电源110和能量源119可一起实现,例如一些全电动车中那样。The power supply 110 may provide power to various components of the vehicle 100 . In one embodiment, the power source 110 may be a rechargeable lithium-ion or lead-acid battery. One or more battery packs of such a battery may be configured as a power source to provide power to various components of the vehicle 100 . In some embodiments, power source 110 and energy source 119 may be implemented together, such as in some all-electric vehicles.
车辆100的部分或所有功能受计算机系统112控制。计算机系统112可包括至少一个处理器113,处理器113执行存储在例如数据存储装置114这样的非暂态计算机可读介质中的指令115。计算机系统112还可以是采用分布式方式控制车辆100的个体组件或子系统的多个计算设备。Some or all of the functions of the vehicle 100 are controlled by the computer system 112 . Computer system 112 may include at least one processor 113 that executes instructions 115 stored in a non-transitory computer-readable medium such as data storage device 114 . Computer system 112 may also be multiple computing devices that control individual components or subsystems of vehicle 100 in a distributed fashion.
处理器113可以是任何常规的处理器,诸如商业可获得的中央处理器(central processing unit,CPU)。替选地,该处理器可以是诸如专用集成电路(application specific integrated circuit,ASIC)或其它基于硬件的处理器的专用设备。尽管图1功能性地图示了处理器、存储器、和在相同块中的计算机110的其它元件,但是本领域的普通技术人员应该理解该处理器、计算机、或存储器实际上可以包括可以或者可以不存储在相同的物理外壳内的多个处理器、计算机、或存储器。例如,存储器可以是硬盘驱动器或位于不同于计算机110的外壳内的其它存储介质。因此,对处理器或计算机的引用将被理解为包括对可以或者可以不并行操作的处理器或计算机或存储器的集合的引用。不同于使用单一的处理器来执行此处所描述的步骤,诸如转向组件和减速组件的一些组件每个都可以具有其自己的处理器,处理器只执行与特定于组件的功能相关的计算。The processor 113 may be any conventional processor, such as a commercially available central processing unit (CPU). Alternatively, the processor may be a dedicated device such as an application specific integrated circuit (ASIC) or other hardware-based processor. Although FIG. 1 functionally illustrates the processor, memory, and other elements of the computer 110 in the same block, one of ordinary skill in the art will understand that the processor, computer, or memory may actually include a processor, a computer, or a memory that may or may not Multiple processors, computers, or memories stored within the same physical enclosure. For example, the memory may be a hard drive or other storage medium located within an enclosure other than computer 110 . Thus, reference to a processor or computer will be understood to include reference to a collection of processors or computers or memories that may or may not operate in parallel. Rather than using a single processor to perform the steps described herein, some components, such as the steering and deceleration components, may each have their own processors that only perform computations related to component-specific functions.
本申请实施例中,处理器113可以获取到相机130以及其他传感器设备的数据,并基于获取的数据进行车辆的定位。In this embodiment of the present application, the processor 113 may acquire data from the camera 130 and other sensor devices, and perform vehicle positioning based on the acquired data.
在此处所描述的各个方面中,处理器可以位于远离该车辆并且与该车辆进行无线通信。在其它方面中,此处所描述的过程中的一些在布置于车辆内的处理器上执行而其它则由远程处理器执行,包括采取执行单一操纵的必要步骤。In various aspects described herein, a processor may be located remotely from the vehicle and in wireless communication with the vehicle. In other aspects, some of the processes described herein are performed on a processor disposed within the vehicle while others are performed by a remote processor, including taking steps necessary to perform a single maneuver.
在一些实施例中,数据存储装置114可包含指令115(例如,程序逻辑),指令115可被处理器113执行来执行车辆100的各种功能,包括以上描述的那些功能。数据存储装置114也可包含额外的指令,包括向推进系统102、传感器系统104、控制系统106和外围设备108中的一个或多个发送数据、从其接收数据、与其交互和/或对其进行控制的指令。In some embodiments, data storage 114 may include instructions 115 (eg, program logic) executable by processor 113 to perform various functions of vehicle 100 , including those described above. Data storage 114 may also contain additional instructions, including sending data to, receiving data from, interacting with, and/or performing data processing on one or more of propulsion system 102 , sensor system 104 , control system 106 , and peripherals 108 . control commands.
除了指令115以外,数据存储装置114还可存储数据,例如道路地图、路线信息,车 辆的位置、方向、速度以及其它的车辆数据,以及其他信息。这种信息可在车辆100在自主、半自主和/或手动模式中被车辆100和计算机系统112使用。In addition to instructions 115, data storage 114 may store data such as road maps, route information, vehicle location, direction, speed, and other vehicle data, among other information. Such information may be used by the vehicle 100 and the computer system 112 while the vehicle 100 is in autonomous, semi-autonomous, and/or manual modes.
用户接口116,用于向车辆100的用户提供信息或从车辆100接收信息。可选地,用户接口116可包括在外围设备108的集合内的一个或多个输入/输出设备,例如无线通信系统146、车车在电脑148、麦克风150和扬声器152。A user interface 116 for providing information to or receiving information from a user of the vehicle 100 . Optionally, the user interface 116 may include one or more input/output devices within the set of peripheral devices 108 , such as a wireless communication system 146 , an onboard computer 148 , a microphone 150 and a speaker 152 .
计算机系统112可基于从各种子系统(例如,行进系统102、传感器系统104和控制系统106)以及从用户接口116接收的输入来控制车辆100的功能。例如,计算机系统112可利用来自控制系统106的输入以便控制转向单元132来避免由传感器系统104和障碍物避免系统144检测到的障碍物。在一些实施例中,计算机系统112可操作来对车辆100及其子系统的许多方面提供控制。Computer system 112 may control functions of vehicle 100 based on input received from various subsystems (eg, travel system 102 , sensor system 104 , and control system 106 ) and from user interface 116 . For example, computer system 112 may utilize input from control system 106 in order to control steering unit 132 to avoid obstacles detected by sensor system 104 and obstacle avoidance system 144 . In some embodiments, computer system 112 is operable to provide control of various aspects of vehicle 100 and its subsystems.
可选地,上述这些组件中的一个或多个可与车辆100分开安装或关联。例如,数据存储装置114可以部分或完全地与车辆100分开存在。上述组件可以按有线和/或无线方式来通信地耦合在一起。Alternatively, one or more of these components described above may be installed or associated with the vehicle 100 separately. For example, data storage device 114 may exist partially or completely separate from vehicle 100 . The above-described components may be communicatively coupled together in a wired and/or wireless manner.
可选地,上述组件只是一个示例,实际应用中,上述各个模块中的组件有可能根据实际需要增添或者删除,图1不应理解为对本发明实施例的限制。Optionally, the above component is just an example. In practical applications, components in each of the above modules may be added or deleted according to actual needs, and FIG. 1 should not be construed as a limitation on the embodiment of the present invention.
在道路行进的自动驾驶汽车,如上面的车辆100,可以识别其周围环境内的物体以确定对当前速度的调整。物体可以是其它车辆、交通控制设备、或者其它类型的物体。在一些示例中,可以独立地考虑每个识别的物体,并且基于物体的各自的特性,诸如它的当前速度、加速度、与车辆的间距等,可以用来确定自动驾驶汽车所要调整的速度。A self-driving car traveling on a road, such as vehicle 100 above, can recognize objects within its surroundings to determine adjustments to the current speed. The objects may be other vehicles, traffic control devices, or other types of objects. In some examples, each identified object may be considered independently, and based on the object's respective characteristics, such as its current speed, acceleration, distance from the vehicle, etc., may be used to determine the speed at which the autonomous vehicle is to adjust.
可选地,自动驾驶汽车车辆100或者与自动驾驶车辆100相关联的计算设备(如图1的计算机系统112、计算机视觉系统140、数据存储装置114)可以基于所识别的物体的特性和周围环境的状态(例如,交通、雨、道路上的冰、等等)来预测识别的物体的行为。可选地,每一个所识别的物体都依赖于彼此的行为,因此还可以将所识别的所有物体全部一起考虑来预测单个识别的物体的行为。车辆100能够基于预测的识别的物体的行为来调整它的速度。换句话说,自动驾驶汽车能够基于所预测的物体的行为来确定车辆将需要调整到(例如,加速、减速、或者停止)什么稳定状态。在这个过程中,也可以考虑其它因素来确定车辆100的速度,诸如,车辆100在行驶的道路中的横向位置、道路的曲率、静态和动态物体的接近度等等。Alternatively, the autonomous vehicle vehicle 100 or a computing device associated with the autonomous vehicle 100 (eg, computer system 112, computer vision system 140, data storage device 114 of FIG. 1) may be based on the characteristics of the identified objects and the surrounding environment state (eg, traffic, rain, ice on the road, etc.) to predict the behavior of the identified objects. Optionally, each identified object is dependent on the behavior of the other, so it is also possible to predict the behavior of a single identified object by considering all identified objects together. The vehicle 100 can adjust its speed based on the predicted behavior of the identified object. In other words, the self-driving car can determine what steady state the vehicle will need to adjust to (eg, accelerate, decelerate, or stop) based on the predicted behavior of the object. In this process, other factors may also be considered to determine the speed of the vehicle 100, such as the lateral position of the vehicle 100 in the road being traveled, the curvature of the road, the proximity of static and dynamic objects, and the like.
除了提供调整自动驾驶汽车的速度的指令之外,计算设备还可以提供修改车辆100的转向角的指令,以使得自动驾驶汽车遵循给定的轨迹和/或维持与自动驾驶汽车附近的物体(例如,道路上的相邻车道中的轿车)的安全横向和纵向距离。In addition to providing instructions to adjust the speed of the self-driving car, the computing device may also provide instructions to modify the steering angle of the vehicle 100 so that the self-driving car follows a given trajectory and/or maintains contact with objects in the vicinity of the self-driving car (eg, , cars in adjacent lanes on the road) safe lateral and longitudinal distances.
上述车辆100可以为轿车、卡车、摩托车、公共汽车、船、飞机、直升飞机、割草机、娱乐车、游乐场车辆、施工设备、电车、高尔夫球车、火车、和手推车等,本发明实施例不做特别的限定。The above-mentioned vehicle 100 can be a car, a truck, a motorcycle, a bus, a boat, an airplane, a helicopter, a lawn mower, a recreational vehicle, a playground vehicle, construction equipment, a tram, a golf cart, a train, a cart, etc. The embodiments of the invention are not particularly limited.
场景示例1:自动驾驶系统Scenario Example 1: Autonomous Driving System
根据图2,计算机系统101包括处理器103,处理器103和系统总线105耦合。处理器103可以是一个或者多个处理器,其中每个处理器都可以包括一个或多个处理器核。显示 适配器(video adapter)107,显示适配器可以驱动显示器109,显示器109和系统总线105耦合。系统总线105通过总线桥111和输入输出(I/O)总线耦合。I/O接口115和I/O总线耦合。I/O接口115和多种I/O设备进行通信,比如输入设备117(如:键盘,鼠标,触摸屏等),多媒体盘(media tray)121,(例如,只读光盘(compact disc read-only memory,CD-ROM),多媒体接口等)。收发器123(可以发送和/或接受无线电通信信号),摄像头155(可以捕捉静态和动态数字视频图像)和外部USB端口125。其中,可选地,和I/O接口115相连接的接口可以是USB接口。According to FIG. 2 , computer system 101 includes processor 103 coupled to system bus 105 . The processor 103 may be one or more processors, each of which may include one or more processor cores. A video adapter 107, which can drive a display 109, is coupled to the system bus 105. The system bus 105 is coupled to an input-output (I/O) bus through a bus bridge 111 . I/O interface 115 is coupled to the I/O bus. I/O interface 115 communicates with various I/O devices, such as input device 117 (eg, keyboard, mouse, touch screen, etc.), media tray 121, (eg, compact disc read-only) memory, CD-ROM), multimedia interface, etc.). Transceiver 123 (which can send and/or receive radio communication signals), camera 155 (which can capture still and moving digital video images) and external USB port 125 . Wherein, optionally, the interface connected to the I/O interface 115 may be a USB interface.
其中,处理器103可以是任何传统处理器,包括精简指令集计算(reduced instruction set Computing,RISC)处理器、复杂指令集计算(complexinstruction set computing,CISC)处理器或上述的组合。可选地,处理器可以是诸如ASIC的专用装置。可选地,处理器103可以是神经网络处理器或者是神经网络处理器和上述传统处理器的组合。The processor 103 may be any conventional processor, including a reduced instruction set computing (reduced instruction set computing, RISC) processor, a complex instruction set computing (complex instruction set computing, CISC) processor or a combination of the above. Alternatively, the processor may be a special purpose device such as an ASIC. Optionally, the processor 103 may be a neural network processor or a combination of a neural network processor and the above-mentioned conventional processors.
可选地,在本文的各种实施例中,计算机系统101可位于远离自动驾驶车辆的地方,并且可与自动驾驶车辆无线通信。在其它方面,本文的一些过程在设置在自动驾驶车辆内的处理器上执行,其它由远程处理器执行,包括采取执行单个操纵所需的动作。Alternatively, in various embodiments herein, computer system 101 may be located remotely from the autonomous vehicle and may communicate wirelessly with the autonomous vehicle. In other aspects, some of the processes herein are performed on a processor disposed within an autonomous vehicle, others are performed by a remote processor, including taking actions required to perform a single maneuver.
计算机101可以通过网络接口129和软件部署服务器149通信。示例性的,网络接口129是硬件网络接口,比如,网卡。网络127可以是外部网络,比如因特网,也可以是内部网络,比如以太网或者虚拟私人网络(virtual private network,VPN)。可选地,网络127还可以是无线网络,比如WiFi网络,蜂窝网络等。 Computer 101 may communicate with software deployment server 149 through network interface 129 . Illustratively, network interface 129 is a hardware network interface, such as a network card. The network 127 may be an external network, such as the Internet, or an internal network, such as an Ethernet network or a virtual private network (VPN). Optionally, the network 127 may also be a wireless network, such as a WiFi network, a cellular network, and the like.
硬盘驱动接口和系统总线105耦合。硬件驱动接口和硬盘驱动器相连接。系统内存135和系统总线105耦合。运行在系统内存135的数据可以包括计算机101的操作系统137和应用程序143。The hard disk drive interface is coupled to the system bus 105 . The hard drive interface is connected to the hard drive. System memory 135 is coupled to system bus 105 . Data running in system memory 135 may include operating system 137 and application programs 143 of computer 101 .
操作系统包括Shell 139和内核(kernel)141。Shell 139是介于使用者和操作系统的内核间的一个接口。shell是操作系统最外面的一层。shell管理使用者与操作系统之间的交互:等待使用者的输入,向操作系统解释使用者的输入,并且处理各种各样的操作系统的输出结果。The operating system includes a Shell 139 and a kernel 141 . Shell 139 is an interface between the user and the operating system's kernel. The shell is the outermost layer of the operating system. The shell manages the interaction between the user and the operating system: waiting for user input, interpreting user input to the operating system, and processing various operating system output.
内核141由操作系统中用于管理存储器、文件、外设和系统资源的那些部分组成。内核141直接与硬件交互,操作系统内核通常运行进程,并提供进程间的通信,提供CPU时间片管理、中断、内存管理和IO管理等等。 Kernel 141 consists of those parts of the operating system that manage memory, files, peripherals, and system resources. The kernel 141 directly interacts with the hardware, and the operating system kernel usually runs processes, provides inter-process communication, provides CPU time slice management, interrupts, memory management, IO management, and the like.
应用程序143包括控制汽车自动驾驶相关的程序,比如,管理自动驾驶的汽车和路上障碍物交互的程序,控制自动驾驶汽车路线或者速度的程序,控制自动驾驶汽车和路上其他自动驾驶汽车交互的程序。应用程序143也存在于deploying server 149的系统上。在一个实施例中,在需要执行应用程序143时,计算机系统101可以从软件部署服务器149下载应用程序143。Application 143 includes programs that control the autonomous driving of the car, for example, programs that manage the interaction of the autonomous vehicle with road obstacles, programs that control the route or speed of the autonomous vehicle, and programs that control the interaction between the autonomous vehicle and other autonomous vehicles on the road. . Application 143 also exists on the system of deploying server 149. In one embodiment, computer system 101 may download application 143 from software deployment server 149 when application 143 needs to be executed.
传感器153和计算机系统101关联。传感器153用于探测计算机系统101周围的环境。举例来说,传感器153可以探测动物,汽车,障碍物和人行横道等,进一步传感器153还可以探测上述动物,汽车,障碍物和人行横道等物体周围的环境,比如:动物周围的环境,例如,动物周围出现的其他动物,天气条件,周围环境的光亮度等。可选地,如果计算机 101位于自动驾驶的汽车上,传感器可以是雷达系统等。 Sensor 153 is associated with computer system 101 . Sensor 153 is used to detect the environment around computer system 101 . For example, the sensor 153 can detect animals, cars, obstacles and pedestrian crossings, etc. Further, the sensor 153 can also detect the environment around the above-mentioned animals, cars, obstacles and pedestrian crossings, such as: the environment around animals, for example, around animals Other animals present, weather conditions, ambient light levels, etc. Alternatively, if the computer 101 is located in a self-driving car, the sensor may be a radar system or the like.
常见的定位方案有基于纯视觉的定位技术,即视觉SLAM。该方案主要思想是基于视觉特征点匹配和全局优化求解移动体位姿,首先从图像中提取特征,利用两帧之间匹配的特征点计算这两帧之间的相对位姿变换,最后利用这些信息计算出里程计信息。A common positioning scheme is based on pure vision positioning technology, namely visual SLAM. The main idea of this scheme is to solve the pose of the moving body based on visual feature point matching and global optimization. First, the features are extracted from the image, and the feature points matched between the two frames are used to calculate the relative pose transformation between the two frames, and finally the information is used. Calculate the odometer information.
除了视觉SLAM,视觉和和惯性测量单元(inertial measurement unit,IMU)数据融合的定位方法也有着广泛的应用。该方案的主要思想是将IMU与视觉数据融合,实现移动体定位。然而基于视觉和和惯性测量单元IMU数据融合的定位方法在动态环境下,视觉约束无法提供可靠的观测值,造成误差较大,从而影响定位精度。In addition to visual SLAM, localization methods based on visual and inertial measurement unit (IMU) data fusion are also widely used. The main idea of this scheme is to fuse the IMU with the visual data to realize the positioning of the moving body. However, the localization method based on the fusion of visual and inertial measurement unit IMU data cannot provide reliable observations in a dynamic environment, resulting in large errors, thus affecting the localization accuracy.
为了解决上述问题,参照图3,图3为本申请实施例提供的一种位姿确定方法的流程示意,如图3所示,本申请实施例提供的位姿确定方法包括:In order to solve the above problem, referring to FIG. 3 , FIG. 3 is a schematic flowchart of a method for determining a pose provided by an embodiment of the present application. As shown in FIG. 3 , the method for determining a pose provided by an embodiment of the present application includes:
301、获取摄像设备拍摄的第一图像帧和第二图像帧,所述第一图像帧和所述第二图像帧为所述摄像设备拍摄得到的相邻图像帧,所述第一图像帧和所述第二图像帧分别包括动态目标。301. Acquire a first image frame and a second image frame captured by an imaging device, where the first image frame and the second image frame are adjacent image frames captured by the imaging device, and the first image frame and the second image frame are adjacent image frames captured by the imaging device. The second image frames respectively include dynamic objects.
本申请实施例中,在车辆处于无GPS或者GPS不稳定的环境中,例如隧道、地下停车场、高楼或者遮挡较多的地方时,可以获取摄像设备拍摄的第一图像帧和第二图像帧,其中,第一图像帧和第二图像帧可以为摄像设备拍摄得到的连续两帧图像帧。In this embodiment of the present application, when the vehicle is in an environment with no GPS or unstable GPS, such as a tunnel, an underground parking lot, a high-rise building, or a place with many occlusions, the first image frame and the second image frame captured by the camera device can be acquired , wherein the first image frame and the second image frame may be two consecutive image frames captured by a camera device.
本申请实施例中,所述第一图像帧和所述第二图像帧分别包括动态目标,动态目标为在所述摄像设备拍摄图像帧时相对于地面发生位移的目标。所谓摄像设备拍摄图像帧时相对于地面发生位移是指,在空间上动态目标相对于地面发生了移动,例如从A位置移动到了B位置(A位置和B位置是不同的两个位置)。在一种实现中,动态目标可以是车辆,例如可以是轿车、载货汽车、客车、挂车、非完整车辆以及摩托车等等。In the embodiment of the present application, the first image frame and the second image frame respectively include a dynamic target, and the dynamic target is a target that is displaced relative to the ground when the imaging device captures the image frame. The so-called displacement relative to the ground when the imaging device captures the image frame means that the dynamic target moves relative to the ground in space, for example, from position A to position B (position A and position B are two different positions). In one implementation, the dynamic target may be a vehicle, such as a car, a truck, a passenger car, a trailer, a non-holonomic vehicle, a motorcycle, and the like.
应理解,第一图像帧和所述第二图像帧可以包括相同的动态目标,该动态目标在第一图像帧和第二图像帧中位于不同的区域。第一图像帧和所述第二图像帧也可以包括不同的动态目标。例如,第一图像帧可以包括车辆1和车辆2,第二图像帧可以包括车辆1和车辆2,车辆1在第一图像帧中的位置与车辆1在第二图像帧中的位置不同,车辆2在第一图像帧中的位置与车辆2在第二图像帧中的位置不同。又例如,第一图像帧可以包括车辆1和车辆2,第二图像帧可以包括车辆1和车辆3,第二图像帧中不包括车辆3,第一图像帧中不包括车辆2,且车辆1在第一图像帧中的位置与车辆1在第二图像帧中的位置不同。It should be understood that the first image frame and the second image frame may include the same dynamic object, the dynamic object being located in different regions in the first image frame and the second image frame. The first image frame and the second image frame may also include different dynamic objects. For example, the first image frame may include vehicle 1 and vehicle 2, the second image frame may include vehicle 1 and vehicle 2, the position of vehicle 1 in the first image frame is different from the position of vehicle 1 in the second image frame, the vehicle The position of 2 in the first image frame is different from the position of vehicle 2 in the second image frame. For another example, the first image frame may include vehicle 1 and vehicle 2, the second image frame may include vehicle 1 and vehicle 3, the second image frame does not include vehicle 3, the first image frame does not include vehicle 2, and the vehicle 1 The position in the first image frame is different from the position of the vehicle 1 in the second image frame.
302、通过分别对所述第一图像帧和所述第二图像帧进行角点检测(corner detection),以得到所述第一图像帧的多个第一角点和所述第二图像帧的多个第二角点。302. Perform corner detection (corner detection) on the first image frame and the second image frame respectively, to obtain a plurality of first corners of the first image frame and a plurality of corners of the second image frame. Multiple second corner points.
本申请实施例中,在获取摄像设备拍摄的第一图像帧和第二图像帧之后,可以进一步对第一图像帧和第二图像帧进行角点检测(corner detection),以获取所述第一图像帧的多个第一角点和所述第二图像帧的多个第二角点。In this embodiment of the present application, after the first image frame and the second image frame captured by the camera device are acquired, corner detection may be further performed on the first image frame and the second image frame to obtain the first image frame and the second image frame. A plurality of first corner points of the image frame and a plurality of second corner points of the second image frame.
在一种实现中,步骤302可以是车辆本身的处理器执行的,也就是说,车辆本身的处理器可以对第一图像帧和第二图像帧进行角点检测,以获取所述第一图像帧的多个第一角点和所述第二图像帧的多个第二角点。在另一种实现中,步骤302可以是云侧的服务器执行的, 也就是说,车辆可以将摄像设备拍摄的第一图像帧和第二图像帧发送至云侧的服务器,服务器可以对第一图像帧和第二图像帧进行角点检测,以获取所述第一图像帧的多个第一角点和所述第二图像帧的多个第二角点。In one implementation, step 302 may be performed by the processor of the vehicle itself, that is, the processor of the vehicle itself may perform corner detection on the first image frame and the second image frame to obtain the first image a plurality of first corner points of the frame and a plurality of second corner points of the second image frame. In another implementation, step 302 may be performed by a server on the cloud side, that is, the vehicle may send the first image frame and the second image frame captured by the camera device to the server on the cloud side, and the server may The image frame and the second image frame are subjected to corner detection to obtain a plurality of first corner points of the first image frame and a plurality of second corner points of the second image frame.
在一些实施例中,所述多个第一角点以及所述多个第二角点可以是加速片段测试特征(Features fromaccelerated segment test,FAST)角点,哈里斯Harris角点或者二进制鲁棒不变可伸缩关键点(Binary Robust Invariant Scalable Keypoints,BRISK)角点等,对此,本发明实施例不进行限定。In some embodiments, the plurality of first corner points and the plurality of second corner points may be accelerated segment test (Features from accelerated segment test, FAST) corner points, Harris corner points or binary robust Variable scalable key points (Binary Robust Invariant Scalable Keypoints, BRISK) corner points, etc., are not limited in this embodiment of the present invention.
在一些情况下,可以理解,人眼对图像的识别通常是在一个局部的小区域或小窗口完成的。如果在各个方向上小范围移动这个小窗口时,窗口内区域图像的灰度发生了较大的变化,那么就可以认为在窗口内存在角点。如果在各个方向上小范围移动这个小窗口时,窗口内区域图像的灰度没有发生变化,那么就可以认为窗口内不存在角点。In some cases, it can be understood that the recognition of the image by the human eye is usually completed in a small local area or small window. If the grayscale of the image in the area of the window changes greatly when the small window is moved in a small range in all directions, then it can be considered that there are corner points in the window. If the grayscale of the image in the area of the window does not change when the small window is moved in a small range in all directions, then it can be considered that there are no corners in the window.
关于如何通过分别对所述第一图像帧和所述第二图像帧进行角点检测(corner detection),以得到所述第一图像帧的多个第一角点和所述第二图像帧的多个第二角点可以基于现有的实现,此处不再赘述。Regarding how to obtain a plurality of first corner points of the first image frame and a plurality of corner points of the second image frame by performing corner detection on the first image frame and the second image frame respectively The multiple second corner points may be based on existing implementations, and details are not described herein again.
303、剔除所述多个第一角点中所述第一图像帧包括的动态目标所在区域的第一角点,以得到剔除后的多个第一角点;剔除所述多个第二角点中所述第二图像帧包括的动态目标所在区域的第二角点,以得到剔除后的多个第二角点。303. Eliminate the first corner points in the region where the dynamic target included in the first image frame is located among the plurality of first corner points, so as to obtain a plurality of first corner points after the elimination; Eliminate the plurality of second corners point the second corner points of the region where the dynamic target included in the second image frame is located, so as to obtain a plurality of second corner points after culling.
应理解,可以先剔除所述多个第一角点中所述第一图像帧包括的动态目标所在区域的第一角点,再剔除所述多个第二角点中所述第二图像帧包括的动态目标所在区域的第二角点;或者,先剔除所述多个第二角点中所述第二图像帧包括的动态目标所在区域的第二角点,再剔除所述多个第一角点中所述第一图像帧包括的动态目标所在区域的第一角点;或者同时进行剔除所述多个第一角点中所述第一图像帧包括的动态目标所在区域的第一角点以及剔除所述多个第二角点中所述第二图像帧包括的动态目标所在区域的第二角点的操作。It should be understood that the first corner points of the region where the dynamic target is located in the first image frame included in the plurality of first corner points may be firstly eliminated, and then the second image frame of the plurality of second corner points may be eliminated. The second corner point of the area where the included dynamic target is located; or, firstly remove the second corner point of the area where the dynamic target is located in the second image frame included in the plurality of second corner points, and then remove the plurality of second corner points. In a corner point, the first corner point of the region where the dynamic object is included in the first image frame; A corner point and an operation of culling a second corner point of the region where the dynamic object included in the second image frame is located among the plurality of second corner points.
在一种实现中,可以通过预训练的神经网络检测所述第一图像帧中的动态目标,以及检测所述第二图像帧中的动态目标,以获取所述第一图像帧包括的动态目标所在的区域,以及所述第二图像帧包括的动态目标所在的区域,并将多个第一角点中位于第一图像帧中检测出的动态目标所在区域的第一角点剔除,以得到剔除后的第一角点,以及将多个第二角点中位于第二图像帧中检测出的动态目标所在区域的第二角点剔除,以得到剔除后的第二角点。In one implementation, a pre-trained neural network may be used to detect dynamic objects in the first image frame and detect dynamic objects in the second image frame, so as to obtain the dynamic objects included in the first image frame the area where the dynamic target is located, and the area where the dynamic target is located in the second image frame, and the first corner point located in the area where the dynamic target detected in the first image frame is located among the plurality of first corner points is culled to obtain The eliminated first corner point, and the second corner point located in the region where the dynamic target detected in the second image frame is located among the plurality of second corner points is eliminated to obtain the eliminated second corner point.
在一种实现中,还可以使用光流法跟踪角点并补充新的角点,之后给每个角点给定唯一标识id,并计算出角点在相机坐标系下的归一化坐标、像素坐标、像素速度等等。In one implementation, the optical flow method can also be used to track the corners and add new ones, and then give each corner a unique id, and calculate the normalized coordinates of the corners in the camera coordinate system, Pixel coordinates, pixel velocity, etc.
而当存在动态角点(也就是动态目标所在区域的角点)时,同一角点在不同相机状态下的观测量不仅包括由于相机自身运动引入的视差,还包括由于特征点本身运动带来的视差,本申请实施例中,通过剔除所述多个角点中所述动态目标所在区域的角点,可以保证视觉重投影误差项可靠,具体的,视觉重投影优化项可以参照如下公式:When there are dynamic corner points (that is, the corner points of the area where the dynamic target is located), the observation amount of the same corner point in different camera states includes not only the parallax caused by the motion of the camera itself, but also the parallax caused by the motion of the feature point itself. Parallax, in the embodiment of the present application, by eliminating the corner points in the area where the dynamic target is located among the plurality of corner points, the visual reprojection error term can be guaranteed to be reliable. Specifically, the visual reprojection optimization term can refer to the following formula:
Figure PCTCN2021127380-appb-000001
Figure PCTCN2021127380-appb-000001
其中,
Figure PCTCN2021127380-appb-000002
为第l个路标点在第j个相机归一化相机坐标系中观察到的坐标,π c为相机内参。为了期望得到相机自身的运动信息,而当存在动态角点时,同一角点在不同相机状态下的观测量不仅包括由于相机自身运动引入的视差,还包括由于角点本身运动带来的视差,为了避免此种情况的发生,本实施例中在图像处理时通过去除动态角点,以提高视觉残差项的准确性。
in,
Figure PCTCN2021127380-appb-000002
is the observed coordinate of the l-th landmark point in the j-th camera's normalized camera coordinate system, and π c is the camera internal parameter. In order to obtain the motion information of the camera itself, when there are dynamic corners, the observation of the same corner in different camera states includes not only the parallax caused by the motion of the camera itself, but also the parallax caused by the movement of the corner itself. In order to avoid the occurrence of this situation, in this embodiment, the dynamic corner points are removed during image processing to improve the accuracy of the visual residual item.
示例性的,可以参照图4,在摄像设备为前视单目相机的情况下,可以对相机拍摄的图像帧进行特征提取与跟踪及动态目标剔除,之后利用单目交叉矩阵结构SFM去除动态特征点(也可以称之为动态角点)影响的视觉残差。Exemplarily, referring to FIG. 4 , when the imaging device is a forward-looking monocular camera, feature extraction and tracking and dynamic target culling can be performed on the image frames captured by the camera, and then the dynamic features can be removed by using the monocular cross-matrix structure SFM. Visual residuals affected by points (also called dynamic corners).
304、根据所述剔除后的多个第一角点以及所述剔除后的多个第二角点,确定所述摄像设备在拍摄所述第二图像帧时相对于拍摄所述第一图像帧时的位姿变化。304. Determine, according to the plurality of first corner points after culling and the plurality of second corner points after culling, when the camera device is photographing the second image frame, relative to when the first image frame is photographed posture changes.
本申请实施例中,在剔除所述多个第一角点中所述第一图像帧包括的动态目标所在区域的第一角点,以得到剔除后的多个第一角点;剔除所述多个第二角点中所述第二图像帧包括的动态目标所在区域的第二角点,以得到剔除后的多个第二角点之后,可以根据所述剔除后的多个第一角点以及所述剔除后的多个第二角点,确定所述摄像设备在拍摄所述第二图像帧时相对于拍摄所述第一图像帧时的位姿变化。In the embodiment of the present application, the first corner points of the region where the dynamic target included in the first image frame is located are removed from the plurality of first corner points, so as to obtain a plurality of removed first corner points; The second corner points of the region where the dynamic target is located in the second image frame included in the plurality of second corner points, after obtaining the plurality of second corner points after culling, can point and the culled second corner points to determine the change of the pose of the imaging device when the second image frame is captured relative to the first image frame.
本申请实施例可以应用于目标车辆,所述目标车辆固定设置有所述摄像设备、惯性测量单元IMU以及轮速计,还可以获取所述IMU测量得到的所述摄像设备在拍摄所述第一图像帧到拍摄所述第二图像帧期间所述目标车辆的运动状态数据,并获取所述轮速计测量得到的所述摄像设备在拍摄所述第一图像帧到拍摄所述第二图像帧期间所述目标车辆的轮速数据,进而,可以根据所述运动状态数据、所述轮速数据以及剔除后的多个第一角点以及所述剔除后的多个第二角点,确定所述摄像设备在拍摄所述第二图像时相对于拍摄所述第一图像时的位姿变化。其中,运动状态数据可以是惯性传感器(IMU)测量得到的车辆加速度数据以及角速度数据,轮速数据可以是车辆车轮转速、方向盘转角数据等等。The embodiments of the present application can be applied to a target vehicle, where the camera device, the inertial measurement unit IMU, and the wheel speedometer are fixedly installed in the target vehicle, and the camera device measured by the IMU can also be obtained when shooting the first The motion state data of the target vehicle during the period from the image frame to the capture of the second image frame, and the acquisition of the camera device measured by the wheel speedometer between the capture of the first image frame and the capture of the second image frame. During the period, the wheel speed data of the target vehicle can be obtained, and further, according to the motion state data, the wheel speed data, the plurality of first corner points after the elimination, and the plurality of second corner points after the elimination, determine the The pose of the imaging device when the second image is photographed is changed relative to when the first image is photographed. The motion state data may be vehicle acceleration data and angular velocity data measured by an inertial sensor (IMU), and the wheel speed data may be vehicle wheel speed, steering wheel angle data, and the like.
具体的,在一种实现中,可以根据所述剔除后的多个第一角点以及所述剔除后的多个第二角点在各自所在图像帧的视差,确定所述摄像设备在拍摄所述第二图像帧时相对于拍摄所述第一图像帧时的第一位姿变化,其中,所述第一位姿变化不包括尺度信息;根据所述运动状态数据、所述轮速数据以及所述第一位姿变化,确定所述摄像设备在拍摄所述第二图像帧时相对于拍摄所述第一图像帧时的第二位姿变化,所述第二位姿变化包括尺度信息;对所述第二位姿变化进行非线优化,得到所述位姿变化。其中,在非线性优化的过程中,可以根据预设的优化函数对所述第二位姿变化进行非线优化,其中,所述预设的优化函数包括轮速计残差项。Specifically, in one implementation, it may be determined according to the disparity of the plurality of first corner points after culling and the plurality of second corner points after culling in the respective image frames that the camera device is in the shooting location. relative to the first posture change when the second image frame is captured, wherein the first posture change does not include scale information; according to the motion state data, the wheel speed data and the For the first pose change, determine a second pose change when the camera device captures the second image frame relative to when the first image frame is captured, and the second pose change includes scale information; Non-linear optimization is performed on the second pose change to obtain the pose change. Wherein, in the process of nonlinear optimization, nonlinear optimization may be performed on the second pose change according to a preset optimization function, wherein the preset optimization function includes a wheel speed meter residual term.
非线性优化是指在给定的目标函数中,寻找最优的一组数值映射,即x-min f(x)。根据导数理论,可以借助导数方程Δf(x)=0的求解获取有效的x的取值,当f(x)是非线性函数时,该优化为非线性优化。本申请实施例中优化函数可以由五个项构成,具体可以表示为:优化函数=先验残差+IMU残差项+轮速计残差项+视觉重投影误差项+回环检测重投影误差项。Nonlinear optimization refers to finding an optimal set of numerical mappings in a given objective function, namely x-min f(x). According to the derivative theory, the effective value of x can be obtained by solving the derivative equation Δf(x)=0. When f(x) is a nonlinear function, the optimization is nonlinear optimization. The optimization function in this embodiment of the present application may be composed of five items, which may be specifically expressed as: optimization function=a priori residual + IMU residual term + wheel speedometer residual term + visual reprojection error term + loop closure detection reprojection error item.
本申请实施例中,可以将剔除后的多个第一角点以及所述剔除后的多个第二角点与运动状态数据、轮速数据进行时间戳的对齐,其中,由于数据频率不同,单帧的图像数据(剔除后的角点)对应多个运动状态数据和轮速数据,之后可以对每帧图像的运动状态数据、轮速数据进行预积分处理,为图像提供位姿初值,之后采用滑窗方法,利用纯视觉数据进行初始化处理,得到摄像设备拍摄滑窗内所有图像帧时的位姿和不带尺度信息(也可以称之为深度)的位置,结合运动状态数据、轮速数据恢复上述不带尺度信息,并且重新计算所有角点的位置,然后对恢复的尺度信息和角点逆深度进行非线性优化,得到所述位姿变化。In this embodiment of the present application, a plurality of first corner points after removal and a plurality of second corner points after removal may be time stamped to align with motion state data and wheel speed data, wherein, due to different data frequencies, A single frame of image data (removed corner points) corresponds to multiple motion state data and wheel speed data. After that, the motion state data and wheel speed data of each frame of image can be pre-integrated to provide initial pose values for the image. After that, the sliding window method is used, and pure visual data is used for initialization processing to obtain the pose and position without scale information (also called depth) when the camera equipment captures all image frames in the sliding window. The speed data is restored without scale information, and the positions of all corner points are recalculated, and then nonlinear optimization is performed on the restored scale information and the inverse depth of the corner points to obtain the pose change.
具体的,可以参照如下公式:Specifically, you can refer to the following formula:
Figure PCTCN2021127380-appb-000003
Figure PCTCN2021127380-appb-000003
其中,
Figure PCTCN2021127380-appb-000004
为第k帧在世界坐标系下的位置,
Figure PCTCN2021127380-appb-000005
为第k帧在世界坐标系下的角度,
Figure PCTCN2021127380-appb-000006
为第k帧在世界坐标系下的速度,
Figure PCTCN2021127380-appb-000007
为两帧之间预积分的位置,
Figure PCTCN2021127380-appb-000008
为两帧之间预积分的角度,
Figure PCTCN2021127380-appb-000009
为两帧之间预积分的速度,
Figure PCTCN2021127380-appb-000010
为第k帧的加速度偏置,
Figure PCTCN2021127380-appb-000011
为第k帧陀螺仪的偏置,
Figure PCTCN2021127380-appb-000012
为第k帧在世界坐标系下姿态的四元数表示。
in,
Figure PCTCN2021127380-appb-000004
is the position of the kth frame in the world coordinate system,
Figure PCTCN2021127380-appb-000005
is the angle of the kth frame in the world coordinate system,
Figure PCTCN2021127380-appb-000006
is the velocity of the kth frame in the world coordinate system,
Figure PCTCN2021127380-appb-000007
is the position of the pre-integration between two frames,
Figure PCTCN2021127380-appb-000008
is the pre-integrated angle between two frames,
Figure PCTCN2021127380-appb-000009
is the speed of preintegration between two frames,
Figure PCTCN2021127380-appb-000010
is the acceleration bias of the kth frame,
Figure PCTCN2021127380-appb-000011
is the offset of the k-th frame gyroscope,
Figure PCTCN2021127380-appb-000012
is the quaternion representation of the pose of the kth frame in the world coordinate system.
由上述公式可知,估计值可以包括加速度计偏置b a和陀螺仪偏置b ω,由于在二维运动下,激励不充分,加速度偏置难估计,会造成偏置值收敛过慢而引起位姿估计不准确,因此,本申请实施例中引入轮速计残差项,具体可以如下式所示: It can be seen from the above formula that the estimated value can include the accelerometer bias b a and the gyroscope bias b ω . Due to insufficient excitation under two-dimensional motion, the acceleration bias is difficult to estimate, which will cause the bias value to converge too slowly. The pose estimation is inaccurate. Therefore, the wheel speed meter residual term is introduced in the embodiment of the present application, which can be specifically shown in the following formula:
Figure PCTCN2021127380-appb-000013
Figure PCTCN2021127380-appb-000013
其中,
Figure PCTCN2021127380-appb-000014
为第k帧在世界坐标系下的角度,
Figure PCTCN2021127380-appb-000015
为第k帧在世界坐标系下的位置,
Figure PCTCN2021127380-appb-000016
为两帧之间预积分的位置,
Figure PCTCN2021127380-appb-000017
为两帧之间预积分的角度,
Figure PCTCN2021127380-appb-000018
为第k帧在世界坐标系下姿态的四元数表示。
in,
Figure PCTCN2021127380-appb-000014
is the angle of the kth frame in the world coordinate system,
Figure PCTCN2021127380-appb-000015
is the position of the kth frame in the world coordinate system,
Figure PCTCN2021127380-appb-000016
is the position of the pre-integration between two frames,
Figure PCTCN2021127380-appb-000017
is the pre-integrated angle between two frames,
Figure PCTCN2021127380-appb-000018
is the quaternion representation of the pose of the kth frame in the world coordinate system.
示例性的,参照图4,可以对轮速计测量得到的轮速数据数据(包括车转速数据以及方向盘数据等等)进行预积分,得到预积分结果,之后基于预积分结果进行预积分残差的建立。Exemplarily, referring to FIG. 4 , pre-integration can be performed on the wheel speed data (including vehicle speed data and steering wheel data, etc.) measured by the wheel speedometer to obtain a pre-integration result, and then a pre-integration residual is performed based on the pre-integration result. establishment.
本申请实施例中,针对视觉惯导系统二维运动初始化误差较大的缺点,添加了轮速计残差项。通过视觉、IMU和轮速计三者信息联合初始化并计算位姿。使得在IMU收敛不好时,可以使用轮速计补偿,从而提升效果。In the embodiment of the present application, a wheel speed meter residual item is added in view of the disadvantage that the two-dimensional motion initialization error of the visual inertial navigation system is relatively large. The pose is initialized and calculated through the joint information of vision, IMU and wheel speedometer. So that when the IMU convergence is not good, the wheel speedometer can be used to compensate, so as to improve the effect.
在一种实现中,还可以将当前关键帧与地图进行闭环检测,检测到的帧为闭环帧,找出闭环帧与滑窗中帧的共视特征点;建立重投影误差项加入非线性优化中;将优化完并滑出窗口的关键帧与其关联帧进行四自由度优化;将优化完的关键帧插入地图;在实际使用时,地图也就是回环检测可以选择开启或者不开启,开启后就是边定位边建图,最后得到的位置都是相当于第一个固定相机位置的,当然这个可以在后续做坐标变换固定到具体参考系。In one implementation, the current key frame and the map can also be closed-loop detection, the detected frame is a closed-loop frame, and the common-view feature point between the closed-loop frame and the frame in the sliding window can be found; the reprojection error term is established to add nonlinear optimization. Medium; perform four-degree-of-freedom optimization on the keyframes that have been optimized and slide out of the window and their associated frames; insert the optimized keyframes into the map; in actual use, the map, that is, loopback detection, can be selected to be enabled or disabled. While positioning and building a map, the final position is equivalent to the first fixed camera position. Of course, this can be fixed to a specific reference system by subsequent coordinate transformation.
本申请实施例提供了一种位姿确定方法,所述方法包括:获取摄像设备拍摄的第一图像帧和第二图像帧,所述第一图像帧和所述第二图像帧为所述摄像设备拍摄得到的相邻图像帧,所述第一图像帧和所述第二图像帧分别包括动态目标;通过分别对所述第一图像帧和所述第二图像帧进行角点检测(corner detection),以得到所述第一图像帧的多个第一角点和所述第二图像帧的多个第二角点;剔除所述多个第一角点中所述第一图像帧包括的动态目标所在区域的第一角点,以得到剔除后的多个第一角点;剔除所述多个第二角点中所述第二图像帧包括的动态目标所在区域的第二角点,以得到剔除后的多个第二角点;根据所述剔除后的多个第一角点以及所述剔除后的多个第二角点,确定所述摄像设备在拍摄所述第二图像帧时相对于拍摄所述第一图像帧时的位姿变化。当存在动态角点时,同一角点在不同相机状态下的观测量不仅包括由于相机自身运动引入的视差,还包括由于角点本身运动带来的视差,本实施例通过上述方式,针对动态环境下定位误差大的问题,剔除了多个角点中所述动态目标所在区域的角点,以确保视觉重投影误差更可靠,从而克服动态特征引入的位姿变化确定误差。An embodiment of the present application provides a method for determining a pose, the method includes: acquiring a first image frame and a second image frame captured by a camera device, where the first image frame and the second image frame are for the camera The adjacent image frames captured by the device, the first image frame and the second image frame respectively include dynamic targets; by performing corner detection on the first image frame and the second image frame respectively (corner detection) ) to obtain a plurality of first corner points of the first image frame and a plurality of second corner points of the second image frame; excluding the plurality of first corner points included in the first image frame the first corner points of the area where the dynamic target is located, to obtain a plurality of first corner points after culling; culling the second corner points of the area where the dynamic target is located in the second image frame included in the plurality of second corner points, to obtain a plurality of second corner points after culling; according to the plurality of first corner points after culling and the plurality of second corner points after culling, it is determined that the camera device is shooting the second image frame relative to the pose change when the first image frame was captured. When there are dynamic corner points, the observation amount of the same corner point in different camera states includes not only the parallax caused by the movement of the camera itself, but also the parallax caused by the movement of the corner point itself. In this embodiment, the above-mentioned methods are used for the dynamic environment. To solve the problem of large positioning error, the corner points in the region where the dynamic target is located among the multiple corner points are eliminated to ensure that the visual reprojection error is more reliable, thereby overcoming the determination error of the pose change introduced by the dynamic feature.
参照图5,图5为本申请实施例提供的一种位姿确定装置的结构示意,如图5所示,本申请实施例提供的位姿确定装置500包括:Referring to FIG. 5 , FIG. 5 is a schematic structural diagram of a pose determination apparatus provided by an embodiment of the present application. As shown in FIG. 5 , a pose determination apparatus 500 provided by an embodiment of the present application includes:
获取模块501,用于获取摄像设备拍摄的第一图像帧和第二图像帧,所述第一图像帧和所述第二图像帧为所述摄像设备拍摄得到的相邻图像帧,所述第一图像帧和所述第二图像帧分别包括动态目标;The acquiring module 501 is configured to acquire a first image frame and a second image frame captured by a camera device, where the first image frame and the second image frame are adjacent image frames captured by the camera device, and the first image frame and the second image frame are adjacent image frames captured by the camera device. An image frame and the second image frame respectively include dynamic objects;
角点提取模块502,用于通过分别对所述第一图像帧和所述第二图像帧进行角点检测(corner detection),以得到所述第一图像帧的多个第一角点和所述第二图像帧的多个第二角点;A corner extraction module 502 is configured to perform corner detection (corner detection) on the first image frame and the second image frame to obtain a plurality of first corners and all the first corners of the first image frame. multiple second corner points of the second image frame;
剔除模块503,用于剔除所述多个第一角点中所述第一图像帧包括的动态目标所在区域的第一角点,以得到剔除后的多个第一角点;A culling module 503, configured to cull the first corner points of the region where the dynamic target included in the first image frame is located among the plurality of first corner points, so as to obtain a plurality of culled first corner points;
剔除所述多个第二角点中所述第二图像帧包括的动态目标所在区域的第二角点,以得到剔除后的多个第二角点;Eliminating the second corner points of the region where the dynamic target is located in the second image frame included in the plurality of second corner points, to obtain a plurality of eliminated second corner points;
定位模块504,用于根据所述剔除后的多个第一角点以及所述剔除后的多个第二角点,确定所述摄像设备在拍摄所述第二图像帧时相对于拍摄所述第一图像帧时的位姿变化。The positioning module 504 is configured to determine, according to the plurality of first corner points after culling and the plurality of second corner points after culling, relative to the photographing of the camera device when photographing the second image frame The pose change at the first image frame.
在一种可能的实现中,所述多个第一角点以及所述多个第二角点包括如下的一种:加速片段测试特征FAST角点、哈里斯Harris角点以及二进制鲁棒不变可伸缩关键点BRISK角点。In a possible implementation, the plurality of first corner points and the plurality of second corner points include one of the following: accelerated segment test feature FAST corner points, Harris corner points, and binary robust invariant Scalable key BRISK corners.
在一种可能的实现中,In one possible implementation,
所述装置应用于目标车辆,所述目标车辆固定设置有所述摄像设备、惯性测量单元IMU以及轮速计;所述获取模块,用于获取所述IMU测量得到的所述摄像设备在拍摄所述第一图像帧到拍摄所述第二图像帧期间所述目标车辆的运动状态数据;The device is applied to a target vehicle, and the target vehicle is fixedly provided with the camera device, the inertial measurement unit IMU, and a wheel speedometer; the acquisition module is configured to acquire the camera device measured by the IMU in the shooting location. motion state data of the target vehicle during the period from the first image frame to the shooting of the second image frame;
获取所述轮速计测量得到的所述摄像设备在拍摄所述第一图像帧到拍摄所述第二图像帧期间所述目标车辆的轮速数据;acquiring the wheel speed data of the target vehicle during the period from shooting the first image frame to shooting the second image frame by the camera device measured by the wheel speed meter;
相应的,所述定位模块,用于根据所述剔除后的多个第一角点以及所述剔除后的多个第二角点在各自所在图像帧的视差,确定所述摄像设备在拍摄所述第二图像帧时相对于拍摄所述第一图像帧时的第一位姿变化,其中,所述第一位姿变化不包括尺度信息;Correspondingly, the positioning module is configured to determine, according to the disparity of the plurality of first corner points after culling and the plurality of second corner points after culling in respective image frames, that the camera device is in the shooting location. The first pose change when the second image frame is taken relative to when the first image frame is captured, wherein the first pose change does not include scale information;
根据所述运动状态数据、所述轮速数据以及所述第一位姿变化,确定所述摄像设备在拍摄所述第二图像帧时相对于拍摄所述第一图像帧时的第二位姿变化,所述第二位姿变化包括尺度信息;According to the motion state data, the wheel speed data and the change of the first posture, determine the second posture of the camera device when the second image frame is photographed relative to when the first image frame is photographed change, the second pose change includes scale information;
对所述第二位姿变化进行非线优化,得到所述位姿变化。Non-linear optimization is performed on the second pose change to obtain the pose change.
在一种可能的实现中,所述定位模块,用于根据预设的优化函数对所述第二位姿变化进行非线优化,其中,所述预设的优化函数包括轮速计残差项。In a possible implementation, the positioning module is configured to perform nonlinear optimization on the second pose change according to a preset optimization function, wherein the preset optimization function includes a wheel speed meter residual item .
在一种可能的实现中,所述装置还包括:In a possible implementation, the apparatus further includes:
动态目标检测模块,用于通过预训练的神经网络检测所述第一图像帧中的动态目标,以及检测所述第二图像帧中的动态目标,以获取所述第一图像帧包括的动态目标所在的区域,以及所述第二图像帧包括的动态目标所在的区域。A dynamic target detection module, configured to detect the dynamic target in the first image frame through a pre-trained neural network, and detect the dynamic target in the second image frame, so as to obtain the dynamic target included in the first image frame the region where the dynamic target is located, and the region where the dynamic object included in the second image frame is located.
基于同一构思,参阅图6所示,本申请实施例提供一种位姿确定装置600,包括收发器610、处理器620、存储器630;存储器630用于存储程序、指令或代码;处理器620用于执行存储器630中的程序、指令或代码;Based on the same concept, referring to FIG. 6 , an embodiment of the present application provides an apparatus 600 for determining a pose and attitude, including a transceiver 610, a processor 620, and a memory 630; the memory 630 is used to store programs, instructions, or codes; in executing programs, instructions or codes in memory 630;
收发器610,用于接收摄像设备输入的第一图像帧和第二图像帧;a transceiver 610, configured to receive the first image frame and the second image frame input by the camera device;
处理器620,用于通过分别对所述第一图像帧和所述第二图像帧进行角点检测(corner detection),以得到所述第一图像帧的多个第一角点和所述第二图像帧的多个第二角点;剔除所述多个第一角点中所述第一图像帧包括的动态目标所在区域的第一角点,以得到剔除后的多个第一角点;剔除所述多个第二角点中所述第二图像帧包括的动态目标所在区域的第二角点,以得到剔除后的多个第二角点;根据所述剔除后的多个第一角点以及所述剔除 后的多个第二角点,确定所述摄像设备在拍摄所述第二图像帧时相对于拍摄所述第一图像帧时的位姿变化。The processor 620 is configured to perform corner detection (corner detection) on the first image frame and the second image frame respectively, so as to obtain a plurality of first corners of the first image frame and the Multiple second corner points of two image frames; culling the first corner points of the region where the dynamic target included in the first image frame is located among the multiple first corner points, so as to obtain the multiple first corner points after the culling ; Eliminate the second corner points of the dynamic target area included in the second image frame in the plurality of second corner points to obtain a plurality of second corner points after the elimination; According to the plurality of second corner points after the elimination A corner point and a plurality of second corner points after the culling are used to determine the change of the pose of the imaging device when the second image frame is captured relative to the first image frame.
其中,处理器620可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器620中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器602可以是通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器630,处理器620读取存储器630中的信息,结合其硬件执行以上方法步骤。The processor 620 may be an integrated circuit chip with signal processing capability. In the implementation process, each step of the above-mentioned method may be completed by an integrated logic circuit of hardware in the processor 620 or an instruction in the form of software. The above-mentioned processor 602 may be a general-purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components. The methods, steps, and logic block diagrams disclosed in the embodiments of this application can be implemented or executed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the methods disclosed in conjunction with the embodiments of the present application may be directly embodied as executed by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor. The software module may be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other storage media mature in the art. The storage medium is located in the memory 630, and the processor 620 reads the information in the memory 630, and performs the above method steps in combination with its hardware.
本领域内的技术人员应明白,本申请实施例可提供为方法、系统、或计算机程序产品。因此,本申请实施例可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请实施例可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present application may be provided as a method, a system, or a computer program product. Accordingly, the embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product implemented on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
本申请实施例是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The embodiments of the present application are described with reference to flowcharts and/or block diagrams of methods, apparatuses (systems), and computer program products according to the embodiments of the present application. It will be understood that each flow and/or block in the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to the processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing device to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing device produce Means for implementing the functions specified in a flow or flow of a flowchart and/or a block or blocks of a block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions The apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing device to cause a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process such that The instructions provide steps for implementing the functions specified in the flow or blocks of the flowcharts and/or the block or blocks of the block diagrams.
显然,本领域的技术人员可以对本申请实施例进行各种改动和变型而不脱离本申请的精神和范围。这样,倘若本申请实施例的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。Obviously, those skilled in the art can make various changes and modifications to the embodiments of the present application without departing from the spirit and scope of the present application. Thus, if these modifications and variations of the embodiments of the present application fall within the scope of the claims of the present application and their equivalents, the present application is also intended to include these modifications and variations.

Claims (14)

  1. 一种位姿确定方法,其特征在于,所述方法包括:A pose determination method, characterized in that the method comprises:
    获取摄像设备拍摄的第一图像帧和第二图像帧,所述第一图像帧和所述第二图像帧为所述摄像设备拍摄得到的相邻图像帧,所述第一图像帧和所述第二图像帧分别包括动态目标;Acquire a first image frame and a second image frame captured by a camera device, where the first image frame and the second image frame are adjacent image frames captured by the camera device, and the first image frame and the second image frame are adjacent image frames captured by the camera device. The second image frames respectively include dynamic objects;
    通过分别对所述第一图像帧和所述第二图像帧进行角点检测(corner detection),以得到所述第一图像帧的多个第一角点和所述第二图像帧的多个第二角点;By performing corner detection (corner detection) on the first image frame and the second image frame respectively, a plurality of first corner points of the first image frame and a plurality of first corner points of the second image frame are obtained second corner;
    剔除所述多个第一角点中所述第一图像帧包括的动态目标所在区域的第一角点,以得到剔除后的多个第一角点;Eliminating the first corner points of the region where the dynamic target is located in the first image frame included in the plurality of first corner points, to obtain a plurality of first corner points after the elimination;
    剔除所述多个第二角点中所述第二图像帧包括的动态目标所在区域的第二角点,以得到剔除后的多个第二角点;Eliminating the second corner points of the region where the dynamic target is located in the second image frame included in the plurality of second corner points, to obtain a plurality of eliminated second corner points;
    根据所述剔除后的多个第一角点以及所述剔除后的多个第二角点,确定所述摄像设备在拍摄所述第二图像帧时相对于拍摄所述第一图像帧时的位姿变化。According to the plurality of first corner points after the culling and the plurality of second corner points after the culling, it is determined that the image capturing device when the second image frame is photographed is relative to the time when the first image frame is photographed. Pose changes.
  2. 根据权利要求1所述的方法,其特征在于,所述多个第一角点以及所述多个第二角点包括如下的一种:加速片段测试特征FAST角点、哈里斯Harris角点以及二进制鲁棒不变可伸缩关键点BRISK角点。The method of claim 1, wherein the plurality of first corner points and the plurality of second corner points comprise one of the following: accelerated segment test feature FAST corner points, Harris corner points, and Binary Robust Invariant Scalable Keypoint BRISK Corner.
  3. 根据权利要求1或2所述的方法,其特征在于,所述方法应用于目标车辆,所述目标车辆固定设置有所述摄像设备、惯性测量单元IMU以及轮速计;所述方法还包括:The method according to claim 1 or 2, wherein the method is applied to a target vehicle, and the target vehicle is fixedly provided with the camera device, the inertial measurement unit (IMU) and a wheel speedometer; the method further comprises:
    获取所述IMU测量得到的所述摄像设备在拍摄所述第一图像帧到拍摄所述第二图像帧期间所述目标车辆的运动状态数据;Acquiring the motion state data of the target vehicle during the period from shooting the first image frame to shooting the second image frame by the camera device measured by the IMU;
    获取所述轮速计测量得到的所述摄像设备在拍摄所述第一图像帧到拍摄所述第二图像帧期间所述目标车辆的轮速数据;acquiring the wheel speed data of the target vehicle during the period from shooting the first image frame to shooting the second image frame by the camera device measured by the wheel speed meter;
    相应的,所述根据所述剔除后的多个第一角点以及所述剔除后的多个第二角点,确定所述摄像设备在拍摄所述第二图像帧时相对于拍摄所述第一图像帧时的位姿变化,包括:Correspondingly, according to the plurality of first corner points after culling and the plurality of second corner points after culling, it is determined relative to when the camera device is photographing the second image frame, relative to when photographing the second image frame. The pose changes during an image frame, including:
    根据所述剔除后的多个第一角点以及所述剔除后的多个第二角点在各自所在图像帧的视差,确定所述摄像设备在拍摄所述第二图像帧时相对于拍摄所述第一图像帧时的第一位姿变化,其中,所述第一位姿变化不包括尺度信息;According to the parallax of the plurality of first corner points after culling and the plurality of second corner points after culling in the respective image frames, it is determined that the image capturing device is relatively relative to the image frame when capturing the second image frame. the first pose change during the first image frame, wherein the first pose change does not include scale information;
    根据所述运动状态数据、所述轮速数据以及所述第一位姿变化,确定所述摄像设备在拍摄所述第二图像帧时相对于拍摄所述第一图像帧时的第二位姿变化,所述第二位姿变化包括尺度信息;According to the motion state data, the wheel speed data and the change of the first posture, determine the second posture of the camera device when the second image frame is photographed relative to when the first image frame is photographed change, the second pose change includes scale information;
    对所述第二位姿变化进行非线优化,得到所述位姿变化。Non-linear optimization is performed on the second pose change to obtain the pose change.
  4. 根据权利要求3所述的方法,其特征在于,所述对所述第二位姿变化进行非线优化,包括:The method according to claim 3, wherein the non-linear optimization of the second pose change comprises:
    根据预设的优化函数对所述第二位姿变化进行非线优化,其中,所述预设的优化函数 包括轮速计残差项。Non-linear optimization is performed on the second pose change according to a preset optimization function, wherein the preset optimization function includes a wheel speed meter residual term.
  5. 根据权利要求1至4任一所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 1 to 4, wherein the method further comprises:
    通过预训练的神经网络检测所述第一图像帧中的动态目标,以及检测所述第二图像帧中的动态目标,以获取所述第一图像帧包括的动态目标所在的区域,以及所述第二图像帧包括的动态目标所在的区域。A pre-trained neural network is used to detect the dynamic target in the first image frame, and detect the dynamic target in the second image frame, so as to obtain the region where the dynamic target included in the first image frame is located, and the The second image frame includes the area where the dynamic object is located.
  6. 根据权利要求1至5任一所述的方法,其特征在于,所述动态目标为车辆。The method according to any one of claims 1 to 5, wherein the dynamic target is a vehicle.
  7. 一种位姿确定装置,其特征在于,所述装置包括:A pose determination device, characterized in that the device comprises:
    获取模块,用于获取摄像设备拍摄的第一图像帧和第二图像帧,所述第一图像帧和所述第二图像帧为所述摄像设备拍摄得到的相邻图像帧,所述第一图像帧和所述第二图像帧分别包括动态目标;an acquisition module, configured to acquire a first image frame and a second image frame captured by a camera device, where the first image frame and the second image frame are adjacent image frames captured by the camera device, and the first image frame and the second image frame are adjacent image frames captured by the camera device. The image frame and the second image frame respectively include dynamic objects;
    角点提取模块,用于通过分别对所述第一图像帧和所述第二图像帧进行角点检测(corner detection),以得到所述第一图像帧的多个第一角点和所述第二图像帧的多个第二角点;a corner extraction module, configured to obtain a plurality of first corners of the first image frame and the a plurality of second corner points of the second image frame;
    剔除模块,用于剔除所述多个第一角点中所述第一图像帧包括的动态目标所在区域的第一角点,以得到剔除后的多个第一角点;A culling module, configured to cull the first corner points of the region where the dynamic target included in the first image frame is located among the plurality of first corner points, so as to obtain a plurality of culled first corner points;
    剔除所述多个第二角点中所述第二图像帧包括的动态目标所在区域的第二角点,以得到剔除后的多个第二角点;Eliminating the second corner points of the region where the dynamic target is located in the second image frame included in the plurality of second corner points, to obtain a plurality of eliminated second corner points;
    定位模块,用于根据所述剔除后的多个第一角点以及所述剔除后的多个第二角点,确定所述摄像设备在拍摄所述第二图像帧时相对于拍摄所述第一图像帧时的位姿变化。a positioning module, configured to determine, according to the plurality of first corner points after culling and the plurality of second corner points after culling, relative to when the camera device is photographing the second image frame, relative to when photographing the first The pose change over an image frame.
  8. 根据权利要求7所述的装置,其特征在于,所述多个第一角点以及所述多个第二角点包括如下的一种:加速片段测试特征FAST角点、哈里斯Harris角点以及二进制鲁棒不变可伸缩关键点BRISK角点。The apparatus of claim 7, wherein the plurality of first corner points and the plurality of second corner points comprise one of the following: accelerated segment test feature FAST corner points, Harris corner points, and Binary Robust Invariant Scalable Keypoint BRISK Corner.
  9. 根据权利要求7或8所述的装置,其特征在于,所述装置应用于目标车辆,所述目标车辆固定设置有所述摄像设备、惯性测量单元IMU以及轮速计;所述获取模块,用于获取所述IMU测量得到的所述摄像设备在拍摄所述第一图像帧到拍摄所述第二图像帧期间所述目标车辆的运动状态数据;The device according to claim 7 or 8, characterized in that, the device is applied to a target vehicle, and the target vehicle is fixedly provided with the camera device, the inertial measurement unit (IMU) and the wheel speedometer; the acquisition module, using in acquiring the motion state data of the target vehicle during the period from shooting the first image frame to shooting the second image frame of the camera device obtained by the IMU measurement;
    获取所述轮速计测量得到的所述摄像设备在拍摄所述第一图像帧到拍摄所述第二图像帧期间所述目标车辆的轮速数据;acquiring the wheel speed data of the target vehicle during the period from shooting the first image frame to shooting the second image frame by the camera device measured by the wheel speed meter;
    相应的,所述定位模块,用于根据所述剔除后的多个第一角点以及所述剔除后的多个第二角点在各自所在图像帧的视差,确定所述摄像设备在拍摄所述第二图像帧时相对于拍摄所述第一图像帧时的第一位姿变化,其中,所述第一位姿变化不包括尺度信息;Correspondingly, the positioning module is configured to determine, according to the disparity of the plurality of first corner points after culling and the plurality of second corner points after culling in respective image frames, that the camera device is in the shooting location. The first pose change when the second image frame is taken relative to when the first image frame is captured, wherein the first pose change does not include scale information;
    根据所述运动状态数据、所述轮速数据以及所述第一位姿变化,确定所述摄像设备在 拍摄所述第二图像帧时相对于拍摄所述第一图像帧时的第二位姿变化,所述第二位姿变化包括尺度信息;According to the motion state data, the wheel speed data and the change of the first posture, determine the second posture of the camera device when the second image frame is photographed relative to when the first image frame is photographed change, the second pose change includes scale information;
    对所述第二位姿变化进行非线优化,得到所述位姿变化。Non-linear optimization is performed on the second pose change to obtain the pose change.
  10. 根据权利要求9所述的装置,其特征在于,所述定位模块,用于根据预设的优化函数对所述第二位姿变化进行非线优化,其中,所述预设的优化函数包括轮速计残差项。The device according to claim 9, wherein the positioning module is configured to perform nonlinear optimization on the second pose change according to a preset optimization function, wherein the preset optimization function includes a Speedometer residual term.
  11. 根据权利要求7至10任一所述的装置,其特征在于,所述装置还包括:The device according to any one of claims 7 to 10, wherein the device further comprises:
    动态目标检测模块,用于通过预训练的神经网络检测所述第一图像帧中的动态目标,以及检测所述第二图像帧中的动态目标,以获取所述第一图像帧包括的动态目标所在的区域,以及所述第二图像帧包括的动态目标所在的区域。A dynamic target detection module, configured to detect the dynamic target in the first image frame through a pre-trained neural network, and detect the dynamic target in the second image frame, so as to obtain the dynamic target included in the first image frame the region where the dynamic target is located, and the region where the dynamic object included in the second image frame is located.
  12. 根据权利要求7至11任一所述的装置,其特征在于,所述动态目标为车辆。The device according to any one of claims 7 to 11, wherein the dynamic target is a vehicle.
  13. 一种非易失性计算机可读存储介质,其特征在于,所述非易失性可读存储介质包含计算机指令,用于执行权利要求1至6任一所述的位姿确定方法。A non-volatile computer-readable storage medium, wherein the non-volatile computer-readable storage medium contains computer instructions for executing the pose determination method according to any one of claims 1 to 6.
  14. 一种运算设备,其特征在于,所述运算设备包括存储器和处理器,所述存储器中存储有代码,所述处理器用于获取所述代码,以执行权利要求1至6任一所述的位姿确定方法。A computing device, characterized in that the computing device includes a memory and a processor, the memory stores codes, and the processor is used to obtain the codes to execute the bits according to any one of claims 1 to 6 Pose determination method.
PCT/CN2021/127380 2020-10-31 2021-10-29 Pose determination method and related device thereof WO2022089577A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011199017.6 2020-10-31
CN202011199017.6A CN114445490A (en) 2020-10-31 2020-10-31 Pose determination method and related equipment thereof

Publications (1)

Publication Number Publication Date
WO2022089577A1 true WO2022089577A1 (en) 2022-05-05

Family

ID=81357589

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/127380 WO2022089577A1 (en) 2020-10-31 2021-10-29 Pose determination method and related device thereof

Country Status (2)

Country Link
CN (1) CN114445490A (en)
WO (1) WO2022089577A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114913235A (en) * 2022-07-18 2022-08-16 合肥工业大学 Pose estimation method and device and intelligent robot
CN116753907A (en) * 2023-08-18 2023-09-15 中国电建集团昆明勘测设计研究院有限公司 Method, device, equipment and storage medium for detecting underground deep cavity

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108364319A (en) * 2018-02-12 2018-08-03 腾讯科技(深圳)有限公司 Scale determines method, apparatus, storage medium and equipment
CN108731667A (en) * 2017-04-14 2018-11-02 百度在线网络技术(北京)有限公司 The method and apparatus of speed and pose for determining automatic driving vehicle
CN109387204A (en) * 2018-09-26 2019-02-26 东北大学 The synchronous positioning of the mobile robot of dynamic environment and patterning process in faced chamber
CN111738085A (en) * 2020-05-22 2020-10-02 华南理工大学 System construction method and device for realizing automatic driving and simultaneously positioning and mapping

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108731667A (en) * 2017-04-14 2018-11-02 百度在线网络技术(北京)有限公司 The method and apparatus of speed and pose for determining automatic driving vehicle
CN108364319A (en) * 2018-02-12 2018-08-03 腾讯科技(深圳)有限公司 Scale determines method, apparatus, storage medium and equipment
CN109387204A (en) * 2018-09-26 2019-02-26 东北大学 The synchronous positioning of the mobile robot of dynamic environment and patterning process in faced chamber
CN111738085A (en) * 2020-05-22 2020-10-02 华南理工大学 System construction method and device for realizing automatic driving and simultaneously positioning and mapping

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114913235A (en) * 2022-07-18 2022-08-16 合肥工业大学 Pose estimation method and device and intelligent robot
CN116753907A (en) * 2023-08-18 2023-09-15 中国电建集团昆明勘测设计研究院有限公司 Method, device, equipment and storage medium for detecting underground deep cavity
CN116753907B (en) * 2023-08-18 2023-11-10 中国电建集团昆明勘测设计研究院有限公司 Method, device, equipment and storage medium for detecting underground deep cavity

Also Published As

Publication number Publication date
CN114445490A (en) 2022-05-06

Similar Documents

Publication Publication Date Title
CN110543814B (en) Traffic light identification method and device
CN112639883B (en) Relative attitude calibration method and related device
CN112640417B (en) Matching relation determining method and related device
WO2022001773A1 (en) Trajectory prediction method and apparatus
WO2021102955A1 (en) Path planning method for vehicle and path planning apparatus for vehicle
WO2021217420A1 (en) Lane tracking method and apparatus
CN110930323B (en) Method and device for removing reflection of image
CN112534483B (en) Method and device for predicting vehicle exit
WO2021057344A1 (en) Data presentation method and terminal device
CN112512887B (en) Driving decision selection method and device
CN113498529B (en) Target tracking method and device
WO2022089577A1 (en) Pose determination method and related device thereof
US20230227052A1 (en) Fault diagnosis method and fault diagnosis device for vehicle speed measurement device
WO2022204855A1 (en) Image processing method and related terminal device
CN112543877B (en) Positioning method and positioning device
CN112810603B (en) Positioning method and related product
WO2022051951A1 (en) Lane line detection method, related device, and computer readable storage medium
US20230048680A1 (en) Method and apparatus for passing through barrier gate crossbar by vehicle
EP4307251A1 (en) Mapping method, vehicle, computer readable storage medium, and chip
WO2021163846A1 (en) Target tracking method and target tracking apparatus
CN115398272A (en) Method and device for detecting passable area of vehicle
WO2022022284A1 (en) Target object sensing method and apparatus
WO2021159397A1 (en) Vehicle travelable region detection method and detection device
CN113022573B (en) Road structure detection method and device
CN113128497A (en) Target shape estimation method and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21885303

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21885303

Country of ref document: EP

Kind code of ref document: A1