WO2023184869A1 - 室内停车场的语义地图构建及定位方法和装置 - Google Patents

室内停车场的语义地图构建及定位方法和装置 Download PDF

Info

Publication number
WO2023184869A1
WO2023184869A1 PCT/CN2022/117351 CN2022117351W WO2023184869A1 WO 2023184869 A1 WO2023184869 A1 WO 2023184869A1 CN 2022117351 W CN2022117351 W CN 2022117351W WO 2023184869 A1 WO2023184869 A1 WO 2023184869A1
Authority
WO
WIPO (PCT)
Prior art keywords
semantic
features
vehicle
semantic map
bird
Prior art date
Application number
PCT/CN2022/117351
Other languages
English (en)
French (fr)
Inventor
曹旭东
赵天坤
陈泽
Original Assignee
合众新能源汽车股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 合众新能源汽车股份有限公司 filed Critical 合众新能源汽车股份有限公司
Publication of WO2023184869A1 publication Critical patent/WO2023184869A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • G06T7/579Depth or shape recovery from multiple images from motion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/10Image acquisition
    • G06V10/16Image acquisition using multiple overlapping images; Image stitching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Definitions

  • the present invention relates to the technical field of indoor positioning, and in particular to a semantic map construction and positioning method and device for an indoor parking lot.
  • SLAM Simultaneous Localization And Mapping
  • SLAM can be divided into laser SLAM and visual SLAM according to the sensors used. It can be mainly divided into laser point cloud maps directly collected by lidar and visual point cloud maps converted from images collected by cameras.
  • visual SLAM has a huge cost advantage.
  • traditional visual SLAM is limited in accuracy and has poor robustness to environmental changes, so it has not been used on a large scale like laser SLAM. Therefore, how to improve the mapping accuracy and robustness of visual SLAM and reduce the redundancy and storage consumption of visual information has become an urgent problem to be solved.
  • embodiments of the present invention provide a semantic map construction and positioning method, device, electronic device and computer-readable medium for an indoor parking lot.
  • a semantic map construction and positioning method for indoor parking lots including:
  • the pose of the vehicle in the semantic map is constrained and optimized.
  • a semantic map construction and positioning device for indoor parking lots including:
  • An image acquisition module used to acquire original images collected during vehicle operation, where the original images at least include front-view original images
  • An image splicing module used to splice the original images into a bird's-eye view
  • a semantic segmentation module used to perform semantic segmentation processing on the bird's-eye view to obtain segmented images with semantic features
  • An image detection module used to perform feature extraction on the original forward-view image to obtain column features, and to perform feature extraction on the bird's-eye view to obtain parking space corner features;
  • a map reconstruction module configured to generate a semantic map based on the semantic features, the column features and the parking space corner features, and calculate the position and posture of the vehicle in the semantic map;
  • An optimization module configured to perform non-linear optimization on the semantic map according to the position and posture of the vehicle in the semantic map; based on the optimized semantic map, the bird's-eye view of multiple adjacent frames and the adjacent bird's-eye view of the multiple frames.
  • the odometer information corresponding to the neighbor's bird's-eye view is used to constrain and optimize the pose of the vehicle in the semantic map.
  • an electronic device including: one or more processors; a storage device configured to store one or more programs.
  • the one or more programs are processed by the Or multiple processors execute, so that the one or more processors implement the semantic map construction and positioning method of indoor parking lots.
  • a computer-readable medium is provided, a computer program is stored thereon, and when the program is executed by a processor, a semantic map construction and positioning method for an indoor parking lot is implemented.
  • a computer program product comprising computer readable code, which when run on an electronic device causes the electronic device to execute a semantic map of an indoor parking lot Construction and positioning methods.
  • the embodiment of the present invention first splices the original images into a bird's-eye view, performs semantic segmentation on the bird's-eye view, and obtains semantic features. Secondly, performs feature detection on the front-view original image and the bird's-eye view respectively to obtain the column features and bird's-eye view in the front-view original image. Parking space corner features in the image are then used for mapping and vehicle positioning through semantic features, column features and parking space corner features.
  • semantic features and odometer information are used to perform nonlinear constraint optimization of the semantic map and vehicle pose, which can be achieved Low-cost, high-precision, and high-robust real-time positioning; in the embodiment of the present invention, only visual features are used, and sensors such as GPS and lidar are not required, which effectively reduces costs and can be applied to a wider range of scenarios. It is suitable for scenarios without GPS signals; in the embodiment of the present invention, two types of feature information, semantic features and detection features, are used to make more comprehensive use of visual sensors and improve positioning accuracy.
  • Figure 1 schematically shows multiple coordinate systems in the semantic map construction and positioning method for indoor parking lots according to the embodiment of the present invention
  • Figure 2 schematically shows a flow chart of the semantic map construction and positioning method for indoor parking lots according to an embodiment of the present invention
  • Figure 3 schematically shows a semantic segmentation image in the semantic map construction and positioning method for indoor parking lots according to the embodiment of the present invention
  • Figure 4 schematically shows the column features in the semantic map construction and positioning method for indoor parking lots according to the embodiment of the present invention
  • Figure 5 schematically shows a schematic diagram of the sub-process of the semantic map construction and positioning method for indoor parking lots according to the embodiment of the present invention
  • Figure 6 schematically shows a structural diagram of a semantic map construction and positioning device for an indoor parking lot according to an embodiment of the present invention
  • Figure 7 schematically shows a structural diagram of an electronic device according to an embodiment of the present invention.
  • first, second, etc. in the description and claims of this application are used to distinguish similar objects and are not used to describe a specific order or sequence. It is to be understood that the figures so used are interchangeable under appropriate circumstances so that the embodiments of the present application can be practiced in orders other than those illustrated or described herein, and that "first,” “second,” etc. are distinguished Objects are usually of one type, and the number of objects is not limited. For example, the first object can be one or multiple.
  • “and/or” in the description and claims indicates at least one of the connected objects, and the character “/" generally indicates that the related objects are in an "or” relationship.
  • the embodiment of the present invention includes a world coordinate system, a bird's-eye view coordinate system (which can also be called a virtual top-view camera coordinate system), a front-view camera coordinate system, a vehicle coordinate system (which can also be called an odometer coordinate system), and pixel coordinates.
  • Figure 1 schematically shows a schematic diagram of each coordinate system mentioned above. As shown in Figure 1, the bird's-eye view coordinate system is described by x t , y t , z t .
  • the origin of the bird's-eye view coordinate system is located at the focus of the center line of the left and right fisheye cameras and the center line of the front and rear fisheye cameras, horizontally to the right (pointing to the right).
  • the direction of the fisheye camera is the positive x-axis direction
  • the horizontal backward direction is the positive y-axis direction
  • the vertical downward direction is the positive z-axis direction.
  • the front-view camera coordinate system is described by x c , y c , z c .
  • the origin of the front-view camera coordinate system is located at the center of the front-view camera.
  • the vehicle body coordinate system is described by The left is the positive direction of the y-axis, and the vertical direction is the positive direction of the z-axis.
  • the world coordinate system is described by x w , y w , z w .
  • the positive direction of the x-axis is horizontally forward, the positive direction of the y-axis is horizontally to the left, and the positive direction of the z-axis is vertically upward.
  • the world coordinate system is the vehicle coordinate system of the first frame, that is, the world coordinate system is the vehicle coordinate system when the vehicle just started.
  • the pixel coordinate system is described by u, v.
  • mapping relationship between the pixel points in the bird's-eye view and the pixel coordinates of the original fisheye image is as follows:
  • p tuv represents the coordinates of the pixels in the bird's-eye view
  • p cuv represents the corresponding pixel coordinates in the original fisheye image
  • k t represents the internal parameters of the virtual top-view camera
  • k c represents the internal parameters of the fisheye camera
  • T tc represents the virtual Transformation matrix from top-view camera to fisheye camera.
  • Figure 2 schematically shows a flow chart of a semantic map construction and positioning method for an indoor parking lot according to an embodiment of the present invention. As shown in Figure 2, the method includes:
  • Step 201 Obtain original images collected during vehicle operation, where the original images at least include front-view original images.
  • the environmental image during vehicle operation is collected through a vehicle-mounted camera, and the environmental image is the original image.
  • a forward-looking fisheye camera is installed on the vehicle, and the forward-looking fisheye camera is installed on the front of the vehicle body.
  • the front-looking fisheye camera can be installed on the upper side of the windshield, centered on the windshield. position or centered on the upper side of the front license plate.
  • the original image collected by the forward-looking fisheye camera is the forward-looking original image.
  • a front-view pinhole camera and at least one fisheye camera are installed on the vehicle.
  • the environmental image collected by the front-view pinhole camera is the front-view original image.
  • the above-mentioned at least one fisheye camera is installed around the vehicle, for example, it can be installed at the upper and middle position of the front license plate, the upper and middle position of the rear license plate, below the left rearview mirror or below the right rearview mirror.
  • four fisheye cameras are installed on the vehicle.
  • the four fisheye cameras are respectively installed at the center of the upper side of the front license plate, the upper center of the rear license plate, below the left rearview mirror, and below the right rearview mirror.
  • the four fish-eye cameras can also be called surround-view fish-eye cameras.
  • Step 202 Stitch the original images into a bird's-eye view.
  • the IPM algorithm (Inverse Perspective Mapping, inverse perspective transformation algorithm) can be used to splice the original images into a bird's-eye view.
  • the IPM algorithm is used to stitch the original images collected by the forward-looking fisheye camera into a bird's-eye view.
  • the IPM algorithm is used to splice the original images collected by the at least one fisheye camera into a bird's-eye view.
  • Step 203 Perform semantic segmentation processing on the bird's-eye view to obtain segmented images with semantic features.
  • the semantic features include parking space line features and lane line features.
  • Semantic segmentation processing of a bird's-eye view refers to classifying each pixel in the bird's-eye view and associating each pixel with a preset semantic label, which includes parking space line labels and lane line labels.
  • a pre-built convolutional neural network model can be used to perform semantic segmentation processing on the bird's-eye view.
  • FCN network Full Convolutional Networks for Semantic Segmentation, full convolutional neural network
  • U-net network or SegNet can be used.
  • the network performs semantic segmentation on bird's-eye views.
  • the segmented image obtained after performing semantic segmentation processing on a bird's-eye view is shown in Figure 3.
  • the white lines in Figure 3 represent parking space lines and lane lines.
  • Step 204 Perform feature extraction on the original front-view image to obtain column features, and perform feature extraction on the bird's-eye view to obtain parking space corner features.
  • columns refer to the structural columns and load-bearing columns in indoor parking lots.
  • the pre-built convolutional neural network can be used to extract features from the front-view original image to obtain the column features in the front-view original image.
  • the front-view original image is collected by a front-view fisheye camera.
  • the column features in the front-view original image are shown in Figure 4.
  • a corner point is usually defined as the intersection point of two sides.
  • the parking space corner point refers to the intersection point of the parking space line.
  • Step 205 Generate a semantic map based on the semantic features, the column features and the parking space corner features, and calculate the posture of the vehicle in the semantic map.
  • the semantic map is a map in the world coordinate system.
  • the process of generating a semantic map includes:
  • the coordinates of the semantic features in the bird's-eye view coordinate system are projected into the world coordinate system
  • the coordinates of the column features in the camera coordinate system are projected into the world coordinate system
  • the parking space corner features are projected into the bird's-eye view coordinate system.
  • the coordinates in the system are projected into the world coordinate system
  • a semantic map is generated based on the coordinates of the semantic features, the column features and the parking space corner features in the world coordinate system.
  • semantic features when projecting semantic features, column features and parking space corner features, they can be projected into the world coordinate system based on the transformation relationship between relevant coordinate systems and the parameters of the camera.
  • semantic features as an example to illustrate, the coordinates of the semantic features in the bird's-eye view coordinate system are projected to the world coordinate system according to the following formula:
  • p w represents the coordinates of the semantic feature in the world coordinate system
  • T wb represents the pose of the vehicle in the world coordinate system at the current moment
  • T tb represents the transformation relationship from the bird's-eye view coordinate system to the vehicle coordinate system
  • k t represents the virtual top-view camera
  • the internal parameter, p tuv represents the pixel coordinate of the semantic feature in the bird's-eye view.
  • the column features When projecting the column features, based on the current vehicle pose, the transformation relationship from the front-view camera coordinate system to the vehicle coordinate system, the internal parameters of the front-view camera, and the pixel coordinates of the column features in the front-view fisheye image, the column features are calculated. Coordinates in the world coordinate system.
  • the parking space angle is calculated based on the current vehicle pose, the transformation relationship from the bird's-eye view coordinate system to the vehicle coordinate system, the internal parameters of the virtual top-view camera, and the pixel coordinates of the parking space corner features in the bird's-eye view.
  • the coordinates of the point feature in the world coordinate system that is, the formula for calculating the coordinates of the corner point of the parking space in the world coordinate system is the same as the formula for calculating the coordinates of the semantic feature in the world coordinate system.
  • the process of calculating the vehicle's pose in the semantic map includes:
  • the vehicle's current posture in the semantic map is determined.
  • the odometer refers to a device installed on the vehicle to measure the journey.
  • the working principle of the odometer is to detect the arc of the wheel rotation within a certain period of time based on the photoelectric encoders installed on the left and right driving wheel motors, and then calculate the changes in the relative posture of the vehicle.
  • the odometer information includes the number of rotations of the vehicle's driving wheels at the current moment.
  • the differential speed model can be used to calculate the displacement of the vehicle from the previous moment to the current moment. Then, the vehicle’s current pose in the semantic map can be calculated according to the following formula:
  • v ij represents the displacement of the vehicle from time i to time j.
  • Step 206 Perform nonlinear optimization on the semantic map according to the posture of the vehicle in the semantic map.
  • this step includes:
  • Step 501 Search the semantic map for semantic features near the vehicle according to the posture of the vehicle in the semantic map;
  • Step 502 Project the searched semantic features into the segmented image, and determine the projection position of the semantic features in the segmented image; that is, transform the coordinates of the searched semantic features in the world coordinate system to Coordinates in the bird's-eye view coordinate system;
  • Step 503 Determine the observation position of the searched semantic feature in the segmented image; the observation position of the semantic feature is the true position of the semantic feature in the segmented image;
  • Step 504 Use the error between the projected position and the observation position of the semantic feature as the first constraint relationship to constrain and optimize the semantic map.
  • the range covered by the vicinity of the vehicle can be flexibly set according to the needs.
  • the present invention is not limited here.
  • the range covered by a circle with the vehicle as the center and a radius of 1 meter is the vicinity of the vehicle.
  • the projection position p tuv of the semantic feature in the segmented image can be determined according to the following equation:
  • T wb represents the position and posture of the vehicle in the semantic map at the current moment
  • p w is the coordinate of the semantic feature in the world coordinate system
  • m represents the distance in the x direction from the vehicle coordinate system to the center of the top-view virtual camera
  • h is the virtual camera. Height from the ground.
  • step 504 the error between the projected position and the observed position of the semantic feature is:
  • err 1 represents the error between the projected position of the semantic feature and the observation position
  • p uv represents the observation position of the semantic feature
  • p w represents the position of the semantic feature in the semantic map (i.e., the coordinates in the world coordinate system)
  • k represents the internal parameters of the virtual top-view camera
  • T cb represents the transformation relationship from the front-view camera coordinate system to the vehicle coordinate system
  • T bw represents the vehicle's current pose in the semantic map
  • I() represents the acquisition of the pixel value of the pixel point.
  • the problem of optimizing the semantic map can be transformed into minimizing the error between the projected position and the observed position of the semantic feature.
  • Step 207 Constraintly optimize the pose of the vehicle in the semantic map based on the optimized semantic map, multiple frames of adjacent bird's-eye views, and the odometer information corresponding to the multiple frames of adjacent bird's-eye views, where , the multi-frame adjacent bird's-eye view includes the current frame and the adjacent frames of the current frame.
  • this step includes:
  • the odometer information corresponding to the adjacent frame and the odometer information corresponding to the current frame determine the odometer error, and use the odometer error as the third constraint relationship;
  • the pose of the vehicle in the semantic map is constrained to be optimized.
  • the projection error is determined according to the following formula:
  • err 2 represents the projection error
  • p′ uv represents the coordinates of the semantic features in the current frame
  • p uv represents the coordinates of the semantic features in the adjacent frames
  • k represents the internal parameters of the virtual top-view camera
  • T bc represents the vehicle coordinate system to the front view.
  • T bibj represents the pose transformation matrix from the adjacent frame to the current frame
  • I() represents the acquisition of the pixel value of the pixel point.
  • the odometer error is determined according to the following formula:
  • T bibj represents the pose transformation matrix from the adjacent frame to the current frame
  • T last represents the accumulated value of the odometer from the beginning to the last moment
  • T current represents the accumulated value of the odometer from the beginning to the current moment
  • the method of the embodiment of the present invention first splices the original images into a bird's-eye view, performs semantic segmentation on the bird's-eye view, and obtains semantic features. Secondly, performs feature detection on the front-view original image and the bird's-eye view respectively to obtain the columns in the front-view original image.
  • Features and parking space corner features in the bird's-eye view and then use semantic features, column features and parking space corner features to construct maps and vehicle positioning, and finally use semantic features and odometer information to perform nonlinear constraint optimization of the semantic map and vehicle pose.
  • Figure 6 schematically shows the structural diagram of a semantic map construction and positioning device 600 for indoor parking lots according to an embodiment of the present invention.
  • the device 600 includes:
  • the image acquisition module 601 is used to acquire original images collected during vehicle operation, where the original images at least include forward-looking original images;
  • Image splicing module 602 used to splice the original images into a bird's-eye view
  • the semantic segmentation module 603 is used to perform semantic segmentation processing on the bird's-eye view to obtain segmented images with semantic features;
  • the image detection module 604 is used to perform feature extraction on the original front-view image to obtain column features, and perform feature extraction on the bird's-eye view to obtain parking space corner features;
  • the map reconstruction module 605 is used to generate a semantic map based on the semantic features, the column features and the parking space corner features, and calculate the posture of the vehicle in the semantic map;
  • the optimization module 606 is used to perform non-linear optimization on the semantic map according to the posture of the vehicle in the semantic map; based on the optimized semantic map, the bird's-eye view adjacent to the multiple frames and the bird's-eye view adjacent to the multiple frames.
  • the odometer information corresponding to the adjacent bird's-eye view constrains the optimization of the vehicle's posture in the semantic map.
  • the semantic map construction and positioning device for indoor parking lots firstly splices the original images into a bird's-eye view, performs semantic segmentation on the bird's-eye view, and obtains semantic features, and secondly performs feature detection on the front-view original image and the bird's-eye view respectively.
  • nonlinear constraint optimization with the vehicle pose can achieve low-cost, high-precision, and high-robust real-time positioning; in the embodiment of the present invention, only visual features are used, and sensors such as GPS and lidar are not required, effectively reducing the It reduces costs and can be used in a wider range of scenarios, suitable for scenarios without GPS signals; in the embodiment of the present invention, two types of feature information, semantic features and detection features, are used to make more comprehensive use of visual sensors and improve positioning. Accuracy.
  • the image acquisition module is also used to: acquire the original image collected by the forward-looking fisheye camera installed on the vehicle, where the original image collected by the forward-looking fisheye camera is the forward-looking original image; or obtain the original image collected by the forward-looking fisheye camera installed on the vehicle.
  • the image stitching module is also used to: stitch the original forward-looking images collected by the forward-looking fisheye camera into a bird's-eye view; or stitch the original images collected by the at least one fisheye camera into a bird's-eye view.
  • the map reconstruction module is also used to: project the coordinates of the semantic features in the bird's-eye view coordinate system to the world coordinate system, and project the coordinates of the column features in the camera coordinate system to the world coordinate system. , project the coordinates of the parking space corner feature in the bird's-eye view coordinate system into the world coordinate system; generate semantics based on the semantic features, the column features, and the coordinates of the parking space corner feature in the world coordinate system.
  • Map obtain odometer information; calculate the displacement of the vehicle from the previous moment to the current moment according to the odometer information; calculate the displacement of the vehicle according to the position and posture of the vehicle in the semantic map at the last moment quantity to determine the current position and orientation of the vehicle in the semantic map.
  • the optimization module is further configured to: search for semantic features near the vehicle in the semantic map according to the posture of the vehicle in the semantic map; and project the searched semantic features to the semantic map.
  • the segmented image determine the projection position of the semantic feature in the segmented image; determine the observation position of the searched semantic feature in the segmented image; determine the distance between the projection position of the semantic feature and the observation position.
  • the error serves as the first constraint relationship to constrain the optimization of the semantic map.
  • the multi-frame adjacent bird's-eye view includes the current frame and adjacent frames of the current frame
  • the optimization module is also configured to: project the semantic features of the adjacent frames to the current frame through a transformation relationship, determine a projection error, and use the projection error as a second constraint relationship; according to the corresponding The odometer information and the odometer information corresponding to the current frame are used to determine the odometer error, and the odometer error is used as the third constraint relationship; according to the first constraint relationship, the second constraint relationship and the third constraint relationship Constraint relationships constraint optimize the position and posture of the vehicle in the semantic map.
  • the above-mentioned device can execute the method provided by the embodiment of the present invention, and has corresponding functional modules and beneficial effects for executing the method.
  • the method provided by the embodiment of the present invention please refer to the method provided by the embodiment of the present invention.
  • the device embodiments described above are only illustrative.
  • the units described as separate components may or may not be physically separated.
  • the components shown as units may or may not be physical units, that is, they may be located in One location, or it can be distributed across multiple network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment. Persons of ordinary skill in the art can understand and implement the method without any creative effort.
  • An embodiment of the present invention also provides an electronic device, as shown in Figure 7, including a processor 701, a communication interface 702, a memory 703, and a communication bus 704.
  • the processor 701, the communication interface 702, and the memory 503 communicate through the communication bus 704. complete mutual communication,
  • Memory 703 used to store computer programs
  • the processor 701 is used to execute the program stored on the memory 703 to implement the following steps:
  • the pose of the vehicle in the semantic map is constrained and optimized.
  • the communication bus mentioned in the above terminal can be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, etc.
  • PCI Peripheral Component Interconnect
  • EISA Extended Industry Standard Architecture
  • the communication bus can be divided into address bus, data bus, control bus, etc. For ease of presentation, only one thick line is used in the figure, but it does not mean that there is only one bus or one type of bus.
  • the communication interface is used for communication between the above terminal and other devices.
  • the memory may include Random Access Memory (RAM) or non-volatile memory (non-volatile memory), such as at least one disk memory.
  • RAM Random Access Memory
  • non-volatile memory non-volatile memory
  • the memory may also be at least one storage device located far away from the aforementioned processor.
  • the above-mentioned processor can be a general-purpose processor, including a central processing unit (Central Processing Unit, referred to as CPU), a network processor (Network Processor, referred to as NP), etc.; it can also be a digital signal processor (Digital Signal Processing, referred to as DSP) , Application Specific Integrated Circuit (ASIC for short), Field-Programmable Gate Array (FPGA for short) or other programmable logic devices, discrete gate or transistor logic devices, and discrete hardware components.
  • CPU Central Processing Unit
  • NP Network Processor
  • DSP Digital Signal Processing
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array
  • a computer-readable storage medium stores instructions that, when run on a computer, cause the computer to execute any one of the above embodiments. the method described.
  • a computer program product containing instructions is also provided, which when run on a computer causes the computer to execute the method described in any of the above embodiments.
  • the above embodiments it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof.
  • software it may be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions.
  • the computer program instructions When the computer program instructions are loaded and executed on a computer, the processes or functions described in accordance with the embodiments of the present invention are generated in whole or in part.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another, e.g., the computer instructions may be transferred from a website, computer, server, or data center Transmission to another website, computer, server or data center by wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) means.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more available media integrated.
  • the available media may be magnetic media (eg, floppy disk, hard disk, magnetic tape), optical media (eg, DVD), or semiconductor media (eg, Solid State Disk (SSD)), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

本发明公开了一种室内停车场的语义地图构建及定位方法和装置,涉及室内定位技术领域。该方法包括:获取车辆运行过程中采集的原始图像,原始图像至少包括前视原始图像;将原始图像拼接为鸟瞰图;对鸟瞰图进行语义分割处理,得到具有语义特征的分割图像;对前视原始图像进行特征提取,得到立柱特征,对鸟瞰图进行特征提取,得到车位角点特征;根据语义特征、立柱特征和车位角点特征,生成语义地图,计算车辆在语义地图中的位姿;根据车辆在语义地图中的位姿,对语义地图进行非线性优化;根据优化后的语义地图、多帧相邻的鸟瞰图及对应的里程计信息,优化车辆在语义地图中的位姿。该方法基于纯视觉的特征建图,成本低,鲁棒性高,应用场景广泛。

Description

室内停车场的语义地图构建及定位方法和装置
本申请要求在2022年04月02日提交中国专利局、申请号为202210343503.3、发明名称为“室内停车场的语义地图构建及定位方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及室内定位技术领域,尤其涉及一种室内停车场的语义地图构建及定位方法和装置。
背景技术
实时定位与建图(Simultaneous Localization And Mapping,SLAM)技术大量用于自动驾驶场景中,在实时建立周边环境地图,后续输出车辆定位信息等具体功能上扮演着重要的作用。SLAM根据所采用的传感器可以分为激光SLAM和视觉SLAM,主要可以分为由激光雷达直接采集得到的激光点云地图,以及由摄像头采集的图像转化得到的视觉点云地图。视觉SLAM相比于激光SLAM,成本优势巨大,但传统的视觉SLAM受限于精度,对环境变换的鲁棒性较差,因此没有像激光SLAM一样得到大规模的应用。因此,如何提升视觉SLAM的建图精度和鲁棒性,降低视觉信息冗余度和储存消耗,成为亟待解决的问题。
发明内容
为解决上述技术问题或至少部分地解决上述技术问题,本发明实施例提供一种室内停车场的语义地图构建及定位方法、装置、电子设备及计算机可读介质。
在本发明实施例的第一方面,提供了一种室内停车场的语义地图构建及定位方法,包括:
获取车辆运行过程中采集的原始图像,所述原始图像至少包括前视原始图像;
将所述原始图像拼接为鸟瞰图;
对所述鸟瞰图进行语义分割处理,得到具有语义特征的分割图像;
对所述前视原始图像进行特征提取,得到立柱特征,以及对所述鸟瞰图进行特征提取,得到车位角点特征;
根据所述语义特征、所述立柱特征和所述车位角点特征,生成语义地图,并计算所述车辆在所述语义地图中的位姿;
根据所述车辆在所述语义地图中的位姿,对所述语义地图进行非线性优化;
根据优化后的语义地图、多帧相邻的鸟瞰图及与所述多帧相邻的鸟瞰图对应的里程计信息,约束优化所述车辆在所述语义地图中的位姿。
在本发明实施例的第二方面,提供了一种室内停车场的语义地图构建及定位装置,包括:
图像获取模块,用于获取车辆运行过程中采集的原始图像,所述原始图像至少包括前视原始图像;
图像拼接模块,用于将所述原始图像拼接为鸟瞰图;
语义分割模块,用于对所述鸟瞰图进行语义分割处理,得到具有语义特征的分割图像;
图像检测模块,用于对所述前视原始图像进行特征提取,得到立柱特征,以及对所述鸟瞰图进行特征提取,得到车位角点特征;
地图重建模块,用于根据所述语义特征、所述立柱特征和所述车位角点特征,生成语义地图,并计算所述车辆在所述语义地图中的位姿;
优化模块,用于根据所述车辆在所述语义地图中的位姿,对所述语义地图进行非线性优化;根据优化后的语义地图、多帧相邻的鸟瞰图及与所述多帧相邻的鸟瞰图对应的里程计信息,约束优化所述车辆在所述语义地图中的位姿。
在本发明实施例的第三方面,提供了一种电子设备,包括:一个或多个处理器;存储装置,用于存储一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现室内停车场的语义地图构建及定位方法。
在本发明实施例的第四方面,提供了一种计算机可读介质,其上存储有计算机程序,所述程序被处理器执行时实现室内停车场的语义地图构建及定位方法。
在本发明实施例的第五方面,提供了一种计算机程序产品,包括计算机可读代码,当所述计算机可读代码在电子设备上运行时,导致所述电子设备执行室内停车场的语义地图构建及定位方法。
上述发明中的一个实施例具有如下优点或有益效果:
本发明实施例首先通过将原始图像拼接成鸟瞰图,对鸟瞰图进行语义分割,得到语义特征,其次分别对前视原始图像和鸟瞰图进行特征检测,得到前视原始图像中的立柱特征以及鸟瞰图中的车位角点特征,然后通过语义特征、立柱特征和车位角点特征进行建图和车辆定位,最后通过语义特征、里程计信息对语义地图和车辆位姿进行非线性约束优化,能够实现低成本、高精度、高鲁棒性的实时定位;在本发明实施例中仅使用了视觉特征,不需要GPS、激光雷达等传感器,有效地降低了成本,能够应用于更广泛的场景中,适用于无GPS信号的场景下;本发明实施例中使用了语义特征和检测特征两种特征信息,更为全面的利用了视觉传感器,提高了定位精度。
上述的非惯用的可选方式所具有的进一步效果将在下文中结合具体实施方式加以说明。
上述说明仅是本发明技术方案的概述,为了能够更清楚了解本发明的技术手段,而可依照说明书的内容予以实施,并且为了让本发明的上述和其它目的、特征和优点能够更明显易懂,以下特举本发明的具体实施方式。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
附图用于更好地理解本发明,不构成对本发明的不当限定。其中:
图1示意性示出了本发明实施例的室内停车场的语义地图构建及定位方 法中的多个坐标系;
图2示意性示出了本发明实施例的室内停车场的语义地图构建及定位方法的流程示意图;
图3示意性示出了本发明实施例的室内停车场的语义地图构建及定位方法中的语义分割图像;
图4示意性示出了本发明实施例的室内停车场的语义地图构建及定位方法中的立柱特征;
图5示意性示出了本发明实施例的室内停车场的语义地图构建及定位方法的子流程的示意图;
图6示意性示出了本发明实施例的室内停车场的语义地图构建及定位装置的结构示意图;
图7示意性示出了本发明实施例的电子设备的结构示意图。
具体实施例
以下结合附图对本发明的示范性实施例做出说明,其中包括本发明实施例的各种细节以助于理解,应当将它们认为仅仅是示范性的。因此,本领域普通技术人员应当认识到,可以对这里描述的实施例做出各种改变和修改,而不会背离本发明的范围和精神。同样,为了清楚和简明,以下的描述中省略了对公知功能和结构的描述。
本申请的说明书和权利要求书中的术语“第一”、“第二”等是用于区别类似的对象,而不用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施,且“第一”、“第二”等所区分的对象通常为一类,并不限定对象的个数,例如第一对象可以是一个,也可以是多个。此外,说明书以及权利要求中“和/或”表示所连接对象的至少其中之一,字符“/”,一般表示前后关联对象是一种“或”的关系。
为便于理解,以下对本发明实施例所涉及的坐标系进行说明。在本发明实施例中包括世界坐标系、鸟瞰图坐标系(也可以称为虚拟顶视相机坐标系)、前视相机坐标系、车辆坐标系(也可以称为里程计坐标系)和像素坐 标系。图1示意性示出了上述各坐标系的示意图。如图1所示,鸟瞰图坐标系采用x t,y t,z t描述,鸟瞰图坐标系的原点位于左右鱼眼相机中心线与前后鱼眼相机中心线的焦点,水平向右(指向右鱼眼相机的方向)为x轴正方向,水平向后为y轴正方向,竖直向下为z轴正方向。前视相机坐标系采用x c,y c,z c描述,前视相机坐标系的原点位于前视相机的中心,水平向前为z轴正方向,水平向右为x轴正方向,竖直向下为y轴正方向。车体坐标系采用x b,y b,z b描述,车体坐标系位于车底地平面上,以车辆后轴中心垂直到地面的交点为原点,水平向前为x轴正方向,水平向左为y轴正方向,竖直向上为z轴正方向。世界坐标系采用x w,y w,z w描述,水平向前为x轴正方向,水平向左为y轴正方向,竖直向上为z轴正方向。世界坐标系为第一帧车辆坐标系,即世界坐标系为车辆刚刚启动时的车辆坐标系。像素坐标系采用u,v描述。
鸟瞰图中的像素点到原始鱼眼图像像素坐标的映射关系如下所示:
p cuv=k cT tck t -1p tuv
其中,p tuv表示鸟瞰图中像素点的坐标,p cuv表示原始鱼眼图像中对应的像素点坐标,k t表示虚拟顶视相机的内参,k c表示鱼眼相机的内参,T tc表示虚拟顶视相机到鱼眼相机的变换矩阵。
图2示意性示出了本发明实施例的室内停车场的语义地图构建及定位方法的流程示意图,如图2所示,该方法包括:
步骤201:获取车辆运行过程中采集的原始图像,所述原始图像至少包括前视原始图像。
在本发明实施例中通过车载相机采集车辆运行过程中的环境图像,该环境图像即为原始图像。
在可选的实施例中,在车辆上安装有一个前视鱼眼相机,该前视鱼眼相机安装在车身的前部,例如该前视鱼眼相机可以安装在挡风玻璃上侧居中的位置或前车牌上侧居中的位置。该前视鱼眼相机采集的原始图像即为前视原始图像。
在另一可选的实施例中,在车辆上安装有一个前视针孔相机和至少一个鱼眼相机。其中,前视针孔相机采集的环境图像为前视原始图像。上述至少 一个鱼眼相机安装在车辆的周围,例如可以安装在前车牌上侧居中的位置、后车牌上侧居中的位置、左后视镜下方或右后视镜下方。优选的,在车辆上安装有4个鱼眼相机,该4个鱼眼相机分别安装在前车牌上侧居中的位置、后车牌上侧居中的位置、左后视镜下方、右后视镜下方。该4个鱼眼相机也可以称为环视鱼眼相机。
步骤202:将所述原始图像拼接为鸟瞰图。
在本步骤中可以利用IPM算法(Inverse Perspective Mapping,逆透视变换算法)将原始图像拼接为鸟瞰图。当车辆上仅安装有一个前视鱼眼相机时,利用IPM算法将该前视鱼眼相机采集的原始图像拼接为鸟瞰图。当车辆上安装有至少一个鱼眼相机和一个前视针孔相机时,利用IPM算法将至少一个鱼眼相机采集的原始图像拼接为鸟瞰图。
步骤203:对所述鸟瞰图进行语义分割处理,得到具有语义特征的分割图像。其中,该语义特征包括车位线特征和车道线特征。
对鸟瞰图进行语义分割处理是指将鸟瞰图中的每一个像素进行分类,将每一个像素关联到预设的语义标签上,该语义标签包括车位线标签和车道线标签。在本实施例中可以利用预构建的卷积神经网络模型对鸟瞰图进行语义分割处理,例如,可以利用FCN网络(Fully Convolutional Networks for Semantic Segmentation,全卷积神经网络)、U-net网络或SegNet网络对鸟瞰图进行语义分割处理。作为示例,对鸟瞰图进行语义分割处理后得到的分割图像如图3所示,图3中白色的线条表示车位线和车道线。
步骤204:对所述前视原始图像进行特征提取,得到立柱特征,以及对所述鸟瞰图进行特征提取,得到车位角点特征。
其中,立柱是指室内停车场内的结构柱和承重柱。在本步骤中可以利用预构建的卷积神经网络对前视原始图像进行特征提取,从而获得前视原始图像中的立柱特征。作为示例,前视原始图像为前视鱼眼相机采集的,该前视原始图像中的立柱特征如图4所示。
角点通常被定义为两条边的交点,在本实施例中,车位角点是指车位线的交点。在本步骤中可以利用预构建的卷积神经网络对鸟瞰图进行特征提取,从而获得车位角点特征,也可以利用角点检测算法例如Harris角点检测 算法提取鸟瞰图中的车位角点特征。
步骤205:根据所述语义特征、所述立柱特征和所述车位角点特征,生成语义地图,并计算所述车辆在所述语义地图中的位姿。
其中,语义地图是在世界坐标系下的地图。
根据所述语义特征、所述立柱特征和所述车位角点特征,生成语义地图的过程包括:
将所述语义特征在鸟瞰图坐标系中的坐标投影到世界坐标系中,将所述立柱特征在相机坐标系中的坐标投影到世界坐标系中,将所述车位角点特征在鸟瞰图坐标系中的坐标投影到世界坐标系中;
根据所述语义特征、所述立柱特征和所述车位角点特征在世界坐标系中坐标,生成语义地图。
其中,在对语义特征、立柱特征和车位角点特征进行投影时,可以根据相关坐标系之间的变换关系、相机的参数,将其投影到世界坐标系中。以语义特征为例进行说明,根据下式将语义特征在鸟瞰图坐标系中的坐标投影到世界坐标系:
p w=T wbT tbk t -1p tuv
p w表示语义特征在世界坐标系中的坐标,T wb为当前时刻车辆在世界坐标系中的位姿,T tb表示鸟瞰图坐标系到车辆坐标系的变换关系,k t表示虚拟顶视相机的内参,p tuv表示语义特征在鸟瞰图中的像素坐标。
在对立柱特征进行投影时,根据当前车辆位姿、前视相机坐标系到车辆坐标系的变换关系、前视相机的内参、立柱特征在前视鱼眼图像中的像素坐标,计算立柱特征在世界坐标系中的坐标。
在对车位角点特征进行投影时,根据当前车辆位姿、鸟瞰图坐标系到车辆坐标系的变换关系、虚拟顶视相机的内参、车位角点特征在鸟瞰图中的像素坐标,计算车位角点特征在世界坐标系中的坐标,即计算车位角点在世界坐标系中的坐标的公式与计算语义特征在世界坐标系中的坐标的公式相同。
在生成语义地图之后,需要计算车辆在语义地图中的位姿。计算车辆在语义地图中的位姿的过程包括:
获取里程计信息;
根据所述里程计信息,计算得到所述车辆由上一时刻到当前时刻的位移量;
根据所述车辆上一时刻在所述语义地图中的位姿和所述位移量,确定所述车辆当前时刻在所述语义地图中的位姿。
其中,里程计是指安装在车辆上的测量行程的装置。里程计的工作原理是根据安装在左右两个驱动轮电机上的光电编码器来检测车轮在一定时间内转过的弧度,进而推算车辆相对位姿的变化。在本实施例中,里程计信息包括车辆驱动轮在当前时刻的转数。在得到里程计信息之后可以使用差速模型计算车辆由上一时刻到当前时刻的位移量。然后,可以根据下式计算车辆当前时刻在所述语义地图中的位姿:
Figure PCTCN2022117351-appb-000001
其中,
Figure PCTCN2022117351-appb-000002
表示车辆在j时刻在语义地图中的位姿,
Figure PCTCN2022117351-appb-000003
表示车辆在i时刻在语义地图中的位姿,v ij表示车辆从i时刻到j时刻的位移量。
步骤206:根据所述车辆在所述语义地图中的位姿,对所述语义地图进行非线性优化。
具体的,该步骤包括:
步骤501:根据所述车辆在所述语义地图中的位姿,在所述语义地图中搜索所述车辆附近的语义特征;
步骤502:将搜索到的语义特征投影到所述分割图像中,确定所述语义特征在所述分割图像中的投影位置;即,将搜索到的语义特征在世界坐标系下的坐标变换到在鸟瞰图坐标系下的坐标;
步骤503:确定搜索到的语义特征在所述分割图像中的观测位置;语义特征的观测位置是语义特征在分割图像中的真实位置;
步骤504:将所述语义特征的投影位置与观测位置之间的误差作为第一约束关系,约束优化所述语义地图。
其中,车辆附近所涵盖的范围可以根据需求灵活设置,本发明在此不做限制,例如,以车辆为中心,以1米为半径的圆所覆盖的范围为车辆附近。
对于步骤502,可以根据如下方程确定语义特征在所述分割图像中的投 影位置p tuv
Figure PCTCN2022117351-appb-000004
p b=T bwp w
p b=Founction(p b)=[-pb y m-pb x h]
Figure PCTCN2022117351-appb-000005
其中,T wb表示车辆在当前时刻在语义地图中的位姿,p w为语义特征在世界坐标系下的坐标,m表示车辆坐标系到顶视虚拟相机中心x方向上的距离,h为虚拟相机距离地面的高度。
对于步骤504,语义特征的投影位置与观测位置之间的误差为:
err 1=I(p uv)-I(kT cbT bwp w)
其中,err 1表示语义特征的投影位置与观测位置之间的误差,p uv表示语义特征的观测位置,p w表示语义特征在语义地图中的位置(即在世界坐标系下的坐标),k表示虚拟顶视相机的内参,T cb表示前视相机坐标系到车辆坐标系的变换关系,T bw表示车辆当前时刻在语义地图中的位姿,I()表示获取像素点的像素值。
将所述语义特征的投影位置与观测位置之间的误差作为第一约束关系约束优化语义地图的问题可以转化为最小化语义特征的投影位置与观测位置之间的误差。
步骤207:根据优化后的语义地图、多帧相邻的鸟瞰图及与所述多帧相邻的鸟瞰图对应的里程计信息,约束优化所述车辆在所述语义地图中的位姿,其中,多帧相邻的鸟瞰图包括当前帧和所述当前帧的相邻帧。
具体的,该步骤包括:
将所述相邻帧的语义特征通过变换关系投影到所述当前帧,确定投影误差,将所述投影误差作为第二约束关系;
根据所述相邻帧对应的里程计信息和所述当前帧对应的里程计信息,确定里程计误差,将所述里程计误差作为第三约束关系;
根据所述第一约束关系、所述第二约束关系和所述第三约束关系,约束优化所述车辆在所述语义地图中的位姿。
其中,投影误差根据下式确定:
err 2=I(p′ uv)-I(kT bc-1T bibjT bck -1p uv)
err 2表示投影误差,p′ uv表示语义特征在当前帧中的坐标,p uv表示语义特征在相邻帧中的坐标,k表示虚拟顶视相机的内参,T bc表示车辆坐标系到前视相机坐标系的变换矩阵,T bibj表示相邻帧到当前帧的位姿变换矩阵,I()表示获取像素点的像素值。
里程计误差根据下式确定:
err 3=T bibj*T last -1T current
err 3表示里程计误差,T bibj表示相邻帧到当前帧的位姿变换矩阵,T last表示从开始到last时刻里程计累加值,T current表示从开始到current时刻里程计累加值,均为世界坐标系结果,即相对于开始时刻里程计坐标系结果。
本发明实施例的方法,首先通过将原始图像拼接成鸟瞰图,对鸟瞰图进行语义分割,得到语义特征,其次分别对前视原始图像和鸟瞰图进行特征检测,得到前视原始图像中的立柱特征以及鸟瞰图中的车位角点特征,然后通过语义特征、立柱特征和车位角点特征进行建图和车辆定位,最后通过语义特征、里程计信息对语义地图和车辆位姿进行非线性约束优化,能够实现低成本、高精度、高鲁棒性的实时定位;在本发明实施例中仅使用了视觉特征,不需要GPS、激光雷达等传感器,有效地降低了成本,能够应用于更广泛的场景中,适用于无GPS信号的场景下;本发明实施例中使用了语义特征和检测特征两种特征信息,更为全面的利用了视觉传感器,提高了定位精度。
图6示意性示出了本发明实施例的室内停车场的语义地图构建及定位装置600的结构示意图,如图6所示,该装置600包括:
图像获取模块601,用于获取车辆运行过程中采集的原始图像,所述原始图像至少包括前视原始图像;
图像拼接模块602,用于将所述原始图像拼接为鸟瞰图;
语义分割模块603,用于对所述鸟瞰图进行语义分割处理,得到具有语义特征的分割图像;
图像检测模块604,用于对所述前视原始图像进行特征提取,得到立柱特征,以及对所述鸟瞰图进行特征提取,得到车位角点特征;
地图重建模块605,用于根据所述语义特征、所述立柱特征和所述车位角点特征,生成语义地图,并计算所述车辆在所述语义地图中的位姿;
优化模块606,用于根据所述车辆在所述语义地图中的位姿,对所述语义地图进行非线性优化;根据优化后的语义地图、多帧相邻的鸟瞰图及与所述多帧相邻的鸟瞰图对应的里程计信息,约束优化所述车辆在所述语义地图中的位姿。
本发明实施例的室内停车场的语义地图构建及定位装置,首先通过将原始图像拼接成鸟瞰图,对鸟瞰图进行语义分割,得到语义特征,其次分别对前视原始图像和鸟瞰图进行特征检测,得到前视原始图像中的立柱特征以及鸟瞰图中的车位角点特征,然后通过语义特征、立柱特征和车位角点特征进行建图和车辆定位,最后通过语义特征、里程计信息对语义地图和车辆位姿进行非线性约束优化,能够实现低成本、高精度、高鲁棒性的实时定位;在本发明实施例中仅使用了视觉特征,不需要GPS、激光雷达等传感器,有效地降低了成本,能够应用于更广泛的场景中,适用于无GPS信号的场景下;本发明实施例中使用了语义特征和检测特征两种特征信息,更为全面的利用了视觉传感器,提高了定位精度。
可选地,所述图像获取模块还用于:获取安装在车辆上的前视鱼眼相机采集的原始图像,所述前视鱼眼相机采集的原始图像为前视原始图像;或获取安装在车辆上的前视针孔相机和安装在所述车辆周围的至少一个鱼眼相机采集的原始图像,所述前视针孔相机采集的原始图像为前视原始图像;
所述图像拼接模块还用于:将所述前视鱼眼相机采集的前视原始图像拼接为鸟瞰图;或将所述至少一个鱼眼相机采集的原始图像拼接为鸟瞰图。
可选地,所述地图重建模块还用于:将所述语义特征在鸟瞰图坐标系中的坐标投影到世界坐标系中,将所述立柱特征在相机坐标系中的坐标投影到 世界坐标系中,将所述车位角点特征在鸟瞰图坐标系中的坐标投影到世界坐标系中;根据所述语义特征、所述立柱特征和所述车位角点特征在世界坐标系中坐标,生成语义地图;获取里程计信息;根据所述里程计信息,计算得到所述车辆由上一时刻到当前时刻的位移量;根据所述车辆上一时刻在所述语义地图中的位姿和所述位移量,确定所述车辆当前时刻在所述语义地图中的位姿。
可选地,所述优化模块还用于:根据所述车辆在所述语义地图中的位姿,在所述语义地图中搜索所述车辆附近的语义特征;将搜索到的语义特征投影到所述分割图像中,确定所述语义特征在所述分割图像中的投影位置;确定搜索到的语义特征在所述分割图像中的观测位置;将所述语义特征的投影位置与观测位置之间的误差作为第一约束关系,约束优化所述语义地图。
可选地,所述多帧相邻的鸟瞰图包括当前帧和所述当前帧的相邻帧;
所述优化模块还用于:将所述相邻帧的语义特征通过变换关系投影到所述当前帧,确定投影误差,将所述投影误差作为第二约束关系;根据所述相邻帧对应的里程计信息和所述当前帧对应的里程计信息,确定里程计误差,将所述里程计误差作为第三约束关系;根据所述第一约束关系、所述第二约束关系和所述第三约束关系,约束优化所述车辆在所述语义地图中的位姿。
上述装置可执行本发明实施例所提供的方法,具备执行方法相应的功能模块和有益效果。未在本实施例中详尽描述的技术细节,可参见本发明实施例所提供的方法。
以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性的劳动的情况下,即可以理解并实施。
本发明实施例还提供了一种电子设备,如图7所示,包括处理器701、通信接口702、存储器703和通信总线704,其中,处理器701,通信接口702,存储器503通过通信总线704完成相互间的通信,
存储器703,用于存放计算机程序;
处理器701,用于执行存储器703上所存放的程序时,实现如下步骤:
获取车辆运行过程中采集的原始图像,所述原始图像至少包括前视原始图像;
将所述原始图像拼接为鸟瞰图;
对所述鸟瞰图进行语义分割处理,得到具有语义特征的分割图像;
对所述前视原始图像进行特征提取,得到立柱特征,以及对所述鸟瞰图进行特征提取,得到车位角点特征;
根据所述语义特征、所述立柱特征和所述车位角点特征,生成语义地图,并计算所述车辆在所述语义地图中的位姿;
根据所述车辆在所述语义地图中的位姿,对所述语义地图进行非线性优化;
根据优化后的语义地图、多帧相邻的鸟瞰图及与所述多帧相邻的鸟瞰图对应的里程计信息,约束优化所述车辆在所述语义地图中的位姿。
上述终端提到的通信总线可以是外设部件互连标准(Peripheral Component Interconnect,简称PCI)总线或扩展工业标准结构(Extended Industry Standard Architecture,简称EISA)总线等。该通信总线可以分为地址总线、数据总线、控制总线等。为便于表示,图中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
通信接口用于上述终端与其他设备之间的通信。
存储器可以包括随机存取存储器(Random Access Memory,简称RAM),也可以包括非易失性存储器(non-volatile memory),例如至少一个磁盘存储器。可选的,存储器还可以是至少一个位于远离前述处理器的存储装置。
上述的处理器可以是通用处理器,包括中央处理器(Central Processing Unit,简称CPU)、网络处理器(Network Processor,简称NP)等;还可以是数字信号处理器(Digital Signal Processing,简称DSP)、专用集成电路(Application Specific Integrated Circuit,简称ASIC)、现场可编程门阵列(Field-Programmable Gate Array,简称FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。
在本发明提供的又一实施例中,还提供了一种计算机可读存储介质,该计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述实施例中任一所述的方法。
在本发明提供的又一实施例中,还提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述实施例中任一所述的方法。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本发明实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘Solid State Disk(SSD))等。
需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。
本文中所称的“一个实施例”、“实施例”或者“一个或者多个实施例” 意味着,结合实施例描述的特定特征、结构或者特性包括在本发明的至少一个实施例中。此外,请注意,这里“在一个实施例中”的词语例子不一定全指同一个实施例。在此处所提供的说明书中,说明了大量具体细节。然而,能够理解,本发明的实施例可以在没有这些具体细节的情况下被实践。在一些实例中,并未详细示出公知的方法、结构和技术,以便不模糊对本说明书的理解。本说明书中的各个实施例均采用相关的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于系统实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
以上所述仅为本发明的较佳实施例而已,并非用于限定本发明的保护范围。凡在本发明的精神和原则之内所作的任何修改、等同替换、改进等,均包含在本发明的保护范围内。

Claims (11)

  1. 一种室内停车场的语义地图构建及定位方法,其中,包括:
    获取车辆运行过程中采集的原始图像,所述原始图像至少包括前视原始图像;
    将所述原始图像拼接为鸟瞰图;
    对所述鸟瞰图进行语义分割处理,得到具有语义特征的分割图像;
    对所述前视原始图像进行特征提取,得到立柱特征,以及对所述鸟瞰图进行特征提取,得到车位角点特征;
    根据所述语义特征、所述立柱特征和所述车位角点特征,生成语义地图,并计算所述车辆在所述语义地图中的位姿;
    根据所述车辆在所述语义地图中的位姿,对所述语义地图进行非线性优化;
    根据优化后的语义地图、多帧相邻的鸟瞰图及与所述多帧相邻的鸟瞰图对应的里程计信息,约束优化所述车辆在所述语义地图中的位姿。
  2. 根据权利要求1所述的方法,其中,获取车辆运行过程中采集的原始图像包括:
    获取安装在车辆上的前视鱼眼相机采集的原始图像,所述前视鱼眼相机采集的原始图像为前视原始图像;
    获取安装在车辆上的前视针孔相机和安装在所述车辆周围的至少一个鱼眼相机采集的原始图像,所述前视针孔相机采集的原始图像为前视原始图像;
    将所述原始图像拼接为鸟瞰图包括:
    将所述前视鱼眼相机采集的前视原始图像拼接为鸟瞰图;
    将所述至少一个鱼眼相机采集的原始图像拼接为鸟瞰图。
  3. 根据权利要求1所述的方法,其中,根据所述语义特征、所述立柱特征和所述车位角点特征,生成语义地图,并计算所述车辆在所述语义地图中 的位姿包括:
    将所述语义特征在鸟瞰图坐标系中的坐标投影到世界坐标系中,将所述立柱特征在相机坐标系中的坐标投影到世界坐标系中,将所述车位角点特征在鸟瞰图坐标系中的坐标投影到世界坐标系中;
    根据所述语义特征、所述立柱特征和所述车位角点特征在世界坐标系中坐标,生成语义地图;
    获取里程计信息;
    根据所述里程计信息,计算得到所述车辆由上一时刻到当前时刻的位移量;
    根据所述车辆上一时刻在所述语义地图中的位姿和所述位移量,确定所述车辆当前时刻在所述语义地图中的位姿。
  4. 根据权利要求1所述的方法,其中,根据所述车辆在所述语义地图中的位姿,对所述语义地图进行非线性优化包括:
    根据所述车辆在所述语义地图中的位姿,在所述语义地图中搜索所述车辆附近的语义特征;
    将搜索到的语义特征投影到所述分割图像中,确定所述语义特征在所述分割图像中的投影位置;
    确定搜索到的语义特征在所述分割图像中的观测位置;
    将所述语义特征的投影位置与观测位置之间的误差作为第一约束关系,约束优化所述语义地图。
  5. 根据权利要求4所述的方法,其中,所述多帧相邻的鸟瞰图包括当前帧和所述当前帧的相邻帧;
    根据优化后的语义地图、多帧相邻的鸟瞰图及与所述多帧相邻的鸟瞰图对应的里程计信息,约束优化所述车辆在所述语义地图中的位姿包括:
    将所述相邻帧的语义特征通过变换关系投影到所述当前帧,确定投影误差,将所述投影误差作为第二约束关系;
    根据所述相邻帧对应的里程计信息和所述当前帧对应的里程计信息,确定里程计误差,将所述里程计误差作为第三约束关系;
    根据所述第一约束关系、所述第二约束关系和所述第三约束关系,约束 约束优化所述车辆在所述语义地图中的位姿。
  6. 一种室内停车场的语义地图构建及定位装置,其中,包括:
    图像获取模块,用于获取车辆运行过程中采集的原始图像,所述原始图像至少包括前视原始图像;
    图像拼接模块,用于将所述原始图像拼接为鸟瞰图;
    语义分割模块,用于对所述鸟瞰图进行语义分割处理,得到具有语义特征的分割图像;
    图像检测模块,用于对所述前视原始图像进行特征提取,得到立柱特征,以及对所述鸟瞰图进行特征提取,得到车位角点特征;
    地图重建模块,用于根据所述语义特征、所述立柱特征和所述车位角点特征,生成语义地图,并计算所述车辆在所述语义地图中的位姿;
    优化模块,用于根据所述车辆在所述语义地图中的位姿,对所述语义地图进行非线性优化;根据优化后的语义地图、多帧相邻的鸟瞰图及与所述多帧相邻的鸟瞰图对应的里程计信息,约束优化所述车辆在所述语义地图中的位姿。
  7. 根据权利要求6所述的装置,其中,所述优化模块还用于:
    根据所述车辆在所述语义地图中的位姿,在所述语义地图中搜索所述车辆附近的语义特征;
    将搜索到的语义特征投影到所述分割图像中,确定所述语义特征在所述分割图像中的投影位置;
    确定搜索到的语义特征在所述分割图像中的观测位置;
    将所述语义特征的投影位置与观测位置之间的误差作为第一约束关系,约束优化所述语义地图。
  8. 根据权利要求7所述的装置,其中,所述多帧相邻的鸟瞰图包括当前帧和所述当前帧的相邻帧;
    所述优化模块还用于:
    将所述相邻帧的语义特征通过变换关系投影到所述当前帧,确定投影误差,将所述投影误差作为第二约束关系;
    根据所述相邻帧对应的里程计信息和所述当前帧对应的里程计信息,确 定里程计误差,将所述里程计误差作为第三约束关系;
    根据所述第一约束关系、所述第二约束关系和所述第三约束关系,约束优化所述车辆在所述语义地图中的位姿。
  9. 一种电子设备,其中,包括:
    一个或多个处理器;
    存储装置,用于存储一个或多个程序,
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-5中任一所述的方法。
  10. 一种计算机可读介质,其上存储有计算机程序,其中,所述程序被处理器执行时实现如权利要求1-5中任一所述的方法。
  11. 一种计算机程序产品,包括计算机可读代码,当所述计算机可读代码在电子设备上运行时,导致所述电子设备执行根据权利要求1-5中任一项所述的室内停车场的语义地图构建及定位方法。
PCT/CN2022/117351 2022-04-02 2022-09-06 室内停车场的语义地图构建及定位方法和装置 WO2023184869A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210343503.3A CN114863096B (zh) 2022-04-02 2022-04-02 室内停车场的语义地图构建及定位方法和装置
CN202210343503.3 2022-04-02

Publications (1)

Publication Number Publication Date
WO2023184869A1 true WO2023184869A1 (zh) 2023-10-05

Family

ID=82629187

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/117351 WO2023184869A1 (zh) 2022-04-02 2022-09-06 室内停车场的语义地图构建及定位方法和装置

Country Status (2)

Country Link
CN (1) CN114863096B (zh)
WO (1) WO2023184869A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114863096B (zh) * 2022-04-02 2024-04-16 合众新能源汽车股份有限公司 室内停车场的语义地图构建及定位方法和装置
CN117274036A (zh) * 2023-08-22 2023-12-22 合肥辉羲智能科技有限公司 一种基于多视角和时序融合的泊车场景检测方法

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100321211A1 (en) * 2009-06-23 2010-12-23 Ming-Kuan Ko Composite-image parking-assistant system
CN110243370A (zh) * 2019-05-16 2019-09-17 西安理工大学 一种基于深度学习的室内环境三维语义地图构建方法
CN112116654A (zh) * 2019-06-20 2020-12-22 杭州海康威视数字技术股份有限公司 一种车辆位姿确定方法、装置及电子设备
CN113781645A (zh) * 2021-08-31 2021-12-10 同济大学 一种面向室内泊车环境的定位和建图方法
CN113903011A (zh) * 2021-10-26 2022-01-07 江苏大学 一种适用于室内停车场的语义地图构建及定位方法
CN114863096A (zh) * 2022-04-02 2022-08-05 合众新能源汽车有限公司 室内停车场的语义地图构建及定位方法和装置

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5861871B2 (ja) * 2011-12-28 2016-02-16 スズキ株式会社 俯瞰画像提示装置
US10268201B2 (en) * 2017-02-28 2019-04-23 Mitsubishi Electric Research Laboratories, Inc. Vehicle automated parking system and method
US11482015B2 (en) * 2019-08-09 2022-10-25 Otobrite Electronics Inc. Method for recognizing parking space for vehicle and parking assistance system using the method
US11288522B2 (en) * 2019-12-31 2022-03-29 Woven Planet North America, Inc. Generating training data from overhead view images
CN111862672B (zh) * 2020-06-24 2021-11-23 北京易航远智科技有限公司 基于顶视图的停车场车辆自定位及地图构建方法
CN113781300B (zh) * 2021-08-17 2023-10-13 东风汽车集团股份有限公司 一种用于远距离自主泊车的车辆视觉定位方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100321211A1 (en) * 2009-06-23 2010-12-23 Ming-Kuan Ko Composite-image parking-assistant system
CN110243370A (zh) * 2019-05-16 2019-09-17 西安理工大学 一种基于深度学习的室内环境三维语义地图构建方法
CN112116654A (zh) * 2019-06-20 2020-12-22 杭州海康威视数字技术股份有限公司 一种车辆位姿确定方法、装置及电子设备
CN113781645A (zh) * 2021-08-31 2021-12-10 同济大学 一种面向室内泊车环境的定位和建图方法
CN113903011A (zh) * 2021-10-26 2022-01-07 江苏大学 一种适用于室内停车场的语义地图构建及定位方法
CN114863096A (zh) * 2022-04-02 2022-08-05 合众新能源汽车有限公司 室内停车场的语义地图构建及定位方法和装置

Also Published As

Publication number Publication date
CN114863096B (zh) 2024-04-16
CN114863096A (zh) 2022-08-05

Similar Documents

Publication Publication Date Title
US11210534B2 (en) Method for position detection, device, and storage medium
US11042762B2 (en) Sensor calibration method and device, computer device, medium, and vehicle
WO2023184869A1 (zh) 室内停车场的语义地图构建及定位方法和装置
CN110163930B (zh) 车道线生成方法、装置、设备、系统及可读存储介质
WO2020098316A1 (zh) 基于视觉点云的语义矢量地图构建方法、装置和电子设备
US20230215187A1 (en) Target detection method based on monocular image
WO2020043081A1 (zh) 定位技术
CN111986214B (zh) 一种地图中人行横道的构建方法和电子设备
WO2022206414A1 (zh) 三维目标检测方法及装置
CN115164918B (zh) 语义点云地图构建方法、装置及电子设备
WO2023028880A1 (zh) 车载相机的外部参数的标定方法及相关装置
EP4386676A1 (en) Method and apparatus for calibrating cameras and inertial measurement unit, and computer device
WO2023155581A1 (zh) 一种图像检测方法和装置
CN112116655A (zh) 目标对象的图像的位置信息确定方法和装置
CN113029128A (zh) 视觉导航方法及相关装置、移动终端、存储介质
WO2021184616A1 (zh) 一种车位检测方法、装置、设备及存储介质
CN110197104B (zh) 基于车辆的测距方法及装置
CN117745845A (zh) 一种外参信息确定方法、装置、设备和存储介质
CN114386481A (zh) 一种车辆感知信息融合方法、装置、设备及存储介质
CN113297958A (zh) 一种自动化标注方法、装置、电子设备和存储介质
CN117830397A (zh) 重定位方法、装置、电子设备、介质和车辆
CN116823966A (zh) 相机的内参标定方法、装置、计算机设备和存储介质
CN117152265A (zh) 一种基于区域提取的交通图像标定方法及装置
CN114648639B (zh) 一种目标车辆的检测方法、系统及装置
WO2023168747A1 (zh) 基于域控制器平台的自动泊车的停车位标注方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22934689

Country of ref document: EP

Kind code of ref document: A1