Disclosure of Invention
In view of the above-mentioned shortcomings of the prior art, an object of the present invention is to provide a positioning method, an apparatus, a storage medium and an electronic device, which are used to solve the problems of insufficient stability and high cost of the prior positioning technology.
To achieve the above and other related objects, the present invention provides a positioning method, including: acquiring environmental image information of a to-be-positioned place; extracting the grid semantic information of a specific marker in the environment image information; acquiring a grid semantic map; and matching the grid semantic information with the grid semantic map, and determining a positioning result of the to-be-positioned location according to a matching result.
In an embodiment of the present invention, the acquiring of the environmental image information of the location to be positioned includes preprocessing the acquired environmental image information, where the preprocessing includes gaussian denoising and image stitching, where the image stitching includes 3D virtual projection and viewpoint conversion technologies, and the stitched image can be arbitrarily converted into a viewing angle.
In an embodiment of the present invention, extracting the grid semantic information of the specific identifier in the environment image information includes: obtaining semantic information of all pixel points of one or more specific markers in the image based on a deep learning target detection algorithm or a semantic segmentation algorithm; and extracting pixel point semantic information of the specific marker as the grid semantic information of the specific marker.
In one embodiment of the present invention, the specific marker includes a road surface marker and a non-road surface space marker, in which: when the specific marker is a road marker, converting the grid semantic information of the specific marker into a road plane for matching in a overlooking manner; and when the specific marker is a non-road surface marker, the non-road surface space marker in the local grid semantic map of the object to be positioned is subjected to perspective transformation to an image plane for matching.
In an embodiment of the present invention, the grid semantic map may be a global grid semantic map or a local grid semantic map, where the local grid semantic map is a grid semantic map of a specific range determined based on a position of an object to be predicted to be positioned in the global grid semantic map.
In an embodiment of the present invention, the global grid semantic map building process includes: establishing a point cloud map based on the perception point cloud data and the positioning data; screening specific marker point clouds based on the point cloud map; extracting semantic information of all pixel points of the specific marker based on the specific marker point cloud; extracting pixel point semantic information of the specific marker as grid semantic information of the specific marker; and forming the global grid semantic map based on the grid semantic information of each specific identifier.
In an embodiment of the present invention, the local grid semantic map building process includes: acquiring the global grid semantic map; determining an initial positioning estimation value of an object to be positioned based on an external positioning source or a motion trend; and acquiring a grid semantic map within a specific range of the initial positioning estimation value based on the initial positioning estimation value.
In an embodiment of the present invention, matching the grid semantic information with the grid semantic map, and determining a positioning result of the to-be-positioned location according to a matching result includes: based on the prediction of the position of the object to be positioned, the current grid semantic information is superposed on the grid semantic map, and the optimal result of the current position of the object to be positioned is obtained by utilizing a probability model algorithm, wherein the probability model algorithm comprises a Kalman filtering algorithm, a particle filtering algorithm and a self-adaptive particle filtering algorithm.
In an embodiment of the present invention, matching the grid semantic information with the grid semantic map based on a particle filtering algorithm, and determining a positioning result of the to-be-positioned location according to the matching result, includes the following steps:
step 1), initializing a particle set, and when an initial time k is 0, determining an approximate position x of a location to be located
0Approximate distribution p (x)
0) Generating N particle poses
Wherein i is 1,2, … N;
step 2), predicting the particle pose at the next moment k equal to 1 according to the motion trend of the object to be positioned
Step 3), according to the predicted pose of each particle, matching the grid semantic information with a global grid semantic map or a local grid semantic map, and taking the distance value of the position information of the two as the weight of the particle
Step 4), according to the weight of the particles
Resampling the particles according to the size to obtain the position and posture of the resampled particles
Step 5), calculating all particles
The optimal position of the to-be-positioned location at the moment k is obtained according to the weight and the pose of the target location, wherein,
to achieve the above and other related objects, the present invention provides a positioning device, comprising: the image information acquisition module is used for acquiring the environmental image information of the to-be-positioned location; the semantic information extraction module is used for extracting the grid semantic information of the specific marker in the environment image information; the semantic map acquisition module is used for acquiring a grid semantic map; and the map matching and positioning module is used for matching the grid semantic information with the grid semantic map and determining a positioning result of the to-be-positioned place according to a matching result.
To achieve the above and other related objects, the present invention provides a computer-readable storage medium, in which a computer program is stored, and the computer program is loaded and executed by a processor to implement the positioning method.
To achieve the above and other related objects, the present invention provides an electronic device, comprising: a processor and a memory; wherein the memory is for storing a computer program; the processor is used for loading and executing the computer program to enable the electronic equipment to execute the positioning method.
As described above, according to the positioning method, the positioning apparatus, the storage medium, and the electronic device of the present invention, the environmental image information of the location to be positioned is obtained; extracting the grid semantic information of a specific marker in the environment image information; acquiring a grid semantic map; matching the grid semantic information with the grid semantic map, and determining a positioning result of the to-be-positioned location according to a matching result; high stability and low cost, and can effectively avoid the adverse effect of illumination and environmental change on positioning.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict.
It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention, and the drawings only show the components related to the present invention rather than the number, shape and size of the components in actual implementation, and the type, quantity and proportion of the components in actual implementation may be changed freely, and the layout of the components may be more complicated. In addition, the present application does not limit the execution sequence of each step in the following embodiments, and the sequence between each step in the practical application is not limited to the embodiments provided in the present application.
The realization of positioning by means of vision and laser radar in the prior art is a hot problem studied by autonomous robots and autopilots. However, these two positioning methods have some disadvantages, respectively.
Based on the visual positioning technology, the matching relation of the feature points between the frames is obtained mainly through feature point matching, the pose of the corresponding time of the frames and the three-dimensional coordinates of the matched feature points are calculated through a multi-view geometric equation to form a feature point cloud, the feature point cloud is then re-projected onto the current image, and the optimal position estimation and the feature point coordinates are obtained through minimizing the projection error square sum of all the feature points. In the feature point-based positioning algorithm, commonly used feature point detection algorithms include SIFT, SURF, FAST, ORB, and the like. The SIFT and SURF algorithms keep invariance to rotation, scale scaling, brightness and the like, but the operation speed is slow, and real-time extraction and matching cannot be realized. The FAST algorithm only detects the gray value of a pixel, and is FAST, but has no direction and scale information. The ORB algorithm uses BRIEF descriptors on the basis of the FAST algorithm and solves the problems of scale and rotation by using an image pyramid and gray centroid method, but the ORB algorithm still has the problem of sensitivity to illumination and environmental changes. In addition, the optical flow-based positioning algorithm calculates the camera motion using the pixel gradation information of the image, but the optical flow method assumes that the pixel gradation value of the same spatial point is constant in each image. Since the assumption of invariant gray scale is difficult to satisfy in practice, the vision-based localization algorithm has a great disadvantage in practical application.
The laser radar-based positioning algorithm mainly extracts line and surface features from each frame of laser point cloud, obtains the relative poses of two frames of laser point cloud by utilizing geometric feature matching to form a laser point cloud map, and matches the laser point cloud features of the current frame with the laser point cloud map to obtain the optimal position, namely the generated point cloud map can only be positioned by using the laser radar. However, the data volume of the laser point cloud map is large, the cost of the laser radar is high, and each device using the laser point cloud map for positioning needs to be equipped with the laser radar, which is difficult to realize in practical application, and the application based on the laser radar positioning is limited to a great extent.
In view of the defects of the prior art, the application provides a positioning method based on a semantic map, which has high stability compared with the visual positioning technology, has lower cost compared with the laser radar positioning technology, and can effectively avoid the adverse effects of illumination and environmental changes on positioning.
The semantic map-based positioning method can be independently executed by electronic equipment such as a robot, a car machine, a smart phone, a tablet computer and a server, and can also be executed by combining a plurality of electronic equipment, and the method is not limited in the application.
As shown in fig. 1, the positioning method of the present application includes the following steps:
s11: and acquiring the environmental image information of the to-be-positioned position.
In an embodiment, it is preferable that the image information of the environment of the location to be positioned is acquired by an image information acquiring device of the object to be positioned. The environment image information can be acquired by a single image information acquisition device, or can be acquired by a plurality of image information acquisition devices in different directions and then spliced by images. Wherein the image stitching comprises: 3D virtual projection, viewpoint conversion technology and the like, and the spliced images can be displayed by converting the visual angle at will.
In an embodiment, preferably, the acquiring the environmental image information of the location to be located includes a step of preprocessing the acquired environmental image information, where the preprocessing includes: gaussian denoising, image splicing and the like.
For example, in step S11, the object to be positioned obtains an environmental picture at the location to be positioned by a single image information obtaining device, and performs gaussian denoising preprocessing on the environmental picture.
For another example, in step S11, the object to be positioned acquires multiple environmental pictures at the location to be positioned through multiple image information acquisition devices in different directions, the multiple environmental pictures can be spliced into one picture through technologies such as 3D virtual projection and viewpoint conversion, and similarly, gaussian denoising preprocessing can be performed on the environmental pictures.
S12: and extracting the grid semantic information of the specific marker in the environment image information.
Specifically, the specific identifier includes: pavement markers and non-pavement space markers. The pavement markers mainly comprise relatively standard and stable ground artificial markers such as lane lines, arrows, speed bumps, vehicle lines and the like, and the non-pavement markers comprise other markers except the pavement markers, such as telegraph poles, traffic signs, buildings, landscapes and the like, and can be defined by the skilled person.
In one embodiment, preferably, extracting the grid semantic information of the specific identifier in the environment image information includes: firstly, acquiring semantic information of all pixel points of one or more specific markers in an image based on a deep learning target detection algorithm or a semantic segmentation algorithm; and then, extracting pixel point semantic information of the specific marker as the grid semantic information of the specific marker.
It should be noted that, a person skilled in the art may use the existing deep learning target detection algorithm and semantic segmentation algorithm to perform semantic extraction on an image, and may also use other algorithms to perform semantic extraction on an image, which is not limited in this application.
S13: acquiring a grid semantic map;
the grid semantic map may be a global grid semantic map or a local grid semantic map.
Specifically, the global grid semantic map is a grid semantic map including a global scope. The global grid semantic map acquisition mode can be self-established, or can be acquired in a cloud downloading mode, an advanced built-in mode and other modes.
As shown in fig. 2, the global grid semantic map building process includes the following steps:
s21: establishing a point cloud map based on the perception point cloud data and the positioning data;
the sensing point cloud data can be generated by a laser radar, and the positioning data can be generated by a Beidou device and a GPS device. Since how to build the point cloud map is not a key invention point of the present application, it is not expanded in detail here, and those skilled in the art can build the point cloud map based on the sensing point cloud data and the positioning data by using the existing algorithm or software.
S22: screening specific marker point clouds based on the point cloud map;
in order to reduce the data volume of the global grid semantic map, the global grid semantic map is constructed by adopting the specific marker point clouds in the point cloud map. The specific identifier point cloud includes: the point clouds of the pavement markers and the point clouds of the non-pavement space markers can be selected by a person skilled in the art according to actual needs.
S23: extracting semantic information of all pixel points of the specific marker based on the specific marker point cloud;
for example, if the specific marker is a lane line, the semantic information of all the pixel points of the lane line is extracted from the point cloud of the lane line marker.
S24: extracting pixel point semantic information of the specific marker as grid semantic information of the specific marker;
s25: and extracting the grid semantic information of each specific marker to form a global grid semantic map.
Specifically, the local grid semantic map is a grid semantic map of a specific range determined based on the position of the object to be predicted to be positioned in the global grid semantic map.
As shown in fig. 3, the local grid semantic map building process includes the following steps:
s31: acquiring the global grid semantic map;
s32: determining an initial positioning estimation value of an object to be positioned based on an external positioning source or a motion trend;
wherein the external positioning source is, for example, GPS, beidou, visual SLAM, lidar SLAM, or the like. The movement trend can be comprehensively judged by the driving distance and the driving direction of the vehicle odometer, for example: on the basis of obtaining the global grid semantic map, obtaining the initial position of the vehicle according to a GPS (global positioning system), or performing clock synchronization on the obtained environment image information and vehicle body data by utilizing hardware equipment, and when a deep learning perception result of one frame (extracted grid semantic information of a specific marker in the environment image information) arrives, searching the corresponding vehicle body information according to a timestamp. Obtaining the current car body mileage value according to the time stamp
According to the position of the vehicle body at the moment k-1
And car body odometer
Predicting the current body position:
s33: and acquiring a grid semantic map within a specific range of the initial positioning estimation value based on the initial positioning estimation value.
In connection with the above example, in this step, the corresponding portion of the specific range of the initial positioning estimation value is extracted from the global grid semantic map, so as to form a local grid semantic map of a certain range including the position of the initial positioning estimation value. The size of the specific range value can be preset according to actual needs. For example, when the specific range value is set to 10 meters, the grid semantic map within a range of 10 meters around the current localization estimation value is selected from the global grid semantic map as the local grid semantic map.
S14: and matching the grid semantic information with the grid semantic map, and determining a positioning result of the to-be-positioned location according to a matching result.
Specifically, matching the grid semantic information with the grid semantic map, and determining a positioning result of the to-be-positioned location according to a matching result, includes: and based on the prediction of the position of the object to be positioned, superposing the current grid semantic information on the grid semantic map, and obtaining the optimal result of the current position of the object to be positioned by utilizing a probability model algorithm. The probabilistic model algorithm includes, but is not limited to, a kalman filter algorithm, a particle filter algorithm, an adaptive monte carlo particle filter algorithm.
Matching the grid semantic information with the grid semantic map based on a particle filter algorithm, and determining a positioning result of the to-be-positioned location according to the matching result, wherein the specific process comprises the following steps:
step 1), initializing a particle set, and when an initial time k is 0, determining an approximate position x of a location to be located
0Approximate distribution p (x)
0) Generating N particle poses
Wherein i is 1,2, … N;
step 2), predicting the particle pose at the next moment k equal to 1 according to the motion trend of the object to be positioned
Step 3), according to the predicted pose of each particle, matching the grid semantic information with a global grid semantic map or a local grid semantic map, and taking the distance value of the position information of the two as the weight of the particle
Step 4), according to the weight of the particles
Resampling the particles according to the size to obtain the position and posture of the resampled particles
Step 5) calculating all particles
The optimal position of the to-be-positioned location at the moment k is obtained according to the weight and the pose of the target location, wherein,
for example, referring to fig. 4A, first, at an initial time, 4 particles distributed near an initial position are obtained according to information provided by GPS; and secondly, acquiring the pose of the particle at the current moment according to the motion trend of the object to be positioned. Referring to fig. 4B, if a marker is found 3m ahead of the current position, as shown by the dashed circular frame, the black square in the figure is shown at 3m ahead of each particle, where the smaller the distance between the black block and the square enclosed by the dashed circular frame, the closer the particle is to the real position, and the larger the weight. Referring to fig. 4C, the particles are resampled according to the weights, and the particles with larger weights are sampled for a plurality of times, so as to obtain the particle distribution shown in fig. 4C. Referring to fig. 4D, the optimal result at the time k is obtained according to the weights and poses of all the particles, as shown by the arrows marked by the dashed oval line in fig. 4D.
And (4) after the optimal position is obtained through matching, positioning the place to be positioned is realized, and the steps S11-S14 are repeatedly executed, so that the target to be positioned can be continuously positioned in real time.
In one embodiment, when the specific marker is a road surface marker, the grid semantic information of the specific marker is transformed to a road plane for matching in a overlooking manner; and when the specific marker is a non-road surface marker, the non-road surface space marker in the local grid semantic map of the object to be positioned is subjected to perspective transformation to an image plane for matching.
It is worth noting that, compared with the matching of the grid semantic information and the global grid semantic map, the matching of the grid semantic information and the local grid semantic map is faster and more accurate.
In an embodiment, the grid semantic information may be matched with a global grid semantic map or a local grid semantic map based on probabilistic model algorithms including, but not limited to: kalman filter algorithm, particle filter algorithm, adaptive particle filter algorithm, etc.
All or part of the steps for implementing the above method embodiments may be performed by hardware associated with a computer program. Based upon such an understanding, the present invention also provides a computer program product comprising one or more computer instructions. The computer instructions may be stored in a computer readable storage medium. The computer-readable storage medium can be any available medium that a computer can store or a data storage device, such as a server, a data center, etc., that is integrated with one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
As shown in fig. 5, the present application further provides a positioning device 50, and since the specific implementation of the present device is the same as that of the foregoing method embodiment, the same contents are not repeated herein. The positioning device 50 mainly comprises the following modules:
an image information obtaining module 51, configured to obtain environment image information of a location to be positioned;
a semantic information extracting module 52, configured to extract grid semantic information of a specific identifier in the environment image information;
a semantic map obtaining module 53, configured to obtain a grid semantic map;
and the map matching and positioning module 54 is configured to match the grid semantic information with the grid semantic map, and determine a positioning result of the to-be-positioned location according to a matching result.
Referring to fig. 6, the embodiment provides an electronic device, which may be a desktop, a tablet computer, a smart phone, a car machine, or the like. In detail, the electronic device comprises at least, connected by a bus: the system comprises a memory and a processor, wherein the memory is used for storing computer programs, and the processor is used for executing the computer programs stored by the memory so as to execute all or part of the steps in the method embodiment.
The above-mentioned system bus may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The system bus may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus. The communication interface is used for realizing communication between the database access device and other equipment (such as a client, a read-write library and a read-only library). The Memory may include a Random Access Memory (RAM), and may further include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory.
The Processor may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the Integrated Circuit may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component.
In summary, according to the positioning method, the positioning device, the storage medium and the electronic device of the present invention, the point cloud map is used to extract the grid semantic map, so as to ensure the accuracy of the grid semantic map, reduce the data volume of the map, and perform positioning by using the visual perception grid semantic information during real-time positioning, so that the cost is low, the map reusability is high, and the perception grid semantic information is not affected by illumination and environmental changes, thereby being beneficial to improving the positioning stability, effectively overcoming various defects in the prior art, and having high industrial utilization value.
The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present invention.