CN108481327B

CN108481327B - Positioning device, positioning method and robot for enhancing vision

Info

Publication number: CN108481327B
Application number: CN201810543865.0A
Authority: CN
Inventors: 赖钦伟
Original assignee: Zhuhai Amicro Semiconductor Co Ltd
Current assignee: Zhuhai Amicro Semiconductor Co Ltd
Priority date: 2018-05-31
Filing date: 2018-05-31
Publication date: 2023-11-07
Anticipated expiration: 2038-05-31
Also published as: CN108481327A

Abstract

The application discloses a positioning device, a positioning method and a robot for enhancing vision, wherein the positioning device is a movable vision positioning device and comprises the following components: an image acquisition module comprising a forward-facing tilted camera and a backward-facing tilted binocular camera positioned at different locations of the positioning apparatus for enhancing the visual sensing effect of the positioning apparatus; the image processing module comprises an image preprocessing sub-module and a characteristic matching sub-module and is used for processing the acquired image data; the inertial data acquisition processing module is used for sensing the rotation angle information, the acceleration information and the translation speed information of the inertial sensor in real time; and the fusion positioning module is used for fusing the environmental information acquired by each sensor module to realize reliable and robust autonomous positioning. Compared with the prior art, the method has the advantages that the image data matched with the features is used for fusing inertial data and updating landmarks in combination with the relative position relationship, so that the feature matching is more accurate, and the robustness of a positioning algorithm is enhanced.

Description

Positioning device, positioning method and robot for enhancing vision

Technical Field

The application relates to a positioning method and a positioning device, in particular to a positioning device, a positioning method and a robot for enhancing vision.

Background

The robot realizes intellectualization, one basic technology is that the robot can position and walk by itself, and an indoor navigation technology is a key technology. The current indoor navigation technology comprises inertial sensor navigation, laser navigation, visual navigation, radio navigation and the like, and each technology has own advantages and disadvantages. The inertial sensor navigation is to use a gyroscope, an odometer and the like for navigation and positioning, so that the cost is low, but the problem of long-time drift exists; the laser navigation precision is high, but the price is higher, and the service life is also a problem; visual navigation in the traditional sense is complex in calculation, and has higher requirements on the performance of a processor, and higher power consumption and price; the radio requires a plurality of fixed radio emission sources, is inconvenient to apply and is relatively expensive. The integration of multiple technologies, low cost and high precision, is one development direction of robotic navigation technology.

In the existing visual sweeper products, the camera is arranged in front of the machine, and generally needs to be slightly protruded to obtain a good visual angle, however, the camera lens is easily touched by some objects which are difficult to detect, and the lens is easily scratched; and the front of the machine is generally provided with more sensors, for example, many machines are provided with collision bars and cylindrical 360-degree infrared receiving devices, which are easy to block the camera, so that the angle of the camera needs to be increased.

Disclosure of Invention

The positioning device for enhancing the vision is a movable visual positioning device and comprises an image acquisition module, an image processing module, an inertial data acquisition module and a fusion positioning module;

the image acquisition module comprises a forward-inclined camera and is used for detecting and identifying an object in the forward driving direction of the positioning device; the camera is inclined backwards and used for capturing an environment image so as to realize positioning;

the image processing module comprises an image preprocessing sub-module and a characteristic matching sub-module and is used for processing the image data acquired by the image acquisition module; the image preprocessing sub-module is used for converting the data acquired by the backward inclined camera into a gray image, and the feature matching sub-module is used for extracting feature data from the image preprocessed by the image preprocessing sub-module and matching the feature data with associated features of landmark images in a landmark database; the landmark database is a landmark database built in the image processing module, and comprises image feature points of a region associated with a given landmark;

the inertial data acquisition processing module consists of a series of inertial data measurement units and senses the rotation angle information, the acceleration information and the translation speed information of the inertial sensor in real time;

and the fusion positioning module is used for carrying out data fusion on the inertial data acquired by the inertial data acquisition processing module according to the feature matching result in the image processing module and combining the image data acquired by the forward-inclined camera, and correcting the current position information through the data fusion result.

Further, the forwardly facing tilt camera is positioned at a forwardly facing concave and/or convex structure of the front half opening of the top surface of the positioning device.

Further, the backward-facing inclined camera is a binocular camera with the same imaging parameters, and two cameras of the binocular camera are positioned side by side at a concave and/or convex structure of a backward tail opening of the top surface of the positioning device.

Further, the angles formed by the inclination of the optical axes of the front-facing inclined camera and the rear-facing inclined camera and the top surface of the positioning device all span 0-80 degrees, and the angles thereof remain equal.

Further, in the fusion positioning module, when the feature matching in the image processing module is successful, the coordinates of the landmark in the map are obtained, and the coordinates of the positioning device in the map can be calculated by combining the coordinates of the positioning device relative to the landmark, and the inertial data is used for updating and correcting;

when feature matching in the image processing module fails, a rigid connection relation between the inertial sensor and the forward-tilting camera is obtained according to the accumulated value of the inertial data, then a new landmark is calculated and stored and recorded in the landmark database by combining the relative gesture of the target image obtained by the backward-tilting camera according to left-right binocular parallax and the associated feature of the landmark image in the landmark database, and the creation of the new landmark is completed;

the inertial sensor is connected with the forward-tilting camera, mapping relation exists between the forward-tilting camera and the gray image features or landmark image associated features, and the features can be obtained through gray image extraction; the rigid connection relation is a position relation established based on pose changes corresponding to the inertial data between two adjacent frames of images acquired by the forward-tilting camera.

A positioning method for enhancing vision, which is applied to the positioning device, comprising the following steps:

the two cameras facing the backward inclined cameras respectively acquire the same landmark in the actual scene to obtain a left image and a right image, an image area of a characteristic point is determined in the left image and is used as a template, the image area with the same size as the template is correspondingly extracted in the right image, and the line parallax gradient constraint matching is completed; selecting a minimum matching value from the obtained matching values, taking an image area corresponding to the minimum matching value as a target area, and performing feature matching on descriptors generated in the target area and descriptors of associated features of landmark images stored in a landmark database; calculating the coordinates of the positioning device relative to the landmark according to the imaging characteristic geometric relationship between the target area and the landmark;

judging whether the target area is matched with the associated features of the landmark images in the landmark database, if so, obtaining the coordinates of the landmarks in the map, combining the coordinates of the positioning device relative to the landmarks, namely, calculating the coordinates of the positioning device in the map, and updating and correcting by using the inertial data to finish the real-time positioning of the positioning device;

otherwise, fusing the inertial data according to the rigid connection relation between the inertial sensor and the forward-tilting camera, calculating a new landmark according to the relative gesture of the feature point of the target area and the feature point associated with the landmark image in the landmark database, and storing and recording the new landmark in the landmark database to finish the creation of the new landmark;

the inertial data is subjected to calibration filtering processing, and the rigid connection relation is a position relation established based on pose changes corresponding to the inertial data between two adjacent frames of images acquired by the forward-tilt camera.

Further, the feature matching process includes: under the current frame image, calculating the Hamming distance between the descriptor of the target region feature and the corresponding descriptor in the associated feature of the landmark image of the landmark database;

if the Hamming distance is smaller than a preset threshold value, the fact that the similarity of the associated features of the image collected by the backward inclined camera and the landmark image corresponding to the landmark image in the landmark database is high is indicated, and the matching is considered to be successful;

wherein the preset threshold corresponds to a determined numerical relationship of the relative pose.

Further, the fused inertial data includes: when the feature matching fails, according to the pose change corresponding to the inertial data between two adjacent frames of images acquired by the forward-tilt camera, the rigid connection relation between the inertial sensor and the forward-tilt camera is obtained, under the condition that the internal parameters of the forward-tilt camera are known, the feature point coordinates of the current frame image predicted by the inertial sensor are calculated according to the rigid connection relation between the inertial sensor and the forward-tilt camera, the feature point coordinates of the current frame image predicted by the inertial sensor are compared with the feature point coordinates of the current frame image acquired by the forward-tilt camera, and the feature point coordinates of the current image acquired by the forward-tilt camera are updated and corrected.

Further, the imaging feature geometric relationship is a similar triangle relationship established based on the parallax between the target area of the right image and the target area of the left image acquired by the backward tilting camera and the position relationship of the road sign in the acquired actual scene.

A robot, which is a mobile robot provided with the positioning device.

The application provides a forward-tilting camera which is used for detecting and identifying an object in the forward driving direction of a positioning device, establishing a rigid connection relation between an inertial sensor and the forward-tilting camera, realizing the fusion of inertial data, and also provides a backward-tilting binocular camera which is used for capturing an environmental image and realizing binocular positioning according to a geometric relation; compared with the prior art, the application is provided with the cameras with different directions to respectively acquire image data for detecting and identifying the obstacle and positioning and navigation processes, thereby improving the precision of landmark detection, realizing positioning by the cooperation of the cameras, reducing the operation amount of memory resources, shortening the characteristic searching time and improving the navigation efficiency.

Drawings

FIG. 1 is a block diagram of a positioning device for enhancing vision according to an embodiment of the present application;

FIG. 2 is a flow chart of a positioning method for enhancing vision according to the embodiment of the present application;

fig. 3 is a diagram of a robot system structure for enhancing vision (a binocular camera is positioned at a protruding structure on the surface of the positioning device) according to the embodiment of the present application.

Description of the embodiments

The following is a further description of embodiments of the application, taken in conjunction with the accompanying drawings:

the positioning device for enhancing the vision in the embodiment of the application is implemented in a robot mode and comprises a sweeping robot, an AGV and the like. The following assumes that the obstacle avoidance device is mounted on a robot for sweeping floor. However, it will be appreciated by those skilled in the art that the configuration according to the embodiment of the present application can be extended to be applied to a mobile terminal, except for being particularly used for a mobile robot.

The application provides a visual enhancement positioning device, which is a movable visual positioning device, as shown in fig. 1, and comprises an image acquisition module, an image processing module, an inertial data acquisition module and a fusion positioning module. The image acquisition module comprises a forward-inclined camera and is used for detecting and identifying an object in the forward driving direction of the positioning device; and a rearwardly facing tilt camera for capturing an image of the environment for positioning. In the advancing process of the positioning device, the front-facing inclined camera is placed in front of the positioning device, and is generally required to slightly protrude, so that a better visual angle can be obtained by keeping a preset angle. The backward inclined camera is placed at the tail part of the positioning device, and positioning navigation is performed by collecting identifiable landmarks. The forward-facing tilt camera in the image acquisition module may use features and/or landmarks that are navigated on the return path by image data captured by the backward-facing tilt camera whenever the positioning device turns.

Preferably, as shown in fig. 3, the front-facing tilt camera 108 is positioned at a front half opening front concave structure of the top surface of the positioning device for detecting and identifying an object in the forward driving direction of the positioning device; specifically, the front camera is only used for object recognition and obstacle detection, but is not used for positioning navigation. The rearward facing tilt camera is positioned at a rearwardly projecting formation of the rear opening in the top surface of the positioning device. The camera is prevented from being collided or shielded by the rear position of the camera, and the camera is more suitable for capturing an environment image to realize accurate positioning; in particular, the forward-facing tilt camera 108 and the rearward-facing tilt camera both form an angular range of about 45 degrees with the top surface of the positioning device, increasing the effective field of view of the positioning device, preventing unwanted imaging problems, such as light reflection and/or refraction that may prevent effective imaging features of the cameras, and thus being more suitable for positioning and mapping in indoor environments.

Further, the front-facing tilt camera 108 is positioned at a front half opening of the top surface of the positioning device and is usually used for navigation positioning in the prior art, but only for object recognition and obstacle detection in the implementation of the present application, because the front-facing tilt camera can block the lens during the forward driving of the positioning device, which is not beneficial for real-time positioning navigation, and after being blocked, can perform object recognition by recognizing the characteristics of the blocked image. The open forward concave and/or convex structure, while easily obscured, can provide a specific view angle for the camera, thereby improving the accuracy of the feature angle of the captured image.

Further, the backward tilting camera is a binocular camera with the same imaging parameters, as shown in fig. 3, the binocular camera is divided into a left camera 106L and a right camera 106R, which are positioned side by side at a protruding structure of the rear opening of the top surface of the positioning device, and the left camera 106L and the right camera 106R can reduce the phenomenon that the left camera 106L and the right camera 106R block the lens during the forward driving of the positioning device, so as to prevent unwanted imaging problems, such as light reflection and/or refraction that may prevent the effective imaging characteristics of the cameras, thereby being more suitable for real-time positioning navigation in indoor environment.

Specifically, as shown in fig. 3, the angles formed by the optical axes of the left camera 106L and the right camera 106R corresponding to the front-facing tilt camera 108 and the rear-facing tilt camera and the tilt of the top surface of the positioning device span 0-80 degrees, and the tilt angles formed by the two angles are acute angles a, which can be generally defined to be 45 degrees, so as to ensure that a good approximation effect of the real imaging characteristics is obtained and improve the precision of detecting landmark characteristics.

As shown in fig. 1, the image processing module includes an image preprocessing sub-module and a feature matching sub-module, and is configured to process image data acquired in the image acquisition module. The image preprocessing sub-module receives the image data acquired by the image acquisition module to establish a unique repeated identifiable landmark in the surrounding environment, binarizes the color image data acquired by the backward-oriented inclined camera, converts the color image data into a gray image and completes the preprocessing process of the image; and then the feature matching sub-module extracts feature data from the images preprocessed by the image preprocessing sub-module and matches the associated features of the landmark images in the landmark database.

The landmark database is a landmark database built in the image processing module, and comprises image feature points of a given landmark associated area. The landmark database contains information about a number of previously observed landmarks with which the positioning device can perform navigational positioning actions. Landmarks may be considered as a collection of features having a specific two-dimensional structure. Any of a variety of features may be used to identify a landmark, which may be, but is not limited to, a set of features identified based on a two-dimensional structure of corners of a photo frame when the positioning device is configured as a house cleaning robot. Such features are based on static geometry within the room and, although features have some illumination and dimensional variation, they are generally more easily discernable and identifiable as landmarks relative to objects in lower regions of the environment that are frequently displaced (e.g., chairs, garbage cans, pets, etc.).

As shown in fig. 1, the inertial data acquisition and processing module consists of a series of inertial data measurement units, and senses the rotation angle information, acceleration information and translation speed information of the inertial sensor in real time; the module is used for acquiring inertial data through an inertial sensor, and then performing calibration filtering processing and transmitting the inertial data to the fusion positioning module. The original data processing of the inertial data comprises a masking of a maximum value and a minimum value; static drift elimination; kalman filtering of data. Wherein the inertial sensor comprises an odometer, a gyroscope, an accelerometer and the like for inertial navigation. The data acquired by the inertial sensors are used for capturing images of the tracking landmarks based on the optical flow observed between the continuous images in the subsequent processing process and determining the travelling distance to obtain an optical flow ranging system, so that the system is suitable for the sensor combination required by specific image matching.

As shown in fig. 1, the fusion positioning module is configured to perform data fusion on the inertial data acquired by the inertial data acquisition processing module according to the feature matching result in the image processing module and in combination with the image data acquired by the forward-tilt camera, and correct the current position information according to the data fusion result. The module is based on the images collected by the forward-facing inclined camera and the backward-facing inclined camera, and then combines the travel distance obtained by the inertial sensor to match the obtained new image information with the corresponding landmark images stored in the landmark database, and then performs data fusion to realize positioning.

Specifically, in the fusion positioning module, when the feature matching in the image processing module is successful, the coordinates of the landmark in the map are obtained, and the coordinates of the positioning device in the map can be calculated by combining the coordinates of the positioning device relative to the landmark, and the inertial data is used for updating and correcting; when feature matching in the image processing module fails, a rigid connection relation between the inertial sensor and the forward-tilting camera is obtained according to the accumulated value of the inertial data, then a new landmark is calculated and stored and recorded in the landmark database by combining the relative gesture of the target image obtained by the backward-tilting camera according to left-right binocular parallax and the associated feature of the landmark image in the landmark database, and the creation of the new landmark is completed; the rigid connection relation is a position relation established based on pose changes corresponding to the inertial data between two adjacent frames of images acquired by the forward-tilting camera; the inertial sensor is connected with the forward-inclined camera, mapping relation exists between the forward-inclined camera and the gray level image features or landmark image associated features, the features can be obtained through gray level image extraction, and iterative operation is carried out by using the inertial sensor data between two continuous frames of images according to a rigid connection relation to obtain a predicted value of the current position of the positioning device, so that the searching area is smaller when the features are matched, and the matching speed is faster.

In an embodiment of the present application, the front-facing tilt camera 108 and the left and right cameras 106L and 106R of the rear-facing tilt camera capture images of the surrounding environment and provide these images to the image processing module. The locating device may use the forward-facing tilt camera 108 to detect a set of features associated with a landmark but not be used to match landmark images in the landmark database to locate navigation while moving in a forward direction toward the landmark; the backward tilt camera is used to track the same set of features when the motion is switched directions, in which case the backward tilt camera can be used for positioning navigation while controlling the positioning device to move away from the landmark. The positioning device can use both the forward-facing tilt camera 108 and the backward-facing tilt camera to capture images of the surrounding environment simultaneously, thereby capturing a greater portion of the surrounding environment in less time than a single camera, also saving memory computing resources because the two cameras play roles in relation to each other, co-assist feature matching recognition.

Based on the same inventive concept, the embodiment of the application also provides a positioning method for enhancing vision, and because the hardware device for solving the positioning problem by using the positioning method is based on the positioning device for enhancing vision, the implementation of the positioning method can be implemented by referring to the application of the positioning device for enhancing vision. In implementation, the positioning method for enhancing vision provided by the embodiment of the application, as shown in fig. 2, specifically includes:

step one, the two cameras facing the backward inclined cameras respectively collect the same landmark in the actual scene, and the images collected by the binocular cameras are analyzed and identified to extract image pairs. When the left image and the right image both contain corresponding landmarks, an image area is extracted from the left image by taking the feature points as central pixel points as templates, and the coordinates of the central pixel points of the left image are determined. Extracting another image area in the right image by taking the characteristic point as a center point as a search window, and then extracting an image area with the same size as the pixel of the template in the search window, so that the extracted image area is subjected to parallax gradient constraint matching with the template, thereby reducing the corner characteristics of mismatching, wherein the parallax gradient constraint matching is used as a purification algorithm of the matching characteristic point. And when the minimum matching value is selected from the obtained corner feature matching values, taking the image area corresponding to the minimum matching value as a target area.

Further, according to the imaging characteristic geometric relation between the target area and the landmark, calculating the coordinate of the positioning device relative to the landmark;

meanwhile, carrying out feature matching on the descriptors generated from the target area and the descriptors of the associated features of the landmark images stored in the landmark database; specifically, performing Gaussian filtering processing on an image of a target area obtained by shooting and identifying from the binocular camera, removing noise, and then performing graying; extracting feature points from the gray level image to generate image features, comparing the extracted feature points with the pixel gray levels of 256 positions in the field of the feature points, recording a result in binary system, wherein 0 represents that the pixel gray level of the feature points is smaller than any one of the 256 pixel gray levels in the field of the feature points, 1 represents that the pixel gray level of the feature points is larger than any one of the 256 pixel gray levels in the field of the feature points, and storing the result in a 256-dimensional vector as a descriptor of the feature points. The field of the feature point is a circular plane including the feature point as a center and r as a radius, and the value of r is determined according to the image gray level of the target area.

And step two, matching the descriptors of the gray image features of the target area, wherein the matching object is the associated features of the landmark images stored in the landmark database, and similarly, carrying out the graying processing on the landmark images to obtain the descriptors of the corresponding feature points, and carrying out feature matching on the descriptors generated in the target area and the descriptors of the associated features of the landmark images stored in the landmark database.

Judging whether the gray image features of the target area are matched with the landmark image associated features in a landmark database, if so, obtaining coordinates of the landmark in a map, and calculating the coordinates of the positioning device relative to the landmark according to the imaging feature geometric relationship of the target area and the landmark; under the condition that the internal parameters of the backward-inclined camera are known, the coordinates of the positioning device in the map are obtained through the inertial sensor, the predicted position coordinates of the current position of the positioning device are corrected and updated, the accurate current position coordinates are obtained, and the real-time positioning of the positioning device is completed.

Otherwise, according to the rigid connection relation between the inertial sensor and the forward-tilting camera, fusing the image characteristic point coordinates predicted by the inertial data in the landmark image of the current frame, and then combining the relative postures of the characteristic points of the target area and the characteristic points associated with the landmark image in a landmark database to calculate a new landmark, and storing and recording the new landmark in the landmark database to finish the creation of the new landmark. Wherein the inertial data has been subjected to a calibration filtering process; the inertial data includes angular velocity, acceleration and distance information; the rigid connection relation is a position relation established based on pose changes corresponding to the inertial data between two adjacent frames of images acquired by the forward-tilting camera.

As one mode of implementation of the present application, in the second step, the feature matching process includes: under the current frame image, calculating the Hamming distance between the descriptors of the target area features and the corresponding descriptors in the associated features of the landmark images of the landmark database; if the Hamming distance is smaller than a preset threshold value, the image which is matched and processed by the binocular camera is high in similarity with the associated features of the landmark image corresponding to the landmark image in the landmark database, and the matching is considered to be successful. Specifically, in the implementation of the application, feature matching is performed by calculating the hamming distance of the descriptors of the feature points, and a large number of experiments show that the hamming distance of the descriptors of the feature points which are failed to match is about 128, and the hamming distance of the descriptors of the feature points which are successful to match is far less than 128; namely, the number of the same elements on the bit corresponding to the feature codes of the descriptors in the image templates of the database is smaller than 128, and the descriptors are not necessarily paired; the feature points on one graph can be paired with the feature points with the largest number of identical elements on the bit corresponding to the feature codes on the other graph. The preset threshold corresponds to a numerical relation of a relative gesture between a feature point of the target area and a feature point associated with a landmark image in a landmark database in the implementation of the present application, and is set to 128 in the embodiment of the present application. The relative pose is dependent on one or more of the two-dimensional spatial features associated with the identifiable landmarks within the image acquired by the rearwardly-facing tilt camera, and the estimate of the relative pose also varies due to variations in the various configuration sensors of the positioning device.

As an implementation manner of the present application, in the third step, according to a rigid connection relationship between the inertial sensor and the forward-tilting camera, a method for fusing image feature point coordinates predicted by the inertial data in a landmark image of a current frame includes: when the feature matching fails, according to the pose change corresponding to the inertial data between two adjacent frames of images acquired by the forward-tilt camera, the rigid connection relation between the inertial sensor and the forward-tilt camera is obtained, under the condition that the internal parameters of the forward-tilt camera are known, the feature point coordinates of the current frame image predicted by the inertial sensor are calculated according to the rigid connection relation between the inertial sensor and the forward-tilt camera, the feature point coordinates of the current frame image predicted by the inertial sensor are compared with the feature point coordinates of the current frame image acquired by the forward-tilt camera, and the feature point coordinates of the current image acquired by the forward-tilt camera are updated and corrected.

Specifically, the inertial sensor is used for making a predictive motion model, the forward-facing tilt camera is used for making an observation model, and rigid connection between the inertial sensor and the forward-facing tilt camera is used as a parameter value to be estimated; and calculating accumulated values of the inertial data between two continuous frames of images, wherein the accumulated values comprise translation caused by speed and acceleration and rotation caused by angular speed, and because one inertial sensor is connected to the forward-inclined camera, the forward-inclined camera is connected to the image, and the image is mapped to the feature, meanwhile, the feature point can be obtained through image extraction, an optimization equation is constructed according to the imaging uniqueness principle of the same feature point in the image, and the iterative solution is carried out by taking the gesture provided by the inertial sensor as an initial value. And then, information fusion is carried out by using the predicted and observed covariance information to obtain an optimal estimation in the least square sense, and the accurate coordinate value of the positioning device in the current position is obtained by updating and correcting.

Recording the inertial data and performing accumulation operation between two continuous frames of images shot by the forward-facing oblique camera, obtaining pose transformation recorded by the inertial sensor between two continuous frames, converting the pose transformation into the pose transformation of the forward-facing oblique camera by utilizing a fixed rotation and translation transformation relation between the inertial sensor and the forward-facing oblique camera, and obtaining the coordinates of the characteristic coordinates of the previous frame in the current frame according to the internal reference matrix of the forward-facing oblique camera; and when the feature matching fails, predicting the current image coordinates through the current position coordinates acquired by the inertial sensor by using the conversion method, comparing the current image coordinates with the feature point coordinates in the current image features, updating and correcting the feature point coordinates in the current image features, and storing the feature point coordinates back to the landmark database as a new landmark created in the current position. When the feature matching is successful, the image processing module preprocesses the current position coordinates obtained by the imaging feature geometric relation operation of the coordinates of the gray image features, and compares the current position coordinates obtained by the inertial sensor with the current position coordinates obtained by the inertial sensor, namely, the current position coordinates obtained by the feature points are corrected by the observation model, so that correction and update of the current position coordinates obtained by the feature points are realized. The creation of a new landmark is optionally attempted when the locating device does not match the presence of a known landmark within the input image.

As an implementation manner of the present application, in the third step, the imaging feature geometric relationship is established based on a position relationship corresponding to an image acquired on a lens orientation angle a (as shown in fig. 3) of the backward tilt camera at the preset position and inertial data acquired by the inertial sensor sensing road sign. The backward tilt camera model adopts a traditional pinhole model, the backward tilt camera internal parameters are known, and the geometrical relationship of a similar triangle is constructed by combining the triangulation of the distance and the position of the features on the road sign shot in the forward process of the positioning device, so that the two-dimensional coordinates of the corresponding feature angular points on the road sign in the backward tilt camera coordinate system can be calculated.

Specifically, subtracting the coordinates of the central pixel point in the left image from the coordinates of the central pixel point in the right image to obtain parallax; substituting parallax into a binocular range formula, and calculating the distance from the binocular camera to an actual scene, wherein the binocular range formula is as follows:

wherein T is the distance between binocular cameras, f is the focal length of the binocular cameras, x is parallax, and Z is the distance from the landmark to the binocular cameras.

As an embodiment of a robot in the implementation of the present application, fig. 3 provides a structural diagram of a sweeping robot, which may be used as a specific application product structural diagram of a positioning device for enhancing vision provided in the implementation of the present application, and for convenience of explanation, only the parts related to the embodiment of the present application are shown. The image processing module and the fusion positioning module in the positioning device are arranged in the signal processing board 102; the image acquisition module comprises a camera 106, wherein the backward-facing inclined camera is a binocular camera, the left camera and the right camera are a camera 106L and a camera 106R respectively, and the image acquisition module is arranged at the backward protruding structure of the tail part of the machine body 101 side by side, so that the backward-facing inclined camera is far away from the collision detection sensor 105 and is prevented from being touched by some objects which are difficult to detect; the optical axes of the camera 106R and the camera 106L and the top surface of the positioning device form a certain inclination angle alpha, so that the binocular camera has a better observation direction.

The image acquisition module also includes a forward-facing tilt camera 108; the front-facing tilt camera 108 is mounted at a backward concave structure of a front half portion of the machine body 101, and an optical axis of the front-facing tilt camera 108 forms a certain tilt angle alpha with a top surface of the positioning device, so that the front-facing tilt camera 108 is far away from the collision detection sensor 105, and a better viewing angle range is provided for the front-facing tilt camera. The inertial data acquisition module comprises a collision detection sensor 105, the inertial data acquisition module senses under the action of the engine body 101 driven by the motion wheel 104 and the universal wheel 107, and the inertial data acquisition module, the camera 106R and the camera 106L acquire data which are used for fusing and correcting position coordinates by applying the relative pose and the rigid connection relation, so that positioning navigation actions are executed, and the landmark database can be updated to be used as a basis for constructing a navigation map. And finally, the human-computer interface 103 outputs the accurate coordinate value of the current position of the sweeping robot calculated by the signal processing board.

The above embodiments are merely for fully disclosing the present application, but not limiting the present application, and should be considered as the scope of the present disclosure based on the substitution of equivalent technical features of the inventive subject matter without creative work.

Claims

1. The positioning device for enhancing the vision is a movable vision positioning device and is characterized by comprising an image acquisition module, an image processing module, an inertial data acquisition module and a fusion positioning module;

2. The positioning device of claim 1 wherein said forwardly facing tilt camera is positioned at a forwardly facing concave and/or convex structure of a front half opening of a top surface of said positioning device.

3. The positioning device according to claim 1, wherein the backward-facing tilt camera is a binocular camera with identical imaging parameters, two cameras of the binocular camera being positioned side by side at concave and/or convex structures with backward tail openings on the top surface of the positioning device.

4. A positioning device as claimed in any one of claims 1 to 3 wherein the optical axes of the forwardly and rearwardly facing inclined cameras are inclined to the top surface of the positioning device at angles which all span between 0 and 80 degrees and which remain equal.

5. The positioning device according to claim 1, wherein in the fusion positioning module, when the feature matching in the image processing module is successful, coordinates of the landmark in the map are obtained, and the coordinates of the positioning device in the map can be calculated by combining the coordinates of the positioning device relative to the landmark, and updating and correcting are performed by using the inertial data;

6. A positioning method for enhancing vision, wherein the positioning method is applied to the positioning device according to any one of claims 1 to 5, and comprises the steps of:

7. The positioning method of claim 6, wherein the feature matching process comprises: under the current frame image, calculating the Hamming distance between the descriptor of the target region feature and the corresponding descriptor in the associated feature of the landmark image of the landmark database;

8. The positioning method according to claim 6, wherein the method for fusing the inertial data according to the rigid connection relationship between the inertial sensor and the forward-tilt camera includes: when the feature matching fails, according to the pose change corresponding to the inertial data between two adjacent frames of images acquired by the forward-tilt camera, the rigid connection relation between the inertial sensor and the forward-tilt camera is obtained, under the condition that the internal parameters of the forward-tilt camera are known, the feature point coordinates of the current frame image predicted by the inertial sensor are calculated according to the rigid connection relation between the inertial sensor and the forward-tilt camera, the feature point coordinates of the current frame image predicted by the inertial sensor are compared with the feature point coordinates of the current frame image acquired by the forward-tilt camera, and the feature point coordinates of the current image acquired by the forward-tilt camera are updated and corrected.

9. The positioning method according to claim 6, wherein the imaging feature geometry is a similar triangle relationship established based on a positional relationship between a parallax of a target region of a right image and a target region of a left image acquired by the backward tilt camera and a road sign in an acquired actual scene.

10. A robot, characterized in that the robot is a mobile robot provided with a positioning device according to any one of claims 1 to 5.