CN113223007A - Visual odometer implementation method and device and electronic equipment - Google Patents

Visual odometer implementation method and device and electronic equipment Download PDF

Info

Publication number
CN113223007A
CN113223007A CN202110715562.4A CN202110715562A CN113223007A CN 113223007 A CN113223007 A CN 113223007A CN 202110715562 A CN202110715562 A CN 202110715562A CN 113223007 A CN113223007 A CN 113223007A
Authority
CN
China
Prior art keywords
image
camera
determining
change information
pose
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110715562.4A
Other languages
Chinese (zh)
Inventor
胡鲲
卢维
王政
李铭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Huaray Technology Co Ltd
Original Assignee
Zhejiang Huaray Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Huaray Technology Co Ltd filed Critical Zhejiang Huaray Technology Co Ltd
Priority to CN202110715562.4A priority Critical patent/CN113223007A/en
Publication of CN113223007A publication Critical patent/CN113223007A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C22/00Measuring distance traversed on the ground by vehicles, persons, animals or other moving solid bodies, e.g. using odometers, using pedometers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/66Analysis of geometric attributes of image moments or centre of gravity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Abstract

The application provides a method and a device for realizing a visual odometer, electronic equipment and a computer-readable storage medium; the method comprises the following steps: dividing a first image acquired by a downward-looking camera to obtain at least two image blocks; acquiring first characteristic points of each image block and main direction information of the first characteristic points; matching each first characteristic point with a second characteristic point included in a last frame of image of the first image based on a locality sensitive hashing algorithm to obtain a corresponding characteristic point pair; determining, based on the pair of feature points, that the camera acquires first pose change information between the first image and a last frame image of the first image; the first pose change information is related to a scale ratio parameter of the camera; determining the pose of the camera when acquiring the first image based on the first pose change information. The method and the device are not influenced by external illumination and a reflective environment, and can reduce the realization time of the visual odometer and improve the precision of the visual odometer.

Description

Visual odometer implementation method and device and electronic equipment
Technical Field
The present disclosure relates to image processing technologies, and in particular, to a method and an apparatus for implementing a visual odometer, and an electronic device.
Background
The vision odometer can calculate the pose change between adjacent frame images acquired by the camera by using a vision-related algorithm, and then obtains the pose change of the robot by using a calibration relation between the camera and the robot. The Visual odometer can also perform Visual incremental Mapping on the scene, or autonomously establish a Visual map, so as to realize Visual Simultaneous Localization and Mapping (VSLAM).
In the related technology, the first scheme is to capture images of an environment through a camera carried by a robot body, and then solve the motion between adjacent frame images captured by the camera through a pyramid optical flow method, so as to obtain the pose change of the camera within a certain time. The second scheme is that feature point extraction is carried out on two adjacent frames of images shot by a camera, descriptors of the feature points are calculated, corresponding matching point pairs are found by matching the feature points in the two adjacent frames of images, and then pose change of the camera between frames is unlocked; and accumulating the continuous interframe pose relations to form data of the visual odometer. The third scheme is that an inclined downward camera is installed at the front end of the robot, ground feature points in an image are extracted by using ground point cloud in the laser odometer, absolute scale camera motion estimation is realized based on homography transformation, and then the camera motion estimation is used for correcting self-motion point cloud distortion and pose optimization in the laser odometer to form data of the visual odometer.
However, the first solution is realized based on the assumption that the gray scale of the image is not changed, and the pose change of the camera is easily affected by external illumination. In the second scheme, the descriptor usually takes 512 dimensions to achieve the best effect, but the 512-dimensional descriptor needs more matching time and more storage space. In the third scheme, the camera is obliquely and downwards arranged on the robot, so that external parameters of the obliquely and downwards arranged camera need to be calibrated and are easily interfered by structural change, the calibration relation between the camera and the laser radar is changed, and the precision of the visual odometer is influenced; the mounting structure for mounting the camera obliquely downward is susceptible to front side light reflection in some scenes where the light reflection is severe.
Disclosure of Invention
The embodiment of the application provides a method and a device for realizing a visual odometer and electronic equipment, which are not influenced by external illumination and a reflective environment, reduce the realization time of the visual odometer and improve the realization precision of the visual odometer.
The technical scheme of the embodiment of the application is realized as follows:
in a first aspect, an embodiment of the present application provides an implementation method of a visual odometer, including:
dividing a first image acquired by a downward-looking camera to obtain at least two image blocks;
acquiring first characteristic points of each image block and main direction information of the first characteristic points;
matching each first characteristic point with a second characteristic point included in a last frame of image of the first image based on a locality sensitive hashing algorithm to obtain a corresponding characteristic point pair;
determining, based on the pair of feature points, that the camera acquires first pose change information between the first image and a last frame image of the first image; the first pose change information is related to a scale ratio parameter of the camera;
determining the pose of the camera when acquiring the first image based on the first pose change information.
In some embodiments, the obtaining the first feature points of each image block and the principal direction information of each first feature point includes:
respectively executing the following operations for each image block:
determining a centroid of the image block based on image moments of the image block;
determining a main direction of the image block based on a centroid of the image block and a geometric center of the image block;
and determining main direction information of each first characteristic point in the image block based on the main direction of the image block.
In some embodiments, the obtaining the first feature points of each image block and the principal direction information of each first feature point includes:
respectively executing the following operations aiming at each pixel point in each image block:
determining first gray information of the pixel points;
determining second gray information of the pixel points with the distance between the pixel points and the distance equal to the distance threshold;
judging whether the pixel points are angular points or not based on the first gray information and the second gray information;
and if the pixel point is the angular point, determining the pixel point as the first characteristic point.
In some embodiments, the matching, based on a locality-sensitive hashing algorithm, each of the first feature points with a second feature point included in a previous frame of the first image to obtain a corresponding feature point pair includes:
respectively carrying out Hash transformation on the descriptor of the first characteristic point and the descriptor of the second characteristic point;
performing dimension reduction processing on the first feature points after the hash transformation and the second feature points after the hash transformation;
calculating the distance between the first feature point after the dimension reduction processing and the second feature point after the dimension reduction processing;
determining the feature point pairs based on the calculation result.
In some embodiments, the determining, based on the pair of feature points, first pose change information between the camera acquiring the first image and a last frame image of the first image includes:
mapping a first characteristic point corresponding to the characteristic point pair to a second characteristic point corresponding to the characteristic point pair to obtain a homography matrix;
determining a reference rotational offset and a reference translational offset based on the homography matrix;
and multiplying the reference rotation offset and the reference translation offset by the scale proportion parameter respectively to obtain the rotation translation amount and the translation offset between the first image acquired by the camera and the last frame image of the first image.
In some embodiments, before determining, based on the pair of feature points, that the camera acquires the first pose change information between the first image and the image of the last frame of the first image, the method further includes:
determining second attitude change information between the camera acquiring the second frame image and the camera acquiring the first frame image;
determining mileage change information occurring when the camera acquires the second frame image and when the camera acquires the first frame image;
determining the scale ratio parameter based on the mileage change information and the second posture change information.
In some embodiments, the determining the scale ratio parameter based on the mileage change information and the second posture change information comprises:
calculating a first ratio of the mileage change information of a first coordinate axis included in the mileage change information to the translation amount of the first coordinate axis included in the second position change information, and a second ratio of the mileage change information of a second coordinate axis included in the mileage change information to the translation amount of the second coordinate axis included in the second position change information;
determining half of the sum of the first ratio and the second ratio as the scale ratio parameter.
In some embodiments, the determining the pose of the camera at the time of acquiring the first image based on the first pose change information comprises:
and determining the pose of the camera when the first image is acquired based on the first pose change information and the pose of the last frame of image of the first image.
In a second aspect, an embodiment of the present application provides an apparatus for implementing a visual odometer, including:
the image segmentation module is used for segmenting a first image acquired by the downward-looking camera to obtain at least two image blocks;
the information acquisition module is used for acquiring first characteristic points of each image block and main direction information of each first characteristic point;
a characteristic point pair determining module, configured to match each first characteristic point with a second characteristic point included in a previous frame of the first image based on a locality sensitive hashing algorithm, to obtain a corresponding characteristic point pair;
a pose determination module for determining first pose change information between the first image acquired by the camera and a last frame image of the first image based on the feature point pairs; the first pose change information is related to a scale ratio parameter of the camera; determining the pose of the camera when acquiring the first image based on the first pose change information.
In some embodiments, the information obtaining module is configured to perform the following operations for the image blocks respectively:
determining a centroid of the image block based on image moments of the image block;
determining a main direction of the image block based on a centroid of the image block and a geometric center of the image block;
and determining main direction information of each first characteristic point in the image block based on the main direction of the image block.
In some embodiments, the information obtaining module is configured to perform the following operations for each pixel point in each of the image blocks respectively:
determining first gray information of the pixel points;
determining second gray information of the pixel points with the distance between the pixel points and the distance equal to the distance threshold;
judging whether the pixel points are angular points or not based on the first gray information and the second gray information;
and if the pixel point is the angular point, determining the pixel point as the first characteristic point.
In some embodiments, the feature point pair determining module is configured to perform hash transformation on the descriptor of the first feature point and the descriptor of the second feature point respectively;
performing dimension reduction processing on the first feature points after the hash transformation and the second feature points after the hash transformation;
calculating the distance between the first feature point after the dimension reduction processing and the second feature point after the dimension reduction processing;
determining the feature point pairs based on the calculation result.
In some embodiments, the pose determination module is configured to map a first feature point corresponding to the feature point pair to a second feature point corresponding to the feature point pair, so as to obtain a homography matrix;
determining a reference rotational offset and a reference translational offset based on the homography matrix;
and multiplying the reference rotation offset and the reference translation offset by the scale proportion parameter respectively to obtain the rotation translation amount and the translation offset between the first image acquired by the camera and the last frame image of the first image.
In some embodiments, the pose determination module is further configured to determine second pose change information between the camera acquiring the second frame of image and the camera acquiring the first frame of image;
determining mileage change information occurring when the camera acquires the second frame image and when the camera acquires the first frame image;
determining the scale ratio parameter based on the mileage change information and the second posture change information.
In some embodiments, the pose determination module is configured to calculate a first ratio of the mileage variation information of a first coordinate axis included in the mileage variation information to the translation amount of a first coordinate axis included in the second posture variation information, and a second ratio of the mileage variation information of a second coordinate axis included in the mileage variation information to the translation amount of a second coordinate axis included in the second posture variation information;
determining half of the sum of the first ratio and the second ratio as the scale ratio parameter.
In some embodiments, the pose determination module is configured to determine the pose of the camera at the time the first image was acquired based on the first pose change information and the pose of the last frame of the first image.
In a third aspect, an embodiment of the present application provides an electronic device, including:
a memory for storing executable instructions;
and the processor is used for realizing the realization method of the visual odometer provided by the embodiment of the application when executing the executable instructions stored in the memory.
In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium storing executable instructions for implementing a method for implementing a visual odometer, when the computer-readable storage medium is executed by a processor.
According to the implementation method of the visual odometer, a first image collected by a downward-looking camera is segmented to obtain at least two image blocks; acquiring first characteristic points of each image block and main direction information of the first characteristic points; matching each first characteristic point with a second characteristic point included in a last frame of image of the first image based on a locality sensitive hashing algorithm to obtain a corresponding characteristic point pair; determining, based on the pair of feature points, that the camera acquires first pose change information between the first image and a last frame image of the first image; the first pose change information is related to a scale ratio parameter of the camera; determining the pose of the camera when acquiring the first image based on the first pose change information. Therefore, the implementation method of the visual odometer provided by the embodiment of the application determines the feature point pairs based on the locality sensitive hashing algorithm, only 256-dimensional descriptors are needed, the dimensionality of a search space is reduced, the matching time of the feature point pairs is further reduced, and the implementation time of the visual odometer is prolonged. In addition, the implementation method of the visual odometer provided by the embodiment of the application determines the first attitude change information by combining the scale proportion parameter of the camera, so that the precision of the visual odometer can be improved. By adopting the structure of the downward-looking camera, the implementation method of the visual odometer provided by the embodiment of the application is suitable for scenes with serious light reflection.
Drawings
FIG. 1 is a schematic diagram of an architecture of a system for implementing a visual odometer according to an embodiment of the present application;
fig. 2 is a schematic architecture diagram of a terminal device provided in an embodiment of the present application;
FIG. 3 is a schematic flow chart diagram illustrating an alternative method for implementing a visual odometer according to an embodiment of the present disclosure;
FIG. 4 is a schematic view of an alternative processing flow for segmenting a first image according to an embodiment of the present application;
fig. 5 is a schematic processing flow diagram for acquiring a first feature point of an image block according to an embodiment of the present application;
fig. 6 is a schematic view of an alternative processing flow for determining the principal direction information of the first feature point according to the embodiment of the present application;
FIG. 7 is a schematic diagram of an alternative process for determining first posture change information according to an embodiment of the present disclosure;
fig. 8 is a schematic processing flow diagram for determining first pose change information between the first image acquired by the camera and a last frame image of the first image, according to the feature point pair.
Detailed Description
In order to make the objectives, technical solutions and advantages of the present application clearer, the present application will be described in further detail with reference to the attached drawings, the described embodiments should not be considered as limiting the present application, and all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.
In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or different subsets of all possible embodiments, and may be combined with each other without conflict.
In the following description, references to the terms "first \ second \ third" are only to distinguish similar objects and do not denote a particular order, but rather the terms "first \ second \ third" are used to interchange specific orders or sequences, where appropriate, so as to enable the embodiments of the application described herein to be practiced in other than the order shown or described herein. In the following description, the term "plurality" referred to means at least two.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the application.
Before further detailed description of the embodiments of the present application, terms and expressions referred to in the embodiments of the present application will be described, and the terms and expressions referred to in the embodiments of the present application will be used for the following explanation.
1) Descriptors, information used to describe feature points, typically describe geometric features around a point based on point coordinates, normal vectors, and curvatures.
2) The centroid, which may also be referred to as the center of gravity of the image.
3) Image moments, which are commonly used to describe the segmented image blocks. Partial properties of the image block, including area (or overall brightness), and information about the geometric center and orientation can be obtained from the image moments.
4) And the visual odometer is used for estimating the motion of the camera according to the image shot by the camera.
5) Database (Database): similar to an electronic file cabinet, namely a place for storing electronic files, a user can perform operations of adding, inquiring, updating, deleting and the like on data in the files. A database is also to be understood as a collection of data that are stored together in a manner that can be shared with a plurality of users, with as little redundancy as possible, independent of the application. In embodiments of the present application, the database may store data for model training.
The embodiment of the application provides a method and a device for realizing a visual odometer, electronic equipment and a computer readable storage medium, which can reduce the realization time of the visual odometer and improve the realization precision of the visual odometer. An exemplary application of the electronic device provided in the embodiment of the present application is described below, and the electronic device provided in the embodiment of the present application may be implemented as various types of terminal devices, and may also be implemented as a server.
Referring to fig. 1, fig. 1 is an architectural diagram of a system 100 for implementing a visual odometer according to an embodiment of the present application, in which a terminal device 400 is connected to a server 200 through a network 300, and the server 200 is connected to a database 500, where the network 300 may be a wide area network or a local area network, or a combination of the two.
In some embodiments, taking the electronic device implementing the method for implementing the visual odometer as an example, as a terminal device, the method for implementing the visual odometer provided in the embodiments of the present application may be implemented by the terminal device. For example, the terminal device 400 runs a client 410, and the client 410 may be a client for executing an implementation method of the visual odometer.
The client 410 acquires a first image acquired by a downward-looking camera, and then the client 410 divides the first image to obtain at least two image blocks; acquiring first characteristic points of each image block and main direction information of the first characteristic points; matching each first characteristic point with a second characteristic point included in a last frame of image of the first image based on a locality sensitive hashing algorithm to obtain a corresponding characteristic point pair; determining, based on the pair of feature points, that the camera acquires first pose change information between the first image and a last frame image of the first image; the first pose change information is related to a scale ratio parameter of the camera; determining the pose of the camera when acquiring the first image based on the first pose change information.
In some embodiments, taking the electronic device implementing the method for implementing the visual odometer as an example, being a server, the method for implementing the visual odometer provided in the embodiments of the present application may be implemented cooperatively by the server and the terminal device.
The server 200 acquires a first image captured by the downward-looking camera from the client 410. Then, the server 200 divides the first image to obtain at least two image blocks; acquiring first characteristic points of each image block and main direction information of the first characteristic points; matching each first characteristic point with a second characteristic point included in a last frame of image of the first image based on a locality sensitive hashing algorithm to obtain a corresponding characteristic point pair; determining, based on the pair of feature points, that the camera acquires first pose change information between the first image and a last frame image of the first image; the first pose change information is related to a scale ratio parameter of the camera; determining the pose of the camera when acquiring the first image based on the first pose change information; the server 200 sends the pose of the camera when acquiring the first image to the client 410. The process of obtaining the image block by segmenting the first image by the server 200 can be implemented by using a pre-trained image block model; the server 200 acquires the sample image and the image blocks included in the sample image from the database 500, and trains the sample block model by using the characteristics of the sample image with different dimensions as granularity, so that the sample block model has the function of dividing the input image into the image blocks.
In some embodiments, the terminal device 400 or the server 200 may implement the implementation method of the visual odometer provided by the embodiments of the present application by running a computer program, for example, the computer program may be a native program or a software module in an operating system; can be a local (Native) Application program (APP), i.e. a program that needs to be installed in an operating system to run; or may be an applet, i.e. a program that can be run only by downloading it to the browser environment; but also an applet that can be embedded into any APP. In general, the computer programs described above may be any form of application, module or plug-in.
In some embodiments, the server 200 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a Cloud server providing basic Cloud computing services such as a Cloud service, a Cloud database, Cloud computing, a Cloud function, Cloud storage, a web service, Cloud communication, a middleware service, a domain name service, a security service, a CDN, and a big data and artificial intelligence platform, where Cloud Technology (Cloud Technology) refers to a hosting Technology for unifying resources of hardware, software, a network, and the like in a wide area network or a local area network to implement computing, storage, processing, and sharing of data. The terminal device 400 may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, and the like. The terminal device and the server may be directly or indirectly connected through wired or wireless communication, and the embodiment of the present application is not limited.
Taking the electronic device provided in the embodiment of the present application as an example for illustration, it can be understood that, for the case where the electronic device is a server, parts (such as the user interface, the presentation module, and the input processing module) in the structure shown in fig. 2 may be default. Referring to fig. 2, fig. 2 is a schematic structural diagram of a terminal device 400 provided in an embodiment of the present application, where the terminal device 400 shown in fig. 2 includes: at least one processor 460, memory 450, at least one network interface 420, and a user interface 430. The various components in the terminal 400 are coupled together by a bus system 440. It is understood that the bus system 440 is used to enable communications among the components. The bus system 440 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 440 in fig. 2.
The Processor 460 may be an integrated circuit chip having Signal processing capabilities, such as a general purpose Processor, a Digital Signal Processor (DSP), or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like, wherein the general purpose Processor may be a microprocessor or any conventional Processor, or the like.
The user interface 430 includes one or more output devices 431, including one or more speakers and/or one or more visual displays, that enable the presentation of media content. The user interface 430 also includes one or more input devices 432, including user interface components that facilitate user input, such as a keyboard, mouse, microphone, touch screen display, camera, other input buttons and controls.
The memory 450 may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid state memory, hard disk drives, optical disk drives, and the like. Memory 450 optionally includes one or more storage devices physically located remote from processor 460.
The memory 450 includes either volatile memory or nonvolatile memory, and may include both volatile and nonvolatile memory. The nonvolatile Memory may be a Read Only Memory (ROM), and the volatile Memory may be a Random Access Memory (RAM). The memory 450 described in embodiments herein is intended to comprise any suitable type of memory.
In some embodiments, memory 450 is capable of storing data, examples of which include programs, modules, and data structures, or a subset or superset thereof, to support various operations, as exemplified below.
An operating system 451, including system programs for handling various basic system services and performing hardware-related tasks, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and handling hardware-based tasks;
a network communication module 452 for communicating to other computing devices via one or more (wired or wireless) network interfaces 420, exemplary network interfaces 420 including: bluetooth, wireless compatibility authentication (WiFi), and Universal Serial Bus (USB), etc.;
a presentation module 453 for enabling presentation of information (e.g., user interfaces for operating peripherals and displaying content and information) via one or more output devices 431 (e.g., display screens, speakers, etc.) associated with user interface 430;
an input processing module 454 for detecting one or more user inputs or interactions from one of the one or more input devices 432 and translating the detected inputs or interactions.
In some embodiments, the apparatus provided by the embodiments of the present application may be implemented in software, and fig. 2 shows an implementation apparatus 455 of the visual odometer, which may be software in the form of programs and plug-ins, and the like, stored in the memory 450, and may include the following software modules: an image segmentation module 4551, an information acquisition module 4552, a feature point pair determination module 4553 and a pose determination module 4554, which are logical and thus may be arbitrarily combined or further separated according to the functions implemented. The functions of the respective modules will be explained below.
The embodiment of the application provides an implementation method of a visual odometer, which can at least solve the problem.
The implementation method of the visual odometer provided by the embodiment of the present application will be described below in conjunction with an exemplary application and implementation of the electronic device provided by the embodiment of the present application.
Referring to fig. 3, fig. 3 is a schematic flow chart of an alternative implementation method of the visual odometer according to the embodiment of the present application, which will be described with reference to the steps shown in fig. 3.
Step S101, a first image collected by a downward-looking camera is divided to obtain at least two image blocks.
In some embodiments, the manner in which the downward-looking camera captures the first image may be vertical capture or vertical photography; as an example, a downward-looking camera is mounted below the chassis of an electronic device (e.g., a robot), the downward-looking camera being perpendicular to the object being photographed when capturing the first image. By adopting the structure of the downward-looking camera, the implementation method of the visual odometer provided by the embodiment of the application is suitable for scenes with serious reflection, and can better track the camera when the camera moves or rotates in a large scale, so that the camera is prevented from falling into the local optimal dilemma. External parameters of the downward-looking camera do not need to be calibrated, and the precision of the visual odometer can be improved. Because the downward-looking camera can be arranged below the chassis of the robot, the working environment of the downward-looking camera is more stable and single compared with the exposed forward-looking camera, the image processing of a visual algorithm is facilitated, and the interference of illumination and a dynamic environment to the camera can be reduced.
In some embodiments, the feature points are generally characteristic of some aspect of the image, and orb (organized FAST and organized brief) feature points have local invariance and strong noise immunity, and can be used in systems of various scales of visual SLAM. And identifying the acquired image at the current position based on a quadtree structure algorithm and an rBRIEF algorithm so as to extract corresponding texture information in the image. Wherein the texture information comprises a first feature point.
In some embodiments, an alternative processing flow diagram for segmenting the acquired first image may be as shown in fig. 4, and at least includes the following steps:
s101a, selecting the number of initial root nodes according to the aspect ratio of the first image.
In some embodiments, images of different aspect ratios correspond to different numbers of initial root nodes. Wherein the initial root node may be set to 1 or 2.
And S101b, performing 'splitting' operation on the determined initial root node according to the width and the height of the first image to obtain an image block.
In some embodiments, a point of interest in the acquired first image is first selected, and after determining the point of interest in the first image, it may be determined whether a response value of the point of interest is greater than a response threshold, and if the response value of the point of interest is greater than the response threshold, the point of interest is retained.
In some embodiments, the interest points with larger response values in one image block obtained by splitting may be set to 4. And continuing the splitting operation in the image blocks after splitting, and stopping the splitting operation when the number of the total interest points in the split image blocks meets the preset number. In another alternative embodiment, when the interest points in the split image blocks do not meet the response threshold, the splitting operation is stopped.
In particular implementation, the splitting operation may be performed on the acquired first image through a quadtree structure. That is, the acquired first image is split into 4 image blocks, and then the 4 image blocks are subjected to splitting operation to obtain 16 image blocks, and as long as the splitting condition is satisfied, the image splitting operation can be continuously performed.
Step S102, acquiring first characteristic points of each image block and main direction information of each first characteristic point.
In some embodiments, the first feature point of each image block may be extracted by the oFAST method.
In specific implementation, the collected first image may be subjected to downsampling processing of different levels by using a pyramid principle to obtain an image pyramid of the first image, and then first feature point detection is performed on each layer of the image pyramid, so that multi-size features are obtained, and the obtained first feature points have scale invariance.
As an example, a first feature point may be extracted by FAST corner feature; as shown in fig. 5, the process flow of acquiring the first feature point of the image block may at least include:
step S102a, determining first gray information of the pixel point.
In some embodiments, the first gray scale information may refer to a gray scale value of a pixel point, and it is assumed that a gray scale value at a pixel point P in the image block is Ip
Step S102b, determining second gray scale information of the pixel point whose distance from the pixel point is equal to the distance threshold.
In some embodiments, in each layer of the image pyramid, the grayscale information of 12 pixels on a circle with a radius of 3 and the center of the pixel P is determined as the second grayscale information.
Step S102c, determining whether the pixel point is an angular point based on the first gray information and the second gray information.
In some embodiments, a threshold T is set, as an example T = Ip*20%。
Step S102d, if the pixel point is an angular point, determining that the pixel point is the first feature point.
In some embodiments, if the brightness value of N continuous pixels in the circle is greater than or equal to Ip+ T, or less than IpAnd T, judging the pixel point P as an angular point, wherein the pixel point P is a first characteristic point.
After the first image is divided for one time, feature point extraction is carried out on the image blocks obtained by division to obtain first feature points. And judging whether the response value of the first characteristic point in each image block is greater than a response threshold value or not, and further determining whether the segmentation operation needs to be continuously executed or not. The feature points extracted after the first image is segmented through the quadtree structure are distributed in the image uniformly, so that the global information of the segmented image can be fully utilized, and the specific position information of the current position can be determined more accurately. According to the method and the device, the first characteristic point is determined according to the relation between the gray values of different pixel points, so that the pose of the camera is not easily influenced by external illumination when the pose is determined.
In some embodiments, the optional process flow of determining the principal direction information of the first feature point, as shown in fig. 6, may include at least:
step S102e, determining the centroid of the image block based on the image moments of the image block.
In some embodiments, the image moments of an image block may be represented by the following formula:
Figure 926129DEST_PATH_IMAGE001
(1)
wherein m represents an image moment; x and y represent coordinates of pixel points, p and q represent the magnitude relation between the pixel value of any one pixel point and the pixel value of the central pixel point, and if the pixel value of the pixel point and the pixel value of the central pixel point are large, the value of p or q is 1; if the pixel value of the pixel point is smaller than that of the central pixel point, the value of p or q is 0.
The centroid of the image block can be represented by the following formula:
Figure 797264DEST_PATH_IMAGE002
(2)
wherein C represents the centroid of the image block,
Figure 45842DEST_PATH_IMAGE003
a matrix representing the lower left corner in the quadtree structure,
Figure 768948DEST_PATH_IMAGE004
the matrix representing the upper left corner in the quadtree structure,
Figure 537315DEST_PATH_IMAGE005
the matrix representing the upper right corner of the quadtree structure,
Figure 297460DEST_PATH_IMAGE006
the matrix representing the bottom right corner in the quadtree structure.
Step S102f, determining a main direction of the image block based on the centroid of the image block and the geometric center of the image block.
In some embodiments, the centroid C of an image block is connected to the geometric center O of the image block, resulting in a principal direction vector OC of the image block.
Step S102g, determining principal direction information of each first feature point in the image block based on the principal direction of the image block.
In some embodiments, the principal direction information of the first feature point extracted in the image block may be represented as:
Figure 485865DEST_PATH_IMAGE007
(3)
step S103, matching each first feature point with a second feature point included in a previous frame of image of the first image based on a locality-sensitive hashing algorithm, to obtain a corresponding feature point pair.
In some embodiments, it may be determined whether the first image is a first frame image captured by a camera, and if the first image is the first frame image captured by the camera, the information acquired in step S102 is stored; then, re-execution of step S101 is performed. If the first image is not the first frame image acquired by the camera, respectively carrying out Hash transformation on the descriptor of the first characteristic point and the descriptor of the second characteristic point; performing dimension reduction processing on the first feature points after the hash transformation and the second feature points after the hash transformation; calculating the distance between the first feature point after the dimension reduction processing and the second feature point after the dimension reduction processing; determining the feature point pairs based on the calculation result.
Based on local sensitive hash algorithm restraining, the distance between two variables in a high-dimensional space is very close, and the distances of the two variables are also approximately very close after the two variables are transformed by the same artificially designed hash function. Therefore, in the embodiment of the present application, after performing hash transformation on the descriptor of the first feature point and the descriptor of the second feature point, dimension reduction processing is performed on the descriptor of the first feature point and the descriptor of the second feature point, proximity search is performed on the first feature point after the dimension reduction processing and the second feature point after the dimension reduction processing, a distance between the first feature point and the second feature point is calculated, and a second feature point closest to the first feature point is found to obtain a feature point pair. In this way, the dimension and time required for the feature point pair matching search can be reduced.
Step S104, determining that the camera acquires first posture change information between the first image and a last frame image of the first image based on the characteristic point pairs.
In some embodiments, the optional process flow of determining the first posture change information may include, as shown in fig. 7, at least:
step S104a, selecting N pixel points and the first feature point to form N feature point pairs within a set range centered on the first feature point by using the rBRIEF algorithm, and performing binary assignment by comparing the gray values to generate a code combination of 0 or 1.
Specifically, a region with a size of 31 × 31 may be selected with a first feature point as a center, and N pixel points are selected in this region. The mode of selecting the N pixel points is selected according to the positions obtained by training, namely the N pixel points are located at the N positions in the region obtained by training. Where N may be 256, then the rBRIEF descriptor dimension is 256. And pairing the selected N pixel points with the first characteristic point serving as the center to obtain N characteristic point pairs. In a specific embodiment, in the feature point pair, by comparing the gray values of the first feature point as the center point and the selected 256 pixel points, the pixel point whose gray value is smaller than the gray value of the first feature point in the image block is defaulted to 0, and the pixel point whose gray value is greater than the gray value of the first feature point in the image block is defaulted to 1, that is, 256 descriptors whose gray values are not 0, that is, 1 are generated.
In step S104b, the weighted sum of the coded combinations of 0 or 1 determines the centroid of the N feature point pairs.
Specifically, based on the obtained coding combination of 0 or 1, 0 or 1 at pixel points at different positions is subjected to weighted summation to obtain gray centroid points of N feature point pairs.
Step S104c, connecting the first feature point and the centroid and determining the direction angle of the first feature point.
Specifically, the first feature point is connected with the centroid, so that a connecting line of the first feature point and the centroid has a direction. In a specific embodiment, the direction angle θ of the connecting line of the feature point and the centroid is determined by the position coordinates of the N feature point pairs. The specific direction angle θ can be obtained by the following formula.
Figure 622448DEST_PATH_IMAGE008
(4)
Wherein N is the number of the characteristic point pairs; y isNiIs the ordinate, y, of the pixel point in the feature point pairAIs the ordinate, x, of the first feature point in the pair of feature pointsNiIs the abscissa, x, of the pixel point in the feature point pairAIs the abscissa of the first feature point in the pair of feature points.
Step S104d, the pixel points are rotated and sampled according to the direction angle to obtain the feature point pairs in the rotating state, and whether the feature point pairs are matched with the pre-stored texture information in the texture information base is determined.
Specifically, the obtained 256 pixel points are rotated in the direction of 360 ° by using the obtained direction angle θ as an angle step length to perform sampling, so as to obtain a lookup table of rotation descriptors. That is to say, the feature point pairs of a plurality of angles are obtained through rotation, the feature point pairs obtained through rotation are compared with the feature point pairs in the pre-stored texture information in the texture information base, the feature point pairs are matched with the pre-stored texture information and are determined, and the rotation angle of the feature point pairs is determined according to the rotation direction of the feature point pairs matched with each other. The pre-stored texture information in the texture information base may include feature points corresponding to images collected by a camera in history.
In some embodiments, the pose change information between two adjacent frames of images acquired by the camera can be represented as:
Figure 979743DEST_PATH_IMAGE009
. Wherein the content of the first and second substances,
Figure 910790DEST_PATH_IMAGE010
the camera pose when the camera acquires the next frame of image is relative to the camera pose when the camera acquires the previous frame of image
Figure 117649DEST_PATH_IMAGE011
The amount of displacement in the axial direction,
Figure 792344DEST_PATH_IMAGE012
the displacement amount of the camera pose when the camera captures the next frame image relative to the camera pose when the camera captures the previous frame image in the axial direction,
Figure 4145DEST_PATH_IMAGE013
the amount of angular rotation of the camera pose when the camera captures the next frame of image relative to the camera pose when the camera captures the previous frame of image is determined.
In some embodiments, when the pose is calculated based on the first frame image and the second frame image acquired by the camera, the pose change between the first frame image and the second frame image acquired by the camera is only a transformation relation from the pixel distance, and is not an actual change amount in the physical coordinate system. Therefore, the scale ratio parameter of the camera can be obtained by combining the wheel type odometer to acquire data in the same time period (the time interval between the camera acquiring the first frame image and the second frame image).Wherein the scale parameter may be expressed as S in units of millimeters per pixel. As an example, if the pose of the first frame image collected by the corresponding camera changes to
Figure 371672DEST_PATH_IMAGE009
The data of the wheel type odometer is changed into
Figure 800248DEST_PATH_IMAGE014
Then, the scale parameter S of the camera can be expressed as:
Figure 13055DEST_PATH_IMAGE015
(5)
for other frame images except the first frame image acquired by the camera, the obtained camera pose change information
Figure 79362DEST_PATH_IMAGE016
The actual distance is obtained by default in combination with the scale ratio parameter S and converted into the actual physical distance. Wherein the unit of the actual physical distance may be millimeters.
Step S105, determining the pose of the camera when the camera collects the first image based on the first pose change information.
In some embodiments, after the rotation angles of the feature point pairs are determined in step S104, the specific position information of the current position of the electronic device corresponding to the camera and the current pose of the electronic device are determined according to the pre-stored texture information matched with the feature points.
In some embodiments, the pose accumulator accumulates the pose variation between the adjacent frames of images acquired by the camera to obtain the pose state of the current frame of image acquired by the camera relative to the first frame of image acquired by the camera, so as to realize the effect of the visual odometer. As an example, if the camera captures the first frame of image, the robot has a pose of
Figure 507805DEST_PATH_IMAGE014
And the relative position of the robot when the camera collects the second frame imageThe posture change amount is
Figure 456301DEST_PATH_IMAGE017
Then, the absolute pose of the robot when the camera acquires the second frame image can be calculated as follows:
Figure 128591DEST_PATH_IMAGE018
(6)
Figure 829831DEST_PATH_IMAGE019
(7)
Figure 555472DEST_PATH_IMAGE020
(8)
similarly, for any frame
Figure 834007DEST_PATH_IMAGE021
The absolute pose of the camera in a world coordinate system can be obtained by accumulating the pose when the camera collects the previous frame image and the pose variation between two adjacent frames collected by the camera:
Figure 185354DEST_PATH_IMAGE022
(9)
Figure 226253DEST_PATH_IMAGE023
(10)
Figure 372064DEST_PATH_IMAGE024
(11)
wherein the content of the first and second substances,
Figure 996949DEST_PATH_IMAGE025
is the output result of the visual odometer.
In some embodiments, the process of determining, based on the feature point pairs, that the camera acquires the first pose change information between the first image and the image of the previous frame of the first image may be as shown in fig. 8, including:
step S1, mapping the first feature point corresponding to the feature point pair to the second feature point corresponding to the feature point pair, so as to obtain a homography matrix.
In some embodiments, the second feature point may be a feature point corresponding to pre-stored texture information.
Let the ground equation be
Figure 355249DEST_PATH_IMAGE026
Then, obtaining:
Figure 504516DEST_PATH_IMAGE027
(12)
wherein, P is the space coordinate of the first characteristic point in the world coordinate system, n is the direction vector of the connecting line of the first characteristic point and the optical center of the camera, d is the vertical distance between the optical center of the camera and the ground, and T is the transposition symbol.
Meanwhile, an image collector images the model, and the characteristic points of the obtained current frame and the pre-stored texture information satisfy the following relations:
Figure 945861DEST_PATH_IMAGE028
(13)
Figure 808775DEST_PATH_IMAGE029
Figure 987078DEST_PATH_IMAGE030
Figure 986258DEST_PATH_IMAGE031
Figure 723138DEST_PATH_IMAGE032
(14)
wherein s is1,s2Respectively, the scaling factor, s in the application scenario of this embodiment1,s2Is a parameter that needs to be calibrated. K is an internal reference matrix of the camera and can be obtained by a common Zhang Zhengyou calibration method. p is a radical of1,p2Respectively representing pixel points of the second characteristic point and the first characteristic point in respective images; r | t is the rotation and translation conversion relation matrix | vector between the two. Thus, it is possible to obtain:
Figure 73348DEST_PATH_IMAGE033
(15)
wherein the content of the first and second substances,
Figure 789763DEST_PATH_IMAGE034
Figure 909028DEST_PATH_IMAGE035
respectively the pixel coordinates of the second feature point and the first feature point in the respective images,
Figure 551231DEST_PATH_IMAGE036
Figure 388737DEST_PATH_IMAGE037
and
Figure 784209DEST_PATH_IMAGE038
three column vectors of the homography matrix H, respectively.
And mapping the first characteristic points corresponding to the characteristic point pairs matched with the pre-stored texture information to a characteristic point set in the pre-stored texture information to obtain a matrix. Specifically, H is a homography matrix in which the first feature point of the current frame is mapped to a feature point set of pre-stored texture information. Ideally, according to the orthogonality of the rotation matrix, the first two column vectors of H can be directly normalized and cross-multiplied to obtain the rotation matrix:
Figure 272828DEST_PATH_IMAGE039
(16)
wherein r is1,r2,r3Respectively three column vectors of the rotation matrix R.
Step S2, a reference rotational offset and a reference translational offset are determined based on the homography matrix.
In some embodiments, the rotation matrix and the translation vector are obtained by using an SVD decomposition matrix.
Specifically, because the output of the visual algorithm is not matched with the input parameters of the navigation positioning algorithm, and an error exists during actual calculation, a rotation matrix and a translation vector can be obtained by decomposing the matrix through SVD, and the result correctness and interpretability are ensured by taking the orthogonality of the rotation matrix as a theoretical basis.
Figure 102244DEST_PATH_IMAGE040
(17)
Figure 584303DEST_PATH_IMAGE041
(18)
Figure 672213DEST_PATH_IMAGE042
(19)
U, S and V can be obtained by carrying out SVD on the homography matrix H, wherein R is a rotation matrix of the detection characteristic point relative to the second characteristic point, and t is a translation vector of the first characteristic point relative to the second characteristic point. And then, determining the rotation offset and the translation offset of the electronic equipment provided with the downward-looking camera in the operation process according to the rotation matrix and the translation vector.
Step S3, multiplying the reference rotational offset and the reference translational offset by the scale ratio parameter, respectively, to obtain a rotational translation amount and a translational offset between the first image acquired by the camera and a previous frame image of the first image.
In some embodiments, the pixel-level pose transformation relationship R | t from the current frame feature point pair to the feature point pair corresponding to the preset texture information is obtained according to the above steps. In an optional embodiment, because the scale information in the optical axis direction of the camera is normalized in the simplification process, only 8 equations are needed for solving R | t, namely 4 pairs of feature point pairs, if the number of matched feature point pairs is more than 4 pairs, the algorithm adopts a RANSAC method for optimizing, and the optimal 4 pairs of matched points are found for resolving by taking the reprojection error of the homography matrix as a standard. The translation vector t is multiplied by a calibrated scaling coefficient s, namely the translation vector under the actual physical space scale, which reflects the translation offset of the electronic equipment provided with the downward-looking camera near the target position corresponding to the preset texture information, and the corresponding angle reflects the rotation offset of the electronic equipment provided with the downward-looking camera near the target position corresponding to the preset texture information. The two devices together guide the electronic device to adjust the operation parameters, so that the electronic device can navigate through the path of the next target position, and the accuracy of the visual odometer is further improved.
In some embodiments, when the texture information in the running environment has special situations such as dirty or gradual change, the texture information base needs to be updated.
In specific implementation, the confidence of the acquired environmental image information is determined according to the number of the screened feature point pairs and the translation vector. Wherein the confidence is calculated by the following formula:
Figure 766071DEST_PATH_IMAGE043
(20)
wherein a is an adjustable parameter and depends on the richness degree of the environment texture; b is a fixed parameter related to the size of the field of view of the image acquisition device; m is the number of the characteristic point pairs after screening;
Figure 251541DEST_PATH_IMAGE044
indication checkAnd measuring the actual translation distance between the characteristic points and the characteristic points in the pre-stored texture information.
The more the matching point pairs of the environmental image of the current frame and the characteristic point pairs in the preset texture information are, the smaller the offset distance of the calculation result is, the higher the image matching similarity is, and the better the matching result is.
Setting a preset confidence coefficient, and judging whether the confidence coefficient corresponding to the environment image of the current frame exceeds the preset confidence coefficient or not; if the confidence corresponding to the environment image of the current frame exceeds the preset confidence, fusing the acquired environment image with the pre-stored texture information to update the texture information base; and if the confidence corresponding to the environment image of the current frame does not exceed the preset confidence, not updating the texture information base.
In the embodiment of the present application, the electronic device mounted with the downward-looking camera may be a robot having a mobile function.
Continuing with the exemplary structure of the implementation device 455 of the visual odometer provided by the embodiment of the present application as a software module, in some embodiments, as shown in fig. 2, the software module stored in the implementation device 455 of the visual odometer in the memory 450 may include: the image segmentation module 4551 is configured to segment a first image acquired by the downward-view camera to obtain at least two image blocks; an information obtaining module 4552, configured to obtain first feature points of each image block and main direction information of each first feature point; a feature point pair determining module 4553, configured to match each first feature point with a second feature point included in a previous frame of the first image based on a locality sensitive hashing algorithm, to obtain a corresponding feature point pair; a pose determination module 4554 configured to determine, based on the feature point pairs, first pose change information between the first image acquired by the camera and a last frame image of the first image; the first pose change information is related to a scale ratio parameter of the camera; determining the pose of the camera when acquiring the first image based on the first pose change information.
In some embodiments, the information obtaining module 4552 is configured to perform the following operations for the image blocks respectively: determining a centroid of the image block based on image moments of the image block; determining a main direction of the image block based on a centroid of the image block and a geometric center of the image block; and determining main direction information of each first characteristic point in the image block based on the main direction of the image block.
In some embodiments, the information obtaining module 4552 is configured to perform the following operations for each pixel point in each image block respectively: determining first gray information of the pixel points; determining second gray information of the pixel points with the distance between the pixel points and the distance equal to the distance threshold;
judging whether the pixel points are angular points or not based on the first gray information and the second gray information; and if the pixel point is the angular point, determining the pixel point as the first characteristic point.
In some embodiments, the feature point pair determining module 4553 is configured to perform hash transform on the descriptor of the first feature point and the descriptor of the second feature point respectively; performing dimension reduction processing on the first feature points after the hash transformation and the second feature points after the hash transformation; calculating the distance between the first feature point after the dimension reduction processing and the second feature point after the dimension reduction processing; determining the feature point pairs based on the calculation result.
In some embodiments, the pose determining module 4554 is configured to map the first feature points corresponding to the feature point pairs to the second feature points corresponding to the feature point pairs, so as to obtain a homography matrix; determining a reference rotational offset and a reference translational offset based on the homography matrix; and multiplying the reference rotation offset and the reference translation offset by the scale proportion parameter respectively to obtain the rotation translation amount and the translation offset between the first image acquired by the camera and the last frame image of the first image.
In some embodiments, the pose determination module 4554 is further configured to determine second pose change information between the camera acquiring the second frame image and the camera acquiring the first frame image; determining mileage change information occurring when the camera acquires the second frame image and when the camera acquires the first frame image; determining the scale ratio parameter based on the mileage change information and the second posture change information.
In some embodiments, the pose determination module 4554 is configured to calculate a first ratio of the mileage variation information of the first coordinate axis included in the mileage variation information to the translation amount of the first coordinate axis included in the second posture variation information, and a second ratio of the mileage variation information of the second coordinate axis included in the mileage variation information to the translation amount of the second coordinate axis included in the second posture variation information; determining half of the sum of the first ratio and the second ratio as the scale ratio parameter.
In some embodiments, the pose determination module 4554 is configured to determine the pose of the camera at the time of acquiring the first image based on the first pose change information and the pose of the last frame of image of the first image.
Embodiments of the present application provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device executes the implementation method of the visual odometer, which is described in the embodiment of the present application.
Embodiments of the present application provide a computer-readable storage medium storing executable instructions, which when executed by a processor, will cause the processor to perform a method provided by embodiments of the present application, for example, a method for implementing a visual odometer as shown in fig. 3 to 8.
In some embodiments, the computer-readable storage medium may be memory such as FRAM, ROM, PROM, EPROM, EEPROM, flash, magnetic surface memory, optical disk, or CD-ROM; or may be various devices including one or any combination of the above memories.
In some embodiments, executable instructions may be written in any form of programming language (including compiled or interpreted languages), in the form of programs, software modules, scripts or code, and may be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
By way of example, executable instructions may correspond, but do not necessarily have to correspond, to files in a file system, and may be stored in a portion of a file that holds other programs or data, such as in one or more scripts in a hypertext Markup Language (HTML) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code).
By way of example, executable instructions may be deployed to be executed on one computing device or on multiple computing devices at one site or distributed across multiple sites and interconnected by a communication network.
The above description is only an example of the present application, and is not intended to limit the scope of the present application. Any modification, equivalent replacement, and improvement made within the spirit and scope of the present application are included in the protection scope of the present application.

Claims (11)

1. A method of implementing a visual odometer, the method comprising:
dividing a first image acquired by a downward-looking camera to obtain at least two image blocks;
acquiring first characteristic points of each image block and main direction information of the first characteristic points;
matching each first characteristic point with a second characteristic point included in a last frame of image of the first image based on a locality sensitive hashing algorithm to obtain a corresponding characteristic point pair;
determining, based on the pair of feature points, that the camera acquires first pose change information between the first image and a last frame image of the first image; the first pose change information is related to a scale ratio parameter of the camera;
determining the pose of the camera when acquiring the first image based on the first pose change information.
2. The method according to claim 1, wherein the obtaining the first feature points of each image block and the principal direction information of each first feature point comprises:
respectively executing the following operations for each image block:
determining a centroid of the image block based on image moments of the image block;
determining a main direction of the image block based on a centroid of the image block and a geometric center of the image block;
and determining main direction information of each first characteristic point in the image block based on the main direction of the image block.
3. The method according to claim 1, wherein the obtaining the first feature points of each image block and the principal direction information of each first feature point comprises:
respectively executing the following operations aiming at each pixel point in each image block:
determining first gray information of the pixel points;
determining second gray information of the pixel points with the distance between the pixel points and the distance equal to the distance threshold;
judging whether the pixel points are angular points or not based on the first gray information and the second gray information;
and if the pixel point is the angular point, determining the pixel point as the first characteristic point.
4. The method according to claim 1, wherein the matching each of the first feature points with a second feature point included in a previous frame of the first image based on a locality-sensitive hashing algorithm to obtain a corresponding feature point pair comprises:
respectively carrying out Hash transformation on the descriptor of the first characteristic point and the descriptor of the second characteristic point;
performing dimension reduction processing on the first feature points after the hash transformation and the second feature points after the hash transformation;
calculating the distance between the first feature point after the dimension reduction processing and the second feature point after the dimension reduction processing;
determining the feature point pairs based on the calculation result.
5. The method of claim 1, wherein the determining, based on the pair of feature points, first pose change information between the camera acquiring the first image and a last frame image of the first image comprises:
mapping a first characteristic point corresponding to the characteristic point pair to a second characteristic point corresponding to the characteristic point pair to obtain a homography matrix;
determining a reference rotational offset and a reference translational offset based on the homography matrix;
and multiplying the reference rotation offset and the reference translation offset by the scale proportion parameter respectively to obtain the rotation translation amount and the translation offset between the first image acquired by the camera and the last frame image of the first image.
6. The method of claim 5, wherein prior to determining, based on the pair of feature points, that the camera acquires first pose change information between the first image and a last frame of the first image, the method further comprises:
determining second attitude change information between the camera acquiring the second frame image and the camera acquiring the first frame image;
determining mileage change information occurring when the camera acquires the second frame image and when the camera acquires the first frame image;
determining the scale ratio parameter based on the mileage change information and the second posture change information.
7. The method of claim 6, wherein the determining the scale ratio parameter based on the range change information and the second posture change information comprises:
calculating a first ratio of the mileage change information of a first coordinate axis included in the mileage change information to the translation amount of the first coordinate axis included in the second position change information, and a second ratio of the mileage change information of a second coordinate axis included in the mileage change information to the translation amount of the second coordinate axis included in the second position change information;
determining half of the sum of the first ratio and the second ratio as the scale ratio parameter.
8. The method of claim 1, wherein the determining the pose of the camera at the time the first image was acquired based on the first pose change information comprises:
and determining the pose of the camera when the first image is acquired based on the first pose change information and the pose of the last frame of image of the first image.
9. An apparatus for implementing a visual odometer, the apparatus comprising:
the image segmentation module is used for segmenting a first image acquired by the downward-looking camera to obtain at least two image blocks;
the information acquisition module is used for acquiring first characteristic points of each image block and main direction information of each first characteristic point;
a characteristic point pair determining module, configured to match each first characteristic point with a second characteristic point included in a previous frame of the first image based on a locality sensitive hashing algorithm, to obtain a corresponding characteristic point pair;
a pose determination module for determining first pose change information between the first image acquired by the camera and a last frame image of the first image based on the feature point pairs; the first pose change information is related to a scale ratio parameter of the camera; determining the pose of the camera when acquiring the first image based on the first pose change information.
10. An electronic device, comprising:
a memory for storing executable instructions;
a processor for implementing the method of any one of claims 1 to 8 when executing the executable instructions stored in the memory.
11. A computer-readable storage medium storing executable instructions for implementing the method of any one of claims 1 to 8 when executed by a processor.
CN202110715562.4A 2021-06-28 2021-06-28 Visual odometer implementation method and device and electronic equipment Pending CN113223007A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110715562.4A CN113223007A (en) 2021-06-28 2021-06-28 Visual odometer implementation method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110715562.4A CN113223007A (en) 2021-06-28 2021-06-28 Visual odometer implementation method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN113223007A true CN113223007A (en) 2021-08-06

Family

ID=77081282

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110715562.4A Pending CN113223007A (en) 2021-06-28 2021-06-28 Visual odometer implementation method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN113223007A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113706620A (en) * 2021-10-22 2021-11-26 杭州迦智科技有限公司 Positioning method, positioning device and movable platform based on reference object

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107967691A (en) * 2016-10-20 2018-04-27 株式会社理光 A kind of visual odometry calculates method and apparatus
CN108615248A (en) * 2018-04-27 2018-10-02 腾讯科技(深圳)有限公司 Method for relocating, device, equipment and the storage medium of camera posture tracing process
CN108734736A (en) * 2018-05-22 2018-11-02 腾讯科技(深圳)有限公司 Camera posture method for tracing, device, equipment and storage medium
CN108955718A (en) * 2018-04-10 2018-12-07 中国科学院深圳先进技术研究院 A kind of visual odometry and its localization method, robot and storage medium
CN109029417A (en) * 2018-05-21 2018-12-18 南京航空航天大学 Unmanned plane SLAM method based on mixing visual odometry and multiple dimensioned map
CN109579844A (en) * 2018-12-04 2019-04-05 电子科技大学 Localization method and system
CN109887029A (en) * 2019-01-17 2019-06-14 江苏大学 A kind of monocular vision mileage measurement method based on color of image feature
CN110044374A (en) * 2018-01-17 2019-07-23 南京火眼猴信息科技有限公司 A kind of method and odometer of the monocular vision measurement mileage based on characteristics of image
CN110335337A (en) * 2019-04-28 2019-10-15 厦门大学 A method of based on the end-to-end semi-supervised visual odometry for generating confrontation network
CN110555435A (en) * 2019-09-10 2019-12-10 深圳一块互动网络技术有限公司 Point-reading interaction realization method
CN110992487A (en) * 2019-12-10 2020-04-10 南京航空航天大学 Rapid three-dimensional map reconstruction device and reconstruction method for hand-held airplane fuel tank
CN111754579A (en) * 2019-03-28 2020-10-09 杭州海康威视数字技术股份有限公司 Method and device for determining external parameters of multi-view camera
CN111797906A (en) * 2020-06-15 2020-10-20 北京三快在线科技有限公司 Method and device for positioning based on vision and inertial mileage

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107967691A (en) * 2016-10-20 2018-04-27 株式会社理光 A kind of visual odometry calculates method and apparatus
CN110044374A (en) * 2018-01-17 2019-07-23 南京火眼猴信息科技有限公司 A kind of method and odometer of the monocular vision measurement mileage based on characteristics of image
CN108955718A (en) * 2018-04-10 2018-12-07 中国科学院深圳先进技术研究院 A kind of visual odometry and its localization method, robot and storage medium
CN108615248A (en) * 2018-04-27 2018-10-02 腾讯科技(深圳)有限公司 Method for relocating, device, equipment and the storage medium of camera posture tracing process
CN109029417A (en) * 2018-05-21 2018-12-18 南京航空航天大学 Unmanned plane SLAM method based on mixing visual odometry and multiple dimensioned map
CN108734736A (en) * 2018-05-22 2018-11-02 腾讯科技(深圳)有限公司 Camera posture method for tracing, device, equipment and storage medium
CN109579844A (en) * 2018-12-04 2019-04-05 电子科技大学 Localization method and system
CN109887029A (en) * 2019-01-17 2019-06-14 江苏大学 A kind of monocular vision mileage measurement method based on color of image feature
CN111754579A (en) * 2019-03-28 2020-10-09 杭州海康威视数字技术股份有限公司 Method and device for determining external parameters of multi-view camera
CN110335337A (en) * 2019-04-28 2019-10-15 厦门大学 A method of based on the end-to-end semi-supervised visual odometry for generating confrontation network
CN110555435A (en) * 2019-09-10 2019-12-10 深圳一块互动网络技术有限公司 Point-reading interaction realization method
CN110992487A (en) * 2019-12-10 2020-04-10 南京航空航天大学 Rapid three-dimensional map reconstruction device and reconstruction method for hand-held airplane fuel tank
CN111797906A (en) * 2020-06-15 2020-10-20 北京三快在线科技有限公司 Method and device for positioning based on vision and inertial mileage

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
黄心汉: "微装机器人", 北京:国防工业出版社, pages: 91 - 92 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113706620A (en) * 2021-10-22 2021-11-26 杭州迦智科技有限公司 Positioning method, positioning device and movable platform based on reference object
CN113706620B (en) * 2021-10-22 2022-03-22 杭州迦智科技有限公司 Positioning method, positioning device and movable platform based on reference object

Similar Documents

Publication Publication Date Title
CN105637530B (en) A kind of method and system of the 3D model modification using crowdsourcing video
US20200175700A1 (en) Joint Training Technique for Depth Map Generation
WO2022241874A1 (en) Infrared thermal imaging monocular vision ranging method and related assembly
CN112102411A (en) Visual positioning method and device based on semantic error image
WO2022179581A1 (en) Image processing method and related device
US20210312650A1 (en) Method and apparatus of training depth estimation network, and method and apparatus of estimating depth of image
CN113808253A (en) Dynamic object processing method, system, device and medium for scene three-dimensional reconstruction
CN113139626B (en) Template matching method and device, electronic equipment and computer-readable storage medium
CN112967341A (en) Indoor visual positioning method, system, equipment and storage medium based on live-action image
WO2023083030A1 (en) Posture recognition method and related device
CN114565728A (en) Map construction method, pose determination method, related device and equipment
CN111428805B (en) Method for detecting salient object, model, storage medium and electronic device
CN116977674A (en) Image matching method, related device, storage medium and program product
CN110111364B (en) Motion detection method and device, electronic equipment and storage medium
WO2022179603A1 (en) Augmented reality method and related device thereof
CN113516697B (en) Image registration method, device, electronic equipment and computer readable storage medium
CN113223007A (en) Visual odometer implementation method and device and electronic equipment
CN113639782A (en) External parameter calibration method and device for vehicle-mounted sensor, equipment and medium
CN113673288B (en) Idle parking space detection method and device, computer equipment and storage medium
CN111680564B (en) All-weather pedestrian re-identification method, system, equipment and storage medium
CN112258647A (en) Map reconstruction method and device, computer readable medium and electronic device
CN116580151A (en) Human body three-dimensional model construction method, electronic equipment and storage medium
CN110956131A (en) Single-target tracking method, device and system
CN117693768A (en) Semantic segmentation model optimization method and device
CN113205530A (en) Shadow area processing method and device, computer readable medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination