CN113011231B

CN113011231B - Classification sliding window method, SLAM positioning method, system and electronic equipment

Info

Publication number: CN113011231B
Application number: CN201911326341.7A
Authority: CN
Inventors: 兰国清; 周俊; 黄菊; 胡增新
Original assignee: Sunny Optical Zhejiang Research Institute Co Ltd
Current assignee: Sunny Optical Zhejiang Research Institute Co Ltd
Priority date: 2019-12-20
Filing date: 2019-12-20
Publication date: 2023-07-07
Anticipated expiration: 2039-12-20
Also published as: CN113011231A

Abstract

A classified sliding window method, a SLAM positioning method, a system thereof and an electronic device. The classification sliding window method comprises the following steps: s110: determining whether the number of all observation frames in a window reaches the maximum window number of the window; s120: when the number of the observation frames reaches the maximum window number, a preset number of observation frames are removed from the window in batches at intervals according to the relative pose between the oldest first frame and the oldest second frame in the window; s130: when the number of the observation frames is smaller than the maximum window number, further determining whether the number of the observation frames is larger than a preset frame number threshold, and if so, selectively eliminating the observation frames from the window according to the characteristic point tracking rate of the current observation frames; if not, reserving all observation frames in the window.

Description

Classification sliding window method, SLAM positioning method, system and electronic equipment

Technical Field

The invention relates to the technical field of SLAM (sliding window), in particular to a classified sliding window method, a SLAM positioning method, a system and electronic equipment thereof.

Background

With the continuous improvement of computing technology, smart technology and sensor technology, intelligent applications such as AR/VR, unmanned aerial vehicle and intelligent robot are rushing into the market, so that positioning and map building (Simultaneous Localization And Mapping, abbreviated as SLAM) technology is also attracting attention. In general, SLAM problems can be largely divided into front-end and back-end parts: the front end mainly processes the data acquired by the sensor and converts the data into a relative pose or other forms which can be understood by the machine; the back end mainly processes the problem of optimal posterior estimation, namely optimal estimation of pose, map and the like. The SLAM positioning technology at the present stage generally adopts a Visual Inertial Odometer (VIO) mode to carry out pose estimation, and the scheme has high positioning precision and stable effect, so that the visual inertial odometer can be widely applied.

At present, the open source algorithm of the visual inertial odometer is various, so that the method can be divided into two major categories of a filtering optimization method and a nonlinear optimization method according to different optimization methods of the rear end. For the filtering optimization method, the state vector dimension and covariance matrix in the filtering optimization method are relatively small, so that the calculation amount of the filtering optimization method is small, the speed is high, and the positioning can be realized in a fast scene; a typical open source algorithm in the filtering optimization method is S-MSCKF, but the positioning accuracy is low and the robustness is poor. For the nonlinear optimization method, the global map and the global key frame are required to be maintained, so that the nonlinear optimization method has large calculated amount and poor instantaneity; classical representatives of nonlinear optimization methods are VIN-Mono, which, although capable of performing well in most scenarios, have higher CPU resource requirements and poor real-time.

In addition, a filtering optimization method such as S-MSCKF typically uses window preservation for each observation frame at the back end of the input, and deletes the observation frames stored in the window when the number of observation frames in this window satisfies the maximum storable number (i.e., full window). Specifically, when the window is full, firstly, calculating the relative pose between the latest third-last frame and the latest fourth-last frame from the latest fourth-last frame, and if the relative pose between the latest third-last frame and the latest fourth-last frame meets a certain threshold value, rejecting the latest third-last frame; if not, the oldest first frame is culled. Then, calculating the relative pose between the latest penultimate frame and the latest fourth last frame, and eliminating the latest penultimate frame if the relative pose between the latest penultimate frame and the latest fourth last frame meets a certain threshold; if not, the oldest second frame is culled. And finally, inputting the characteristic points on the two removed observation frames into a filter to perform filtering optimization, so as to obtain positioning information.

However, compared to EKF-SLAM, filtering optimization methods such as S-MSCKF, although improving real-time and positioning accuracy, still have positioning accuracy that does not meet the requirements in applications such as AR/VR, and under complex motion conditions, unrecoverable drift may occur.

Disclosure of Invention

An advantage of the present invention is to provide a sort sliding window method and SLAM positioning method, and system and electronic device thereof, which can improve positioning accuracy so as to meet the requirements of applications such as AR/VR for positioning accuracy.

Another advantage of the present invention is to provide a method for classifying sliding windows, a method for positioning SLAM, a system thereof, and an electronic device, wherein in an embodiment of the present invention, the method for classifying sliding windows can optimize constraints of a plurality of observation frames on feature points, and improve positioning accuracy.

Another advantage of the present invention is to provide a method for classifying sliding windows, a method for positioning SLAM, a system thereof, and an electronic device, wherein in an embodiment of the present invention, the method for classifying sliding windows can start different sliding window methods for different observation information, thereby further improving positioning accuracy.

Another advantage of the present invention is to provide a method for classifying sliding window, a method for positioning SLAM, a system thereof and an electronic device thereof, wherein in an embodiment of the present invention, the method for classifying sliding window can reserve a proper number of observation frames, so as to reduce the amount of calculation of the back end, which is helpful for improving the overall real-time performance.

Another advantage of the present invention is to provide a sliding window classifying method, a SLAM positioning method, a system thereof and an electronic device, wherein in an embodiment of the present invention, the SLAM positioning method can accelerate front-end processing speed, thereby being beneficial to further improving overall real-time performance and meeting the requirement of applications such as AR/VR on real-time performance.

Another advantage of the present invention is to provide a sliding window classifying method, a SLAM locating method, a system thereof and an electronic device, wherein in an embodiment of the present invention, the SLAM locating method can combine an optical flow tracking method and an epipolar searching and block matching method to process a front end, so as to reduce errors of left and right eye feature tracking, and increase a front end processing speed, thereby further improving overall real-time performance.

Another advantage of the present invention is to provide a sliding window method and a SLAM positioning method, and a system and an electronic device thereof, wherein a complex structure and a huge amount of calculation are not required in the present invention in order to achieve the above advantages. Therefore, the present invention successfully and effectively provides a solution that not only provides a sort slide window method and a SLAM positioning method and a system thereof, and an electronic device, but also increases the practicality and reliability of the sort slide window method and the SLAM positioning method and the system thereof, and the electronic device.

To achieve at least one of the above or other advantages and objects, the present invention provides a method of classifying sliding windows, comprising the steps of:

s110: determining whether the number of all observation frames in a window reaches the maximum window number of the window;

s120: when the number of the observation frames reaches the maximum window number, a preset number of observation frames are removed from the window in batches at intervals according to the relative pose between the oldest first frame and the oldest second frame in the window; and

s130: when the number of the observation frames is smaller than the maximum window number, further determining whether the number of the observation frames is larger than a preset frame number threshold, and if so, selectively eliminating the observation frames from the window according to the characteristic point tracking rate of the current observation frames; if not, reserving all observation frames in the window.

In an embodiment of the invention, the sliding window classifying method further includes the steps of:

s140: the current observation frame is added to the window as the latest frame in the window.

In an embodiment of the present invention, the step S120 includes the steps of:

calculating the relative pose between the oldest first frame and the oldest second frame in the window to determine whether the relative pose is greater than a first pose threshold;

When the relative pose is greater than the first pose threshold, rejecting the oldest first frame in the window, and starting from the oldest second frame in the window, intermittently rejecting a first predetermined number of observation frames in batches; and

when the relative pose is not greater than the first pose threshold, the oldest first frame in the window is retained and a second predetermined number of observed frames are batch-dropped at intervals starting from the oldest second frame in the window.

In an embodiment of the present invention, in the step S120, the observation frames in the window are batch-dropped at equal intervals, starting from the oldest second frame in the window.

In an embodiment of the present invention, the step S130 includes the steps of:

detecting the characteristic point tracking rate of the current observation frame to determine whether the characteristic point tracking rate of the current observation frame is 100%;

when the characteristic point tracking rate of the current observation frame is 100%, starting from the oldest second frame in the window, sequentially calculating the relative pose between the observation frame to be removed and the oldest first frame in the window to judge whether the relative pose is smaller than a second pose threshold, and removing the observation frame to be removed if the relative pose is smaller than the second pose threshold; if not, reserving the observation frame to be removed; and

And when the characteristic point tracking rate of the current observation frame is less than 100%, reserving all the observation frames to be removed in the window.

In an embodiment of the present invention, the step S130 further includes the steps of:

the number of the observation frames to be removed from the window is monitored, so that the removal operation is stopped when the removal number of the observation frames to be removed reaches 1/3 of the maximum window number.

According to another aspect of the present invention, the present invention further provides a SLAM positioning method, including the steps of:

front-end processing is carried out on an original image acquired by a binocular camera so as to obtain characteristic point information of a current observation frame;

performing filter prediction processing on IMU information acquired by an inertial measurement unit to obtain a predicted pose and a predicted speed of the binocular camera;

carrying out map construction according to the characteristic point information of the current observation frame to determine whether the characteristic point information of tracking loss exists or not, and further carrying out filter estimation processing to obtain the estimated pose and the estimated speed of the binocular camera;

based on the characteristic point information of the current observation frame, sliding window processing is carried out by a classified sliding window method so as to determine whether the rejected observation frame exists or not; and

When the eliminated observation frame exists, carrying out filter estimation processing on characteristic point information in the eliminated observation frame according to the estimated pose and the estimated speed of the binocular camera so as to obtain the optimized pose and the optimized speed of the binocular camera; and when the eliminated observation frame does not exist, directly taking the estimated pose and the estimated speed of the binocular camera as the optimized pose and the optimized speed of the binocular camera.

In one embodiment of the present invention, the method for classifying sliding windows includes the steps of:

In an embodiment of the present invention, the step of performing front-end processing on an original image acquired by a binocular camera to obtain feature point information of a current observation frame includes the steps of:

tracking the characteristic points of the left-eye image in the original image by an optical flow tracking method to obtain left-eye characteristic point information in the current observation frame; and

and tracking the characteristic points of the right-eye image in the original image by using a polar line searching and block matching method according to the relative pose between the left-eye camera and the right-eye camera in the binocular camera so as to obtain right-eye characteristic point information in the current observation frame.

In an embodiment of the present invention, the step of performing front-end processing on the original image acquired by the binocular camera to obtain feature point information of the current observation frame further includes the steps of:

judging whether the number of the characteristic points of the left eye image tracked by the optical flow tracking method is smaller than a threshold value of the number of the characteristic points, and if so, extracting new characteristic point information from the left eye image by a characteristic point extraction method so as to supplement the left eye characteristic point information in the current observation frame.

In an embodiment of the present invention, the step of performing map construction according to the feature point information of the current observation frame to determine whether there is a feature point with tracking lost, and further obtaining the estimated pose and the estimated speed of the binocular camera through filter estimation processing includes the steps of:

When the tracking lost feature points exist, carrying out filter estimation processing on the information of the tracking lost feature points according to the predicted pose and the predicted speed of the binocular camera so as to obtain the estimated pose and the estimated speed of the binocular camera; and

and when the characteristic points of tracking loss do not exist, the predicted pose and the predicted speed of the binocular camera are directly used as the estimated pose and the estimated speed of the binocular camera.

when the characteristic points lost in tracking do not exist, a preset number of characteristic points are screened from the characteristic points of the current observation frame, and then filtering estimation processing is carried out on the information of the screened characteristic points according to the predicted pose and the predicted speed of the binocular camera so as to obtain the estimated pose and the estimated speed of the binocular camera.

According to another aspect of the present invention, there is also provided a sorting slide window system for sorting slide windows, wherein the sorting slide window system comprises:

a determining module, configured to determine whether the number of all observation frames in the window reaches a maximum window number of the window;

a first eliminating module, wherein the first eliminating module is communicably connected to the determining module and is used for eliminating a preset number of observation frames from the window in batches according to the relative pose between the oldest first frame and the oldest second frame in the window when the number of the observation frames reaches the maximum window number; and

a second eliminating module, wherein the second eliminating module is communicably connected to the determining module, and is configured to further determine whether the number of the observed frames is greater than a preset frame number threshold when the number of the observed frames is less than the maximum window number, and if so, selectively eliminate the observed frames from the window according to a feature point tracking rate of the current observed frame; if not, reserving all observation frames in the window.

In an embodiment of the invention, the classification sliding window system further includes:

And the adding module is respectively connected with the first rejecting module and the second rejecting module in a communication way and is used for adding the current observation frame to the window to be used as the latest frame in the window.

In an embodiment of the present invention, the first culling module includes a pose calculation module and a batch culling module that are communicatively connected to each other, where the pose calculation module is configured to calculate a relative pose between the oldest first frame and the oldest second frame in the window, so as to determine whether the relative pose is greater than a first pose threshold; the batch eliminating module is used for eliminating the oldest first frame in the window when the relative pose is larger than the first pose threshold value, and eliminating a first preset number of observation frames in batches at intervals from the oldest second frame in the window; and the batch eliminating module is further configured to reserve the oldest first frame in the window and to intermittently and batch eliminate a second predetermined number of observation frames from the oldest second frame in the window when the relative pose is not greater than the first pose threshold.

In an embodiment of the present invention, the second culling module includes a detection module, a selecting culling module, and a preserving module, where the detection module is configured to detect a feature point tracking rate of the current observation frame to determine whether the feature point tracking rate of the current observation frame is 100%; the selection eliminating module is communicably connected to the detection module and is used for sequentially calculating the relative pose between the observation frame to be eliminated and the oldest first frame in the window from the oldest second frame in the window when the characteristic point tracking rate of the current observation frame is 100%, so as to judge whether the relative pose is smaller than a second pose threshold value, and eliminating the observation frame to be eliminated if the relative pose is smaller than the second pose threshold value; if not, reserving the observation frame to be removed; the retaining module is communicatively connected to the detecting module, and is configured to retain all the observation frames to be removed in the window when the feature point tracking rate of the current observation frame is less than 100%.

In an embodiment of the present invention, the second culling module further includes a monitoring module, configured to monitor a number of the observation frames to be culled, which are culled from the window, so as to stop the culling operation when the culling number of the observation frames to be culled reaches 1/3 of the maximum window number.

According to another aspect of the present invention, there is also provided a SLAM positioning system for positioning based on an original image acquired by a binocular camera and IMU information acquired by an inertial measurement unit, wherein the SLAM positioning system includes:

the front-end system is used for performing front-end processing on the original image to obtain the characteristic point information of the current observation frame;

the filter prediction system is used for carrying out filter prediction processing on the IMU information so as to obtain the predicted pose and the predicted speed of the binocular camera;

the map construction system comprises a map construction module and a characteristic point determination module which are mutually connected in a communication way, wherein the map construction module is respectively connected with the front-end system and the filter prediction system in a communication way and is used for carrying out map construction according to the characteristic point information of the current observation frame, the characteristic point determination module is used for determining whether the characteristic point information of tracking loss exists or not, and then the estimated pose and the estimated speed of the binocular camera are obtained through filter estimation processing;

The classification sliding window system is used for carrying out sliding window processing by a classification sliding window method based on the characteristic point information of the current observation frame so as to determine whether the rejected observation frame exists or not; and

the filter estimation system is communicatively connected with the classification sliding window system and is used for carrying out filter estimation processing on characteristic point information in the eliminated observation frame according to the estimated pose and the estimated speed of the binocular camera when the eliminated observation frame exists, so as to obtain the optimized pose and the optimized speed of the binocular camera; and when the eliminated observation frame does not exist, directly taking the estimated pose and the estimated speed of the binocular camera as the optimized pose and the optimized speed of the binocular camera.

In an embodiment of the present invention, the front-end system includes an optical flow tracking module, an epipolar searching and block matching module and a judgment extraction module that are communicatively connected to each other, where the optical flow tracking module is configured to track, by using an optical flow tracking method, feature points of a left-eye image in the original image, so as to obtain left-eye feature point information in the current observation frame; the epipolar searching and block matching module is used for tracking the characteristic points of the right-eye image in the original image through an epipolar searching and block matching method according to the relative pose between the left-eye camera and the right-eye camera in the binocular camera so as to obtain right-eye characteristic point information in the current observation frame; the judging and extracting module is used for judging whether the number of the characteristic points of the left-eye image tracked by the optical flow tracking method is smaller than a threshold value of the number of the characteristic points, and if so, extracting new characteristic point information from the left-eye image by the characteristic point extracting method so as to supplement the left-eye characteristic point information in the current observation frame.

In an embodiment of the present invention, when the feature point of tracking loss exists, the filter estimation system is further configured to perform a filter estimation process on information of the feature point of tracking loss according to the predicted pose and the predicted speed of the binocular camera, so as to obtain the estimated pose and the estimated speed of the binocular camera; and when the tracking lost feature point does not exist, directly taking the predicted pose and the predicted speed of the binocular camera as the estimated pose and the estimated speed of the binocular camera.

In an embodiment of the present invention, the map building system further includes a feature point screening module, configured to screen a predetermined number of feature points from the feature points of the current observation frame when the feature points lost in tracking do not exist; the filter estimation system is further used for carrying out filter estimation processing on the information of the screened characteristic points according to the predicted pose and the predicted speed of the binocular camera so as to obtain the estimated pose and the estimated speed of the binocular camera.

According to another aspect of the present invention, there is also provided an electronic apparatus including:

at least one processor for executing instructions; and

A memory communicatively coupled to the at least one processor, wherein the memory has at least one instruction, wherein the instruction is executed by the at least one processor to cause the at least one processor to perform some or all of the steps in a SLAM positioning method, wherein the SLAM positioning method comprises the steps of:

carrying out map construction according to the characteristic point information of the current observation frame to determine whether the characteristic point information of tracking loss exists or not, and further obtaining the estimated pose and the estimated speed of the binocular camera through filter estimation processing;

Further objects and advantages of the present invention will become fully apparent from the following description and the accompanying drawings.

These and other objects, features and advantages of the present invention will become more fully apparent from the following detailed description, the accompanying drawings and the appended claims.

Drawings

FIG. 1 is a flow chart of a method of classifying sliding windows according to an embodiment of the invention.

Fig. 2 shows a flow diagram of one of the steps of the sliding window method of classification according to the above embodiment of the invention.

Fig. 3 shows a schematic flow chart of a second step of the sliding window sorting method according to the above embodiment of the present invention.

Fig. 4 shows an example of the classification sliding window method according to the above embodiment of the present invention.

FIG. 5 shows a flow diagram of a SLAM positioning method according to an embodiment of the present invention.

Fig. 6 shows a flowchart of one of the steps of the SLAM positioning method according to the above embodiment of the present invention.

Fig. 7 shows a flowchart of a second step of the SLAM positioning method according to the above embodiment of the present invention.

Fig. 8 shows an example of the SLAM positioning method according to the above-described embodiment of the present invention.

Fig. 9 shows an example of a front-end processing step of the SLAM positioning method according to the above-described embodiment of the present invention.

FIG. 10 illustrates a block diagram schematic of a classification slide window system in accordance with an embodiment of the invention.

FIG. 11 shows a block diagram schematic of a SLAM positioning system according to an embodiment of the present invention.

Fig. 12 shows a block diagram schematic of an electronic device according to an embodiment of the invention.

Detailed Description

The following description is presented to enable one of ordinary skill in the art to make and use the invention. The preferred embodiments in the following description are by way of example only and other obvious variations will occur to those skilled in the art. The basic principles of the invention defined in the following description may be applied to other embodiments, variations, modifications, equivalents, and other technical solutions without departing from the spirit and scope of the invention.

In the present invention, the terms "a" and "an" in the claims and specification should be understood as "one or more", i.e. in one embodiment the number of one element may be one, while in another embodiment the number of the element may be plural. The terms "a" and "an" are not to be construed as unique or singular, and the term "the" and "the" are not to be construed as limiting the amount of the element unless the amount of the element is specifically indicated as being only one in the disclosure of the present invention.

In the description of the present invention, it should be understood that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. In the description of the present invention, unless explicitly stated or limited otherwise, the terms "connected," "connected," and "connected" should be interpreted broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; may be directly connected or indirectly connected through a medium. The specific meaning of the above terms in the present invention can be understood by those of ordinary skill in the art according to the specific circumstances.

In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.

Currently, existing filtering-based SLAM positioning methods such as S-MSCKF generally use a vision sensor (e.g., a binocular camera, etc.) to perform fusion positioning with an Inertial Measurement Unit (IMU), where the observed frames at each input back end are stored through a window, and when the number of observed frames in the window meets the maximum storable number (i.e., the maximum window number), several observed frames in the window need to be deleted to complete the sliding window operation. However, the sliding window adopted by the existing S-MSCKF positioning method is: when a window is full (i.e. the number of observed frames in the window is equal to the maximum window number), firstly, calculating the relative pose between the latest third-last frame and the latest fourth-last frame from the latest fourth-last frame, and if the relative pose between the latest third-last frame and the latest fourth-last frame meets a certain threshold value, rejecting the latest third-last frame; if not, the oldest first frame is culled. Then, calculating the relative pose between the latest penultimate frame and the latest fourth last frame, and eliminating the latest penultimate frame if the relative pose between the latest penultimate frame and the latest fourth last frame meets a certain threshold; if not, the oldest second frame is culled. And finally, inputting the characteristic points on the two removed observation frames into a filter to perform filtering optimization, so as to obtain positioning information.

In other words, the sliding window mapping adopted by the existing S-MSCKF positioning method does not completely consider the influence of the current observation frame and the oldest first two frames, and the rejection strategy is too rough, so that the real-time performance and positioning accuracy of the existing S-MSCKF positioning method are improved compared with those of the EKF-SLAM positioning method, but the positioning accuracy still cannot meet the requirements of applications such as AR/VR. In addition, in the case of complex motion conditions, existing filtering-based SLAM positioning methods such as S-MSCKF may also suffer from impossible recovery drift, resulting in irreversible errors in the positioning results. Accordingly, in order to solve the above-mentioned problems, the present invention proposes a sort sliding window method and a SLAM positioning method, and a system and an electronic device thereof.

Schematic method

Referring to fig. 1 to 4 of the drawings of the specification, a sliding window classifying method according to an embodiment of the present invention is illustrated. Specifically, as shown in fig. 1, the sliding window classification method includes the steps of:

S130: when the number of the observed frames is smaller than the maximum window number, further determining whether the number of the observed frames is larger than a preset frame number threshold, and if so, selectively removing the observed frames from the window according to the characteristic point tracking rate of the current observed frames; if not, reserving all the observation frames in the window.

It is noted that the classification sliding window method of the present invention selects different sliding window (i.e. deleting the observed frames stored in the window) policies for different observed information (e.g. the number of observed frames in the window, etc.) in order to ensure that useful information is deleted as little as possible while useless information is deleted.

Further, in the classification sliding window method of the above embodiment of the present invention, the step S100 may be performed when the current observation frame (i.e., the observation frame newly input to the rear end) is received, that is, when the rear end receives the current observation frame, the number of all observation frames that have been saved in the window is detected to judge and determine whether the number of all observation frames reaches the maximum window number of the window. It will be appreciated that for ease of understanding, and to avoid confusion, the present invention defines the first observation frame into the window (i.e., the first saved in the window) as the oldest first frame; defining a second observation frame into said window (i.e. a second saved in said window) as an oldest second frame; and by analogy, all observation frames in the window are defined in sequence according to the time sequence of entering the window.

It should be noted that, since the number of the observation frames stored in the window will necessarily be smaller than the maximum window number of the window after the step S120 and the step S130 are performed by the sliding window classification method of the present invention, the window can continue to store new observation frames; thus, as shown in fig. 1, the sliding window sorting method of the present invention further includes the steps of:

s140: and adding the current observation frame to the window to serve as the latest frame in the window.

In other words, the method for classifying and sliding window according to the present invention adds the current observation frame to the window after classifying and removing part of the observation frames in the window, so as to achieve the whole sliding window effect, so that the observation frames stored in the window are not only up to date to update the constraint relationship, but also can keep the original constraint relationship with strong relevance as comprehensively as possible.

It should be noted that, according to the above embodiment of the present invention, when the number of all the observation frames reaches the maximum number of windows of the window, it is necessary to first reject a part of the observation frames from the window before the current observation frame can be added to the window. When the relative pose between the oldest first frame and the oldest second frame in the window is greater than a certain threshold, the effect of the oldest first frame on the constraint relationship is smaller than the effect of the oldest second frame on the constraint relationship, so that the oldest first frame can be removed, and adverse effects caused by removing the observed frame are reduced; when the relative pose between the oldest first frame and the oldest second frame in the window is not greater than the threshold, the effect of the oldest first frame on the constraint relationship will be greater than the effect of the oldest second frame on the constraint relationship, so that the oldest second frame can be removed, and adverse effects caused by removing the observation frame can be reduced. Thus, as shown in fig. 2, the step S120 of the sliding window classification method may include the steps of:

S121: calculating the relative pose between the oldest first frame and the oldest second frame in the window to determine whether the relative pose is greater than a first pose threshold;

s122: when the relative pose is greater than the first pose threshold, rejecting the oldest first frame, and intermittently rejecting a first predetermined number of observation frames in batches starting from the oldest second frame in the window; and

s123: and when the relative pose is not greater than the first pose threshold, reserving the oldest first frame, and starting from the oldest second frame in the window, intermittently removing a second preset number of observation frames in batches.

Notably, since the relative pose between the oldest first frame and the oldest second frame in the window may include an angular difference and a distance difference between the oldest first frame and the oldest second frame, the first pose threshold of the present invention may be implemented as a preset relative pose (i.e., a preset angular difference and a preset distance difference). In other words, in the classification sliding window method of the present invention, if the relative pose calculated by the step S121 is greater than the preset relative pose, that is, when the angle difference and the distance difference calculated by the step S121 are greater than the preset angle difference and the preset distance difference, the classification sliding window method performs the step S122; otherwise, the step S123 is performed. It can be understood that the first pose threshold of the present invention may be obtained by debugging the SLAM positioning method, so as to determine the range or specific value of the first pose threshold according to the accuracy of the positioning information obtained by the SLAM positioning method, which is not described in detail herein.

Preferably, in the step S122 of the sliding window classifying method according to the present invention, when the relative pose between the oldest first frame and the oldest second frame in the window is greater than the first pose threshold, that is, the angle difference is greater than the preset angle difference, and the distance difference is greater than the preset distance difference, the oldest first frame is rejected first, and then a first predetermined number of the observation frames are rejected in batches at equal intervals starting from the oldest second frame in the window.

It is understood that the first predetermined number may be, but is not limited to being, set according to a maximum number of windows of the window. For example, when the maximum number of windows is thirty frames, the first predetermined number may be implemented as ten frames such that one frame is removed every one frame from the oldest second frame. The oldest first frame, oldest second frame, oldest fourth frame, oldest sixth frame. Of course, in other examples of the invention, the first predetermined number may also be implemented as nine frames or eleven frames or the like.

Similarly, in the step S123 of the classification sliding window method of the present invention, when the relative pose between the oldest first frame and the oldest second frame in the window is not greater than the first pose threshold, that is, the angle difference is not greater than the preset angle difference and/or the distance difference is not greater than the preset distance difference, the oldest first frame is retained, and the observation frames are rejected from the oldest second frame in the window in batches at equal intervals, wherein the number of the observation frames rejected from the window is equal to the second predetermined number.

It is understood that the second predetermined number may be, but is not limited to being, set according to a maximum number of windows of the window. For example, when the maximum number of windows is thirty frames, the second predetermined number may be implemented as ten frames such that one frame is removed every two frames from the oldest second frame to remove ten frames in total from the window. Of course, in other examples of the invention, the second predetermined number may also be implemented as nine frames or eleven frames, among others.

It should be noted that, according to the above embodiment of the present invention, when the number of all the observation frames does not reach the maximum window number of the window, the current observation frame may be directly added to the window without excluding any observation frame. Therefore, the step S130 of the present invention of the classifying sliding window anti-shake may compare the number of the observation frames in the window with the preset frame number threshold to determine whether the number of the observation frames is greater than the preset frame number threshold, and then select whether to reject the observation frames from the window according to the comparison result of the frame numbers, so as to ensure that a sufficient number of the observation frames are stored in the window. It will be appreciated that the preset frame number threshold of the present invention may be set, but is not limited to, according to the maximum number of windows of the window, such that the number of observation frames in the window remains as high as possible. For example, the preset frame number threshold may be between 1/3 and 2/3 of the maximum window number (i.e., when the maximum window number is thirty, the preset frame number threshold may be between ten and twenty).

Furthermore, when the number of the observed frames in the window is greater than the preset frame number threshold, the method for classifying sliding window according to the present invention may further determine whether to reject the observed frame according to whether the feature point tracking rate of the current observed frame reaches 100%, so as to consider the characteristics of the current observed frame.

Illustratively, as shown in fig. 3, the step S130 of the classifying sliding window method of the present invention may include the steps of:

s131: detecting the characteristic point tracking rate of the current observation frame to determine whether the characteristic point tracking rate of the current observation frame is 100%;

s132: when the characteristic point of the current observation frame is 100%, starting from the oldest second frame in the window, sequentially calculating the relative pose between the observation frame to be removed and the oldest first frame in the window to judge whether the relative pose is smaller than a second pose threshold, and removing the observation frame to be removed if the relative pose is smaller than the second pose threshold; if not, reserving the observation frames to be removed; and

s133: and when the characteristic point tracking rate of the current observation frame is less than 100%, reserving all observation frames to be removed in the window.

It is noted that in this example of the invention, other observation frames in the window, except for the oldest first frame, may be defined as the observation frames to be culled in the window. In this way, the relative pose between the observation frame to be removed and the oldest first frame in the window being smaller than the second pose threshold value means that the pose between the observation frame to be removed and the oldest first frame is not changed much, so that the invention can retain the original constraint relationship as much as possible even if the observation frame to be removed is removed, and only the oldest first frame is retained. It can be understood that the second pose threshold of the present invention may also be obtained by debugging the SLAM positioning method, so as to determine the range or specific value of the second pose threshold according to the accuracy of the positioning information obtained by the SLAM positioning method, which is not described in detail herein.

Preferably, in the step S132, the number of the observation frames to be removed from the window is not more than 1/3 of the maximum window number, so as to ensure that a sufficient number of observation frames remain in the window, which helps to ensure that the positioning accuracy is always kept at a high level. For example, when the maximum window number of the windows is thirty, in the step S132, the number of the observation frames to be removed, which are removed from the windows, is ten frames at the maximum.

In other words, as shown in fig. 3, the step S130 of the sliding window classification method of the present invention may further include the steps of:

s134: and monitoring the number of the observation frames to be removed from the window, and stopping the removing operation when the removal number of the observation frames to be removed reaches 1/3 of the maximum window number.

Illustratively, as shown in fig. 4, according to the classification sliding window method of the above-described embodiment of the present invention, when a new observation frame (i.e., a current observation frame) is input to the back end, first, the number of observation frames stored in the window is detected to determine whether the number of observation frames reaches the maximum window number of the window.

Secondly, if so, calculating the relative pose between the oldest first frame and the oldest second frame in the window to judge whether the relative pose is greater than a first pose threshold; if not, it is further determined whether the number of observed frames is greater than a predetermined frame number threshold.

Then, when the relative pose is greater than the first pose threshold, rejecting the oldest first frame, and starting from the oldest second frame in the window, intermittently rejecting a first predetermined number of observation frames in batches; and when the relative pose is not greater than the first pose threshold, reserving the oldest first frame, and intermittently removing a second predetermined number of observation frames in batches starting from the oldest second frame in the window. Correspondingly, when the number of the observed frames is larger than the preset frame number threshold, selectively eliminating the observed frames from the window according to the characteristic point tracking rate of the current observed frames; and when the number of the observation frames is not greater than the preset frame number threshold, reserving all the observation frames in the window.

Then, detecting the characteristic point tracking rate of the current observation frame to determine whether the characteristic point tracking rate of the current observation frame is 100%; if so, starting from the oldest second frame in the window, sequentially calculating the relative pose between the observation frame to be removed and the oldest first frame in the window to judge whether the relative pose is smaller than a second pose threshold, and removing the observation frame to be removed when the relative pose is smaller than the second pose threshold, wherein the number of the observation frames to be removed from the window is at most 1/3 of the maximum window number; when the relative pose is not smaller than the second pose threshold, reserving the observation frame to be removed; if not, reserving all the observation frames to be rejected in the window.

Finally, the current observed frame is added to the window as the latest frame in the window (i.e., the oldest penultimate frame or the latest first frame in the window).

It is noted that the method for classifying sliding windows according to the present invention can effectively eliminate redundant observation frames in the window according to different motion conditions, and reduce constraints of feature points, so that the calculation amount can be greatly reduced when the same feature points are processed next. Therefore, compared with a positioning method based on nonlinear optimization, the SLAM positioning method can greatly improve the calculation speed and reduce the memory consumption. In addition, in the classification sliding window method, the characteristic points contained in the removed observation frames can provide the observation information to carry out filtering optimization. Therefore, compared with the S-MSCKF positioning method, the classification sliding window method provided by the invention also provides more observation information and constraint for filtering optimization, and the positioning accuracy is improved.

According to another aspect of the present invention, as shown in fig. 5, the present invention further provides a SLAM positioning method, including the steps of:

s210: front-end processing is carried out on an original image acquired by a binocular camera so as to obtain characteristic point information of a current observation frame;

S220: performing filter prediction processing on IMU information acquired by an inertial measurement unit to obtain a predicted pose and a predicted speed of the binocular camera;

s230: carrying out map construction according to the characteristic point information of the current observation frame to determine whether the characteristic point of tracking loss exists or not, and further carrying out filter estimation processing to obtain the estimated pose and the estimated speed of the binocular camera;

s240: based on the characteristic point information of the current observation frame, carrying out sliding window processing by the classification sliding window method to determine whether the rejected observation frame exists or not; and

s250: when the eliminated observation frame exists, carrying out filter estimation processing on characteristic point information in the eliminated observation frame according to the estimated pose and the estimated speed of the binocular camera so as to obtain the optimized pose and the optimized speed of the binocular camera; and when the eliminated observation frames do not exist, directly taking the estimated pose and the estimated speed of the binocular camera as the optimized pose and the optimized speed of the binocular camera.

Notably, the SLAM positioning method can effectively remove redundant observation frames in the window according to different motion conditions through the classification sliding window method, so that the positioning accuracy and the instantaneity of the SLAM positioning method are obviously improved. It can be understood that, in the sliding window processing by the classifying sliding window method in the SLAM positioning method of the present invention, reference may be made to the classifying sliding window method in the above embodiment of the present invention, which is not described in detail herein.

In addition, in performing front-end processing on an original image (including a left-eye image and a right-eye image in the original image) acquired by a binocular camera to obtain feature point information of a current observation frame (including left-eye feature point information and right-eye feature point information in the current observation frame), it is generally necessary to perform feature point tracking on the original image information. In the existing S-MSCKF positioning method, an optical flow tracking method is generally adopted to track the feature point of the left-eye image so as to obtain the left-eye feature point information in the current observation frame, and a stereo matching method is adopted to track the feature point of the right-eye image so as to obtain the right-eye feature point information in the current observation frame, so that the feature point information of the current observation frame is obtained. However, in the existing S-MSCKF positioning method, the calculation amount of obtaining the feature point information of the current observation frame is large, the front end processing speed is slow, the time required for the front end processing is long, and especially the error of tracking the left and right eye feature points is large, so that the positioning accuracy and instantaneity of the existing S-MSCKF positioning method are difficult to meet the requirements of applications such as AR/VR.

Therefore, in order to reduce the error of tracking the left and right eye feature points and increase the front end processing speed, as shown in fig. 6, the step S210 of the SLAM positioning method of the present invention may include the steps of:

S211: tracking the characteristic points of the left-eye image in the original image by an optical flow tracking method to obtain left-eye characteristic point information in the current observation frame; and

s212: and tracking the characteristic points of the right-eye image in the original image by using a polar line searching and block matching method according to the relative pose between the left-eye camera and the right-eye camera in the binocular camera so as to obtain right-eye characteristic point information in the current observation frame.

It should be noted that, in the present invention, for a newly received frame of left-eye image (i.e., the left-eye image in the current original image), the number of feature points of the left-eye image tracked by the optical flow tracking method may be reduced, that is, there may be feature points that are lost in tracking. At this time, the feature points need to be supplemented so that the number of feature points reaches the maximum number. Therefore, according to the above embodiment of the present invention, as shown in fig. 6, the step S210 of the SLAM positioning method of the present invention may further include the steps of:

s213: judging whether the number of the characteristic points of the left-eye image tracked by an optical flow tracking method is smaller than a characteristic point number threshold, and if so, extracting new characteristic point information from the left-eye image by a characteristic point extraction method so as to supplement the left-eye characteristic point information in the current observation frame.

Still further, in an example of the present invention, as shown in fig. 7, the step S230 of the SLAM positioning method may include the steps of:

s231: when the tracking lost feature points exist, carrying out filter estimation processing on the information of the tracking lost feature points according to the predicted pose and the predicted speed of the binocular camera so as to obtain the estimated pose and the estimated speed of the binocular camera; and

s232: and when the characteristic points with the tracking loss do not exist, the predicted pose and the predicted speed of the binocular camera are directly used as the estimated pose and the estimated speed of the binocular camera.

It is noted that, in the above example of the present invention, only when there is the tracking lost feature point information, the tracking lost feature point information is subjected to the filter estimation process to obtain the estimated pose and the estimated speed with higher accuracy; and when the tracking lost characteristic point information does not exist, filter estimation processing is not performed, so that the accuracy of the estimated pose and the estimated speed of the binocular camera is poor. Therefore, in order to solve the above-mentioned problem, in another example of the present invention, as shown in fig. 7, the step S230 of the SLAM positioning method may further include the steps of:

S232': when the tracking lost feature points do not exist, a preset number of feature points are screened from the feature points of the current observation frame, and then filter estimation processing is carried out on the information of the screened feature points according to the predicted pose and the predicted speed of the binocular camera, so that the estimated pose and the estimated speed of the binocular camera are obtained.

It can be appreciated that in this example of the present invention, when there is no feature point information of the tracking loss, that is, the feature point tracking rate of the current observation frame is 100%, a predetermined number of feature points are still selected for the filter estimation process, so that the accuracy of the estimated pose and the estimated speed of the binocular camera is improved. Further, the predetermined number of the present invention may be designed according to a maximum tracking number of feature points, for example, the predetermined number of the present invention may be implemented as 1/10 of the maximum tracking number, but is not limited thereto.

More specifically, in order to further improve the accuracy of the estimated pose and estimated speed of the binocular camera and thus the positioning accuracy of the SLAM positioning method, in the step S232' of the SLAM positioning method of the present invention, it is preferable to screen out feature points having a left-right matching error smaller than a predetermined threshold value from feature point information of the current observation frame by a feature point screen.

Illustratively, the feature point screener of the present invention may be, but is not limited to being, implemented as:

wherein p is ₁ And p ₂ Coordinates of the left matching feature points and the right matching feature points respectively; t is the translation amount; r is the rotation amount.

Notably, the left-right matching error of the feature points of the present invention

Theoretically: if p is ₁ And p ₂ Exact match, then->

And due to noise, tracking error, etc., is +.>

Not equal to 0, but->

The closer to 0, p ₁ And p ₂ The higher the left-right matching degree between them.

Preferably, the predetermined threshold of the present invention may be implemented as, but not limited to

Wherein s is a coefficient; c _x And c _y Respectively the internal parameters of the binocular camera.

In this way, the step S232' of the present invention can screen out the feature points with the best left-right matching degree, i.e. the feature points with the best tracking effect, from the feature point information of the current observation frame by using the feature point screen, so as to improve the accuracy of the estimated pose and the estimated speed of the binocular camera to the maximum extent, and further improve the positioning accuracy of the SLAM positioning method to the maximum extent.

Illustratively, in an example of the present invention, as shown in fig. 8, the SLAM positioning method of the present invention may include the steps of:

Step 1: system initialization and feature extraction

Initializing the whole system to obtain the camera internal and external parameters and IMU initial parameters required by the system, receiving information of a vision sensor, filtering original image information, establishing a two-layer pyramid of the image, extracting feature points from the top layer to the bottom layer of the pyramid, accelerating the feature point extraction speed under the condition that the maximum number of feature points is fixed, sorting by utilizing a harris response value, selecting feature points with high response values, and outputting features.

Step 2: feature tracking and matching

As shown in fig. 9, first, the features of the left eye image are extracted, the relative pose of the left and right eye cameras is utilized, feature tracking is performed by a method of polar line search and block matching, and the tracked result is input into the rear end; then, adopting optical flow tracking to the new left-eye image to obtain the characteristic point of the new image; if the feature points are relatively fewer, enough features are extracted for supplementation through a feature extraction method, the maximum tracking number of the feature points is met, the tracked feature points are input into the rear end, and the front end processing is completed.

Step 3: IMU initialization and pre-integration, filter initialization

The initialization of the IMU adopts static initialization, determines the direction of gravitational acceleration, and provides the direction for initializing the camera. The IMU data requires a pre-integration process as a predictor of EKF, where the pre-integration method may be, but is not limited to, using the 4 th order longlattice tower algorithm.

The initialization of the filter is to set initial values of parameters of the filter, in particular, initial values of covariance matrix and system noise, which play an important role in filtering accuracy. The specific process is as follows: firstly, establishing a continuous IMU error model; secondly, discretizing the matrixes F and G; then, predicting the covariance of the IMU at the current moment by using the covariance of the previous moment; finally, the covariance prediction equation of the system is subjected to considerable consistency correction.

Step 4: camera state synchronization with IMU and augmentation of covariance

When new camera information (i.e. characteristic point information of a current observation frame) is input into the rear end, the current pose of the IMU is predicted through pre-integration, the pose of the camera is calculated through the relative pose of the IMU and the camera, and the poses of the two sensors are synchronized. When the system adds cameras, the covariance needs to be amplified.

Step 5: building map and processing feature points

And receiving visual information, inputting the visual information into the rear end after the visual information is processed by the front end, and establishing the characteristic points in the visual information into a local map for restraining the characteristic points. In the process of tracking the feature points, the condition of losing the feature points is easy to occur, so when the condition of losing the feature points occurs, EKF updating is performed by using the lost feature points, and the optimized pose is output.

Step 6: sliding window

Because the feature points are constrained by multiple frames, in order to continuously update the constraint relation, the stability and the instantaneity of the algorithm are ensured, a sliding window is required to be adopted to reject some frames, and the constraint relation is updated and meanwhile the constraint can be reduced. Compared with sliding window strategies in algorithms such as VINS, ORB, ICE-BA, S-MSCKF, the invention provides a new sliding window strategy, and different sliding window methods are started according to different observed information, and the specific operation is as follows:

(1) When the number of observed frames in the window meets a frame number threshold value, but is smaller than the maximum window number, and when the tracking rate of the characteristic points of the current frame is 100%, namely, when no lost characteristic points exist in the current frame, calculating the angle difference and the distance difference between the current frame and the oldest first frame from the oldest second frame in the window, and if the threshold value is met, rejecting the oldest second frame; and by analogy, the number of the observation frames is at most removed to be 1/3 of the maximum window number.

(2) When the number of observed frames in the window meets the maximum window number, calculating the distance difference and the angle difference between the oldest first frame and the oldest second frame, and if the threshold value is met, eliminating the oldest first frame; if not, the oldest first frame is not rejected. Ten frames are then successively dropped at equal intervals starting from the oldest second frame in the window.

Step 7: system update

The main task of the system updating is to firstly use the current time state and covariance predicted value obtained by the prediction module; and constructing a measurement model by using the selected characteristic points, and filtering the two kinds of information by using an extended Kalman filtering algorithm to obtain an estimated value of the current moment. Notably, the SLAM positioning method obtains the current pose estimation after the SLAM positioning method is updated by the EKF, and the EUROC data set test result shows that the positioning accuracy is greatly improved.

Schematic System

Referring to fig. 10 of the drawings, a sorting slide window system according to an embodiment of the present invention is illustrated. Specifically, as shown in fig. 10, the sliding window system 10 for sliding a window in a classified manner may include a determining module 11, a first culling module 12, and a second culling module 13 that are communicatively connected to each other, where the determining module 11 is configured to determine whether the number of all observation frames in the window reaches the maximum window number of the window; wherein the first rejecting module 12 is communicatively connected to the determining module 11, and is configured to reject a predetermined number of observation frames from the window in batches at intervals according to a relative pose between an oldest first frame and an oldest second frame in the window when the number of observation frames reaches the maximum window number; wherein the second rejecting module 13 is communicatively connected to the determining module 11, and is configured to further determine whether the number of the observed frames is greater than a preset frame number threshold when the number of the observed frames is less than the maximum window number, and if so, selectively reject the observed frames from the window according to a feature point tracking rate of the current observed frame; if not, reserving all the observation frames in the window.

It should be noted that, in the above embodiment of the present invention, as shown in fig. 10, the classification sliding window system 10 may further include an adding module 14, where the adding module 14 is communicatively connected to the first culling module 12 and the second culling module 13, respectively, for adding the current observation frame to the window as the latest frame in the window.

Still further, in an example of the present invention, as shown in fig. 10, the first culling module 12 may include a pose calculation module 121 and a batch culling module 122 that are communicatively connected to each other. The pose calculating module 121 is configured to calculate a relative pose between the oldest first frame and the oldest second frame in the window, so as to determine whether the relative pose is greater than a first pose threshold. The batch eliminating module 122 is configured to eliminate the oldest first frame when the relative pose is greater than the first pose threshold, and to intermittently eliminate a first predetermined number of observation frames from the oldest second frame in the window; and when the relative pose is not greater than the first pose threshold, retaining the oldest first frame and intermittently removing a second predetermined number of observation frames in batches starting from the oldest second frame in the window.

In an example of the present invention, as shown in fig. 10, the second culling module 13 may include a detecting module 131, a selecting culling module 132, and a preserving module 133, where the detecting module 131 is configured to detect a feature point tracking rate of the current observed frame to determine whether the feature point tracking rate of the current observed frame is 100%; the selecting and eliminating module 132 is communicatively connected to the detecting module 131, and is configured to sequentially calculate, when the feature point of the current observation frame is 100%, a relative pose between the observation frame to be eliminated and the oldest first frame in the window from the oldest second frame in the window, so as to determine whether the relative pose is less than a second pose threshold, and if so, eliminate the observation frame to be eliminated; if not, reserving the observation frames to be removed; wherein the retaining module 133 is communicatively connected to the detecting module 131, and is configured to retain all the observation frames to be rejected in the window when the feature point tracking rate of the current observation frame is less than 100%.

Preferably, as shown in fig. 10, the second culling module 13 further includes a monitoring module 134, where the monitoring module 134 is configured to monitor the number of the observation frames to be culled that are culled from the window, and stop the culling operation when the number of the observation frames to be culled that are culled from the window reaches 1/3 of the maximum window number.

According to another aspect of the present invention, as shown in fig. 11, an embodiment of the present invention further provides a SLAM positioning system 1. Specifically, as shown in fig. 11, the SLAM positioning system 1 includes the classification sliding window system 10, a front-end system 20, a filter prediction system 30, a map construction system 40, and a filter estimation system 50. The front-end system 20 is configured to perform front-end processing on an original image acquired by the binocular camera, so as to obtain feature point information of a current observation frame. The filter prediction system 30 is configured to perform filter prediction processing on IMU information acquired by the inertial measurement unit, so as to obtain a predicted pose and a predicted speed of the binocular camera. The map construction system 40 may include a map construction module 41 and a feature point determination module 42 communicatively connected, wherein the map construction module 41 is communicatively connected with the front-end system 20 and the filter prediction system 30, respectively, for performing map construction based on feature point information of the current observation frame; wherein the feature point determining module 42 is configured to determine whether there is a feature point of tracking loss, and further perform estimation processing through the filter to obtain an estimated pose and an estimated speed of the binocular camera. The classification sliding window system 10 is configured to perform a classification sliding window processing by using the classification sliding window method based on the feature point information of the current observation frame, so as to determine whether the rejected observation frame exists. The filter estimation system 50 is communicatively connected to the classification sliding window system 10, and is further configured to perform filter estimation processing on feature point information in the eliminated observation frame according to the estimated pose and the estimated speed of the binocular camera when the eliminated observation frame exists, so as to obtain an optimized pose and an optimized speed of the binocular camera; and when the eliminated observation frames do not exist, directly taking the estimated pose and the estimated speed of the binocular camera as the optimized pose and the optimized speed of the binocular camera.

It should be noted that, in an embodiment of the present invention, as shown in fig. 11, the front-end system 20 may include an optical flow tracking module 21 and an epipolar search and block matching module 22 that are communicatively connected to each other, where the optical flow tracking module 21 is configured to track, by an optical flow tracking method, a feature point of a left-eye image in a current original image, so as to obtain left-eye feature point information in the current observation frame; the epipolar searching and block matching module 22 is configured to track, according to the relative pose between the left-eye camera and the right-eye camera in the binocular camera, the feature points of the right-eye image in the current original image by using an epipolar searching and block matching method, so as to obtain right-purpose feature point information in the current observation frame.

Preferably, as shown in fig. 11, the front-end system 20 may further include a judgment extraction module 23, where the judgment extraction module 23 is configured to judge whether the number of feature points of the left-eye image tracked by the optical flow tracking method is smaller than a threshold value of the number of feature points, and if so, extract new feature point information from the left-eye image in the current original image by the feature point extraction method to supplement the left-eye feature point information in the current observation frame.

In an example of the present invention, as shown in fig. 11, the filter estimation system 50 may be further configured to perform, when there is the feature point of tracking loss, a filter estimation process on the feature point information of tracking loss according to a predicted pose and a predicted speed of the binocular camera, so as to obtain an estimated pose and an estimated speed of the binocular camera; and when the tracking lost feature points do not exist, the predicted pose and the predicted speed of the binocular camera are directly used as the estimated pose and the estimated speed of the binocular camera.

In another example of the present invention, as shown in fig. 11, the map construction system 40 may further include a feature point screening module 43 for screening a predetermined number of feature points from the feature point information of the current observation frame when the tracking loss feature points do not exist; the filter estimation system 50 is further configured to perform a filter estimation process on the filtered feature points according to the predicted pose and the predicted speed of the binocular camera, so as to obtain an estimated pose and an estimated speed of the binocular camera.

Schematic electronic device

Next, an electronic device according to an embodiment of the present invention is described with reference to fig. 12. As shown in fig. 12, the electronic device 90 includes one or more processors 91 and memory 92.

The processor 91 may be a Central Processing Unit (CPU) or other form of processing unit having data processing and/or instruction execution capabilities, and may control other components in the electronic device 90 to perform desired functions. In other words, the processor 91 comprises one or more physical devices configured to execute instructions. For example, the processor 91 may be configured to execute instructions that are part of: one or more applications, services, programs, routines, libraries, objects, components, data structures, or other logical constructs. Such instructions may be implemented to perform a task, implement a data type, transform the state of one or more components, implement a technical effect, or otherwise achieve a desired result.

The processor 91 may include one or more processors configured to execute software instructions. Additionally or alternatively, the processor 91 may include one or more hardware or firmware logic machines configured to execute hardware or firmware instructions. The processors of the processor 91 may be single-core or multi-core, and the instructions executed thereon may be configured for serial, parallel, and/or distributed processing. The various components of the processor 91 may optionally be distributed across two or more separate devices, which may be remotely located and/or configured for coordinated processing. Aspects of the processor 91 may be virtualized and executed by remotely accessible networked computing devices configured in a cloud computing configuration.

The memory 92 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, random Access Memory (RAM) and/or cache memory (cache), and the like. The non-volatile memory may include, for example, read Only Memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer readable storage medium and executed by the processor 11 to perform some or all of the steps in the above-described exemplary methods of the present invention, and/or other desired functions.

In other words, the memory 92 includes one or more physical devices configured to hold machine readable instructions executable by the processor 91 to implement the methods and processes described herein. In implementing these methods and processes, the state of the memory 92 may be transformed (e.g., different data is saved). The memory 92 may include removable and/or built-in devices. The memory 92 may include optical memory (e.g., CD, DVD, HD-DVD, blu-ray disc, etc.), semiconductor memory (e.g., RAM, EPROM, EEPROM, etc.), and/or magnetic memory (e.g., hard disk drive, floppy disk drive, tape drive, MRAM, etc.), among others. The memory 92 may include volatile, nonvolatile, dynamic, static, read/write, read-only, random access, sequential access, location-addressable, file-addressable, and/or content-addressable devices.

It is to be appreciated that the memory 92 includes one or more physical devices. However, aspects of the instructions described herein may alternatively be propagated through a communication medium (e.g., an electromagnetic signal, an optical signal, etc.) that is not held by a physical device for a limited period of time. Aspects of the processor 91 and the memory 92 may be integrated together into one or more hardware logic components. These hardware logic components may include, for example, field Programmable Gate Arrays (FPGAs), program and application specific integrated circuits (PASICs/ASICs), program and application specific standard products (PSSPs/ASSPs), system on a chip (SOCs), and Complex Programmable Logic Devices (CPLDs).

In one example, as shown in FIG. 12, the electronic device 90 may further include an input device 93 and an output device 94, which are interconnected by a bus system and/or other form of connection mechanism (not shown). For example, the input device 93 may be, for example, a camera module or the like for capturing image data or video data. As another example, the input device 93 may include or interface with one or more user input devices such as a keyboard, mouse, touch screen, or game controller. In some embodiments, the input device 93 may include or interface with selected Natural User Input (NUI) components. Such component parts may be integrated or peripheral and the transduction and/or processing of the input actions may be processed on-board or off-board. Example NUI components may include microphones for speech and/or speech recognition; infrared, color, stereoscopic display, and/or depth cameras for machine vision and/or gesture recognition; head trackers, eye trackers, accelerometers and/or gyroscopes for motion detection and/or intent recognition; and an electric field sensing component for assessing brain activity and/or body movement; and/or any other suitable sensor.

The output device 94 may output various information including the classification result and the like to the outside. The output device 94 may include, for example, a display, speakers, a printer, and a communication network and remote output devices connected thereto, etc.

Of course, the electronic device 90 may further comprise the communication means, wherein the communication means may be configured to communicatively couple the electronic device 90 with one or more other computer devices. The communication means may comprise wired and/or wireless communication devices compatible with one or more different communication protocols. As non-limiting examples, the communication subsystem may be configured for communication via a wireless telephone network or a wired or wireless local area network or wide area network. In some embodiments, the communications apparatus may allow the electronic device 90 to send and/or receive messages to and/or from other devices via a network such as the Internet.

It will be appreciated that the configurations and/or approaches described herein are exemplary in nature, and that these specific embodiments or examples are not to be considered in a limiting sense, because numerous variations are possible. The specific routines or methods described herein may represent one or more of any number of processing strategies. As such, various acts illustrated and/or described may be performed in the sequence illustrated and/or described, in other sequences, in parallel, or omitted. Also, the order of the above-described processes may be changed.

Of course, only some of the components of the electronic device 90 that are relevant to the present invention are shown in fig. 12 for simplicity, components such as buses, input/output interfaces, etc. are omitted. In addition, the electronic device 90 may include any other suitable components depending on the particular application.

Illustrative computing program product

In addition to the methods and apparatus described above, embodiments of the invention may also be a computer program product comprising computer program instructions which, when executed by a processor, cause the processor to perform the steps in a method according to various embodiments of the invention described in the "exemplary methods" section of this specification.

The computer program product may write program code for performing the operations of embodiments of the present invention in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the C programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server.

Furthermore, embodiments of the present invention may also be a computer readable storage medium, having stored thereon computer program instructions, which when executed by a processor, cause the processor to perform the steps of the method described above in the present specification.

The computer readable storage medium may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may include, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The basic principles of the present invention have been described above in connection with specific embodiments, however, it should be noted that the advantages, benefits, effects, etc. mentioned in the present invention are merely examples and not intended to be limiting, and these advantages, benefits, effects, etc. are not to be considered as essential to the various embodiments of the present invention. Furthermore, the specific details disclosed herein are for purposes of illustration and understanding only, and are not intended to be limiting, as the invention is not necessarily limited to practice with the above described specific details.

The block diagrams of the devices, apparatuses, devices, systems referred to in the present invention are only illustrative examples and are not intended to require or imply that the connections, arrangements, configurations must be made in the manner shown in the block diagrams. As will be appreciated by one of skill in the art, the devices, apparatuses, devices, systems may be connected, arranged, configured in any manner. Words such as "including," "comprising," "having," and the like are words of openness and mean "including but not limited to," and are used interchangeably therewith. The terms "or" and "as used herein refer to and are used interchangeably with the term" and/or "unless the context clearly indicates otherwise. The term "such as" as used herein refers to, and is used interchangeably with, the phrase "such as, but not limited to.

It is also noted that in the apparatus, devices and methods of the present invention, the components or steps may be disassembled and/or assembled. Such decomposition and/or recombination should be considered as equivalent aspects of the present invention.

The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the invention. Thus, the present invention is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

It will be appreciated by persons skilled in the art that the embodiments of the invention described above and shown in the drawings are by way of example only and are not limiting. The objects of the present invention have been fully and effectively achieved. The functional and structural principles of the present invention have been shown and described in the examples and embodiments of the invention may be modified or practiced without departing from the principles described.

Claims

1. The method for classifying sliding windows is characterized by comprising the following steps:

2. The method of classifying sliding window according to claim 1, further comprising the step of:

3. The sliding window classifying method according to claim 2, wherein the step S120 includes the steps of:

4. A method of classifying a sliding window according to claim 3, wherein in said step S120, the observed frames in the window are culled in batches at equal intervals, starting from the oldest second frame in the window.

5. The sliding window method of any one of claims 1 to 4, wherein the step S130 includes the steps of:

6. The sliding window classifying method according to claim 5, wherein the step S130 further comprises the steps of:

A slam locating method, comprising the steps of:

when the eliminated observation frame exists, carrying out filter estimation processing on characteristic point information in the eliminated observation frame according to the estimated pose and the estimated speed of the binocular camera so as to obtain the optimized pose and the optimized speed of the binocular camera; when the eliminated observation frame does not exist, the estimated pose and the estimated speed of the binocular camera are directly used as the optimized pose and the optimized speed of the binocular camera;

the method for classifying sliding windows comprises the following steps:

8. The SLAM locating method of claim 7, wherein the step of performing front-end processing on the original image acquired by the binocular camera to obtain feature point information of the current observation frame comprises the steps of:

9. The SLAM locating method of claim 8, wherein the step of performing front-end processing on the original image acquired by the binocular camera to obtain feature point information of the current observation frame further comprises the steps of:

10. The SLAM locating method of claim 9, wherein the step of mapping according to the feature point information of the current observation frame to determine whether there is a feature point of tracking loss, and further obtaining the estimated pose and the estimated speed of the binocular camera through a filter estimation process, comprises the steps of:

11. The SLAM locating method of claim 9, wherein the step of mapping according to the feature point information of the current observation frame to determine whether there is a feature point of tracking loss, and further obtaining the estimated pose and the estimated speed of the binocular camera through a filter estimation process, comprises the steps of:

12. A sorting slide system for sorting slide windows, wherein the sorting slide system comprises:

13. The classification slide system of claim 12, further comprising:

14. The classification slide window system of claim 13, wherein the first culling module comprises a pose calculation module and a batch culling module communicatively coupled to each other, wherein the pose calculation module is configured to calculate a relative pose between the oldest first frame and the oldest second frame in the window to determine whether the relative pose is greater than a first pose threshold; the batch eliminating module is used for eliminating the oldest first frame in the window when the relative pose is larger than the first pose threshold value, and eliminating a first preset number of observation frames in batches at intervals from the oldest second frame in the window; and the batch eliminating module is further configured to reserve the oldest first frame in the window and to intermittently and batch eliminate a second predetermined number of observation frames from the oldest second frame in the window when the relative pose is not greater than the first pose threshold.

15. The classification sliding window system according to any one of claims 12 to 14, wherein the second culling module comprises a detection module, a selection culling module, and a retention module, wherein the detection module is configured to detect a feature point tracking rate of the current observation frame to determine whether the feature point tracking rate of the current observation frame is 100%; the selection eliminating module is communicably connected to the detection module and is used for sequentially calculating the relative pose between the observation frame to be eliminated and the oldest first frame in the window from the oldest second frame in the window when the characteristic point tracking rate of the current observation frame is 100%, so as to judge whether the relative pose is smaller than a second pose threshold value, and eliminating the observation frame to be eliminated if the relative pose is smaller than the second pose threshold value; if not, reserving the observation frame to be removed; the retaining module is communicatively connected to the detecting module, and is configured to retain all the observation frames to be removed in the window when the feature point tracking rate of the current observation frame is less than 100%.

16. The classification slide window system of claim 15, wherein the second culling module further comprises a monitoring module for monitoring the number of observation frames to be culled from the window to stop the culling operation when the culling number of observation frames to be culled reaches 1/3 of the maximum window number.

A SLAM locating system for locating based on an original image acquired by a binocular camera and IMU information acquired by an inertial measurement unit, wherein the SLAM locating system comprises:

The filter estimation system is communicatively connected with the classification sliding window system and is used for carrying out filter estimation processing on characteristic point information in the eliminated observation frame according to the estimated pose and the estimated speed of the binocular camera when the eliminated observation frame exists, so as to obtain the optimized pose and the optimized speed of the binocular camera; when the eliminated observation frame does not exist, the estimated pose and the estimated speed of the binocular camera are directly used as the optimized pose and the optimized speed of the binocular camera;

wherein the sorting slide system comprises:

a determining module, configured to determine whether the number of all observation frames in a window reaches a maximum window number of the window;

18. The SLAM positioning system of claim 17, wherein the front-end system comprises an optical flow tracking module, an epipolar search and block matching module and a judgment extraction module that are communicatively connected to each other, wherein the optical flow tracking module is configured to track feature points of a left eye image in the original image by an optical flow tracking method to obtain left eye feature point information in the current observation frame; the epipolar searching and block matching module is used for tracking the characteristic points of the right-eye image in the original image through an epipolar searching and block matching method according to the relative pose between the left-eye camera and the right-eye camera in the binocular camera so as to obtain right-eye characteristic point information in the current observation frame; the judging and extracting module is used for judging whether the number of the characteristic points of the left-eye image tracked by the optical flow tracking method is smaller than a threshold value of the number of the characteristic points, and if so, extracting new characteristic point information from the left-eye image by the characteristic point extracting method so as to supplement the left-eye characteristic point information in the current observation frame.

19. The SLAM positioning system of claim 18, wherein the filter estimation system is further configured to perform a filter estimation process on information of the lost tracking feature point according to the predicted pose and the predicted speed of the binocular camera to obtain the estimated pose and the estimated speed of the binocular camera when the lost tracking feature point exists; and when the tracking lost feature point does not exist, directly taking the predicted pose and the predicted speed of the binocular camera as the estimated pose and the estimated speed of the binocular camera.

20. The SLAM locating system of claim 18, wherein the mapping system further comprises a feature point screening module for screening a predetermined number of feature points from the feature points of the current observation frame when there are no feature points for which tracking is lost; the filter estimation system is further used for carrying out filter estimation processing on the information of the screened characteristic points according to the predicted pose and the predicted speed of the binocular camera so as to obtain the estimated pose and the estimated speed of the binocular camera.

21. An electronic device, comprising:

at least one processor for executing instructions; and

a memory communicatively coupled to the at least one processor, wherein the memory has at least one instruction, wherein the instruction is executed by the at least one processor to cause the at least one processor to perform all of the steps in a SLAM positioning method, wherein the SLAM positioning method comprises the steps of:

the method for classifying sliding windows comprises the following steps: