CN114708482A - Topological graph scene recognition method and device based on density filtering and landmark saliency - Google Patents

Topological graph scene recognition method and device based on density filtering and landmark saliency Download PDF

Info

Publication number
CN114708482A
CN114708482A CN202210174254.XA CN202210174254A CN114708482A CN 114708482 A CN114708482 A CN 114708482A CN 202210174254 A CN202210174254 A CN 202210174254A CN 114708482 A CN114708482 A CN 114708482A
Authority
CN
China
Prior art keywords
landmark
landmarks
frame
query
representing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210174254.XA
Other languages
Chinese (zh)
Inventor
张云洲
刘英达
秦操
杨非
杜承垚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Lab
Original Assignee
Zhejiang Lab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Lab filed Critical Zhejiang Lab
Priority to CN202210174254.XA priority Critical patent/CN114708482A/en
Publication of CN114708482A publication Critical patent/CN114708482A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention discloses a topological graph scene recognition method and a device based on density filtering and landmark significance, which effectively solve the scene recognition problem under the view angle change, on the basis of a target detection algorithm, SIFT key points are extracted from the obtained landmarks, the characteristic that the SIFT key points have robustness to the view angle change is fully utilized, the landmarks with view angle invariance are obtained by adopting a density filtering algorithm, then, cross authentication and comparison of the shape scores of the landmarks are carried out on the depth global descriptors of the view angle invariance landmarks in a query frame and a reference frame, mutually matched landmarks in the two frames are obtained, the extracted landmarks are considered to only represent a small part of an image, and some landmarks with low identification degree can cause confusion, so that the matching result is negatively influenced, a landmark positioning network is used for obtaining the landmark significance, and then, the spatial relationship between the landmarks is utilized, The topological structure of appearance and image saliency calculates query and reference frame matching scores.

Description

Topological graph scene recognition method and device based on density filtering and landmark saliency
Technical Field
The invention relates to the field of image processing, in particular to a topological graph scene recognition method and device based on density filtering and landmark saliency.
Background
Synchronous positioning and mapping (SLAM) is a process of placing a robot in an unknown environment and moving the robot while gradually mapping the surrounding environment. The robot needs to be able to successfully match images from the same location over the course of repeated traversal environments to correct drift errors that accumulate on the map over time. This is called closed loop detection (also called visual scene recognition). When the mobile robot is in an environment with severe change of the view angle, the closed-loop detection performance is reduced due to serious drift and large state estimation error. Therefore, visual scene recognition becomes a very challenging problem in case of viewing angle changes.
From recent research, some approaches use landmarks and topological approaches to address such challenges in severely view-changing environments, and the results indicate that better performance improvements are obtained. However, the extraction of landmarks is obtained by object detection, and the extracted landmarks are not robust enough to drastic view angle changes, so scene recognition may fail. Furthermore, these methods do not make full use of the information of landmarks contained in the frames during the construction of the topology map. If the extracted landmarks have low recognition degree, confusion of scenes can be caused, and finally matching of the scene recognition method fails.
Disclosure of Invention
In order to overcome the defects of the prior art and improve the scene recognition capability under the change of visual angles, the invention adopts the following technical scheme:
the topological graph scene recognition method and device based on density filtering and landmark saliency overcome adverse effects brought to a scene recognition result by visual angle change, and accuracy and robustness of a scene recognition system are guaranteed. On the basis of a target detection algorithm, SIFT key points in the landmarks are quantified and modeled through a density filtering algorithm, the number and distribution of the key points in each landmark are calculated to obtain the landmarks with view angle invariance, the problem that the landmarks depend on a detector is solved, and the performance and the accuracy of scene matching are improved. And then extracting depth global descriptors of the landmarks in the query frame and the reference frame, performing primary matching of the landmarks through cross authentication, and eliminating mismatching by using the shape similarity of the matched landmarks. The significance of the landmarks is obtained by applying a Landmark Localization Network (Landmark Localization Network), and in the process of constructing a topological graph, the matching scores of a query frame and a reference frame are calculated by encoding whether the landmarks have the attribute with high identification degree, the spatial relationship among the landmarks and the appearance of the landmarks, so that the problems of matching scene confusion caused by low-identification-degree landmarks, poor robustness of a scene recognition system under the change of a visual angle and low accuracy are solved. The robustness of the scene recognition system under the change of the visual angle can be ensured, and the accuracy of the scene matching result can be ensured.
The topological graph scene identification method based on density filtering and landmark saliency comprises the following steps:
the method comprises the following steps: extracting landmarks from the input frame using a target detection algorithm;
step two: utilizing a density filtering algorithm to quantize and model SIFT key points in the landmarks, calculating the number and distribution of the key points in each landmark, and utilizing the attribute that the SIFT key points have better robustness to the change of a visual angle to obtain the landmarks with visual angle invariance;
step three: adopting a convolutional neural network to extract a depth global descriptor for the visual angle invariance landmarks in the query frame and the reference frame, and completing the primary matching of the landmarks through cross authentication;
step four: eliminating mismatching by utilizing the shape similarity of the matched landmarks to obtain mutually matched landmarks in the query frame and the reference frame;
step five: the significance of the landmark is obtained by adopting a landmark positioning network;
step six: and coding the spatial relationship, the landmark significance and the landmark appearance of the landmarks in the query frame and the reference frame into a topological graph structure, calculating the similarity scores of the query frame and the reference frame, and determining the frame with the highest score as a matching frame.
Further, in the second step, the landmark having invariance to the angle of view is obtained by calculating the frequency and the coefficient of variation of each landmark, which specifically includes the following steps:
step 2.1: quantifying and modeling keypoint frequencies in a current landmark:
Figure BDA0003518433300000021
where CR represents the frequency of key points in the quantified and modeled landmark, KhRepresenting the number of keypoints, HhIndicating height, W, of a landmarkhRepresenting the width of the landmark, f representing a scaling factor;
step 2.2: dividing the landmark into a group of grids, and quantifying and modeling the distribution of key points in the current landmark:
Figure BDA0003518433300000022
std _ sum represents a coefficient of variation, is a statistical measure of the variation degree of each grid value in the landmark and reflects the dispersion degree of key points, std (G) represents the standard deviation of the number of the key points of m grids, avg (G) represents the average value of the number of the key points of m grids, and m represents the number of grids;
step 2.3: when CR is greater than a first threshold value T1Std _ sum is smaller than a second threshold value T2The corresponding landmarks have view angle invariance.
Further, in the fourth step, in order to ensure the accuracy of landmark matching, the shape similarity of the matching landmarks is introduced:
Figure BDA0003518433300000023
wherein ShapeabShape similarity score, w, representing a preliminary matching landmarka,ha,wb,hbThe width and height of the primary matched landmark are respectively, a represents the landmark of the query frame A, B represents the landmark of the reference frame B, and a and B are the landmarks of the A frame and the B frame matched through the primary characteristic.
Further, in the sixth step, the calculation of the similarity score between the query frame and the reference frame includes the following steps:
step S6.1: calculating the similarity score of the angles in the query frame and the reference frame:
Figure BDA0003518433300000024
Figure BDA0003518433300000031
wherein, wθA similarity score representing the angle in the query frame and the reference frame,
Figure BDA0003518433300000032
representing spatial angular similarity in spatial relationship of landmarks in query and reference frames, zii′,zjj′,zkk′Appearance similarity scores, S, representing three pairs of matching landmarks that construct a trianglei,Si′,Sj,Sj′,Sk,Sk′The normalized landmark saliency of three pairs of landmarks in the query frame and the reference frame is respectively represented, avg represents the average value calculation, and max represents the maximum value acquisition;
step S6.2: calculating the similarity score of the distance between the query frame and the reference frame:
Figure BDA0003518433300000033
wherein, wdSimilarity score representing the distance between the query frame and the reference frame, dii′,jj′Representing spatial distance similarity in spatial relationship of landmarks in query and reference frames, zii′,zjj′Appearance similarity scores, S, for two pairs of matching landmarks representing construction edgesi,Si′,Sj,Sj′The normalized landmark significance of two pairs of landmarks in the query frame and the reference frame is respectively represented, avg represents the average value calculation, and max represents the maximum value acquisition;
step S6.3: to limit the range of similarity scores to [0,1] due to the different number of landmarks matched between different frames, a final similarity score is calculated from frame to frame:
Figure BDA0003518433300000034
where Score represents the similarity Score from frame to frame, and n represents the number of landmarks matched from frame to frame.
Further, the spatial relationship in the sixth step includes a distance spatial relationship, and the distance spatial relationship is configured as follows:
dii′,jj′=exp(-|ei,j-ei′,j′|) (4)
wherein d isii′,jj′Representing the spatial distance similarity of landmarks in the query frame and the reference frame, ei,j、ei′,j′The edges of the topological graph constructed by the landmarks in the query frame and the reference frame are respectively represented, i 'and j, j' respectively represent the landmarks in two pairs of the query frame and the reference frame, namely the ith, i 'and the j, j' nodes coded in the topological graph.
Further, the spatial relationship in the sixth step includes an angular spatial relationship, and the angular spatial relationship is configured as follows:
Figure BDA0003518433300000035
wherein
Figure BDA0003518433300000036
Representing the spatial angle similarity of the landmarks in the query frame and the reference frame, k and k' representing the landmarks in a pair of the query frame and the reference frame, namely the kth node and the kth node coded in the topological graph, thetauU ∈ { i, j, k } denotes the corner of a triangle in the topology constructed by landmarks in the query frame, θvAnd v ∈ { i ', j ', k ' } denotes the corners of triangles in the topology constructed by landmarks in the reference frame.
Further, in the sixth step, normalization processing is performed on the landmark saliency and the landmark saliency is encoded into a topological graph structure:
Figure BDA0003518433300000041
wherein SlDenotes normalized landmark saliency, slDenotes the landmark significance, min(s), before normalizationl) Represents the minimum of all landmark saliency, max(s)l) Representing the maximum of all landmark significances.
Further, the key points are SIFT key points.
Further, the grid in step 2.2 is a group of grids with the same size.
The device for identifying the topological graph scene based on the density filtering and the landmark saliency comprises a memory and one or more processors, wherein the memory stores executable codes, and the one or more processors are used for realizing the topological graph scene identification method based on the density filtering and the landmark saliency when executing the executable codes.
The invention has the advantages and beneficial effects that:
the density filtering-based landmark and the landmark saliency-based map matching method well solve the problems that under severe visual angle change, extracted landmarks are not robust enough, and landmark information in frames is not fully utilized in scene recognition. The method and the device have the advantage that a better experimental result is obtained on the aspect of improving the scene identification matching precision and robustness under the change of the visual angle.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
FIG. 2 is a diagram of the effect of obtaining a model with a view-angle-invariant landmark by using a density filtering method in the present invention.
3a-3c are diagrams of the visualization effect of the significance of the landmark in the invention.
Fig. 4 is a diagram of the demonstration effect of constructing a topological graph for frame matching in the present invention.
FIG. 5 is a table comparing AUC values for the methods of the invention and other methods.
FIG. 6 is a method of the invention and other methods RP=100Table of comparison of values.
Fig. 7 is a view of a scene under different drastic viewing angle changes for the present inventive design.
FIG. 8 is a table comparing AUC values under drastic visual angle changes for the methods of the present invention and other methods.
FIG. 9 is a graph of R under severe viewing angle changes for the method of the present invention and other methodsP=100Table of comparison of values.
FIG. 10 is a schematic diagram of the apparatus of the present invention.
Detailed Description
The following detailed description of embodiments of the invention refers to the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating the present invention, are given by way of illustration and explanation only, not limitation.
For the scene recognition technology under the change of the visual angle, the robustness of the technology needs to be increased while the scene matching accuracy is improved, so that the practical significance is achieved. The invention provides a method based on combination of landmarks, landmark saliency and topological maps. The algorithm flow is shown in fig. 1.
On the basis of a target detection algorithm, SIFT key points are extracted from the obtained landmarks, the characteristic that the SIFT key points have robustness on the change of a visual angle is fully utilized, and the designed density filtering algorithm is adopted to convert the obtained landmarks with the visual angle invariance into the distribution and quantity problems of the SIFT key points. And then extracting a depth global descriptor for the view angle invariance landmarks in the query frame and the reference frame, completing the initial matching of the landmarks through cross authentication, and then eliminating mismatching by utilizing the shape similarity of the matched landmarks to obtain mutually matched landmarks in the two frames. Considering that landmarks with low identification degree can cause confusion of matched scenes and the better performance of a topological graph structure in solving the scene identification problem under the condition of view angle change, the invention provides a method for combining landmark significance with the construction of a topological graph. The method is characterized in that a Landmark significance is obtained by applying a Landmark Localization Network (Landmark Localization Network), and the matching scores of a query frame and a reference frame are calculated by fully utilizing the attribute of high identification degree of the Landmark, the spatial relationship among the landmarks and the appearance structure topological graph structure. The method has higher accuracy and robustness in scene recognition tasks with changing visual angles.
A topological map scene identification method based on density filtering and landmark saliency comprises the following specific implementation steps:
the method comprises the following steps: as shown in fig. 2, landmarks are extracted using an existing object detection algorithm for an input query frame, and SIFT key points are extracted.
Step two: by utilizing the attribute that the SIFT key points have better robustness to the change of the view angle, the obtained landmark with view angle invariance is converted into the number and distribution problems of the SIFT key points in the current landmark. And quantifying and modeling key points in the landmark through a density filtering algorithm, and calculating the frequency and the coefficient of variation of each landmark to obtain the landmark with invariance to the view angle.
Specifically, the problem of obtaining landmarks with view invariance is converted into the problem of the number and distribution of SIFT key points in the current landmarks. The method comprises the following steps:
step 2.1: SIFT keypoint frequencies in the current landmark are first quantified and modeled.
Figure BDA0003518433300000051
Wherein CR is used to quantify and model the frequency, K, of SIFT keypoints in landmarkshIs the number of SIFT keypoints, HhHigh, W, representing a landmarkhRepresenting the width of the landmark, f is a scaling factor.
Step 2.2: and dividing the landmark into 9 grids with the same size, and quantifying and modeling SIFT key point distribution in the current landmark.
Figure BDA0003518433300000052
Std _ sum is a coefficient of variation, is a statistical measure of the degree of change of each grid value in the landmark, and reflects the degree of dispersion of the key points, std (g) is the standard deviation of the number of SIFT key points of 9 grids, and avg (g) is the average value of the number of SIFT key points of 9 grids.
Step 2.3: if CR is greater than threshold T1Std _ sum is smaller than threshold T2Then the landmark has view angle invariance.
Step three: the method utilizes a convolutional neural network to extract a depth global descriptor for each frame of landmark, and completes the initial matching of the landmark through cross authentication;
specifically, for the visual angle invariance landmarks in the query frame and the reference frame, a depth global descriptor is extracted, and the preliminary matching of the landmarks is completed through cross authentication.
And step four, eliminating mismatching by utilizing the shape similarity of the matched landmarks to obtain mutually matched landmarks in the two frames.
Specifically, in order to ensure the accuracy of landmark matching, the shape similarity of the matched landmarks is introduced
Figure BDA0003518433300000061
Wherein ShapeabIs the shape similarity score of the preliminary matching landmarks, wa,ha,wb,hbAre respectively preliminary piecesThe width and height of the fitted landmark.
Step five: aiming at the problem that the low-identification-degree landmarks can cause confusion of matched scenes, the invention uses a Landmark Localization Network (Landmark Localization Network) to acquire the significance of the landmarks, as shown in figures 3a-3 c.
Step six, the invention codes the landmark significance, the spatial relationship among the landmarks and the landmark appearance into a topological graph structure, and the effect of the construction process is shown in fig. 4. And calculating the similarity scores of the query frame and the reference frames, wherein the reference frame with the highest score is the final matching frame.
Specifically, the spatial relationship of the landmarks in the query frame and the reference frame is encoded into the topological graph, and the spatial relationship is divided into a distance spatial relationship and an angle spatial relationship.
First, a distance spatial relationship is constructed:
dii′,jj′=exp(-|ei,j-ei′,j′|) (4)
wherein d isii′,jj′Representing the spatial distance similarity of landmarks in the query frame and the reference frame, ei,j、ei′,j′Edges of the topology constructed by landmarks in the query frame and the reference frame, respectively.
Then, an angular spatial relationship is constructed:
Figure BDA0003518433300000062
wherein
Figure BDA0003518433300000063
Representing the similarity of the spatial angles of the landmarks in the query frame and the reference frame, u belongs to { i, j, k }, v belongs to { i ', j ', k ' }, thetau,θvThe corners of the triangles in the topology constructed by landmarks in the query frame and the reference frame, respectively.
The landmark saliency is introduced into the topological graph construction, and the landmark saliency is firstly subjected to normalization processing:
Figure BDA0003518433300000064
wherein SlRepresents normalized landmark significance, slRepresents landmark significance, min(s), before normalizationl) Represents the minimum of all landmark saliency, max(s)l) Representing the maximum of all landmark significances.
In order to obtain a correct matching frame, a topological graph structure is constructed based on the spatial relationship among the landmarks, the appearance similarity score and the image saliency to calculate the matching scores of the query frame and the reference frame. A final similarity score is calculated from frame to frame.
Figure BDA0003518433300000071
Wherein, wθA similarity score representing angles in the query frame and the reference frame,
Figure BDA0003518433300000072
calculated from the formula (5), zii′,zjj′,zkk′Appearance similarity scores, S, representing three pairs of matching landmarks that construct a trianglei,Si′,Sj,Sj′,Sk,Sk′Calculated by equation (6), avg represents the average value, and max represents the maximum value.
Figure BDA0003518433300000073
Wherein, wdSimilarity score representing the distance between the query frame and the reference frame, dii′,jj′Calculated from the formula (4), zii′,zjj′Appearance similarity scores, S, for two pairs of matching landmarks representing construction edgesi,Si′,Sj,Sj′Calculated by equation (6), avg represents the average value, and max represents the maximum value.
Since the number of matched landmarks varies from frame to frame, to limit the range of similarity scores to [0,1], the final similarity score from frame to frame is:
Figure BDA0003518433300000074
where Score represents the similarity Score from frame to frame, and n represents the number of landmarks matched from frame to frame.
The invention aims at the problem that a scene recognition system is not robust under the condition of visual angle change. The scene matching results are shown in fig. 5 and 6. From the displayed result, the algorithm of the invention reduces the influence of the change of the visual angle on the scene recognition. In order to verify the performance of the invention under severe visual angle change, the invention increases the severe degree of the visual angle change by adding translation, rotation and combined transformation to the scene, as shown in fig. 7. The scene recognition results under severe viewing angle changes are shown in fig. 8 and 9. The result shows that the algorithm effect of the invention is better.
In conclusion, the topological graph scene recognition algorithm based on density filtering and landmark significance can overcome adverse effects brought to a scene recognition result by visual angle changes, solve the problems of matching scene confusion and poor performance of a scene recognition system caused by the fact that landmarks depend on detectors and have low identification degrees in the matching process, and ensure accuracy and robustness of scene recognition.
The invention fully considers the limitation of scene identification under the change of visual angles and provides an algorithm based on a density filtering algorithm, landmark significance and a topological graph. By the designed density filtering algorithm, on the basis of the existing target detection algorithm, the problem of obtaining the landmarks with view angle invariance is converted into the problem of the number and distribution of SIFT key points in the current landmarks, and the problem that the extracted landmarks are excessively dependent on a detector and the change of the view angle of the landmark surface is not robust is solved; in order to solve the problem that a low-identification-degree Landmark can cause confusion of a matching scene, the invention applies a Landmark Localization Network (Landmark Localization Network) to acquire the significance of the Landmark, and codes the significance of the Landmark, the spatial relationship among the landmarks and the Landmark appearance into a topological graph structure to match a query frame with a reference frame. The accuracy and the strong robustness of the scene recognition system under the change of the visual angle are ensured through the designed density filtering algorithm, the landmark significance and the topological graph, and the reliable guarantee is provided for the operation of the method.
Corresponding to the embodiment of the topological map scene recognition method based on the density filtering and the landmark saliency, the invention also provides an embodiment of a topological map scene recognition device based on the density filtering and the landmark saliency.
Referring to fig. 10, the topology scene recognition apparatus based on density filtering and landmark saliency according to the embodiment of the present invention includes a memory and one or more processors, where the memory stores executable codes, and the one or more processors execute the executable codes to implement the topology scene recognition method based on density filtering and landmark saliency according to the embodiment.
The topology scene recognition device based on density filtering and landmark saliency according to the present invention may be applied to any device with data processing capability, such as a computer or other devices. The device embodiments may be implemented by software, or by hardware, or by a combination of hardware and software. The software implementation is taken as an example, and as a logical device, the device is formed by reading corresponding computer program instructions in the nonvolatile memory into the memory for running through the processor of any device with data processing capability. In terms of hardware, as shown in fig. 10, the hardware structure diagram of any device with data processing capability where the topology scene identification device based on density filtering and landmark saliency is located according to the present invention is a diagram, except for the processor, the memory, the network interface, and the nonvolatile memory shown in fig. 10, in an embodiment, any device with data processing capability where the device is located may also include other hardware according to the actual function of the any device with data processing capability, which is not described again.
The implementation process of the functions and actions of each unit in the above device is specifically described in the implementation process of the corresponding step in the above method, and is not described herein again.
For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the invention. One of ordinary skill in the art can understand and implement it without inventive effort.
The embodiment of the present invention further provides a computer-readable storage medium, on which a program is stored, and when the program is executed by a processor, the method for identifying a topological scene based on density filtering and landmark saliency in the foregoing embodiments is implemented.
The computer readable storage medium may be an internal storage unit, such as a hard disk or a memory, of any data processing capability device described in any of the foregoing embodiments. The computer readable storage medium may also be any external storage device of a device with data processing capabilities, such as a plug-in hard disk, a Smart Media Card (SMC), an SD Card, a Flash memory Card (Flash Card), etc. provided on the device. Further, the computer readable storage medium may include both internal storage units and external storage devices of any data processing capable device. The computer-readable storage medium is used for storing the computer program and other programs and data required by the arbitrary data processing-capable device, and may also be used for temporarily storing data that has been output or is to be output.
The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. The topological map scene identification method based on density filtering and landmark saliency is characterized by comprising the following steps:
the method comprises the following steps: extracting landmarks from the input frame using a target detection algorithm;
step two: quantifying and modeling the key points in the landmarks by using a density filtering algorithm, and calculating the number and distribution of the key points in each landmark to obtain the landmarks with view angle invariance;
step three: adopting a convolutional neural network to extract a depth global descriptor for the visual angle invariance landmarks in the query frame and the reference frame, and completing the primary matching of the landmarks through cross authentication;
step four: eliminating mismatching by utilizing the shape similarity of the matched landmarks to obtain mutually matched landmarks in the query frame and the reference frame;
step five: the significance of the landmark is obtained by adopting a landmark positioning network;
step six: and coding the spatial relationship, the landmark significance and the landmark appearance of the landmarks in the query frame and the reference frame into a topological graph structure, calculating the similarity scores of the query frame and the reference frame, and determining the frame with the highest score as a matching frame.
2. The topogram scene recognition method based on density filtering and landmark saliency according to claim 1, wherein in the second step, the landmarks with view angle invariance are obtained by calculating the frequency and the coefficient of variation of each landmark, specifically comprising the following steps:
step 2.1: quantifying and modeling keypoint frequencies in a current landmark:
Figure FDA0003518433290000011
where CR represents the frequency of key points in the quantified and modeled landmarks, KhRepresenting the number of keypoints, HhIndicating height, W, of a landmarkhRepresenting the width of the landmark, f representing a scaling factor;
step 2.2: dividing the landmark into a group of grids, and quantifying and modeling the distribution of key points in the current landmark:
Figure FDA0003518433290000012
std _ sum represents a coefficient of variation, is a statistical measure of the variation degree of each grid value in the landmark and reflects the dispersion degree of key points, std (G) represents the standard deviation of the number of the key points of m grids, avg (G) represents the average value of the number of the key points of m grids, and m represents the number of grids;
step 2.3: when CR is greater than a first threshold value T1Std _ sum is smaller than a second threshold value T2The corresponding landmarks have view angle invariance.
3. The topogram scene recognition method based on density filtering and landmark saliency according to claim 1, characterized in that in step four, shape similarity of matching landmarks is introduced:
Figure FDA0003518433290000013
wherein ShapeabShape similarity score, w, representing preliminary matching landmarksa,ha,wb,hbThe width and height of the preliminary matching landmarks, a representing the landmark of the query frame and b representing the landmark of the reference frame, respectively.
4. The topological map scene recognition method based on density filtering and landmark significance according to claim 1, wherein in the sixth step, the calculation of the similarity scores of the query frame and the reference frame comprises the following steps:
step S6.1: calculating the similarity score of the angles in the query frame and the reference frame:
Figure FDA0003518433290000021
wherein, wθA similarity score representing the angle in the query frame and the reference frame,
Figure FDA0003518433290000022
representing spatial angular similarity in spatial relationship of landmarks in query and reference frames, zii′,zjj′,zkk′Appearance similarity scores, S, representing three pairs of matching landmarks that construct a trianglei,Si′,Sj,Sj′,Sk,Sk′The normalized landmark saliency of three pairs of landmarks in the query frame and the reference frame is respectively represented, avg represents the average value calculation, and max represents the maximum value acquisition;
step S6.2: calculating the similarity score of the distance between the query frame and the reference frame:
Figure FDA0003518433290000023
wherein, wdSimilarity score representing the distance between the query frame and the reference frame, dii′,jj′Representing spatial distance similarity in spatial relationship of landmarks in query and reference frames, zii′,zjj′Appearance similarity scores, S, for two pairs of matching landmarks representing construction edgesi,Si′,Sj,Sj′The normalized landmark significance of two pairs of landmarks in the query frame and the reference frame is respectively represented, avg represents the average value calculation, and max represents the maximum value acquisition;
step S6.3: calculating a final similarity score from frame to frame:
Figure FDA0003518433290000024
where Score represents the similarity Score from frame to frame, and n represents the number of landmarks matched from frame to frame.
5. The method for identifying topological map scenes based on density filtering and landmark saliency according to claim 1 or 4, characterized in that the spatial relationship in the step six comprises a distance spatial relationship, and the distance spatial relationship is constructed as follows:
dii′,jj′=exp(-|ei,j-ei′,j′|) (4)
wherein d isii′,jj′Representing the spatial distance similarity of landmarks in the query frame and the reference frame, ei,j、ei′,j′The edges of the topological graph constructed by the landmarks in the query frame and the reference frame are respectively represented, i 'and j, j' respectively represent the landmarks in two pairs of the query frame and the reference frame, namely the ith, i 'and the j, j' nodes coded in the topological graph.
6. The method for identifying topological map scenes based on density filtering and landmark saliency according to claim 1 or 4, characterized in that the spatial relationship in the step six comprises an angular spatial relationship, and the angular spatial relationship is constructed as follows:
Figure FDA0003518433290000031
wherein
Figure FDA0003518433290000032
Representing the spatial angle similarity of the landmarks in the query frame and the reference frame, k and k' representing the landmarks in a pair of the query frame and the reference frame, namely the kth node and the kth node coded in the topological graph, thetauU e { i, j, k } represents the location in the query frameAngle of triangle in target-constructed topological graph, thetavAnd v ∈ { i ', j ', k ' } denotes the corners of triangles in the topology constructed by landmarks in the reference frame.
7. The topological map scene recognition method based on density filtering and landmark saliency according to claim 1 or 4, wherein in step six, the landmark saliency is normalized and coded into the topological map structure:
Figure FDA0003518433290000033
wherein SlDenotes normalized landmark saliency, slDenotes the landmark significance, min(s), before normalizationl) Represents the minimum of all landmark significances, max(s)l) Representing the maximum of all landmark significances.
8. The topological map scene recognition method based on density filtering and landmark saliency according to claim 1, wherein the key points are SIFT key points.
9. The method for topogram scene identification based on density filtering and landmark saliency according to claim 2, wherein the grid in step 2.2 is a set of grids with same size.
10. The device for identifying a topological map scene based on density filtering and landmark saliency, characterized by comprising a memory and one or more processors, wherein the memory stores executable codes, and the one or more processors execute the executable codes to realize the topological map scene identification method based on density filtering and landmark saliency according to any one of claims 1-9.
CN202210174254.XA 2022-02-24 2022-02-24 Topological graph scene recognition method and device based on density filtering and landmark saliency Pending CN114708482A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210174254.XA CN114708482A (en) 2022-02-24 2022-02-24 Topological graph scene recognition method and device based on density filtering and landmark saliency

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210174254.XA CN114708482A (en) 2022-02-24 2022-02-24 Topological graph scene recognition method and device based on density filtering and landmark saliency

Publications (1)

Publication Number Publication Date
CN114708482A true CN114708482A (en) 2022-07-05

Family

ID=82167502

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210174254.XA Pending CN114708482A (en) 2022-02-24 2022-02-24 Topological graph scene recognition method and device based on density filtering and landmark saliency

Country Status (1)

Country Link
CN (1) CN114708482A (en)

Similar Documents

Publication Publication Date Title
US11288492B2 (en) Method and device for acquiring 3D information of object
Lei et al. Fast descriptors and correspondence propagation for robust global point cloud registration
CN111795704A (en) Method and device for constructing visual point cloud map
Chen et al. Wide-baseline relative camera pose estimation with directional learning
Yang et al. Aligning 2.5 D scene fragments with distinctive local geometric features and voting-based correspondences
CN104318552B (en) The Model registration method matched based on convex closure perspective view
Li et al. 4FP-structure: A robust local region feature descriptor
Liu et al. A novel rock-mass point cloud registration method based on feature line extraction and feature point matching
Xu et al. GLORN: Strong generalization fully convolutional network for low-overlap point cloud registration
Fu et al. Real-time dense 3D reconstruction and camera tracking via embedded planes representation
Guo et al. Line-based 3d building abstraction and polygonal surface reconstruction from images
Li et al. Fast and globally optimal rigid registration of 3d point sets by transformation decomposition
Kraft et al. Efficient RGB-D data processing for feature-based self-localization of mobile robots
Yue et al. Automatic vocabulary and graph verification for accurate loop closure detection
CN114708482A (en) Topological graph scene recognition method and device based on density filtering and landmark saliency
Tan et al. Automatic Registration Method of Multi-Source Point Clouds Based on Building Facades Matching in Urban Scenes
Jia et al. Robust line matching for image sequences based on point correspondences and line mapping
Noury et al. How to overcome perceptual aliasing in ASIFT?
CN102034235B (en) Rotary model-based fisheye image quasi dense corresponding point matching diffusion method
Guo et al. Image registration method based on improved SIFT algorithm and essential matrix estimation
Gao et al. Image matching method based on multi-scale corner detection
Kim et al. Ep2p-loc: End-to-end 3d point to 2d pixel localization for large-scale visual localization
Alassaf et al. Non-rigid surface registration using cover tree based clustering and nearest neighbor search
Zhao et al. SphereNet: Learning a Noise-Robust and General Descriptor for Point Cloud Registration
CN117095300B (en) Building image processing method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination