TW202217662A

TW202217662A - Visual positioning method, training method of related models, electronic device and computer-readable storage medium

Info

Publication number: TW202217662A
Application number: TW110132126A
Authority: TW
Inventors: 鮑虎軍; 章國鋒; 余海林; 馮友計
Original assignee: 大陸商浙江商湯科技開發有限公司
Priority date: 2020-10-16
Filing date: 2021-08-30
Publication date: 2022-05-01
Also published as: JP7280393B2; CN112328715B; CN112328715A; JP2023502819A; KR20220051162A; WO2022077863A1

Abstract

The present application discloses a visual positioning method, a training method of related models, an electronic device and a computer-readable storage medium, the training method of matching prediction model includes: using sample image and map data to construct sample matching data, in which the sample matching data includes several groups of point pairs and the actual matching value of each group of point pairs, and two points of each group of point pairs are from the sample image and map data respectively; the matching prediction model is used to predict several groups of point pairs, and the prediction matching value of the point pairs is obtained; the loss value of the matching prediction model is determined by using the actual matching value and the prediction matching value; and the parameters of the matching prediction model are adjusted by using the loss value. The above scheme can improve the accuracy and instantaneity of visual positioning.

Description

Visual positioning method and related model training method, electronic device and computer-readable storage medium

本發明關於電腦視覺技術領域，特別是關於一種視覺定位方法及相關模型的訓練方法、電子設備和電腦可讀儲存介質。The present invention relates to the technical field of computer vision, in particular to a visual positioning method and a training method of a related model, an electronic device and a computer-readable storage medium.

視覺定位根據地圖資料的表達方式不同，可分為多種方式。其中，尤以基於結構的方式，又稱為基於特徵（feature-based）的方式以其精度高、泛化性能優而受到廣泛關注。Visual positioning can be divided into various ways according to the expression of map data. Among them, the structure-based method, also known as the feature-based method, has received extensive attention due to its high accuracy and excellent generalization performance.

目前，在利用基於特徵的方式進行視覺定位時，需要匹配得到圖像資料與地圖資料之間的多個點對。然而，採用局部相似度建立匹配關係，其可靠性較弱，特別是在大規模場景或具有重複結構/重複紋理的場景中，極易產生錯誤匹配，從而影響視覺定位的準確性。使用隨機採樣一致性（Random Sample Consensus，RANSAC）雖然可以剔除錯誤匹配，但是由於RANSAC對每個樣本點進行等概率採樣，當初始匹配中的外點過多時，RANSAC存在耗時久且精度低的問題，從而影響視覺定位的即時性和準確性。有鑑於此，如何提高視覺定位的準確性和即時性成為亟待解決的問題。At present, when using a feature-based method for visual positioning, it is necessary to obtain multiple point pairs between image data and map data by matching. However, the reliability of using local similarity to establish a matching relationship is weak, especially in large-scale scenes or scenes with repeated structures/textures, which are prone to false matching, thus affecting the accuracy of visual localization. Although the use of random sampling consistency (Random Sample Consensus, RANSAC) can eliminate false matching, but because RANSAC performs equal probability sampling on each sample point, when there are too many outliers in the initial matching, RANSAC has a time-consuming and low-accuracy problem. problems, thereby affecting the immediacy and accuracy of visual positioning. In view of this, how to improve the accuracy and immediacy of visual positioning has become an urgent problem to be solved.

本發明提供一種視覺定位方法及相關模型的訓練方法、電子設備和電腦可讀儲存介質。The present invention provides a visual positioning method and a training method of a related model, an electronic device and a computer-readable storage medium.

本發明第一方面提供了一種匹配預測模型的訓練方法，包括：利用樣本圖像和地圖資料，構建樣本匹配資料，其中，樣本匹配資料包括若干組點對以及每組點對的實際匹配值，每組點對的兩個點分別來自樣本圖像和地圖資料；利用匹配預測模型對若干組點對進行預測處理，得到點對的預測匹配值；利用實際匹配值和預測匹配值，確定匹配預測模型的損失值；利用損失值，調整匹配預測模型的參數。A first aspect of the present invention provides a training method for a matching prediction model, including: using sample images and map data to construct sample matching data, wherein the sample matching data includes several groups of point pairs and the actual matching value of each group of point pairs, The two points of each group of point pairs come from sample images and map data respectively; use the matching prediction model to predict and process several groups of point pairs to obtain the predicted matching value of the point pair; use the actual matching value and the predicted matching value to determine the matching prediction The loss value of the model; using the loss value, adjust the parameters of the matching prediction model.

因此，通過利用樣本圖像和地圖資料構建得到樣本匹配資料，且樣本匹配資料包括若干組點對以及每組點對的實際匹配值，每組點對的兩個點分別來自樣本圖像和地圖資料，從而利用匹配預測模型對若干組點對進行預測處理，得到點對的預測匹配值，進而利用實際匹配值和預測匹配值，確定匹配預測模型的損失值，利用損失值來對匹配預測模型的參數進行調整，故能夠利用匹配預測模型建立匹配關係，從而能夠在視覺定位中利用匹配預測模型預測點對之間的匹配值，因而能夠基於預測得到的匹配值優先採樣高匹配值的點對，確定待定位圖像的攝影的位姿參數，進而能夠有利於提高視覺定位的準確性和即時性。Therefore, the sample matching data is obtained by constructing the sample image and map data, and the sample matching data includes several groups of point pairs and the actual matching value of each group of point pairs. The two points of each group of point pairs come from the sample image and the map respectively. Then use the matching prediction model to perform prediction processing on several groups of point pairs to obtain the predicted matching value of the point pair, and then use the actual matching value and the predicted matching value to determine the loss value of the matching prediction model, and use the loss value to match the prediction model. Therefore, the matching prediction model can be used to establish a matching relationship, so that the matching prediction model can be used to predict the matching value between point pairs in visual positioning, so the point pair with high matching value can be preferentially sampled based on the predicted matching value. , to determine the photographing pose parameters of the image to be positioned, which can help improve the accuracy and immediacy of visual positioning.

其中，利用樣本圖像和地圖資料，構建樣本匹配資料包括：從樣本圖像中獲取若干圖像點，以及從地圖資料中獲取若干地圖點，以組成若干組點對；其中，若干組點對包括至少一組所包含的圖像點和地圖點之間匹配的匹配點對；對於每組匹配點對：利用樣本圖像的位姿參數將地圖點投影至樣本圖像所屬的維度中，得到地圖點的投影點；並基於圖像點和投影點之間的差異，確定匹配點對的實際匹配值。Wherein, using sample images and map data to construct sample matching data includes: obtaining several image points from the sample image, and obtaining several map points from the map data to form several sets of point pairs; wherein, several sets of point pairs It includes at least one set of matching point pairs matched between the included image points and map points; for each set of matching point pairs: using the pose parameters of the sample image to project the map points into the dimension to which the sample image belongs, to obtain The projected point of the map point; and based on the difference between the image point and the projected point, the actual matching value of the matched point pair is determined.

因此，通過從樣本圖像中獲取若干圖像點，以及從地圖資料中獲取若干地圖點，以組成若干組點對，且若干組點對中包括至少一組所包含的圖像點和地圖點之間匹配的匹配點對，故能夠生成用於訓練匹配預測模型的樣本，並對於每組匹配點對，利用樣本圖像的位姿參數將地圖點投影至樣本圖像所屬的維度中，得到地圖點的投影點，從而基於圖像點和投影點之間的差異，確定匹配點對的實際匹配值，故能夠使匹配預測模型在訓練過程中學習到匹配點對幾何特徵，有利於提高匹配預測模型的準確性。Therefore, by obtaining a number of image points from the sample image and obtaining a number of map points from the map data, several sets of point pairs are formed, and the several sets of point pairs include at least one set of the included image points and map points. The matching point pairs that match between the two, so it is possible to generate samples for training the matching prediction model, and for each set of matching point pairs, use the pose parameters of the sample image to project the map points to the dimension of the sample image to get The projection point of the map point, so as to determine the actual matching value of the matching point pair based on the difference between the image point and the projection point, so the matching prediction model can learn the geometric features of the matching point pair during the training process, which is conducive to improving the matching Predictive model accuracy.

其中，若干組點對包括至少一組所包含的圖像點和地圖點之間不匹配的非匹配點，利用樣本圖像和地圖資料，構建樣本匹配資料還包括：將非匹配點對的實際匹配值設置為預設數值The several sets of point pairs include at least one set of non-matching points that do not match between the included image points and map points, and using sample images and map data to construct sample matching data also includes: Match value is set to preset value

因此，若干組點對包括至少一組所包含的圖像點和地圖點之間不匹配的非匹配點對，並區別於匹配點對，將非匹配點對的實際匹配值設置預設數值，從而能夠有利於提高匹配預測模型的魯棒性。Therefore, several groups of point pairs include at least one group of non-matching point pairs that do not match between the included image points and map points, and different from the matching point pairs, the actual matching value of the non-matching point pairs is set to a preset value, Thus, the robustness of the matching prediction model can be improved.

其中，從樣本圖像中獲取若干圖像點，以及從地圖資料中獲取若干地圖點，以組成若干組點對，包括：將樣本圖像中的圖像點劃分為第一圖像點和第二圖像點，其中，第一圖像點在地圖資料中存在與其匹配的地圖點，第二圖像點在地圖資料中不存在與其匹配的地圖點；對於每一第一圖像點，從地圖資料中分配若干第一地圖點，並分別將第一圖像點與每一第一地圖點作為一第一點對，其中，第一地圖點中包含與第一圖像點匹配的地圖點；以及，對於每一第二圖像點，從地圖資料中分配若干第二地圖點，並分別將第二圖像點與每一第二地圖點作為一第二點對；從第一點對和第二點對中抽取得到若干組點對。Wherein, acquiring several image points from the sample image and acquiring several map points from the map data to form several sets of point pairs, including: dividing the image points in the sample image into the first image point and the first image point Two image points, wherein the first image point has a matching map point in the map data, and the second image point does not have a matching map point in the map data; for each first image point, from A number of first map points are allocated in the map data, and the first image point and each first map point are respectively used as a first point pair, wherein the first map point includes a map point matching the first image point ; and, for each second image point, assign a number of second map points from the map data, and respectively use the second image point and each second map point as a second point pair; from the first point pair And the second point pair is extracted to obtain several groups of point pairs.

因此，通過將樣本圖像中的圖像點劃分為第一圖像點和第二圖像點，且第一圖像點在地圖中存在與其匹配的地圖點，第二圖像點在圖像資料中不存在與其匹配的圖像點，並對第一圖像點，從地圖資料中分配若干第一地圖點，分別將第一圖像點與每一第一地圖點作為一第一點對，且第一地圖點中包含與第一圖像點匹配的地圖點，而對於每一第二圖像點，從地圖資料中分配若干第二地圖點，分別將第二圖像點與每一第二地圖點作為一第二點對，並從第一點對和第二點對中抽取得到若干組點對，從而能夠構建得到數量豐富且包含非匹配點對和匹配點對的若干組點對，以用於訓練匹配預測模型，故能夠有利於提高匹配預測模型的準確性。Therefore, by dividing the image points in the sample image into a first image point and a second image point, and the first image point has a matching map point in the map, the second image point is in the image There is no matching image point in the data, and for the first image point, a number of first map points are allocated from the map data, and the first image point and each first map point are regarded as a first point pair. , and the first map point includes map points matching the first image point, and for each second image point, a number of second map points are allocated from the map data, and the second image point and each The second map point is used as a second point pair, and several groups of point pairs are extracted from the first point pair and the second point pair, so that several groups of points that are abundant and include non-matching point pairs and matching point pairs can be constructed. Yes, it is used for training the matching prediction model, so it can help to improve the accuracy of the matching prediction model.

其中，利用樣本圖像的位姿參數將地圖點投影至樣本圖像所屬的維度中，得到地圖點的投影點包括：基於匹配點對，計算樣本圖像的位姿參數；利用位姿參數將地圖點投影至樣本圖像所屬的維度中，得到地圖點的投影點。Wherein, using the pose parameter of the sample image to project the map point into the dimension to which the sample image belongs, and obtaining the projected point of the map point includes: calculating the pose parameter of the sample image based on the matching point pair; using the pose parameter to The map point is projected into the dimension to which the sample image belongs, and the projected point of the map point is obtained.

因此，通過利用匹配點對，計算樣本圖像的位姿參數，並利用位姿參數將地圖點投影至樣本圖像所屬的維度中，得到地圖點的投影點，從而能夠有利於提高投影點與圖像點之間差異的準確性，進而能夠有利於提高匹配預測模型的準確性。Therefore, by using the matching point pairs to calculate the pose parameters of the sample image, and using the pose parameters to project the map points to the dimension of the sample image, the projected points of the map points can be obtained, which can help to improve the relationship between the projected points and the image. The accuracy of the difference between image points can be beneficial to improve the accuracy of the matching prediction model.

其中，基於圖像點和投影點之間的差異，確定匹配點對的實際匹配值包括：利用預設概率分佈函數將差異轉換為概率密度值，作為匹配點對的實際匹配值。Wherein, determining the actual matching value of the matching point pair based on the difference between the image point and the projection point includes: using a preset probability distribution function to convert the difference into a probability density value as the actual matching value of the matching point pair.

因此，通過利用預設概率分佈函數將差異轉換為概率密度值，作為匹配點對的實際匹配值，故能夠有利於準確地描述投影點與圖像點之間的差異，從而能夠有利於提高匹配預測模型的準確性。Therefore, by using the preset probability distribution function to convert the difference into a probability density value as the actual matching value of the matching point pair, it is beneficial to accurately describe the difference between the projection point and the image point, which can help to improve the matching Predictive model accuracy.

其中，樣本匹配資料為二分圖，二分圖包括若干組點對和連接每組點對的連接邊，且連接邊標注有對應點對的實際匹配值；匹配預測模型包括與樣本圖像所屬的維度對應的第一點特徵提取子模型、與地圖資料所屬的維度對應的第二點特徵提取子模型以及邊特徵提取子模型；利用匹配預測模型對若干組點對進行預測處理，得到點對的預測匹配值包括：分別利用第一點特徵提取子模型和第二點特徵提取子模型對二分圖進行特徵提取，得到第一特徵和第二特徵；利用邊特徵提取子模型對第一特徵和第二特徵進行特徵提取，得到第三特徵；利用第三特徵，得到連接邊對應的點對的預測匹配值。Among them, the sample matching data is a bipartite graph, and the bipartite graph includes several groups of point pairs and connecting edges connecting each group of point pairs, and the connecting edges are marked with the actual matching values of the corresponding point pairs; the matching prediction model includes the dimension to which the sample image belongs. The corresponding first point feature extraction sub-model, the second point feature extraction sub-model corresponding to the dimension to which the map data belongs, and the edge feature extraction sub-model; use the matching prediction model to perform prediction processing on several groups of point pairs, and obtain the prediction of point pairs The matching value includes: using the first point feature extraction sub-model and the second point feature extraction sub-model to perform feature extraction on the bipartite graph to obtain the first feature and the second feature; using the edge feature extraction sub-model to extract the first feature and the second feature. Feature extraction is performed to obtain a third feature; using the third feature, the predicted matching value of the point pair corresponding to the connecting edge is obtained.

因此，通過對二分圖分別進行點特徵抽取以及邊特徵抽取，從而能夠使匹配預測模型更加有效地感知匹配的空間幾何結構，進而能夠有利於提高匹配預測模型的準確性。Therefore, by performing point feature extraction and edge feature extraction respectively on the bipartite graph, the matching prediction model can more effectively perceive the spatial geometric structure of the matching, which can help to improve the accuracy of the matching prediction model.

其中，第一點特徵提取子模型和第二點特徵提取子模型的結構為以下任一種：包括至少一個殘差塊，包括至少一個殘差塊和至少一個空間變換網路；和/或，邊特徵提取子模型包括至少一個殘差塊。Wherein, the structure of the first point feature extraction sub-model and the second point feature extraction sub-model is any of the following: including at least one residual block, including at least one residual block and at least one spatial transformation network; and/or, edge The feature extraction submodel includes at least one residual block.

因此，通過將第一點特徵提取子模型和第二點特徵提取子模型的結構設置為以下任一者：包括至少一個殘差塊，包括至少一個殘差塊和至少一個空間變換網路，且將邊特徵提取子模型設置為包括至少一個殘差塊，故能夠有利於匹配預測模型的優化，並提高匹配預測模型的準確性。Therefore, by setting the structure of the first point feature extraction sub-model and the second point feature extraction sub-model to any one of the following: including at least one residual block, including at least one residual block and at least one spatial transformation network, and The edge feature extraction sub-model is set to include at least one residual block, so it can facilitate the optimization of the matching prediction model and improve the accuracy of the matching prediction model.

其中，若干組點對包括至少一組所包含的圖像點和地圖點之間匹配的匹配點對和至少一組所包含的圖像點和地圖點之間不匹配的非匹配點對；利用實際匹配值和預測匹配值，確定匹配預測模型的損失值包括：利用匹配點對的預測匹配值和實際匹配值，確定匹配預測模型的第一損失值；並利用非匹配點對的預測匹配值和實際匹配值，確定匹配預測模型的第二損失值；對第一損失值和第二損失值進行加權處理，得到匹配預測模型的損失值。Wherein, several sets of point pairs include at least one set of matching point pairs that match between the included image points and map points and at least one set of non-matching point pairs that do not match between the included image points and map points; using The actual matching value and the predicted matching value, and determining the loss value of the matching prediction model includes: using the predicted matching value and the actual matching value of the matching point pair to determine the first loss value of the matching prediction model; and using the predicted matching value of the non-matching point pair. and the actual matching value to determine the second loss value of the matching prediction model; the first loss value and the second loss value are weighted to obtain the loss value of the matching prediction model.

因此，通過利用匹配點對的預測匹配值和實際匹配值，確定匹配預測模型的第一損失值，並利用非匹配點對的預測匹配值和實際損失值，確定匹配預測模型的第二損失值，從而對第一損失值和第二損失值進行加權處理，得到匹配預測模型的損失值，故能夠有利於使匹配預測模型有效感知匹配的空間幾何結構，從而提高匹配預測模型的準確性。Therefore, the first loss value of the matching prediction model is determined by using the predicted matching value and the actual matching value of the matching point pair, and the second loss value of the matching prediction model is determined using the predicted matching value and the actual loss value of the non-matching point pair. , so that the first loss value and the second loss value are weighted to obtain the loss value of the matching prediction model, which can help the matching prediction model to effectively perceive the spatial geometric structure of the matching, thereby improving the accuracy of the matching prediction model.

其中，利用匹配點對的預測匹配值和實際匹配值，確定匹配預測模型的第一損失值之前，方法還包括：分別統計匹配點對的第一數量，以及非匹配點對的第二數量；利用匹配點對的預測匹配值和實際匹配值，確定匹配預測模型的第一損失值包括：利用匹配點對的預測匹配值和實際匹配值之間的差值，以及第一數量，確定第一損失值；利用非匹配點對的預測匹配值和實際匹配值，確定匹配預測模型的第二損失值包括：利用非匹配點對的預測匹配值和實際匹配值之間的差值，以及第二數量，確定第二損失值。Wherein, before determining the first loss value of the matching prediction model by using the predicted matching value and the actual matching value of the matching point pair, the method further includes: separately counting the first number of matching point pairs and the second number of non-matching point pairs; Using the predicted matching value and the actual matching value of the matching point pair to determine the first loss value of the matching prediction model includes: using the difference between the predicted matching value and the actual matching value of the matching point pair, and the first number, determining the first loss value of the matching prediction model. Loss value; using the predicted matching value and the actual matching value of the unmatched point pair to determine the second loss value of the matching prediction model includes: using the difference between the predicted matching value and the actual matching value of the unmatched point pair, and the second loss value of the matching prediction model. quantity to determine the second loss value.

因此，通過統計匹配點對的第一數量，以及非匹配點對的第二數量，從而利用匹配點對的預測匹配值和實際匹配值之間的差值，以及第一數量，確定第一損失值，並利用非匹配點對的預測匹配值和實際匹配值之間的差異，以及第二數量，確定第二損失值，能夠有利於提高匹配預測模型的損失值的準確性，從而能夠有利於提高匹配預測模型的準確性。Therefore, by counting the first number of matching point pairs and the second number of non-matching point pairs, the first loss is determined by using the difference between the predicted matching value and the actual matching value of the matching point pair, and the first number value, and use the difference between the predicted matching value and the actual matching value of the non-matching point pair, and the second quantity, to determine the second loss value, which can help to improve the accuracy of the loss value of the matching prediction model, which can be beneficial to Improve the accuracy of matching prediction models.

其中，樣本圖像所屬的維度為2維或3維，地圖資料所屬的維度為2維或3維。The dimension to which the sample image belongs is 2D or 3D, and the dimension to which the map data belongs is 2D or 3D.

因此，通過設置樣本圖像和地圖資料所屬的維度，能夠訓練得到用於2維-2維的匹配預測模型，或者能夠訓練得到用於2維-3維的匹配預測模型，或者能夠訓練得到用於3維-3維的匹配預測模型，從而能夠提高匹配預測模型的適用範圍。Therefore, by setting the dimensions to which the sample images and map data belong, it is possible to train a matching prediction model for 2D-2D, or a matching prediction model for 2D-3D, or a matching prediction model for 2D-3D. It can be used in 3-3-dimensional matching prediction model, so that the applicable scope of the matching prediction model can be improved.

本發明第二方面提供了一種視覺定位方法，包括：利用待定位圖像和地圖資料，構建待識別匹配資料，其中，待識別匹配資料包括若干組點對，每組點對的兩個點分別來自待定位圖像和地圖資料；利用匹配預測模型對若干組點對進行預測處理，得到點對的預測匹配值；基於點對的預測匹配值，確定待定位圖像的攝影器件的位姿參數。A second aspect of the present invention provides a visual positioning method, comprising: constructing matching data to be identified by using images to be positioned and map data, wherein the matching data to be identified includes several sets of point pairs, and two points of each set of point pairs are respectively From the image and map data to be positioned; use the matching prediction model to perform prediction processing on several groups of point pairs to obtain the predicted matching value of the point pair; based on the predicted matching value of the point pair, determine the pose parameters of the photographic device of the image to be positioned .

因此，通過利用待定位圖像和地圖資料，構建待識別匹配資料，且待識別匹配資料包括若干組點對，每組點對的兩個點分別來自待定位圖像和地圖資料，從而利用匹配預測模型對若干組點對進行預測處理，從而得到點對的預測匹配值，進而基於點對的預測匹配值，確定待定位圖像的攝影器件的位姿參數，提高了視覺定位的準確性和即時性。Therefore, by using the to-be-located image and map data, the to-be-identified matching data is constructed, and the to-be-identified matching data includes several sets of point pairs, and the two points of each set of point pairs are respectively from the to-be-located image and the map data, so as to utilize the matching data to be identified. The prediction model performs prediction processing on several groups of point pairs, so as to obtain the predicted matching value of the point pair, and then determines the pose parameters of the photographic device of the image to be positioned based on the predicted matching value of the point pair, which improves the accuracy of visual positioning. Immediacy.

其中，基於點對的預測匹配值，確定待定位圖像的攝影器件的位姿參數，包括：將若干組點對按照預測匹配值從高到低的順序進行排序；利用前預設數量組點對，確定待定位圖像的攝影器件的位姿參數。Wherein, determining the pose parameters of the photographic device of the image to be positioned based on the predicted matching values of the point pairs includes: sorting several groups of point pairs in descending order of the predicted matching values; using the previously preset number of groups of points Yes, determine the pose parameters of the imaging device of the image to be positioned.

因此，通過將若干組點對按照預測匹配值從高到低的順序進行排序，並利用前預設數量組點對，確定待定位圖像的攝影器件的位姿參數，從而能夠有利於利用排序後的點對進行增量式採樣，優先採樣匹配值高的點對，故能夠通過幾何先驗引導位姿參數的求解，從而能夠提高視覺定位的準確性和即時性。Therefore, by sorting several groups of point pairs in descending order of the predicted matching values, and using the previously preset number of point pairs to determine the pose parameters of the photographic device of the image to be positioned, it is beneficial to use the sorting The latter point pairs are incrementally sampled, and the point pairs with high matching values are preferentially sampled, so the geometric prior can guide the solution of the pose parameters, thereby improving the accuracy and immediacy of visual positioning.

其中，匹配預測模型是利用上述第一方面中的匹配預測模型的訓練方法得到的。Wherein, the matching prediction model is obtained by using the training method of the matching prediction model in the first aspect.

因此，通過上述匹配預測模型的訓練方法得到的匹配預測模型進行視覺定位，能夠提高視覺定位的準確性和即時性。Therefore, performing visual positioning with the matching prediction model obtained by the training method for the matching prediction model can improve the accuracy and immediacy of the visual positioning.

本發明第三方面提供了一種匹配預測模型的訓練裝置，包括：樣本構建模組、預測處理模組、損失確定模組和參數調整模組，樣本構建模組用於利用樣本圖像和地圖資料，構建樣本匹配資料，其中，樣本匹配資料包括若干組點對以及每組點對的實際匹配值，每組點對的兩個點分別來自樣本圖像和地圖資料；預測處理模組用於利用匹配預測模型對若干組點對進行預測處理，得到點對的預測匹配值；損失確定模組用於利用實際匹配值和預測匹配值，確定匹配預測模型的損失值；參數調整模組用於利用損失值，調整匹配預測模型的參數。A third aspect of the present invention provides a training device for matching prediction models, including: a sample construction module, a prediction processing module, a loss determination module and a parameter adjustment module, and the sample construction module is used for using sample images and map data. , construct the sample matching data, wherein the sample matching data includes several groups of point pairs and the actual matching value of each group of point pairs, and the two points of each group of point pairs come from the sample image and map data respectively; the prediction processing module is used to use The matching prediction model performs prediction processing on several groups of point pairs to obtain the predicted matching value of the point pair; the loss determination module is used to use the actual matching value and the predicted matching value to determine the loss value of the matching prediction model; the parameter adjustment module is used to use The loss value, which adjusts the parameters to match the prediction model.

本發明第四方面提供了一種視覺定位裝置，包括：資料構建模組、預測處理模組和參數確定模組，資料構建模組用於利用待定位圖像和地圖資料，構建待識別匹配資料，其中，待識別匹配資料包括若干組點對，每組點對的兩個點分別來自待定位圖像和地圖資料；預測處理模組用於利用匹配預測模型對若干組點對進行預測處理，得到點對的預測匹配值；參數確定模組用於基於點對的預測匹配值，確定待定位圖像的攝影器件的位姿參數。A fourth aspect of the present invention provides a visual positioning device, comprising: a data construction module, a prediction processing module and a parameter determination module, the data construction module is used for using the image to be positioned and the map data to construct the matching data to be identified, Among them, the matching data to be identified includes several groups of point pairs, and the two points of each group of point pairs are respectively from the image to be located and the map data; the prediction processing module is used to use the matching prediction model to perform prediction processing on several groups of point pairs, and obtain The predicted matching value of the point pair; the parameter determination module is used to determine the pose parameters of the photographic device of the image to be positioned based on the predicted matching value of the point pair.

本發明第五方面提供了一種電子設備，包括相互耦接的記憶體和處理器，處理器用於執行記憶體中儲存的程式指令，以實現上述第一方面中的匹配預測模型的訓練方法，或者實現上述第二方面中的視覺定位方法。A fifth aspect of the present invention provides an electronic device, comprising a memory and a processor coupled to each other, the processor is configured to execute program instructions stored in the memory, so as to implement the training method for a matching prediction model in the first aspect, or The visual positioning method in the second aspect above is implemented.

本發明第六方面提供了一種電腦可讀儲存介質，其上儲存有程式指令，程式指令被處理器執行時實現上述第一方面中的匹配預測模型的訓練方法，或者實現上述第二方面中的視覺定位方法。A sixth aspect of the present invention provides a computer-readable storage medium on which program instructions are stored. When the program instructions are executed by a processor, the training method of the matching prediction model in the first aspect above is implemented, or the training method in the second aspect above is implemented. Visual positioning method.

本發明第七方面提供了一種電腦程式，包括電腦可讀代碼，在所述電腦可讀代碼在電子設備中運行，被所述電子設備中的處理器執行的情況下，實現上述第一方面中的匹配預測模型的訓練方法，或者實現上述第二方面中的視覺定位方法。A seventh aspect of the present invention provides a computer program, including computer-readable codes, in the case where the computer-readable codes are executed in an electronic device and executed by a processor in the electronic device, the above-mentioned first aspect is realized. The training method of the matching prediction model, or implement the visual positioning method in the second aspect above.

上述方案，能夠利用匹配預測模型建立匹配關係，從而能夠在視覺定位中利用匹配預測模型預測點對之間的匹配值，因而能夠基於預測得到的匹配值優先採樣高匹配值的點對，而建立匹配關係，進而能夠有利於提高視覺定位的準確性和即時性。The above solution can use the matching prediction model to establish a matching relationship, so that the matching prediction model can be used to predict the matching value between point pairs in visual positioning, so the point pair with high matching value can be preferentially sampled based on the predicted matching value, and establish. The matching relationship can be beneficial to improve the accuracy and immediacy of visual positioning.

下面結合說明書附圖，對本發明實施例的方案進行詳細說明。The solutions of the embodiments of the present invention will be described in detail below with reference to the accompanying drawings.

以下描述中，為了說明而不是為了限定，提出了諸如特定系統結構、介面、技術之類的具體細節，以便透徹理解本發明。In the following description, for purposes of illustration and not limitation, specific details such as specific system structures, interfaces, techniques, etc. are set forth in order to provide a thorough understanding of the present invention.

本文中術語“系統”和“網路”在本文中常被可互換使用。本文中術語“和/或”，僅僅是一種描述關聯物件的關聯關係，表示可以存在三種關係，例如，A和/或B，可以表示：單獨存在A，同時存在A和B，單獨存在B這三種情況。另外，本文中字元“/”，一般表示前後關聯物件是一種“或”的關係。此外，本文中的“多”表示兩個或者多於兩個。The terms "system" and "network" are often used interchangeably herein. The term "and/or" in this article is only a relationship to describe related objects, which means that there can be three relationships, for example, A and/or B, which can mean that A exists alone, A and B exist at the same time, and B exists alone. three situations. In addition, the character "/" in this text generally indicates that the contextually related objects are in an "or" relationship. Also, "multiple" herein means two or more than two.

請參閱圖1，圖1是本發明匹配預測模型的訓練方法一實施例的流程示意圖。匹配預測模型的訓練方法可以包括如下步驟。Please refer to FIG. 1. FIG. 1 is a schematic flowchart of an embodiment of a training method for a matching prediction model of the present invention. The training method of the matching prediction model may include the following steps.

步驟S11：利用樣本圖像和地圖資料，構建樣本匹配資料。Step S11: Using the sample image and map data, construct sample matching data.

在本發明實施例中，樣本匹配資料包括若干組點對以及每組點對的實際匹配值，每組點對的兩個點分別來自樣本圖像和地圖資料。In the embodiment of the present invention, the sample matching data includes several groups of point pairs and actual matching values of each group of point pairs, and the two points of each group of point pairs come from sample images and map data respectively.

在一個實施場景中，地圖資料可以是通過樣本圖像而構建得到。其中，樣本圖像所屬的維度可以為2維或3維，地圖資料所屬的維度可以為2維或3維，在此不做限定。例如，樣本圖像為二維圖像，則可以通過諸如SFM（Structure From Motion）等三維重建方式對二維圖像進行處理，得到諸如稀疏點雲模型的地圖資料，此外，樣本圖像還可以包括三維資訊，例如，樣本圖像還可以為RGB-D圖像（即色彩圖像與深度圖像），在此不做限定。地圖資料可以是由單純的二維圖像組成，也可以是由三維點雲地圖組成，也可以是二維圖像和三維點雲的結合，在此不做限定。In one implementation scenario, the map data may be constructed from sample images. The dimension to which the sample image belongs may be 2-dimensional or 3-dimensional, and the dimension to which the map data belongs may be 2-dimensional or 3-dimensional, which is not limited herein. For example, if the sample image is a two-dimensional image, the two-dimensional image can be processed by three-dimensional reconstruction methods such as SFM (Structure From Motion) to obtain map data such as a sparse point cloud model. Including three-dimensional information, for example, the sample image may also be an RGB-D image (ie, a color image and a depth image), which is not limited here. The map data can be composed of a simple two-dimensional image, or a three-dimensional point cloud map, or a combination of a two-dimensional image and a three-dimensional point cloud, which is not limited here.

在本發明實施例中，匹配預測模型的訓練方法的執行主體可以是匹配預測模型的訓練裝置，下文中描述為訓練裝置；例如，匹配預測模型的訓練方法可以由終端設備或伺服器或其它處理設備執行，其中，終端設備可以為使用者設備（User Equipment，UE）、移動設備、使用者終端、終端、蜂窩電話、無線電話、個人數位助理（Personal Digital Assistant，PDA）、手持設備、計算設備、車載設備、可穿戴設備等。在一些可能的實現方式中，該匹配預測模型的訓練方法可以通過處理器調用記憶體中儲存的電腦可讀指令的方式來實現。In this embodiment of the present invention, the execution body of the training method for matching prediction models may be a training device for matching prediction models, which is described as a training device hereinafter; for example, the training method for matching prediction models may be processed by a terminal device or a server or other processing methods Device execution, where the terminal device may be User Equipment (UE), mobile device, user terminal, terminal, cellular phone, wireless phone, Personal Digital Assistant (PDA), handheld device, computing device , vehicle equipment, wearable devices, etc. In some possible implementations, the training method of the matching prediction model may be implemented by the processor calling computer-readable instructions stored in the memory.

在一個實施場景中，樣本匹配資料可以為二分圖，二分圖又稱為二部圖，是由點集和邊集所構成的無向圖，且點集可以分為兩個互不相交的子集，邊集中的每條邊所關聯的兩個點分別屬於這兩個互不相交的子集。其中，樣本匹配資料為二分圖時，其包括若干組點對和連接每組點對的連接邊，且連接邊標注有對應點對的實際匹配值，用於描述對應點對的匹配程度，例如，實際匹配值可以為0~1之間的數值；這裡，實際匹配值為0.1時，可以表明對應點對之間匹配程度較低，點對中來自樣本圖像的點與來自地圖資料中的點對應於空間中同一點的概率較低；實際匹配值為0.98時，可以表明對應點對之間的匹配程度較高，點對中來自樣本圖像的點與來自地圖資料中的點對應於空間中同一點的概率較高。請結合參閱圖2，圖2是本發明匹配預測模型的訓練方法一實施例的狀態示意圖，如圖2所示，左側為由二分圖表示的樣本匹配資料，二分圖的上側和下側兩個為互不相交的點集，連接兩個點集中的點為連接邊，連接邊標注有實際匹配值（未圖示）。In an implementation scenario, the sample matching data can be a bipartite graph, which is also called a bipartite graph, which is an undirected graph composed of a point set and an edge set, and the point set can be divided into two mutually disjoint sub-graphs. The two points associated with each edge in the edge set belong to the two disjoint subsets. Among them, when the sample matching data is a bipartite graph, it includes several groups of point pairs and connecting edges connecting each group of point pairs, and the connecting edges are marked with the actual matching value of the corresponding point pair, which is used to describe the matching degree of the corresponding point pair, for example , the actual matching value can be a value between 0 and 1; here, when the actual matching value is 0.1, it can indicate that the matching degree between the corresponding point pairs is low, and the points from the sample image in the point pairs are the same as those from the map data. The probability that a point corresponds to the same point in the space is low; when the actual matching value is 0.98, it can indicate that the matching degree between the corresponding point pairs is high, and the points from the sample image in the point pairs correspond to the points from the map data. The probability of the same point in space is high. Please refer to FIG. 2. FIG. 2 is a schematic state diagram of an embodiment of a training method for a matching prediction model of the present invention. As shown in FIG. 2, the left side is the sample matching data represented by a bipartite graph. The upper side and the lower side of the bipartite graph are two It is a set of points that do not intersect with each other, and the points connecting the two point sets are the connecting edges, and the connecting edges are marked with actual matching values (not shown).

在一個實施場景中，為了提高樣本匹配資料多樣化，訓練裝置還可以對樣本匹配資料進行資料增強。例如，訓練裝置可以將樣本匹配資料中的三維點的座標分別對三個軸進行隨機旋轉；或者，還可以對樣本匹配資料中的三維點進行歸一化處理，在此不做限定。In an implementation scenario, in order to improve the diversification of the sample matching data, the training device may further perform data enhancement on the sample matching data. For example, the training device may randomly rotate the coordinates of the three-dimensional points in the sample matching data to three axes respectively; or, it may also perform normalization processing on the three-dimensional points in the sample matching data, which is not limited herein.

步驟S12：利用匹配預測模型對若干組點對進行預測處理，得到點對的預測匹配值。Step S12: Use the matching prediction model to perform prediction processing on several groups of point pairs to obtain the predicted matching values of the point pairs.

請繼續結合參閱圖2，仍以樣本匹配資料以二分圖表示為例，匹配預測模型可以包括與樣本圖像所屬的維度對應的第一點特徵提取子模型、與地圖資料所屬的維度對應的第二點特徵提取子模型，以及邊特徵提取子模型。例如，樣本圖像為二維圖像、地圖資料包括二維圖像時，第一點特徵提取子模型和第二點特徵提取子模型為二維點特徵提取子模型，則訓練得到的匹配預測模型可以用於二維-三維的匹配預測；或者，樣本圖像為三維圖像、地圖資料包括三維點雲時，第一點特徵提取子模型和第二點特徵提取子模型為三維點特徵提取子模型，則訓練得到的匹配預測模型可以用於三維-三維的匹配預測；或者，樣本圖像為二維圖像、地圖資料包括三維點雲時，第一點特徵提取子模型為二維點特徵提取子模型、第二點特徵提取子模型為三維點特徵提取子模型，則訓練得到的匹配預測模型可以用於二維-三維的匹配預測；這裡，對於匹配預測模型可以根據實際應用進行設置，在此不做限定。Please continue to refer to FIG. 2, still taking the sample matching data represented by a bipartite graph as an example, the matching prediction model may include a first point feature extraction sub-model corresponding to the dimension to which the sample image belongs, and a first point feature extraction sub-model corresponding to the dimension to which the map data belongs. Two-point feature extraction sub-model, and edge feature extraction sub-model. For example, when the sample image is a two-dimensional image and the map data includes two-dimensional images, the first point feature extraction sub-model and the second point feature extraction sub-model are two-dimensional point feature extraction sub-models, then the matching prediction obtained by training The model can be used for 2D-3D matching prediction; or, when the sample image is a 3D image and the map data includes a 3D point cloud, the first point feature extraction submodel and the second point feature extraction submodel are 3D point feature extraction. submodel, the matching prediction model obtained by training can be used for 3D-3D matching prediction; or, when the sample image is a 2D image and the map data includes a 3D point cloud, the first point feature extraction submodel is a 2D point The feature extraction sub-model and the second point feature extraction sub-model are 3D point feature extraction sub-models, and the matching prediction model obtained by training can be used for 2D-3D matching prediction; here, the matching prediction model can be set according to the actual application , which is not limited here.

在一個實施場景中，訓練裝置可以利用第一點特徵提取子模型和第二點特徵提取子模型對二分圖進行特徵提取，得到第一特徵和第二特徵；再利用邊特徵提取子模型對第一特徵和第二特徵進行特徵提取，得到第三特徵；利用第三特徵，得到連接邊對應點的預測匹配值；如圖2中的表示二分圖中各連接邊對應的預測匹配值。In an implementation scenario, the training device may use the first point feature extraction sub-model and the second point feature extraction sub-model to perform feature extraction on the bipartite graph to obtain the first feature and the second feature; and then use the edge feature extraction sub-model to perform feature extraction on the second feature The first feature and the second feature are extracted to obtain the third feature; the third feature is used to obtain the predicted matching value of the corresponding point of the connecting edge; as shown in Figure 2, the predicted matching value corresponding to each connecting edge in the bipartite graph is shown.

在一個的實施場景中，當第一點特徵提取子模型和第二點特徵提取子模型為二維點特徵提取子模型時，可以包括至少一個殘差塊（resblock），例如，1個殘差塊、2個殘差塊、3個殘差塊等等，每個殘差塊（resblock）由多個基本塊（base block）組成，而每個基本塊（base block）由一層1*1的卷積層、批標準化層（batch normalization）、上下文標準化層（context normalization）組成。當第一點特徵提取子模型和第二點特徵提取子模型為三維點特徵提取子模型時，可以包括至少一個殘差塊（resblock）和至少一個空間變換網路（如，t-net），例如，1個殘差塊、2個殘差塊、3個殘差塊等等，在此不做限定。空間變換網路可以為1個、2個，這裡，空間變換網路可以位於模型的首尾位置，在此不做限定。殘差塊（resblock）的結構可以參考前述實施場景中的結構，在此不再贅述。邊特徵提取子模型可以包括至少一個殘差塊，例如，1個殘差塊、2個殘差塊、3個殘差塊等等，在此不做限定，殘差塊（resblock）的結構可以參考前述實施場景中的結構，在此不再贅述。In one implementation scenario, when the first point feature extraction sub-model and the second point feature extraction sub-model are two-dimensional point feature extraction sub-models, at least one residual block (resblock) may be included, for example, one residual block, 2 residual blocks, 3 residual blocks, etc., each residual block (resblock) consists of multiple basic blocks (base blocks), and each basic block (base block) consists of a layer of 1*1 Convolutional layer, batch normalization layer (batch normalization), context normalization layer (context normalization) composition. When the first point feature extraction sub-model and the second point feature extraction sub-model are three-dimensional point feature extraction sub-models, at least one residual block (resblock) and at least one spatial transformation network (eg, t-net) may be included, For example, one residual block, two residual blocks, three residual blocks, etc., are not limited here. The number of spatial transformation networks can be one or two. Here, the spatial transformation networks can be located at the beginning and end of the model, which is not limited here. For the structure of the residual block (resblock), reference may be made to the structure in the foregoing implementation scenario, and details are not described herein again. The edge feature extraction sub-model may include at least one residual block, for example, 1 residual block, 2 residual blocks, 3 residual blocks, etc., which are not limited here, and the structure of the residual block (resblock) can be Referring to the structure in the foregoing implementation scenario, details are not repeated here.

步驟S13：利用實際匹配值和預測匹配值，確定匹配預測模型的損失值。Step S13: Determine the loss value of the matching prediction model by using the actual matching value and the predicted matching value.

在一個實施場景中，訓練裝置可以統計實際匹配值和預測匹配值之間的差異，從而確定匹配預測模型的損失值。這裡，訓練裝置可以統計所有點對的預測匹配值和其實際匹配值之間差值的總和，再利用該總和和所有點對的數量，求取所有點對的預測匹配值的均值，作為匹配預測模型的損失值。In an implementation scenario, the training device may count the difference between the actual matching value and the predicted matching value, so as to determine the loss value of the matching prediction model. Here, the training device can count the sum of the differences between the predicted matching values of all point pairs and their actual matching values, and then use the sum and the number of all point pairs to obtain the average of the predicted matching values of all point pairs, as matching The loss value of the prediction model.

在另一個實施場景中，若干組匹配點對可以包括至少一組所包含的圖像點和地圖點之間匹配的匹配點對，即匹配點對所包含的圖像點與地圖點為空間中的同一點，若干組匹配點對還可以包括至少一組所包含的圖像點和地圖點之間不匹配的非匹配點對，即非匹配點對所包含的圖像點與地圖點為空間中的不同點，則訓練裝置可以利用匹配點對的預測匹配值

和實際匹配值

，確定匹配預測模型的第一損失值

，並利用非匹配點對的預測匹配值

和實際匹配值

，確定匹配預測模型的第二損失值

，從而通過對第一損失值

和第二損失值

進行加權處理，得到匹配預測模型的損失值

，參見公式（1）：

……（1）上述公式（1）中，

表示匹配預測模型的損失值，

表示匹配點對所對應的第一損失值，

表示非匹配點對所對應的第二損失值，

和

分別表示第一損失值

的權重、第二損失值

的權重。 In another implementation scenario, the several sets of matching point pairs may include at least one set of matching point pairs between the included image points and map points, that is, the image points and map points included in the matching point pairs are in space the same point of different points in , the training device can use the predicted matching values

and the actual match value

, determine the first loss value that matches the prediction model

, and use the predicted matching values for non-matching point pairs

and the actual match value

, determine the second loss value that matches the prediction model

, so that by taking the first loss value

and the second loss value

Perform weighting processing to get the loss value that matches the prediction model

, see formula (1):

...(1) In the above formula (1),

represents the loss value of the matching prediction model,

represents the first loss value corresponding to the matching point pair,

represents the second loss value corresponding to the unmatched point pair,

and

respectively represent the first loss value

The weight of , the second loss value

the weight of.

在一個實施場景中，訓練裝置還可以分別統計匹配點對的第一數量

和非匹配點對的第二數量

，從而可以利用匹配點對的預測匹配值和實際匹配值之間的差值，以及第一數量，確定第一損失值，參見公式（2）：

……（2）上述公式（2）中，

表示第一損失值，

表示第一數量，

分別表示匹配點對的實際匹配值和預測匹配值。 In an implementation scenario, the training device may also count the first number of matching point pairs separately

and the second number of non-matching point pairs

, so that the difference between the predicted matching value and the actual matching value of the matching point pair, as well as the first quantity, can be used to determine the first loss value, see formula (2):

...(2) In the above formula (2),

represents the first loss value,

represents the first quantity,

represent the actual matching value and the predicted matching value of the matching point pair, respectively.

訓練裝置還可以利用非匹配點對的預測匹配值和實際匹配值之間的差值，以及第二數量，確定第二損失值，參見公式（3）：

……（3）上述公式（3）中，

表示第二損失值，

表示第二數量，

分別表示非匹配點對的實際匹配值和預測匹配值；此外，非匹配點對的實際匹配值

還可以統一設置為一預設數值（例如，0）。 The training device may also use the difference between the predicted matching value and the actual matching value of the unmatched point pair, as well as the second quantity, to determine the second loss value, see formula (3):

...(3) In the above formula (3),

represents the second loss value,

represents the second quantity,

represent the actual matching value and predicted matching value of the unmatched point pair, respectively; in addition, the actual matching value of the unmatched point pair

It can also be uniformly set to a preset value (eg, 0).

步驟S14：利用損失值，調整匹配預測模型的參數。Step S14: Using the loss value, adjust the parameters of the matching prediction model.

在本發明實施例中，訓練裝置可以採用隨機梯度下降（Stochastic Gradient Descent，SGD）、批量梯度下降（Batch Gradient Descent，BGD）、小批量梯度下降（Mini-Batch Gradient Descent，MBGD）等方式，利用損失值對匹配預測模型的參數進行調整；其中，批量梯度下降是指在每一次反覆運算時，使用所有樣本來進行參數更新；隨機梯度下降是指在每一次反覆運算時，使用一個樣本來進行參數更新；小批量梯度下降是指在每一次反覆運算時，使用一批樣本來進行參數更新，在此不再贅述。In the embodiment of the present invention, the training device may adopt methods such as Stochastic Gradient Descent (SGD), Batch Gradient Descent (BGD), Mini-Batch Gradient Descent (MBGD), etc. The loss value adjusts the parameters of the matching prediction model; among them, batch gradient descent means that all samples are used to update parameters in each iterative operation; stochastic gradient descent means that one sample is used for each iterative operation. Parameter update; mini-batch gradient descent refers to using a batch of samples to update parameters in each repeated operation, which will not be repeated here.

在一個實施場景中，還可以設置一訓練結束條件，當滿足訓練結束條件時，訓練裝置可以結束對匹配預測模型的訓練。其中，訓練結束條件可以包括：損失值小於一預設損失閾值，且損失值不再減小；當前訓練次數達到預設次數閾值（例如，500次、1000次等），在此不做限定。In an implementation scenario, a training end condition may also be set, and when the training end condition is satisfied, the training device may end the training of the matching prediction model. The training end conditions may include: the loss value is less than a preset loss threshold, and the loss value is no longer reduced; the current training times reaches a preset times threshold (for example, 500 times, 1000 times, etc.), which is not limited here.

上述方案，通過利用樣本圖像和地圖資料構建得到樣本匹配資料，且樣本匹配資料包括若干組點對以及每組點對的實際匹配值，每組點對的兩個點分別來自樣本圖像和地圖資料，從而利用匹配預測模型對若干組點對進行預測處理，得到點對的預測匹配值，進而利用實際匹配值和預測匹配值，確定匹配預測模型的損失值，利用損失值來對匹配預測模型的參數進行調整，故能夠利用匹配預測模型建立匹配關係，從而能夠在視覺定位中利用匹配預測模型預測點對之間的匹配值，因而能夠基於預測得到的匹配值優先採樣高匹配值的點對，進而能夠有利於提高視覺定位的準確性和即時性。In the above scheme, the sample matching data is obtained by constructing the sample image and map data, and the sample matching data includes several groups of point pairs and the actual matching value of each group of point pairs, and the two points of each group of point pairs come from the sample image and Map data, so as to use the matching prediction model to perform prediction processing on several groups of point pairs to obtain the predicted matching value of the point pair, and then use the actual matching value and the predicted matching value to determine the loss value of the matching prediction model, and use the loss value to predict the matching. The parameters of the model are adjusted, so the matching prediction model can be used to establish a matching relationship, so that the matching prediction model can be used to predict the matching value between point pairs in visual positioning, so the points with high matching value can be preferentially sampled based on the predicted matching value. Yes, it can help to improve the accuracy and immediacy of visual positioning.

請參閱圖3，圖3是圖1中步驟S11一實施例的流程示意圖。其中，訓練裝置可以通過如下步驟構建得到樣本匹配資料。Please refer to FIG. 3 , which is a schematic flowchart of an embodiment of step S11 in FIG. 1 . The training device can construct the sample matching data through the following steps.

步驟S111：從樣本圖像中獲取若干圖像點，以及從地圖資料中獲取若干地圖點，以組成若干組點對。Step S111: Acquire several image points from the sample image, and acquire several map points from the map data to form several groups of point pairs.

若干組點對包括至少一組所包含的圖像點和地圖點之間匹配的匹配點對；也就是說，若干組點對中至少包含一組所包含的圖像點和地圖點對應於空間中同一點的匹配點對。以樣本圖像為二維圖像，地圖資料為通SFM重建方式得到的稀疏點雲模型為例，若干組點對中至少包含1個三角化的點以及該三角化的點對應於稀疏點雲模型中的三維點。Several sets of point pairs include at least one set of matched point pairs that match between the included image points and map points; that is, at least one set of included image points and map points in the several sets of point pairs corresponds to the spatial matching point pairs at the same point. Taking the sample image as a two-dimensional image and the map data as a sparse point cloud model obtained by SFM reconstruction as an example, several groups of point pairs contain at least one triangulated point and the triangulated point corresponds to the sparse point cloud 3D points in the model.

在一個實施場景中，若干組點對中還可以包括至少一組所包含的圖像點與地圖點之間不匹配的非匹配點對，也就是說，若干組點對中還可以包括至少一組所包含的圖像點和地圖點對應於空間中不同點的非匹配點對。仍以樣本圖像為二維圖像，地圖資料為通SFM重建方式得到的稀疏點雲模型為例，若干組點對中還可以包括未三角化的點以及稀疏點雲模型中的任一點，以構成一組非匹配點對，從而能夠在樣本匹配資料中加入雜訊，進而能夠提高匹配預測模型的魯棒性。In an implementation scenario, several groups of point pairs may further include at least one group of non-matching point pairs that do not match between the included image points and map points, that is, several groups of point pairs may further include at least one A group contains image points and map points that correspond to unmatched pairs of different points in space. Still taking the sample image as a two-dimensional image and the map data as an example of a sparse point cloud model obtained by SFM reconstruction, several groups of point pairs can also include untriangulated points and any point in the sparse point cloud model. In order to form a set of non-matching point pairs, noise can be added to the sample matching data, and the robustness of the matching prediction model can be improved.

在一個實施場景中，請結合參閱圖4，圖4是圖3中步驟S111一實施例的流程示意圖。其中，訓練裝置可以通過如下步驟，得到若干組點對。In an implementation scenario, please refer to FIG. 4 , which is a schematic flowchart of an embodiment of step S111 in FIG. 3 . The training device can obtain several sets of point pairs through the following steps.

步驟S41：將樣本圖像中的圖像點劃分為第一圖像點和第二圖像點。Step S41: Divide the image points in the sample image into a first image point and a second image point.

在本發明實施例中，第一圖像點在地圖資料中存在與其匹配的地圖點，第二圖像點在地圖資料中不存在與其匹配的地圖點。仍以樣本圖像為二維圖像，地圖資料為通SFM重建方式得到的稀疏點雲模型為例，第一圖像點可以為樣本圖像中三角化的特徵點，第二圖像點可以為樣本圖像中未三角化的特徵點；在其他應用場景中，可以以此類推，在此不做限定。In this embodiment of the present invention, the first image point has a matching map point in the map data, and the second image point does not have a matching map point in the map data. Still taking the sample image as a two-dimensional image and the map data as a sparse point cloud model obtained by SFM reconstruction as an example, the first image point can be the triangulated feature point in the sample image, and the second image point can be is an untriangulated feature point in the sample image; in other application scenarios, it can be deduced by analogy, which is not limited here.

在一個實施場景中，樣本圖像中的圖像點為樣本圖像的特徵點。在另一個實施場景中，還可以將特徵點的座標轉換到歸一化平面上。In one implementation scenario, the image points in the sample image are feature points of the sample image. In another implementation scenario, the coordinates of the feature points may also be converted to a normalized plane.

步驟S42：對於每一第一圖像點，從地圖資料中分配若干第一地圖點，並分別將第一圖像點與每一第一地圖點作為一第一點對，其中，第一地圖點中包含與第一圖像點匹配的地圖點。Step S42: For each first image point, assign a number of first map points from the map data, and use the first image point and each first map point as a first point pair, wherein the first map point Points contains map points that match the first image point.

對於每一第一圖像點，從地圖資料中分配若干第一地圖點，並分別將第一圖像點與每一第一地圖點作為一第一點對，且第一地圖點中包含與第一圖像點匹配的地圖點。在一個實施場景中，對於每一第一圖像點分配的第一地圖點的數量可以相同，也可以不同。在另一個實施場景中，在分配第一地圖點之前，還可以從劃分得到的第一圖像點中隨機抽取若干第一圖像點，並對抽取得到的第一圖像點，執行從地圖資料中分配若干第一地圖點，並分別將第一圖像點與每一第一地圖點作為一第一點對的步驟，在此不做限定。在一個實施場景中，可以從劃分得到的第一圖像點中隨機抽取N個點，並對抽取得到的N個第一圖像點中的每一個第一圖像點，從地圖資料中隨機分配K個第一地圖點，且隨機分配的K個第一地圖點中包含與第一圖像點匹配的地圖點。For each first image point, a number of first map points are allocated from the map data, and the first image point and each first map point are respectively regarded as a first point pair, and the first map point includes and The first image point matches the map point. In an implementation scenario, the number of first map points allocated to each first image point may be the same or different. In another implementation scenario, before assigning the first map points, a number of first image points may also be randomly selected from the first image points obtained by division, and the extracted first image points may be executed from the map The steps of allocating a number of first map points in the data and using the first image point and each first map point as a first point pair are not limited herein. In an implementation scenario, N points may be randomly selected from the first image points obtained by division, and for each first image point of the N first image points obtained by the extraction, randomly selected from the map data K first map points are allocated, and the randomly allocated K first map points include map points matching the first image point.

步驟S43：對於每一第二圖像點，從地圖資料中分配若干第二地圖點，並分別將第二圖像點與每一第二地圖點作為一第二點對。Step S43: For each second image point, assign a number of second map points from the map data, and use the second image point and each second map point as a second point pair respectively.

對於每一第二圖像點，從地圖資料中分配若干第二地圖點，並分別將第二圖像點與每一第二地圖點作為一第二點對。在一個實施場景中，對於每一第二圖像點分配的第二地圖點的數量可以相同，也可以不同。在另一個實施場景中，在分配第二地圖點之前，還可以從劃分得到的第二圖像點中隨機抽取若干第二圖像點，並對抽取得到的第二圖像點，執行從地圖資料中分配若干第二地圖點，並分別將第二圖像點與每一第二地圖點作為一第二點對的步驟，在此不做限定。在一個實施場景中，可以從劃分得到的第二圖像點中隨機抽取M個點，並對抽取得到的M個第二圖像點中的每一個第二圖像點，從地圖資料中隨機分配K個第二地圖點。For each second image point, a number of second map points are allocated from the map data, and the second image point and each second map point are respectively used as a second point pair. In an implementation scenario, the number of second map points allocated to each second image point may be the same or different. In another implementation scenario, before allocating the second map points, a number of second image points may be randomly selected from the second image points obtained by division, and the extracted second image points may be executed from the map The steps of allocating a plurality of second map points in the data and using the second image point and each second map point as a second point pair are not limited herein. In one implementation scenario, M points may be randomly selected from the divided second image points, and for each second image point of the M second image points obtained by the extraction, randomly selected from the map data Allocate K second map points.

在一個實施場景中，為了便於明確每一第一點對和每一第二點對是否為匹配點對，還可以遍歷每一第一點對和每一第二點對，並利用第一識別字（例如，1）對匹配點對進行標記，利用第二識別字（例如，0）對非匹配點對進行標記。In an implementation scenario, in order to clarify whether each first point pair and each second point pair are matching point pairs, it is also possible to traverse each first point pair and each second point pair, and use the first identification A word (eg, 1) marks matching point pairs, and a second identifying word (eg, 0) marks non-matching point pairs.

上述步驟S42和步驟S43可以按照先後循序執行，例如，先執行步驟S42，後執行步驟S43；或者，先執行步驟S43，後執行步驟S42；此外，上述步驟S42和步驟S43也可以同步執行，在此不做限定。The above-mentioned steps S42 and S43 can be performed sequentially, for example, step S42 is performed first, and then step S43 is performed; or, step S43 is performed first, and then step S42 is performed; This is not limited.

步驟S44：從第一點對和第二點對中抽取得到若干組點對。Step S44: Extracting several groups of point pairs from the first point pair and the second point pair.

在本發明實施例中，可以從第一點對和第二點對中隨機抽取，得到若干組點對，作為一樣本匹配資料。在一個實施場景中，還可以對第一點對和第二點對隨機抽取若干次，從而得到若干個樣本匹配資料。在另一個實施場景中，還可以獲取多個樣本圖像和地圖資料，並對每一樣本圖像和地圖資料，重複執行上述步驟，得到多個樣本匹配資料，從而能夠提高樣本數量，有利於提高匹配預測模型的準確性。In the embodiment of the present invention, several groups of point pairs may be obtained by randomly extracting from the first point pair and the second point pair, as a sample matching data. In an implementation scenario, the first point pair and the second point pair may also be randomly selected several times to obtain several sample matching data. In another implementation scenario, a plurality of sample images and map data can also be obtained, and the above steps are repeatedly performed for each sample image and map data to obtain a plurality of sample matching data, thereby increasing the number of samples, which is beneficial to Improve the accuracy of matching prediction models.

步驟S112：對於每組匹配點對：利用樣本圖像的位姿參數將地圖點投影至樣本圖像所屬的維度中，得到地圖點的投影點；並基於圖像點和投影點之間的差異，確定匹配點對的實際匹配值。Step S112: for each set of matching point pairs: using the pose parameters of the sample image to project the map point into the dimension to which the sample image belongs to obtain the projected point of the map point; and based on the difference between the image point and the projected point , to determine the actual matching value of the matching point pair.

對於每組匹配點對，可以利用與其對應的樣本圖像的位姿參數將地圖點投影至樣本圖像所屬的維度中，得到地圖點的投影點。仍以樣本圖像為二維圖像，地圖資料為通過SFM重建方式得到的稀疏點雲模型為例，訓練裝置可以利用位姿參數將三維點重投影，從而得到其投影點。For each set of matched point pairs, the map point can be projected into the dimension to which the sample image belongs by using the pose parameters of the corresponding sample image to obtain the projected point of the map point. Taking the sample image as a two-dimensional image and the map data as a sparse point cloud model obtained by SFM reconstruction as an example, the training device can use the pose parameters to reproject the three-dimensional points to obtain their projected points.

在一個實施場景中，可以利用預設概率分佈函數將圖像點和其投影點之間的差異轉換為概率密度值，作為匹配點對的實際匹配值。在一個實施場景中，預設概率分佈函數可以是標準高斯分佈函數，從而能夠將取值範圍在負無窮至正無窮之間的差異轉換為對應的概率密度值，且差異的絕對值越大，對應的概率密度值越小，相應表示點對的匹配程度越低，差異的絕對值越小，對應的概率密度值越小，相應表示點對的匹配程度越高，當差異的絕對值為0時，其對應的概率密度值最大。In an implementation scenario, a preset probability distribution function can be used to convert the difference between the image point and its projected point into a probability density value, which is used as the actual matching value of the matching point pair. In an implementation scenario, the preset probability distribution function may be a standard Gaussian distribution function, so that the difference in the value range from negative infinity to positive infinity can be converted into a corresponding probability density value, and the greater the absolute value of the difference, The smaller the corresponding probability density value, the lower the matching degree of the corresponding point pair, the smaller the absolute value of the difference, the smaller the corresponding probability density value, the higher the matching degree of the corresponding point pair, when the absolute value of the difference is 0 , the corresponding probability density value is the largest.

在一個實施場景中，在利用位姿參數將地圖點投影至樣本圖像所屬的維度之前，訓練裝置還可以基於匹配點對，計算樣本圖像的位姿參數；這裡，可以採用BA（Bundle Adjustment）計算位姿參數，從而利用位姿參數將地圖點投影至樣本圖像所屬的維度中，得到地圖點的投影點。In an implementation scenario, before using the pose parameters to project the map points to the dimension to which the sample images belong, the training device may also calculate the pose parameters of the sample images based on the matched point pairs; ) to calculate the pose parameters, so as to use the pose parameters to project the map point into the dimension to which the sample image belongs to obtain the projected point of the map point.

在一個實施場景中，還可以將非匹配點對的實際匹配值設置為預設數值，例如，將非匹配點對的實際匹配值設置為0。In an implementation scenario, the actual matching value of the non-matching point pair may also be set to a preset value, for example, the actual matching value of the non-matching point pair is set to 0.

區別於前述實施例，通過從樣本圖像中獲取若干圖像點，以及從地圖資料中獲取若干地圖點，以組成若干組點對，且若干組點對中包括至少一組所包含的圖像點和地圖點之間匹配的匹配點對，故能夠生成用於訓練匹配預測模型的樣本，並對於每組匹配點對，利用樣本圖像的位姿參數將地圖點投影至樣本圖像所屬的維度中，得到地圖點的投影點，從而基於圖像點和投影點之間的差異，確定匹配點對的實際匹配值，故能夠使匹配預測模型在訓練過程中學習到匹配點對幾何特徵，有利於提高匹配預測模型的準確性。Different from the foregoing embodiments, several groups of point pairs are formed by acquiring several image points from the sample image and several map points from the map data, and the several groups of point pairs include at least one group of included images The matching point pairs matched between points and map points can generate samples for training the matching prediction model, and for each set of matching point pairs, the pose parameters of the sample images are used to project the map points to the location where the sample images belong. In the dimension, the projection point of the map point is obtained, so as to determine the actual matching value of the matching point pair based on the difference between the image point and the projection point, so that the matching prediction model can learn the geometric features of the matching point pair during the training process. It is beneficial to improve the accuracy of the matching prediction model.

請參閱圖5，圖5是本發明視覺定位方法一實施例的流程示意圖。視覺定位方法可以包括如下步驟。Please refer to FIG. 5 , which is a schematic flowchart of an embodiment of a visual positioning method of the present invention. The visual positioning method may include the following steps.

步驟S51：利用待定位圖像和地圖資料，構建待識別匹配資料。Step S51: Using the image to be located and the map data to construct matching data to be identified.

在本發明實施例中，待識別匹配資料包括若干組點對，每組點對的兩個點分別來自待定位圖像和地圖資料。其中，待定位圖像和地圖資料所屬的維度可以為2維或3維，在此不做限定。例如，待定位圖像可以為二維圖像，或者，待定位圖像還可以還是RGB-D圖像，在此不做限定；地圖資料可以為單純的二維圖像組成，也可以是由三維點雲地圖組成，也可以是二維圖像和三維點雲的結合，在此不做限定。In the embodiment of the present invention, the matching data to be identified includes several groups of point pairs, and the two points of each group of point pairs are respectively from the image to be located and the map data. The dimension to which the image to be located and the map data belong may be 2-dimensional or 3-dimensional, which is not limited herein. For example, the image to be positioned may be a two-dimensional image, or, the image to be positioned may also be an RGB-D image, which is not limited here; the map data may be composed of a simple two-dimensional image, or may be composed of The composition of a 3D point cloud map can also be a combination of a 2D image and a 3D point cloud, which is not limited here.

步驟S52：利用匹配預測模型對若干組點對進行預測處理，得到點對的預測匹配值。Step S52: Use the matching prediction model to perform prediction processing on several groups of point pairs to obtain the predicted matching values of the point pairs.

匹配預測模型為預先通過樣本匹配資料訓練得到的神經網路模型。在一個實施場景中，匹配預測模型可以是通過前述任一匹配預測模型的訓練方法實施例中的步驟訓練得到的，其中，訓練步驟可以參考前述實施例中的步驟，在此不再贅述。The matching prediction model is a neural network model trained in advance through sample matching data. In an implementation scenario, the matching prediction model may be obtained by training through the steps in any of the foregoing embodiments of the matching prediction model training method, wherein the training steps may refer to the steps in the foregoing embodiments, which will not be repeated here.

通過利用匹配預測模型對若干組點對進行預測處理，可以得到待識別匹配資料中點對的預測匹配值。在一個實施場景中，待識別匹配資料為二分圖，二分圖中包括若干組點對和連接每組點對的連接邊，匹配預測模型包括與待定位圖像所屬的維度對應的第一點特徵提取子模型、與地圖資料所屬的維度對應的第二點特徵提取子模型，以及邊特徵提取子模型，從而可以利用第一點特徵提取子模型和第二點特徵提取子模型對二分圖進行特徵提取，得到第一特徵和第二特徵，並利用邊特徵提取子模型對第一特徵和第二特徵進行特徵提取，得到第三特徵，進而利用第三特徵，得到連接邊對應的點對的預測匹配值。這裡，可以參閱前述實施例中的步驟，在此不再贅述。By using the matching prediction model to perform prediction processing on several groups of point pairs, the predicted matching values of the point pairs in the matching data to be identified can be obtained. In one implementation scenario, the matching data to be identified is a bipartite graph, the bipartite graph includes several groups of point pairs and connecting edges connecting each group of point pairs, and the matching prediction model includes a first point feature corresponding to the dimension to which the image to be located belongs. The extraction sub-model, the second point feature extraction sub-model corresponding to the dimension to which the map data belongs, and the edge feature extraction sub-model, so that the bipartite graph can be characterized by using the first point feature extraction sub-model and the second point feature extraction sub-model Extraction to obtain the first feature and the second feature, and use the edge feature extraction sub-model to perform feature extraction on the first feature and the second feature to obtain the third feature, and then use the third feature to obtain the prediction of the point pair corresponding to the connecting edge match value. Here, reference may be made to the steps in the foregoing embodiments, which will not be repeated here.

步驟S53：基於點對的預測匹配值，確定待定位圖像的攝影器件的位姿參數。Step S53: Determine the pose parameters of the photographing device of the image to be positioned based on the predicted matching value of the point pair.

通過待識別匹配資料中點對的預測匹配值，可以優先利用預測匹配值比較高的點對，確定待定位圖像的攝影器件的位姿參數。在一個實施場景中，可以利用預測匹配值比較高的n個點對，構建PnP（Perspective-n-Point）問題，從而採用諸如EPnP（Efficient PnP）等方式對PnP問題進行求解，進而得到待定位圖像的攝影器件的位姿參數。在另一個實施場景中，還可以將若干組點對按照預測匹配值從高到低的順序進行排序，並利用前預設數量組點對，確定待定位圖像的攝影器件的位姿參數。其中，前預設數量可以根據實際情況進行設置，例如，將排序後的若干組點對中預測匹配值不為0的點對作為前預設數量組點對；或者，還可以將排序後的若組點對中預測匹配值大於一下限值的點對作為前預設數量組點對；對於前預設數量，可以根據實際應用而設置，在此不做限定。這裡，還可以採用諸如PROSAC（PROgressive SAmple Consensus，漸進一致採樣）的方式，對排序後的點對進行處理，得到待定位圖像的攝影器件的位姿參數。在一個實施場景中，待定位圖像的攝影器件的位姿參數可以包括攝影器件在地圖資料所屬的地圖坐標系中的6個自由度（Degree of freedom, DoF），包括：位姿（pose），即座標，以及環繞x軸的偏轉yaw（俯仰角）、繞y軸的偏轉pitch（偏航角）、繞z軸的偏轉roll（翻滾角）。According to the predicted matching value of the point pair in the matching data to be identified, the point pair with a relatively high predicted matching value can be preferentially used to determine the pose parameters of the photographic device of the image to be positioned. In an implementation scenario, the PnP (Perspective-n-Point) problem can be constructed by using n point pairs with relatively high predicted matching values, so as to solve the PnP problem by means such as EPnP (Efficient PnP), and then obtain the to-be-located problem. The pose parameters of the photographic device of the image. In another implementation scenario, several sets of point pairs may also be sorted in descending order of predicted matching values, and the previously preset number of sets of point pairs may be used to determine the pose parameters of the photographic device of the image to be positioned. The first preset number can be set according to the actual situation, for example, point pairs whose predicted matching value is not 0 among the sorted groups of point pairs are used as the first preset number of point pairs; If the predicted matching value in the group point pair is greater than the lower limit value, it is used as the first preset number of group point pairs; the first preset number can be set according to the actual application, which is not limited here. Here, a method such as PROSAC (PROgressive SAmple Consensus, progressive consistent sampling) may also be used to process the sorted point pairs to obtain the pose parameters of the photographic device of the image to be positioned. In an implementation scenario, the pose parameters of the photographic device of the image to be positioned may include 6 degrees of freedom (DoF) of the photographic device in the map coordinate system to which the map data belongs, including: pose , that is, the coordinates, and the deflection yaw (pitch angle) around the x-axis, the deflection pitch (yaw angle) around the y-axis, and the deflection roll (roll angle) around the z-axis.

上述方案，通過利用待定位圖像和地圖資料，構建待識別匹配資料，且待識別匹配資料包括若干組點對，每組點對的兩個點分別來自待定位圖像和地圖資料，從而利用匹配預測模型對若干組點對進行預測處理，得到點對的預測匹配值，進而基於點對的預測匹配值，確定待定位圖像的攝影器件的位姿參數，故能夠在視覺定位中利用匹配預測模型預測點對之間的匹配值而建立匹配關係，能夠有利於提高視覺定位的準確性和即時性。In the above scheme, the matching data to be identified is constructed by using the image to be located and the map data, and the matching data to be identified includes several sets of point pairs, and the two points of each set of point pairs are respectively from the image to be located and the map data, so as to use the matching data to be identified. The matching prediction model performs prediction processing on several groups of point pairs to obtain the predicted matching value of the point pair, and then determines the pose parameters of the photographic device of the image to be positioned based on the predicted matching value of the point pair, so the matching can be used in visual positioning. The prediction model predicts the matching value between point pairs to establish a matching relationship, which can help to improve the accuracy and immediacy of visual positioning.

請參閱圖6，圖6是本發明匹配預測模型的訓練裝置60一實施例的方塊示意圖。匹配預測模型的訓練裝置60包括樣本構建部分61、預測處理部分62、損失確定部分63和參數調整部分64，樣本構建部分61配置為利用樣本圖像和地圖資料，構建樣本匹配資料，其中，樣本匹配資料包括若干組點對以及每組點對的實際匹配值，每組點對的兩個點分別來自樣本圖像和地圖資料；預測處理部分62配置為利用匹配預測模型對若干組點對進行預測處理，得到點對的預測匹配值；損失確定部分63配置為利用實際匹配值和預測匹配值，確定匹配預測模型的損失值；參數調整部分64配置為利用損失值，調整匹配預測模型的參數。Please refer to FIG. 6 . FIG. 6 is a schematic block diagram of an embodiment of a training apparatus 60 for matching prediction models of the present invention. The training device 60 for matching prediction models includes a sample construction part 61, a prediction processing part 62, a loss determination part 63 and a parameter adjustment part 64. The sample construction part 61 is configured to use sample images and map data to construct sample matching data, wherein the sample The matching data includes several groups of point pairs and the actual matching value of each group of point pairs, and the two points of each group of point pairs are respectively from the sample image and the map data; the prediction processing part 62 is configured to use the matching prediction model to perform the matching prediction model on the several groups of point pairs. Prediction processing to obtain the predicted matching value of the point pair; the loss determination part 63 is configured to use the actual matching value and the predicted matching value to determine the loss value of the matching prediction model; the parameter adjustment part 64 is configured to use the loss value to adjust the parameters of the matching prediction model .

上述方案，能夠利用匹配預測模型建立匹配關係，從而能夠在視覺定位中利用匹配預測模型預測點對之間的匹配值，因而能夠基於預測得到的匹配值優先採樣高匹配值的點對，進而能夠有利於提高視覺定位的準確性和即時性。The above solution can use the matching prediction model to establish a matching relationship, so that the matching prediction model can be used in the visual positioning to predict the matching value between the point pairs, so the point pair with the high matching value can be preferentially sampled based on the predicted matching value, and then the matching value can be sampled. It is beneficial to improve the accuracy and immediacy of visual positioning.

在一些實施例中，樣本構建部分61包括點對獲取子部分，配置為從樣本圖像中獲取若干圖像點，以及從地圖資料中獲取若干地圖點，以組成若干組點對；其中，若干組點對包括至少一組所包含的圖像點和地圖點之間匹配的匹配點對，樣本構建部分61包括第一匹配值確定子部分，配置為對於每組匹配點對：利用樣本圖像的位姿參數將地圖點投影至樣本圖像所屬的維度中，得到地圖點的投影點；並基於圖像點和投影點之間的差異，確定匹配點對的實際匹配值。In some embodiments, the sample construction section 61 includes a point pair acquisition subsection configured to acquire several image points from the sample image and several map points from the map material to form several sets of point pairs; wherein several The set of point pairs includes at least one set of matching point pairs matched between the included image points and map points, and the sample construction section 61 includes a first matching value determination subsection configured to, for each set of matching point pairs: use the sample image The pose parameters of the map point are projected into the dimension to which the sample image belongs, and the projected point of the map point is obtained; and the actual matching value of the matching point pair is determined based on the difference between the image point and the projected point.

在一些實施例中，若干組點對包括至少一組所包含的圖像點和地圖點之間不匹配的非匹配點對，樣本構建部分61包括第二匹配值確定子部分，配置為將非匹配點對的實際匹配值設置為預設數值。In some embodiments, the sets of point pairs include at least one set of non-matched point pairs that do not match between the included image points and map points, and the sample construction section 61 includes a second matching value determination subsection configured to The actual matching value of the matching point pair is set to the preset value.

區別於前述實施例，若干組點對包括至少一組所包含的圖像點和地圖點之間不匹配的非匹配點對，並區別於匹配點對，將非匹配點對的實際匹配值設置預設數值，從而能夠有利於提高匹配預測模型的魯棒性。Different from the foregoing embodiments, several groups of point pairs include at least one group of non-matching point pairs that do not match between the included image points and map points, and different from the matching point pairs, the actual matching values of the non-matching point pairs are set. The preset value can help to improve the robustness of the matching prediction model.

在一些實施例中，點對獲取子部分包括圖像點劃分部分，配置為將樣本圖像中的圖像點劃分為第一圖像點和第二圖像點，其中，第一圖像點在地圖資料中存在與其匹配的地圖點，第二圖像點在地圖資料中不存在與其匹配的地圖點，點對獲取子部分包括第一點對獲取部分，配置為對於每一第一圖像點，從地圖資料中分配若干第一地圖點，並分別將第一圖像點與每一第一地圖點作為一第一點對，其中，第一地圖點中包含與第一圖像點匹配的地圖點，點對獲取子部分包括第二點對獲取部分，配置為對於每一第二圖像點，從地圖資料中分配若干第二地圖點，並分別將第二圖像點與每一第二地圖點作為一第二點對，點對獲取子部分包括點對抽取部分，配置為從第一點對和第二點對中抽取得到若干組點對。In some embodiments, the point pair acquisition subsection includes an image point division section configured to divide the image points in the sample image into first image points and second image points, wherein the first image point There is a matching map point in the map material, the second image point does not have a matching map point in the map material, the point pair acquisition subsection includes a first point pair acquisition section, and is configured for each first image point, assign a number of first map points from the map data, and respectively use the first image point and each first map point as a first point pair, wherein the first map point includes matching with the first image point the map point, the point pair acquisition subsection includes a second point pair acquisition section, configured to allocate a number of second map points from the map data for each second image point, and respectively associate the second image point with each The second map point is used as a second point pair, and the point pair acquisition subsection includes a point pair extraction section configured to extract several groups of point pairs from the first point pair and the second point pair.

區別於前述實施例，通過將樣本圖像中的圖像點劃分為第一圖像點和第二圖像點，且第一圖像點在地圖中存在與其匹配的地圖點，第二圖像點在圖像資料中不存在與其匹配的圖像點，並對第一圖像點，從地圖資料中分配若干第一地圖點，分別將第一圖像點與每一第一地圖點作為一第一點對，且第一地圖點中包含與第一圖像點匹配的地圖點，而對於每一第二圖像點，從地圖資料中分配若干第二地圖點，分別將第二圖像點與每一第二地圖點作為一第二點對，並從第一點對和第二點對中抽取得到若干組點對，從而能夠構建得到數量豐富且包含非匹配點對和匹配點對的若干組點對，以用於訓練匹配預測模型，故能夠有利於提高匹配預測模型的準確性。Different from the foregoing embodiments, by dividing the image points in the sample image into a first image point and a second image point, and the first image point has a matching map point in the map, the second image There is no matching image point in the image data, and for the first image point, a number of first map points are allocated from the map data, and the first image point and each first map point are regarded as one. The first point pair, and the first map point includes a map point matching the first image point, and for each second image point, a number of second map points are allocated from the map data, and the second image The point and each second map point are used as a second point pair, and several groups of point pairs are extracted from the first point pair and the second point pair, so that a rich number of point pairs can be constructed and include non-matching point pairs and matching point pairs Several groups of point pairs are used to train the matching prediction model, so it can help to improve the accuracy of the matching prediction model.

在一些實施例中，第一匹配值確定子部分包括位姿計算部分，配置為基於匹配點對，計算樣本圖像的位姿參數，第一匹配值確定子部分包括投影部分，配置為利用位姿參數將地圖點投影至樣本圖像所屬的維度中，得到地圖點的投影點。In some embodiments, the first matching value determination subsection includes a pose calculation section configured to calculate pose parameters of the sample image based on the matching point pairs, and the first matching value determination subsection includes a projection section configured to utilize the position The pose parameter projects the map point into the dimension to which the sample image belongs to obtain the projected point of the map point.

區別於前述實施例，通過利用匹配點對，計算樣本圖像的位姿參數，並利用位姿參數將地圖點投影至樣本圖像所屬的維度中，得到地圖點的投影點，從而能夠有利於提高投影點與圖像點之間差異的準確性，進而能夠有利於提高匹配預測模型的準確性。Different from the previous embodiments, by using the matching point pair to calculate the pose parameters of the sample image, and using the pose parameters to project the map points to the dimension to which the sample image belongs, the projection points of the map points are obtained, which can be beneficial to Improving the accuracy of the difference between the projection point and the image point can be beneficial to improve the accuracy of the matching prediction model.

在一些實施例中，第一匹配值確定子部分包括概率密度轉換部分，配置為利用預設概率分佈函數將差異轉換為概率密度值，作為匹配點對的實際匹配值In some embodiments, the first matching value determination subsection includes a probability density conversion section configured to convert the difference into a probability density value using a preset probability distribution function as the actual matching value of the matching point pair

區別於前述實施例，通過利用預設概率分佈函數將差異轉換為概率密度值，作為匹配點對的實際匹配值，故能夠有利於準確地描述投影點與圖像點之間的差異，從而能夠有利於提高匹配預測模型的準確性。Different from the foregoing embodiments, by using a preset probability distribution function to convert the difference into a probability density value as the actual matching value of the matching point pair, it can be beneficial to accurately describe the difference between the projection point and the image point, thereby enabling It is beneficial to improve the accuracy of the matching prediction model.

在一些實施例中，樣本匹配資料為二分圖，二分圖包括若干組點對和連接每組點對的連接邊，且連接邊標注有對應點對的實際匹配值；匹配預測模型包括與樣本圖像所屬的維度對應的第一點特徵提取子模型、與地圖資料所屬的維度對應的第二點特徵提取子模型以及邊特徵提取子模型，預測處理部分62包括點特徵提取子部分，配置為分別利用第一點特徵提取子模型和第二點特徵提取子模型對二分圖進行特徵提取，得到第一特徵和第二特徵，預測處理部分62包括邊特徵提取子部分，配置為利用邊特徵提取子模型對第一特徵和第二特徵進行特徵提取，得到第三特徵，預測處理部分62包括預測子部分，配置為利用第三特徵，得到連接邊對應的點對的預測匹配值。In some embodiments, the sample matching data is a bipartite graph, and the bipartite graph includes several groups of point pairs and connecting edges connecting each group of point pairs, and the connecting edges are marked with the actual matching values of the corresponding point pairs; the matching prediction model includes and the sample graph Like the first point feature extraction sub-model corresponding to the dimension to which it belongs, the second point feature extraction sub-model and the edge feature extraction sub-model corresponding to the dimension to which the map data belongs, the prediction processing part 62 includes a point feature extraction sub-part, which is configured to be respectively Using the first point feature extraction sub-model and the second point feature extraction sub-model to perform feature extraction on the bipartite graph to obtain the first feature and the second feature, the prediction processing part 62 includes an edge feature extraction sub-part, which is configured to use the edge feature extraction sub-part The model performs feature extraction on the first feature and the second feature to obtain a third feature. The prediction processing part 62 includes a prediction sub-part configured to use the third feature to obtain the predicted matching value of the point pair corresponding to the connecting edge.

區別於前述實施例，通過對二分圖分別進行點特徵抽取以及邊特徵抽取，從而能夠使匹配預測模型更加有效地感知匹配的空間幾何結構，進而能夠有利於提高匹配預測模型的準確性。Different from the foregoing embodiments, by performing point feature extraction and edge feature extraction on the bipartite graph respectively, the matching prediction model can more effectively perceive the spatial geometric structure of the matching, thereby improving the accuracy of the matching prediction model.

在一些實施例中，第一點特徵提取子模型和第二點特徵提取子模型的結構為以下任一種：包括至少一個殘差塊，包括至少一個殘差塊和至少一個空間變換網路；和/或，邊特徵提取子模型包括至少一個殘差塊。In some embodiments, the structure of the first point feature extraction sub-model and the second point feature extraction sub-model is any of the following: including at least one residual block, including at least one residual block and at least one spatial transformation network; and /or, the edge feature extraction sub-model includes at least one residual block.

區別於前述實施例，通過將第一點特徵提取子模型和第二點特徵提取子模型的結構設置為以下任一者：包括至少一個殘差塊，包括至少一個殘差塊和至少一個空間變換網路，且將邊特徵提取子模型設置為包括至少一個殘差塊，故能夠有利於匹配預測模型的優化，並提高匹配預測模型的準確性。Different from the foregoing embodiments, the structures of the first point feature extraction sub-model and the second point feature extraction sub-model are set to any one of the following: including at least one residual block, including at least one residual block and at least one spatial transformation network, and the edge feature extraction sub-model is set to include at least one residual block, so it can facilitate the optimization of the matching prediction model and improve the accuracy of the matching prediction model.

在一些實施例中，若干組點對包括至少一組所包含的圖像點和地圖點之間匹配的匹配點對和至少一組所包含的圖像點和地圖點之間不匹配的非匹配點對，損失確定部分63包括第一損失確定子部分，配置為利用匹配點對的預測匹配值和實際匹配值，確定匹配預測模型的第一損失值，損失確定部分63包括第二損失確定子部分，配置為利用非匹配點對的預測匹配值和實際匹配值，確定匹配預測模型的第二損失值，損失確定部分63包括損失加權子部分，配置為對第一損失值和第二損失值進行加權處理，得到匹配預測模型的損失值。In some embodiments, the sets of point pairs include at least one set of matched point pairs that match between the included image points and map points and at least one set of non-matches that do not match between the included image points and map points point pair, the loss determination section 63 includes a first loss determination subsection configured to use the predicted matching value and the actual matching value of the matching point pair to determine a first loss value matching the prediction model, and the loss determination section 63 includes a second loss determination subsection part 63 configured to use the predicted matching value and the actual matching value of the non-matching point pair to determine a second loss value of the matching prediction model, the loss determination part 63 includes a loss weighting sub-section configured to compare the first loss value and the second loss value Perform weighting processing to obtain the loss value that matches the prediction model.

區別於前述實施例，通過利用匹配點對的預測匹配值和實際匹配值，確定匹配預測模型的第一損失值，並利用非匹配點對的預測匹配值和實際損失值，確定匹配預測模型的第二損失值，從而對第一損失值和第二損失值進行加權處理，得到匹配預測模型的損失值，故能夠有利於使匹配預測模型有效感知匹配的空間幾何結構，從而提高匹配預測模型的準確性。Different from the previous embodiment, the first loss value of the matching prediction model is determined by using the predicted matching value and the actual matching value of the matching point pair, and the predicted matching value and the actual loss value of the non-matching point pair are used to determine the matching prediction model. The second loss value, so that the first loss value and the second loss value are weighted to obtain the loss value of the matching prediction model, so it can help the matching prediction model to effectively perceive the matching spatial geometry, thereby improving the matching prediction model. accuracy.

在一些實施例中，損失確定部分63還包括數量統計子部分，配置為分別統計匹配點對的第一數量，以及非匹配點對的第二數量；第一損失確定子部分，配置為利用匹配點對的預測匹配值和實際匹配值之間的差值，以及第一數量，確定第一損失值；第二損失確定子部分具，配置為利用非匹配點對的預測匹配值和實際匹配值之間的差值，以及第二數量，確定第二損失值。In some embodiments, the loss determination part 63 further includes a quantity statistics subsection, configured to count the first number of matched point pairs and the second number of non-matched point pairs, respectively; the first loss determination subsection is configured to use matching The difference between the predicted matching value and the actual matching value of the point pair, and the first number, determine the first loss value; the second loss determining subsection is configured to use the predicted matching value and the actual matching value of the non-matching point pair The difference between , and the second quantity, determines the second loss value.

區別於前述實施例，通過統計匹配點對的第一數量，以及非匹配點對的第二數量，從而利用匹配點對的預測匹配值和實際匹配值之間的差值，以及第一數量，確定第一損失值，並利用非匹配點對的預測匹配值和實際匹配值之間的差異，以及第二數量，確定第二損失值，能夠有利於提高匹配預測模型的損失值的準確性，從而能夠有利於提高匹配預測模型的準確性。Different from the foregoing embodiments, by counting the first number of matching point pairs and the second number of non-matching point pairs, the difference between the predicted matching value and the actual matching value of the matching point pair, and the first number, are used, Determining the first loss value, and using the difference between the predicted matching value and the actual matching value of the unmatched point pair, and the second quantity, determining the second loss value can help to improve the accuracy of the loss value of the matching prediction model, Thus, the accuracy of the matching prediction model can be improved.

在一些實施例中，樣本圖像所屬的維度為2維或3維，地圖資料所屬的維度為2維或3維。In some embodiments, the dimension to which the sample image belongs is 2D or 3D, and the dimension to which the map data belongs is 2D or 3D.

區別於前述實施例，通過設置樣本圖像和地圖資料所屬的維度，能夠訓練得到用於2維-2維的匹配預測模型，或者能夠訓練得到用於2維-3維的匹配預測模型，或者能夠訓練得到用於3維-3維的匹配預測模型，從而能夠提高匹配預測模型的適用範圍。Different from the foregoing embodiments, by setting the dimensions to which the sample images and map data belong, a matching prediction model for 2D-2D can be trained, or a matching prediction model for 2D-3D can be trained, or The matching prediction model for 3D-3D can be obtained by training, so that the applicable scope of the matching prediction model can be improved.

請參閱圖7，圖7是本發明視覺定位裝置70一實施例的方塊示意圖。視覺定位裝置70包括資料構建部分71、預測處理部分72和參數確定部分73，資料構建部分71配置為利用待定位圖像和地圖資料，構建待識別匹配資料，其中，待識別匹配資料包括若干組點對，每組點對的兩個點分別來自待定位圖像和地圖資料；預測處理部分72配置為利用匹配預測模型對若干組點對進行預測處理，得到點對的預測匹配值；參數確定部分73配置為基於點對的預測匹配值，確定待定位圖像的攝影器件的位姿參數。Please refer to FIG. 7 . FIG. 7 is a schematic block diagram of an embodiment of a visual positioning device 70 of the present invention. The visual positioning device 70 includes a data construction part 71, a prediction processing part 72 and a parameter determination part 73. The data construction part 71 is configured to use the image and map data to be located to construct the matching data to be identified, wherein the matching data to be identified includes several groups. point pairs, the two points of each group of point pairs are respectively from the image to be located and the map data; the prediction processing part 72 is configured to use the matching prediction model to perform prediction processing on several groups of point pairs to obtain the predicted matching values of the point pairs; parameter determination Section 73 is configured to determine pose parameters of the photographic device of the image to be positioned based on the predicted matching values of the point pairs.

上述方案，能夠利用匹配預測模型建立匹配關係，從而能夠在視覺定位中利用匹配預測模型預測點對之間的匹配值而建立匹配關係，能夠有利於提高視覺定位的準確性和即時性。In the above solution, the matching prediction model can be used to establish the matching relationship, so that the matching prediction model can be used to predict the matching value between the point pairs in the visual positioning to establish the matching relationship, which can help to improve the accuracy and immediacy of the visual positioning.

在一些實施例中，參數確定部分73包括點對排序子部分，配置為將若干組點對按照預測匹配值從高到低的順序進行排序，參數確定部分73還包括參數確定子部分，配置為利用前預設數量組點對，確定待定位圖像的攝影器件的位姿參數。In some embodiments, the parameter determination section 73 includes a point pair sorting subsection configured to sort several groups of point pairs in descending order of predicted matching values, and the parameter determination section 73 further includes a parameter determination subsection configured as Using the previously preset number of point pairs, the pose parameters of the photographic device of the image to be positioned are determined.

在本發明實施例以及其他的實施例中，“部分”可以是部分電路、部分處理器、部分程式或軟體等等，當然也可以是單元，還可以是模組也可以是非模組化的。In the embodiments of the present invention and other embodiments, a "part" may be a part of a circuit, a part of a processor, a part of a program or software, etc., of course, a unit, a module or a non-modular form.

區別於前述實施例，通過將若干組點對按照預測匹配值從高到低的順序進行排序，並利用前預設數量組點對，確定待定位圖像的攝影器件的位姿參數，從而能夠有利於利用排序後的點對進行增量式採樣，優先採樣匹配值高的點對，故能夠通過幾何先驗引導位姿參數的求解，從而能夠提高視覺定位的準確性和即時性。Different from the foregoing embodiments, by sorting several groups of point pairs in descending order of predicted matching values, and using the previously preset number of groups of point pairs to determine the pose parameters of the photographic device of the image to be positioned, it is possible to It is beneficial to use the sorted point pairs for incremental sampling, and preferentially sample the point pairs with high matching values. Therefore, the geometric prior can guide the solution of the pose parameters, thereby improving the accuracy and immediacy of visual positioning.

在一些實施例中，匹配預測模型是利用上述任一匹配預測模型的訓練裝置實施例中的匹配預測模型的訓練裝置訓練得到的。In some embodiments, the matching prediction model is obtained by training the matching prediction model training device in any of the above-mentioned embodiments of the matching prediction model training device.

區別於前述實施例，通過上述任一匹配預測模型的訓練裝置實施例中的匹配預測模型的訓練裝置得到的匹配預測模型進行視覺定位，能夠提高視覺定位的準確性和即時性。Different from the foregoing embodiments, performing visual positioning through the matching prediction model obtained by the training device for matching prediction models in any of the above-mentioned embodiments of the training device for matching prediction models can improve the accuracy and immediacy of visual positioning.

請參閱圖8，圖8是本發明電子設備80一實施例的方塊示意圖。電子設備80包括相互耦接的記憶體81和處理器82，處理器82用於執行記憶體81中儲存的程式指令，以實現上述任一匹配預測模型的訓練方法實施例中的步驟，或實現上述任一視覺定位方法實施例中的步驟。在一個實施場景中，電子設備80可以包括但不限於：手機、匹配電腦等移動設備，在此不做限定。Please refer to FIG. 8 , which is a block diagram illustrating an embodiment of an electronic device 80 of the present invention. The electronic device 80 includes a memory 81 and a processor 82 coupled to each other, and the processor 82 is configured to execute program instructions stored in the memory 81 to implement the steps in any of the above-mentioned embodiments of the training method for matching the prediction model, or to implement The steps in any of the above-mentioned embodiments of the visual positioning method. In an implementation scenario, the electronic device 80 may include, but is not limited to, mobile devices such as a mobile phone and a matching computer, which are not limited herein.

其中，處理器82用於控制其自身以及記憶體81以實現上述任一匹配預測模型的訓練方法實施例中的步驟，或實現上述任一視覺定位方法實施例中的步驟。處理器82還可以稱為CPU（Central Processing Unit，中央處理單元）。處理器82可能是一種積體電路晶片，具有信號的處理能力。處理器82還可以是通用處理器、數位訊號處理器（Digital Signal Processor, DSP）、專用積體電路（Application Specific Integrated Circuit, ASIC）、現場可程式設計閘陣列（Field-Programmable Gate Array, FPGA）或者其他可程式設計邏輯器件、分立門或者電晶體邏輯器件、分立硬體元件。通用處理器可以是微處理器或者該處理器也可以是任何常規的處理器等。另外，處理器82可以由積體電路晶片共同實現。The processor 82 is configured to control itself and the memory 81 to implement the steps in any of the above-mentioned embodiments of the training method for matching prediction models, or to implement the steps in any of the above-mentioned embodiments of the visual positioning method. The processor 82 may also be referred to as a CPU (Central Processing Unit, central processing unit). The processor 82 may be an integrated circuit chip with signal processing capabilities. The processor 82 may also be a general-purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), or a field-programmable gate array (FPGA) Or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. Additionally, the processor 82 may be commonly implemented by an integrated circuit die.

請參閱圖9，圖9為本發明電腦可讀儲存介質90一實施例的方塊示意圖。電腦可讀儲存介質90儲存有能夠被處理器運行的程式指令901，程式指令901用於實現上述任一匹配預測模型的訓練方法實施例中的步驟，或實現上述任一視覺定位方法實施例中的步驟。Please refer to FIG. 9 , which is a block diagram illustrating an embodiment of a computer-readable storage medium 90 of the present invention. The computer-readable storage medium 90 stores program instructions 901 that can be run by the processor. The program instructions 901 are used to implement the steps in any of the above-mentioned embodiments of the training method for matching prediction models, or to implement any of the above-mentioned embodiments of the visual positioning method. A step of.

在本發明所提供的幾個實施例中，應該理解到，所揭露的方法和裝置，可以通過其它的方式實現。例如，以上所描述的裝置實施方式僅僅是示意性的，例如，模組或單元的劃分，僅僅為一種邏輯功能劃分，實際實現時可以有另外的劃分方式，例如單元或元件可以結合或者可以集成到另一個系統，或一些特徵可以忽略，或不執行。另一點，所顯示或討論的相互之間的耦合或直接耦合或通信連接可以是通過一些介面，裝置或單元的間接耦合或通信連接，可以是電性、機械或其它的形式。In the several embodiments provided by the present invention, it should be understood that the disclosed method and apparatus may be implemented in other manners. For example, the device implementations described above are only illustrative. For example, the division of modules or units is only a logical function division. In actual implementation, there may be other divisions. For example, units or elements may be combined or integrated. to another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.

作為分離部件說明的單元可以是或者也可以不是物理上分開的，作為單元顯示的部件可以是或者也可以不是物理單元，即可以位於一個地方，或者也可以分佈到網路單元上。可以根據實際的需要選擇其中的部分或者全部單元來實現本實施方式方案的目的。Units described as separate components may or may not be physically separated, and components shown as units may or may not be physical units, that is, may be located in one place, or may be distributed over network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this implementation manner.

另外，在本發明各個實施例中的各功能單元可以集成在一個處理單元中，也可以是各個單元單獨物理存在，也可以兩個或兩個以上單元集成在一個單元中。上述集成的單元既可以採用硬體的形式實現，也可以採用軟體功能單元的形式實現。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware, or can be implemented in the form of software functional units.

集成的單元如果以軟體功能單元的形式實現並作為獨立的產品銷售或使用時，可以儲存在一個電腦可讀取儲存介質中。基於這樣的理解，本發明的技術方案本質上或者說對現有技術做出貢獻的部分或者該技術方案的全部或部分可以以軟體產品的形式體現出來，該電腦軟體產品儲存在一個儲存介質中，包括若干指令用以使得一台電腦設備（可以是個人電腦，伺服器，或者網路設備等）或處理器（processor）執行本發明各個實施方式方法的全部或部分步驟。而前述的儲存介質包括：U盤、移動硬碟、唯讀記憶體（ROM，Read-Only Memory）、隨機存取記憶體（RAM，Random Access Memory）、磁碟或者光碟等各種可以儲存程式碼的介質。The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention is essentially or the part that contributes to the prior art or the whole or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, Several instructions are included to cause a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to execute all or part of the steps of the methods of the various embodiments of the present invention. The aforementioned storage medium includes: U disk, removable hard disk, Read-Only Memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), magnetic disk or CD, etc. medium.

工業實用性本發明實施例中，能夠利用匹配預測模型建立匹配關係，從而能夠在視覺定位中利用匹配預測模型預測點對之間的匹配值，因而能夠基於預測得到的匹配值優先採樣高匹配值的點對，而建立匹配關係，進而能夠有利於提高視覺定位的準確性和即時性。 Industrial Applicability In the embodiment of the present invention, the matching prediction model can be used to establish a matching relationship, so that the matching prediction model can be used to predict the matching value between point pairs in visual positioning, so the point pair with high matching value can be preferentially sampled based on the predicted matching value. , and establish a matching relationship, which can help to improve the accuracy and immediacy of visual positioning.

60:匹配預測模型的訓練裝置 61:樣本構建部分 62:預測處理部分 63:損失確定部分 64:參數調整部分 70:視覺定位裝置 71:資料構建部分 72:預測處理部分 73:參數確定部分 80:電子設備 81:記憶體 82:處理器 90:電腦可讀儲存介質 901:程式指令 S11~S14,S111~S112,S41~S44,S51~S53:步驟 60: A training device for matching prediction models 61: Sample Construction Section 62: Prediction processing part 63: Loss Determination Section 64: Parameter adjustment part 70: Visual positioning device 71: Data Construction Section 72: Prediction processing part 73: Parameter determination part 80: Electronic equipment 81: Memory 82: Processor 90: Computer-readable storage media 901: Program command S11~S14, S111~S112, S41~S44, S51~S53: Steps

圖1是本發明匹配預測模型的訓練方法一實施例的流程示意圖；圖2是本發明匹配預測模型的訓練方法一實施例的狀態示意圖；圖3是圖1中步驟S11一實施例的流程示意圖；圖4是圖3中步驟S111一實施例的流程示意圖；圖5是本發明視覺定位方法一實施例的流程示意圖；圖6是本發明匹配預測模型的訓練裝置一實施例的方塊示意圖；圖7是本發明視覺定位裝置一實施例的方塊示意圖；圖8是本發明電子設備一實施例的方塊示意圖；圖9是本發明電腦可讀儲存介質一實施例的方塊示意圖。 1 is a schematic flowchart of an embodiment of a training method for a matching prediction model of the present invention; 2 is a state schematic diagram of an embodiment of a training method for a matching prediction model of the present invention; 3 is a schematic flowchart of an embodiment of step S11 in FIG. 1; FIG. 4 is a schematic flowchart of an embodiment of step S111 in FIG. 3; 5 is a schematic flowchart of an embodiment of the visual positioning method of the present invention; 6 is a schematic block diagram of an embodiment of a training device for matching prediction models of the present invention; 7 is a schematic block diagram of an embodiment of the visual positioning device of the present invention; 8 is a schematic block diagram of an embodiment of an electronic device of the present invention; FIG. 9 is a schematic block diagram of an embodiment of a computer-readable storage medium of the present invention.

S11~S14:步驟 S11~S14: Steps

Claims

A training method for matching predictive models, including: Use sample images and map data to construct sample matching data, wherein the sample matching data includes several groups of point pairs and the actual matching values of each group of point pairs, and the two points of each group of point pairs are respectively from the sample image and said map data; Use a matching prediction model to perform prediction processing on the several groups of point pairs to obtain the predicted matching values of the point pairs; Using the actual matching value and the predicted matching value, determining the loss value of the matching prediction model; Using the loss value, the parameters of the matching prediction model are adjusted.

The training method according to claim 1, wherein, using sample images and map data to construct sample matching data includes: A number of image points are obtained from the sample image, and a number of map points are obtained from the map material to form a number of sets of point pairs; wherein the sets of point pairs include at least one set of included image points Matching point pairs that match between map points; For each set of the matching point pairs: use the pose parameters of the sample image to project the map point into the dimension to which the sample image belongs to obtain the projected point of the map point; and based on the map The difference between the image point and the projected point determines the actual matching value of the matching point pair.

The training method according to claim 2, wherein the several groups of point pairs include at least one group of non-matching point pairs that do not match between the included image points and map points, and the sample images and map data are used for the training method. , the construction of sample matching data also includes: The actual matching value of the non-matching point pair is set as a preset value.

The training method according to claim 2 or 3, wherein the acquiring several image points from the sample image and acquiring several map points from the map data to form several sets of point pairs, including: The image points in the sample image are divided into a first image point and a second image point, wherein the first image point has the map point matching it in the map data, so the second image point does not have a matching map point in the map data; For each of the first image points, a number of first map points are allocated from the map data, and the first image point and each of the first map points are respectively used as a first point pair, wherein the first map point includes the map point that matches the first image point; and, for each of the second image points, assigning a plurality of second map points from the map data, and respectively using the second image point and each of the second map points as a second point pair; Several sets of point pairs are extracted from the first point pair and the second point pair.

The training method according to claim 2 or 3, wherein the map point is projected into the dimension to which the sample image belongs by using the pose parameter of the sample image to obtain the projection of the map point Points include: Based on the matched point pair, calculate the pose parameter of the sample image; Using the pose parameter to project the map point into the dimension to which the sample image belongs, to obtain the projected point of the map point; and/or, The determining the actual matching value of the matching point pair based on the difference between the image point and the projection point includes: The difference is converted into a probability density value using a preset probability distribution function as the actual matching value of the matching point pair.

The training method according to any one of claim 1 to 3, wherein the sample matching data is a bipartite graph, and the bipartite graph includes several groups of point pairs and connecting edges connecting each group of point pairs, and the connecting edges The actual matching value corresponding to the point pair is marked; the matching prediction model includes a first point feature extraction sub-model corresponding to the dimension to which the sample image belongs, and a second point corresponding to the dimension to which the map data belongs Feature extraction sub-model and edge feature extraction sub-model; The performing prediction processing on the several groups of point pairs by using the matching prediction model, and obtaining the predicted matching values of the point pairs includes: Using the first point feature extraction sub-model and the second point feature extraction sub-model to perform feature extraction on the bipartite graph, respectively, to obtain the first feature and the second feature; Using the edge feature extraction sub-model to perform feature extraction on the first feature and the second feature to obtain a third feature; Using the third feature, the predicted matching value of the point pair corresponding to the connecting edge is obtained.

The training method according to claim 6, wherein the structures of the first point feature extraction sub-model and the second point feature extraction sub-model are any of the following: including at least one residual block, including at least one residual blocks and at least one spatial transformation network; and/or, The edge feature extraction submodel includes at least one residual block.

The training method according to any one of claim 1 to 3, wherein the several sets of point pairs include at least one set of matching point pairs that match between image points and map points contained in at least one set and at least one set of contained point pairs. non-matching point pairs that do not match between image points and map points; The determining the loss value of the matching prediction model by using the actual matching value and the predicted matching value includes: Using the predicted matching value and the actual matching value of the matching point pair to determine a first loss value of the matching prediction model; and using the predicted matching value and the actual matching value of the non-matching point pair to determine the second loss value of the matching prediction model; The first loss value and the second loss value are weighted to obtain the loss value of the matching prediction model.

The training method according to claim 8, wherein before the first loss value of the matching prediction model is determined by using the predicted matching value and the actual matching value of the matching point pair, the method further comprises: include: respectively count the first number of the matching point pairs and the second number of the non-matching point pairs; The determining the first loss value of the matching prediction model by using the predicted matching value and the actual matching value of the matching point pair includes: Using the difference between the predicted matching value and the actual matching value of the matching point pair, and the first number, determining the first loss value; Using the predicted matching value and the actual matching value of the non-matching point pair to determine the second loss value of the matching prediction model includes: The second loss value is determined using the difference between the predicted match value and the actual match value for the pair of non-matching points, and the second number.

The training method according to any one of claims 1 to 3, wherein the dimension to which the sample image belongs is 2-dimensional or 3-dimensional, and the dimension to which the map data belongs is 2-dimensional or 3-dimensional.

A visual positioning method comprising: Using the to-be-located image and map data, the to-be-identified matching data is constructed, wherein the to-be-identified matching data includes several sets of point pairs, and the two points of each set of point pairs are respectively from the to-be-located image and the map data ; Use a matching prediction model to perform prediction processing on the several groups of point pairs to obtain the predicted matching values of the point pairs; Based on the predicted matching values of the point pairs, a pose parameter of the photographic device of the to-be-located image is determined.

The visual positioning method according to claim 11, wherein the determining the pose parameters of the photographic device of the image to be positioned based on the predicted matching value of the point pair comprises: sorting the groups of point pairs in descending order of the predicted matching values; Using the previously preset number of sets of the point pairs, the pose parameters of the photographing device of the to-be-positioned image are determined.

The visual positioning method according to claim 11 or 12, wherein the matching prediction model is obtained by using the training method of the matching prediction model described in any one of claim 1 to 10.

An electronic device, comprising a memory and a processor coupled to each other, the processor is used to execute program instructions stored in the memory, so as to realize the training method of the matching prediction model according to any one of request items 1 to 10 , or the visual positioning method described in any one of claim items 11 to 13.

A computer-readable storage medium on which program instructions are stored, and when the program instructions are executed by a processor, the training method of the matching prediction model described in any one of claim items 1 to 10, or any one of claim items 11 to 13 The visual positioning method described in item.