WO2021237443A1 - Visual positioning method and apparatus, device and readable storage medium - Google Patents

Visual positioning method and apparatus, device and readable storage medium Download PDF

Info

Publication number
WO2021237443A1
WO2021237443A1 PCT/CN2020/092284 CN2020092284W WO2021237443A1 WO 2021237443 A1 WO2021237443 A1 WO 2021237443A1 CN 2020092284 W CN2020092284 W CN 2020092284W WO 2021237443 A1 WO2021237443 A1 WO 2021237443A1
Authority
WO
WIPO (PCT)
Prior art keywords
photos
positioning
visual positioning
panoramic
neural network
Prior art date
Application number
PCT/CN2020/092284
Other languages
French (fr)
Chinese (zh)
Inventor
陈尊裕
吴珏其
胡斯洋
陈欣
吴沛谦
张仲文
Original Assignee
蜂图志科技控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 蜂图志科技控股有限公司 filed Critical 蜂图志科技控股有限公司
Priority to JP2022566049A priority Critical patent/JP7446643B2/en
Priority to CN202080001067.0A priority patent/CN111758118B/en
Priority to PCT/CN2020/092284 priority patent/WO2021237443A1/en
Publication of WO2021237443A1 publication Critical patent/WO2021237443A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/97Determining parameters from multiple pictures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • This application relates to the field of positioning technology, and in particular to a visual positioning method, device, equipment, and readable storage medium.
  • the principle of visual positioning based on machine learning Use a large number of real scene photos with location markers for training, and get a neural network model whose input is a photo (RGB numerical matrix) and output is a specific location. After obtaining the trained neural network model, the user only needs to take a picture of the environment to get the specific shooting location.
  • This method needs to collect a large number of photo samples of the use environment as a training data set. For example, some documents record that in order to realize the visual positioning of a 35-meter-wide street corner store, 330 photos need to be collected, and in order to realize the visual positioning of a 140-meter street (positioning only on one side), 1,500 Multiple photos; in order to realize the positioning of a certain factory, the factory needs to be divided into 18 areas, and each area needs to take 200 images. It can be seen that in order to ensure the visual positioning effect, it is necessary to collect a large number of on-site photos as training data, and these photos must be taken to every corner of the scene, which is very time-consuming and labor-intensive.
  • the purpose of this application is to provide a visual positioning method, device, equipment, and readable storage medium, which use panoramic photos in a real map to train a neural network model, which can solve the problem of difficult sample collection in visual positioning.
  • a visual positioning method including:
  • the positioning model is a neural network model trained by using panoramic photos in a real map
  • the final position is determined.
  • said determining the final position using a plurality of said candidate positions includes:
  • the geometric center of the geometric figure is taken as the final positioning.
  • it also includes:
  • the standard deviation is used as the positioning error of the final positioning.
  • the process of training the neural network model includes:
  • the geographic mark includes a geographic location and a specific orientation
  • the neural network model is trained using the training samples, and the trained neural network model is determined as the positioning model.
  • said performing the anti-distortion transformation on several of said panoramic photos to obtain several groups of plane projection photos with the same aspect ratio including:
  • each of the panoramic photos is divided according to different focal length parameters, and several groups of plane projection photos with different viewing angles are obtained.
  • dividing each of the panoramic photos according to different focal length parameters to obtain several groups of plane projection photos with different viewing angles includes:
  • Each of the panoramic photos is segmented according to the number of segments corresponding to the original image coverage greater than a specified percentage, and several groups of adjacent images have plane projection photos with overlapping viewing angles.
  • the process of training the neural network model further includes:
  • the training samples are supplemented by using scene photos obtained from the Internet or environment photos collected from the positioning environment.
  • performing random segmentation on the wide-angle photos to obtain the atlas to be tested includes:
  • random division is performed on the wide-angle photo with an original image coverage greater than a specified percentage, and a set of atlases to be tested matching the number of divisions is obtained.
  • a visual positioning device includes:
  • the atlas to be tested acquisition module is used to acquire wide-angle photos, and randomly segment the wide-angle photos to obtain the atlas to be tested;
  • the candidate positioning acquisition module is used to input the atlas to be tested into a positioning model for positioning recognition to obtain multiple candidate positionings;
  • the positioning model is a neural network model trained by using panoramic photos in a real-world map;
  • the positioning output module is used to determine the final positioning by using a plurality of the candidate positionings.
  • a visual positioning device including:
  • Memory used to store computer programs
  • the processor is used to implement the above-mentioned visual positioning method when the computer program is executed.
  • the real map is a map where you can see the real street scene, and the real map includes a 360-degree real scene.
  • the panoramic photo in the real map is the real street view map, which overlaps with the application environment of visual positioning.
  • the neural network module is trained by using the panoramic photos in the real map to obtain a positioning model for visual positioning. After obtaining the wide-angle photos, perform random segmentation on the wide-angle photos to obtain the atlas to be tested. Input the atlas to be tested into the positioning model for positioning recognition, and then multiple candidate positionings can be obtained. Based on these candidate positions, the final position can be determined.
  • a positioning model can be obtained by training the neural network model based on the panoramic photos in the real scene map, and the visual positioning can be completed based on the positioning model, which solves the problem of difficulty in the collection of visual positioning training samples.
  • the embodiments of the present application also provide devices, equipment, and readable storage media corresponding to the above-mentioned visual positioning method, which have the above-mentioned technical effects, and will not be repeated here.
  • Fig. 1 is an implementation flowchart of a visual positioning method in an embodiment of the application
  • FIG. 2 is a schematic diagram of a perspective segmentation in an embodiment of this application.
  • FIG. 3 is a schematic structural diagram of a visual positioning device in an embodiment of the application.
  • FIG. 4 is a schematic structural diagram of a visual positioning device in an embodiment of this application.
  • Fig. 5 is a schematic diagram of a specific structure of a visual positioning device in an embodiment of the application.
  • the visual positioning method provided in the embodiment of the present invention can be directly applied to a cloud server, or in a local device.
  • the device that needs to be positioned can be positioned through a wide-angle photo if it has the functions of taking pictures and networking.
  • FIG. 1 is a flowchart of a visual positioning method in an embodiment of the application. The method includes the following steps:
  • Wide-angle that is, pictures taken with a wide-angle lens or panoramic mode. Simply put, the smaller the focal length, the wider the field of view, and the wider the range of the scene that can be accommodated in the photo.
  • the panoramic photos in the real map are used to train the neural network model. Therefore, in order to better perform visual positioning, when using the positioning model for visual positioning, the required photos are also wide-angle photos.
  • the user can use the wide-angle mode (or ultra-wide-angle mode) or the panoramic mode to take a picture of the surrounding environment at a location that needs to be positioned.
  • the angle of view exceeds 120 degrees (of course, it can also be other degrees, such as 140 degrees, 180 degrees, etc.). Photo.
  • the wide-angle photos are obtained, they are randomly divided to obtain the atlas to be tested composed of several divided photos.
  • the specific number of photos divided into the wide-angle photo can be set according to the training effect of the positioning model of the world and the actual positioning accuracy requirements.
  • the larger the number of divisions the higher the positioning accuracy.
  • the more training iterations of the model The longer the training time.
  • the wide-angle photo when segmenting the wide-angle photo, can also be randomly divided according to the number of divisions with the original image coverage greater than a specified percentage, to obtain the atlas to be tested matching the number of divisions.
  • the wide-angle photos can be randomly divided into N pieces with an aspect ratio of 1:1 (it should be noted that the aspect ratio can also be other ratios, and the aspect ratio is the same as the aspect ratio of the training sample used for training the positioning model. That is, the image whose height is 1/3 to 1/2 of the height of the wide-angle photo is used as the atlas to be measured.
  • the number of N is set according to the training effect and positioning accuracy needs.
  • N When the training effect is slightly poor and the positioning accuracy is high, select a higher value of N.
  • the number of N can be set to 100 (of course, other values can also be selected , Such as 50, 80, etc., will not be enumerated here one by one).
  • the random segmentation result requires the coverage of the original image (that is, the wide-angle photo) to be >95% (of course, it can also be set to other percentages, which will not be enumerated here).
  • S102 Input the atlas to be tested into the positioning model for positioning recognition, and obtain multiple candidate positioning.
  • the positioning model is a neural network model trained by using panoramic photos in the real map.
  • each segmented photo in the atlas to be measured is input into the positioning model for positioning recognition, and an output about the positioning result is obtained for each photo.
  • the positioning result corresponding to each divided photo is used as a candidate positioning.
  • the process of training a neural network model includes:
  • Step 1 Obtain a number of panoramic photos from the real-world map, and determine the geographic location of each real-world photo;
  • Step 2 Perform anti-distortion transformation on several panoramic photos to obtain several groups of plane projection photos with the same aspect ratio;
  • Step 3 Mark a geographic mark for each group of plane projection photos according to the correspondence with the panoramic photos; the geographic mark includes the geographic location and the specific orientation;
  • Step 4 Use the geo-tagged plane projection photos as training samples
  • Step 5 Use the training samples to train the neural network model, and determine the trained neural network model as the positioning model.
  • the panoramic photo can be subjected to anti-distortion transformation, and then several groups of plane projection photos with the same length ratio can be obtained. Since there is a corresponding relationship between the panoramic photos and the geographic locations in the real scene map, in this embodiment, the geographic locations of a group of planar projection photos divided from the same panoramic photo correspond to the geographic locations of the panoramic photos. In addition, when dividing a panoramic photo, the segmentation is performed based on the angle of view. Therefore, the orientation of the divided photo is also clear. In this embodiment, the geographic location and the specific orientation are used as geographic markers and added. In other words, every flat projection photo has a corresponding geographic location and specific orientation.
  • the trained neural network model is the positioning model. Specifically, a collection of photos with specific locations and specific orientations can be used as the data pool. Randomly select 80% of the data pool as the training set, and the remaining 20% as the test set. The ratio can also be adjusted according to the actual training situation. Input the training set into the initialized or pre-trained neural network model of the large-scale image set for training, and use the test set to verify the training results.
  • CNN Convolutional Neural Network
  • convolutional neural network which is a feedforward neural network, including convolutional layer (alternating convolutional layer) and pooling layer) and its derivative structure
  • LSTM Long Short-Term Memory, long and short-term memory network
  • RNN time recurrent neural network
  • hybrid structures etc.
  • the specific neural network used is not limited.
  • the panoramic photos can be segmented according to different focal length parameters, so as to obtain plane projection photos with different viewing angles as training samples.
  • each panoramic photo can be divided according to different focal length parameters in the anti-distortion transformation to obtain several groups of plane projection photos with different viewing angles. That is, the number of divisions n is determined according to the focal length parameter F.
  • Fig. 2 is a schematic diagram of a viewing angle segmentation in an embodiment of this application.
  • the focal length parameter F can also be changed to other values, such as 1.0 and 1.3, to obtain plane projection photos with other perspectives.
  • the panoramic photo when segmenting the panoramic photo, can also be segmented according to the number of segments corresponding to the original image coverage greater than a specified percentage. That is, under the same viewing angle, the adjacent pictures have a flat projection photo with a covering angle. Specifically, each panoramic photo is segmented according to the number of segments corresponding to the original image coverage greater than a specified percentage, and several groups of adjacent images have overlapping perspective plane projection photos. That is, in order to enrich the shooting angle of the photo, it is recommended that the number of divisions be greater than the number of equal divisions when the focal length is fixed.
  • the axis perpendicular to the ground of the panoramic photo projection spherical surface is the rotation axis, and the center of the line of sight (the arrow in the figure 2) is rotated every 45 degrees to split a plane projection photo with a viewing angle of 90 degrees.
  • the adjacent pictures There will be a 45-degree overlapping viewing angle.
  • the orientation data is then marked for the resulting flat projection photo. Because the value of F can also be 1.0 and 1.3, the viewing angle is about 60 degrees and 30 degrees, respectively, and the value of n can also be 12 and 24. You can also set more F values and increase the number of n to further improve the coverage of the training set. Generally, the coverage rate is greater than 95%.
  • the process of training the neural network model can also be used from Use the Internet to obtain scene photos, or supplement the training samples with environmental photos collected from the positioning environment.
  • the final location can be determined based on these candidate locations. After obtaining the final positioning, it can be output for the user to view.
  • one location can be randomly selected from the candidate locations as the final location, or several candidate locations can be randomly selected from the candidate locations, and the geometric centers of geometric figures corresponding to these candidate locations can be taken as the final location.
  • candidate locations with a high degree of overlap can also be used as the final location.
  • the candidate positions can be clustered and filtered, and the candidate positions that are free from most positioning positions can be removed, and then based on The remaining candidate positions determine the final position.
  • the implementation process includes:
  • Step 1 Perform clustering processing on multiple candidate locations, and use the clustering results to screen multiple candidate locations;
  • Step 2 Use the selected candidate locations to construct geometric figures
  • Step three take the geometric center of the geometric figure as the final positioning.
  • a clustering algorithm such as DBSCAN (Density-Based Spatial Clustering of Applications with Noise) can be used to classify candidate locations, and adjacent location data can be classified into one category.
  • the positioning error can also be determined.
  • the final location is used to calculate the standard deviation of multiple candidate locations; the standard deviation is used as the location error of the final location. That is, the variance between each candidate location and the final location is calculated and accumulated to obtain the final location error.
  • the real map is a map where you can see the real street scene, and the real map includes a 360-degree real scene.
  • the panoramic photo in the real map is the real street view map, which overlaps with the application environment of visual positioning.
  • the neural network module is trained by using the panoramic photos in the real map to obtain a positioning model for visual positioning. After obtaining the wide-angle photos, perform random segmentation on the wide-angle photos to obtain the atlas to be tested. Input the atlas to be tested into the positioning model for positioning recognition, and then multiple candidate positionings can be obtained. Based on these candidate positions, the final position can be determined.
  • a positioning model can be obtained by training the neural network model based on the panoramic photos in the real scene map, and the visual positioning can be completed based on the positioning model, which solves the problem of difficulty in the collection of visual positioning training samples.
  • the embodiments of the present application also provide corresponding improvement solutions.
  • the same steps as in the above-mentioned embodiments or the corresponding steps can be referred to each other, and the corresponding beneficial effects can also be referred to each other, which will not be repeated in the preferred/improved embodiments herein.
  • the embodiment of the present application also provides a visual positioning device.
  • the visual positioning device described below and the visual positioning method described above can be referred to each other.
  • the visual positioning device includes:
  • the atlas to be tested acquisition module 101 is used to acquire wide-angle photos, and randomly segment the wide-angle photos to obtain the atlas to be tested;
  • the candidate location acquisition module 102 is used to input the atlas to be tested into a location model for location recognition to obtain multiple candidate locations;
  • the location model is a neural network model trained by using panoramic photos in the real map;
  • the positioning output module 103 is used for determining the final positioning using multiple candidate positionings.
  • the wide-angle photos are obtained, and the wide-angle photos are randomly divided to obtain the atlas to be tested; the atlas to be tested is input to the positioning model for positioning recognition, and multiple candidate positions are obtained; the positioning model is Use the neural network model trained on the panoramic photos in the real map; use multiple candidate locations to determine the final location.
  • the real map is a map where you can see the real street scene, and the real map includes a 360-degree real scene.
  • the panoramic photo in the real map is the real street view map, which overlaps with the application environment of visual positioning.
  • the neural network module is trained by using the panoramic photos in the real map to obtain a positioning model for visual positioning. After obtaining the wide-angle photos, perform random segmentation on the wide-angle photos to obtain the atlas to be tested. Input the atlas to be tested into the positioning model for positioning recognition, and multiple candidate positionings can be obtained. Based on these candidate positions, the final position can be determined.
  • a positioning model can be obtained by training the neural network model based on the panoramic photos in the real map, and based on the positioning model, the visual positioning can be completed, which solves the problem of difficulty in the collection of visual positioning training samples.
  • the positioning output module 103 specifically includes:
  • the positioning screening unit is used to perform clustering processing on multiple candidate locations, and use the clustering results to screen multiple candidate locations;
  • the geometric figure construction unit is used to construct a geometric figure by using several candidate positions obtained by screening;
  • the final positioning determining unit is used to take the geometric center of the geometric figure as the final positioning.
  • the positioning output module 103 further includes:
  • the positioning error determining unit is used to calculate the standard deviation of multiple candidate positioning by using the final positioning; the standard deviation is used as the positioning error of the final positioning.
  • the model training module includes:
  • the panoramic photo obtaining unit is used to obtain several panoramic photos from the real-world map and determine the geographic location of each real-world photo;
  • the anti-distortion transformation unit is used to perform anti-distortion transformation on several panoramic photos to obtain several groups of plane projection photos with the same aspect ratio;
  • the geotagging unit is used to tag each group of plane projection photos with geotags according to the corresponding relationship with the panoramic photos; the geotags include geographic location and specific orientation;
  • the training sample determination unit is used to use the geographic-tagged plane projection photos as the training sample
  • the model training unit is used to train the neural network model using training samples, and determine the trained neural network model as a positioning model.
  • the anti-warping transformation unit is specifically used to segment each panoramic photo according to different focal length parameters in the anti-warping transformation to obtain several groups of plane projection photos with different viewing angles.
  • the anti-distortion transformation unit is specifically used to divide each panoramic photo according to the number of divisions whose coverage ratio of the corresponding original image is greater than a specified percentage, to obtain planes with overlapping viewing angles in several groups of adjacent pictures Project photos.
  • model training module further includes:
  • the sample supplement unit is used to supplement the training samples by using the scene photos obtained from the Internet or the environment photos collected from the positioning environment.
  • the atlas acquisition module 101 to be tested is specifically configured to perform random segmentation of the wide-angle photo with the original image coverage greater than a specified percentage according to the number of segments, to obtain the image to be tested matching the number of segments set.
  • the embodiment of the present application also provides a visual positioning device.
  • the visual positioning device described below and the visual positioning method described above can be referenced correspondingly.
  • the visual positioning device includes:
  • the memory 410 is used to store computer programs
  • the processor 420 is configured to implement the steps of the visual positioning method provided in the foregoing method embodiment when executing a computer program.
  • FIG. 5 is a schematic diagram of a specific structure of a visual positioning device provided by this embodiment.
  • the visual positioning device may have relatively large differences due to different configurations or performance, and may include one or more processors ( Central processing units, CPU) 420 (for example, one or more processors) and memory 410, one or more of which store computer application programs 413 or data 412.
  • the memory 410 may be short-term storage or persistent storage.
  • the computer application program may include one or more modules (not shown in the figure), and each module may include a series of instruction operations on the data processing device.
  • the central processing unit 420 may be configured to communicate with the memory 410 and execute a series of instruction operations in the memory 410 on the visual positioning device 301.
  • the visual positioning device 400 may also include one or more power supplies 430, one or more wired or wireless network interfaces 440, one or more input and output interfaces 450, and/or one or more operating systems 411.
  • the steps in the visual positioning method described above can be implemented by the structure of the visual positioning device.
  • the embodiment of the present application also provides a readable storage medium, and a readable storage medium described below and a visual positioning method described above can be referenced correspondingly.
  • a readable storage medium in which a computer program is stored, and when the computer program is executed by a processor, the steps of the visual positioning method provided by the foregoing method embodiment are implemented.
  • the readable storage medium can specifically be a U disk, a mobile hard disk, a read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk, or an optical disk that can store program codes. Readable storage medium.

Abstract

A visual positioning method and apparatus, a device, and a readable storage medium, the method comprising: acquiring a wide-angle photo, and randomly segmenting the wide-angle photo to obtain an atlas to be measured (S101); and inputting the atlas into a positioning model for position identification to obtain a plurality of candidate positions, wherein the positioning model is a neural network model trained by using panoramic photos in a real map (S102); and determining the final position by using the plurality of candidate positions (S103). The positioning model may be obtained by training the neural network model on the basis of the panoramic photos in the real map, and visual positioning may be completed on the basis of the positioning model, which solves the problem of the collection difficulty of training samples for visual positioning.

Description

一种视觉定位方法、装置、设备及可读存储介质Visual positioning method, device, equipment and readable storage medium 技术领域Technical field
本申请涉及定位技术领域,特别是涉及一种视觉定位方法、装置、设备及可读存储介质。This application relates to the field of positioning technology, and in particular to a visual positioning method, device, equipment, and readable storage medium.
背景技术Background technique
基于机器学习的视觉定位原理:利用大量的带有位置标记的真实场景照片进行训练,得到一个输入为照片(RGB数值矩阵),输出为具体的位置的神经网络模型。获得训练好的神经网络模型后,只需要用户对环境拍摄一张照片就可以得到具体的拍摄位置。The principle of visual positioning based on machine learning: Use a large number of real scene photos with location markers for training, and get a neural network model whose input is a photo (RGB numerical matrix) and output is a specific location. After obtaining the trained neural network model, the user only needs to take a picture of the environment to get the specific shooting location.
这种方法需要对使用环境采集大量的照片样本作为训练数据集。例如,在一些文献中记载,为了实现对35米宽的街角店铺进行视觉定位,需要采集330张照片,而为了实现对140米的街道(只针对一侧进行定位)进行视觉定位,需采集1500多张照片;为了实现某工厂定位,需将工厂划分为18个区域,每个区域需要拍摄200幅图像。可见,为了保证视觉定位效果,需要采集大量的现场照片作为训练数据,而且这些照片必须保证拍摄到场景中的每个角落,非常耗费时间和人力。This method needs to collect a large number of photo samples of the use environment as a training data set. For example, some documents record that in order to realize the visual positioning of a 35-meter-wide street corner store, 330 photos need to be collected, and in order to realize the visual positioning of a 140-meter street (positioning only on one side), 1,500 Multiple photos; in order to realize the positioning of a certain factory, the factory needs to be divided into 18 areas, and each area needs to take 200 images. It can be seen that in order to ensure the visual positioning effect, it is necessary to collect a large number of on-site photos as training data, and these photos must be taken to every corner of the scene, which is very time-consuming and labor-intensive.
综上所述,如何解决视觉定位中样本采集困难等问题,是目前本领域技术人员急需解决的技术问题。In summary, how to solve problems such as difficulty in sample collection in visual positioning is a technical problem urgently needed to be solved by those skilled in the art.
发明内容Summary of the invention
本申请的目的是提供一种视觉定位方法、装置、设备及可读存储介质,利用实景地图中的全景照片来对神经网络模型进行训练,可解决视觉定位中样本采集困难的问题。The purpose of this application is to provide a visual positioning method, device, equipment, and readable storage medium, which use panoramic photos in a real map to train a neural network model, which can solve the problem of difficult sample collection in visual positioning.
为解决上述技术问题,本申请提供如下技术方案:In order to solve the above technical problems, this application provides the following technical solutions:
一种视觉定位方法,包括:A visual positioning method, including:
获取广角照片,并对所述广角照片进行随机分割,获得待测图集;Obtain a wide-angle photo, and perform random segmentation on the wide-angle photo to obtain an atlas to be tested;
将所述待测图集输入至定位模型进行定位识别,得到多个候选定位;所述定位模型为利用实景地图中的全景照片训练后的神经网络模型;Inputting the atlas to be tested into a positioning model for positioning recognition, and obtaining multiple candidate positioning; the positioning model is a neural network model trained by using panoramic photos in a real map;
利用多个所述候选定位,确定出最终定位。Using a plurality of the candidate positions, the final position is determined.
优选地,所述利用多个所述候选定位,确定出最终定位,包括:Preferably, said determining the final position using a plurality of said candidate positions includes:
对多个所述候选定位进行聚类处理,并利用聚类结果对多个所述候选定位进行筛选;Perform clustering processing on a plurality of candidate locations, and use a clustering result to screen the plurality of candidate locations;
利用筛选得到的若干候选定位构建几何图形;Use the selected candidate locations to construct geometric figures;
将所述几何图形的几何中心作为所述最终定位。The geometric center of the geometric figure is taken as the final positioning.
优选地,还包括:Preferably, it also includes:
利用所述最终定位,计算多个所述候选定位的标准方差;Using the final positioning to calculate the standard deviation of a plurality of candidate positionings;
将所述标准方差作为所述最终定位的定位误差。The standard deviation is used as the positioning error of the final positioning.
优选地,训练所述神经网络模型的过程,包括:Preferably, the process of training the neural network model includes:
从所述实景地图中获取若干个所述全景照片,并确定每个所述实景照片的地理位置;Acquiring a number of the panoramic photos from the real-scene map, and determining the geographic location of each of the real-scene photos;
对若干个所述全景照片进行反扭曲变换,得到若干组长宽比相同的平面投影照片;Performing anti-distortion transformation on a plurality of said panoramic photos to obtain a plurality of groups of plane projection photos with the same aspect ratio;
按照与所述全景照片的对应关系,为每组所述平面投影照片标记地理标记;所述地理标记包括地理位置和具体朝向;According to the corresponding relationship with the panoramic photos, mark each group of the plane projection photos with a geographic mark; the geographic mark includes a geographic location and a specific orientation;
将标记了地理标记的平面投影照片作为训练样本;Use geographic-tagged plane projection photos as training samples;
利用所述训练样本对所述神经网络模型进行训练,将训练好的所述神经网络模型确定为所述定位模型。The neural network model is trained using the training samples, and the trained neural network model is determined as the positioning model.
优选地,所述对若干个所述全景照片进行反扭曲变换,得到若干组长宽比相同的平面投影照片,包括:Preferably, said performing the anti-distortion transformation on several of said panoramic photos to obtain several groups of plane projection photos with the same aspect ratio, including:
在反扭曲变换中按照不同的焦距参数对每个所述全景照片进行分割,得到若干组视角不同的平面投影照片。In the anti-distortion transformation, each of the panoramic photos is divided according to different focal length parameters, and several groups of plane projection photos with different viewing angles are obtained.
优选地,所述在反扭曲变换中按照不同的焦距参数对每个所述全景照片进行分割,得到若干组视角不同的平面投影照片,包括:Preferably, in the de-warping transformation, dividing each of the panoramic photos according to different focal length parameters to obtain several groups of plane projection photos with different viewing angles includes:
按照对应原图覆盖率大于指定百分比的分割数量对每个所述全景照片进行分割,得到若干组相邻图片存在重合视角的平面投影照片。Each of the panoramic photos is segmented according to the number of segments corresponding to the original image coverage greater than a specified percentage, and several groups of adjacent images have plane projection photos with overlapping viewing angles.
优选地,训练所述神经网络模型的过程,还包括:Preferably, the process of training the neural network model further includes:
利用从互联网获取场景照片,或对定位环境采集的环境照片对所述训练样本进行补充。The training samples are supplemented by using scene photos obtained from the Internet or environment photos collected from the positioning environment.
优选地,对所述广角照片进行随机分割,获得待测图集,包括:Preferably, performing random segmentation on the wide-angle photos to obtain the atlas to be tested includes:
按照分割数量,对所述广角照片进行原图覆盖率大于指定百分比的随机分割,得到与所述分割数量匹配的待测图集。According to the number of divisions, random division is performed on the wide-angle photo with an original image coverage greater than a specified percentage, and a set of atlases to be tested matching the number of divisions is obtained.
一种视觉定位装置,包括:A visual positioning device includes:
待测图集获取模块,用于获取广角照片,并对所述广角照片进行随机分割,获得待测图集;The atlas to be tested acquisition module is used to acquire wide-angle photos, and randomly segment the wide-angle photos to obtain the atlas to be tested;
候选定位获取模块,用于将所述待测图集输入至定位模型进行定位识别,得到多个候选定位;所述定位模型为利用实景地图中的全景照片训练后的神经网络模型;The candidate positioning acquisition module is used to input the atlas to be tested into a positioning model for positioning recognition to obtain multiple candidate positionings; the positioning model is a neural network model trained by using panoramic photos in a real-world map;
定位输出模块,用于利用多个所述候选定位,确定出最终定位。The positioning output module is used to determine the final positioning by using a plurality of the candidate positionings.
一种视觉定位设备,包括:A visual positioning device, including:
存储器,用于存储计算机程序;Memory, used to store computer programs;
处理器,用于执行所述计算机程序时实现如上述的视觉定位方法。The processor is used to implement the above-mentioned visual positioning method when the computer program is executed.
一种可读存储介质,所述可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如上述的视觉定位方法。A readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the visual positioning method as described above is realized.
应用本申请实施例所提供的方法,获取广角照片,并对广角照片进行随机分割,获得待测图集;将待测图集输入至定位模型进行定位识别,得到多个候选定位;定位模型为利用实景地图中的全景照片训练后的神经网络模型;利用多个候选定位,确定出最终定位。Apply the method provided in the embodiments of this application to obtain wide-angle photos, and randomly segment the wide-angle photos to obtain the atlas to be tested; input the atlas to be tested into the positioning model for positioning recognition, and obtain multiple candidate positions; the positioning model is Use the neural network model trained on the panoramic photos in the real map; use multiple candidate locations to determine the final location.
实景地图,即为可以看到真实街景的地图,实景地图中包括了360度实景。且实景地图中的全景照片即为真实街景地图,这和视觉定位的应用环境相互重叠。基于此,在本方法中,利用实景地图中的全景照片来训练神经网络模块,可得到一个用于视觉定位的定位模型。在获取到广角照片之后,对广角照片进行随机分割,可以得到待测图集。将待测图集输入至定位模型进行定位识别,便可得到多个候选定位。基于这些候选定位可以确定出最终定位。可见,在本方法中,基于实景地图中的全景照片对神经网络模型进行训练即可得到一个定位模型,且基于该定位模型可以完成视觉定位,解决了视觉定位训练样本采集困难的问题。The real map is a map where you can see the real street scene, and the real map includes a 360-degree real scene. And the panoramic photo in the real map is the real street view map, which overlaps with the application environment of visual positioning. Based on this, in this method, the neural network module is trained by using the panoramic photos in the real map to obtain a positioning model for visual positioning. After obtaining the wide-angle photos, perform random segmentation on the wide-angle photos to obtain the atlas to be tested. Input the atlas to be tested into the positioning model for positioning recognition, and then multiple candidate positionings can be obtained. Based on these candidate positions, the final position can be determined. It can be seen that in this method, a positioning model can be obtained by training the neural network model based on the panoramic photos in the real scene map, and the visual positioning can be completed based on the positioning model, which solves the problem of difficulty in the collection of visual positioning training samples.
相应地,本申请实施例还提供了与上述视觉定位方法相对应的装置、设备和可读存储介质,具有上述技术效果,在此不再赘述。Correspondingly, the embodiments of the present application also provide devices, equipment, and readable storage media corresponding to the above-mentioned visual positioning method, which have the above-mentioned technical effects, and will not be repeated here.
附图说明Description of the drawings
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly explain the technical solutions in the embodiments of the present application or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. For those of ordinary skill in the art, Without creative work, other drawings can be obtained based on these drawings.
图1为本申请实施例中一种视觉定位方法的实施流程图;Fig. 1 is an implementation flowchart of a visual positioning method in an embodiment of the application;
图2为本申请实施例中一种视角分割示意图;FIG. 2 is a schematic diagram of a perspective segmentation in an embodiment of this application;
图3为本申请实施例中一种视觉定位装置的结构示意图;FIG. 3 is a schematic structural diagram of a visual positioning device in an embodiment of the application;
图4为本申请实施例中一种视觉定位设备的结构示意图;4 is a schematic structural diagram of a visual positioning device in an embodiment of this application;
图5为本申请实施例中一种视觉定位设备的具体结构示意图。Fig. 5 is a schematic diagram of a specific structure of a visual positioning device in an embodiment of the application.
具体实施方式Detailed ways
为了使本技术领域的人员更好地理解本申请方案,下面结合附图和具体实施方式对本申请作进一步的详细说明。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to enable those skilled in the art to better understand the solution of the application, the application will be further described in detail below with reference to the accompanying drawings and specific implementations. Based on the embodiments in this application, all other embodiments obtained by a person of ordinary skill in the art without creative work shall fall within the protection scope of this application.
需要说明的是,由于神经网络模型可以储存在云端或本地设备,因而本发明实施例所提供的视觉定位方法可直接应用于云服务器,也可以在本地设备中。需要进行定位的设备具备拍照、联网功能即可通过一个广角照片实现定位。It should be noted that, since the neural network model can be stored in the cloud or a local device, the visual positioning method provided in the embodiment of the present invention can be directly applied to a cloud server, or in a local device. The device that needs to be positioned can be positioned through a wide-angle photo if it has the functions of taking pictures and networking.
请参考图1,图1为本申请实施例中一种视觉定位方法的流程图,该方法包括以下步骤:Please refer to FIG. 1. FIG. 1 is a flowchart of a visual positioning method in an embodiment of the application. The method includes the following steps:
S101、获取广角照片,并对广角照片进行随机分割,获得待测图集。S101. Obtain a wide-angle photo, and perform random segmentation on the wide-angle photo to obtain a set of atlases to be tested.
广角,即用广角镜头或全景模式拍摄出来的图片。简单的说,焦距越小视野越宽,照片内可以容纳的景物的范围也越广。Wide-angle, that is, pictures taken with a wide-angle lens or panoramic mode. Simply put, the smaller the focal length, the wider the field of view, and the wider the range of the scene that can be accommodated in the photo.
由于,在发明所提供的视觉定位方法中,采用实景地图中的全景照片来对神经网络模型进行训练。因而,为了更好地进行视觉定位,在利用定位模型进行视觉定位,所需的照片也为广角照片。例如,用户可在需要定位的位置,使用广角模式(或超广角模式)或者全景模式对周围环境拍摄 一张视角超过120度(当然也可以为其他度数,如140度,180度等)的广角照片。Because, in the visual positioning method provided by the invention, the panoramic photos in the real map are used to train the neural network model. Therefore, in order to better perform visual positioning, when using the positioning model for visual positioning, the required photos are also wide-angle photos. For example, the user can use the wide-angle mode (or ultra-wide-angle mode) or the panoramic mode to take a picture of the surrounding environment at a location that needs to be positioned. The angle of view exceeds 120 degrees (of course, it can also be other degrees, such as 140 degrees, 180 degrees, etc.). Photo.
得到广角照片之后,对其进行随机分割,得到由分割出的若干个照片组成的待测图集。After the wide-angle photos are obtained, they are randomly divided to obtain the atlas to be tested composed of several divided photos.
特别地,具体将该广角照片分割出多少照片,可根据世界的定位模型的训练效果以及实际定位精度要求进行设置。通常来说,在可识别范围内(照片大小过小,会存在无相关定位特征,无法进行有效识别的问题),分割数量越大,定位精度越高,当然,模型的训练迭代次数越多,训练时间越久。In particular, the specific number of photos divided into the wide-angle photo can be set according to the training effect of the positioning model of the world and the actual positioning accuracy requirements. Generally speaking, within the recognizable range (the size of the photo is too small, there will be no relevant positioning features, and the problem of effective recognition cannot be carried out), the larger the number of divisions, the higher the positioning accuracy. Of course, the more training iterations of the model, The longer the training time.
优选地,为了提高定位精准度,在分割广角照片时,还可按照分割数量,对广角照片进行原图覆盖率大于指定百分比的随机分割,得到与分割数量匹配的待测图集。具体的,可将广角照片随机分割出N张长宽比为1:1(需要说明的是,长宽比还可为其他比值,该长宽比与训练定位模型所用的训练样本的长宽比大小相同即可)、高度为该广角照片高度1/3~1/2的图像作为待测图集。其中,N的数量根据训练的效果和定位精度的需要设置,训练效果稍差、定位精度要求高时,选择更高的N值,通常N的数量可以设为100(当然,也可选用其他数值,如50,80等数值,在此不再一一枚举)。通常的,随机分割结果要求对原图(即该广角照片)的覆盖率>95%(当然也可设置为其他百分比,在此不再一一枚举)。Preferably, in order to improve the positioning accuracy, when segmenting the wide-angle photo, the wide-angle photo can also be randomly divided according to the number of divisions with the original image coverage greater than a specified percentage, to obtain the atlas to be tested matching the number of divisions. Specifically, the wide-angle photos can be randomly divided into N pieces with an aspect ratio of 1:1 (it should be noted that the aspect ratio can also be other ratios, and the aspect ratio is the same as the aspect ratio of the training sample used for training the positioning model. That is, the image whose height is 1/3 to 1/2 of the height of the wide-angle photo is used as the atlas to be measured. Among them, the number of N is set according to the training effect and positioning accuracy needs. When the training effect is slightly poor and the positioning accuracy is high, select a higher value of N. Usually the number of N can be set to 100 (of course, other values can also be selected , Such as 50, 80, etc., will not be enumerated here one by one). Generally, the random segmentation result requires the coverage of the original image (that is, the wide-angle photo) to be >95% (of course, it can also be set to other percentages, which will not be enumerated here).
S102、将待测图集输入至定位模型进行定位识别,得到多个候选定位。S102: Input the atlas to be tested into the positioning model for positioning recognition, and obtain multiple candidate positioning.
其中,定位模型为利用实景地图中的全景照片训练后的神经网络模型。Among them, the positioning model is a neural network model trained by using panoramic photos in the real map.
为了得到更为精准的定位效果,在本实施例中,将待测图集中的每一个分割得到的照片分别输入至定位模型中进行定位识别,对于每一个照片得到一个关于定位结果的输出。在本实施例中,将每一个分割而得的照片对应的定位结果作为候选定位。In order to obtain a more precise positioning effect, in this embodiment, each segmented photo in the atlas to be measured is input into the positioning model for positioning recognition, and an output about the positioning result is obtained for each photo. In this embodiment, the positioning result corresponding to each divided photo is used as a candidate positioning.
需要说明的是,在实际应用前,需要训练得到定位模型。训练神经网络模型的过程,包括:It should be noted that before actual application, a positioning model needs to be trained. The process of training a neural network model includes:
步骤一、从实景地图中获取若干个全景照片,并确定每个实景照片的地理位置;Step 1: Obtain a number of panoramic photos from the real-world map, and determine the geographic location of each real-world photo;
步骤二、对若干个全景照片进行反扭曲变换,得到若干组长宽比相同 的平面投影照片;Step 2: Perform anti-distortion transformation on several panoramic photos to obtain several groups of plane projection photos with the same aspect ratio;
步骤三、按照与全景照片的对应关系,为每组平面投影照片标记地理标记;地理标记包括地理位置和具体朝向;Step 3: Mark a geographic mark for each group of plane projection photos according to the correspondence with the panoramic photos; the geographic mark includes the geographic location and the specific orientation;
步骤四、将标记了地理标记的平面投影照片作为训练样本; Step 4. Use the geo-tagged plane projection photos as training samples;
步骤五、利用训练样本对神经网络模型进行训练,将训练好的神经网络模型确定为定位模型。Step 5. Use the training samples to train the neural network model, and determine the trained neural network model as the positioning model.
为了便于描述,将上述五个步骤结合起来进行说明。For ease of description, the above five steps are combined for description.
由于全景照片的视角近360度,在本实施例中,可将全景照片进行反扭曲变换,然后得到若干组长度比相同的平面投影照片。由于实景地图中的全景照片和地理位置有对应关系,因此,在本实施例中将同一个全景照片分割出的一组平面投影照片的地理位置与全景照片的地理位置对应。另外,在分割全景照片时,会基于视角进行分割,因而分割得到的照片的朝向也是明确的,在本实施例中将地理位置和具体朝向作为地理标记并进行添加。也就是说,每一个平面投影照片都有对应的地理位置和具体朝向。Since the angle of view of the panoramic photo is close to 360 degrees, in this embodiment, the panoramic photo can be subjected to anti-distortion transformation, and then several groups of plane projection photos with the same length ratio can be obtained. Since there is a corresponding relationship between the panoramic photos and the geographic locations in the real scene map, in this embodiment, the geographic locations of a group of planar projection photos divided from the same panoramic photo correspond to the geographic locations of the panoramic photos. In addition, when dividing a panoramic photo, the segmentation is performed based on the angle of view. Therefore, the orientation of the divided photo is also clear. In this embodiment, the geographic location and the specific orientation are used as geographic markers and added. In other words, every flat projection photo has a corresponding geographic location and specific orientation.
将具有地理标记的平面投影照片作为训练样本,然后利用该训练样本对神经网络模型进行训练,训练好的神经网络模型即定位模型。具体的,可将带有具体位置、具体朝向的照片集作为数据池。将该数据池中随机抽取80%作为训练集,剩下20%作为测试集。该比例也可以根据实际训练情况调整。将训练集输入初始化的或者经过大规模图片集预训练的神经网络模型进行训练,用测试集对训练结果进行验证。可选用常用的神经网络结构有CNN(Convolutional Neural Network,卷积神经网络,即一种前馈神经网络,包括卷积层(alternating convolutional layer)和池层(pooling layer))及其衍生结构、LSTM(Long Short-Term Memory,长短期记忆网络,一种时间递归神经网络(RNN))以及混合结构等。在本申请实施例中并不限定具体使用何种神经网络。完成训练后得到一个适用于该实景地图数据源场地的神经网络模型,即定位模型。Take a plane projection photo with geographic markers as a training sample, and then use the training sample to train the neural network model, and the trained neural network model is the positioning model. Specifically, a collection of photos with specific locations and specific orientations can be used as the data pool. Randomly select 80% of the data pool as the training set, and the remaining 20% as the test set. The ratio can also be adjusted according to the actual training situation. Input the training set into the initialized or pre-trained neural network model of the large-scale image set for training, and use the test set to verify the training results. Common neural network structures that can be used are CNN (Convolutional Neural Network, convolutional neural network, which is a feedforward neural network, including convolutional layer (alternating convolutional layer) and pooling layer) and its derivative structure, LSTM (Long Short-Term Memory, long and short-term memory network, a time recurrent neural network (RNN)) and hybrid structures, etc. In the embodiments of the present application, the specific neural network used is not limited. After completing the training, a neural network model that is suitable for the site of the real map data source is obtained, that is, the positioning model.
优选地,为了适应实际应用中,不同的图片采集设备的焦距(即视角),在分割全景照片时,可按照不同的焦距参数进行分割,以便得到视角大小不同的平面投影照片作为训练样本。具体的,可在反扭曲变换中按照不同的焦距参数对每个全景照片进行分割,得到若干组视角不同的平面投影照 片。即,根据焦距参数F确定分割的数量n。焦距参数小时,视角大,分割的数量n就可以越小。如图2,图2为本申请实施例中一种视角分割示意图,最常用的焦距参数F=0.5,视角为90度,分割数量n=4就可以覆盖360度全角度。当需要多个不同视角的平面投影照片时,焦距参数F还可更改为其他数值,如1.0和1.3,以获得其他视角的平面投影照片。Preferably, in order to adapt to different focal lengths (ie viewing angles) of different picture acquisition devices in practical applications, the panoramic photos can be segmented according to different focal length parameters, so as to obtain plane projection photos with different viewing angles as training samples. Specifically, each panoramic photo can be divided according to different focal length parameters in the anti-distortion transformation to obtain several groups of plane projection photos with different viewing angles. That is, the number of divisions n is determined according to the focal length parameter F. The smaller the focal length parameter, the larger the viewing angle, the smaller the number of divisions n can be. As shown in Fig. 2, Fig. 2 is a schematic diagram of a viewing angle segmentation in an embodiment of this application. The most commonly used focal length parameter is F=0.5, the viewing angle is 90 degrees, and the number of divisions n=4 can cover a full 360-degree angle. When multiple plane projection photos with different perspectives are needed, the focal length parameter F can also be changed to other values, such as 1.0 and 1.3, to obtain plane projection photos with other perspectives.
优选地,为了提高视角定位的精准度,在分割全景照片时,还可按照对应原图覆盖率大于指定百分比的分割数量对全景照片进行分割。即,得到相同视角下,相邻图片具有覆盖角度的平面投影照片。具体的,按照对应原图覆盖率大于指定百分比的分割数量对每个全景照片进行分割,得到若干组相邻图片存在重合视角的平面投影照片。即,为了丰富照片的拍摄角度,在焦距固定的情况下,推荐分割数量大于均等分割的数量。即可将全景照片投影球面的垂直地面的轴为旋转轴,将视线中心朝向(如图2图中箭头)每旋转45度分割出一张视角为90度的平面投影照片,此时相邻图片会有45度的重合视角。根据视线中心朝向角度,然后为所得的平面投影照片标注朝向数据。由于,F的取值还可以是1.0和1.3,视角分别大约是60度、30度,n值还可选用12和24。也可以设置更多的F值、增加n的数量以进一步提高训练集的覆盖率。通常可保证覆盖率大于95%。Preferably, in order to improve the accuracy of viewing angle positioning, when segmenting the panoramic photo, the panoramic photo can also be segmented according to the number of segments corresponding to the original image coverage greater than a specified percentage. That is, under the same viewing angle, the adjacent pictures have a flat projection photo with a covering angle. Specifically, each panoramic photo is segmented according to the number of segments corresponding to the original image coverage greater than a specified percentage, and several groups of adjacent images have overlapping perspective plane projection photos. That is, in order to enrich the shooting angle of the photo, it is recommended that the number of divisions be greater than the number of equal divisions when the focal length is fixed. That is, the axis perpendicular to the ground of the panoramic photo projection spherical surface is the rotation axis, and the center of the line of sight (the arrow in the figure 2) is rotated every 45 degrees to split a plane projection photo with a viewing angle of 90 degrees. At this time, the adjacent pictures There will be a 45-degree overlapping viewing angle. According to the orientation angle of the center of the line of sight, the orientation data is then marked for the resulting flat projection photo. Because the value of F can also be 1.0 and 1.3, the viewing angle is about 60 degrees and 30 degrees, respectively, and the value of n can also be 12 and 24. You can also set more F values and increase the number of n to further improve the coverage of the training set. Generally, the coverage rate is greater than 95%.
优选地,考虑到实际应用中,光依赖全景照片进行训练,可能会因实景地图的更新频率较低等原因,导致视觉定位识别效果不佳,因此在训练神经网络模型的过程,还可以利用从互联网获取场景照片,或对定位环境采集的环境照片对训练样本进行补充。Preferably, considering that in practical applications, relying on panoramic photos for training may cause poor visual positioning and recognition due to the low update frequency of the real map and other reasons, the process of training the neural network model can also be used from Use the Internet to obtain scene photos, or supplement the training samples with environmental photos collected from the positioning environment.
S103、利用多个候选定位,确定出最终定位。S103. Use multiple candidate locations to determine a final location.
得到多个候选定位之后,可基于这些候选定位确定出最终定位。得到最终定位之后,可将其输出,以便用户查看。After obtaining multiple candidate locations, the final location can be determined based on these candidate locations. After obtaining the final positioning, it can be output for the user to view.
具体的,可从候选定位中随机选择一个定位作为最终定位,也可从候选定位中随机选择几个候选定位,取这几个候选定位对应的几何图形的几何中心作为最终定位。当然,也可将几个具有高度重合的候选定位作为最终定位。Specifically, one location can be randomly selected from the candidate locations as the final location, or several candidate locations can be randomly selected from the candidate locations, and the geometric centers of geometric figures corresponding to these candidate locations can be taken as the final location. Of course, several candidate locations with a high degree of overlap can also be used as the final location.
优选地,考虑到候选定位中可能会出现相对较为特殊的个别定位,为了提高最终的定位的精准度,可对候选定位进行聚类筛选,将游离于大多 数定位位置的候选定位去除,然后基于留下的候选定位确定出最终定位。具体的,实现过程包括:Preferably, considering that relatively special individual positions may appear in the candidate positions, in order to improve the accuracy of the final positioning, the candidate positions can be clustered and filtered, and the candidate positions that are free from most positioning positions can be removed, and then based on The remaining candidate positions determine the final position. Specifically, the implementation process includes:
步骤一、对多个候选定位进行聚类处理,并利用聚类结果对多个候选定位进行筛选;Step 1: Perform clustering processing on multiple candidate locations, and use the clustering results to screen multiple candidate locations;
步骤二、利用筛选得到的若干候选定位构建几何图形;Step 2: Use the selected candidate locations to construct geometric figures;
步骤三、将几何图形的几何中心作为最终定位。Step three, take the geometric center of the geometric figure as the final positioning.
具体的,可使用诸如DBSCAN(Density-Based Spatial Clustering of Applications with Noise)的聚类算法将候选定位进行分类,将相邻的位置数据分为一类。其中,分类参数可设置为∈邻域=1,最少点数minPts=5。将数量最多的一类位置结果视为可靠结果,计算该类所有候选定位对应几何图形的几何中心作为最终的定位结果。Specifically, a clustering algorithm such as DBSCAN (Density-Based Spatial Clustering of Applications with Noise) can be used to classify candidate locations, and adjacent location data can be classified into one category. Among them, the classification parameter can be set to ε neighborhood=1, and the minimum number of points minPts=5. The position result with the largest number is regarded as a reliable result, and the geometric center of the geometric figure corresponding to all candidate positions of this type is calculated as the final positioning result.
优选地,为了更好的展示定位情况,还可确定出定位误差。具体的,利用最终定位,计算多个候选定位的标准方差;将标准方差作为最终定位的定位误差。即,计算每一个候选定位与最终定位之间的方差,并进行累加,得到最终的定位误差。Preferably, in order to better display the positioning situation, the positioning error can also be determined. Specifically, the final location is used to calculate the standard deviation of multiple candidate locations; the standard deviation is used as the location error of the final location. That is, the variance between each candidate location and the final location is calculated and accumulated to obtain the final location error.
应用本申请实施例所提供的方法,获取广角照片,并对广角照片进行随机分割,获得待测图集;将待测图集输入至定位模型进行定位识别,得到多个候选定位;定位模型为利用实景地图中的全景照片训练后的神经网络模型;利用多个候选定位,确定出最终定位。Apply the method provided in the embodiments of this application to obtain wide-angle photos, and randomly segment the wide-angle photos to obtain the atlas to be tested; input the atlas to be tested into the positioning model for positioning recognition, and obtain multiple candidate positions; the positioning model is Use the neural network model trained on the panoramic photos in the real map; use multiple candidate locations to determine the final location.
实景地图,即为可以看到真实街景的地图,实景地图中包括了360度实景。且实景地图中的全景照片即为真实街景地图,这和视觉定位的应用环境相互重叠。基于此,在本方法中,利用实景地图中的全景照片来训练神经网络模块,可得到一个用于视觉定位的定位模型。在获取到广角照片之后,对广角照片进行随机分割,可以得到待测图集。将待测图集输入至定位模型进行定位识别,便可得到多个候选定位。基于这些候选定位可以确定出最终定位。可见,在本方法中,基于实景地图中的全景照片对神经网络模型进行训练即可得到一个定位模型,且基于该定位模型可以完成视觉定位,解决了视觉定位训练样本采集困难的问题。The real map is a map where you can see the real street scene, and the real map includes a 360-degree real scene. And the panoramic photo in the real map is the real street view map, which overlaps with the application environment of visual positioning. Based on this, in this method, the neural network module is trained by using the panoramic photos in the real map to obtain a positioning model for visual positioning. After obtaining the wide-angle photos, perform random segmentation on the wide-angle photos to obtain the atlas to be tested. Input the atlas to be tested into the positioning model for positioning recognition, and then multiple candidate positionings can be obtained. Based on these candidate positions, the final position can be determined. It can be seen that in this method, a positioning model can be obtained by training the neural network model based on the panoramic photos in the real scene map, and the visual positioning can be completed based on the positioning model, which solves the problem of difficulty in the collection of visual positioning training samples.
需要说明的是,基于上述实施例,本申请实施例还提供了相应的改进方案。在优选/改进实施例中涉及与上述实施例中相同步骤或相应步骤之间 可相互参考,相应的有益效果也可相互参照,在本文的优选/改进实施例中不再一一赘述。It should be noted that, based on the foregoing embodiments, the embodiments of the present application also provide corresponding improvement solutions. In the preferred/improved embodiments, the same steps as in the above-mentioned embodiments or the corresponding steps can be referred to each other, and the corresponding beneficial effects can also be referred to each other, which will not be repeated in the preferred/improved embodiments herein.
相应于上面的方法实施例,本申请实施例还提供了一种视觉定位装置,下文描述的一种视觉定位装置与上文描述的一种视觉定位方法可相互对应参照。Corresponding to the above method embodiment, the embodiment of the present application also provides a visual positioning device. The visual positioning device described below and the visual positioning method described above can be referred to each other.
参见图3所示,该视觉定位装置包括:As shown in Figure 3, the visual positioning device includes:
待测图集获取模块101,用于获取广角照片,并对广角照片进行随机分割,获得待测图集;The atlas to be tested acquisition module 101 is used to acquire wide-angle photos, and randomly segment the wide-angle photos to obtain the atlas to be tested;
候选定位获取模块102,用于将待测图集输入至定位模型进行定位识别,得到多个候选定位;定位模型为利用实景地图中的全景照片训练后的神经网络模型;The candidate location acquisition module 102 is used to input the atlas to be tested into a location model for location recognition to obtain multiple candidate locations; the location model is a neural network model trained by using panoramic photos in the real map;
定位输出模块103,用于利用多个候选定位,确定出最终定位。The positioning output module 103 is used for determining the final positioning using multiple candidate positionings.
应用本申请实施例所提供的装置,获取广角照片,并对广角照片进行随机分割,获得待测图集;将待测图集输入至定位模型进行定位识别,得到多个候选定位;定位模型为利用实景地图中的全景照片训练后的神经网络模型;利用多个候选定位,确定出最终定位。Using the device provided in the embodiments of this application, the wide-angle photos are obtained, and the wide-angle photos are randomly divided to obtain the atlas to be tested; the atlas to be tested is input to the positioning model for positioning recognition, and multiple candidate positions are obtained; the positioning model is Use the neural network model trained on the panoramic photos in the real map; use multiple candidate locations to determine the final location.
实景地图,即为可以看到真实街景的地图,实景地图中包括了360度实景。且实景地图中的全景照片即为真实街景地图,这和视觉定位的应用环境相互重叠。基于此,在本装置中,利用实景地图中的全景照片来训练神经网络模块,可得到一个用于视觉定位的定位模型。在获取到广角照片之后,对广角照片进行随机分割,可以得到待测图集。将待测图集输入至定位模型进行定位识别,便可得到多个候选定位。基于这些候选定位可以确定出最终定位。可见,在本装置中,基于实景地图中的全景照片对神经网络模型进行训练即可得到一个定位模型,且基于该定位模型可以完成视觉定位,解决了视觉定位训练样本采集困难的问题。The real map is a map where you can see the real street scene, and the real map includes a 360-degree real scene. And the panoramic photo in the real map is the real street view map, which overlaps with the application environment of visual positioning. Based on this, in this device, the neural network module is trained by using the panoramic photos in the real map to obtain a positioning model for visual positioning. After obtaining the wide-angle photos, perform random segmentation on the wide-angle photos to obtain the atlas to be tested. Input the atlas to be tested into the positioning model for positioning recognition, and multiple candidate positionings can be obtained. Based on these candidate positions, the final position can be determined. It can be seen that in this device, a positioning model can be obtained by training the neural network model based on the panoramic photos in the real map, and based on the positioning model, the visual positioning can be completed, which solves the problem of difficulty in the collection of visual positioning training samples.
在本申请的一种具体实施方式中,定位输出模块103,具体包括:In a specific implementation manner of the present application, the positioning output module 103 specifically includes:
定位筛选单元,用于对多个候选定位进行聚类处理,并利用聚类结果对多个候选定位进行筛选;The positioning screening unit is used to perform clustering processing on multiple candidate locations, and use the clustering results to screen multiple candidate locations;
几何图形构建单元,用于利用筛选得到的若干候选定位构建几何图形;The geometric figure construction unit is used to construct a geometric figure by using several candidate positions obtained by screening;
最终定位确定单元,用于将几何图形的几何中心作为最终定位。The final positioning determining unit is used to take the geometric center of the geometric figure as the final positioning.
在本申请的一种具体实施方式中,定位输出模块103,还包括:In a specific implementation manner of the present application, the positioning output module 103 further includes:
定位误差确定单元,用于利用最终定位,计算多个候选定位的标准方差;将标准方差作为最终定位的定位误差。The positioning error determining unit is used to calculate the standard deviation of multiple candidate positioning by using the final positioning; the standard deviation is used as the positioning error of the final positioning.
在本申请的一种具体实施方式中,模型训练模块,包括:In a specific implementation manner of the present application, the model training module includes:
全景照片获取单元,用于从实景地图中获取若干个全景照片,并确定每个实景照片的地理位置;The panoramic photo obtaining unit is used to obtain several panoramic photos from the real-world map and determine the geographic location of each real-world photo;
反扭曲变换单元,用于对若干个全景照片进行反扭曲变换,得到若干组长宽比相同的平面投影照片;The anti-distortion transformation unit is used to perform anti-distortion transformation on several panoramic photos to obtain several groups of plane projection photos with the same aspect ratio;
地理标记标注单元,用于按照与全景照片的对应关系,为每组平面投影照片标记地理标记;地理标记包括地理位置和具体朝向;The geotagging unit is used to tag each group of plane projection photos with geotags according to the corresponding relationship with the panoramic photos; the geotags include geographic location and specific orientation;
训练样本确定单元,用于将标记了地理标记的平面投影照片作为训练样本;The training sample determination unit is used to use the geographic-tagged plane projection photos as the training sample;
模型训练单元,用于利用训练样本对神经网络模型进行训练,将训练好的神经网络模型确定为定位模型。The model training unit is used to train the neural network model using training samples, and determine the trained neural network model as a positioning model.
在本申请的一种具体实施方式中,反扭曲变换单元,具体用于在反扭曲变换中按照不同的焦距参数对每个全景照片进行分割,得到若干组视角不同的平面投影照片。In a specific implementation of the present application, the anti-warping transformation unit is specifically used to segment each panoramic photo according to different focal length parameters in the anti-warping transformation to obtain several groups of plane projection photos with different viewing angles.
在本申请的一种具体实施方式中,反扭曲变换单元,具体用于按照对应原图覆盖率大于指定百分比的分割数量对每个全景照片进行分割,得到若干组相邻图片存在重合视角的平面投影照片。In a specific implementation of the present application, the anti-distortion transformation unit is specifically used to divide each panoramic photo according to the number of divisions whose coverage ratio of the corresponding original image is greater than a specified percentage, to obtain planes with overlapping viewing angles in several groups of adjacent pictures Project photos.
在本申请的一种具体实施方式中,模型训练模块,还包括:In a specific implementation manner of this application, the model training module further includes:
样本补充单元,用于利用从互联网获取场景照片,或对定位环境采集的环境照片对训练样本进行补充。The sample supplement unit is used to supplement the training samples by using the scene photos obtained from the Internet or the environment photos collected from the positioning environment.
在本申请的一种具体实施方式中,待测图集获取模块101,具体用于按照分割数量,对广角照片进行原图覆盖率大于指定百分比的随机分割,得到与分割数量匹配的待测图集。In a specific implementation of the present application, the atlas acquisition module 101 to be tested is specifically configured to perform random segmentation of the wide-angle photo with the original image coverage greater than a specified percentage according to the number of segments, to obtain the image to be tested matching the number of segments set.
相应于上面的方法实施例,本申请实施例还提供了一种视觉定位设备,下文描述的一种视觉定位设备与上文描述的一种视觉定位方法可相互对应 参照。Corresponding to the above method embodiment, the embodiment of the present application also provides a visual positioning device. The visual positioning device described below and the visual positioning method described above can be referenced correspondingly.
参见图4所示,该视觉定位设备包括:As shown in Figure 4, the visual positioning device includes:
存储器410,用于存储计算机程序;The memory 410 is used to store computer programs;
处理器420,用于执行计算机程序时实现上述方法实施例所提供的视觉定位方法的步骤。The processor 420 is configured to implement the steps of the visual positioning method provided in the foregoing method embodiment when executing a computer program.
具体的,请参考图5,为本实施例提供的一种视觉定位设备的具体结构示意图,该视觉定位设备可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上处理器(central processing units,CPU)420(例如,一个或一个以上处理器)和存储器410,一个或一个以上存储计算机应用程序413或数据412。其中,存储器410可以是短暂存储或持久存储。该计算机应用程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对数据处理设备中的一系列指令操作。更进一步地,中央处理器420可以设置为与存储器410通信,在视觉定位设备301上执行存储器410中的一系列指令操作。Specifically, please refer to FIG. 5, which is a schematic diagram of a specific structure of a visual positioning device provided by this embodiment. The visual positioning device may have relatively large differences due to different configurations or performance, and may include one or more processors ( Central processing units, CPU) 420 (for example, one or more processors) and memory 410, one or more of which store computer application programs 413 or data 412. Among them, the memory 410 may be short-term storage or persistent storage. The computer application program may include one or more modules (not shown in the figure), and each module may include a series of instruction operations on the data processing device. Furthermore, the central processing unit 420 may be configured to communicate with the memory 410 and execute a series of instruction operations in the memory 410 on the visual positioning device 301.
视觉定位设备400还可以包括一个或一个以上电源430,一个或一个以上有线或无线网络接口440,一个或一个以上输入输出接口450,和/或,一个或一个以上操作系统411。The visual positioning device 400 may also include one or more power supplies 430, one or more wired or wireless network interfaces 440, one or more input and output interfaces 450, and/or one or more operating systems 411.
上文所描述的视觉定位方法中的步骤可以由视觉定位设备的结构实现。The steps in the visual positioning method described above can be implemented by the structure of the visual positioning device.
相应于上面的方法实施例,本申请实施例还提供了一种可读存储介质,下文描述的一种可读存储介质与上文描述的一种视觉定位方法可相互对应参照。Corresponding to the above method embodiment, the embodiment of the present application also provides a readable storage medium, and a readable storage medium described below and a visual positioning method described above can be referenced correspondingly.
一种可读存储介质,可读存储介质上存储有计算机程序,计算机程序被处理器执行时实现上述方法实施例所提供的视觉定位方法的步骤。A readable storage medium in which a computer program is stored, and when the computer program is executed by a processor, the steps of the visual positioning method provided by the foregoing method embodiment are implemented.
该可读存储介质具体可以为U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可存储程序代码的可读存储介质。The readable storage medium can specifically be a U disk, a mobile hard disk, a read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk, or an optical disk that can store program codes. Readable storage medium.
本领域的技术人员还可以进一步意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者 的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。本领域的技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。Those skilled in the art may further realize that the units and algorithm steps of the examples described in the embodiments disclosed herein can be implemented by electronic hardware, computer software or a combination of the two, in order to clearly illustrate the hardware and For the interchangeability of software, the composition and steps of each example have been generally described in accordance with the function in the above description. Whether these functions are executed by hardware or software depends on the specific application and design constraint conditions of the technical solution. Those skilled in the art can use different methods for each specific application to implement the described functions, but such implementation should not be considered as going beyond the scope of the present application.

Claims (11)

  1. 一种视觉定位方法,其特征在于,包括:A visual positioning method, characterized in that it comprises:
    获取广角照片,并对所述广角照片进行随机分割,获得待测图集;Obtain a wide-angle photo, and perform random segmentation on the wide-angle photo to obtain an atlas to be tested;
    将所述待测图集输入至定位模型进行定位识别,得到多个候选定位;所述定位模型为利用实景地图中的全景照片训练后的神经网络模型;Inputting the atlas to be tested into a positioning model for positioning recognition, and obtaining multiple candidate positioning; the positioning model is a neural network model trained by using panoramic photos in a real map;
    利用多个所述候选定位,确定出最终定位。Using a plurality of the candidate positions, the final position is determined.
  2. 根据权利要求1所述的视觉定位方法,其特征在于,所述利用多个所述候选定位,确定出最终定位,包括:The visual positioning method according to claim 1, wherein said determining the final position by using a plurality of said candidate positions comprises:
    对多个所述候选定位进行聚类处理,并利用聚类结果对多个所述候选定位进行筛选;Perform clustering processing on a plurality of candidate locations, and use a clustering result to screen the plurality of candidate locations;
    利用筛选得到的若干候选定位构建几何图形;Use the selected candidate locations to construct geometric figures;
    将所述几何图形的几何中心作为所述最终定位。The geometric center of the geometric figure is taken as the final positioning.
  3. 根据权利要求2所述的视觉定位方法,其特征在于,还包括:The visual positioning method according to claim 2, further comprising:
    利用所述最终定位,计算多个所述候选定位的标准方差;Using the final positioning to calculate the standard deviation of a plurality of candidate positionings;
    将所述标准方差作为所述最终定位的定位误差。The standard deviation is used as the positioning error of the final positioning.
  4. 根据权利要求1所述的视觉定位方法,其特征在于,训练所述神经网络模型的过程,包括:The visual positioning method according to claim 1, wherein the process of training the neural network model comprises:
    从所述实景地图中获取若干个所述全景照片,并确定每个所述实景照片的地理位置;Acquiring a number of the panoramic photos from the real-scene map, and determining the geographic location of each of the real-scene photos;
    对若干个所述全景照片进行反扭曲变换,得到若干组长宽比相同的平面投影照片;Performing anti-distortion transformation on a plurality of said panoramic photos to obtain a plurality of groups of plane projection photos with the same aspect ratio;
    按照与所述全景照片的对应关系,为每组所述平面投影照片标记地理标记;所述地理标记包括地理位置和具体朝向;According to the corresponding relationship with the panoramic photos, mark each group of the plane projection photos with a geographic mark; the geographic mark includes a geographic location and a specific orientation;
    将标记了地理标记的平面投影照片作为训练样本;Use geographic-tagged plane projection photos as training samples;
    利用所述训练样本对所述神经网络模型进行训练,将训练好的所述神经网络模型确定为所述定位模型。The neural network model is trained using the training samples, and the trained neural network model is determined as the positioning model.
  5. 根据权利要求4所述的视觉定位方法,其特征在于,所述对若干个所述全景照片进行反扭曲变换,得到若干组长宽比相同的平面投影照片,包括:The visual positioning method according to claim 4, characterized in that said performing inverse distortion transformation on several of said panoramic photos to obtain several groups of plane projection photos with the same aspect ratio comprises:
    在反扭曲变换中按照不同的焦距参数对每个所述全景照片进行分割, 得到若干组视角不同的平面投影照片。In the anti-distortion transformation, each of the panoramic photos is divided according to different focal length parameters, and several groups of plane projection photos with different viewing angles are obtained.
  6. 根据权利要求5所述的视觉定位方法,其特征在于,所述在反扭曲变换中按照不同的焦距参数对每个所述全景照片进行分割,得到若干组视角不同的平面投影照片,包括:The visual positioning method according to claim 5, wherein the segmentation of each of the panoramic photos according to different focal length parameters in the de-warping transformation to obtain several groups of plane projection photos with different viewing angles comprises:
    按照对应原图覆盖率大于指定百分比的分割数量对每个所述全景照片进行分割,得到若干组相邻图片存在重合视角的平面投影照片。Each of the panoramic photos is segmented according to the number of segments corresponding to the original image coverage greater than a specified percentage, and several groups of adjacent images have plane projection photos with overlapping viewing angles.
  7. 根据权利要求4所述的视觉定位方法,其特征在于,训练所述神经网络模型的过程,还包括:The visual positioning method according to claim 4, wherein the process of training the neural network model further comprises:
    利用从互联网获取场景照片,或对定位环境采集的环境照片对所述训练样本进行补充。The training samples are supplemented by using scene photos obtained from the Internet or environment photos collected from the positioning environment.
  8. 根据权利要求1所述的视觉定位方法,其特征在于,对所述广角照片进行随机分割,获得待测图集,包括:The visual positioning method according to claim 1, wherein the random segmentation of the wide-angle photos to obtain the atlas to be measured comprises:
    按照分割数量,对所述广角照片进行原图覆盖率大于指定百分比的随机分割,得到与所述分割数量匹配的待测图集。According to the number of divisions, random division is performed on the wide-angle photo with an original image coverage greater than a specified percentage, and a set of atlases to be tested matching the number of divisions is obtained.
  9. 一种视觉定位装置,其特征在于,包括:A visual positioning device, characterized in that it comprises:
    待测图集获取模块,用于获取广角照片,并对所述广角照片进行随机分割,获得待测图集;The atlas to be tested acquisition module is used to acquire wide-angle photos, and randomly segment the wide-angle photos to obtain the atlas to be tested;
    候选定位获取模块,用于将所述待测图集输入至定位模型进行定位识别,得到多个候选定位;所述定位模型为利用实景地图中的全景照片训练后的神经网络模型;The candidate positioning acquisition module is used to input the atlas to be tested into a positioning model for positioning recognition to obtain multiple candidate positionings; the positioning model is a neural network model trained by using panoramic photos in a real-world map;
    定位输出模块,用于利用多个所述候选定位,确定出最终定位。The positioning output module is used to determine the final positioning by using a plurality of the candidate positionings.
  10. 一种视觉定位设备,其特征在于,包括:A visual positioning device, characterized in that it comprises:
    存储器,用于存储计算机程序;Memory, used to store computer programs;
    处理器,用于执行所述计算机程序时实现如权利要求1至8任一项所述的视觉定位方法。The processor is configured to implement the visual positioning method according to any one of claims 1 to 8 when the computer program is executed.
  11. 一种可读存储介质,其特征在于,所述可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1至8任一项所述的视觉定位方法。A readable storage medium, wherein a computer program is stored on the readable storage medium, and when the computer program is executed by a processor, the visual positioning method according to any one of claims 1 to 8 is realized.
PCT/CN2020/092284 2020-05-26 2020-05-26 Visual positioning method and apparatus, device and readable storage medium WO2021237443A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP2022566049A JP7446643B2 (en) 2020-05-26 2020-05-26 Visual positioning methods, devices, equipment and readable storage media
CN202080001067.0A CN111758118B (en) 2020-05-26 2020-05-26 Visual positioning method, device, equipment and readable storage medium
PCT/CN2020/092284 WO2021237443A1 (en) 2020-05-26 2020-05-26 Visual positioning method and apparatus, device and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/092284 WO2021237443A1 (en) 2020-05-26 2020-05-26 Visual positioning method and apparatus, device and readable storage medium

Publications (1)

Publication Number Publication Date
WO2021237443A1 true WO2021237443A1 (en) 2021-12-02

Family

ID=72713357

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/092284 WO2021237443A1 (en) 2020-05-26 2020-05-26 Visual positioning method and apparatus, device and readable storage medium

Country Status (3)

Country Link
JP (1) JP7446643B2 (en)
CN (1) CN111758118B (en)
WO (1) WO2021237443A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117289626A (en) * 2023-11-27 2023-12-26 杭州维讯机器人科技有限公司 Virtual simulation method and system for industrialization

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113724284A (en) * 2021-09-03 2021-11-30 四川智胜慧旅科技有限公司 Position locking device, mountain type scenic spot search and rescue system and search and rescue method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109308678A (en) * 2017-07-28 2019-02-05 株式会社理光 The method, device and equipment relocated using panoramic picture
CN109829406A (en) * 2019-01-22 2019-05-31 上海城诗信息科技有限公司 A kind of interior space recognition methods
CN110298370A (en) * 2018-03-21 2019-10-01 北京猎户星空科技有限公司 Network model training method, device and object pose determine method, apparatus
CN110298320A (en) * 2019-07-01 2019-10-01 北京百度网讯科技有限公司 A kind of vision positioning method, device and storage medium
CN110636274A (en) * 2019-11-11 2019-12-31 成都极米科技股份有限公司 Ultrashort-focus picture screen alignment method and device, ultrashort-focus projector and storage medium

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3650578B2 (en) * 2000-09-28 2005-05-18 株式会社立山アールアンドディ Panoramic image navigation system using neural network to correct image distortion
JP4264380B2 (en) * 2004-04-28 2009-05-13 三菱重工業株式会社 Self-position identification method and apparatus
CN202818503U (en) * 2012-09-24 2013-03-20 天津市亚安科技股份有限公司 Multidirectional monitoring area early warning positioning automatic tracking and monitoring device
CN104200188B (en) * 2014-08-25 2017-02-15 北京慧眼智行科技有限公司 Method and system for rapidly positioning position detection patterns of QR code
CN108009588A (en) 2017-12-01 2018-05-08 深圳市智能现实科技有限公司 Localization method and device, mobile terminal
JP6676082B2 (en) 2018-01-18 2020-04-08 光禾感知科技股▲ふん▼有限公司 Indoor positioning method and system, and device for creating the indoor map
US11195010B2 (en) * 2018-05-23 2021-12-07 Smoked Sp. Z O. O. Smoke detection system and method
KR102227583B1 (en) * 2018-08-03 2021-03-15 한국과학기술원 Method and apparatus for camera calibration based on deep learning
CN109285178A (en) * 2018-10-25 2019-01-29 北京达佳互联信息技术有限公司 Image partition method, device and storage medium
CN110136136B (en) * 2019-05-27 2022-02-08 北京达佳互联信息技术有限公司 Scene segmentation method and device, computer equipment and storage medium
CN110503037A (en) * 2019-08-22 2019-11-26 三星电子(中国)研发中心 A kind of method and system of the positioning object in region

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109308678A (en) * 2017-07-28 2019-02-05 株式会社理光 The method, device and equipment relocated using panoramic picture
CN110298370A (en) * 2018-03-21 2019-10-01 北京猎户星空科技有限公司 Network model training method, device and object pose determine method, apparatus
CN109829406A (en) * 2019-01-22 2019-05-31 上海城诗信息科技有限公司 A kind of interior space recognition methods
CN110298320A (en) * 2019-07-01 2019-10-01 北京百度网讯科技有限公司 A kind of vision positioning method, device and storage medium
CN110636274A (en) * 2019-11-11 2019-12-31 成都极米科技股份有限公司 Ultrashort-focus picture screen alignment method and device, ultrashort-focus projector and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117289626A (en) * 2023-11-27 2023-12-26 杭州维讯机器人科技有限公司 Virtual simulation method and system for industrialization
CN117289626B (en) * 2023-11-27 2024-02-02 杭州维讯机器人科技有限公司 Virtual simulation method and system for industrialization

Also Published As

Publication number Publication date
JP7446643B2 (en) 2024-03-11
CN111758118A (en) 2020-10-09
CN111758118B (en) 2024-04-16
JP2023523364A (en) 2023-06-02

Similar Documents

Publication Publication Date Title
US10922844B2 (en) Image positioning method and system thereof
CN111046125A (en) Visual positioning method, system and computer readable storage medium
US20150363971A1 (en) Systems and Methods for Generating Three-Dimensional Models Using Sensed Position Data
WO2021237443A1 (en) Visual positioning method and apparatus, device and readable storage medium
US9551579B1 (en) Automatic connection of images using visual features
WO2021027692A1 (en) Visual feature library construction method and apparatus, visual positioning method and apparatus, and storage medium
CN111626295B (en) Training method and device for license plate detection model
US20170039450A1 (en) Identifying Entities to be Investigated Using Storefront Recognition
KR20100124748A (en) Platform for the production of seamless orthographic imagery
CN113808267A (en) GIS map-based three-dimensional community display method and system
CN115049878B (en) Target detection optimization method, device, equipment and medium based on artificial intelligence
CN112637519A (en) Panoramic stitching algorithm for multi-path 4K quasi-real-time stitched video
CN112613107A (en) Method and device for determining construction progress of tower project, storage medium and equipment
WO2022247126A1 (en) Visual localization method and apparatus, and device, medium and program
WO2024088071A1 (en) Three-dimensional scene reconstruction method and apparatus, device and storage medium
Abrams et al. Webcams in context: Web interfaces to create live 3D environments
CN113379748A (en) Point cloud panorama segmentation method and device
US9852542B1 (en) Methods and apparatus related to georeferenced pose of 3D models
CN116433822A (en) Neural radiation field training method, device, equipment and medium
CN111107307A (en) Video fusion method, system, terminal and medium based on homography transformation
Wang et al. Identifying people wearing masks in a 3D-scene
Porzi et al. An automatic image-to-DEM alignment approach for annotating mountains pictures on a smartphone
Li et al. Fisheye image rectification using spherical and digital distortion models
CN113920144B (en) Real-scene photo ground vision field analysis method and system
TWI755250B (en) Method for determining plant growth curve, device, electronic equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20937750

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022566049

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20937750

Country of ref document: EP

Kind code of ref document: A1