CN111758118B - Visual positioning method, device, equipment and readable storage medium - Google Patents

Visual positioning method, device, equipment and readable storage medium Download PDF

Info

Publication number
CN111758118B
CN111758118B CN202080001067.0A CN202080001067A CN111758118B CN 111758118 B CN111758118 B CN 111758118B CN 202080001067 A CN202080001067 A CN 202080001067A CN 111758118 B CN111758118 B CN 111758118B
Authority
CN
China
Prior art keywords
positioning
photos
panoramic
photo
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202080001067.0A
Other languages
Chinese (zh)
Other versions
CN111758118A (en
Inventor
陈尊裕
吴珏其
胡斯洋
陈欣
吴沛谦
张仲文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fengtuzhi Technology Holding Co ltd
Original Assignee
Fengtuzhi Technology Holding Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fengtuzhi Technology Holding Co ltd filed Critical Fengtuzhi Technology Holding Co ltd
Publication of CN111758118A publication Critical patent/CN111758118A/en
Application granted granted Critical
Publication of CN111758118B publication Critical patent/CN111758118B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/97Determining parameters from multiple pictures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

A visual positioning method, a device, equipment and a readable storage medium, wherein the method comprises the following steps: obtaining a wide-angle photo, and randomly dividing the wide-angle photo to obtain a to-be-detected atlas; inputting the to-be-detected atlas into a positioning model for positioning identification to obtain a plurality of candidate positioning; the positioning model is a neural network model trained by using panoramic photos in the live-action map; and determining the final positioning by utilizing the plurality of candidate positioning. In the application, the neural network model can be trained based on the panoramic photo in the live-action map to obtain a positioning model, and visual positioning can be completed based on the positioning model, so that the problem that a visual positioning training sample is difficult to collect is solved.

Description

Visual positioning method, device, equipment and readable storage medium
Technical Field
The present disclosure relates to the field of positioning technologies, and in particular, to a visual positioning method, device, apparatus, and readable storage medium.
Background
Visual positioning principle based on machine learning: training is carried out by using a large number of real scene photos with position marks, so that a neural network model with the input of photos (RGB numerical matrix) and the specific position is obtained. After the trained neural network model is obtained, a specific shooting position can be obtained only by taking a picture of the environment by a user.
This approach requires taking a large number of photo samples as training data sets for the use environment. For example, in some documents, 330 photographs need to be taken in order to achieve visual positioning of a street corner store 35 meters wide, while 1500 or more photographs need to be taken in order to achieve visual positioning of a street 140 meters wide (positioning only for one side); to achieve a certain factory positioning, the factory needs to be divided into 18 areas, and 200 images need to be taken in each area. It can be seen that in order to guarantee the visual positioning effect, a large number of live photographs need to be acquired as training data, and these photographs have to be guaranteed to be taken at every corner in the scene, which is very time-consuming and labor-intensive.
In summary, how to solve the problems of difficult sample collection in visual positioning is a technical problem that needs to be solved by those skilled in the art.
Disclosure of Invention
The application aims to provide a visual positioning method, a visual positioning device, visual positioning equipment and a readable storage medium, and the problem that a sample is difficult to collect in visual positioning can be solved by training a neural network model through panoramic photos in a live-action map.
In order to solve the technical problems, the application provides the following technical scheme:
a visual positioning method, comprising:
obtaining a wide-angle photo, and randomly dividing the wide-angle photo to obtain a to-be-detected atlas;
inputting the to-be-detected atlas into a positioning model for positioning identification to obtain a plurality of candidate positioning; the positioning model is a neural network model trained by using panoramic photos in a live-action map;
and determining the final positioning by utilizing a plurality of candidate positioning.
Preferably, said determining a final position fix using a plurality of said candidate position fixes includes:
clustering the candidate positioning, and screening the candidate positioning by using a clustering result;
constructing geometric figures by utilizing a plurality of candidate positioning obtained by screening;
the geometric center of the geometric figure is taken as the final positioning.
Preferably, the method further comprises:
calculating standard deviations of a plurality of candidate positioning by utilizing the final positioning;
and taking the standard deviation as a positioning error of the final positioning.
Preferably, the process of training the neural network model includes:
acquiring a plurality of panoramic photos from the live-action map, and determining the geographic position of each live-action photo;
performing anti-distortion transformation on a plurality of panoramic photos to obtain a plurality of groups of plane projection photos with the same length-width ratio;
marking geographic marks for each group of plane projection photos according to the corresponding relation with the panoramic photos; the geographic markers include geographic locations and specific orientations;
taking the plane projection photo marked with the geographic mark as a training sample;
and training the neural network model by using the training sample, and determining the trained neural network model as the positioning model.
Preferably, the performing a back-warping transformation on the panoramic photos to obtain a plurality of groups of planar projection photos with the same aspect ratio includes:
and in the anti-warping transformation, dividing each panoramic photo according to different focal length parameters to obtain a plurality of groups of plane projection photos with different visual angles.
Preferably, the splitting each panoramic photo according to different focal length parameters in the anti-warping transformation to obtain a plurality of groups of plane projection photos with different viewing angles includes:
and dividing each panoramic photo according to the dividing quantity with the coverage rate of the corresponding original image being larger than the specified percentage, so as to obtain a plurality of groups of plane projection photos with the overlapping view angles of the adjacent pictures.
Preferably, the process of training the neural network model further comprises:
the training samples are supplemented with ambient photographs taken from the internet or from the positioning environment.
Preferably, the random segmentation is performed on the wide-angle photo to obtain a to-be-detected atlas, including:
and randomly dividing the wide-angle photo with the original image coverage rate larger than a specified percentage according to the dividing number to obtain a to-be-detected image set matched with the dividing number.
A visual positioning device, comprising:
the to-be-detected atlas acquisition module is used for acquiring a wide-angle photo, and randomly dividing the wide-angle photo to acquire a to-be-detected atlas;
the candidate positioning acquisition module is used for inputting the atlas to be detected into a positioning model for positioning identification to obtain a plurality of candidate positioning; the positioning model is a neural network model trained by using panoramic photos in a live-action map;
and the positioning output module is used for determining the final positioning by utilizing a plurality of candidate positioning.
A visual positioning apparatus comprising:
a memory for storing a computer program;
and a processor for implementing the visual localization method as described above when executing the computer program.
A readable storage medium having stored thereon a computer program which when executed by a processor implements a visual localization method as described above.
By applying the method provided by the embodiment of the application, a wide-angle photo is obtained, and the wide-angle photo is randomly segmented to obtain a to-be-detected atlas; inputting the to-be-detected atlas into a positioning model for positioning identification to obtain a plurality of candidate positioning; the positioning model is a neural network model trained by using panoramic photos in the live-action map; and determining the final positioning by utilizing the plurality of candidate positioning.
The real scene map is a map capable of seeing a real street scene, and the real scene map comprises 360-degree real scenes. And the panoramic photo in the live-action map is the real street view map, which is overlapped with the application environment of visual positioning. Based on the above, in the method, the neural network module is trained by using the panoramic photo in the live-action map, so that a positioning model for visual positioning can be obtained. After the wide-angle photo is obtained, the wide-angle photo is randomly segmented, and a to-be-detected atlas can be obtained. And inputting the to-be-detected atlas into a positioning model for positioning identification, so that a plurality of candidate positioning can be obtained. Based on these candidate locations, a final location may be determined. Therefore, in the method, the neural network model is trained based on the panoramic photo in the live-action map to obtain a positioning model, and visual positioning can be completed based on the positioning model, so that the problem that a visual positioning training sample is difficult to collect is solved.
Correspondingly, the embodiment of the application also provides a device, equipment and a readable storage medium corresponding to the visual positioning method, which have the technical effects and are not repeated herein.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the description of the embodiments or the prior art will be briefly described, and it is possible for a person skilled in the art to obtain other drawings from these drawings without inventive effort.
FIG. 1 is a flowchart of an implementation of a visual positioning method according to an embodiment of the present application;
FIG. 2 is a view angle segmentation schematic diagram according to an embodiment of the present application;
FIG. 3 is a schematic structural view of a visual positioning device according to an embodiment of the present application;
FIG. 4 is a schematic structural view of a visual positioning device according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of a visual positioning device according to an embodiment of the present application.
Detailed Description
In order to provide a better understanding of the present application, those skilled in the art will now make further details of the present application with reference to the drawings and detailed description. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
It should be noted that, because the neural network model may be stored in the cloud or the local device, the visual positioning method provided by the embodiment of the present invention may be directly applied to the cloud server, or may be in the local device. The equipment that needs to carry out the location possesses and shoots, networking function can realize the location through a wide-angle photograph.
Referring to fig. 1, fig. 1 is a flowchart of a visual positioning method according to an embodiment of the application, and the method includes the following steps:
s101, obtaining a wide-angle photo, and randomly dividing the wide-angle photo to obtain a to-be-detected atlas.
And wide angle, namely using a wide angle lens or a picture shot in a panoramic mode. In short, the smaller the focal length, the wider the field of view and the wider the range of scenes that can be accommodated within the photograph.
Because, in the visual positioning method provided by the invention, the panoramic photo in the live-action map is adopted to train the neural network model. Thus, in order to perform better visual positioning, the required photograph is also a wide-angle photograph in performing visual positioning using a positioning model. For example, a user may take a wide-angle photograph of the surrounding environment with a view angle exceeding 120 degrees (although other degrees, such as 140 degrees, 180 degrees, etc.) using a wide-angle mode (or super-wide angle mode) or a panoramic mode at a location where positioning is desired.
After the wide-angle photo is obtained, the wide-angle photo is randomly segmented, and a to-be-detected atlas formed by a plurality of segmented photos is obtained.
Particularly, how many photos are segmented from the wide-angle photo can be set according to the training effect of the world positioning model and the actual positioning accuracy requirement. Generally, in the identifiable range (the size of the photo is too small, there is a problem that no relevant positioning features exist and effective recognition cannot be performed), the larger the segmentation number is, the higher the positioning accuracy is, and of course, the more training iterations of the model are, the longer the training time is.
Preferably, in order to improve positioning accuracy, when the wide-angle photo is segmented, random segmentation with original image coverage larger than a specified percentage can be performed on the wide-angle photo according to the segmentation number, so as to obtain a to-be-detected image set matched with the segmentation number. Specifically, the wide-angle photograph can be randomly segmented into N images with an aspect ratio of 1:1 (it should be noted that the aspect ratio can be other ratios, the aspect ratio is the same as the aspect ratio of a training sample used for training the positioning model), and the height is 1/3-1/2 of the height of the wide-angle photograph, which is used as the to-be-measured atlas. The number of N is set according to the training effect and the positioning accuracy, when the training effect is slightly worse and the positioning accuracy is required to be high, a higher N value is selected, and usually the number of N can be set to 100 (of course, other values, such as 50, 80, etc., are also selected, and are not enumerated here). Typically, the random segmentation results require >95% coverage of the original (i.e., the wide-angle photograph) (although other percentages may be set and are not enumerated here).
S102, inputting the to-be-detected atlas into a positioning model for positioning identification, and obtaining a plurality of candidate positioning.
The positioning model is a neural network model trained by using panoramic photos in the live-action map.
In order to obtain a more accurate positioning effect, in this embodiment, each divided photo in the to-be-detected image set is respectively input into a positioning model for positioning identification, and an output about a positioning result is obtained for each photo. In this embodiment, the positioning result corresponding to each divided photograph is used as a candidate positioning.
It should be noted that, before practical application, a positioning model needs to be obtained through training. A process of training a neural network model, comprising:
step one, acquiring a plurality of panoramic photos from a live-action map, and determining the geographic position of each live-action photo;
performing anti-distortion transformation on the panoramic photos to obtain a plurality of groups of plane projection photos with the same length-width ratio;
marking geographic marks for each group of plane projection photos according to the corresponding relation with the panoramic photos; the geographic markers include geographic locations and specific orientations;
step four, taking the plane projection photo marked with the geographic mark as a training sample;
training the neural network model by using the training sample, and determining the trained neural network model as a positioning model.
For ease of description, the five steps described above are combined.
Because the view angle of the panoramic photo is nearly 360 degrees, in this embodiment, the panoramic photo can be subjected to anti-warping transformation, and then several groups of plane projection photos with the same length ratio are obtained. Because the panoramic photo and the geographic position in the live-action map have a corresponding relationship, in this embodiment, the geographic position of a group of plane projection photos separated from the same panoramic photo corresponds to the geographic position of the panoramic photo. In addition, when a panoramic photograph is divided, the orientation of the divided photograph is also clear because the division is performed based on the angle of view, and in this embodiment, the geographic position and the specific orientation are added as geographic marks. That is, each planar projection photograph has a corresponding geographic location and specific orientation.
Taking the plane projection photo with the geographic mark as a training sample, and training the neural network model by using the training sample, wherein the trained neural network model is the positioning model. Specifically, a photo set with a specific location, a specific orientation may be used as a data pool. 80% of the data pool was randomly extracted as training set, leaving 20% as test set. The ratio can also be adjusted according to the actual training situation. The training set is input into an initialized neural network model or a neural network model pre-trained by a large-scale picture set for training, and the training result is verified by a test set. The common neural network structure is selected from CNN (Convolutional Neural Network, convolutional neural network, i.e. a feedforward neural network, including convolutional layer (alternating convolutional layer) and pool layer (pool layer)) and its derivative structure, LSTM (Long Short-Term Memory, long-Term Memory network), a time Recurrent Neural Network (RNN)), and a hybrid structure. The specific type of neural network used is not limited in the embodiments of the present application. And obtaining a neural network model, namely a positioning model, suitable for the live-action map data source field after training is completed.
Preferably, in order to adapt to the focal length (i.e. viewing angle) of different image capturing devices in practical application, when segmenting the panoramic photo, segmentation can be performed according to different focal length parameters, so as to obtain plane projection photos with different viewing angles as training samples. Specifically, each panoramic photo can be segmented according to different focal length parameters in the anti-warping transformation to obtain a plurality of groups of plane projection photos with different visual angles. That is, the number of divisions n is determined from the focal length parameter F. When the focal length parameter is small, the viewing angle is large, and the number n of divisions can be smaller. As shown in fig. 2, fig. 2 is a view angle division schematic diagram in the embodiment of the present application, where the most commonly used focal length parameter f=0.5, the view angle is 90 degrees, and the division number n=4 can cover a full angle of 360 degrees. When a plurality of plane projection photographs with different viewing angles are required, the focal length parameter F can be changed to other values, such as 1.0 and 1.3, so as to obtain plane projection photographs with other viewing angles.
Preferably, in order to improve the accuracy of positioning the view angle, when the panoramic photo is segmented, the panoramic photo can be segmented according to the segmentation quantity with the coverage rate of the corresponding artwork larger than a specified percentage. That is, a planar projection photograph is obtained in which adjacent pictures have a coverage angle at the same viewing angle. Specifically, each panoramic photo is segmented according to the segmentation quantity with the coverage rate of the corresponding original pictures being larger than a specified percentage, so that a plurality of groups of plane projection photos with the overlapping view angles of the adjacent pictures are obtained. That is, in order to enrich the photographing angle of the photograph, in the case where the focal length is fixed, it is recommended that the number of divisions is larger than the number of equal divisions. The axis perpendicular to the ground of the panoramic photo projection sphere is taken as a rotation axis, and a plane projection photo with a view angle of 90 degrees is segmented every 45 degrees when the center of the line of sight faces (as shown by an arrow in the diagram of fig. 2), and at the moment, adjacent pictures have overlapping view angles of 45 degrees. And labeling orientation data for the obtained plane projection photo according to the orientation angle of the sight line center. The values of F can be 1.0 and 1.3, the viewing angles are about 60 degrees and 30 degrees respectively, and the values of n can be 12 and 24. More F values can be set and the number of n can be increased to further improve the coverage rate of the training set. Coverage of greater than 95% is generally ensured.
Preferably, in consideration of practical application, the training is performed by relying on panoramic photos, which may result in poor visual positioning recognition effect due to low update frequency of live-action maps, so that in the process of training the neural network model, scene photos can be obtained from the internet, or environmental photos collected from the positioning environment can be used for supplementing training samples.
S103, determining final positioning by utilizing the plurality of candidate positioning.
After a plurality of candidate locations are obtained, a final location may be determined based on the candidate locations. After the final positioning is obtained, it can be output for viewing by the user.
Specifically, one positioning can be randomly selected from the candidate positioning as the final positioning, or a plurality of candidate positioning can be randomly selected from the candidate positioning, and the geometric centers of geometric figures corresponding to the plurality of candidate positioning are taken as the final positioning. Of course, several candidate locations with high degree of coincidence may also be used as final locations.
Preferably, in order to improve the accuracy of the final position fix, the candidate position fixes may be clustered to remove candidate position fixes that are free from most position fixes, and then the final position fix may be determined based on the remaining candidate position fixes, taking into account that relatively specific individual position fixes may occur in the candidate position fixes. Specifically, the implementation process includes:
step one, clustering a plurality of candidate positioning, and screening the plurality of candidate positioning by using a clustering result;
step two, constructing geometric figures by utilizing a plurality of candidate positioning obtained by screening;
and thirdly, taking the geometric center of the geometric figure as the final positioning.
Specifically, candidate locations may be classified using a clustering algorithm such as DBSCAN (Density-Based Spatial Clustering of Applications with Noise), classifying neighboring location data into one category. Wherein, the classification parameter can be set as epsilon neighborhood=1, and the minimum point minpts=5. And taking the most number of the position results as reliable results, and calculating the geometric centers of all candidate positioning corresponding geometric figures of the class as final positioning results.
Preferably, in order to better demonstrate the positioning, the positioning error is also determined. Specifically, calculating standard deviations of a plurality of candidate positioning by utilizing the final positioning; the standard deviation is taken as the positioning error of the final positioning. That is, the variance between each candidate position fix and the final position fix is calculated and accumulated to obtain the final position fix error.
By applying the method provided by the embodiment of the application, a wide-angle photo is obtained, and the wide-angle photo is randomly segmented to obtain a to-be-detected atlas; inputting the to-be-detected atlas into a positioning model for positioning identification to obtain a plurality of candidate positioning; the positioning model is a neural network model trained by using panoramic photos in the live-action map; and determining the final positioning by utilizing the plurality of candidate positioning.
The real scene map is a map capable of seeing a real street scene, and the real scene map comprises 360-degree real scenes. And the panoramic photo in the live-action map is the real street view map, which is overlapped with the application environment of visual positioning. Based on the above, in the method, the neural network module is trained by using the panoramic photo in the live-action map, so that a positioning model for visual positioning can be obtained. After the wide-angle photo is obtained, the wide-angle photo is randomly segmented, and a to-be-detected atlas can be obtained. And inputting the to-be-detected atlas into a positioning model for positioning identification, so that a plurality of candidate positioning can be obtained. Based on these candidate locations, a final location may be determined. Therefore, in the method, the neural network model is trained based on the panoramic photo in the live-action map to obtain a positioning model, and visual positioning can be completed based on the positioning model, so that the problem that a visual positioning training sample is difficult to collect is solved.
It should be noted that, based on the above embodiments, the embodiments of the present application further provide corresponding improvements. The preferred/improved embodiments relate to the same steps as those in the above embodiments or the steps corresponding to the steps may be referred to each other, and the corresponding advantages may also be referred to each other, so that detailed descriptions of the preferred/improved embodiments are omitted herein.
Corresponding to the above method embodiments, the embodiments of the present application further provide a visual positioning device, where a visual positioning device described below and a visual positioning method described above may be referred to correspondingly to each other.
Referring to fig. 3, the visual positioning device includes:
the to-be-measured atlas obtaining module 101 is configured to obtain a wide-angle photograph, and randomly segment the wide-angle photograph to obtain a to-be-measured atlas;
the candidate positioning acquisition module 102 is used for inputting the atlas to be detected into the positioning model for positioning identification to obtain a plurality of candidate positioning; the positioning model is a neural network model trained by using panoramic photos in the live-action map;
and the positioning output module 103 is used for determining a final positioning by utilizing the plurality of candidate positioning.
By applying the device provided by the embodiment of the application, a wide-angle photo is obtained, and the wide-angle photo is randomly segmented to obtain a to-be-detected atlas; inputting the to-be-detected atlas into a positioning model for positioning identification to obtain a plurality of candidate positioning; the positioning model is a neural network model trained by using panoramic photos in the live-action map; and determining the final positioning by utilizing the plurality of candidate positioning.
The real scene map is a map capable of seeing a real street scene, and the real scene map comprises 360-degree real scenes. And the panoramic photo in the live-action map is the real street view map, which is overlapped with the application environment of visual positioning. Based on the above, in the device, the neural network module is trained by using the panoramic photo in the live-action map, so that a positioning model for visual positioning can be obtained. After the wide-angle photo is obtained, the wide-angle photo is randomly segmented, and a to-be-detected atlas can be obtained. And inputting the to-be-detected atlas into a positioning model for positioning identification, so that a plurality of candidate positioning can be obtained. Based on these candidate locations, a final location may be determined. Therefore, in the device, the neural network model can be trained based on the panoramic photo in the live-action map to obtain a positioning model, and visual positioning can be completed based on the positioning model, so that the problem that a visual positioning training sample is difficult to collect is solved.
In one embodiment of the present application, the positioning output module 103 specifically includes:
the positioning screening unit is used for carrying out clustering treatment on the plurality of candidate positioning and screening the plurality of candidate positioning by utilizing a clustering result;
the geometric figure constructing unit is used for constructing geometric figures by utilizing a plurality of candidate positioning obtained by screening;
and the final positioning determining unit is used for taking the geometric center of the geometric figure as the final positioning.
In a specific embodiment of the present application, the positioning output module 103 further includes:
a positioning error determination unit for calculating standard deviations of a plurality of candidate positioning using the final positioning; the standard deviation is taken as the positioning error of the final positioning.
In one embodiment of the present application, a model training module includes:
the panoramic photo acquisition unit is used for acquiring a plurality of panoramic photos from the live-action map and determining the geographic position of each live-action photo;
the anti-distortion transformation unit is used for carrying out anti-distortion transformation on the panoramic photos to obtain a plurality of groups of plane projection photos with the same length-width ratio;
the geographic marking and labeling unit is used for marking geographic marks for each group of plane projection photos according to the corresponding relation with the panoramic photos; the geographic markers include geographic locations and specific orientations;
the training sample determining unit is used for taking the plane projection photo marked with the geographic mark as a training sample;
the model training unit is used for training the neural network model by using the training sample and determining the trained neural network model as a positioning model.
In a specific embodiment of the present application, the anti-warping transformation unit is specifically configured to segment each panoramic photo according to different focal length parameters in anti-warping transformation, so as to obtain a plurality of groups of planar projection photos with different viewing angles.
In a specific embodiment of the present application, the anti-distortion transformation unit is specifically configured to segment each panoramic photo according to a segmentation number corresponding to an original image coverage rate greater than a specified percentage, so as to obtain a plurality of groups of planar projection photos with overlapping viewing angles of adjacent pictures.
In a specific embodiment of the present application, the model training module further includes:
and the sample supplementing unit is used for supplementing the training sample by utilizing the scene photo acquired from the Internet or the environment photo acquired from the positioning environment.
In a specific embodiment of the present application, the to-be-measured atlas obtaining module 101 is specifically configured to perform random segmentation with an original image coverage rate greater than a specified percentage on the wide-angle photograph according to the segmentation number, so as to obtain a to-be-measured atlas matched with the segmentation number.
Corresponding to the above method embodiments, the embodiments of the present application further provide a visual positioning apparatus, where a visual positioning apparatus described below and a visual positioning method described above may be referred to correspondingly to each other.
Referring to fig. 4, the visual positioning apparatus includes:
a memory 410 for storing a computer program;
the processor 420 is configured to implement the steps of the visual positioning method provided in the above method embodiment when executing the computer program.
Specifically, referring to fig. 5, a schematic diagram of a specific structure of a visual positioning apparatus according to the present embodiment may be provided, where the visual positioning apparatus may have a relatively large difference due to different configurations or performances, and may include one or more processors (central processing units, CPU) 420 (e.g., one or more processors) and a memory 410, and one or more storage computer applications 413 or data 412. Wherein the memory 410 may be transient storage or persistent storage. The computer application may include one or more modules (not shown) each of which may include a series of instruction operations in the data processing apparatus. Still further, the central processor 420 may be configured to communicate with the memory 410 and execute a series of instruction operations in the memory 410 on the visual positioning device 301.
Visual positioning device 400 may also include one or more power supplies 430, one or more wired or wireless network interfaces 440, one or more input/output interfaces 450, and/or one or more operating systems 411.
The steps in the visual positioning method described above may be implemented by the structure of the visual positioning apparatus.
Corresponding to the above method embodiments, the embodiments of the present application further provide a readable storage medium, where a readable storage medium described below and a visual positioning method described above may be referred to correspondingly to each other.
A readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the visual positioning method provided by the above-described method embodiments.
The readable storage medium may be a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, and the like.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Those skilled in the art may implement the described functionality using different approaches for each particular application, but such implementation is not to be considered as outside the scope of this application.

Claims (8)

1. A method of visual localization comprising:
obtaining a wide-angle photo, and randomly dividing the wide-angle photo to obtain a to-be-detected atlas;
inputting the to-be-detected atlas into a positioning model for positioning identification to obtain a plurality of candidate positioning; the positioning model is a neural network model trained by using panoramic photos in a live-action map;
determining a final positioning by utilizing a plurality of candidate positioning;
a process of training the neural network model, comprising:
acquiring a plurality of panoramic photos from the live-action map, and determining the geographic position of each panoramic photo;
performing anti-distortion transformation on a plurality of panoramic photos to obtain a plurality of groups of plane projection photos with the same length-width ratio;
marking geographic marks for each group of plane projection photos according to the corresponding relation with the panoramic photos; the geographic markers include geographic locations and specific orientations;
taking the plane projection photo marked with the geographic mark as a training sample;
training the neural network model by using the training sample, and determining the trained neural network model as the positioning model;
performing anti-distortion transformation on the panoramic photos to obtain a plurality of groups of plane projection photos with the same length-width ratio, wherein the method comprises the following steps:
dividing each panoramic photo according to different focal length parameters in the anti-distortion transformation to obtain a plurality of groups of plane projection photos with different visual angles;
dividing each panoramic photo according to different focal length parameters in the anti-warping transformation to obtain a plurality of groups of plane projection photos with different visual angles, wherein the method comprises the following steps:
and dividing each panoramic photo according to the dividing quantity with the coverage rate of the corresponding original image being larger than the specified percentage, so as to obtain a plurality of groups of plane projection photos with the overlapping view angles of the adjacent pictures.
2. The visual positioning method of claim 1, wherein said determining a final position fix using a plurality of said candidate position fixes comprises:
clustering the candidate positioning, and screening the candidate positioning by using a clustering result;
constructing geometric figures by utilizing a plurality of candidate positioning obtained by screening;
the geometric center of the geometric figure is taken as the final positioning.
3. The visual positioning method of claim 2, further comprising:
calculating standard deviations of a plurality of candidate positioning by utilizing the final positioning;
and taking the standard deviation as a positioning error of the final positioning.
4. The visual localization method of claim 1, wherein the process of training the neural network model further comprises:
the training samples are supplemented with ambient photographs taken from the internet or from the positioning environment.
5. The visual positioning method according to claim 1, wherein randomly dividing the wide-angle photograph to obtain an atlas to be measured comprises:
and randomly dividing the wide-angle photo with the original image coverage rate larger than a specified percentage according to the dividing number to obtain a to-be-detected image set matched with the dividing number.
6. A visual positioning device, comprising:
the to-be-detected atlas acquisition module is used for acquiring a wide-angle photo, and randomly dividing the wide-angle photo to acquire a to-be-detected atlas;
the candidate positioning acquisition module is used for inputting the atlas to be detected into a positioning model for positioning identification to obtain a plurality of candidate positioning; the positioning model is a neural network model trained by using panoramic photos in a live-action map;
the positioning output module is used for determining final positioning by utilizing a plurality of candidate positioning;
a model training module that performs a process of training the neural network model, comprising:
the panoramic photo acquisition unit is used for acquiring a plurality of panoramic photos from the live-action map and determining the geographic position of each panoramic photo;
the anti-distortion transformation unit is used for carrying out anti-distortion transformation on a plurality of panoramic photos to obtain a plurality of groups of plane projection photos with the same length-width ratio;
the geographic marking unit is used for marking geographic marks for each group of plane projection photos according to the corresponding relation with the panoramic photos; the geographic markers include geographic locations and specific orientations;
the training sample determining unit is used for taking the plane projection photo marked with the geographic mark as a training sample;
the model training unit is used for training the neural network model by using the training sample and determining the trained neural network model as the positioning model;
the performing a reverse distortion transformation on the panoramic photos in the reverse distortion transformation unit to obtain a plurality of groups of planar projection photos with the same aspect ratio, including:
dividing each panoramic photo according to different focal length parameters in the anti-distortion transformation to obtain a plurality of groups of plane projection photos with different visual angles;
dividing each panoramic photo according to different focal length parameters in the anti-warping transformation to obtain a plurality of groups of plane projection photos with different visual angles, wherein the method comprises the following steps:
and dividing each panoramic photo according to the dividing quantity with the coverage rate of the corresponding original image being larger than the specified percentage, so as to obtain a plurality of groups of plane projection photos with the overlapping view angles of the adjacent pictures.
7. A visual positioning apparatus, comprising:
a memory for storing a computer program;
a processor for implementing the visual localization method of any one of claims 1 to 5 when executing the computer program.
8. A readable storage medium, characterized in that it has stored thereon a computer program which, when executed by a processor, implements the visual localization method according to any one of claims 1 to 5.
CN202080001067.0A 2020-05-26 2020-05-26 Visual positioning method, device, equipment and readable storage medium Active CN111758118B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/092284 WO2021237443A1 (en) 2020-05-26 2020-05-26 Visual positioning method and apparatus, device and readable storage medium

Publications (2)

Publication Number Publication Date
CN111758118A CN111758118A (en) 2020-10-09
CN111758118B true CN111758118B (en) 2024-04-16

Family

ID=72713357

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202080001067.0A Active CN111758118B (en) 2020-05-26 2020-05-26 Visual positioning method, device, equipment and readable storage medium

Country Status (3)

Country Link
JP (1) JP7446643B2 (en)
CN (1) CN111758118B (en)
WO (1) WO2021237443A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113724284A (en) * 2021-09-03 2021-11-30 四川智胜慧旅科技有限公司 Position locking device, mountain type scenic spot search and rescue system and search and rescue method
CN117289626B (en) * 2023-11-27 2024-02-02 杭州维讯机器人科技有限公司 Virtual simulation method and system for industrialization

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6671400B1 (en) * 2000-09-28 2003-12-30 Tateyama R & D Co., Ltd. Panoramic image navigation system using neural network for correction of image distortion
JP2005315746A (en) * 2004-04-28 2005-11-10 Mitsubishi Heavy Ind Ltd Own position identifying method, and device therefor
CN202818503U (en) * 2012-09-24 2013-03-20 天津市亚安科技股份有限公司 Multidirectional monitoring area early warning positioning automatic tracking and monitoring device
CN104200188A (en) * 2014-08-25 2014-12-10 北京慧眼智行科技有限公司 Method and system for rapidly positioning position detection patterns of QR code
CN109285178A (en) * 2018-10-25 2019-01-29 北京达佳互联信息技术有限公司 Image partition method, device and storage medium
CN110136136A (en) * 2019-05-27 2019-08-16 北京达佳互联信息技术有限公司 Scene Segmentation, device, computer equipment and storage medium
CN110298320A (en) * 2019-07-01 2019-10-01 北京百度网讯科技有限公司 A kind of vision positioning method, device and storage medium
CN110503037A (en) * 2019-08-22 2019-11-26 三星电子(中国)研发中心 A kind of method and system of the positioning object in region

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109308678B (en) * 2017-07-28 2023-10-27 株式会社理光 Method, device and equipment for repositioning by using panoramic image
CN108009588A (en) 2017-12-01 2018-05-08 深圳市智能现实科技有限公司 Localization method and device, mobile terminal
JP6676082B2 (en) 2018-01-18 2020-04-08 光禾感知科技股▲ふん▼有限公司 Indoor positioning method and system, and device for creating the indoor map
CN110298370A (en) * 2018-03-21 2019-10-01 北京猎户星空科技有限公司 Network model training method, device and object pose determine method, apparatus
US11195010B2 (en) * 2018-05-23 2021-12-07 Smoked Sp. Z O. O. Smoke detection system and method
KR102227583B1 (en) * 2018-08-03 2021-03-15 한국과학기술원 Method and apparatus for camera calibration based on deep learning
CN109829406A (en) * 2019-01-22 2019-05-31 上海城诗信息科技有限公司 A kind of interior space recognition methods
CN110636274A (en) * 2019-11-11 2019-12-31 成都极米科技股份有限公司 Ultrashort-focus picture screen alignment method and device, ultrashort-focus projector and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6671400B1 (en) * 2000-09-28 2003-12-30 Tateyama R & D Co., Ltd. Panoramic image navigation system using neural network for correction of image distortion
JP2005315746A (en) * 2004-04-28 2005-11-10 Mitsubishi Heavy Ind Ltd Own position identifying method, and device therefor
CN202818503U (en) * 2012-09-24 2013-03-20 天津市亚安科技股份有限公司 Multidirectional monitoring area early warning positioning automatic tracking and monitoring device
CN104200188A (en) * 2014-08-25 2014-12-10 北京慧眼智行科技有限公司 Method and system for rapidly positioning position detection patterns of QR code
CN109285178A (en) * 2018-10-25 2019-01-29 北京达佳互联信息技术有限公司 Image partition method, device and storage medium
CN110136136A (en) * 2019-05-27 2019-08-16 北京达佳互联信息技术有限公司 Scene Segmentation, device, computer equipment and storage medium
CN110298320A (en) * 2019-07-01 2019-10-01 北京百度网讯科技有限公司 A kind of vision positioning method, device and storage medium
CN110503037A (en) * 2019-08-22 2019-11-26 三星电子(中国)研发中心 A kind of method and system of the positioning object in region

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种基于射线模型的图像定位系统;邓磊;陈宝华;黄思远;段岳圻;周杰;;电子学报(第01期);第4-10页 *

Also Published As

Publication number Publication date
JP7446643B2 (en) 2024-03-11
JP2023523364A (en) 2023-06-02
WO2021237443A1 (en) 2021-12-02
CN111758118A (en) 2020-10-09

Similar Documents

Publication Publication Date Title
CN109520500B (en) Accurate positioning and street view library acquisition method based on terminal shooting image matching
US9530235B2 (en) Aligning panoramic imagery and aerial imagery
CN106529538A (en) Method and device for positioning aircraft
EP3274964B1 (en) Automatic connection of images using visual features
Houshiar et al. A study of projections for key point based registration of panoramic terrestrial 3D laser scan
CA2850937A1 (en) Using videogrammetry to fabricate parts
CN113408566A (en) Target detection method and related equipment
CN111758118B (en) Visual positioning method, device, equipment and readable storage medium
CN111383204A (en) Video image fusion method, fusion device, panoramic monitoring system and storage medium
CN112364843A (en) Plug-in aerial image target positioning detection method, system and equipment
Castillo-Carrión et al. SIFT optimization and automation for matching images from multiple temporal sources
CN108447092B (en) Method and device for visually positioning marker
CN117218201A (en) Unmanned aerial vehicle image positioning precision improving method and system under GNSS refusing condition
CN111383286A (en) Positioning method, positioning device, electronic equipment and readable storage medium
CN112270748A (en) Three-dimensional reconstruction method and device based on image
CN117253022A (en) Object identification method, device and inspection equipment
Božić-Štulić et al. Complete model for automatic object detection and localisation on aerial images using convolutional neural networks
Atik et al. An automatic image matching algorithm based on thin plate splines
CN107644394A (en) A kind of processing method and processing device of 3D rendering
Zhang et al. An UAV navigation aided with computer vision
Marbel et al. Star-tracker algorithm for smartphones and commercial micro-drones
Blazhko et al. Unmanned Aerial Vehicle (UAV): back to base without satellite navigation
Gotovac et al. A model for automatic geomapping of aerial images mosaic acquired by UAV
Porzi et al. An automatic image-to-DEM alignment approach for annotating mountains pictures on a smartphone
CN109269477A (en) A kind of vision positioning method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20210923

Address after: P.O. Box 4519, 30 de Castro street, 1 Wickham Island, road town of Tortola, British Virgin Islands

Applicant after: Fengtuzhi Technology Holding Co.,Ltd.

Address before: Room 901, Cheung Sha Wan building, 909 Cheung Sha Wan Road, Lai Chi Kok, Hong Kong, China

Applicant before: Fengtu Technology Co.,Ltd.

GR01 Patent grant
GR01 Patent grant