WO2021027676A1 - 视觉定位方法、终端和服务器 - Google Patents

视觉定位方法、终端和服务器 Download PDF

Info

Publication number
WO2021027676A1
WO2021027676A1 PCT/CN2020/107364 CN2020107364W WO2021027676A1 WO 2021027676 A1 WO2021027676 A1 WO 2021027676A1 CN 2020107364 W CN2020107364 W CN 2020107364W WO 2021027676 A1 WO2021027676 A1 WO 2021027676A1
Authority
WO
WIPO (PCT)
Prior art keywords
descriptor
vertical line
information
image
location
Prior art date
Application number
PCT/CN2020/107364
Other languages
English (en)
French (fr)
Inventor
丁然
周妍
王永亮
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN201910736244.9A external-priority patent/CN112348886B/zh
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2021027676A1 publication Critical patent/WO2021027676A1/zh
Priority to US17/667,122 priority Critical patent/US20220156969A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4038Scaling the whole image or part thereof for image mosaicing, i.e. plane images composed of plane sub-images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/176Urban or other man-made structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/32Indexing scheme for image data processing or generation, in general involving image mosaicing

Definitions

  • This application relates to the field of positioning technology, and in particular to a visual positioning method, terminal and server.
  • Outdoor positioning technology mainly relies on the global positioning system (GPS). GPS signals are susceptible to occlusion and multipath reflection caused by buildings, and the positioning accuracy is low. The visual positioning can perform more accurate positioning by acquiring image information of the current scene.
  • GPS global positioning system
  • a large number of images are collected in advance, feature points are extracted based on the brightness changes of image pixels, and a three-dimensional space map of the scene is constructed.
  • the map contains the three-dimensional space positions of the feature points and their corresponding The descriptor of the feature point is used to describe the relationship between the feature point and the surrounding pixels.
  • the feature points are extracted based on the brightness changes of the image pixels, and the feature points extracted from the current captured image are matched with those extracted during the database construction. If the current image is captured and the database is captured and captured The scene changes greatly, such as the change of lighting conditions or the influence of rain and snow weather, etc., the extracted feature points will not be effectively matched, and the accuracy of the visual positioning result is low.
  • the embodiment of the present application provides a visual positioning method, which can realize positioning with higher accuracy when the scene of the captured image changes greatly.
  • the first aspect of the embodiments of the present application provides a visual positioning method.
  • the method includes: a terminal acquires an image of a building; the terminal generates a descriptor according to the image, and the descriptor includes a first characteristic vertical line and a first characteristic vertical line in the image.
  • the terminal According to the descriptor matching in a preset descriptor sub-database to obtain the location information of the shooting location of the image, the preset descriptor sub-database includes: the geographic location of the candidate point and the descriptor of the candidate point , The descriptor of the candidate point includes, taking the candidate point as a viewpoint, the direction information of the visible elevation intersection of surrounding buildings, wherein, in the descriptor sub-database, the description of the image
  • the geographic location of the candidate point indicated by the sub-matching descriptor is the positioning information of the shooting location of the image.
  • the user can obtain the location information of the shooting location of the current image by taking an image.
  • the terminal can obtain an image of the building, and the terminal can generate a descriptor corresponding to the image based on the image.
  • the descriptor is used to indicate the spatial positional relationship between the shooting location of the image and the intersection of the building's elevation.
  • the descriptor includes the horizontal viewing angle between the first characteristic vertical line and the second characteristic vertical line in the image Information, the first characteristic vertical line indicates the first elevation intersection line of the building, the second characteristic vertical line indicates the second elevation intersection line of the building, and the terminal uses the descriptor in the preset description Matching in the sub-database to obtain the location information of the shooting location of the image.
  • the preset descriptor database includes the geographic location of the candidate point and the descriptor of the candidate point.
  • the candidate point is a pre-selected point with a known geographic location, which is used as a reference point to collect the visible elevation intersection of surrounding buildings.
  • the line orientation information includes the horizontal viewing angle information between any two intersection lines of the elevation with the candidate point as the viewpoint. Therefore, the descriptor obtained from the image can be matched with the descriptor of the candidate point, and the descriptor of the candidate point with a higher matching degree is determined.
  • the geographic location of the corresponding descriptor can be used to determine the location information of the shooting location of the image .
  • the terminal generating the descriptor according to the image includes: the terminal extracts the first characteristic vertical line and the second characteristic vertical line of the building from the image; the terminal The descriptor is generated according to the position relationship between the first characteristic vertical line and the second characteristic vertical line, and the position relationship includes the horizontal viewing angle information between the first characteristic vertical line and the second characteristic vertical line.
  • the terminal can extract the first characteristic vertical line and the second characteristic vertical line of the building from the image, and then generate a descriptor.
  • the descriptor includes: height information and/or depth information of the first feature vertical line and the second feature vertical line, and the height information of the first feature vertical line is used To indicate the height of the intersection of the first elevation, the height information of the second characteristic vertical line is used to indicate the height of the intersection of the second elevation, and the depth information of the first characteristic vertical line is used to indicate the shooting of the image The distance from the location to the intersection line of the first elevation, and the depth information of the second characteristic vertical line is used to indicate the distance from the shooting location of the image to the intersection line of the second elevation.
  • the descriptor may also include height information and/or depth information of the characteristic vertical line.
  • the height information of the characteristic vertical line is used to indicate the height of the intersection of the building facade, and the depth information is used to indicate The distance between the shooting location of the image and the intersection of the elevation. Therefore, the descriptor includes information about the horizontal angle of view between the first characteristic vertical line and the second characteristic vertical line in the image.
  • the position information of the intersection line is richer.
  • the height information and depth information can also be used for matching positioning. For example, when the elevation intersection lines of the buildings around the shooting location are relatively uniform, if the elevation information is also available, Used to improve the accuracy of positioning, and can also provide orientation information when the image was taken.
  • the descriptor is represented by a circular array
  • the circular array includes first data indicating a first characteristic vertical line and second data indicating the first characteristic vertical line.
  • the positional interval between the first data and the second data in the circular array is used to indicate horizontal viewing angle information between the first characteristic vertical line and the second characteristic vertical line.
  • the visual positioning method provided by the embodiments of this application provides a specific storage form of descriptors, namely a circular array.
  • the interval of data can be used to conveniently express the angle information without specific orientation, and it can also facilitate the communication with the descriptor database. match.
  • the descriptor is represented by a circular array, the circular array includes the first data, the second data, and the third data, and the third data is used to indicate that the The characteristic vertical line of the building.
  • the visual positioning method provided by the embodiment of the present application provides another specific storage form of the descriptor.
  • the ring array includes data corresponding to each characteristic vertical line, and a third characteristic vertical line indicating that the building does not appear. Data improves the feasibility of the solution.
  • the first data includes height information and/or depth information of the first characteristic vertical line
  • the second data includes height information and/or depth information of the second characteristic vertical line In-depth information.
  • the visual positioning method provided by the embodiment of the present application provides another specific storage form of the descriptor.
  • the first data of the circular array may also contain height information and/or depth information, which provides the realization of height information and/or A specific way of deep information storage.
  • the descriptor is represented by a circle, a first feature point indicating the first feature vertical line on the circle and a second feature point indicating the second feature vertical line
  • the circle is represented by
  • the optical center corresponding to the image is the center of the circle
  • the first characteristic point is the point obtained by projecting the first characteristic vertical line on a cylinder with the axis of gravity passing through the optical center as the axis, and then projecting onto the horizontal plane containing the optical center
  • the second feature point is the point obtained by projecting the second feature vertical line on the cylinder and then projecting it onto the horizontal plane
  • the line between the first feature point and the center of the circle and the line between the second feature point and the center of the circle are The angle between is used to indicate the horizontal viewing angle information.
  • the descriptor can be expressed through intuitive geometric information, which can conveniently display the generation method of the descriptor, reflect the specific process of visual positioning, and enhance the interactive experience.
  • the method further includes: the terminal obtains first positioning information, the first positioning information including location information of a base station serving the terminal according to a GPS signal, a wifi signal, or The user manually enters the location information obtained by the address; the terminal matches the descriptor in the preset descriptor sub-database according to the descriptor to obtain the location information of the shooting location of the image includes: the terminal according to the descriptor in the first descriptor sub-database Matching to obtain the positioning information, the first descriptor sub-database includes the geographic location of the first candidate point and the descriptor of the first candidate point, and the first candidate point is the geographic location corresponding to the first positioning information The candidate point within the range of the first position.
  • the terminal can obtain a rough first position range through the first positioning information, and accordingly narrow the range of the descriptor database matching, and only match the descriptors of the candidate points belonging to the first position range. Matching can reduce the amount of calculation and increase the speed of visual positioning.
  • the method further includes: the terminal obtains first positioning information, the first positioning information including location information of a base station serving the terminal according to a GPS signal, a wifi signal, or The user manually enters the location information obtained by the address; the terminal sends the first location information to the server, and the terminal receives the preset descriptor sub-database sent by the server, and the geographic location of the first candidate point of the preset descriptor sub-database A descriptor of the location and the first candidate point, where the first candidate point is the candidate point whose geographic location is within the first location range corresponding to the first positioning information.
  • the terminal may send the first positioning information to the server, and obtain the preset descriptor sub-database from the server, so that local matching and positioning may be realized.
  • the method further includes: the terminal obtains shooting orientation information of the image; the terminal matches the descriptor in a preset descriptor database according to the descriptor to obtain the shooting of the image Location information: The terminal matches the descriptor with the preset descriptor database according to the first angle range determined by the shooting orientation information to obtain the location information.
  • the terminal can obtain the shooting orientation information of the image and match the descriptors of the candidate points within the angle range constrained by the orientation information, which can reduce the amount of calculation and increase the speed of visual positioning .
  • the terminal acquiring the image of the building includes: the terminal acquiring a first image shot at a first location and a second image shot at a second location, the first location and The second location is the same location, or the distance between the first location and the second location is less than a preset threshold, and the first image and the second image include partially repeated image information.
  • the terminal can broaden the angle of view by taking multiple images and obtain more building image information.
  • the first image and the second image with partially repeated image information can be spliced and used Generate descriptors.
  • the preset descriptor database is generated based on satellite images.
  • the preset descriptor database can be generated based on satellite images.
  • the satellite image generates the LOD model, selects candidate points in the LOD model, obtains the descriptor of each candidate point, and then constructs Describe the subdatabase.
  • This solution uses satellite images to automatically construct a descriptor database on a large scale, eliminating the need to collect images on site, reducing the workload and the difficulty of building a database.
  • the second aspect of the embodiments of the present application provides a visual positioning method, including: a server receives an image of a building sent by a terminal; the server generates a descriptor according to the image, and the descriptor includes the first characteristic vertical line and The horizontal viewing angle information between the second characteristic vertical lines, the first characteristic vertical line indicates the first elevation intersection line of the building, and the second characteristic vertical line indicates the second elevation intersection line of the building; The server matches the descriptor in a preset descriptor sub-database according to the descriptor to obtain the location information of the shooting location of the image.
  • the preset descriptor sub-database includes: the geographic location of the candidate point and the description of the candidate point
  • the descriptor of the candidate point includes the direction information of the visible elevation intersection line of the surrounding buildings with the candidate point as the viewpoint, wherein, in the descriptor sub-database, the The geographic location of the candidate point indicated by the descriptor matched by the descriptor is the location information of the shooting location of the image; the server sends the location information to the terminal.
  • the server generating a descriptor according to the image includes: the server extracts the first characteristic vertical line and the second characteristic vertical line of the building from the image; the server The descriptor is generated according to the position relationship between the first characteristic vertical line and the second characteristic vertical line, and the position relationship includes the horizontal viewing angle information between the first characteristic vertical line and the second characteristic vertical line.
  • the server can extract the first characteristic vertical line and the second characteristic vertical line of the building from the image, and then generate a descriptor.
  • the descriptor includes: height information and/or depth information of the first characteristic vertical line and the second characteristic vertical line, and the height information of the first characteristic vertical line is used To indicate the height of the intersection of the first elevation, the height information of the second characteristic vertical line is used to indicate the height of the intersection of the second elevation, and the depth information of the first characteristic vertical line is used to indicate the shooting of the image The distance from the location to the intersection line of the first elevation, and the depth information of the second characteristic vertical line is used to indicate the distance from the shooting location of the image to the intersection line of the second elevation.
  • the descriptor may also include height information and/or depth information of the feature vertical line.
  • the descriptor includes the horizontal viewing angle between the first feature vertical line and the second feature vertical line in the image. It can also provide richer location information about the intersection of the building’s elevation.
  • the height and depth information can also be used for matching positioning. For example, when the intersection of the buildings around the shooting location is relatively uniform, if At the same time, it has the height information of the intersection of the facade, which can be used to improve the accuracy of positioning, and it can also provide orientation information when the image is taken.
  • the descriptor is represented by a circular array
  • the circular array includes first data indicating a first characteristic vertical line and second data indicating a first characteristic vertical line.
  • the positional interval between the first data and the second data in the circular array is used to indicate horizontal viewing angle information between the first characteristic vertical line and the second characteristic vertical line.
  • the visual positioning method provided by the embodiments of this application provides a specific storage form of descriptors, namely a circular array.
  • the interval of data can be used to conveniently express the angle information without specific orientation, and it can also facilitate the communication with the descriptor database. match.
  • the descriptor is represented by a circular array, the circular array includes the first data, the second data, and the third data, and the third data is used to indicate that the The characteristic vertical line of the building.
  • the visual positioning method provided by the embodiment of the present application provides another specific storage form of the descriptor.
  • the ring array includes data corresponding to each characteristic vertical line, and a third characteristic vertical line indicating that the building does not appear. Data improves the feasibility of the solution.
  • the first data includes height information and/or depth information of the first characteristic vertical line
  • the second data includes height information and/or depth information of the second characteristic vertical line In-depth information.
  • the visual positioning method provided by the embodiment of the present application provides another specific storage form of the descriptor.
  • the first data of the circular array may also contain height information and/or depth information, which provides the realization of height information and/or A specific way of deep information storage.
  • the descriptor is represented by a circle, a first feature point indicating the first feature vertical line on the circle, and a second feature point indicating the second feature vertical line
  • the circle is represented by
  • the optical center corresponding to the image is the center of the circle
  • the first characteristic point is the point obtained by projecting the first characteristic vertical line on a cylinder with the axis of gravity passing through the optical center as the axis, and then projecting onto the horizontal plane containing the optical center
  • the second feature point is the point obtained by projecting the second feature vertical line on the cylinder and then projecting it onto the horizontal plane
  • the line between the first feature point and the center of the circle and the line between the second feature point and the center of the circle are The angle between is used to indicate the horizontal viewing angle information.
  • the descriptor can be expressed through intuitive geometric information, which can conveniently display the generation method of the descriptor, reflect the specific process of visual positioning, and enhance the interactive experience.
  • the method further includes: the server obtains first positioning information sent by the terminal, where the first positioning information includes a base station that provides services for the terminal according to GPS signals, wifi signals, and The location information or the location information obtained by the user manually inputting the address; the server matches the descriptor in the preset descriptor database according to the descriptor to obtain the location information of the shooting location of the image includes: the server is in the first location according to the descriptor The description sub-database is matched to obtain the positioning information.
  • the first descriptor sub-database includes the geographic location of the first candidate point and the descriptor of the first candidate point.
  • the first candidate point is the geographic location in the first candidate point.
  • the server can obtain a rough first position range through the first positioning information, and accordingly narrow the matching range of the descriptor database, and only match the descriptors of the candidate points belonging to the first position range. Matching can reduce the amount of calculation and increase the speed of visual positioning.
  • the method further includes: the server obtains the shooting orientation information of the image sent by the terminal; the server matches the descriptor in a preset descriptor database according to the descriptor to obtain
  • the positioning information of the shooting location of the image includes: the terminal matches the descriptor with a preset descriptor sub-database according to the first angle range determined by the shooting orientation information to obtain the positioning information.
  • the server can obtain the shooting orientation information of the image and match the descriptors of the candidate points within the angle range constrained by the orientation information, which can reduce the amount of calculation and improve the speed of visual positioning .
  • the server receiving the image of the building sent by the terminal includes: the server receiving a first image shot at a first location and a second image shot at a second location, the first The location and the second location are the same location, or the distance between the first location and the second location is less than a preset threshold, and the first image and the second image include partially repeated image information.
  • the server can receive multiple images to obtain more building image information, and the first image and the second image with partially repeated image information can be spliced and used to generate a descriptor.
  • the preset descriptor database is generated based on satellite images.
  • the preset descriptor database can be generated based on satellite images.
  • the satellite image generates the LOD model, selects candidate points in the LOD model, obtains the descriptor of each candidate point, and then constructs Describe the subdatabase.
  • This solution uses satellite images to automatically construct a descriptor database on a large scale, eliminating the need to collect images on site, reducing the workload and the difficulty of building a database.
  • the third aspect of the embodiments of the present application provides a visual positioning method, including: a terminal obtains an image of a building; the terminal sends the image to a server; the terminal obtains positioning information sent by the server, and the server uses the positioning information according to
  • the descriptor generated by the image is matched in a preset descriptor database, and the descriptor includes the horizontal viewing angle information between the first feature vertical line and the second feature vertical line in the image.
  • the first feature vertical line Indicates the first elevation intersection line of the building
  • the second characteristic vertical line indicates the second elevation intersection line of the building
  • the preset descriptor sub-database includes: the geographic location of the candidate point and the backup
  • the descriptor of the selected point, the descriptor of the candidate point includes, taking the candidate point as the viewpoint, the direction information of the visible elevation intersection of the surrounding buildings, wherein, in the descriptor sub-database, and
  • the geographic location of the candidate point indicated by the descriptor matched by the descriptor of the image is the location information of the shooting location of the image.
  • the terminal obtains shooting orientation information of the image; the terminal sends the shooting orientation information to the server, the shooting orientation information is used to determine the first angle range, and the positioning information
  • the server within the first angle range, the descriptor generated according to the image is matched in the descriptor database preset by the server.
  • the fourth aspect of the embodiments of the present application provides a visual positioning method, including: a server obtains first positioning information sent by a terminal; the server sends a preset descriptor sub-database to the terminal according to the first positioning information, and the preset Describe the geographic location of the first candidate point in the sub-database and the descriptor of the first candidate point, where the first candidate point is the candidate point whose geographic location is within the first location range corresponding to the first positioning information.
  • the first descriptor database is used for the visual positioning of the terminal.
  • the fifth aspect of the embodiments of the present application provides a terminal, which is characterized in that it includes: an acquisition unit, configured to acquire an image of a building; a generating unit, configured to generate a descriptor based on the image, and the descriptor includes Information about the horizontal viewing angle between the first feature vertical line and the second feature vertical line, the first feature vertical line indicates the first elevation intersection line of the building, and the second feature vertical line indicates the second feature vertical line of the building Elevation intersection line; the acquisition unit is further configured to match the descriptor in a preset descriptor sub-database according to the descriptor to acquire the location information of the shooting location of the image, the preset descriptor sub-database including: candidate points Descriptor of the candidate point, the descriptor of the candidate point includes, taking the candidate point as the viewpoint, the orientation information of the visible elevation intersection of surrounding buildings, where In the descriptor sub-database, the geographic location of the candidate point indicated by the descriptor matching the descriptor of the image is the location information of the shooting
  • the generating unit is specifically configured to: extract the first characteristic vertical line and the second characteristic vertical line of the building from the image; according to the first characteristic vertical line
  • the descriptor is generated by the positional relationship with the second characteristic vertical line, and the positional relationship includes the horizontal viewing angle information between the first characteristic vertical line and the second characteristic vertical line.
  • the descriptor includes: height information and/or depth information of the first characteristic vertical line and the second characteristic vertical line, and the height information of the first characteristic vertical line is used for To indicate the height of the intersection of the first elevation, the height information of the second characteristic vertical line is used to indicate the height of the intersection of the second elevation, and the depth information of the first characteristic vertical line is used to indicate the shooting of the image The distance from the location to the intersection line of the first elevation, and the depth information of the second characteristic vertical line is used to indicate the distance from the shooting location of the image to the intersection line of the second elevation.
  • the descriptor is represented by a circular array
  • the circular array includes first data indicating a first characteristic vertical line and second data indicating the first characteristic vertical line.
  • the positional interval between the first data and the second data in the circular array is used to indicate horizontal viewing angle information between the first characteristic vertical line and the second characteristic vertical line.
  • the descriptor is represented by a circular array, the circular array includes the first data, the second data, and the third data, and the third data is used to indicate that the The characteristic vertical line of the building.
  • the first data includes height information and/or depth information of the first characteristic vertical line
  • the second data includes height information and/or depth information of the second characteristic vertical line In-depth information.
  • the descriptor is represented by a circle, a first feature point indicating the first feature vertical line on the circle and a second feature point indicating the second feature vertical line
  • the circle is represented by
  • the optical center corresponding to the image is the center of the circle
  • the first characteristic point is the point obtained by projecting the first characteristic vertical line on a cylinder with the axis of gravity passing through the optical center as the axis, and then projecting onto the horizontal plane containing the optical center
  • the second feature point is the point obtained by projecting the second feature vertical line on the cylinder and then projecting it onto the horizontal plane
  • the line between the first feature point and the center of the circle and the line between the second feature point and the center of the circle are The angle between is used to indicate the horizontal viewing angle information.
  • the acquiring unit is further configured to: acquire first positioning information, where the first positioning information includes location information of a base station serving the terminal according to GPS signals, wifi signals, or The user manually enters the location information obtained by the address; the obtaining unit is specifically configured to: match the descriptor in the first descriptor sub-database according to the descriptor to obtain the location information, and the first descriptor sub-database includes the geographic location of the first candidate point And the descriptor of the first candidate point, the first candidate point is a candidate point whose geographic location is within the first location range corresponding to the first positioning information.
  • the acquiring unit is further configured to: acquire first positioning information, where the first positioning information includes location information of a base station serving the terminal according to GPS signals, wifi signals, or The user manually enters the location information obtained by the address; the terminal further includes: a sending unit for sending the first location information to the server, and a receiving unit for receiving the preset descriptor sub-database sent by the server, the preset The geographic location of the first candidate point in the descriptor sub-database and the descriptor of the first candidate point, where the first candidate point is the candidate point whose geographic location is within the first location range corresponding to the first positioning information .
  • the acquiring unit is further configured to: acquire shooting orientation information of the image; the acquiring unit is specifically configured to: describe the description according to the first angle range determined by the shooting orientation information The sub-database is matched with the preset description sub-database to obtain the positioning information.
  • the acquiring unit is specifically configured to: acquire a first image shot at a first location and a second image shot at a second location, the first location and the second location It is the same location, or the distance between the first location and the second location is less than a preset threshold, the first image and the second image include partially repeated image information.
  • the preset descriptor database is generated based on satellite images.
  • the sixth aspect of the embodiments of the present application provides a server, including: a receiving unit, configured to receive an image of a building sent by a terminal; and a generating unit, configured to generate a descriptor according to the image, the descriptor including the first image in the image Information about the horizontal viewing angle between a characteristic vertical line and a second characteristic vertical line, the first characteristic vertical line indicating the first elevation intersection line of the building, and the second characteristic vertical line indicating the second elevation of the building Intersection line; an acquisition unit, configured to match the descriptor in a preset descriptor sub-database according to the descriptor to obtain positioning information of the shooting location of the image, the preset descriptor sub-database including: the geographic location of the candidate point And the descriptor of the candidate point, the descriptor of the candidate point includes, taking the candidate point as a viewpoint, the direction information of the visible elevation intersection of surrounding buildings, wherein the descriptor In the database, the geographic location of the candidate point indicated by the descriptor matching the descriptor of the
  • the generating unit is specifically configured to: extract the first characteristic vertical line and the second characteristic vertical line of the building from the image; according to the first characteristic vertical line
  • the descriptor is generated by the positional relationship with the second characteristic vertical line, and the positional relationship includes the horizontal viewing angle information between the first characteristic vertical line and the second characteristic vertical line.
  • the descriptor includes: height information and/or depth information of the first feature vertical line and the second feature vertical line, and the height information of the first feature vertical line is used To indicate the height of the intersection of the first elevation, the height information of the second characteristic vertical line is used to indicate the height of the intersection of the second elevation, and the depth information of the first characteristic vertical line is used to indicate the shooting of the image The distance from the location to the intersection line of the first elevation, and the depth information of the second characteristic vertical line is used to indicate the distance from the shooting location of the image to the intersection line of the second elevation.
  • the descriptor is represented by a circular array, and the circular array includes first data indicating the first characteristic vertical line and second data indicating the first characteristic vertical line.
  • the positional interval between the first data and the second data in the circular array is used to indicate horizontal viewing angle information between the first characteristic vertical line and the second characteristic vertical line.
  • the descriptor is represented by a circular array
  • the circular array includes the first data, the second data, and the third data
  • the third data is used to indicate that the The characteristic vertical line of the building.
  • the first data includes height information and/or depth information of the first characteristic vertical line
  • the second data includes height information and/or depth information of the second characteristic vertical line In-depth information.
  • the descriptor is represented by a circle, a first feature point indicating the first feature vertical line on the circle and a second feature point indicating the second feature vertical line
  • the circle is represented by
  • the optical center corresponding to the image is the center of the circle
  • the first characteristic point is the point obtained by projecting the first characteristic vertical line on a cylinder with the axis of gravity passing through the optical center as the axis, and then projecting onto the horizontal plane containing the optical center
  • the second feature point is the point obtained by projecting the second feature vertical line on the cylinder and then projecting it onto the horizontal plane
  • the line between the first feature point and the center of the circle and the line between the second feature point and the center of the circle are The angle between is used to indicate the horizontal viewing angle information.
  • the preset descriptor database includes: the geographic location of the candidate point and the descriptor of the candidate point, and the descriptor of the candidate point includes: The point is the viewpoint, the direction information of the visible elevation intersection of the surrounding buildings.
  • the acquiring unit is further configured to: acquire first positioning information sent by the terminal, where the first positioning information includes a base station that provides services for the terminal according to GPS signals and wifi signals The location information or the location information obtained by the user manually entering the address; the obtaining unit is specifically configured to: match the descriptor in the first descriptor sub-database to obtain the location information, and the first descriptor sub-database includes the first candidate The geographic location of the point and the descriptor of the first candidate point, where the first candidate point is a candidate point whose geographic location is within a first location range corresponding to the first positioning information.
  • the acquiring unit is further configured to: acquire the shooting orientation information of the image sent by the terminal; the acquiring unit is specifically configured to: the first angle range determined according to the shooting orientation information , Match the descriptor with the preset descriptor database to obtain the positioning information.
  • the receiving unit is specifically configured to: receive a first image shot at a first location and a second image shot at a second location, the first location and the second location It is the same location, or the distance between the first location and the second location is less than a preset threshold, the first image and the second image include partially repeated image information.
  • the preset descriptor database is generated based on satellite images.
  • the seventh aspect of the embodiments of the present application provides a terminal, including: a processor and a storage; the memory is used to store instructions; the processor is used to execute the various implementation manners provided in the first or third aspect according to the instructions method.
  • the eighth aspect of the embodiments of the present application provides a server, including: a processor and a storage; the storage is used to store instructions; the processor is used to execute the implementation manners provided in the foregoing second or fourth aspects according to the instructions method.
  • the ninth aspect of the embodiments of the present application provides a computer program product containing instructions, which when run on a computer, causes the computer to execute the methods of the implementation manners provided in the foregoing first to fourth aspects.
  • the tenth aspect of the embodiments of the present application provides a computer-readable storage medium that stores instructions, and when the instructions are run on a computer, the computer executes the implementation manners provided in the first to fourth aspects. Methods.
  • the visual positioning method provided by the embodiments of the present application acquires an image of a building obtained by shooting a building, generates a descriptor based on the image, and matches the descriptor in a preset descriptor database to obtain the location of the image. Positioning information.
  • the descriptor includes the horizontal viewing angle information between the first characteristic vertical line and the second characteristic vertical line in the image, the first characteristic vertical line indicates the intersection line of the first elevation of the building, and the second characteristic vertical line The characteristic vertical line indicates the intersection line of the second elevation of the building.
  • the feature vertical lines extracted from the image have geometric semantic information and correspond to the elevation intersection lines of the physical buildings.
  • the feature vertical lines are not easily affected by the scene changes during image shooting, even When the building is partially obscured, or the top skyline is not photographed, or the lighting conditions change greatly, or when it is rainy or snowy, it does not affect the generation and use of the descriptor. According to the descriptor matching, the obtained visual positioning result is more accurate high.
  • feature points are extracted based on the brightness changes of image pixels, and the descriptors of feature points are used to describe the numerical relationship between the feature points and surrounding pixels, usually a multi-dimensional feature vector, such as a 128-dimensional feature vector.
  • the number of feature points is large, and the amount of calculation to generate descriptors is relatively large.
  • the feature vertical lines corresponding to the intersection lines of the solid buildings are extracted in the image, and the descriptor includes the horizontal viewing angle between the feature vertical lines The number of feature vertical lines is small, and the calculation of obtaining the descriptor is easier.
  • the construction of the descriptor database requires on-site collection of a large number of images, and the database needs to be refreshed when the season changes, which is time-consuming and labor-intensive.
  • the descriptor database in the embodiment of the present application can be constructed automatically on a large scale by using satellite images without on-site collection Images greatly reduce the workload and the difficulty of building a database.
  • the descriptor database stores a large number of two-dimensional images, three-dimensional feature point clouds, and feature point descriptors, which requires a large amount of data and requires a large amount of storage space, and can usually only be constructed based on a small geographic area.
  • the descriptor generated from the image is directly associated with the descriptor of the candidate point of the map, and the descriptor records the direction information of the visible elevation intersection lines of the buildings around the candidate point without storing a large amount of information.
  • Two-dimensional image and three-dimensional point cloud data the amount of data is small, and the storage space is small.
  • the descriptor database of this solution is about visual simultaneous localization and map construction (visual simultaneous localization and mapping, vSLAM) features The order of one millionth of the library.
  • FIG. 1 is a schematic diagram of an embodiment of a visual positioning method in an embodiment of the application
  • Figure 2 is a schematic diagram of the camera coordinate system
  • FIG. 3 is a schematic diagram of semantic segmentation of an image in an embodiment of the application.
  • FIG. 4 is a schematic diagram of obtaining characteristic vertical lines of buildings in an embodiment of the application.
  • FIG. 5 is a schematic diagram of cylindrical projection conversion in an embodiment of the application.
  • FIG. 6 is a schematic diagram of cylindrical projection of multiple images in an embodiment of the application.
  • FIG. 7 is a schematic diagram of the data structure of the descriptor in an embodiment of the application.
  • FIG. 8 is a schematic diagram of an embodiment of GPS signal assisted visual positioning in an embodiment of the application.
  • FIG. 9 is a schematic diagram of an embodiment of constructing a descriptor database in an embodiment of the application.
  • FIG. 10 is a schematic diagram of an embodiment of selecting candidate points in an embodiment of the application.
  • FIG. 11 is a schematic diagram of cylindrical projection according to the LOD model in an embodiment of this application.
  • FIG. 12 is a schematic diagram of a descriptor for generating candidate points in an embodiment of this application.
  • FIG. 13 is an interaction diagram of an embodiment of a visual positioning method in an embodiment of the application.
  • 15 is a schematic diagram of an embodiment of a terminal in an embodiment of the application.
  • FIG. 16 is a schematic diagram of an embodiment of a server in an embodiment of the application.
  • FIG. 17 is a schematic diagram of another embodiment of a terminal in an embodiment of this application.
  • FIG. 18 is a schematic diagram of another embodiment of a server in an embodiment of the application.
  • FIG. 19 is a schematic structural diagram of an AR device disclosed in an embodiment of this application.
  • the embodiments of the present application provide a visual positioning method, which is used to improve the accuracy of visual positioning in a scene with large changes in lighting conditions.
  • FIG. 1 is a schematic diagram of an embodiment of the visual positioning method in the embodiment of the application.
  • the terminal acquires an image of a building
  • the terminal acquires an image obtained by shooting a building, and the image will include an image of the building.
  • image acquisition devices for acquiring images, which may be monocular cameras, binocular cameras, depth cameras, or lidars.
  • the specific types of image acquisition devices are not limited here.
  • the image acquisition device may be a camera component installed in the terminal, or a device that is externally placed in the terminal and can communicate with the terminal.
  • the specific setting form of the image acquisition device is not limited here.
  • the field of view of a single image is too small to capture a complete building, and the information provided is limited.
  • the terminal can expand the field of view by acquiring multiple images and increase the amount of image information.
  • the image acquired by the terminal may be one image or multiple images, and the number of acquired images is not limited here.
  • the terminal acquires a first image shot at a first location and a second image shot at a second location, the first location and the second location are the same location, or the first location and the second location
  • the distance of is less than a preset threshold, and the first image and the second image include partially repeated image information.
  • the multiple images can be taken at the same place.
  • the user takes a picture of a building through a terminal and rotates the terminal, captures at equal intervals, or uses intelligent recognition to obtain multiple overlapping photos, and uses an image stitching algorithm Combine multiple images to get an image containing buildings.
  • the image capture device usually undergoes relative displacement.
  • the distance between the first location for shooting the first image and the second location for shooting the second image should be less than the preset threshold.
  • the preset threshold can be set according to actual use requirements.
  • the preset threshold can be a displacement value that can be ignored for the depth of field shooting in the distant view.
  • the ratio of the depth of field to the displacement is greater than 100, which can be 200 or 500 etc.
  • the image depth of field is 200 meters, and the preset threshold may be 1 meter.
  • the displacement caused by the rotation of the arm is usually within 0.5 meters, and the obtained image can be regarded as the location of the same shooting location.
  • the camera coordinate system is a three-dimensional rectangular coordinate system with the optical center O of the camera as the origin and the optical axis as the z-axis.
  • the x and y axes of the camera coordinate system are parallel to the X and Y axes of the image, and the z axis of the camera coordinate system is the camera optical axis, which is perpendicular to the graphics plane.
  • the intersection of the optical axis and the image plane is the origin of the image coordinate system, and the image coordinate system is a two-dimensional rectangular coordinate system.
  • Facade generally refers to the exterior wall of a building, including the front, side or back.
  • the line of intersection is the line of intersection of the solid surface.
  • the intersection line of the elevation is referred to as the intersection line of the elevation for short in the embodiments of the present application, and refers to the intersection line between adjacent faces in the outer wall of a building.
  • the characteristic vertical line of the building should be a line segment parallel to the Y-axis in the image obtained.
  • the x-axis of the camera coordinate system may not be horizontal when the image is taken. In the image obtained, there will be a certain angle between the characteristic vertical line of the building and the Y-axis of the image.
  • regularization processing such as image correction technology and rotation transformation, can obtain characteristic vertical lines perpendicular to the Y axis of the image.
  • the process of regularizing the contour of the building includes rotating the camera coordinate system, rotating around the x-axis and z-axis of the camera coordinate system, so that the x-axis of the camera coordinate system is parallel to the world coordinate system
  • the horizontal axis of the camera coordinate system is parallel to the gravity axis of the world coordinate system.
  • the Y axis in the image coordinate system of the camera imaging is parallel to the y axis of the gravity axis in the world coordinate system, and the image correction is completed, so that the vertical line of the contour of the building in the image is parallel to the y axis of the image coordinate system.
  • a semantic segmentation algorithm is used to identify and distinguish image information of different types of objects in the image.
  • FIG. 3 is a schematic diagram of semantic segmentation of an image in an embodiment of this application.
  • the image is segmented into buildings, vegetation, ground and sky regions.
  • the contour lines of the building can be obtained, and then the vertical lines in the contour lines are extracted as the characteristic vertical lines of the image.
  • FIG. 4 is a schematic diagram of acquiring characteristic vertical lines of a building in an embodiment of this application.
  • the bold line segments in the figure indicate the acquired characteristic vertical lines.
  • a straight line extraction algorithm such as line segment detector (LSD) can be used to obtain the regularized line segment information in the image, and then the classification result mask obtained by semantic segmentation can be used as a constraint condition to eliminate the internal and The external miscellaneous line segment only retains the target line segment at the junction of the building and the sky.
  • LSD line segment detector
  • the image captured by the camera is usually a central projection image.
  • the central projection image is cylindrically projected and projected onto a cylinder with the y-axis of the camera coordinate system as the axis.
  • the conversion from the central projection to the cylindrical projection can be realized.
  • the cylindrical projection image is projected to the xOz plane where the x-axis and z-axis of the camera coordinate system are located, and a circle with the optical center O as the center is obtained, and the characteristic vertical line of the building obtained from the image is projected as a characteristic point on the circle.
  • the feature point represents the feature vertical line of the building obtained from the image.
  • Figure 5 is a schematic diagram of cylindrical projection conversion, where the straight line segment formed by point A and point B represents the imaging plane of the central projection, and the arc segment formed by point D, point I and point E is the cylindrical projection Imaging surface. Point C is the midpoint of line segment A B. Since the focal length of the camera that obtains the central projection image is known, the distance between the imaging plane AB of the central projection and the optical center O can be obtained according to the focal length, that is, the length of the line segment OC, and the radius of the cylinder is The setting value, the specific size is not limited, and it is optional. In Figure 5, the OC length is used as the cylinder radius.
  • point D is the intersection of the AO line segment and the arc O, that is, after cylindrical projection transformation, point A in the image corresponds to point D on the cylindrical projection surface.
  • point F corresponds to point I.
  • Point C corresponds to point C, and point B corresponds to point E.
  • point F represents the characteristic vertical line of the building, connecting point F and point O, the intersection with the circle where point O is located is point I, which is a characteristic point of the geometric semantic descriptor.
  • the characteristic vertical lines of the building in the cylindrical projection are the characteristic points of the geometric semantic descriptor in the embodiment of the application.
  • step 1003 and step 1002 are not limited. Step 1002 can be executed first, and then step 1003 can be executed, or step 1003 can be executed first, and then step 1002 can be executed, as described below:
  • FIG. 6 a schematic diagram of cylindrical projection of multiple images.
  • P1, P2, and P3 respectively represent three images taken at the same location.
  • Point A is the characteristic vertical line of the building extracted from P1. Since P1 and P2 have partial image information overlap, the characteristic vertical line represented by point A is The characteristic vertical line represented by point B in P2 corresponds to the intersection line of the elevation of the same building entity.
  • both points A and B are projected to point C on the cylindrical projection.
  • the cylindrical projection is usually performed first and then the stitching is performed. Since the captured images are all central projections, and the imaging surface is flat, if multiple images are spliced directly, the splicing result may be greatly deformed due to projection distortion. Therefore, it is necessary to perform cylindrical projection transformation first to convert the central projection to The cylindrical projection surface keeps the projection surfaces of all images to be processed consistent. Since multiple images are captured by the same capture device at the same location, for outdoor long-range shooting, the location of the image captured by the terminal can be approximated as the position of the optical center of the camera. Since multiple images are captured at the same location, the visual It is a single viewpoint rotation, so the multiple images can be converted to the same coordinate system.
  • the first image can be used as a reference image, and the second image and the third image can be stitched into the coordinate system of the first image in sequence.
  • image fusion There are many ways of image fusion, which are not limited here.
  • the gradual and gradual method is adopted, that is, the closer to the splicing edge, the larger the weight of the pixel of the image to be spliced, and the smaller the weight of the pixel of the spliced image, and the weighted average is taken to obtain the spliced image.
  • the descriptor includes the horizontal viewing angle information between the first feature vertical line and the second feature vertical line in the image, the first feature vertical line indicates the intersection of the first elevation of the building, and the second feature vertical line indicates The intersection line of the second facade of the building; for the image, the only point of view can be determined, that is, the optical center of the camera that took the image. From this point of view, on the horizontal plane, the first feature vertical line and the second feature can be determined The horizontal viewing angle information between the vertical lines.
  • the horizontal viewing angle information includes viewing from this point.
  • the horizontal viewing angle information carries the relative position information between the first characteristic vertical line and the second characteristic vertical line.
  • the first characteristic vertical line and the second characteristic vertical line can indicate the intersection of the elevations of the same building, or the elevations of different buildings.
  • the intersection line this is not limited in the embodiment of this application.
  • step 1003 project the first feature point as the first feature vertical line on a cylinder with the axis of gravity passing through the optical center as the axis, and then project the feature point on the horizontal plane containing the optical center, according to the feature point
  • the angle relationship between the line with the optical center can generate a descriptor, which is also called a geometric semantic descriptor in the embodiment of the present application.
  • the first characteristic vertical line and the second characteristic vertical line of the building extracted from the image are transformed by cylindrical projection to obtain the xOz plane of the camera coordinate system and the circle centered on the optical center O.
  • the first feature point and the second feature point, and the descriptor includes angle interval information corresponding to the arc formed by the first feature point and the second feature point.
  • the geometric semantic descriptor only includes the angle information between the feature points; optionally, the height information of the vertical line of the building outline on the image can also be recorded in the information corresponding to the feature points, and the geometric semantic descriptor includes the feature Angle information between points and height information of feature points.
  • the height information may be pixel height information.
  • the senor has depth data, or the depth value of the contour vertical line is obtained through techniques such as binocular camera depth calculation and monocular camera depth recovery, it can also be recorded in the information corresponding to the feature point.
  • the geometric semantic descriptor includes the angle information between the feature points and the depth information of the feature points.
  • the height information is used to indicate the height of the second elevation intersection line, and the depth information is used to indicate the distance from the shooting location of the image to the elevation intersection line of the corresponding building.
  • the descriptor includes: a circular array, that is, the circular array represents or stores the information of the descriptor, the circular array includes first data indicating the first characteristic vertical line, and second data indicating the first characteristic vertical line The positional interval between the first data and the second data in the circular array is used to indicate the horizontal field of view information between the first characteristic vertical line and the second characteristic vertical line.
  • the descriptor is represented by a circular array, the circular array includes the first data, the second data, and the third data, and the third data is used to indicate that the characteristic vertical line of the building does not appear.
  • the accuracy of the horizontal viewing angle information may be determined according to actual requirements, such as 1 degree or 0.1 degree, etc., which is not specifically limited here.
  • the circular array contains 360 data used to represent the feature point information within the range of 360 degrees.
  • the number "1” can represent the existence of a feature point at the angle, and the number "0" "Means no feature points are detected or not.
  • the field of view obtained from the image usually only contains part of the 360-degree field of view.
  • the geometry corresponding to the image can be generated Semantic descriptor.
  • the position of "1" is used here to describe the spatial relationship of the vertical line of the building outline based on the image shooting point.
  • the descriptor is represented by a circle, a first feature point on the circle indicating the first feature vertical line and a second feature point indicating the second feature vertical line, and the circle is centered on the optical center corresponding to the image
  • the first characteristic point is a point obtained by projecting the first characteristic vertical line on a cylinder with the axis of gravity passing through the optical center as the axis, and then projecting onto a horizontal plane containing the optical center
  • the second characteristic point is the first Two characteristic vertical lines are projected on the cylinder, and then projected to the point obtained on the horizontal plane.
  • the angle between the line between the first characteristic point and the center of the circle and the line between the second characteristic point and the center of the circle is used to indicate the level Perspective information.
  • the restriction information acquired by the terminal includes first positioning information and/or orientation information when the image is taken.
  • the first positioning information includes positioning information obtained according to GPS signals, wifi signals, location information of a base station that provides services for the terminal, or a user manually inputting an address.
  • the positioning information has low accuracy and can provide a reference for visual positioning.
  • a more accurate visual positioning than the first positioning information is obtained according to a geometric semantic descriptor matching method.
  • the terminal matches the descriptor in a first descriptor sub-database according to the descriptor to obtain the positioning information.
  • the first descriptor sub-database includes the geographic location of the first candidate point and the descriptor of the first candidate point.
  • the candidate point is a candidate point whose geographic location is within the first location range corresponding to the first positioning information.
  • the descriptor is obtained by matching the descriptor with a preset descriptor sub-database, and the terminal obtains more accurate positioning information within the first position range.
  • the orientation information refers to the angle information when the image was taken according to the terminal sensor.
  • the orientation information is the orientation angle obtained according to the magnetometer, which represents the orientation when the image was taken, specifically in the northeast sky coordinate system (east north up, ENU) The direction of the coordinate system. It is understandable that the direction of the ENU coordinate system provided by the magnetometer has a certain error and low accuracy, which can be used to provide a reference for visual positioning.
  • the terminal obtains the shooting orientation information of the image; the terminal matches the descriptor with a preset descriptor sub-database according to the first angle range determined by the shooting orientation information to obtain the positioning information.
  • step 1005 is an optional step, which may or may not be performed, and is not limited here. If step 1005 is performed, usually the constraint information when performing step 1001 to obtain an image is obtained.
  • the preset descriptor sub-database includes: the geographic location of the candidate point and the descriptor of the candidate point.
  • the descriptor of the candidate point includes: taking the candidate point as the viewpoint, the visual elevation of the surrounding buildings Towards information.
  • the geometric semantic descriptor of the image it matches with the descriptor of the candidate point in the descriptor sub-database, and obtains the candidate point related to the positioning information according to the matching similarity.
  • the coordinate position of the candidate point in the map you can Determine the visual positioning of the image shooting point. That is, the geographic location of the candidate point indicated by the descriptor matching the descriptor of the image in the descriptor subdatabase is the location information of the shooting location of the image.
  • the descriptor sub-database includes descriptors of multiple candidate points on the map. Specifically, the position coordinates of the candidate points are used as indexes, and the angle information of the characteristic vertical lines of the buildings around the candidate points is stored in the database. Optionally, the descriptor of the candidate point further includes depth and/or height information of the characteristic vertical lines of the surrounding buildings.
  • the data structure diagram of the descriptor in the embodiment of this application is to expand a 360-degree circular array into a 360-degree array with a precision of 1 degree. Dimensional array.
  • the photodescriptor of the image is the angle information array of the vertical line of the building outline in the image in the current coordinate system, and the position of the number "1" represents the angle of the vertical line of the building outline in the image that can be obtained in the current coordinate system.
  • the map descriptor in the database is the angle information array of the corner point of the outer ring of the building in the ENU coordinate system, and the position of the number "1" represents the angle of the corner point of the outer ring of the building in the ENU coordinate system.
  • the "1" in the 4th digit of the map descriptor means that the corner point of the outer ring of the building can be seen from the current map position at 4 degrees under the ENU coordinate system.
  • the "1" in the first position of the Photodescriptor means that the vertical line of the building outline in the image can be seen at 1 degree in the ENU coordinate system.
  • the position with the number "0" indicates that no vertical line of the building feature is detected or there is no building feature in the current orientation.
  • the image to be located can be directly matched and located with the candidate points on the map through the descriptor. There are many matching methods, which are not limited here.
  • the interval angle of the feature point and the descriptor of the candidate point are unified to the same 360-degree scale, but because the position of the shooting point is uncertain, the angle information of the descriptor is not available In the same coordinate system.
  • a new geometric semantic descriptor with angular offset can be formed with each sliding, and the similarity is calculated with the geometric semantic descriptor of the current map candidate point.
  • the target candidate point is determined.
  • the terminal can obtain the location information of the shooting location of the image.
  • the descriptor matching can be performed within the error range of the magnetometer.
  • the geometric semantic descriptor is converted to the frequency domain through discrete Fourier transform, the mutual power spectrum of the two is calculated, and the geometric semantic descriptor relative is obtained through the Dirac function.
  • the geometric semantic descriptor relative is obtained through the Dirac function.
  • the first positioning information may determine that the rough positioning is within the first position range, and only candidate points in the first position range may be used for descriptor matching. The amount of calculation can be reduced.
  • Point A in Figure 8 represents the GPS positioning point, and the box represents the first location range determined according to GPS accuracy information.
  • descriptor matching can be performed within the box. That is, the descriptors of all candidate points on the map belonging to the first location range are matched with the descriptors of the captured image.
  • a heat map may be drawn for the scores of all candidate points within the first position range, and the score represents the similarity between the candidate point and the shooting point.
  • a sliding window of a preset size is established, the heat map is traversed, and the point with the highest comprehensive score of the sliding window is the calculated positioning result.
  • FIG. 9 is a schematic diagram of an embodiment of constructing a descriptor database in an embodiment of the application.
  • the database construction process includes:
  • LOD0 is the outline of the building's top view
  • LOD1 is the three-dimensional outline of the building with height information
  • LOD2 is There is building roof information.
  • the LOD model is generated based on satellite images, the road layer is extracted, and candidate points are selected at regular intervals on the road.
  • the interval may be, for example, one meter, and the specific interval value is not limited here.
  • FIG. 10 is a schematic diagram of an embodiment of selecting candidate points in the embodiment of this application.
  • the points on the road in the figure are candidate points.
  • FIG. 11 is a schematic diagram of cylindrical projection according to the LOD model in an embodiment of this application.
  • the horizontal axis is the angle and the vertical axis is The perspective projection height of surrounding buildings.
  • Different collection precisions can be set when constructing a cylindrical projection.
  • the precision can be set to 1 degree, 0.1 degree, etc. The specifics are not limited here.
  • Identify the contour vertical line of the model projection result as the extracted feature of the geometric semantic description sub-database. Please refer to the bolded line segment in Figure 11 to perform cylindrical projection of the intersection of the building in the LOD model to obtain the characteristic vertical line.
  • a descriptor is generated according to the feature vertical line obtained in step 902.
  • the descriptor of the candidate point includes, using the candidate point as a reference point, the direction angle information of the intersection of the visible elevations of the surrounding buildings.
  • the descriptor subdatabase includes the geographic location of the candidate point and the descriptor of the candidate point.
  • the descriptor of the candidate point includes, taking the candidate point as the viewpoint, the direction of the visible elevation intersection of the surrounding buildings information.
  • FIG. 12 is a schematic diagram of a descriptor for generating candidate points in an embodiment of this application.
  • Point O is a candidate point selected on the map, where the candidate point is manually designated map coordinate data. Since the map data is used as the input source, the coordinates are known in advance.
  • point O stands at point O, for the upper right corner of the building, point A is blocked and invisible, you can see that the intersection of the building's elevation is: point B, point C, point D, respectively connecting point B
  • point B point C
  • point D point D
  • this feature point represents a vertical line of the outline of the building. Since in the ENU coordinate system, the angle between the line of each feature point and the optical center and true north is known, the physical meaning of this angle is the direction of the vertical line of the building's outline from point O. The direction angle information of the line connecting each feature point and the optical center can be obtained.
  • Point b, point c, and point d record the spatial constraint information generated at point O when standing at point O and surrounding buildings. Specifically, it includes the direction angle information of the intersection line of the building's elevation with point O as the reference point. In addition, using the orientation angle information of point b, point c, and point d as an index, the intersection line of the elevation represented by point B, point C, and point D, and distance information and height information relative to point O can also be stored.
  • data is stored in a key-value structure.
  • the position (x, y) of the O point in the map is used as an index, and the descriptor samples a 360-degree unit circle.
  • an array of 360 data is formed.
  • the angle with a feature is stored as "1”
  • the angle without a feature is stored as "0”.
  • the angle between Ob and true north in the figure is 26.86 degrees, and then "1" is recorded in 27 positions in the array.
  • the descriptor of the candidate point may also record the depth information of each feature point.
  • the descriptor of the candidate point may also record the height information of each feature point.
  • the descriptor matching can be performed on the terminal side, or the descriptor matching can be performed on the server side, which will be introduced separately below.
  • FIG. 13 is an interaction diagram of an embodiment of a visual positioning method in an embodiment of this application.
  • the server constructs a geometric semantic description sub-database
  • the terminal obtains the first positioning information and sends it to the server.
  • the server determines the first sub-database and sends it to the terminal.
  • the terminal acquires an image
  • step 1304 and step 1302 are not limited. Step 1302 may be performed first, and step 1304 may be performed first, and then step 1302 may be performed first.
  • the terminal extracts the characteristic vertical line of the building from the image
  • the terminal generates a geometric semantic descriptor
  • the terminal performs matching in the first descriptor sub-database to obtain a visual positioning result.
  • steps 1301 to 1307 For the specific content of steps 1301 to 1307, reference may be made to the embodiments corresponding to FIG. 1 and FIG. 9, which will not be repeated here.
  • FIG. 14 is an interaction diagram of another embodiment of a visual positioning method in an embodiment of this application.
  • the server constructs a geometric semantic description sub-database
  • the terminal obtains an image and sends it to the server;
  • the terminal obtains the first positioning information and sends it to the server.
  • step 1401 to step 1403 is not limited.
  • the server extracts the characteristic vertical line of the building from the image
  • the server generates a geometric semantic descriptor
  • the server performs matching in the first description sub-database, obtains the visual positioning result, and sends it to the terminal.
  • steps 1401 to 1406 For the specific content of steps 1401 to 1406, reference may be made to the embodiments corresponding to FIG. 1 and FIG. 9, which will not be repeated here.
  • FIG. 15 is a schematic diagram of an embodiment of the terminal in the embodiment of the present application.
  • the acquiring unit 1501 is used to acquire an image of a building
  • the generating unit 1502 is configured to generate a descriptor according to the image, the descriptor including the horizontal viewing angle information between the first characteristic vertical line and the second characteristic vertical line in the image, and the first characteristic vertical line indicates the building The first elevation intersection line of the building, the second characteristic vertical line indicates the second elevation intersection line of the building;
  • the acquiring unit 1501 is further configured to match the descriptor in a preset descriptor database according to the descriptor to acquire the location information of the shooting location of the image.
  • the generating unit 1502 is specifically configured to: extract the first characteristic vertical line and the second characteristic vertical line of the building from the image; according to the first characteristic vertical line and the second characteristic vertical line
  • the position relationship generates the descriptor, and the position relationship includes the horizontal viewing angle information between the first characteristic vertical line and the second characteristic vertical line.
  • the descriptor includes: height information and/or depth information of the first characteristic vertical line and the second characteristic vertical line, and the height information of the first characteristic vertical line is used to indicate the intersection line of the first elevation
  • the height information of the second characteristic vertical line is used to indicate the height of the intersection line of the second elevation
  • the depth information of the first characteristic vertical line is used to indicate the shooting location of the image to the intersection line of the first elevation
  • the depth information of the second characteristic vertical line is used to indicate the distance from the shooting location of the image to the intersection line of the second elevation.
  • the descriptor is represented by a circular array
  • the circular array includes first data indicating a first characteristic vertical line and second data indicating the first characteristic vertical line.
  • the first data and the second data are in The position interval in the circular array is used to indicate horizontal viewing angle information between the first characteristic vertical line and the second characteristic vertical line.
  • the descriptor is represented by a circular array, the circular array includes the first data, the second data, and the third data, and the third data is used to indicate that the characteristic vertical line of the building does not appear.
  • the first data includes height information and/or depth information of the first characteristic vertical line
  • the second data includes height information and/or depth information of the second characteristic vertical line.
  • the descriptor is represented by a circle, a first feature point on the circle indicating the first feature vertical line and a second feature point indicating the second feature vertical line, and the circle is centered on the optical center corresponding to the image
  • the first characteristic point is a point obtained by projecting the first characteristic vertical line on a cylinder with the axis of gravity passing through the optical center as the axis, and then projecting onto a horizontal plane containing the optical center
  • the second characteristic point is the first Two characteristic vertical lines are projected on the cylinder, and then projected to the point obtained on the horizontal plane.
  • the angle between the line between the first characteristic point and the center of the circle and the line between the second characteristic point and the center of the circle is used to indicate the level Perspective information.
  • the preset descriptor database includes: the geographic location of the candidate point and the descriptor of the candidate point, and the descriptor of the candidate point includes, taking the candidate point as the viewpoint, the availability of surrounding buildings The direction information of the intersection line of the elevation.
  • the obtaining unit 1501 is further configured to: obtain first positioning information, the first positioning information including positioning information obtained according to GPS signals, wifi signals, location information of a base station that provides services for the terminal, or a user manually inputting an address
  • the acquiring unit 1501 is specifically configured to: match the descriptor in the first descriptor sub-database according to the descriptor to acquire the positioning information, the first descriptor sub-database including the geographic location of the first candidate point and the first candidate point
  • the first candidate point is a candidate point whose geographic location is within the first location range corresponding to the first positioning information.
  • the obtaining unit 1501 is further configured to: obtain first positioning information, the first positioning information including positioning information obtained according to GPS signals, wifi signals, location information of a base station that provides services for the terminal, or a user manually inputting an address ;
  • the terminal also includes:
  • the sending unit 1503 is configured to send the first positioning information to the server,
  • the receiving unit 1504 is configured to receive the preset descriptor database sent by the server, the geographic location of the first candidate point of the preset descriptor database and the descriptor of the first candidate point, and the first candidate The point is the candidate point whose geographic location is within the first location range corresponding to the first positioning information.
  • the obtaining unit 1501 is further configured to: obtain shooting orientation information of the image;
  • the acquiring unit 1501 is specifically configured to: match the descriptor with a preset descriptor database according to the first angle range determined by the shooting orientation information to acquire the positioning information.
  • the acquiring unit 1501 is specifically configured to acquire a first image shot at a first location and a second image shot at a second location, the first location and the second location are the same location, or the first location The distance between the location and the second location is less than a preset threshold, and the first image and the second image include partially repeated image information.
  • the preset descriptor sub-database is generated based on satellite images.
  • FIG. 16 is a schematic diagram of an embodiment of the server in the embodiment of the application.
  • the receiving unit 1601 is used to receive the image of the building sent by the terminal;
  • the generating unit 1602 is configured to generate a descriptor according to the image, the descriptor including the horizontal viewing angle information between the first characteristic vertical line and the second characteristic vertical line in the image, and the first characteristic vertical line indicates the building The first elevation intersection line of the building, the second characteristic vertical line indicates the second elevation intersection line of the building;
  • the acquiring unit 1603 is configured to match the descriptor in a preset descriptor database according to the descriptor to acquire the location information of the shooting location of the image;
  • the sending unit 1604 is configured to send the positioning information to the terminal.
  • the generating unit 1602 is specifically configured to: extract the first characteristic vertical line and the second characteristic vertical line of the building from the image; according to the first characteristic vertical line and the second characteristic vertical line
  • the position relationship generates the descriptor, and the position relationship includes the horizontal viewing angle information between the first characteristic vertical line and the second characteristic vertical line.
  • the descriptor includes: height information and/or depth information of the first characteristic vertical line and the second characteristic vertical line, and the height information of the first characteristic vertical line is used to indicate the intersection line of the first elevation
  • the height information of the second characteristic vertical line is used to indicate the height of the intersection line of the second elevation
  • the depth information of the first characteristic vertical line is used to indicate the shooting location of the image to the intersection line of the first elevation
  • the depth information of the second characteristic vertical line is used to indicate the distance from the shooting location of the image to the intersection line of the second elevation.
  • the descriptor is represented by a circular array
  • the circular array includes first data indicating a first characteristic vertical line, and second data indicating the first characteristic vertical line, the first data and the The position interval of the second data in the circular array is used to indicate horizontal viewing angle information between the first characteristic vertical line and the second characteristic vertical line.
  • the descriptor is represented by a circular array, the circular array includes the first data, the second data, and the third data, and the third data is used to indicate that the characteristic vertical line of the building does not appear.
  • the first data includes height information and/or depth information of the first characteristic vertical line
  • the second data includes height information and/or depth information of the second characteristic vertical line.
  • the descriptor is represented by a circle, a first feature point on the circle indicating the first feature vertical line and a second feature point indicating the second feature vertical line, and the circle is centered on the optical center corresponding to the image
  • the first characteristic point is a point obtained by projecting the first characteristic vertical line on a cylinder with the axis of gravity passing through the optical center as the axis, and then projecting onto a horizontal plane containing the optical center
  • the second characteristic point is the first Two characteristic vertical lines are projected on the cylinder, and then projected to the point obtained on the horizontal plane.
  • the angle between the line between the first characteristic point and the center of the circle and the line between the second characteristic point and the center of the circle is used to indicate the level Perspective information.
  • the preset descriptor database includes: the geographic location of the candidate point and the descriptor of the candidate point, and the descriptor of the candidate point includes, taking the candidate point as the viewpoint, the availability of surrounding buildings The direction information of the intersection line of the elevation.
  • the obtaining unit 1603 is further configured to: obtain first positioning information sent by the terminal, the first positioning information including location information of a base station that provides services for the terminal according to GPS signals, wifi signals, or a user’s manual input of an address Positioning information obtained;
  • the acquiring unit 1603 is specifically configured to: match the descriptor in a first descriptor sub-database according to the descriptor to acquire the positioning information.
  • the first descriptor sub-database includes the geographic location of the first candidate point and the first descriptor.
  • Descriptor of the candidate point, the first candidate point is a candidate point whose geographic location is within the first location range corresponding to the first positioning information.
  • the obtaining unit 1603 is further configured to: obtain shooting orientation information of the image sent by the terminal;
  • the acquiring unit 1603 is specifically configured to: match the descriptor with a preset descriptor database according to the first angle range determined by the shooting orientation information to acquire the positioning information.
  • the receiving unit 1601 is specifically configured to: receive a first image shot at a first location and a second image shot at a second location, the first location and the second location are the same location, or the first location The distance between the location and the second location is less than a preset threshold, and the first image and the second image include partially repeated image information.
  • the preset descriptor sub-database is generated based on satellite images.
  • FIG. 17 is a schematic diagram of another embodiment of a terminal in an embodiment of this application.
  • FIG. 17 shows a block diagram of a partial structure of a terminal provided in an embodiment of the present application.
  • the terminal includes: an image acquisition unit 1710, a global positioning system (GPS) module 1720, a sensor 1730, a display unit 1740, an input unit 1750, a memory 1760, a processor 1770, and a power supply 1780.
  • GPS global positioning system
  • FIG. 17 does not constitute a limitation on the terminal, and may include more or fewer components than shown in the figure, or combine certain components, or arrange different components.
  • the image acquisition unit 1710 is used to acquire images. In the embodiment of the present application, it is used to acquire images of buildings. For example, it may be a monocular camera, a binocular camera, a depth camera, or a lidar. The type is not limited.
  • the GPS module 1720 a satellite-based global navigation system, provides positioning or navigation for users, which has lower accuracy than visual positioning. It can be used to assist positioning in the embodiments of this application.
  • the terminal may also include at least one sensor 1730, such as a magnetometer 1731, an inertial measurement unit (IMU) 1732.
  • the inertial measurement unit is a device that measures the three-axis attitude angle (or angular rate) and acceleration of an object.
  • an IMU contains three single-axis accelerometers and three single-axis gyroscopes.
  • the accelerometer detects the acceleration signal of an object in the independent three-axis coordinate system of the carrier, while the gyroscope detects the angular velocity signal of the carrier relative to the navigation coordinate system. Measure the angular velocity and acceleration of the object in three-dimensional space, and calculate the posture of the object. In order to improve reliability, it is also possible to equip more sensors for each axis.
  • the terminal may also include vibration recognition related functions (such as a pedometer, tapping) and so on.
  • it may also include light sensors, motion sensors, and other sensors.
  • the light sensor may include an ambient light sensor and a proximity sensor.
  • the ambient light sensor can adjust the brightness of the display panel 1741 according to the brightness of the ambient light.
  • the proximity sensor can close the display panel 1741 and/or when the terminal is moved to the ear. Or backlight.
  • sensors such as gyroscopes, barometers, hygrometers, thermometers, and infrared sensors that can be configured on the terminal, I won't repeat them here.
  • the display unit 1740 may be used to display information input by the user or information provided to the user and various menus of the terminal.
  • the display unit 1740 may include a display panel 1741.
  • the display panel 1741 may be configured in the form of a liquid crystal display (LCD), an organic light-emitting diode (OLED), etc.
  • the touch panel 1751 can cover the display panel 1741. When the touch panel 1751 detects a touch operation on or near it, it transmits it to the processor 1770 to determine the type of the touch event, and then the processor 1770 responds to the touch event. Type provides corresponding visual output on the display panel 1741.
  • the touch panel 1751 and the display panel 1741 are used as two independent components to realize the input and input functions of the terminal, but in some embodiments, the touch panel 1751 and the display panel 1741 may be integrated. Realize the input and output functions of the terminal.
  • the input unit 1750 can be used to receive input digital or character information, and generate key signal input related to user settings and function control of the terminal.
  • the input unit 1750 may include a touch panel 1751 and other input devices 1752.
  • the touch panel 1751 also called a touch screen, can collect user touch operations on or near it (for example, the user uses any suitable objects or accessories such as fingers, stylus, etc.) on the touch panel 1751 or near the touch panel 1751. Operation), and drive the corresponding connection device according to the preset program.
  • the touch panel 1751 may include two parts: a touch detection device and a touch controller.
  • the touch detection device detects the user's touch position, detects the signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts it into contact coordinates, and then sends it To the processor 1770, and can receive commands sent by the processor 1770 and execute them.
  • the touch panel 1751 can be implemented in multiple types such as resistive, capacitive, infrared, and surface acoustic wave.
  • the input unit 1750 may also include other input devices 1752.
  • the other input device 1752 may include, but is not limited to, one or more of a physical keyboard, function keys (such as volume control buttons, switch buttons, etc.), trackball, mouse, and joystick.
  • the memory 1760 may be used to store software programs and modules.
  • the processor 1770 runs the software programs and modules stored in the memory 1760 to execute various functional applications and data processing of the terminal.
  • the memory 1760 may mainly include a program storage area and a data storage area.
  • the program storage area may store an operating system, an application program required by at least one function (such as a sound playback function, an image playback function, etc.), etc.;
  • the data (such as audio data, phone book, etc.) created by the use of the terminal, etc.
  • the memory 1760 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other volatile solid-state storage devices.
  • the processor 1770 is the control center of the terminal. It uses various interfaces and lines to connect various parts of the entire terminal. It executes by running or executing software programs and/or modules stored in the memory 1720, and calling data stored in the memory 1720. Various functions of the terminal and processing data, so as to monitor the terminal as a whole.
  • the processor 1770 may include one or more processing units; preferably, the processor 1770 may integrate an application processor and a modem processor, where the application processor mainly processes the operating system, user interface, and application programs, etc. , The modem processor mainly deals with wireless communication. It can be understood that the foregoing modem processor may not be integrated into the processor 1770.
  • the terminal also includes a power source 1780 (such as a battery) for supplying power to various components.
  • a power source 1780 (such as a battery) for supplying power to various components.
  • the power source can be logically connected to the processor 1770 through a power management system, so that functions such as charging, discharging, and power consumption management can be managed through the power management system.
  • the terminal may include an audio circuit, which includes a speaker and a microphone, and may provide an audio interface between the user and the terminal.
  • an audio circuit which includes a speaker and a microphone, and may provide an audio interface between the user and the terminal.
  • the terminal may include a wireless fidelity (WiFi) module.
  • WiFi is a short-distance wireless transmission technology.
  • the terminal can help users send and receive emails, browse web pages, and access streaming media through the WiFi module 1770. And so on, it provides users with wireless broadband Internet access.
  • the terminal may further include a radio frequency (RF) circuit.
  • RF radio frequency
  • the terminal may also include a Bluetooth module, etc., which will not be repeated here.
  • the processor 1770 included in the terminal also has the function of implementing the aforementioned gesture interaction methods.
  • FIG. 18 is a schematic diagram of another embodiment of a server in an embodiment of this application.
  • the server provided in this embodiment is a target server for virtual machine hot migration, and its specific device form is not limited in the embodiment of this application.
  • the server 1800 may have relatively large differences due to different configurations or performance, and may include one or more processors 1801 and a memory 1802, and the memory 1802 stores programs or data.
  • the memory 1802 may be volatile storage or non-volatile storage.
  • the processor 1801 is one or more central processing units (CPU, Central Processing Unit, the CPU may be a single-core CPU or a multi-core CPU.
  • the processor 1801 may communicate with the memory 1802 and execute on the server 1800 A series of instructions in the memory 1802.
  • the server 1800 also includes one or more wired or wireless network interfaces 1803, such as an Ethernet interface.
  • the server 1800 may also include one or more power supplies; one or more input and output interfaces, which can be used to connect a display, a mouse, a keyboard, a touch screen device, or a sensor device Etc.
  • the input/output interface is an optional component, which may or may not exist, and is not limited here.
  • the foregoing terminal device may also be an augmented reality (AR) device.
  • AR augmented reality
  • FIG. 19 is a schematic structural diagram of an AR device disclosed in an embodiment of the present application.
  • the AR device includes a processor 1901, and the processor 1901 may be coupled to one or more storage media.
  • the storage medium includes a storage medium 1911 and at least one memory 1902.
  • the storage medium 1911 may be read-only, such as read-only memory (ROM), or a readable/writable hard disk or flash memory.
  • the memory 1902 may be, for example, a random access memory (RAM).
  • the memory 1902 may be combined with the processor 1901, or integrated in the processor 1901, or composed of an independent unit or multiple units.
  • the processor 1901 is the control center of the AR device, and specifically provides time sequence and process equipment for executing instructions, completing interrupt events, providing time functions, and many other functions.
  • the processor 1901 includes one or more central processing units CPU, such as CPU0 and CPU1 in FIG. 2.
  • the AR device may also include multiple processors, and each processor may be single-core or multi-core.
  • the specific implementation of the processor or memory described in this application includes a general-purpose component or a special-purpose component, the general-purpose component is configured to perform a certain task at a specific moment, and the special-purpose component is produced to perform a special task.
  • the processor described in the embodiment of the present application at least includes an electronic device, a circuit, and/or a processor chip configured to process data (such as computer program instructions).
  • the program code executed by the processor 1901 and/or the processor 1912, or the single CPU in the processor 1901 and/or the processor 1912 may be stored in the memory 1902.
  • the AR device also includes a front camera 1903, a front rangefinder 1904, a rear camera 1905, a rear rangefinder 1906, an output module 1907 (such as an optical projector or a laser projector, etc.) and/or communication Interface 1908.
  • the front camera 1903, the front rangefinder 1904, the rear camera 1905, the rear rangefinder 1906, and the output module 1907 are coupled to the processor 1901.
  • the AR device may also include a receiving/transmitting circuit 1909 and an antenna 1910. The receiving/transmitting circuit 1909 and the antenna 1910 are used to realize the connection between the AR device and the external network.
  • the constituent units of the AR device may be coupled with each other through a communication bus, and the communication bus includes at least any one of the following: a data bus, an address bus, a control bus, an expansion bus, and a local bus.
  • the AR device is only an example physical device form disclosed in the embodiment of the present application, and the embodiment of the present application does not uniquely limit the specific form of the AR device.
  • the processor 1901 of the AR device can be coupled to the at least one memory 1902.
  • the memory 1902 is pre-stored with program code.
  • the program code specifically includes an image acquisition module, a parameter detection module, a coefficient determination module, an image cropping module, an image generation module, and an image Display module
  • the memory 1902 further stores a kernel module
  • the kernel module includes an operating system (such as WINDOWSTM, ANDROIDTM, IOSTM, etc.).
  • the processor 1901 of the AR device is configured to call the program code to execute the positioning method in the embodiment of the present application.
  • the disclosed system, device, and method may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the unit is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or may be Integrate into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the unit described as a separate component may or may not be physically separated, and the component displayed as a unit may or may not be a physical unit, that is, it may be located in one place, or may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • each unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of this application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , Including several instructions to make a computer device (which can be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the method in each embodiment of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disk and other media that can store program code .

Abstract

一种视觉定位方法,可以在拍摄图像的场景变化较大的情况下,实现准确度较高的定位。该方法包括:终端获取建筑物的图像(1001);所述终端根据所述图像生成描述子(1004),所述描述子包括所述图像中的第一特征竖线和第二特征竖线之间的水平视角的信息,所述第一特征竖线指示所述建筑物的第一立面交线,所述第二特征竖线指示所述建筑物的第二立面交线;所述终端根据所述描述子在预设的描述子数据库中匹配,以获取所述图像的拍摄地点的定位信息(1006)。

Description

视觉定位方法、终端和服务器
本申请要求于2019年8月9日提交中国专利局、申请号为201910736244.9、发明名称为“视觉定位方法、终端和服务器”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及定位技术领域,尤其涉及一种视觉定位方法、终端和服务器。
背景技术
随着通信技术逐渐发展以及终端硬件计算能力的提升,室外基于位置的服务(location based service,LBS)需求越来越丰富,例如增强导航、增强现实(augmented reality,AR)广告、AR游戏等。室外定位技术主要依赖全球定位系统(global positioning system,GPS),GPS信号易受建筑影响发生遮挡与多径反射,定位精度较低。而视觉定位能够通过获取当前场景的图像信息进行更精确的定位。
为满足LBS对于定位精度的需求,现有技术中,预先采集大量图像,基于图像像素点的亮度变化提取特征点,构建出场景的三维空间地图,该地图包含特征点的三维空间位置和其对应的描述子,该特征点的描述子用于描述特征点与周边像素点的关系。进行视觉定位时输入当前拍摄图像,首先对图像进行特征点的提取,然后将图像中的特征点和地图中的特征点进行匹配,进而得到视觉定位结果。
由于现有技术中,基于图像像素点的亮度变化提取特征点,并通过当前拍摄图像提取的特征点与数据库构建时采集的图像提取的特征点进行匹配,若当前图像拍摄时与数据库采集拍摄时的场景变化较大,例如光照条件变化较大或雨雪天气影响等,提取的特征点将无法有效匹配,视觉定位结果准确度较低。
发明内容
本申请实施例提供了一种视觉定位方法,可以在拍摄图像的场景变化较大的情况下,实现准确度较高的定位。
本申请实施例第一方面提供了一种视觉定位方法,该方法包括:终端获取建筑物的图像;该终端根据该图像生成描述子,该描述子包括该图像中的第一特征竖线和第二特征竖线之间的水平视角的信息,该第一特征竖线指示该建筑物的第一立面交线,该第二特征竖线指示该建筑物的第二立面交线;该终端根据该描述子在预设的描述子数据库中匹配,以获取该图像的拍摄地点的定位信息,所述预设的描述子数据库包括:备选点的地理位置和所述备选点的描述子,所述备选点的描述子包括,以所述备选点为视点,周围建筑物的可视的立面交线的朝向信息,其中,所述描述子数据库中,与所述图像的描述子匹配的描述子指示的备选点的地理位置为所述图像的拍摄地点的定位信息。
视觉定位时,用户通过拍摄图像可以获取当前图像的拍摄地点的定位信息,本申请实施例中,通过拍摄建筑物,终端可以获取建筑物的图像,终端根据该图像可以生成图像对应的描述子,该描述子用于指示图像的拍摄地点与拍摄的建筑物的立面交线之间的空间位置关系,描述子包括该图像中的第一特征竖线和第二特征竖线之间的水平视角的信息,该第一特征竖线指示该建筑物的第一立面交线,该第二特征竖线指示该建筑物的第二立面交线,该终端根据该描述子在预设的描述子数据库中匹配,以获取该图像的拍摄地点的定位信息。预设的描 述子数据库包括备选点的地理位置和该备选点的描述子,备选点为预先选定的已知地理位置的点,作为参考点采集周围建筑物的可视立面交线的朝向信息,包含了以该备选点为视点,任意两条立面交线之间的水平视角信息。由此,根据图像获取的描述子可以与备选点的描述子进行匹配,确定匹配度较高的备选点的描述子,对应的描述子的地理位置可用于确定图像的拍摄地点的定位信息。
在第一方面的一种可能的实现方式中,该终端根据该图像生成描述子包括:该终端从该图像中提取该建筑物的该第一特征竖线和该第二特征竖线;该终端根据该第一特征竖线和该第二特征竖线的位置关系生成该描述子,该位置关系包括该第一特征竖线和该第二特征竖线的之间的水平视角的信息。
本申请实施例提供的视觉定位方法,终端可以从该图像中提取该建筑物的该第一特征竖线和该第二特征竖线,进而生成描述子。
在第一方面的一种可能的实现方式中,该描述子包括:该第一特征竖线和该第二特征竖线的高度信息和/或深度信息,该第一特征竖线的高度信息用于指示该第一立面交线的高度,该第二特征竖线的高度信息用于指示该第二立面交线的高度,该第一特征竖线的深度信息用于指示该图像的拍摄地点到该第一立面交线的距离,该第二特征竖线的深度信息用于指示该图像的拍摄地点到该第二立面交线的距离。
本申请实施例提供的视觉定位方法,描述子还可以包括特征竖线的高度信息和/或深度信息,特征竖线的高度信息用于指示建筑物立面交线的高度,深度信息用于指示该图像的拍摄地点到立面交线的距离,由此,描述子除了包括图像中的第一特征竖线和第二特征竖线之间的水平视角的信息,还可以包括关于建筑物立面交线更丰富的位置信息,该高度信息和深度信息也可以用于匹配定位,例如拍摄地点周围的建筑物的立面交线分别较均匀时,如果同时具有立面交线的高度信息,可以用于提高定位的准确性,还可以提供图像拍摄时的朝向信息。
在第一方面的一种可能的实现方式中,该描述子以环形数组表示,该环形数组中包括指示第一特征竖线的第一数据,和指示第一特征竖线的第二数据,该第一数据和该第二数据在该环形数组中的位置间隔用于指示该第一特征竖线和该第二特征竖线之间水平视角信息。
本申请实施例提供的视觉定位方法,给出了描述子的一种具体的存储形式,即环形数组,可以利用数据的间隔方便的表达没有具体朝向的角度信息,还可以便于与描述子数据库进行匹配。
在第一方面的一种可能的实现方式中,该描述子以环形数组表示,该环形数组中包括该第一数据、该第二数据和第三数据,该第三数据用于指示未出现该建筑物的特征竖线。
本申请实施例提供的视觉定位方法,给出了描述子的另一种具体的存储形式,环形数组中包括各特征竖线对应的数据,以及指示未出现该建筑物的特征竖线的第三数据,提升了方案的可实现性。
在第一方面的一种可能的实现方式中,该第一数据包括该第一特征竖线的高度信息和/或深度信息,该第二数据包括该第二特征竖线的高度信息和/或深度信息。
本申请实施例提供的视觉定位方法,给出了描述子的另一种具体的存储形式,环形数组的第一数据中还可以包含高度信息和/或深度信息,提供了实现高度信息和/或深度信息存储的一种具体方式。
在第一方面的一种可能的实现方式中,该描述子以圆、圆上指示该第一特征竖线的第一特征点和指示该第二特征竖线第二特征点表示,该圆以该图像对应的光心为圆心,该第一特征点为该第一特征竖线投影在以经过该光心的重力轴为轴线的圆柱上,再投影至包含该光心的水平面上得到的点,该第二特征点为该第二特征竖线投影在该圆柱上,再投影至该水平面上得到的点,该第一特征点和圆心的连线与第二特征点和圆心的连线之间的夹角用于指示该水平视角的信息。
本申请实施例提供的视觉定位方法,描述子可以通过直观的几何信息来表达,可以方便地展示描述子的生成方式,体现视觉定位的具体过程,提升交互体验。
在第一方面的一种可能的实现方式中,该方法还包括:该终端获取第一定位信息,该第一定位信息包括根据GPS信号、wifi信号、为该终端提供服务的基站的位置信息或用户手动输入地址获取的定位信息;该终端根据该描述子在预设的描述子数据库中匹配,以获取该图像的拍摄地点的定位信息包括:该终端根据该描述子在第一描述子数据库中匹配,以获取该定位信息,该第一描述子数据库包括第一备选点的地理位置和该第一备选点的描述子,该第一备选点为地理位置在该第一定位信息对应的第一位置范围内的备选点。
本申请实施例提供的视觉定位方法,终端可以通过第一定位信息获取粗略的第一位置范围,据此缩小描述子数据库匹配的范围,仅与属于该第一位置范围的备选点的描述子进行匹配,可以减少计算量,提升视觉定位速度。
在第一方面的一种可能的实现方式中,该方法还包括:该终端获取第一定位信息,该第一定位信息包括根据GPS信号、wifi信号、为该终端提供服务的基站的位置信息或用户手动输入地址获取的定位信息;该终端将该第一定位信息发送给服务器,该终端接收该服务器发送的该预设的描述子数据库,该预设的描述子数据库第一备选点的地理位置和该第一备选点的描述子,该第一备选点为地理位置在该第一定位信息对应的第一位置范围内的该备选点。
本申请实施例提供的视觉定位方法,终端可以通过将第一定位信息发送给服务器,从服务器获取预设的描述子数据库,可以实现本地匹配和定位。
在第一方面的一种可能的实现方式中,该方法还包括:该终端获取该图像的拍摄朝向信息;该终端根据该描述子在预设的描述子数据库中匹配,以获取该图像的拍摄地点的定位信息:该终端根据该拍摄朝向信息确定的第一角度范围,将该描述子与预设的描述子数据库进行匹配,以获取该定位信息。
本申请实施例提供的视觉定位方法,终端可以通过获取该图像的拍摄朝向信息,在该朝向信息约束的角度范围内,与备选点的描述子进行匹配,可以减少计算量,提升视觉定位速度。
在第一方面的一种可能的实现方式中,该终端获取建筑物的图像包括:该终端获取在第一地点拍摄的第一图像和在第二地点拍摄的第二图像,该第一地点和该第二地点为同一地点,或者该第一地点与该第二地点的距离小于预设阈值,该第一图像和该第二图像包括部分重复的图像信息。
本申请实施例提供的视觉定位方法,终端可以通过拍摄多张图像拓宽视场角,获取更多建筑物图像信息,具有部分重复的图像信息的第一图像和第二图像可以进行拼接,并用于生 成描述子。
在第一方面的一种可能的实现方式中,该预设的描述子数据库根据卫星图像生成。
本申请实施例提供的视觉定位方法,预设的描述子数据库可以根据卫星图像生成,例如卫星图像生成LOD模型,在LOD模型中选取备选点,获取每个备选点的描述子,进而构建描述子数据库。该方案采用卫星影像大规模自动化构建描述子数据库,无需现场采集图像,减少了工作量,降低了构建数据库的难度。
本申请实施例第二方面提供了一种视觉定位方法,包括:服务器接收终端发送的建筑物的图像;该服务器根据该图像生成描述子,该描述子包括该图像中的第一特征竖线和第二特征竖线之间的水平视角的信息,该第一特征竖线指示该建筑物的第一立面交线,该第二特征竖线指示该建筑物的第二立面交线;该服务器根据该描述子在预设的描述子数据库中匹配,以获取该图像的拍摄地点的定位信息,所述预设的描述子数据库包括:备选点的地理位置和所述备选点的描述子,所述备选点的描述子包括,以所述备选点为视点,周围建筑物的可视的立面交线的朝向信息,其中,所述描述子数据库中,与所述图像的描述子匹配的描述子指示的备选点的地理位置为所述图像的拍摄地点的定位信息;该服务器将该定位信息发送给该终端。
在第二方面的一种可能的实现方式中,该服务器根据该图像生成描述子包括:该服务器从该图像中提取该建筑物的该第一特征竖线和该第二特征竖线;该服务器根据该第一特征竖线和该第二特征竖线的位置关系生成该描述子,该位置关系包括该第一特征竖线和该第二特征竖线的之间的水平视角的信息。
本申请实施例提供的视觉定位方法,服务器可以从该图像中提取该建筑物的该第一特征竖线和该第二特征竖线,进而生成描述子。
在第二方面的一种可能的实现方式中,该描述子包括:该第一特征竖线和该第二特征竖线的高度信息和/或深度信息,该第一特征竖线的高度信息用于指示该第一立面交线的高度,该第二特征竖线的高度信息用于指示该第二立面交线的高度,该第一特征竖线的深度信息用于指示该图像的拍摄地点到该第一立面交线的距离,该第二特征竖线的深度信息用于指示该图像的拍摄地点到该第二立面交线的距离。
本申请实施例提供的视觉定位方法,描述子还可以包括特征竖线的高度信息和/或深度信息,描述子除了包括图像中的第一特征竖线和第二特征竖线之间的水平视角的信息,还可以提供关于建筑物立面交线更丰富的位置信息,该高度信息和深度信息也可以用于匹配定位,例如拍摄地点周围的建筑物的立面交线分别较均匀时,如果同时具有立面交线的高度信息,可以用于提高定位的准确性,还可以提供图像拍摄时的朝向信息。
在第二方面的一种可能的实现方式中,该描述子以环形数组表示,该环形数组中包括指示第一特征竖线的第一数据,和指示第一特征竖线的第二数据,该第一数据和该第二数据在该环形数组中的位置间隔用于指示该第一特征竖线和该第二特征竖线之间水平视角信息。
本申请实施例提供的视觉定位方法,给出了描述子的一种具体的存储形式,即环形数组,可以利用数据的间隔方便的表达没有具体朝向的角度信息,还可以便于与描述子数据库进行匹配。
在第二方面的一种可能的实现方式中,该描述子以环形数组表示,该环形数组中包括该 第一数据、该第二数据和第三数据,该第三数据用于指示未出现该建筑物的特征竖线。
本申请实施例提供的视觉定位方法,给出了描述子的另一种具体的存储形式,环形数组中包括各特征竖线对应的数据,以及指示未出现该建筑物的特征竖线的第三数据,提升了方案的可实现性。
在第二方面的一种可能的实现方式中,该第一数据包括该第一特征竖线的高度信息和/或深度信息,该第二数据包括该第二特征竖线的高度信息和/或深度信息。
本申请实施例提供的视觉定位方法,给出了描述子的另一种具体的存储形式,环形数组的第一数据中还可以包含高度信息和/或深度信息,提供了实现高度信息和/或深度信息存储的一种具体方式。
在第二方面的一种可能的实现方式中,该描述子以圆、圆上指示该第一特征竖线的第一特征点和指示该第二特征竖线第二特征点表示,该圆以该图像对应的光心为圆心,该第一特征点为该第一特征竖线投影在以经过该光心的重力轴为轴线的圆柱上,再投影至包含该光心的水平面上得到的点,该第二特征点为该第二特征竖线投影在该圆柱上,再投影至该水平面上得到的点,该第一特征点和圆心的连线与第二特征点和圆心的连线之间的夹角用于指示该水平视角的信息。
本申请实施例提供的视觉定位方法,描述子可以通过直观的几何信息来表达,可以方便地展示描述子的生成方式,体现视觉定位的具体过程,提升交互体验。
在第二方面的一种可能的实现方式中,该方法还包括:该服务器获取该终端发送的第一定位信息,该第一定位信息包括根据GPS信号、wifi信号、为该终端提供服务的基站的位置信息或用户手动输入地址获取的定位信息;该服务器根据该描述子在预设的描述子数据库中匹配,以获取该图像的拍摄地点的定位信息包括:该服务器根据该描述子在第一描述子数据库中匹配,以获取该定位信息,该第一描述子数据库包括第一备选点的地理位置和该第一备选点的描述子,该第一备选点为地理位置在该第一定位信息对应的第一位置范围内的备选点。
本申请实施例提供的视觉定位方法,服务器可以通过第一定位信息获取粗略的第一位置范围,据此缩小描述子数据库匹配的范围,仅与属于该第一位置范围的备选点的描述子进行匹配,可以减少计算量,提升视觉定位速度。
在第二方面的一种可能的实现方式中,该方法还包括:该服务器获取该终端发送的该图像的拍摄朝向信息;该服务器根据该描述子在预设的描述子数据库中匹配,以获取该图像的拍摄地点的定位信息包括:该终端根据该拍摄朝向信息确定的第一角度范围,将该描述子与预设的描述子数据库进行匹配,以获取该定位信息。
本申请实施例提供的视觉定位方法,服务器可以通过获取该图像的拍摄朝向信息,在该朝向信息约束的角度范围内,与备选点的描述子进行匹配,可以减少计算量,提升视觉定位速度。
在第二方面的一种可能的实现方式中,服务器接收终端发送的建筑物的图像包括:该服务器接收在第一地点拍摄的第一图像和在第二地点拍摄的第二图像,该第一地点和该第二地点为同一地点,或者该第一地点与该第二地点的距离小于预设阈值,该第一图像和该第二图像包括部分重复的图像信息。
本申请实施例提供的视觉定位方法,服务器可以接收多张图像,获取更多建筑物图像信 息,具有部分重复的图像信息的第一图像和第二图像可以进行拼接,并用于生成描述子。
在第二方面的一种可能的实现方式中,该预设的描述子数据库根据卫星图像生成。
本申请实施例提供的视觉定位方法,预设的描述子数据库可以根据卫星图像生成,例如卫星图像生成LOD模型,在LOD模型中选取备选点,获取每个备选点的描述子,进而构建描述子数据库。该方案采用卫星影像大规模自动化构建描述子数据库,无需现场采集图像,减少了工作量,降低了构建数据库的难度。
本申请实施例第三方面提供了一种视觉定位方法,包括:终端获取建筑物的图像;该终端将该图像发送给服务器;该终端获取该服务器发送的定位信息,该定位信息由该服务器根据该图像生成的描述子在预设的描述子数据库中匹配得到,该描述子包括该图像中的第一特征竖线和第二特征竖线之间的水平视角的信息,该第一特征竖线指示该建筑物的第一立面交线,该第二特征竖线指示该建筑物的第二立面交线,所述预设的描述子数据库包括:备选点的地理位置和所述备选点的描述子,所述备选点的描述子包括,以所述备选点为视点,周围建筑物的可视的立面交线的朝向信息,其中,所述描述子数据库中,与所述图像的描述子匹配的描述子指示的备选点的地理位置为所述图像的拍摄地点的定位信息。
在第三方面的一种可能的实现方式中,该终端获取该图像的拍摄朝向信息;该终端将该拍摄朝向信息发送给该服务器,该拍摄朝向信息用于确定第一角度范围,该定位信息由该服务器,在该第一角度范围内,根据该图像生成的描述子在该服务器预设的描述子数据库中匹配得到。
本申请实施例第四方面提供了一种视觉定位方法,包括:服务器获取终端发送的第一定位信息;该服务器根据该第一定位信息向该终端发送预设的描述子数据库,该预设的描述子数据库第一备选点的地理位置和该第一备选点的描述子,该第一备选点为地理位置在该第一定位信息对应的第一位置范围内的该备选点。该第一描述子数据库用于终端的视觉定位。
本申请实施例第五方面提供了一种终端,其特征在于,包括:获取单元,用于获取建筑物的图像;生成单元,用于根据该图像生成描述子,该描述子包括该图像中的第一特征竖线和第二特征竖线之间的水平视角的信息,该第一特征竖线指示该建筑物的第一立面交线,该第二特征竖线指示该建筑物的第二立面交线;该获取单元还用于,根据该描述子在预设的描述子数据库中匹配,以获取该图像的拍摄地点的定位信息,所述预设的描述子数据库包括:备选点的地理位置和所述备选点的描述子,所述备选点的描述子包括,以所述备选点为视点,周围建筑物的可视的立面交线的朝向信息,其中,所述描述子数据库中,与所述图像的描述子匹配的描述子指示的备选点的地理位置为所述图像的拍摄地点的定位信息。
在第五方面的一种可能的实现方式中,该生成单元具体用于:从该图像中提取该建筑物的该第一特征竖线和该第二特征竖线;根据该第一特征竖线和该第二特征竖线的位置关系生成该描述子,该位置关系包括该第一特征竖线和该第二特征竖线的之间的水平视角的信息。
在第五方面的一种可能的实现方式中,该描述子包括:该第一特征竖线和该第二特征竖线的高度信息和/或深度信息,该第一特征竖线的高度信息用于指示该第一立面交线的高度,该第二特征竖线的高度信息用于指示该第二立面交线的高度,该第一特征竖线的深度信息用于指示该图像的拍摄地点到该第一立面交线的距离,该第二特征竖线的深度信息用于指示该 图像的拍摄地点到该第二立面交线的距离。
在第五方面的一种可能的实现方式中,该描述子以环形数组表示,该环形数组中包括指示第一特征竖线的第一数据,和指示第一特征竖线的第二数据,该第一数据和该第二数据在该环形数组中的位置间隔用于指示该第一特征竖线和该第二特征竖线之间水平视角信息。
在第五方面的一种可能的实现方式中,该描述子以环形数组表示,该环形数组中包括该第一数据、该第二数据和第三数据,该第三数据用于指示未出现该建筑物的特征竖线。
在第五方面的一种可能的实现方式中,该第一数据包括该第一特征竖线的高度信息和/或深度信息,该第二数据包括该第二特征竖线的高度信息和/或深度信息。
在第五方面的一种可能的实现方式中,该描述子以圆、圆上指示该第一特征竖线的第一特征点和指示该第二特征竖线第二特征点表示,该圆以该图像对应的光心为圆心,该第一特征点为该第一特征竖线投影在以经过该光心的重力轴为轴线的圆柱上,再投影至包含该光心的水平面上得到的点,该第二特征点为该第二特征竖线投影在该圆柱上,再投影至该水平面上得到的点,该第一特征点和圆心的连线与第二特征点和圆心的连线之间的夹角用于指示该水平视角的信息。
在第五方面的一种可能的实现方式中,该获取单元还用于:获取第一定位信息,该第一定位信息包括根据GPS信号、wifi信号、为该终端提供服务的基站的位置信息或用户手动输入地址获取的定位信息;该获取单元具体用于:根据该描述子在第一描述子数据库中匹配,以获取该定位信息,该第一描述子数据库包括第一备选点的地理位置和该第一备选点的描述子,该第一备选点为地理位置在该第一定位信息对应的第一位置范围内的备选点。
在第五方面的一种可能的实现方式中,该获取单元还用于:获取第一定位信息,该第一定位信息包括根据GPS信号、wifi信号、为该终端提供服务的基站的位置信息或用户手动输入地址获取的定位信息;该终端还包括:发送单元,用于将该第一定位信息发送给服务器,接收单元,用于接收该服务器发送的该预设的描述子数据库,该预设的描述子数据库第一备选点的地理位置和该第一备选点的描述子,该第一备选点为地理位置在该第一定位信息对应的第一位置范围内的该备选点。
在第五方面的一种可能的实现方式中,该获取单元还用于:获取该图像的拍摄朝向信息;该获取单元具体用于:根据该拍摄朝向信息确定的第一角度范围,将该描述子与预设的描述子数据库进行匹配,以获取该定位信息。
在第五方面的一种可能的实现方式中,该获取单元具体用于:获取在第一地点拍摄的第一图像和在第二地点拍摄的第二图像,该第一地点和该第二地点为同一地点,或者该第一地点与该第二地点的距离小于预设阈值,该第一图像和该第二图像包括部分重复的图像信息。
在第五方面的一种可能的实现方式中,该预设的描述子数据库根据卫星图像生成。
本申请实施例第六方面提供了一种服务器,包括:接收单元,用于接收终端发送的建筑物的图像;生成单元,用于根据该图像生成描述子,该描述子包括该图像中的第一特征竖线和第二特征竖线之间的水平视角的信息,该第一特征竖线指示该建筑物的第一立面交线,该第二特征竖线指示该建筑物的第二立面交线;获取单元,用于根据该描述子在预设的描述子数据库中匹配,以获取该图像的拍摄地点的定位信息,所述预设的描述子数据库包括:备选点的地理位置和所述备选点的描述子,所述备选点的描述子包括,以所述备选点为视点,周 围建筑物的可视的立面交线的朝向信息,其中,所述描述子数据库中,与所述图像的描述子匹配的描述子指示的备选点的地理位置为所述图像的拍摄地点的定位信息;发送单元,用于将该定位信息发送给该终端。
在第六方面的一种可能的实现方式中,该生成单元具体用于:从该图像中提取该建筑物的该第一特征竖线和该第二特征竖线;根据该第一特征竖线和该第二特征竖线的位置关系生成该描述子,该位置关系包括该第一特征竖线和该第二特征竖线的之间的水平视角的信息。
在第六方面的一种可能的实现方式中,该描述子包括:该第一特征竖线和该第二特征竖线的高度信息和/或深度信息,该第一特征竖线的高度信息用于指示该第一立面交线的高度,该第二特征竖线的高度信息用于指示该第二立面交线的高度,该第一特征竖线的深度信息用于指示该图像的拍摄地点到该第一立面交线的距离,该第二特征竖线的深度信息用于指示该图像的拍摄地点到该第二立面交线的距离。
在第六方面的一种可能的实现方式中,该描述子以环形数组表示,该环形数组中包括指示第一特征竖线的第一数据,和指示第一特征竖线的第二数据,该第一数据和该第二数据在该环形数组中的位置间隔用于指示该第一特征竖线和该第二特征竖线之间水平视角信息。
在第六方面的一种可能的实现方式中,该描述子以环形数组表示,该环形数组中包括该第一数据、该第二数据和第三数据,该第三数据用于指示未出现该建筑物的特征竖线。
在第六方面的一种可能的实现方式中,该第一数据包括该第一特征竖线的高度信息和/或深度信息,该第二数据包括该第二特征竖线的高度信息和/或深度信息。
在第六方面的一种可能的实现方式中,该描述子以圆、圆上指示该第一特征竖线的第一特征点和指示该第二特征竖线第二特征点表示,该圆以该图像对应的光心为圆心,该第一特征点为该第一特征竖线投影在以经过该光心的重力轴为轴线的圆柱上,再投影至包含该光心的水平面上得到的点,该第二特征点为该第二特征竖线投影在该圆柱上,再投影至该水平面上得到的点,该第一特征点和圆心的连线与第二特征点和圆心的连线之间的夹角用于指示该水平视角的信息。
在第六方面的一种可能的实现方式中,该预设的描述子数据库包括:备选点的地理位置和该备选点的描述子,该备选点的描述子包括,以该备选点为视点,周围建筑物的可视的立面交线的朝向信息。
在第六方面的一种可能的实现方式中,该获取单元还用于:获取该终端发送的第一定位信息,该第一定位信息包括根据GPS信号、wifi信号、为该终端提供服务的基站的位置信息或用户手动输入地址获取的定位信息;该获取单元具体用于:根据该描述子在第一描述子数据库中匹配,以获取该定位信息,该第一描述子数据库包括第一备选点的地理位置和该第一备选点的描述子,该第一备选点为地理位置在该第一定位信息对应的第一位置范围内的备选点。
在第六方面的一种可能的实现方式中,该获取单元还用于:获取该终端发送的该图像的拍摄朝向信息;该获取单元具体用于:根据该拍摄朝向信息确定的第一角度范围,将该描述子与预设的描述子数据库进行匹配,以获取该定位信息。
在第六方面的一种可能的实现方式中,该接收单元具体用于:接收在第一地点拍摄的第一图像和在第二地点拍摄的第二图像,该第一地点和该第二地点为同一地点,或者该第一地 点与该第二地点的距离小于预设阈值,该第一图像和该第二图像包括部分重复的图像信息。
在第六方面的一种可能的实现方式中,该预设的描述子数据库根据卫星图像生成。
本申请实施例第七方面提供了一种终端,包括:处理器和储存器;该存储器用于存储指令;该处理器用于根据该指令执行前述第一方面或第三方面提供的各实施方式的方法。
本申请实施例第八方面提供了一种服务器,包括:处理器和储存器;该存储器用于存储指令;该处理器用于根据该指令执行前述第二方面或第四方面提供的各实施方式的方法。
本申请实施例第九方面提供了包含指令的计算机程序产品,当其在计算机上运行时,使得该计算机执行前述第一方面至第四方面提供的各实施方式的方法。
本申请实施例第十方面提供了计算机可读存储介质,该计算机可读存储介质存储指令,当该指令在计算机上运行时,使得该计算机执行前述第一方面至第四方面提供的各实施方式的方法。
从以上技术方案可以看出,本申请实施例具有以下优点:
本申请实施例提供的视觉定位方法,获取对建筑物进行拍摄得到的建筑物的图像,根据该图像生成描述子,根据该描述子在预设的描述子数据库中进行匹配得到图像的拍摄地点的定位信息。其中,该描述子包括该图像中的第一特征竖线和第二特征竖线之间的水平视角的信息,该第一特征竖线指示该建筑物的第一立面交线,该第二特征竖线指示该建筑物的第二立面交线。本申请实施例提供的视觉定位方法,从图像中提取的特征竖线具有几何语义信息,对应于实体建筑物的立面交线,该特征竖线不易受图像拍摄时的场景变化的影响,即使建筑被部分遮挡,或者未拍摄到顶端天际线,或者光照条件变化较大,或者雨雪天气时,都不影响描述子的生成和使用,根据该描述子匹配,获取的视觉定位结果准确度较高。
此外,现有技术中,基于图像像素点的亮度变化提取特征点,特征点的描述子用于描述特征点与周边像素点的数值关系,通常为一个多维特征向量,例如128维特征向量,可见,特征点数量较多,且生成描述子的计算量较大,而本申请中提取图像中对应于实体建筑物的立面交线的特征竖线,描述子包含特征竖线之间的水平视角的信息,特征竖线数量少,获取描述子的计算更简便。
此外,现有技术中,描述子数据库的构建需要现场采集大量图像,季节变换时还需要刷新数据库,耗时费力,本申请实施例的描述子数据库可采用卫星影像大规模自动化构建,无需现场采集图像,极大程度上减少了工作量,降低了构建数据库的难度。
此外,现有技术中,描述子数据库存储大量的二维图像、三维特征点云、以及特征点的描述子,数据量大,需大量存储空间,通常只能基于较小的地理范围构建。而本申请实施例提供的视觉定位方法,根据图像生成的描述子直接关联地图备选点的描述子,描述子记录备选点周边建筑物可视的立面交线的朝向信息,无需存储大量二维图像和三维点云数据,数据量较小,占用存储空间小,例如,对于同一位置范围,本方案的描述子数据库约为视觉同步定位与地图构建(visual simultaneous localization and mapping,vSLAM)特征库的百万分之一量级。
附图说明
图1为本申请实施例中视觉定位方法的一个实施例示意图;
图2为相机坐标系的示意图;
图3为本申请实施例中对图像进行语义分割的示意图;
图4为本申请实施例中获取建筑物的特征竖线的示意图;
图5为本申请实施例中圆柱投影转换的示意图;
图6为本申请实施例中多张图像进行圆柱投影的示意图;
图7为本申请实施例中描述子的数据结构示意图;
图8为本申请实施例中GPS信号辅助进行视觉定位的一个实施例示意图;
图9为本申请实施例中构建描述子数据库的一个实施例示意图;
图10为本申请实施例中选取备选点的一个实施例示意图;
图11为为本申请实施例中根据LOD模型进行圆柱投影的示意图;
图12为本申请实施例中生成备选点的描述子的示意图;
图13为本申请实施例中一种视觉定位方法的一个实施例交互图;
图14为本申请实施例中一种视觉定位方法的另一个实施例交互图;
图15为本申请实施例中终端的一个实施例示意图;
图16为本申请实施例中服务器的一个实施例示意图;
图17为本申请实施例中终端的另一个实施例示意图;
图18为本申请实施例中服务器的另一个实施例示意图;
图19为本申请实施例公开的一种AR设备的结构示意图。
具体实施方式
本申请实施例提供了一种视觉定位方法,用于在光照条件变化较大的场景中的提高视觉定位的准确度。
下面结合附图,对本申请的实施例进行描述,显然,所描述的实施例仅仅是本申请一部分的实施例,而不是全部的实施例。本领域普通技术人员可知,随着技术的发展和新场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或模块的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或模块,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或模块。在本申请中出现的对步骤进行的命名或者编号,并不意味着必须按照命名或者编号所指示的时间/逻辑先后顺序执行方法流程中的步骤,已经命名或者编号的流程步骤可以根据要实现的技术目的变更执行次序,只要能达到相同或者相类似的技术效果即可。
请参阅图1,为本申请实施例中视觉定位方法的一个实施例示意图;
本申请实施例提供的视觉定位方法,包括:
1001、终端获取建筑物的图像;
用户需要获取当前所在的精确位置信息时,可以通过拍摄周围的建筑物进行视觉定位。 终端获取对建筑物进行拍摄得到的图像,该图像中将包括建筑物图像。
可选的,获取图像的图像采集设备有多种,可以是单目相机、双目相机、深度相机或激光雷达等,此处对于图像采集设备的具体类型不做限定。该图像采集设备可以是设置在该终端内的相机部件,还可以是外置于该终端且可与该终端通信的设备,图像采集设备的设置形式具体此处不做限定。
在某些场景下,单张图像视场角较小,无法拍摄到一栋完整的建筑物,提供的信息有限,终端可以通过获取多张图像扩大视场角,增加图像的信息量。终端获取的图像,可以是一张图像,也可以是多张图像,此处对于获取的图像的数量不做限定。可选的,该终端获取在第一地点拍摄的第一图像和在第二地点拍摄的第二图像,该第一地点和该第二地点为同一地点,或者该第一地点与该第二地点的距离小于预设阈值,该第一图像和该第二图像包括部分重复的图像信息。
该多张图像可以为在同一地点进行拍摄得到,例如,用户通过终端拍摄建筑物并旋转该终端,通过等间隔抓取,或者智能识别等方式获取多张有重叠度的照片,通过图像拼接算法将多张图像进行拼接,得到包含建筑物的图像。需要说明的是,由于拍摄多张图像的过程中,图像采集设备通常会发生相对位移,这里拍摄第一图像的第一地点和拍摄第二图像的第二地点的距离应小于预设阈值,该预设阈值可以根据实际使用需求设定,可选的,预设阈值可以为对于远景拍摄的景深而言可以忽略的位移值,示例性的,例如景深与位移的比值大于100,可以是200或500等。例如图像景深200米,预设阈值可以为1米,在用户手持终端原地拍摄时,手臂旋转产生的位移通常在0.5米以内,由此获取的图像可以视为同一拍摄地点的定位。
1002、从图像中提取建筑物的特征竖线;
为便于描述,下面对相机坐标系和图像坐标系进行介绍。请参阅图2,相机坐标系是以相机的光心O为原点,以光轴为z轴的三维直角坐标系。相机坐标系的x轴与y轴与图像的X,Y轴平行,相机坐标系的z轴为相机光轴,与图形平面垂直。光轴与图像平面的交点,即为图像坐标系的原点,图像坐标系为二维直角坐标系。
立面一般指建筑物的外墙,包括正面、侧面或背面。在三维空间中,交线是立体表面的交线。立面的交线在本申请实施例中简称为立面交线,是指建筑物外墙中相邻面之间的交线。
由于建筑物的立面交线通常垂直于水平面,相应的,若拍摄建筑物时,相机x轴水平,由此获取的图像中,建筑物的特征竖线应为平行于Y轴的线段。然而在实际应用中,拍摄图像时相机坐标系的x轴可能不水平,由此获取的图像中,建筑物的特征竖线与图像Y轴将存在一定的夹角,需要对建筑物的轮廓线进行规则化处理,例如通过图像校正技术,进行旋转变换,可以获取垂直于图像Y轴的特征竖线。
可选的,对建筑物的轮廓线进行规则化处理的过程包括,对相机坐标系进行旋转,绕相机坐标系的x轴和z轴进行旋转,使得相机坐标系的x轴平行于世界坐标系的水平轴,相机坐标系的y轴平行于世界坐标系的重力轴。从而使得相机成像的图像坐标系中Y轴与世界坐标系中的重力轴y轴平行,完成图像校正,使得图像中建筑物的轮廓竖线与图像坐标系y轴平行。
从图像中提取建筑物的特征竖线的方式有多种,本申请对此具体不做限定。
可选的,首先,使用语义分割算法,识别图像中的不同类型物体的图像信息并进行区分。示例性的,请参阅图3,为本申请实施例中对图像进行语义分割的示意图,图像被分割为建筑物、植被、地面和天空区域。由此可以得到建筑物的轮廓线,然后,提取轮廓线中的竖线作为图像的特征竖线。请参阅图4,为本申请实施例中获取建筑物的特征竖线的示意图,图中加粗的线段指示获取的特征竖线。可选的,可以使用线段检测算法(line segment detector,LSD)等直线提取算法获得图像中的规则化线段信息,接着,利用语义分割得到的分类结果掩模作为约束条件,剔除建筑物范围内部和外部的杂线段,只保留建筑物和天空交界处的目标线段。最后,对每一条待提取的直线段,在建筑物掩模有效范围内,按照线段斜率方向进行延长,得到符合建筑物外轮廓形状的直线线段。
1003、进行圆柱投影转换;
相机拍摄得到的图像通常为中心投影图像,对该中心投影图像进行圆柱投影,投影至以相机坐标系y轴为轴线的圆柱上,可以实现中心投影到圆柱投影的转换。
将圆柱投影图像投影至相机坐标系x轴和z轴所在的xOz平面,将得到以光心O为圆心的圆,从图像中获取的建筑物的特征竖线投影为圆上的特征点。该特征点表示从图像中获取的建筑物的特征竖线。
请参阅图5,为圆柱投影转换的示意图,其中,A点、B点连成的直线段代表中心投影的成像面,D点、I点和E点所构成的圆弧段为该圆柱投影的成像面。C点为线段A B的中点,由于获取中心投影图像的相机的焦距已知,可以根据焦距获取中心投影的成像面AB与光心O之间的距离,即线段OC的长度,圆柱半径为设定值,具体大小不做限定,可选的,图5中以OC长度作为圆柱半径。
由图5所示,D点为AO线段与圆O弧的交点,即经过圆柱投影变换后,图像中的A点与圆柱投影面上的D点对应,类似的,F点与I点对应,C点与C点对应,B点与E点对应。若F点代表建筑物的特征竖线,连接F点与O点,与O点所在的圆的交点为I点即为几何语义描述子的一个特征点。圆柱投影中的建筑物的特征竖线为本申请实施例中几何语义描述子的特征点。
需要说明的是,步骤1003与步骤1002的执行顺序不做限定,可以先执行步骤1002,再执行步骤1003,也可以先执行步骤1003,再执行步骤1002,下面分别进行介绍:
一、若先从图像中提取建筑物的特征竖线,再进行圆柱投影转换。
可以仅将获取的特征竖线进行圆柱投影转换;若根据具有重叠信息的多张图像获取了每张图像中建筑物的特征竖线,则在进行圆柱投影后多张图像中重复出现的特征竖线将在圆柱投影中重合。请参阅图6,为多张图像进行圆柱投影的示意图。P1、P2和P3分别代表在同一位置拍摄的三张图像,其中A点为P1中提取的建筑物的特征竖线,由于P1和P2有部分图像信息重叠,A点代表的该特征竖线与P2中B点代表的特征竖线对应于同一建筑物实体的立面交线,在中心投影转换为圆柱投影时,A点和B点均投影至圆柱投影上的C点。
二、若先进行圆柱投影转换,再从圆柱投影图像中提取建筑物的特征竖线。
需要将获取的中心投影图像整体转换为圆柱投影图像,然后从该圆柱投影图像中提取建筑物的特征竖线。
需要说明的是,当需要根据具有重叠信息的多张图像进行拼接时,通常先进行圆柱投影 再进行拼接。由于拍摄得到的图像均为中心投影,其成像面为平面,若直接对多张图像进行拼接,可能因投影变形使得拼接结果出现较大变形,因此,需要先进行圆柱投影变换把中心投影转换到圆柱投影面上,使得所有待处理图像的投影面保持一致。由于多张图像为同一采集设备在同一位置进行拍摄获取的,对于室外远景拍摄,终端拍摄图像的位置可以近似作为相机的光心的位置,由于多张图像是站在同一位置拍摄得到,可视为单视点旋转,因此该多张图像可以转换到同一坐标系下。
可选的,将具有重叠信息的多张图像进行图像拼接时,由于拍摄该多张图像拍摄的亮度可能不一致,在拼接后往往会出现明显的拼接缝隙,因此需要进行图像融合以消除拼接缝隙。示例性的,获取三张图像进行拼接,可以把第一张图做为参考图像,依次把第二张图和第三张图拼接到第一张图的坐标系下。图像融合的方式有多种,具体此处不做限定。可选的,采用渐进渐出法,即越靠近拼接边缘,待拼接图像像素点的权值越大,拼接图像的像素值权值越小,取加权平均以获取拼接后的图像。
1004、生成几何语义描述子;
描述子包括该图像中的第一特征竖线和第二特征竖线之间的水平视角的信息,该第一特征竖线指示该建筑物的第一立面交线,第二特征竖线指示该建筑物的第二立面交线;对于图像可以确定其唯一的视点,即拍摄该图像的相机的光心,从该视点观察,在水平面上,可以确定第一特征竖线和第二特征竖线之间的水平视角的信息,水平视角的信息包括从该点观察,在过该点的水平面上,第一特征竖线与该水平面将存在第一交点,第二特征竖线与该水平面存在第二交点,第一交点和第二交点分别与观察的点连接,由此形成的夹角,即为第一特征竖线和第二特征竖线之间的水平视角。由此可见,水平视角的信息中携带了第一特征竖线和第二特征竖线之间的相对位置信息。
需要说明的是,图像中拍摄到的建筑物可以为一个或多个,第一特征竖线和第二特征竖线可以指示同一建筑物的立面交线,也可以指示不同建筑物的立面交线,对此,本申请实施例中不做限定。
步骤1003中将该第一特征点为该第一特征竖线投影在以经过该光心的重力轴为轴线的圆柱上,再投影至包含该光心的水平面上得到的特征点,根据特征点与光心的连线之间的角度关系可以生成描述子,本申请实施例中该描述子也被称作几何语义描述子。
可选的,从该图像中提取的该建筑物的第一特征竖线和第二特征竖线,经过圆柱投影转换后,得到相机坐标系的xOz平面上,以光心O为圆心的圆上第一特征点和第二特征点,描述子包括该第一特征点和第二特征点构成的圆弧对应的角度间隔信息。
可选的,几何语义描述子仅包括特征点之间的角度信息;可选的,还可以将图像上建筑物轮廓竖线高度信息也记录在特征点对应的信息中,几何语义描述子包括特征点之间的角度信息和特征点的高度信息。该高度信息可以是像素高度信息。
可选的,若传感器有深度数据,或者通过双目相机计算深度、单目相机深度恢复等技术获取了轮廓竖线的深度值,也可以记录在该特征点对应的信息中。几何语义描述子包括特征点之间的角度信息和该特征点的深度信息。该高度信息用于指示该第二立面交线的高度,深度信息用于指示该图像的拍摄地点到对应的建筑物的立面交线的距离。
可选的,描述子包括:环形数组,即由环形数组表示或存储描述子的信息,该环形数组中包括指示第一特征竖线的第一数据,和指示第一特征竖线的第二数据,该第一数据和该第二数据在该环形数组中的位置间隔用于指示该第一特征竖线和该第二特征竖线之间水平视场角信息。
可选的,该描述子以环形数组表示,该环形数组中包括该第一数据、该第二数据和第三数据,该第三数据用于指示未出现该建筑物的特征竖线。
可选的,该水平视场角信息的精度可以根据实际需求确定,例如1度或0.1度等,具体此处不做限定。
示例性的,以角度信息的精度为1度为例,环形数组包含360个数据用于表示360度范围内的特征点信息,可以由数字“1”代表该角度存在一个特征点,数字“0”代表未检测到或者没有特征点。可以理解的是,由图像获取的视场角通常只包含了360度视场中的部分,对于未获取图像的剩余视场角,可以用“0”填充,由此,可以生成图像对应的几何语义描述子。这里用“1”的位置,描述了建筑物轮廓竖线基于图像拍摄点的空间关系。若需记录特征点的高度和/或深度信息,可以采用图像中竖线轮廓线长即像素高度替代“1”,也可以用单目估计出的建筑物轮廓竖线深度和地图建筑物轮廓竖线深度替代“1”。从而提升了信息的维度,增大鲁棒性。
可选的,该描述子以圆、圆上指示该第一特征竖线的第一特征点和指示该第二特征竖线第二特征点表示,该圆以该图像对应的光心为圆心,该第一特征点为该第一特征竖线投影在以经过该光心的重力轴为轴线的圆柱上,再投影至包含该光心的水平面上得到的点,该第二特征点为该第二特征竖线投影在该圆柱上,再投影至该水平面上得到的点,该第一特征点和圆心的连线与第二特征点和圆心的连线之间的夹角用于指示该水平视角的信息。
1005、获取约束信息;
终端获取约束信息包括拍摄该图像时的第一定位信息和/或方位信息。
第一定位信息,包括根据GPS信号、wifi信号、为该终端提供服务的基站的位置信息或用户手动输入地址获取的定位信息,该定位信息精度较低,可以为视觉定位提供参考。在第一定位信息确定的第一位置范围内进一步根据几何语义描述子匹配的方式获取较第一定位信息更精确的视觉定位。
该终端根据该描述子在第一描述子数据库中匹配,以获取该定位信息,该第一描述子数据库包括第一备选点的地理位置和该第一备选点的描述子,该第一备选点为地理位置在该第一定位信息对应的第一位置范围内的备选点。在该第一定位数据确定的第一位置范围内,将该描述子与预设的描述子数据库进行匹配得到,终端获取在该第一位置范围内的更精确的定位信息。
方位信息是指根据终端传感器获取的在拍摄该图像时的角度信息,可选的,该方位信息为根据磁力计获取的朝向角,代表了拍摄该图像时的朝向,具体为在东北天坐标系(east north up,ENU)坐标系的方向。可以理解的是,磁力计提供的该ENU坐标系的方向具有一定的误差,准确度较低,可以用于为视觉定位提供参考。该终端获取该图像的拍摄朝向信息;该终端根据该拍摄朝向信息确定的第一角度范围,将该描述子与预设的描述子数据库进行匹配,以获取该定位信息。
需要说明的是,步骤1005为可选步骤,可以执行,也可以不执行,此处不做限定。若执行步骤1005,通常获取执行步骤1001获取图像时的约束信息。
1006、匹配描述子并获取视觉定位信息;
预设的描述子数据库包括:备选点的地理位置和该备选点的描述子,备选点的描述子包括:以备选点为视点,周围建筑物的可视的立面交线的朝向信息。
根据图像的几何语义描述子,与描述子数据库中的备选点的描述子进行匹配,根据匹配的相似度获取与定位信息相关的备选点,根据备选点在地图中的坐标位置,可以确定图像拍摄点的视觉定位。即,描述子数据库中与图像的描述子匹配的描述子指示的备选点的地理位置为所述图像的拍摄地点的定位信息。
描述子数据库中包括了地图上多个备选点的描述子,具体的,以备选点的位置坐标为索引,备选点周围建筑物的特征竖线的角度信息存储在数据库。可选的,备选点的描述子还包括周围建筑物的特征竖线的深度和/或高度信息。
描述子的数据结构可能有多种,可选的,请参阅图7,本申请实施例中描述子的数据结构示意图,为按照的1度精度,将360度环形数组展开成长度为360的一维数组。图像的描述子(photo descriptor)为当前坐标系下图像中建筑物轮廓竖线的角度信息数组,数字“1”的位置代表当前坐标系下可以获取到图像中建筑物轮廓竖线的角度。数据库中地图的描述子(map descriptor)为ENU坐标系下建筑物外环角点的角度信息数组,数字“1”的位置代表ENU坐标系下建筑物外环角点的角度。
例如map descriptor第4位的“1”是指当前地图位置ENU坐标系下4度朝向可以看到建筑物外环角点。Photo descriptor第1位的“1”是指ENU坐标系下1度朝向可以看到图像中建筑物轮廓竖线。数字为“0”的位置表示当前朝向上未检测到或者没有建筑物特征竖线。
需定位的图像可以直接与地图上的备选点通过描述子进行匹配定位。匹配的方式有多种,此处不做限定。
可选的,由于拍摄图像生成的描述子中,特征点的间隔角度与备选点的描述子统一到了一致的360度尺度上,但是由于拍摄点的方位不确定,因此描述子的角度信息不在同一个坐标系中。通过首尾相接的圆环上滑动几何语义描述子数据,每次滑动可以形成一个新的带角度偏移的几何语义描述子,再与当前地图备选点几何语义描述子计算相似度。根据描述子匹配的相似度,确定目标备选点,根据目标备选点的地理位置,终端可以获取该图像的拍摄地点的定位信息。
然后,可以在磁力计的误差范围内进行描述子匹配。
描述子相似度的计算方式有多种,可选的,通过离散傅里叶变换,将几何语义描述子转换到频域,计算二者的互功率谱,通过狄拉克函数获取几何语义描述子相对于几何语义描述子数据库中某个备选点的描述子的偏移量和相关性。通过匹配定位计算,同时优化求解最优角度与最优位置,可以得到带有角度信息的定位结果。
可选的,若步骤1005中获取了第一定位信息,第一定位信息可以确定粗略的定位为第一位置范围内,可以仅将该第一位置范围的备选点用于进行描述子匹配,可以减少计算量。
示例性的,请参阅图8,图8中的点A代表GPS定位的点,方框代表根据GPS精度信息确定的第一位置范围,本步骤可以在该方框的范围内进行描述子匹配,即将地图上属于该第 一位置范围内的所有备选点的描述子与拍摄图像的描述子进行匹配。可选的,计算备选点与拍摄点相似度时,可以对第一位置范围内的所有备选点得分绘制热力图,分数代表该备选点与拍摄点相似度。建立预设大小的滑动窗口,遍历热力图,获取滑动窗口得分综合最高的点即为计算的定位结果。
下面介绍本申请实施例提供的视觉定位方法中,离线数据库的构建方法,包括:
请参阅图9,为本申请实施例中构建描述子数据库的一个实施例示意图。
数据库的构建过程,包括:
901、根据LOD模型在备选点进行采样;
基于卫星图像,根据网格简化算法,可以生成不同细节层次(level of detail,LOD)模型,示例性的,LOD0为建筑物俯视平面轮廓,LOD1为带有高度信息的建筑物立体轮廓,LOD2带有建筑物屋顶信息。
本实施例中,基于卫星图像生成LOD模型,提取道路层,在道路上每隔一定间隔选取备选点,该间隔例如可以为一米,具体间隔数值此处不做限定。请参阅图10,为本申请实施例中选取备选点的一个实施例示意图,图中道路上的点即为备选点。
对于每一个备选点构建圆柱投影,获取360度展开的投影结果,示例性的,请参阅图11,为本申请实施例中根据LOD模型进行圆柱投影的示意图,横轴为角度,纵轴是周边建筑物的透视投影高度。构建圆柱投影时可设置不同的采集精度,例如,精度可以设置为1度、0.1度等,具体此处不做限定。
902、提取圆柱投影中建筑物的特征竖线;
识别模型投影结果的轮廓竖线,作为几何语义描述子数据库的抽取特征,请参阅图11中加粗的线段,为LOD模型中建筑物的立面交线进行圆柱投影,获取的特征竖线。
903、构建几何语义描述子数据库;
记录地图中每一个备选点,以及该备选点的描述子,构建几何语义描述子数据库。
根据步骤902中获取的特征竖线生成描述子。备选点的描述子包括,以该备选点为参考点,周围的建筑物的可视立面交线的朝向角信息。
描述子数据库包括,备选点的地理位置和该备选点的描述子,该备选点的描述子包括,以该备选点为视点,周围建筑物的可视的立面交线的朝向信息。
示例性的:请参阅图12,为本申请实施例中生成备选点的描述子的示意图。
将LOD模型投影至水平面,图中的矩形分别代表建筑物ABCD、建筑物EFGH、建筑物IJKL和建筑物MNQP,如右上角的建筑物ABCD中,A点、B点、C点、D点分别代表建筑物ABCD的四条立面交线,每个点的坐标已知。点O为地图上选取的备选点,这里的备选点是人为指定的地图坐标数据,由于地图数据作为输入源,坐标是预先已知的。以备选点O点为例:站在O点,对于右上角建筑,A点被遮挡不可见,可以看到建筑物立面交线为:B点、C点、D点,分别连接B点、C点、D点与O点,与O点所在的单位圆的交点即为几何语义描述子的一个特征点,在实际物理意义上,该特征点代表了建筑物的一条轮廓竖线。由于在ENU坐标系中,每个特征点与光心的连线与正北的夹角已知,此夹角的物理意义是站在O点看到该建 筑物轮廓竖线的朝向,由此可以获取每个特征点与光心连线的朝向角信息。
用此夹角代表描述特征的角度信息描述,用于后续的匹配定位。
b点、c点和d点记录下了站在O点,周围建筑物在O点产生的空间约束信息,具体的,包括以O点为参考点,建筑物立面交线的朝向角信息。此外,以b点、c点和d点的朝向角信息作为索引,还可以存储B点、C点、D点代表的立面交线,相对于点O的距离信息和高度信息。
计算机存储中,采用键值(key-value)结构存储数据。O点在地图中的位置(x,y)作为索引,描述子对360度单位圆进行采样,典型的,如以1度为精度采样,则形成360个数据组成的数组。存在特征子的角度存储“1”,没有特征子的角度存储“0”,示例性的,图中Ob和正北的夹角是26.86度,则在数组中的27个位置记录“1”。
对地图上所有选定的备选点进行描述子生成即可产生描述子数据库。
可选的,由于备选点与建筑物立面交线的距离已知,备选点的描述子还可以记录下每个特征点的深度信息。
可选的,由于建筑物立面交线的透视投影高度已知,备选点的描述子还可以记录下每个特征点的高度信息。
在地图的道路约束下,可以仅在道路上选取备选点,生成所有道路上间隔采样的备选点的几何语义描述子。
本申请提供的视觉定位方法,在实现过程中,可以由终端侧进行描述子匹配,也可以由或服务器侧进行描述子匹配,下面分别进行介绍,
一、由终端侧进行描述子匹配;
请参阅图13,为本申请实施例中一种视觉定位方法的一个实施例交互图;
1301、服务器构建几何语义描述子数据库;
1302、终端获取第一定位信息并发送给服务器;
1303、服务器确定第一子数据库并发送给终端;
1304、终端获取图像;
需要说明的是,步骤1304与步骤1302的执行顺序不做限定,可以先执行步骤1302,在执行步骤1304,也可以先执行步骤1304,再执行步骤1302。
1305、终端从从图像中提取建筑物的特征竖线;
1306、终端生成几何语义描述子;
1307、终端在第一描述子数据库中进行匹配得到视觉定位结果。
步骤1301至1307的具体内容可以参考图1和图9对应的实施例,此处不再赘述。
二、由服务器侧进行描述子匹配;
请参阅图14,为本申请实施例中一种视觉定位方法的另一个实施例交互图;
1401、服务器构建几何语义描述子数据库;
1402、终端获取图像并发送给服务器;
1403、终端获取第一定位信息并发送给服务器;
需要说明的是,步骤1401至步骤1403的执行顺序不做限定。
1404、服务器从图像中提取建筑物的特征竖线;
1405、服务器生成几何语义描述子;
1406、服务器在第一描述子数据库中进行匹配,获取视觉定位结果并发送给终端。
步骤1401至1406的具体内容可以参考图1和图9对应的实施例,此处不再赘述。
上面介绍了本申请实施例提供的视觉定位方法,下面对实现该方法的终端和服务器进行介绍,请参阅图15,为本申请实施例中终端的一个实施例示意图。
本申请实施例提供的终端,包括:
获取单元1501,用于获取建筑物的图像;
生成单元1502,用于根据该图像生成描述子,该描述子包括该图像中的第一特征竖线和第二特征竖线之间的水平视角的信息,该第一特征竖线指示该建筑物的第一立面交线,该第二特征竖线指示该建筑物的第二立面交线;
该获取单元1501还用于,根据该描述子在预设的描述子数据库中匹配,以获取该图像的拍摄地点的定位信息。
可选的,该生成单元1502具体用于:从该图像中提取该建筑物的该第一特征竖线和该第二特征竖线;根据该第一特征竖线和该第二特征竖线的位置关系生成该描述子,该位置关系包括该第一特征竖线和该第二特征竖线的之间的水平视角的信息。
可选的,该描述子包括:该第一特征竖线和该第二特征竖线的高度信息和/或深度信息,该第一特征竖线的高度信息用于指示该第一立面交线的高度,该第二特征竖线的高度信息用于指示该第二立面交线的高度,该第一特征竖线的深度信息用于指示该图像的拍摄地点到该第一立面交线的距离,该第二特征竖线的深度信息用于指示该图像的拍摄地点到该第二立面交线的距离。
可选的,该描述子以环形数组表示,该环形数组中包括指示第一特征竖线的第一数据,和指示第一特征竖线的第二数据,该第一数据和该第二数据在该环形数组中的位置间隔用于指示该第一特征竖线和该第二特征竖线之间水平视角信息。
可选的,该描述子以环形数组表示,该环形数组中包括该第一数据、该第二数据和第三数据,该第三数据用于指示未出现该建筑物的特征竖线。
可选的,该第一数据包括该第一特征竖线的高度信息和/或深度信息,该第二数据包括该第二特征竖线的高度信息和/或深度信息。
可选的,该描述子以圆、圆上指示该第一特征竖线的第一特征点和指示该第二特征竖线第二特征点表示,该圆以该图像对应的光心为圆心,该第一特征点为该第一特征竖线投影在以经过该光心的重力轴为轴线的圆柱上,再投影至包含该光心的水平面上得到的点,该第二特征点为该第二特征竖线投影在该圆柱上,再投影至该水平面上得到的点,该第一特征点和圆心的连线与第二特征点和圆心的连线之间的夹角用于指示该水平视角的信息。
可选的,该预设的描述子数据库包括:备选点的地理位置和该备选点的描述子,该备选点的描述子包括,以该备选点为视点,周围建筑物的可视的立面交线的朝向信息。
可选的,该获取单元1501还用于:获取第一定位信息,该第一定位信息包括根据GPS信 号、wifi信号、为该终端提供服务的基站的位置信息或用户手动输入地址获取的定位信息;该获取单元1501具体用于:根据该描述子在第一描述子数据库中匹配,以获取该定位信息,该第一描述子数据库包括第一备选点的地理位置和该第一备选点的描述子,该第一备选点为地理位置在该第一定位信息对应的第一位置范围内的备选点。
可选的,述获取单元1501还用于:获取第一定位信息,该第一定位信息包括根据GPS信号、wifi信号、为该终端提供服务的基站的位置信息或用户手动输入地址获取的定位信息;
该终端还包括:
发送单元1503,用于将该第一定位信息发送给服务器,
接收单元1504,用于接收该服务器发送的该预设的描述子数据库,该预设的描述子数据库第一备选点的地理位置和该第一备选点的描述子,该第一备选点为地理位置在该第一定位信息对应的第一位置范围内的该备选点。
可选的,该获取单元1501还用于:获取该图像的拍摄朝向信息;
该获取单元1501具体用于:根据该拍摄朝向信息确定的第一角度范围,将该描述子与预设的描述子数据库进行匹配,以获取该定位信息。
可选的,该获取单元1501具体用于:获取在第一地点拍摄的第一图像和在第二地点拍摄的第二图像,该第一地点和该第二地点为同一地点,或者该第一地点与该第二地点的距离小于预设阈值,该第一图像和该第二图像包括部分重复的图像信息。
可选的,该预设的描述子数据库根据卫星图像生成。
请参阅图16,为本申请实施例中服务器的一个实施例示意图。
本申请实施例提供的服务器,包括:
接收单元1601,用于接收终端发送的建筑物的图像;
生成单元1602,用于根据该图像生成描述子,该描述子包括该图像中的第一特征竖线和第二特征竖线之间的水平视角的信息,该第一特征竖线指示该建筑物的第一立面交线,该第二特征竖线指示该建筑物的第二立面交线;
获取单元1603,用于根据该描述子在预设的描述子数据库中匹配,以获取该图像的拍摄地点的定位信息;
发送单元1604,用于将该定位信息发送给该终端。
可选的,该生成单元1602具体用于:从该图像中提取该建筑物的该第一特征竖线和该第二特征竖线;根据该第一特征竖线和该第二特征竖线的位置关系生成该描述子,该位置关系包括该第一特征竖线和该第二特征竖线的之间的水平视角的信息。
可选的,该描述子包括:该第一特征竖线和该第二特征竖线的高度信息和/或深度信息,该第一特征竖线的高度信息用于指示该第一立面交线的高度,该第二特征竖线的高度信息用于指示该第二立面交线的高度,该第一特征竖线的深度信息用于指示该图像的拍摄地点到该第一立面交线的距离,该第二特征竖线的深度信息用于指示该图像的拍摄地点到该第二立面交线的距离。
可选的,其特征在于,该描述子以环形数组表示,该环形数组中包括指示第一特征竖线的第一数据,和指示第一特征竖线的第二数据,该第一数据和该第二数据在该环形数组中的 位置间隔用于指示该第一特征竖线和该第二特征竖线之间水平视角信息。
可选的,该描述子以环形数组表示,该环形数组中包括该第一数据、该第二数据和第三数据,该第三数据用于指示未出现该建筑物的特征竖线。
可选的,该第一数据包括该第一特征竖线的高度信息和/或深度信息,该第二数据包括该第二特征竖线的高度信息和/或深度信息。
可选的,该描述子以圆、圆上指示该第一特征竖线的第一特征点和指示该第二特征竖线第二特征点表示,该圆以该图像对应的光心为圆心,该第一特征点为该第一特征竖线投影在以经过该光心的重力轴为轴线的圆柱上,再投影至包含该光心的水平面上得到的点,该第二特征点为该第二特征竖线投影在该圆柱上,再投影至该水平面上得到的点,该第一特征点和圆心的连线与第二特征点和圆心的连线之间的夹角用于指示该水平视角的信息。
可选的,该预设的描述子数据库包括:备选点的地理位置和该备选点的描述子,该备选点的描述子包括,以该备选点为视点,周围建筑物的可视的立面交线的朝向信息。
可选的,该获取单元1603还用于:获取该终端发送的第一定位信息,该第一定位信息包括根据GPS信号、wifi信号、为该终端提供服务的基站的位置信息或用户手动输入地址获取的定位信息;
可选的,该获取单元1603具体用于:根据该描述子在第一描述子数据库中匹配,以获取该定位信息,该第一描述子数据库包括第一备选点的地理位置和该第一备选点的描述子,该第一备选点为地理位置在该第一定位信息对应的第一位置范围内的备选点。
可选的,该获取单元1603还用于:获取该终端发送的该图像的拍摄朝向信息;
该获取单元1603具体用于:根据该拍摄朝向信息确定的第一角度范围,将该描述子与预设的描述子数据库进行匹配,以获取该定位信息。
可选的,该接收单元1601具体用于:接收在第一地点拍摄的第一图像和在第二地点拍摄的第二图像,该第一地点和该第二地点为同一地点,或者该第一地点与该第二地点的距离小于预设阈值,该第一图像和该第二图像包括部分重复的图像信息。
可选的,该预设的描述子数据库根据卫星图像生成。
请参阅图17,为本申请实施例中终端的另一个实施例示意图。
图17示出的是与本申请实施例提供的终端的部分结构的框图。该终端包括:图像采集单元1710、全球卫星定位系统(global positioning system,GPS)模块1720、传感器1730、显示单元1740、输入单元1750、存储器1760、处理器1770、以及电源1780等部件。本领域技术人员可以理解,图17中示出的终端结构并不构成对终端的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
下面结合图17对终端的各个构成部件进行具体的介绍:
图像采集单元1710,用于采集图像,在本申请实施例中用于获取建筑物的图像,例如可以是单目相机、双目相机、深度相机或激光雷达等,此处对于图像采集单元的具体类型不做限定。
GPS模块1720,基于卫星的全球导航系统,为用户提供定位或导航,相较视觉定位,精度较低。本申请实施例中可用于辅助定位。
终端还可包括至少一种传感器1730,例如磁力计1731、惯性测量单元(inertial measurement unit,IMU)1732,惯性测量单元是测量物体三轴姿态角(或角速率)以及加速度的装置。一般的,一个IMU包含了三个单轴的加速度计和三个单轴的陀螺,加速度计检测物体在载体坐标系统独立三轴的加速度信号,而陀螺检测载体相对于导航坐标系的角速度信号,测量物体在三维空间中的角速度和加速度,并以此解算出物体的姿态。为了提高可靠性,还可以为每个轴配备更多的传感器。此外,终端还可以包括振动识别相关功能(比如计步器、敲击)等。此外,还可以包括比如光传感器、运动传感器以及其他传感器。具体地,光传感器可包括环境光传感器及接近传感器,其中,环境光传感器可根据环境光线的明暗来调节显示面板1741的亮度,接近传感器可在终端移动到耳边时,关闭显示面板1741和/或背光。至于终端还可配置的陀螺仪、气压计、湿度计、温度计、红外线传感器等其他传感器,在此不再赘述。
显示单元1740可用于显示由用户输入的信息或提供给用户的信息以及终端的各种菜单。显示单元1740可包括显示面板1741,可选的,可以采用液晶显示器(liquid crystal display,LCD)、有机发光二极管(organic light-emitting diode,OLED)等形式来配置显示面板1741。进一步的,触控面板1751可覆盖显示面板1741,当触控面板1751检测到在其上或附近的触摸操作后,传送给处理器1770以确定触摸事件的类型,随后处理器1770根据触摸事件的类型在显示面板1741上提供相应的视觉输出。虽然在图17中,触控面板1751与显示面板1741是作为两个独立的部件来实现终端的输入和输入功能,但是在某些实施例中,可以将触控面板1751与显示面板1741集成而实现终端的输入和输出功能。
输入单元1750可用于接收输入的数字或字符信息,以及产生与终端的用户设置以及功能控制有关的键信号输入。具体地,输入单元1750可包括触控面板1751以及其他输入设备1752。触控面板1751,也称为触摸屏,可收集用户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在触控面板1751上或在触控面板1751附近的操作),并根据预先设定的程式驱动相应的连接装置。可选的,触控面板1751可包括触摸检测装置和触摸控制器两个部分。其中,触摸检测装置检测用户的触摸方位,并检测触摸操作带来的信号,将信号传送给触摸控制器;触摸控制器从触摸检测装置上接收触摸信息,并将它转换成触点坐标,再送给处理器1770,并能接收处理器1770发来的命令并加以执行。此外,可以采用电阻式、电容式、红外线以及表面声波等多种类型实现触控面板1751。除了触控面板1751,输入单元1750还可以包括其他输入设备1752。具体地,其他输入设备1752可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆等中的一种或多种。
存储器1760可用于存储软件程序以及模块,处理器1770通过运行存储在存储器1760的软件程序以及模块,从而执行终端的各种功能应用以及数据处理。存储器1760可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据终端的使用所创建的数据(比如音频数据、电话本等)等。此外,存储器1760可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。
处理器1770是终端的控制中心,利用各种接口和线路连接整个终端的各个部分,通过运行或执行存储在存储器1720内的软件程序和/或模块,以及调用存储在存储器1720内的数据,执行终端的各种功能和处理数据,从而对终端进行整体监控。可选的,处理器1770可包括一个或多个处理单元;优选的,处理器1770可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器1770中。
终端还包括给各个部件供电的电源1780(比如电池),优选的,电源可以通过电源管理系统与处理器1770逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。
尽管未示出,可选的,终端可以包括音频电路,音频电路包括扬声器和传声器,可提供用户与终端之间的音频接口。
尽管未示出,可选的,终端可以包括无线保真(wireless fidelity,WiFi)模块,WiFi属于短距离无线传输技术,终端通过WiFi模块1770可以帮助用户收发电子邮件、浏览网页和访问流式媒体等,它为用户提供了无线的宽带互联网访问。
尽管未示出,可选的,终端还可以包括射频(radio frequency,RF)电路。
尽管未示出,终端还可以包括蓝牙模块等,在此不再赘述。
在本申请实施例中,该终端所包括的处理器1770还具有实现上述各手势交互方法的功能。
请参阅图18,为本申请实施例中一种服务器的另一个实施例示意图。本实施例提供的服务器为虚拟机热迁移的目的服务器,本申请实施例中对其具体设备形态不做限定。
该服务器1800可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上处理器1801和存储器1802,该存储器1802中存储有程序或数据。
其中,存储器1802可以是易失性存储或非易失性存储。可选地,处理器1801是一个或多个中央处理器(CPU,Central Processing Unit,该CPU可以是单核CPU,也可以是多核CPU。处理器1801可以与存储器1802通信,在服务器1800上执行存储器1802中的一系列指令。
该服务器1800还包括一个或一个以上有线或无线网络接口1803,例如以太网接口。
可选地,尽管图18中未示出,服务器1800还可以包括一个或一个以上电源;一个或一个以上输入输出接口,输入输出接口可以用于连接显示器、鼠标、键盘、触摸屏设备或传感设备等,输入输出接口为可选部件,可以存在也可以不存在,此处不做限定。
本实施例中服务器1800中的处理器1801所执行的流程可以参考前述方法实施例中描述的方法流程,此处不加赘述。
上述终端设备还可以为增强现实(augmented reality,AR)设备,请参阅图19,图19是本申请实施例公开的一种AR设备的结构示意图。
如图19所示,该AR设备包括处理器1901,该处理器1901可以耦合一个或多个存储介质。该存储介质包括存储媒介1911和至少一个存储器1902。该存储媒介1911可以是只读的,如只读存储器(read-only memory,ROM),或者可读/可写的硬盘或闪存。存储器1902例如可 以是随机存取存储器(random access memory,RAM)。该存储器1902可以与处理器1901结合,或者,集成于处理器1901中,或者,由一个独立单元或多个单元构成。该处理器1901是该AR设备的控制中心,具体提供用于执行指令、完成中断事件、提供时间功能以及其他诸多功能的时间序列及工艺设备。可选的,该处理器1901包括一个或多个中央处理单元CPU,如图2中的CPU0和CPU1。可选的,该AR设备还可以包括多个处理器,每一个处理器可以是单核或多核。除非特别说明,本申请所描述的处理器或存储器的具体实现方式包括通用组件或专用组件,该通用组件在特定时刻被配置用于执行某一任务,该专用组件被生产用于执行专用任务。本申请实施例所描述的处理器至少包括一个电子设备、电路、和/或被配置为处理数据(如计算机程序指令)的处理器芯片。该处理器1901,和/或,该处理器1912,或,该处理器1901和/或该处理器1912中的单个CPU所执行的程序代码可以存储在该存储器1902中。
进一步地,该AR设备还包括前置摄像头1903、前置测距仪1904、后置摄像头1905、后置测距仪1906,输出模块1907(如光学投影仪或激光投影仪等)和/或通信接口1908。其中,该前置摄像头1903、该前置测距仪1904、该后置摄像头1905、该后置测距仪1906以及该输出模块1907与该处理器1901耦合。此外,该AR设备还可以包括接收/发送电路1909和天线1910。该接收/发送电路1909和天线1910用于实现AR设备与外部网络的连接。该AR设备的组成单元可以通过通信总线相互耦合,该通信总线至少包括以下任意一种:数据总线、地址总线、控制总线、扩展总线和局部总线。需要注意的是,该AR设备仅仅是本申请实施例公开的一种示例实体装置形态,本申请实施例对AR设备的具体形态不做唯一限定。
该AR设备的处理器1901能够耦合该至少一个存储器1902,该存储器1902中预存有程序代码,该程序代码具体包括图像获取模块、参数检测模块、系数确定模块、图像裁剪模块、影像生成模块、影像显示模块,该存储器1902还进一步存储有内核模块,该内核模块包括操作系统(如WINDOWSTM,ANDROIDTM,IOSTM等)。
该AR设备的处理器1901被配置用于调用该程序代码,以执行本申请实施例中的定位方法。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,该单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
该作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以 采用硬件的形式实现,也可以采用软件功能单元的形式实现。
该集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例该方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上该,以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。

Claims (50)

  1. 一种视觉定位方法,其特征在于,包括:
    终端获取建筑物的图像;
    所述终端根据所述图像生成描述子,所述描述子包括所述图像中的第一特征竖线和第二特征竖线之间的水平视角的信息,所述第一特征竖线指示所述建筑物的第一立面交线,所述第二特征竖线指示所述建筑物的第二立面交线;
    所述终端根据所述描述子在预设的描述子数据库中匹配,以获取所述图像的拍摄地点的定位信息,所述预设的描述子数据库包括:备选点的地理位置和所述备选点的描述子,所述备选点的描述子包括,以所述备选点为视点,周围建筑物的可视的立面交线的朝向信息,其中,所述描述子数据库中,与所述图像的描述子匹配的描述子指示的备选点的地理位置为所述图像的拍摄地点的定位信息。
  2. 根据权利要求1所述的方法,其特征在于,所述终端根据所述图像生成描述子包括:
    所述终端从所述图像中提取所述建筑物的所述第一特征竖线和所述第二特征竖线;
    所述终端根据所述第一特征竖线和所述第二特征竖线的位置关系生成所述描述子,所述位置关系包括所述第一特征竖线和所述第二特征竖线的之间的水平视角的信息。
  3. 根据权利要求1或2所述的方法,其特征在于,所述描述子包括:
    所述第一特征竖线和所述第二特征竖线的高度信息和/或深度信息,所述第一特征竖线的高度信息用于指示所述第一立面交线的高度,所述第二特征竖线的高度信息用于指示所述第二立面交线的高度,所述第一特征竖线的深度信息用于指示所述图像的拍摄地点到所述第一立面交线的距离,所述第二特征竖线的深度信息用于指示所述图像的拍摄地点到所述第二立面交线的距离。
  4. 根据权利要求1至3中任一项所述的方法,其特征在于,
    所述描述子以环形数组表示,所述环形数组中包括指示第一特征竖线的第一数据,和指示第一特征竖线的第二数据,所述第一数据和所述第二数据在所述环形数组中的位置间隔用于指示所述第一特征竖线和所述第二特征竖线之间水平视角信息。
  5. 根据权利要求4所述的方法,其特征在于,所述描述子以环形数组表示,所述环形数组中包括所述第一数据、所述第二数据和第三数据,所述第三数据用于指示未出现所述建筑物的特征竖线。
  6. 根据权利要求4或5所述的方法,其特征在于,所述第一数据包括所述第一特征竖线的高度信息和/或深度信息,所述第二数据包括所述第二特征竖线的高度信息和/或深度信息。
  7. 根据权利要求1至3中任一项所述的方法,其特征在于,
    所述描述子以圆、圆上指示所述第一特征竖线的第一特征点和指示所述第二特征竖线第二特征点表示,所述圆以所述图像对应的光心为圆心,所述第一特征点为所述第一特征竖线投影在以经过所述光心的重力轴为轴线的圆柱上,再投影至包含所述光心的水平面上得到的点,所述第二特征点为所述第二特征竖线投影在所述圆柱上,再投影至所述水平面上得到的点,所述第一特征点和圆心的连线与第二特征点和圆心的连线之间的夹角用于指示所述水平视角的信息。
  8. 根据权利要求1至7中任一项所述的方法,其特征在于,所述方法还包括:
    所述终端获取第一定位信息,所述第一定位信息包括根据GPS信号、wifi信号、为所述终端提供服务的基站的位置信息或用户手动输入地址获取的定位信息;
    所述终端根据所述描述子在预设的描述子数据库中匹配,以获取所述图像的拍摄地点的定位信息包括:
    所述终端根据所述描述子在第一描述子数据库中匹配,以获取所述定位信息,所述第一描述子数据库包括第一备选点的地理位置和所述第一备选点的描述子,所述第一备选点为地理位置在所述第一定位信息对应的第一位置范围内的备选点。
  9. 根据权利要求1至8中任一项所述的方法,其特征在于,所述方法还包括:
    所述终端获取第一定位信息,所述第一定位信息包括根据GPS信号、wifi信号、为所述终端提供服务的基站的位置信息或用户手动输入地址获取的定位信息;
    所述终端将所述第一定位信息发送给服务器,
    所述终端接收所述服务器发送的所述预设的描述子数据库,所述预设的描述子数据库第一备选点的地理位置和所述第一备选点的描述子,所述第一备选点为地理位置在所述第一定位信息对应的第一位置范围内的所述备选点。
  10. 根据权利要求1至9中任一项所述的方法,其特征在于,所述方法还包括:
    所述终端获取所述图像的拍摄朝向信息;
    所述终端根据所述描述子在预设的描述子数据库中匹配,以获取所述图像的拍摄地点的定位信息包括:
    所述终端根据所述拍摄朝向信息确定的第一角度范围,将所述描述子与预设的描述子数据库进行匹配,以获取所述定位信息。
  11. 根据权利要求1至10中任一项所述的方法,其特征在于,所述终端获取建筑物的图像包括:
    所述终端获取在第一地点拍摄的第一图像和在第二地点拍摄的第二图像,所述第一地点和所述第二地点为同一地点,或者所述第一地点与所述第二地点的距离小于预设阈值,所述第一图像和所述第二图像包括部分重复的图像信息。
  12. 根据权利要求1至11中任一项所述的方法,其特征在于,所述预设的描述子数据库根据卫星图像生成。
  13. 一种视觉定位方法,其特征在于,包括:
    服务器接收终端发送的建筑物的图像;
    所述服务器根据所述图像生成描述子,所述描述子包括所述图像中的第一特征竖线和第二特征竖线之间的水平视角的信息,所述第一特征竖线指示所述建筑物的第一立面交线,所述第二特征竖线指示所述建筑物的第二立面交线;
    所述服务器根据所述描述子在预设的描述子数据库中匹配,以获取所述图像的拍摄地点的定位信息,所述预设的描述子数据库包括:备选点的地理位置和所述备选点的描述子,所述备选点的描述子包括,以所述备选点为视点,周围建筑物的可视的立面交线的朝向信息,其中,所述描述子数据库中,与所述图像的描述子匹配的描述子指示的备选点的地理位置为所述图像的拍摄地点的定位信息;
    所述服务器将所述定位信息发送给所述终端。
  14. 根据权利要求13所述的方法,其特征在于,所述服务器根据所述图像生成描述子包括:
    所述服务器从所述图像中提取所述建筑物的所述第一特征竖线和所述第二特征竖线;
    所述服务器根据所述第一特征竖线和所述第二特征竖线的位置关系生成所述描述子,所述位置关系包括所述第一特征竖线和所述第二特征竖线的之间的水平视角的信息。
  15. 根据权利要求13或14所述的方法,其特征在于,所述描述子包括:
    所述第一特征竖线和所述第二特征竖线的高度信息和/或深度信息,所述第一特征竖线的高度信息用于指示所述第一立面交线的高度,所述第二特征竖线的高度信息用于指示所述第二立面交线的高度,所述第一特征竖线的深度信息用于指示所述图像的拍摄地点到所述第一立面交线的距离,所述第二特征竖线的深度信息用于指示所述图像的拍摄地点到所述第二立面交线的距离。
  16. 根据权利要求13至15中任一项所述的方法,其特征在于,
    所述描述子以环形数组表示,所述环形数组中包括指示第一特征竖线的第一数据,和指示第一特征竖线的第二数据,所述第一数据和所述第二数据在所述环形数组中的位置间隔用于指示所述第一特征竖线和所述第二特征竖线之间水平视角信息。
  17. 根据权利要求16所述的方法,其特征在于,所述描述子以环形数组表示,所述环形数组中包括所述第一数据、所述第二数据和第三数据,所述第三数据用于指示未出现所述建筑物的特征竖线。
  18. 根据权利要求16或17所述的方法,其特征在于,所述第一数据包括所述第一特征竖线的高度信息和/或深度信息,所述第二数据包括所述第二特征竖线的高度信息和/或深度信息。
  19. 根据权利要求13至15中任一项所述的方法,其特征在于,
    所述描述子以圆、圆上指示所述第一特征竖线的第一特征点和指示所述第二特征竖线第二特征点表示,所述圆以所述图像对应的光心为圆心,所述第一特征点为所述第一特征竖线投影在以经过所述光心的重力轴为轴线的圆柱上,再投影至包含所述光心的水平面上得到的点,所述第二特征点为所述第二特征竖线投影在所述圆柱上,再投影至所述水平面上得到的点,所述第一特征点和圆心的连线与第二特征点和圆心的连线之间的夹角用于指示所述水平视角的信息。
  20. 根据权利要求13至19中任一项所述的方法,其特征在于,所述方法还包括:
    所述服务器获取所述终端发送的第一定位信息,所述第一定位信息包括根据GPS信号、wifi信号、为所述终端提供服务的基站的位置信息或用户手动输入地址获取的定位信息;
    所述服务器根据所述描述子在预设的描述子数据库中匹配,以获取所述图像的拍摄地点的定位信息包括:
    所述服务器根据所述描述子在第一描述子数据库中匹配,以获取所述定位信息,所述第一描述子数据库包括第一备选点的地理位置和所述第一备选点的描述子,所述第一备选点为地理位置在所述第一定位信息对应的第一位置范围内的备选点。
  21. 根据权利要求13至20中任一项所述的方法,其特征在于,所述方法还包括:
    所述服务器获取所述终端发送的所述图像的拍摄朝向信息;
    所述服务器根据所述描述子在预设的描述子数据库中匹配,以获取所述图像的拍摄地点的定位信息包括:
    所述终端根据所述拍摄朝向信息确定的第一角度范围,将所述描述子与预设的描述子数据库进行匹配,以获取所述定位信息。
  22. 根据权利要求13至21中任一项所述的方法,其特征在于,所述服务器接收终端发送的建筑物的图像包括:
    所述服务器接收在第一地点拍摄的第一图像和在第二地点拍摄的第二图像,所述第一地点和所述第二地点为同一地点,或者所述第一地点与所述第二地点的距离小于预设阈值,所述第一图像和所述第二图像包括部分重复的图像信息。
  23. 根据权利要求13至22中任一项所述的方法,其特征在于,所述预设的描述子数据库根据卫星图像生成。
  24. 一种终端,其特征在于,包括:
    获取单元,用于获取建筑物的图像;
    生成单元,用于根据所述图像生成描述子,所述描述子包括所述图像中的第一特征竖线和第二特征竖线之间的水平视角的信息,所述第一特征竖线指示所述建筑物的第一立面交线,所述第二特征竖线指示所述建筑物的第二立面交线;
    所述获取单元还用于,根据所述描述子在预设的描述子数据库中匹配,以获取所述图像的拍摄地点的定位信息,所述预设的描述子数据库包括:备选点的地理位置和所述备选点的描述子,所述备选点的描述子包括,以所述备选点为视点,周围建筑物的可视的立面交线的朝向信息,其中,所述描述子数据库中,与所述图像的描述子匹配的描述子指示的备选点的地理位置为所述图像的拍摄地点的定位信息。
  25. 根据权利要求24所述的终端,其特征在于,所述生成单元具体用于:
    从所述图像中提取所述建筑物的所述第一特征竖线和所述第二特征竖线;
    根据所述第一特征竖线和所述第二特征竖线的位置关系生成所述描述子,所述位置关系包括所述第一特征竖线和所述第二特征竖线的之间的水平视角的信息。
  26. 根据权利要求24或25所述的终端,其特征在于,所述描述子包括:
    所述第一特征竖线和所述第二特征竖线的高度信息和/或深度信息,所述第一特征竖线的高度信息用于指示所述第一立面交线的高度,所述第二特征竖线的高度信息用于指示所述第二立面交线的高度,所述第一特征竖线的深度信息用于指示所述图像的拍摄地点到所述第一立面交线的距离,所述第二特征竖线的深度信息用于指示所述图像的拍摄地点到所述第二立面交线的距离。
  27. 根据权利要求24至26中任一项所述的终端,其特征在于,
    所述描述子以环形数组表示,所述环形数组中包括指示第一特征竖线的第一数据,和指示第一特征竖线的第二数据,所述第一数据和所述第二数据在所述环形数组中的位置间隔用于指示所述第一特征竖线和所述第二特征竖线之间水平视角信息。
  28. 根据权利要求27所述的终端,其特征在于,所述描述子以环形数组表示,所述环形数组中包括所述第一数据、所述第二数据和第三数据,所述第三数据用于指示未出现所述建筑物的特征竖线。
  29. 根据权利要求27或28所述的终端,其特征在于,所述第一数据包括所述第一特征竖线的高度信息和/或深度信息,所述第二数据包括所述第二特征竖线的高度信息和/或深度信息。
  30. 根据权利要求24至26中任一项所述的终端,其特征在于,
    所述描述子以圆、圆上指示所述第一特征竖线的第一特征点和指示所述第二特征竖线第二特征点表示,所述圆以所述图像对应的光心为圆心,所述第一特征点为所述第一特征竖线投影在以经过所述光心的重力轴为轴线的圆柱上,再投影至包含所述光心的水平面上得到的点,所述第二特征点为所述第二特征竖线投影在所述圆柱上,再投影至所述水平面上得到的点,所述第一特征点和圆心的连线与第二特征点和圆心的连线之间的夹角用于指示所述水平视角的信息。
  31. 根据权利要求24至30中任一项所述的终端,其特征在于,所述获取单元还用于:
    获取第一定位信息,所述第一定位信息包括根据GPS信号、wifi信号、为所述终端提供服务的基站的位置信息或用户手动输入地址获取的定位信息;
    所述获取单元具体用于:
    根据所述描述子在第一描述子数据库中匹配,以获取所述定位信息,所述第一描述子数据库包括第一备选点的地理位置和所述第一备选点的描述子,所述第一备选点为地理位置在所述第一定位信息对应的第一位置范围内的备选点。
  32. 根据权利要求24至31中任一项所述的终端,其特征在于,所述获取单元还用于:
    获取第一定位信息,所述第一定位信息包括根据GPS信号、wifi信号、为所述终端提供服务的基站的位置信息或用户手动输入地址获取的定位信息;
    所述终端还包括:
    发送单元,用于将所述第一定位信息发送给服务器,
    接收单元,用于接收所述服务器发送的所述预设的描述子数据库,所述预设的描述子数据库第一备选点的地理位置和所述第一备选点的描述子,所述第一备选点为地理位置在所述第一定位信息对应的第一位置范围内的所述备选点。
  33. 根据权利要求24至32中任一项所述的终端,其特征在于,所述获取单元还用于:
    获取所述图像的拍摄朝向信息;
    所述获取单元具体用于:
    根据所述拍摄朝向信息确定的第一角度范围,将所述描述子与预设的描述子数据库进行匹配,以获取所述定位信息。
  34. 根据权利要求24至33中任一项所述的终端,其特征在于,所述获取单元具体用于:
    获取在第一地点拍摄的第一图像和在第二地点拍摄的第二图像,所述第一地点和所述第二地点为同一地点,或者所述第一地点与所述第二地点的距离小于预设阈值,所述第一图像和所述第二图像包括部分重复的图像信息。
  35. 根据权利要求24至34中任一项所述的终端,其特征在于,所述预设的描述子数据库根据卫星图像生成。
  36. 一种服务器,其特征在于,包括:
    接收单元,用于接收终端发送的建筑物的图像;
    生成单元,用于根据所述图像生成描述子,所述描述子包括所述图像中的第一特征竖线和第二特征竖线之间的水平视角的信息,所述第一特征竖线指示所述建筑物的第一立面交线,所述第二特征竖线指示所述建筑物的第二立面交线;
    获取单元,用于根据所述描述子在预设的描述子数据库中匹配,以获取所述图像的拍摄地点的定位信息,所述预设的描述子数据库包括:备选点的地理位置和所述备选点的描述子,所述备选点的描述子包括,以所述备选点为视点,周围建筑物的可视的立面交线的朝向信息,其中,所述描述子数据库中,与所述图像的描述子匹配的描述子指示的备选点的地理位置为所述图像的拍摄地点的定位信息;
    发送单元,用于将所述定位信息发送给所述终端。
  37. 根据权利要求36所述的服务器,其特征在于,所述生成单元具体用于:
    从所述图像中提取所述建筑物的所述第一特征竖线和所述第二特征竖线;
    根据所述第一特征竖线和所述第二特征竖线的位置关系生成所述描述子,所述位置关系包括所述第一特征竖线和所述第二特征竖线的之间的水平视角的信息。
  38. 根据权利要求36或37所述的服务器,其特征在于,所述描述子包括:
    所述第一特征竖线和所述第二特征竖线的高度信息和/或深度信息,所述第一特征竖线的高度信息用于指示所述第一立面交线的高度,所述第二特征竖线的高度信息用于指示所述第二立面交线的高度,所述第一特征竖线的深度信息用于指示所述图像的拍摄地点到所述第一立面交线的距离,所述第二特征竖线的深度信息用于指示所述图像的拍摄地点到所述第二立面交线的距离。
  39. 根据权利要求36至38中任一项所述的服务器,其特征在于,
    所述描述子以环形数组表示,所述环形数组中包括指示第一特征竖线的第一数据,和指示第一特征竖线的第二数据,所述第一数据和所述第二数据在所述环形数组中的位置间隔用于指示所述第一特征竖线和所述第二特征竖线之间水平视角信息。
  40. 根据权利要求39所述的服务器,其特征在于,所述描述子以环形数组表示,所述环形数组中包括所述第一数据、所述第二数据和第三数据,所述第三数据用于指示未出现所述建筑物的特征竖线。
  41. 根据权利要求39或40所述的服务器,其特征在于,所述第一数据包括所述第一特征竖线的高度信息和/或深度信息,所述第二数据包括所述第二特征竖线的高度信息和/或深度信息。
  42. 根据权利要求36至38中任一项所述的服务器,其特征在于,
    所述描述子以圆、圆上指示所述第一特征竖线的第一特征点和指示所述第二特征竖线第二特征点表示,所述圆以所述图像对应的光心为圆心,所述第一特征点为所述第一特征竖线投影在以经过所述光心的重力轴为轴线的圆柱上,再投影至包含所述光心的水平面上得到的点,所述第二特征点为所述第二特征竖线投影在所述圆柱上,再投影至所述水平面上得到的点,所述第一特征点和圆心的连线与第二特征点和圆心的连线之间的夹角用于指示所述水平视角的信息。
  43. 根据权利要求36至42中任一项所述的服务器,其特征在于,所述获取单元还用于:
    获取所述终端发送的第一定位信息,所述第一定位信息包括根据GPS信号、wifi信号、 为所述终端提供服务的基站的位置信息或用户手动输入地址获取的定位信息;
    所述获取单元具体用于:
    根据所述描述子在第一描述子数据库中匹配,以获取所述定位信息,所述第一描述子数据库包括第一备选点的地理位置和所述第一备选点的描述子,所述第一备选点为地理位置在所述第一定位信息对应的第一位置范围内的备选点。
  44. 根据权利要求36至43中任一项所述的服务器,其特征在于,所述获取单元还用于:
    获取所述终端发送的所述图像的拍摄朝向信息;
    所述获取单元具体用于:
    根据所述拍摄朝向信息确定的第一角度范围,将所述描述子与预设的描述子数据库进行匹配,以获取所述定位信息。
  45. 根据权利要求36至44中任一项所述的服务器,其特征在于,所述接收单元具体用于:
    接收在第一地点拍摄的第一图像和在第二地点拍摄的第二图像,所述第一地点和所述第二地点为同一地点,或者所述第一地点与所述第二地点的距离小于预设阈值,所述第一图像和所述第二图像包括部分重复的图像信息。
  46. 根据权利要求36至45中任一项所述的服务器,其特征在于,所述预设的描述子数据库根据卫星图像生成。
  47. 一种终端,其特征在于,包括:处理器和储存器;
    所述存储器用于存储指令;
    所述处理器用于根据所述指令以执行如权利要求1至12中任一项所述的方法。
  48. 一种服务器,其特征在于,包括:处理器和储存器;
    所述存储器用于存储指令;
    所述处理器用于根据所述指令以执行如权利要求13至23中任一项所述的方法。
  49. 一种包含指令的计算机程序产品,其特征在于,当其在计算机上运行时,使得所述计算机执行如权利要求1至23中任一项所述的方法。
  50. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储指令,当所述指令在计算机上运行时,使得所述计算机执行如权利要求1至23中任一项所述的方法。
PCT/CN2020/107364 2019-08-09 2020-08-06 视觉定位方法、终端和服务器 WO2021027676A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/667,122 US20220156969A1 (en) 2019-08-09 2022-02-08 Visual localization method, terminal, and server

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910736244.9A CN112348886B (zh) 2019-08-09 视觉定位方法、终端和服务器
CN201910736244.9 2019-08-09

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/667,122 Continuation US20220156969A1 (en) 2019-08-09 2022-02-08 Visual localization method, terminal, and server

Publications (1)

Publication Number Publication Date
WO2021027676A1 true WO2021027676A1 (zh) 2021-02-18

Family

ID=74367069

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/107364 WO2021027676A1 (zh) 2019-08-09 2020-08-06 视觉定位方法、终端和服务器

Country Status (2)

Country Link
US (1) US20220156969A1 (zh)
WO (1) WO2021027676A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113112544A (zh) * 2021-04-09 2021-07-13 国能智慧科技发展(江苏)有限公司 基于智能物联网与大数据的人员定位异常检测系统

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113987228A (zh) * 2018-06-20 2022-01-28 华为技术有限公司 一种数据库构建方法、一种定位方法及其相关设备
CN115564837B (zh) * 2022-11-17 2023-04-18 歌尔股份有限公司 一种视觉定位方法、装置和系统

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101114337A (zh) * 2007-08-08 2008-01-30 华中科技大学 一种地面建筑物识别定位方法
US20120314935A1 (en) * 2011-06-10 2012-12-13 Sri International Method and apparatus for inferring the geographic location of captured scene depictions
CN103139700A (zh) * 2011-11-28 2013-06-05 联想(北京)有限公司 一种终端定位的方法和系统
US20130166202A1 (en) * 2007-08-06 2013-06-27 Amrit Bandyopadhyay System and method for locating, tracking, and/or monitoring the status of personnel and/or assets both indoors and outdoors
CN104573735A (zh) * 2015-01-05 2015-04-29 广东小天才科技有限公司 基于图像拍摄以优化定位的方法、智能终端及服务器
CN107924590A (zh) * 2015-10-30 2018-04-17 斯纳普公司 增强现实系统中的基于图像的跟踪

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130166202A1 (en) * 2007-08-06 2013-06-27 Amrit Bandyopadhyay System and method for locating, tracking, and/or monitoring the status of personnel and/or assets both indoors and outdoors
CN101114337A (zh) * 2007-08-08 2008-01-30 华中科技大学 一种地面建筑物识别定位方法
US20120314935A1 (en) * 2011-06-10 2012-12-13 Sri International Method and apparatus for inferring the geographic location of captured scene depictions
CN103139700A (zh) * 2011-11-28 2013-06-05 联想(北京)有限公司 一种终端定位的方法和系统
CN104573735A (zh) * 2015-01-05 2015-04-29 广东小天才科技有限公司 基于图像拍摄以优化定位的方法、智能终端及服务器
CN107924590A (zh) * 2015-10-30 2018-04-17 斯纳普公司 增强现实系统中的基于图像的跟踪

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113112544A (zh) * 2021-04-09 2021-07-13 国能智慧科技发展(江苏)有限公司 基于智能物联网与大数据的人员定位异常检测系统

Also Published As

Publication number Publication date
US20220156969A1 (en) 2022-05-19
CN112348886A (zh) 2021-02-09

Similar Documents

Publication Publication Date Title
CN112894832B (zh) 三维建模方法、装置、电子设备和存储介质
US11798190B2 (en) Position and pose determining method, apparatus, smart device, and storage medium
WO2021027676A1 (zh) 视觉定位方法、终端和服务器
US10643373B2 (en) Augmented reality interface for interacting with displayed maps
WO2019205850A1 (zh) 位姿确定方法、装置、智能设备及存储介质
JP5920352B2 (ja) 情報処理装置、情報処理方法及びプログラム
WO2018107679A1 (zh) 一种动态三维图像获取的方法和设备
US9888215B2 (en) Indoor scene capture system
US9269196B1 (en) Photo-image-based 3D modeling system on a mobile device
CN109461208B (zh) 三维地图处理方法、装置、介质和计算设备
US11557083B2 (en) Photography-based 3D modeling system and method, and automatic 3D modeling apparatus and method
CN110967011A (zh) 一种定位方法、装置、设备及存储介质
KR102200299B1 (ko) 3d-vr 멀티센서 시스템 기반의 도로 시설물 관리 솔루션을 구현하는 시스템 및 그 방법
WO2022028129A1 (zh) 位姿确定方法、装置、电子设备及存储介质
WO2018233623A1 (zh) 图像显示的方法和装置
EP4105766A1 (en) Image display method and apparatus, and computer device and storage medium
WO2021088498A1 (zh) 虚拟物体显示方法以及电子设备
CN110926478B (zh) 一种ar导航路线纠偏方法、系统及计算机可读存储介质
CN113610702B (zh) 一种建图方法、装置、电子设备及存储介质
CN111928861B (zh) 地图构建方法及装置
CN111093266B (zh) 一种导航校准方法及电子设备
CN112348886B (zh) 视觉定位方法、终端和服务器
US20220345621A1 (en) Scene lock mode for capturing camera images
CN115578432B (zh) 图像处理方法、装置、电子设备及存储介质
CA3102860C (en) Photography-based 3d modeling system and method, and automatic 3d modeling apparatus and method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20853498

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20853498

Country of ref document: EP

Kind code of ref document: A1