CN115439536B - Visual map updating method and device and electronic equipment - Google Patents

Visual map updating method and device and electronic equipment Download PDF

Info

Publication number
CN115439536B
CN115439536B CN202210992488.5A CN202210992488A CN115439536B CN 115439536 B CN115439536 B CN 115439536B CN 202210992488 A CN202210992488 A CN 202210992488A CN 115439536 B CN115439536 B CN 115439536B
Authority
CN
China
Prior art keywords
images
image
visual map
visual
coordinate system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210992488.5A
Other languages
Chinese (zh)
Other versions
CN115439536A (en
Inventor
王星博
张晋川
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202210992488.5A priority Critical patent/CN115439536B/en
Publication of CN115439536A publication Critical patent/CN115439536A/en
Application granted granted Critical
Publication of CN115439536B publication Critical patent/CN115439536B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/42Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/757Matching configurations of points or features

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The disclosure provides a visual map updating method, a visual map updating device and electronic equipment, relates to the technical field of image processing, and in particular relates to the technical field of artificial intelligence, computer vision and augmented reality. The specific implementation scheme is as follows: acquiring M first images in a target scene, wherein the M first images comprise image contents of a first position and image contents of a second position, the first position is a position which can be positioned through a first visual map, the second position is a position which cannot be positioned through the first visual map, the first visual map is a visual map which is pre-constructed by taking a first coordinate system as a reference, and the first visual map is associated with first data; constructing a second visual map under the target scene based on the M first images by taking a second coordinate system as a reference, wherein the second visual map is associated with second data; determining a transformation relation from the second coordinate system to the first coordinate system based on the M first images; based on the transformation relationship, the second data is fused into the first data to update the first visual map.

Description

Visual map updating method and device and electronic equipment
Technical Field
The disclosure relates to the technical field of image processing, in particular to the technical fields of artificial intelligence, computer vision, augmented reality and the like, and particularly relates to a visual map updating method, a visual map updating device and electronic equipment.
Background
The visual positioning enhancement service (Visual Positioning and Augmenting Service, VPAS) inputs images shot by the user camera, calculates the pose of the user camera in the local map coordinate system through the steps of image searching, feature matching, pose solving and the like, and accordingly achieves visual positioning of the user position.
If the user shoots images at these positions and performs positioning based on the shot images, visual positioning or incorrect visual positioning results cannot be achieved or resolved due to the change of local scenes of the map caused by construction or store change.
At present, a user usually finds that the scene cannot be positioned, and judges that the reason is because after the scene of the map is changed, the user manually carries a designated camera device to acquire the image of the region again, constructs a new visual map by the acquired image, and replaces the original visual map by the new visual map.
Disclosure of Invention
The disclosure provides a visual map updating method, a visual map updating device and electronic equipment.
According to a first aspect of the present disclosure, there is provided a visual map updating method including:
acquiring M first images in a target scene, wherein the M first images comprise image contents of a first position and image contents of a second position, the first position is a position which can be positioned through a first visual map, the second position is a position which can not be positioned through the first visual map, the first visual map is a visual map which is pre-constructed by taking a first coordinate system as a reference, the first visual map is associated with first data, and M is an integer larger than 1;
Constructing a second visual map under the target scene based on the M first images and taking a second coordinate system as a reference, wherein the second visual map is associated with second data;
determining a transformation relationship from the second coordinate system to the first coordinate system based on the M first images;
and fusing the second data into the first data based on the transformation relation to update the first visual map.
According to a second aspect of the present disclosure, there is provided a visual map updating apparatus comprising:
the first acquisition module is used for acquiring M first images in a target scene, wherein the M first images comprise image contents of a first position and image contents of a second position, the first position is a position which can be positioned through a first visual map, the second position is a position which cannot be positioned through the first visual map, the first visual map is a visual map which is pre-constructed by taking a first coordinate system as a reference, the first visual map is associated with first data, and M is an integer larger than 1;
the building module is used for building a second visual map under the target scene based on the M first images and taking a second coordinate system as a reference, and the second visual map is associated with second data;
A determining module, configured to determine a transformation relationship from the second coordinate system to the first coordinate system based on the M first images;
and the fusion module is used for fusing the second data into the first data based on the transformation relation so as to update the first visual map.
According to a third aspect of the present disclosure, there is provided an electronic device comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform any one of the methods of the first aspect.
According to a fourth aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform any of the methods of the first aspect.
According to a fifth aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements any of the methods of the first aspect.
According to the technology disclosed by the invention, the problem that the updating efficiency of the visual map is lower is solved, and the updating efficiency of the visual map is improved.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
fig. 1 is a flow diagram of a visual map updating method according to a first embodiment of the present disclosure;
fig. 2 is a schematic structural view of a visual map updating apparatus according to a second embodiment of the present disclosure;
fig. 3 is a schematic block diagram of an example electronic device used to implement embodiments of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
First embodiment
As shown in fig. 1, the present disclosure provides a visual map updating method, including the steps of:
step S101: obtaining M first images in a target scene, wherein the M first images comprise image contents of a first position and image contents of a second position, the first position is a position which can be positioned through a first visual map, the second position is a position which can not be positioned through the first visual map, the first visual map is a visual map which is built in advance by taking a first coordinate system as a reference, and the first visual map is associated with first data.
Wherein M is an integer greater than 1.
In this embodiment, the visual map updating method relates to the technical field of image processing, in particular to the technical fields of artificial intelligence, computer vision and augmented reality, and can be widely applied to the visual map updating scene. The visual map updating method of the embodiment of the present disclosure may be performed by the visual map updating apparatus of the embodiment of the present disclosure. The visual map updating apparatus of the embodiment of the present disclosure may be configured in any electronic device to perform the visual map updating method of the embodiment of the present disclosure.
In an application scenario, the electronic device may be a server for positioning a visual map, in a VPAS visual positioning task, a user may take an image of a surrounding environment with a terminal, such as a mobile phone, and upload the image to the server, perform visual positioning in a pre-constructed first visual map, and after resolving a positioning result, the server sends the position of the user to the user terminal, so that the user experiences navigation. If the user shoots images at these positions and performs positioning based on the shot images, the visual positioning cannot be realized or an erroneous visual positioning result is resolved, and therefore the first visual map needs to be updated periodically.
The target scene may be a scene that includes a location in the first visual map where visual localization is not possible, which may include one, two, or even a plurality of locations where visual localization is not possible. These locations where visual localization is not possible may be locations that change from previous scenes, for example, at location a, the store at location a may be a clothing store when the first visual map is constructed, after which the store at location a becomes a food outlet due to the store planning.
The positions where the visual positioning cannot be achieved may also be positions expanded relative to the previous scene, for example, a first visual map of a shopping square is pre-constructed, and due to the need of scale expansion, some stores are additionally expanded around the shopping square and incorporated into the shopping square, and the additionally expanded store positions cannot achieve the visual positioning in the first visual map.
Where the target scene includes two or more locations where visual localization is not possible, the locations may all be adjacent, or may not be adjacent, or may be partially adjacent, or may not be adjacent.
The target scene may also include at least one location where visual localization is achievable in the first visual map, which may or may not be contiguous, not specifically defined herein.
The positions in the target scene generally need to be relatively continuous, have a relative positional relationship, so that the construction of a visual map can be achieved, for example, the target scene includes a position a, a position B, and a position C, which are respectively adjacent and not far apart.
Accordingly, the M first images in the target scene may include image content of a first location that is locatable by the first visual map and image content of a second location that is not locatable by the first visual map.
The first visual map may be a visual map pre-constructed based on a first coordinate system, the first coordinate system may be a three-dimensional 3D coordinate system, the first visual map may be formed by 3D points representing positions, the first points are associated with first data, and the first data may include 3D point data, image data and the like for constructing the first visual map.
The M first images stored in advance may be acquired, or M first images sent by other electronic devices may be received. In an application scenario, a video of a target scene photographed by a user terminal through a camera may be received, and the video may include M first images.
For example, in an application scenario, a VPAS service is already deployed in a shopping mall, and when a user cannot experience the VPAS service in a certain area in the shopping mall, a common reason is that the first visual map cannot identify the area due to a change in the surrounding scenario of the area. For this case, if the user cannot successfully initiate the positioning experience VPAS service between both points a and B of the shopping mall, the nearby location C may initiate the visual positioning. The user may capture a video, starting at position C, and proceeding to positions a and B.
After the video acquisition is completed, the user can upload the video to the server, and correspondingly, the electronic equipment can receive the video in the target scene and acquire M first images in the target scene.
Under a complex scene, for example, a shopping square comprises a plurality of floors, a user can also select the shopping square and the floors to be updated to send to the electronic equipment, so that the electronic equipment can select a first visual map corresponding to the shopping square from a map server for updating, and the first data associated with the first visual map can be the data associated with the floors in the data associated with the first visual map.
Step S102: and constructing a second visual map under the target scene based on the M first images by taking a second coordinate system as a reference, wherein the second visual map is associated with second data.
In this step, the second coordinate system may be a three-dimensional coordinate system, which is different from the first coordinate system, wherein the second coordinate system being different from the first coordinate system may refer to at least one of the origin of coordinates and the direction of coordinate axes being different.
The second visual map may be a visual map constructed based on the second coordinate system, specifically, a 3D point at a position corresponding to any one of the M first images may be used as a coordinate origin or a fixed point of the second coordinate system, and based on the positional relationship of the M first images, the second visual map under the target scene may be constructed by adopting an existing or new visual map construction manner. The positional relationship of the M first images may refer to a relationship between positions related to image contents of the M first images, and the relationship may include a distance relationship, a direction relationship, and the like.
The second visual map may be a local visual map, which may be composed of 3D points characterizing positions in the M first images, which are associated with second data, which may include 3D point data characterizing positions in the target scene, and the M first images.
Step S103: based on the M first images, a transformation relationship of the second coordinate system to the first coordinate system is determined.
In this step, the transformation relation may refer to transforming the second coordinate system by transformation parameters, so that the transformed second coordinate system may be aligned with the first coordinate system, where the transformation parameters may include rotation parameters and translation parameters, i.e. the alignment of the two coordinate systems may be achieved by corresponding rotation and translation.
For each first image, the pose T of the user camera in the second coordinate system can be calculated in a visual positioning mode CW ,T CW =[R CW ,t CW ]The pose can be used for positioning the position corresponding to the first image, namely, if the pose of the user camera in the map coordinate system can be calculated, the visual positioning of the position in the first image can be realized. Wherein R is CW A rotation matrix from the second coordinate system to the camera coordinate system, t CW Is the position of the camera coordinate system center in the second coordinate system.
Meanwhile, for each first image, the pose of the user camera in the first coordinate system can be calculated in a visual positioning mode, and as the position which cannot be positioned through the first visual map exists in the target scene, the pose of the user camera in the first coordinate system cannot be necessarily calculated for the first image. And if the pose of the user camera in the first coordinate system can be calculated based on the first image, determining that the first image is visually positioned successfully in the first visual map.
Accordingly, for a first image that is successfully visually located in a first visual map and a second image that is successfully located in a second visual map, two poses may be corresponding, respectively a pose relative to the first visual map and a pose relative to the second visual map.
An equation of the transformation relationship of the second coordinate system to the first coordinate system may be constructed for a plurality of first images, such as 4 first images, that are visually located successfully in the first visual map based on their pose relative to the first visual map and pose relative to the second visual map, by iterating the closest point (Iterative Closest Point, ICP) algorithm Jie Suanchu.
Step S104: and fusing the second data into the first data based on the transformation relation to update the first visual map.
In this step, the 3D point data in the second visual map may be transformed into the first coordinate system based on the transformation relationship of the second coordinate system to the first coordinate system, and the 3D point data transformed into the first coordinate system may be fused into the first data.
Correspondingly, the first visual map constructed based on the fused first data, namely the updated first visual map, not only comprises the 3D point data constructed before, but also comprises the newly added 3D point data representing each position in the target scene, so that the visual positioning of each position in the updated first visual map in the target scene can be realized.
In an application scenario, after the electronic device completes updating the first visual map, the updated first visual map may replace the visual map that is originally constructed, and upload the updated first visual map to the map server. Meanwhile, the electronic device can send a message of completion of updating the visual map to the user, and accordingly, the user can re-experience the VPAS service in an area where visual positioning cannot be realized before.
In this embodiment, the 3D point data in the second visual map is integrally transformed into the first coordinate system by constructing the transformation relationship from the second coordinate system to the first coordinate system, so that the flow of updating the visual map can be simplified, and the efficiency of updating the visual map can be improved. And, compared with the first image with failed visual positioning being identified again by triangulating the first image with successful visual positioning, in a way of updating the first visual map, the scene limitation of the first image with successful visual positioning can be relieved (the scene limitation exists by triangulating the first image with successful visual positioning, such as that the position distance corresponding to the two first images cannot be too close) by reconstructing the new visual map and determining the transformation relation from the second coordinate system to the first coordinate system so as to integrally transform the 3D point data in the second visual map into the first coordinate system based on the transformation relation, thereby improving the success rate and accuracy of the visual map updating.
Optionally, the step S103 specifically includes:
acquiring N target images from the M first images, wherein the target images are images which are successfully positioned in the first visual map, and N is an integer greater than 1;
acquiring N first poses of the N target images relative to the first visual map and N second poses of the N target images relative to the second visual map;
based on the N first poses and the N second poses, a transformation relationship from the second coordinate system to the first coordinate system is calculated.
In this embodiment, N target images may be obtained from M first images, where the target images may be images that are successfully positioned in the first visual map, and the N target images may be all images that are successfully positioned in the first visual map, or may be partial images that are successfully positioned in the first visual map, where for example, 8 images that are successfully positioned in the first visual map exist in the M first images, and 5 images thereof are taken as N target images, so as to determine a transformation relationship from the second coordinate system to the first coordinate system.
An existing or new visual positioning mode can be adopted, visual positioning is attempted in the first visual map based on the first images, if the visual positioning is successful, the first images are used as target images, correspondingly, N target images can be obtained from M first images, N is smaller than M, and the value of N cannot be too small, for example, N can be larger than or equal to 4.
Under the condition that visual positioning is successful, N first poses of N target images relative to the first visual map and N second poses of N target images relative to the second visual map can be respectively obtained through pose solving in the visual positioning process.
Based on the N first poses and the N second poses, an equation of a transformation relation is constructed, and an ICP algorithm is adopted to calculate the transformation relation from the second coordinate system to the first coordinate system. In this way, determination of the transformation relationship of the second coordinate system to the first coordinate system can be achieved.
Optionally, the first data includes an image for constructing the first visual map, and before acquiring N target images from the M first images, the method further includes:
for each first image of the M first images, acquiring global features of the first image, and acquiring local features of the first image, wherein the local features comprise first features for characterizing key points in the first image;
Acquiring a second image matched with the global feature from the first data;
based on the first feature and a pre-acquired second feature, matching the key points in the first image with the key points in the second image to obtain a first matching pair, wherein the second feature is a feature for representing the key points in the second image;
and carrying out pose solving based on the first matching pair to obtain a visual positioning result, wherein the visual positioning result is used for representing whether the first image is successfully positioned in the first visual map or not.
In this embodiment, the visual positioning of the first image in the first visual map may be performed through steps such as image searching, feature matching, pose solving, and the like, to obtain a visual positioning result.
Specifically, the first data may include images for constructing the first visual map, global features and local features of the first image may be extracted by a deep learning model, and global features and local features of the images in the first data may be searched for second images identical or similar to the first image from the first data by means of global feature matching, where the number of the second images is at least one.
And then, matching the key points in the first image with the key points in the second image in a local feature matching mode to obtain a first matching pair, wherein the first matching pair can be a matching pair of the 2D key points in the first image and the 2D key points in the second image.
Correspondingly, triangularization processing can be performed based on the first matching pair, so that a second matching pair of the key points in the first image and the 3D points in the first visual map is obtained, the second matching pair can be a matching pair of the 2D key points and the 3D points, pose solving is performed based on the second matching pair, and a visual positioning result is obtained. In this way, visual localization of the first image in the first visual map may be achieved.
Optionally, the performing pose solving based on the first matching pair to obtain a visual positioning result includes:
acquiring a second matching pair of the first image key point in the first matching pair and the three-dimensional point in the first visual map based on the matching relation between the key point of the second image and the three-dimensional point in the first visual map;
and carrying out pose solving based on the second matching pair to obtain the visual positioning result.
In this embodiment, in the generation process of the first visual map, the 3D points in the first visual map may be constructed based on the triangularization process of the 2D key points in the image, so that the 2D key points in the image and the 3D points in the first visual map may be associated and stored. Correspondingly, based on the matching relation between the key points of the second image and the three-dimensional points in the first visual map, a second matching pair of the 2D key points in the first image and the 3D points in the first visual map can be determined, so that pose solving can be performed based on the second matching pair, visual positioning of the first image in the first visual map is realized, and the steps of pose solving are simplified.
Optionally, the performing pose solving based on the second matching pair to obtain the visual positioning result includes:
under the condition that the pose of the first image relative to the first visual map can be obtained based on the second matching pair, determining that the visual positioning result is successful in visual positioning;
and under the condition that the pose of the first image relative to the first visual map can not be obtained based on the second matching pair, determining that the visual positioning result is visual positioning failure.
In this embodiment, when the pose of the first image relative to the first visual map is obtained by solving based on the second matching pair, it is described that there is a 3D point in the first visual map that is fully matched with a 2D key point in the first image, and accordingly, visual positioning of a position corresponding to the first image may be achieved, and it is determined that the visual positioning result is that the visual positioning is successful.
And under the condition that the pose of the first image relative to the first visual map can not be obtained based on the second matching pair, the fact that 3D points which are matched with 2D key points in the first image in full scale do not exist in the first visual map is explained, the position of the position in the first image in the first visual map can not be determined, and accordingly the visual positioning result is the visual positioning failure.
In this way, the visual localization process of the image in the first visual map can be simplified.
Optionally, the step S104 specifically includes:
transforming three-dimensional points used for representing positions in the second visual map into the first coordinate system based on the transformation relation, wherein the second data comprise the three-dimensional points used for representing positions in the second visual map and the M first images;
three-dimensional points transformed into the first coordinate system are added to the first data in association with the M first images.
In this embodiment, the transformation parameters for representing the transformation relationship may be multiplied by the 3D point data for representing the position in the second visual map, so as to transform the 3D point in the second visual map into the first coordinate system, obtain new 3D point data, and add the new 3D point data and the M first images to the first data in association with each other, so as to perform fusion of the two visual maps on the data layer, and implement updating of the first visual map.
Optionally, the M first images are images in the video under the target scene, and the step S102 specifically includes:
and taking three-dimensional points which are corresponding to the first images of the first frames of the video in the M first images as coordinate origins of the second coordinate system, and constructing a second visual map under the target scene based on the position relation of the M first images.
In this embodiment, the M first images may be image frames in the video, and the video uploaded by the user may be split into image frames to obtain the M first images.
The 3D point of the first image corresponding to the first frame of the video in the M first images can be used as the coordinate origin of the second coordinate system, and based on the position relation of the M first images, the second visual map in the target scene is constructed in an existing or new visual map construction mode, so that the construction of the second visual map can be simplified.
Second embodiment
As shown in fig. 2, the present disclosure provides a visual map updating apparatus 200, including:
a first obtaining module 201, configured to obtain M first images in a target scene, where the M first images include image contents of a first location and image contents of a second location, where the first location is a location that is locatable by a first visual map, the second location is a location that is not locatable by the first visual map, the first visual map is a visual map that is pre-constructed based on a first coordinate system, the first visual map is associated with first data, and M is an integer greater than 1;
a building module 202, configured to build a second visual map under the target scene based on the M first images and taking a second coordinate system as a reference, where the second visual map is associated with second data;
A determining module 203, configured to determine a transformation relationship from the second coordinate system to the first coordinate system based on the M first images;
and a fusion module 204, configured to fuse the second data into the first data based on the transformation relationship, so as to update the first visual map.
Optionally, the determining module 203 is specifically configured to:
acquiring N target images from the M first images, wherein the target images are images which are successfully positioned in the first visual map, and N is an integer greater than 1;
acquiring N first poses of the N target images relative to the first visual map and N second poses of the N target images relative to the second visual map;
based on the N first poses and the N second poses, a transformation relationship from the second coordinate system to the first coordinate system is calculated.
Optionally, the first data includes an image for constructing the first visual map, and the apparatus further includes:
a second obtaining module, configured to obtain, for each of the M first images, a global feature of the first image, and obtain a local feature of the first image, where the local feature includes a first feature for characterizing a key point in the first image;
A third obtaining module, configured to obtain a second image matched with the global feature from the first data;
the matching module is used for matching the key points in the first image with the key points in the second image based on the first features and the pre-acquired second features to obtain a first matching pair, wherein the second features are features used for representing the key points in the second image;
and the pose solving module is used for carrying out pose solving based on the first matching pair to obtain a visual positioning result, and the visual positioning result is used for representing whether the first image is successfully positioned in the first visual map or not.
Optionally, the pose solving module includes:
the acquisition unit is used for acquiring a second matching pair of the first image key points in the first matching pair and the three-dimensional points in the first visual map based on the matching relation between the key points of the second image and the three-dimensional points in the first visual map;
and the pose solving unit is used for carrying out pose solving based on the second matching pair to obtain the visual positioning result.
Optionally, the pose solving unit is specifically configured to:
under the condition that the pose of the first image relative to the first visual map can be obtained based on the second matching pair, determining that the visual positioning result is successful in visual positioning;
And under the condition that the pose of the first image relative to the first visual map can not be obtained based on the second matching pair, determining that the visual positioning result is visual positioning failure.
Optionally, the fusion module 204 is specifically configured to:
transforming three-dimensional points used for representing positions in the second visual map into the first coordinate system based on the transformation relation, wherein the second data comprise the three-dimensional points used for representing positions in the second visual map and the M first images;
three-dimensional points transformed into the first coordinate system are added to the first data in association with the M first images.
Optionally, the M first images are images in a video of the target scene, and the building module 202 is specifically configured to:
and taking three-dimensional points which are corresponding to the first images of the first frames of the video in the M first images as coordinate origins of the second coordinate system, and constructing a second visual map under the target scene based on the position relation of the M first images.
The visual map updating apparatus 200 provided in the present disclosure can implement each process implemented by the embodiment of the visual map updating method, and can achieve the same beneficial effects, so that repetition is avoided, and no further description is given here.
In the technical scheme of the disclosure, the related processes of collecting, storing, using, processing, transmitting, providing, disclosing and the like of the personal information of the user accord with the regulations of related laws and regulations, and the public order colloquial is not violated.
According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.
FIG. 3 illustrates a schematic block diagram of an example electronic device that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 3, the apparatus 300 includes a computing unit 301 that may perform various suitable actions and processes according to a computer program stored in a Read Only Memory (ROM) 302 or a computer program loaded from a storage unit 308 into a Random Access Memory (RAM) 303. In the RAM 303, various programs and data required for the operation of the device 300 may also be stored. The computing unit 301, the ROM 302, and the RAM 303 are connected to each other by a bus 304. An input/output (I/O) interface 305 is also connected to bus 304.
Various components in device 300 are connected to I/O interface 305, including: an input unit 306 such as a keyboard, a mouse, etc.; an output unit 307 such as various types of displays, speakers, and the like; a storage unit 308 such as a magnetic disk, an optical disk, or the like; and a communication unit 309 such as a network card, modem, wireless communication transceiver, etc. The communication unit 309 allows the device 300 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
The computing unit 301 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 301 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 301 performs the respective methods and processes described above, such as a visual map updating method. For example, in some embodiments, the visual map updating method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as the storage unit 308. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 300 via the ROM 302 and/or the communication unit 309. When the computer program is loaded into the RAM 303 and executed by the computing unit 301, one or more steps of the visual map updating method described above may be performed. Alternatively, in other embodiments, the computing unit 301 may be configured to perform the visual map updating method by any other suitable means (e.g. by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server incorporating a blockchain.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel, sequentially, or in a different order, provided that the desired results of the disclosed aspects are achieved, and are not limited herein.
The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims (12)

1. A visual map updating method, comprising:
acquiring M first images in a target scene, wherein the M first images comprise image contents of a first position and image contents of a second position, the first position is a position which can be positioned through a first visual map, the second position is a position which can not be positioned through the first visual map, the first visual map is a visual map which is pre-constructed by taking a first coordinate system as a reference, the first visual map is associated with first data, and M is an integer larger than 1;
Constructing a second visual map under the target scene based on the M first images and taking a second coordinate system as a reference, wherein the second visual map is associated with second data;
determining a transformation relationship from the second coordinate system to the first coordinate system based on the M first images;
fusing the second data into the first data based on the transformation relationship to update the first visual map;
the determining, based on the M first images, a transformation relationship of the second coordinate system to the first coordinate system, includes:
acquiring N target images from the M first images, wherein the target images are images which are successfully positioned in the first visual map, and N is an integer greater than 1;
acquiring N first poses of the N target images relative to the first visual map and N second poses of the N target images relative to the second visual map;
based on the N first poses and the N second poses, calculating a transformation relation from the second coordinate system to the first coordinate system;
the first data includes images for constructing the first visual map, and the method further includes, prior to acquiring N target images from the M first images:
Matching key points in the first image with key points in a second image in the first data based on first features of the first image and second features acquired in advance to obtain a first matching pair, wherein the second image is matched with global features of the first image, the first features are local features used for representing the key points in the first image, and the second features are local features used for representing the key points in the second image;
acquiring a second matching pair of the first image key point in the first matching pair and the three-dimensional point in the first visual map based on the matching relation between the key point of the second image and the three-dimensional point in the first visual map;
and carrying out pose solving based on the second matching pair to obtain a visual positioning result.
2. The method of claim 1, wherein prior to the acquiring N target images from the M first images, the method further comprises:
for each first image of the M first images, acquiring global features of the first image, and acquiring local features of the first image, wherein the local features comprise first features for characterizing key points in the first image;
Acquiring a second image matched with the global feature from the first data;
based on the first feature and a pre-acquired second feature, matching the key points in the first image with the key points in the second image to obtain a first matching pair, wherein the second feature is a feature for representing the key points in the second image;
and carrying out pose solving based on the first matching pair to obtain a visual positioning result, wherein the visual positioning result is used for representing whether the first image is successfully positioned in the first visual map or not.
3. The method of claim 2, wherein the pose solving based on the second matching pair to obtain the visual positioning result comprises:
under the condition that the pose of the first image relative to the first visual map can be obtained based on the second matching pair, determining that the visual positioning result is successful in visual positioning;
and under the condition that the pose of the first image relative to the first visual map can not be obtained based on the second matching pair, determining that the visual positioning result is visual positioning failure.
4. The method of claim 1, wherein the fusing the second data into the first data based on the transformation relationship comprises:
Transforming three-dimensional points used for representing positions in the second visual map into the first coordinate system based on the transformation relation, wherein the second data comprise the three-dimensional points used for representing positions in the second visual map and the M first images;
three-dimensional points transformed into the first coordinate system are added to the first data in association with the M first images.
5. The method of claim 1, wherein the M first images are in-video images of the target scene, and the constructing a second visual map of the target scene based on the M first images and based on a second coordinate system comprises:
and taking three-dimensional points which are corresponding to the first images of the first frames of the video in the M first images as coordinate origins of the second coordinate system, and constructing a second visual map under the target scene based on the position relation of the M first images.
6. A visual map updating apparatus comprising:
the first acquisition module is used for acquiring M first images in a target scene, wherein the M first images comprise image contents of a first position and image contents of a second position, the first position is a position which can be positioned through a first visual map, the second position is a position which cannot be positioned through the first visual map, the first visual map is a visual map which is pre-constructed by taking a first coordinate system as a reference, the first visual map is associated with first data, and M is an integer larger than 1;
The building module is used for building a second visual map under the target scene based on the M first images and taking a second coordinate system as a reference, and the second visual map is associated with second data;
a determining module, configured to determine a transformation relationship from the second coordinate system to the first coordinate system based on the M first images;
the fusion module is used for fusing the second data into the first data based on the transformation relation so as to update the first visual map;
the determining module is specifically configured to:
acquiring N target images from the M first images, wherein the target images are images which are successfully positioned in the first visual map, and N is an integer greater than 1;
acquiring N first poses of the N target images relative to the first visual map and N second poses of the N target images relative to the second visual map;
based on the N first poses and the N second poses, calculating a transformation relation from the second coordinate system to the first coordinate system;
the first data includes an image for constructing the first visual map, the apparatus further comprising: the matching module and the pose solving module are used for matching the pose of the object; wherein,
The matching module is used for matching the key points in the first image with the key points in the second image in the first data based on the first features of the first image and the second features acquired in advance to obtain a first matching pair, the second image is matched with the global features of the first image, the first features are local features used for representing the key points in the first image, and the second features are local features used for representing the key points in the second image;
the pose solving module comprises:
the acquisition unit is used for acquiring a second matching pair of the first image key points in the first matching pair and the three-dimensional points in the first visual map based on the matching relation between the key points of the second image and the three-dimensional points in the first visual map;
and the pose solving unit is used for carrying out pose solving based on the second matching pair to obtain a visual positioning result.
7. The apparatus of claim 6, wherein the apparatus further comprises:
a second obtaining module, configured to obtain, for each of the M first images, a global feature of the first image, and obtain a local feature of the first image, where the local feature includes a first feature for characterizing a key point in the first image;
A third obtaining module, configured to obtain a second image matched with the global feature from the first data;
the matching module is used for matching the key points in the first image with the key points in the second image based on the first features and the pre-acquired second features to obtain a first matching pair, wherein the second features are features used for representing the key points in the second image;
and the pose solving module is used for carrying out pose solving based on the first matching pair to obtain a visual positioning result, and the visual positioning result is used for representing whether the first image is successfully positioned in the first visual map or not.
8. The apparatus of claim 7, wherein the pose solving unit is specifically configured to:
under the condition that the pose of the first image relative to the first visual map can be obtained based on the second matching pair, determining that the visual positioning result is successful in visual positioning;
and under the condition that the pose of the first image relative to the first visual map can not be obtained based on the second matching pair, determining that the visual positioning result is visual positioning failure.
9. The apparatus of claim 6, wherein the fusion module is specifically configured to:
transforming three-dimensional points used for representing positions in the second visual map into the first coordinate system based on the transformation relation, wherein the second data comprise the three-dimensional points used for representing positions in the second visual map and the M first images;
three-dimensional points transformed into the first coordinate system are added to the first data in association with the M first images.
10. The apparatus of claim 6, wherein the M first images are in-video images of the target scene, and the building module is specifically configured to:
and taking three-dimensional points which are corresponding to the first images of the first frames of the video in the M first images as coordinate origins of the second coordinate system, and constructing a second visual map under the target scene based on the position relation of the M first images.
11. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-5.
12. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-5.
CN202210992488.5A 2022-08-18 2022-08-18 Visual map updating method and device and electronic equipment Active CN115439536B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210992488.5A CN115439536B (en) 2022-08-18 2022-08-18 Visual map updating method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210992488.5A CN115439536B (en) 2022-08-18 2022-08-18 Visual map updating method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN115439536A CN115439536A (en) 2022-12-06
CN115439536B true CN115439536B (en) 2023-09-26

Family

ID=84242255

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210992488.5A Active CN115439536B (en) 2022-08-18 2022-08-18 Visual map updating method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN115439536B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117115262B (en) * 2023-10-24 2024-03-26 锐驰激光(深圳)有限公司 Positioning method, device, equipment and storage medium based on vision and TOF

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110309330A (en) * 2019-07-01 2019-10-08 北京百度网讯科技有限公司 The treating method and apparatus of vision map
WO2020155615A1 (en) * 2019-01-28 2020-08-06 速感科技(北京)有限公司 Vslam method, controller, and mobile device
WO2020223975A1 (en) * 2019-05-09 2020-11-12 珊口(深圳)智能科技有限公司 Method of locating device on map, server, and mobile robot
CN112269851A (en) * 2020-11-16 2021-01-26 Oppo广东移动通信有限公司 Map data updating method and device, storage medium and electronic equipment
CN112435338A (en) * 2020-11-19 2021-03-02 腾讯科技(深圳)有限公司 Method and device for acquiring position of interest point of electronic map and electronic equipment
CN112785700A (en) * 2019-11-08 2021-05-11 华为技术有限公司 Virtual object display method, global map updating method and device
CN113029128A (en) * 2021-03-25 2021-06-25 浙江商汤科技开发有限公司 Visual navigation method and related device, mobile terminal and storage medium
CN113884006A (en) * 2021-09-27 2022-01-04 视辰信息科技(上海)有限公司 Space positioning method, system, equipment and computer readable storage medium
WO2022002039A1 (en) * 2020-06-30 2022-01-06 杭州海康机器人技术有限公司 Visual positioning method and device based on visual map
CN113989450A (en) * 2021-10-27 2022-01-28 北京百度网讯科技有限公司 Image processing method, image processing apparatus, electronic device, and medium
WO2022036980A1 (en) * 2020-08-17 2022-02-24 浙江商汤科技开发有限公司 Pose determination method and apparatus, electronic device, storage medium, and program
CN114241039A (en) * 2021-12-13 2022-03-25 Oppo广东移动通信有限公司 Map data processing method and device, storage medium and electronic equipment
CN114627268A (en) * 2022-03-11 2022-06-14 北京百度网讯科技有限公司 Visual map updating method and device, electronic equipment and medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108692720B (en) * 2018-04-09 2021-01-22 京东方科技集团股份有限公司 Positioning method, positioning server and positioning system
CN108717710B (en) * 2018-05-18 2022-04-22 京东方科技集团股份有限公司 Positioning method, device and system in indoor environment

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020155615A1 (en) * 2019-01-28 2020-08-06 速感科技(北京)有限公司 Vslam method, controller, and mobile device
WO2020223975A1 (en) * 2019-05-09 2020-11-12 珊口(深圳)智能科技有限公司 Method of locating device on map, server, and mobile robot
CN110309330A (en) * 2019-07-01 2019-10-08 北京百度网讯科技有限公司 The treating method and apparatus of vision map
CN112785700A (en) * 2019-11-08 2021-05-11 华为技术有限公司 Virtual object display method, global map updating method and device
WO2022002039A1 (en) * 2020-06-30 2022-01-06 杭州海康机器人技术有限公司 Visual positioning method and device based on visual map
WO2022036980A1 (en) * 2020-08-17 2022-02-24 浙江商汤科技开发有限公司 Pose determination method and apparatus, electronic device, storage medium, and program
CN112269851A (en) * 2020-11-16 2021-01-26 Oppo广东移动通信有限公司 Map data updating method and device, storage medium and electronic equipment
CN112435338A (en) * 2020-11-19 2021-03-02 腾讯科技(深圳)有限公司 Method and device for acquiring position of interest point of electronic map and electronic equipment
CN113029128A (en) * 2021-03-25 2021-06-25 浙江商汤科技开发有限公司 Visual navigation method and related device, mobile terminal and storage medium
CN113884006A (en) * 2021-09-27 2022-01-04 视辰信息科技(上海)有限公司 Space positioning method, system, equipment and computer readable storage medium
CN113989450A (en) * 2021-10-27 2022-01-28 北京百度网讯科技有限公司 Image processing method, image processing apparatus, electronic device, and medium
CN114241039A (en) * 2021-12-13 2022-03-25 Oppo广东移动通信有限公司 Map data processing method and device, storage medium and electronic equipment
CN114627268A (en) * 2022-03-11 2022-06-14 北京百度网讯科技有限公司 Visual map updating method and device, electronic equipment and medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种改进ICP算法的移动机器人激光与视觉建图方法研究;张杰;周军;;机电工程(12);全文 *

Also Published As

Publication number Publication date
CN115439536A (en) 2022-12-06

Similar Documents

Publication Publication Date Title
CN115409933B (en) Multi-style texture mapping generation method and device
CN112652036B (en) Road data processing method, device, equipment and storage medium
CN112529097B (en) Sample image generation method and device and electronic equipment
CN115439536B (en) Visual map updating method and device and electronic equipment
CN114792355B (en) Virtual image generation method and device, electronic equipment and storage medium
CN114998433A (en) Pose calculation method and device, storage medium and electronic equipment
US20220198743A1 (en) Method for generating location information, related apparatus and computer program product
CN113587928B (en) Navigation method, navigation device, electronic equipment, storage medium and computer program product
CN113838217B (en) Information display method and device, electronic equipment and readable storage medium
CN113932796A (en) High-precision map lane line generation method and device and electronic equipment
CN113435462B (en) Positioning method, positioning device, electronic equipment and medium
CN113870439A (en) Method, apparatus, device and storage medium for processing image
CN114723894B (en) Three-dimensional coordinate acquisition method and device and electronic equipment
CN114549303B (en) Image display method, image processing method, image display device, image processing apparatus, image display device, image processing program, and storage medium
CN114119990B (en) Method, apparatus and computer program product for image feature point matching
CN115790621A (en) High-precision map updating method and device and electronic equipment
CN112990046B (en) Differential information acquisition method, related device and computer program product
CN113112398A (en) Image processing method and device
CN113838201B (en) Model adaptation method and device, electronic equipment and readable storage medium
CN113838200B (en) Model adaptation method, device, electronic equipment and readable storage medium
CN113658277B (en) Stereo matching method, model training method, related device and electronic equipment
CN115439331B (en) Corner correction method and generation method and device of three-dimensional model in meta universe
CN114820908B (en) Virtual image generation method and device, electronic equipment and storage medium
CN114463409B (en) Image depth information determining method and device, electronic equipment and medium
CN113099231B (en) Method and device for determining sub-pixel interpolation position, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant