WO2021208372A1

WO2021208372A1 - Indoor visual navigation method, apparatus, and system, and electronic device

Info

Publication number: WO2021208372A1
Application number: PCT/CN2020/119479
Authority: WO
Inventors: 王金戈; 谢航; 庹东成; 陈南; 李正权; 刘诗文; 刘骁
Original assignee: 北京迈格威科技有限公司
Priority date: 2020-04-14
Filing date: 2020-09-30
Publication date: 2021-10-21
Also published as: CN111627114A; JP2023509099A

Abstract

Provided are an indoor visual navigation method, apparatus, and system, and an electronic device. The method comprises: upload a collected indoor image to be positioned to a server by means of a mobile device; determine a camera pose by means of the server when the mobile device collects the indoor image, and send the camera pose corresponding to the indoor image to the mobile device; then the mobile device establishes an AR coordinate system aligned with a world coordinate system according to the camera pose corresponding to the indoor image, and plans a shortest route in a pre-imported indoor topological map according to destination information set by a user; and finally display, on an interface, the current preview image collected by the mobile device, and according to the AR coordinate system and the shortest route, superimpose, on the current preview image, a three-dimensional identifier for indicating a route travel direction. The present application can provide indoor navigation service for the user, and guide the user to conveniently arrive at a destination.

Description

Indoor visual navigation method, device, system and electronic equipment

Cross-references to related applications

This disclosure claims the priority of a Chinese patent application filed with the Chinese Patent Office on April 14, 2020, with the application number 202010292954X, titled "Indoor visual navigation method, device, system and electronic equipment", the entire content of which is incorporated by reference In this disclosure.

Technical field

The present disclosure relates to the field of image processing technology, and in particular to an indoor visual navigation method, device, system and electronic equipment.

Background technique

Electronic map navigation has become the way people mainly rely on wayfinding when traveling. However, the existing navigation technology mainly combines GPS technology for outdoor navigation. When users are indoors such as shopping malls, they can only rely on the floor plan provided at the entrance of the shopping mall. Know the location of the store (destination) you need, but when the user goes to the destination, as the user's location changes, it is often impossible to clearly know the navigation route from the current location to the destination, so it is usually necessary to reach the destination More time and energy to find the way.

Summary of the invention

In view of this, the purpose of the present disclosure is to provide an indoor visual navigation method, device, system and electronic equipment, which can provide users with indoor navigation services and guide users to reach their destinations conveniently.

In order to achieve the foregoing objectives, the technical solutions adopted in the embodiments of the present disclosure are as follows:

In the first aspect, the embodiments of the present disclosure provide an indoor visual navigation method, the method is executed by a mobile device, and the method includes: if an indoor image to be positioned is collected, uploading the indoor image to a server, so that The server determines the camera pose when the mobile device collects the indoor image; receives the camera pose corresponding to the indoor image returned by the server, and establishes a world coordinate based on the camera pose corresponding to the indoor image The AR coordinate system is aligned; the shortest route is planned in the pre-imported indoor topology map based on the destination information set by the user; the current preview image collected by the mobile device is displayed on the interface of the mobile device, and based on all The AR coordinate system and the shortest route are superimposed on the current preview image with a three-dimensional mark indicating the direction of the route.

In a possible implementation, the step of establishing an AR coordinate system aligned with the world coordinate system based on the camera pose corresponding to the indoor image includes: establishing an initial AR coordinate system; The camera pose adjusts the initial AR coordinate system so that the AR coordinate system is aligned with the world coordinate system.

In a possible implementation manner, the step of planning the shortest route in the pre-imported indoor topology map based on the destination information set by the user includes: using the destination information set by the user, using a path planning algorithm in advance Plan the shortest route in the imported indoor topology map.

In a possible implementation manner, the step of superimposing a three-dimensional mark indicating the direction of the route on the current preview image includes: detecting a ground plane on the current preview image; determining the shortest route The three-dimensional coordinates in the AR coordinate system are used to generate a three-dimensional mark indicating the direction of the route based on the determined three-dimensional coordinates; the three-dimensional mark is drawn on the ground plane of the current preview image.

In a possible implementation, the method further includes: if the current camera pose issued by the server is received during the navigation process, correcting the AR coordinate system based on the current camera pose to Keep the corrected AR coordinate system aligned with the world coordinate system.

In a second aspect, the embodiments of the present disclosure also provide an indoor visual navigation method, the method is executed by a server, and the method includes: if an indoor image to be positioned uploaded by a mobile device is received, determining that the mobile device collects the The camera pose of the indoor image; the camera pose corresponding to the indoor image is sent to the mobile device, so that the mobile device establishes an AR aligned with the world coordinate system based on the camera pose corresponding to the indoor image Coordinate system, and based on the AR coordinate system and the shortest route, superimpose a three-dimensional mark indicating the direction of the route on the current preview image collected by the mobile device; wherein, the shortest route is the mobile device based on user settings. The predetermined destination information is planned in the indoor topological map.

In a possible implementation, the step of determining the camera pose when the mobile device collects the indoor image includes: feature matching the indoor image with a visual map in a pre-established visual map library , Obtain the camera pose when the mobile device collects the indoor image; wherein the visual map is characterized by a sparse point cloud model of the indoor scene.

In a possible implementation manner, the step of performing feature matching of the indoor image with a visual map in a pre-established visual map library to obtain the camera pose when the mobile device collects the indoor image includes : Calculate the global image descriptor of the indoor image using a deep hash algorithm; search for multiple key frame images similar to the global image descriptor in the visual map library; obtain the key of each key frame image Frame information; divide the multiple key frame images into multiple clusters according to the key frame information; traverse each of the clusters to obtain the local feature points of the indoor image; calculate the local descriptors and compare them with all Matching the local feature points in the cluster; obtain the 3D map points corresponding to the successfully matched local feature points; if the number of the 3D map points is greater than the preset number, then the camera pose corresponding to the indoor image is obtained.

In a possible implementation, the process of establishing the visual map library includes: acquiring multiple scene images collected by a mobile device in an indoor scene; performing three-dimensional reconstruction on the multiple scene images based on the SFM algorithm to obtain all Describes a visual map library of sparse point cloud models corresponding to multiple scene images.

In a possible implementation manner, the method further includes: aligning the visual map with a pre-imported indoor plan distribution map.

In a possible implementation manner, the method further includes: compressing the visual map in the visual map library.

In a possible implementation manner, the step of compressing the visual map in the visual map library includes: encoding the original feature in the visual map; saving the encoded original feature, and clearing the code The original feature before.

In a possible implementation, the method further includes: periodically acquiring the current preview image collected by the mobile device during the navigation process, and determining the current camera pose when the mobile device collects the current preview image; Send the current camera pose to the mobile device, so that the mobile device corrects the AR coordinate system based on the current camera pose.

In a third aspect, an embodiment of the present disclosure provides an indoor visual navigation device, the device is set on the side of a mobile device, and the device includes: an image upload module configured to collect an indoor image to be positioned, The image is uploaded to the server so that the server determines the camera pose when the mobile device collects the indoor image; the coordinate system establishment module is configured to receive the camera pose corresponding to the indoor image returned by the server, and The AR coordinate system aligned with the world coordinate system is established based on the camera pose corresponding to the indoor image; the route planning module is configured to plan the shortest route in the pre-imported indoor topology map based on the destination information set by the user; the navigation display module , Configured to display the current preview image collected by the mobile device on the interface of the mobile device, and based on the AR coordinate system and the shortest route, superimpose a three-dimensional mark indicating the direction of the route on the On the current preview image.

In a fourth aspect, an embodiment of the present disclosure provides an indoor visual navigation device, the device is set on the server side, and the device includes: a pose determination module configured to receive an indoor image to be positioned uploaded by a mobile device, Determine the camera pose when the mobile device collects the indoor image; the device navigation module is configured to send the camera pose corresponding to the indoor image to the mobile device, so that the mobile device is based on the indoor image. The camera pose corresponding to the image establishes an AR coordinate system aligned with the world coordinate system, and based on the AR coordinate system and the shortest route, superimposes a three-dimensional mark indicating the direction of the route on the current preview image collected by the mobile device; Wherein, the shortest route is obtained by planning the mobile device in the indoor topology map according to the destination information set by the user.

In a fifth aspect, an embodiment of the present disclosure provides an indoor visual navigation system, the system includes a mobile device and a server that are communicatively connected; wherein the mobile device is configured to execute the method according to any one of the first aspect, The server is configured to perform the method according to any one of the second aspects.

In a sixth aspect, an embodiment of the present disclosure provides an electronic device, including: a processor and a storage device; the storage device stores a computer program, and the computer program is executed when being run by the processor as in the first aspect The method of any one, or the method of any one of the second aspect.

In a seventh aspect, the embodiments of the present disclosure provide a computer-readable storage medium having a computer program stored on the computer-readable storage medium. The steps of the method or the steps of the method described in any one of the second aspects above.

The embodiments of the present disclosure provide an indoor visual navigation method, device, system, and electronic equipment. The collected indoor image to be positioned is uploaded to a server through the mobile device, and the server determines the camera pose when the mobile device collects the indoor image. And send the camera pose corresponding to the indoor image to the mobile device, and then the mobile device establishes an AR coordinate system aligned with the world coordinate system based on the camera pose corresponding to the indoor image, and imports it in advance based on the destination information set by the user. The shortest route is planned in the indoor topology map, and finally the current preview image collected by the mobile device is displayed on the interface. Based on the AR coordinate system and the shortest route, a three-dimensional mark indicating the direction of the route is superimposed on the current preview image to realize the indoor Visual navigation. The above method provided in this embodiment can guide the user to follow the shortest route to the destination in the AR mode indoors, which better improves the user experience.

Other features and advantages of the embodiments of the present disclosure will be described in the following specification, or part of the features and advantages can be inferred from the specification or determined without doubt, or can be learned by implementing the above-mentioned technology of the embodiments of the present disclosure.

In order to make the above objectives, features and advantages of the present disclosure more comprehensible, preferred embodiments accompanied with accompanying drawings are described in detail as follows.

Description of the drawings

In order to more clearly illustrate the specific embodiments of the present disclosure or the technical solutions in the prior art, the following will briefly introduce the drawings that need to be used in the specific embodiments or the description of the prior art. Obviously, the appendix in the following description The drawings are some embodiments of the present disclosure. For those of ordinary skill in the art, without creative work, other drawings can be obtained based on these drawings.

FIG. 1 shows a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure;

Fig. 2 shows a flowchart of an indoor visual navigation method provided by an embodiment of the present disclosure;

Fig. 3 shows a flowchart of another indoor visual navigation method provided by an embodiment of the present disclosure;

FIG. 4 shows a flowchart of another indoor visual navigation method provided by an embodiment of the present disclosure;

FIG. 5 shows a structural block diagram of an indoor visual navigation device provided by an embodiment of the present disclosure;

Fig. 6 shows a structural block diagram of another indoor visual navigation device provided by an embodiment of the present disclosure.

Detailed ways

In order to make the objectives, technical solutions, and advantages of the embodiments of the present disclosure clearer, the technical solutions of the present disclosure will be described below in conjunction with the accompanying drawings. Obviously, the described implementations in this embodiment are only a part of the possible implementations. , Not the full implementation.

Based on the problem that users cannot use mobile terminals such as mobile phones for indoor navigation, the embodiments of the present disclosure provide an indoor visual navigation method, device, system, and electronic equipment. The technology can be applied to any occasion where indoor navigation is required. The disclosed embodiments are described in detail.

First, with reference to FIG. 1, an example electronic device 100 for implementing an indoor visual navigation method, apparatus, system, and electronic device according to an embodiment of the present disclosure will be described.

As shown in FIG. 1 is a schematic structural diagram of an electronic device. The electronic device 100 includes one or more processors 102, one or more storage devices 104, an input device 106, an output device 108, and an image acquisition device 110. These components pass through The bus system 112 and/or other forms of connection mechanisms (not shown) are interconnected. It should be noted that the components and structure of the electronic device 100 shown in FIG. 1 are only exemplary and not restrictive, and the electronic device may also have other components and structures as required.

The processor 102 may be implemented in a hardware form of at least one of a digital signal processor (DSP), a field programmable gate array (FPGA), and a programmable logic array (PLA), and the processor 102 may be a central processing unit. A unit (CPU), a graphics processing unit (GPU), or other forms of processing units with data processing capabilities and/or instruction execution capabilities, or a combination of several, and can control other components in the electronic device 100 To perform the desired function.

The storage device 104 may include one or more computer program products, and the computer program products may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include random access memory (RAM) and/or cache memory (cache), for example. The non-volatile memory may include, for example, read-only memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer-readable storage medium, and the processor 102 may run the program instructions to implement the client functions (implemented by the processor) in the embodiments of the present disclosure described below. And/or other desired functions. Various application programs and various data, such as various data used and/or generated by the application program, can also be stored in the computer-readable storage medium.

The input device 106 may be a device used by a user to input instructions, and may include one or more of a keyboard, a mouse, a microphone, and a touch screen.

The output device 108 may output various information (for example, images or sounds) to the outside (for example, a user), and may include one or more of a display, a speaker, and the like.

The image capture device 110 may capture images (for example, photos, videos, etc.) desired by the user, and store the captured images in the storage device 104 for use by other components.

Exemplarily, the example electronic equipment used to implement the indoor visual navigation method, apparatus, system, and electronic equipment according to the embodiments of the present disclosure may be implemented as smart terminals such as smart phones, tablets, wearable electronic equipment, computers, servers, etc. .

The embodiments of the present disclosure can provide an indoor visual navigation method on the side of a mobile device. The method can be executed by a mobile device such as a mobile phone, a tablet computer, a wearable electronic device, etc., see the flowchart of the indoor visual navigation method shown in FIG. 2 , The method mainly includes the following steps S202 to S208:

Step S202: If an indoor image to be located is collected, the indoor image is uploaded to the server so that the server can determine the camera pose when the mobile device collects the indoor image.

The following is an example of the mobile device being a mobile phone. When the user is in an indoor scene, he may not know his current location, let alone how to get to the destination from his current location, so he can take a picture first with his mobile phone. Zhang indoor images in the current scene, and then upload the indoor images to the server, through the server to perform visual positioning based on the indoor images.

Step S204: Receive the camera pose corresponding to the indoor image returned by the server, and establish an AR coordinate system aligned with the world coordinate system based on the camera pose corresponding to the indoor image.

AR (Augmented Reality) technology is a technology that integrates virtual information with the real world. It can apply virtual information to the real world. By superimposing virtual information (such as virtual graphics) with real scenes, it can show The effect of virtual graphics and real scene images on the same screen or in the same space, so that users can experience a scene that combines virtual and reality. In one possible implementation method, AR navigation is adopted. Since virtual graphics need to be displayed in the real scene, it is necessary to establish an AR coordinate system and adjust the AR coordinate system based on the camera pose corresponding to the indoor image. Align the AR coordinate system with the world coordinate system. Among them, the establishment of the AR coordinate system can be realized by using the existing SLAM algorithm (Simultaneous Localization and Mapping, real-time positioning and map construction), which will not be repeated here.

Step S206: Plan the shortest route in the pre-imported indoor topology map based on the destination information set by the user.

The user can enter the name of the destination in the mobile APP, or directly perform the operation of tapping the destination on the indoor floor plan presented by the mobile APP, which is not limited here. After the mobile phone obtains the destination information set by the user, it can plan the route in the pre-imported indoor topology map. In an indoor topology map, each store can be regarded as a node in the map, and indoor paths can be regarded as an edge path in the map. In a possible implementation, areas such as shopping malls, libraries, museums and other areas that need to provide indoor navigation services can provide the server with an indoor topology map in advance, and the server can also directly convert it from the indoor floor plan.

In step S208, the current preview image collected by the mobile device is displayed on the interface of the mobile device, and based on the AR coordinate system and the shortest route, a three-dimensional mark indicating the direction of the route is superimposed on the current preview image.

The camera of the mobile device starts to be in a shooting state and continues to collect images, and displays the collected images on the screen of the mobile terminal. It can also be said that the camera of the mobile device is in an image preview mode. When the user is holding the mobile phone while navigating, the mobile phone camera is in preview mode, and the user can see through the mobile phone interface that the current three-dimensional mark indicating the direction of the route is superimposed and marked in the indoor scene (such as marking the arrow on the ground).

Through the indoor visual navigation method provided in this embodiment, the mobile device can guide the user to the destination according to the shortest route in the AR mode indoors, which better improves the user experience.

When establishing an AR coordinate system aligned with the world coordinate system based on the camera pose corresponding to the indoor image, you can first establish the initial AR coordinate system, and then adjust the initial AR coordinate system based on the camera pose corresponding to the indoor image to make the AR coordinate The system is aligned with the world coordinate system.

An AR system can be installed in the mobile phone. The AR system is usually a Visual-Inertial Odometry (VIO) that includes loop detection. The AR coordinate system can also be referred to as the VIO coordinate system. The AR system can be the AR Kit that comes with the iOS system, or the AR Core that comes with the Android, or any third-party system that can implement the navigation function of the mobile device, which is not limited here.

The AR coordinate system established at the initial stage usually takes the first frame image as the coordinate origin. In order to enable the virtual image drawn based on the AR coordinate system to be better integrated with the real scene image in the world coordinate system, it is necessary to use the corresponding indoor image The camera pose adjusts the initial AR coordinate system so that the mobile phone can smoothly convert the planned path to the world coordinate system, so that the virtual graphics and the real scene image are better combined.

In a possible implementation manner, the embodiment of the present disclosure provides a specific implementation manner of planning the shortest route in a pre-imported indoor topology map based on the destination information set by the user: based on the destination information set by the user, using The path planning algorithm plans the shortest route in the pre-imported indoor topology map. The path planning algorithm can be an A* algorithm, of course, it can also be other path planning algorithms, which is not limited here.

In order to clearly provide the user with the navigation direction, this embodiment can refer to the following steps when performing the above step S208: (1) Detect the ground plane on the current preview image; (2) Determine the shortest route in the AR coordinate system Three-dimensional coordinates, and based on the determined three-dimensional coordinates, a three-dimensional mark is generated for indicating the direction of the route; (3) the three-dimensional mark is drawn on the ground plane of the current preview image. For example, the three-dimensional mark may be a virtual arrow, a dotted path mark, and the like. The user can walk in the direction indicated by the three-dimensional logo, and finally reach the destination with the shortest path.

Considering that in the process of AR navigation, the navigation route may gradually become inaccurate due to the error of the initial visual positioning pose and the drift of the AR system itself, for example, the navigation route begins to intersect the building. Therefore, in order to ensure accurate navigation throughout the entire process, if the mobile device in this embodiment receives the current camera pose issued by the server during the navigation process, it will correct the AR coordinate system based on the current camera pose to make the corrected AR coordinate The system remains aligned with the world coordinate system. That is, the server will periodically obtain the current preview image collected by the mobile device during the navigation process, and determine the current camera pose when the mobile device collects the current preview image, and send the current camera pose to the mobile device, and then the mobile device can Correct the AR coordinate system based on the current camera pose to improve navigation accuracy.

This embodiment can also provide an indoor visual navigation method on the server side. The method can be executed by, for example, a cloud server. Refer to the flowchart of the indoor visual navigation method shown in FIG. 3, the method mainly includes the following steps S302 to S304 :

Step S302: If the indoor image to be located uploaded by the mobile device is received, the camera pose when the mobile device collects the indoor image is determined. Among them, the camera pose may include XY coordinates and camera orientation.

In step S304, the camera pose corresponding to the indoor image is sent to the mobile device, so that the mobile device establishes an AR coordinate system aligned with the world coordinate system based on the camera pose corresponding to the indoor image, and will use the AR coordinate system and the shortest route based on the AR coordinate system and the shortest route. The three-dimensional mark indicating the direction of the route is superimposed on the current preview image collected by the mobile device; among them, the shortest route is planned by the mobile device in the indoor topology map based on the destination information set by the user.

Through the above-mentioned indoor visual navigation method provided by this embodiment, the server can undertake the large calculation steps required in the navigation process, such as camera pose calculation, and then use the mobile device to quickly and conveniently instruct the user to follow the instructions in the room. The shortest route to the destination improves the user experience.

In this embodiment, the more time-consuming and space-consuming calculation process can be executed by the server side, such as the server determining the camera pose when the mobile device collects indoor images. In a possible implementation manner, the server can set the indoor The image is feature-matched with the visual map in the pre-established visual map library to obtain the camera pose when the mobile device collects indoor images; among them, the visual map is represented by the sparse point cloud model of the indoor scene, which can be understood as a large number of visual features of the scene . In a possible implementation, the PNP (Perspective-n-Point) problem can be solved according to the 3D-2D matching relationship, that is, the 3D to 2D point pair movement method is solved, so as to solve the camera pose according to the feature point pair movement .

This embodiment provides a possible implementation of matching indoor images with visual maps in a pre-established visual map library to obtain the camera pose when the mobile device collects indoor images, mainly including coarse positioning and fine positioning. Two links of positioning are introduced as follows:

In the rough positioning link, the deep hash algorithm is used to calculate the global image descriptor of the indoor image, the k most similar key frame images are searched in the visual map library, and the key frame ID is obtained, and then the stored key frame ID is searched according to the key frame ID. Key frame information, including the pose of the key frame, local feature points, local descriptors and the coordinates of the corresponding map points in the world coordinate system. The k key frames are clustered according to the position and pose of the key frames, and the key frames with similar positions are grouped into one category. For each cluster, its cluster center provides preliminary coarse positioning results for subsequent fine positioning.

In the fine positioning process, it is first necessary to traverse each cluster, extract local feature points from indoor images, calculate local descriptors and match them with the local features of all key frames in the cluster, and then extract the 3D corresponding to the successfully matched feature points. For map points, if the number of 3D-2D point pairs successfully matched is greater than the preset number (such as greater than 5), the PNP problem can be solved to obtain the camera pose corresponding to the indoor image. The pose obtained by the PNP can be used as the initial value to further construct the Bundle Adjustment graph optimization problem, so that the pose corresponding to the indoor image can be optimized to minimize the reprojection error. After the pose optimization, the edges whose reprojection error is still large are removed, and the Bundle Adjustment graph optimization problem is constructed again with the remaining edges, and finally a more accurate pose corresponding to the indoor image is obtained. If the number of 3D-2D point pairs is too small or the reprojection error is too large after optimization, the key frame in the current cluster is considered to be a mismatch, and the cluster is abandoned. If the reprojection error is small after optimization, it is considered that the pose solution is correct, and the result is output directly, without entering the next clustering cycle.

The server can obtain a more accurate camera pose corresponding to the indoor image after the above-mentioned two-step positioning from coarse to fine. Of course, the above is only one method for determining the pose provided by this embodiment, and any other method for determining the pose of the camera can also be used, and it is not limited here.

In a possible implementation, the server will also align the visual map with the pre-imported indoor plan distribution map, so that the mobile device used for indoor navigation can smoothly convert the planned path to the world coordinate system. Among them, the indoor floor plan can be understood as a building structure drawing, and the indoor floor plan can be uploaded to the server in advance by, for example, a shopping mall.

The server may construct a visual map library in advance, and the process of establishing the visual map library includes the following steps: (1) Obtain multiple scene images collected by the mobile device in the indoor scene. In a possible implementation manner, a large number of images of various indoor scenes can be collected in advance, so as to construct a more accurate sparse point cloud model. (2) Based on the SFM (Structure From Motion) algorithm, three-dimensional reconstruction of multiple scene images is performed, and a visual map library containing sparse point cloud models corresponding to multiple scene images is obtained. The SFM mapping process can be implemented using open source algorithms such as COLMAP, Theia, VisualSfM, OpenMVG, etc., and there is no restriction here.

In a possible implementation, in order to save the disk storage space of the server, the visual map in the visual map library can also be compressed. For example, the original features in the visual map can be encoded by the method of product quantization, and the visual map Only the encoded result is saved in, and the original features are not saved, thus greatly compressing the map size. When the server uses the visual database for visual positioning, the encoded visual features can be decoded, and the decoded features can be used for matching and pose estimation.

Through the above method, the server can undertake the construction of a large amount of calculation of the visual map library and the determination of the camera pose, thereby reducing the hardware requirements of the mobile device, and also enabling the mobile device to provide users with navigation based on the calculation results of the server more quickly service.

In a possible implementation, considering that during the AR navigation process, the navigation route may gradually become inaccurate due to the error of the initial visual positioning pose and the drift of the AR system itself, and the server can also periodically obtain the mobile device The current preview image collected during the navigation process, and determine the current camera pose when the mobile device collects the current preview image; send the current camera pose to the mobile device, so that the mobile device can perform the AR coordinate system based on the current camera pose Fix.

The embodiments of the present disclosure also provide a possible implementation of an indoor visual navigation method based on standing on the mobile device side and standing on the server side. For details, please refer to the flowchart of an indoor visual navigation method as shown in FIG. 4 , Including the following steps:

Step S410: Collect multiple indoor images;

Step S412: Perform SFM mapping based on the indoor image to generate a visual map database.

Step S414: align the visual map in the visual map database with the indoor plan distribution map. Among them, the indoor plane distribution map can also be called the architectural structure map.

Step S420: receiving the indoor image to be located uploaded by the mobile phone;

Step S422: Extract the image features of the indoor image to be located, match the extracted image features with the visual map, and estimate the camera pose when the mobile phone collects the indoor image;

Step S424: Return the estimated camera pose to the mobile phone;

Step S430: Establish an AR coordinate system;

Step S432: align the AR coordinate system with the camera pose when the mobile phone collects indoor images;

Step S434: Receive destination information set by the user, and plan the shortest route in the indoor topology map;

Step S436: Detect the ground plane;

Step S438: Convert the shortest route into three-dimensional coordinates in the AR coordinate system, and draw the route on the ground plane with arrows.

The specific implementation operations of the above steps can refer to the content of the indoor visual navigation method standing on the mobile device side and the indoor visual navigation method standing on the server side, which will not be repeated here.

Among them, steps S410 to S414 can be collectively referred to as a visual map construction operation, and steps S420 to S424 can be collectively referred to as a cloud visual positioning operation. Both the visual map construction operation and the cloud visual positioning operation can be performed by the server. Steps S430 to S438 can be collectively referred to as AR navigation operations on the mobile terminal can be performed by mobile devices such as mobile phones.

The aforementioned indoor visual navigation method provided by this embodiment allows users to use their mobile phone to determine their location by photographing the surrounding environment anytime and anywhere. After selecting a destination, they can see the best route planned by the mobile phone through the path selection algorithm through the mobile phone screen. Walk along the route to reach the destination. Moreover, the aforementioned indoor visual navigation method puts a relatively time-consuming and space-consuming calculation process on the server side, so that users can realize real-time positioning and navigation on mobile devices.

The embodiment of the present disclosure also provides an indoor visual navigation device arranged on the side of the mobile device. Refer to the structural block diagram of the indoor visual navigation device shown in FIG. 5, which includes the following modules:

The image upload module 502 is configured to upload the indoor image to the server if the indoor image to be located is collected, so that the server can determine the camera pose when the mobile device collects the indoor image;

The coordinate system establishment module 504 is configured to receive the camera pose corresponding to the indoor image returned by the server, and establish an AR coordinate system aligned with the world coordinate system based on the camera pose corresponding to the indoor image;

The route planning module 506 is configured to plan the shortest route in the pre-imported indoor topology map based on the destination information set by the user;

The navigation display module 508 is configured to display the current preview image collected by the mobile device on the interface of the mobile device, and based on the AR coordinate system and the shortest route, superimpose a three-dimensional mark indicating the direction of the route on the current preview image.

Through the aforementioned indoor visual navigation device provided in this embodiment, the mobile device can guide the user to the destination according to the shortest route in the AR mode indoors, which better improves the user experience.

In a possible implementation, the coordinate system establishing module 504 is configured to establish an initial AR coordinate system; adjust the initial AR coordinate system based on the camera pose corresponding to the indoor image, so that the AR coordinate system is aligned with the world coordinate system.

In a possible implementation manner, the route planning module 506 is configured to use a route planning algorithm to plan the shortest route in the pre-imported indoor topology map based on the destination information.

In a possible implementation manner, the navigation display module 508 is configured to detect the ground plane on the current preview image; determine the three-dimensional coordinates of the shortest route in the AR coordinate system, and generate the three-dimensional coordinates configured to indicate the direction of the route based on the determined three-dimensional coordinates. Three-dimensional logo; draw the three-dimensional logo on the ground plane of the current preview image.

In a possible implementation, the above-mentioned device further includes a coordinate system correction module configured to correct the AR coordinate system based on the current camera pose if the current camera pose issued by the server is received during the navigation process, so that The revised AR coordinate system remains aligned with the world coordinate system.

The implementation principles and technical effects of the device provided in this embodiment are the same as those in the foregoing embodiment. For a brief description, for parts not mentioned in the device embodiment, please refer to the corresponding content in the foregoing method embodiment.

The embodiment of the present disclosure also provides an indoor visual navigation device arranged on the server side. Refer to the structural block diagram of the indoor visual navigation device shown in FIG. 6, which includes the following modules:

The pose determination module 602 is configured to determine the camera pose when the mobile device collects the indoor image if an indoor image to be positioned uploaded by the mobile device is received;

The device navigation module 604 is configured to send the camera pose corresponding to the indoor image to the mobile device, so that the mobile device establishes an AR coordinate system aligned with the world coordinate system based on the camera pose corresponding to the indoor image, and based on the AR coordinate system and The shortest route superimposes a three-dimensional mark used to indicate the direction of the route on the current preview image collected by the mobile device; among them, the shortest route is planned by the mobile device in the indoor topology map according to the destination information set by the user.

Through the above-mentioned indoor visual navigation device provided by this embodiment, the server can undertake the large calculation steps required in the navigation process, such as the calculation of camera pose, and then use the mobile device to quickly and conveniently instruct the user to follow the instructions in the room. The shortest route to the destination improves the user experience.

In one embodiment, the pose determination module 602 is configured to perform feature matching of the indoor image with the visual map in the pre-established visual map library to obtain the camera pose when the mobile device collects the indoor image; wherein, the visual map passes the indoor image. Sparse point cloud model representation of the scene.

In one embodiment, the above-mentioned device further includes a map building module configured to obtain multiple scene images collected by the mobile device in an indoor scene; and perform three-dimensional reconstruction on the multiple scene images based on the SFM algorithm to obtain a corresponding map containing multiple scene images. A visual map library of the sparse point cloud model.

In one embodiment, the above-mentioned device further includes: an alignment module configured to align the visual map with a pre-imported indoor plan distribution map.

In one embodiment, the above-mentioned device further includes: a current pose determination module configured to periodically obtain the current preview image collected by the mobile device during the navigation process, and determine the current camera pose when the mobile device collects the current preview image; The module is configured to send the current camera pose to the mobile device, so that the mobile device corrects the AR coordinate system based on the current camera pose.

The implementation principles and technical effects of the device provided by the embodiments of the present disclosure are the same as those of the previous embodiments. For a brief description, the parts not mentioned in the device embodiments can be implemented with reference to the indoor visual navigation method on the side of the mobile device. The corresponding content in the example.

The embodiment of the present disclosure also provides an indoor visual navigation system, which includes a mobile device and a server.

The embodiments of the present disclosure also provide an electronic device, including: a processor and a storage device; the storage device stores a computer program, and the computer program executes an indoor visual navigation method or a station on the side of a mobile device when the computer program is run by the processor. Indoor visual navigation method on the server side.

The embodiment of the present disclosure also provides a computer-readable storage medium, and a computer program is stored on the computer-readable storage medium. Indoor visual navigation method.

Those skilled in the art can clearly understand that, for the convenience and conciseness of the description, the specific working process of the system described above can refer to the corresponding process in the foregoing embodiment, which will not be repeated here.

The computer program product of the indoor visual navigation method, device, system, and electronic equipment provided by the embodiments of the present disclosure includes a computer-readable storage medium storing program code, and the instructions included in the program code can be used to execute the previous method embodiments For the specific implementation of the described method, please refer to the foregoing content of the embodiments of the present disclosure, and details are not described herein again.

In addition, in the description of the embodiments of the present disclosure, unless otherwise clearly specified and limited, the terms "installed", "connected", and "connected" should be interpreted broadly, for example, they may be fixed connections or detachable connections. , Or integrally connected; it can be a mechanical connection or an electrical connection; it can be directly connected or indirectly connected through an intermediate medium, and it can be the internal communication between two components. For those of ordinary skill in the art, the specific meanings of the above-mentioned terms in the present disclosure can be understood in specific situations.

If the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, the technical solution of the present disclosure essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various implementations of the present disclosure. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks and other media that can store program codes. .

In the description of the present disclosure, it should be noted that the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc. The indicated orientation or positional relationship is based on the orientation or positional relationship shown in the drawings, and is only for the convenience of describing the present disclosure and simplifying the description, and does not indicate or imply that the pointed device or element must have a specific orientation or a specific orientation. The structure and operation cannot therefore be construed as a limitation of the present disclosure. In addition, the terms "first", "second", and "third" are only used for descriptive purposes, and cannot be understood as indicating or implying relative importance.

Finally, it should be noted that the above-mentioned embodiments are only optional implementations of the present disclosure, which are used to illustrate the technical solutions of the present disclosure, but not to limit it. The protection scope of the present disclosure is not limited thereto, although The present disclosure has been described in detail with reference to the foregoing embodiments, and those of ordinary skill in the art should understand that any person skilled in the art within the technical scope disclosed in the present disclosure can still apply the technical solutions described in the foregoing embodiments. Modifications or changes can be easily conceived, or equivalent replacements of some of the technical features; and these modifications, changes or replacements do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present disclosure, and should be covered in Within the protection scope of this disclosure. Therefore, the protection scope of the present disclosure should be subject to the protection scope of the claims.

Industrial applicability

The embodiments of the present disclosure provide an indoor visual navigation method, device, system, and electronic equipment. The collected indoor image to be positioned is uploaded to a server through the mobile device, and the server determines the camera pose when the mobile device collects the indoor image. And send the camera pose corresponding to the indoor image to the mobile device, and then the mobile device establishes an AR coordinate system aligned with the world coordinate system based on the camera pose corresponding to the indoor image, and imports it in advance based on the destination information set by the user. The shortest route is planned in the indoor topology map, and finally the current preview image collected by the mobile device is displayed on the interface, and based on the AR coordinate system and the shortest route, a three-dimensional mark configured to indicate the direction of the route is superimposed on the current preview image to achieve indoor Visual navigation. In this way, the user can be guided to the destination according to the shortest route in the AR mode indoors, and the user experience is better improved.

Claims

An indoor visual navigation method, wherein the method is executed by a mobile device, and the method includes:

If an indoor image to be located is collected, upload the indoor image to a server, so that the server can determine the camera pose when the mobile device collects the indoor image;

Receiving the camera pose corresponding to the indoor image returned by the server, and establishing an AR coordinate system aligned with the world coordinate system based on the camera pose corresponding to the indoor image;

Plan the shortest route in the pre-imported indoor topology map based on the destination information set by the user;

Display the current preview image collected by the mobile device on the interface of the mobile device, and based on the AR coordinate system and the shortest route, superimpose a three-dimensional mark indicating the direction of the route on the current preview image superior.
The method according to claim 1, wherein the step of establishing an AR coordinate system aligned with a world coordinate system based on the camera pose corresponding to the indoor image comprises:

Establish the initial AR coordinate system;

The initial AR coordinate system is adjusted based on the camera pose corresponding to the indoor image, so that the AR coordinate system is aligned with the world coordinate system.
The method according to claim 1 or 2, wherein the step of planning the shortest route in a pre-imported indoor topology map based on the destination information set by the user comprises:

Based on the destination information set by the user, the route planning algorithm is used to plan the shortest route in the pre-imported indoor topology map.
The method according to any one of claims 1 to 3, wherein the step of superimposing a three-dimensional mark for indicating the direction of the route on the current preview image comprises:

Detecting the ground plane on the current preview image;

Determine the three-dimensional coordinates of the shortest route in the AR coordinate system, and generate a three-dimensional identifier for indicating the direction of the route based on the determined three-dimensional coordinates;

Draw the three-dimensional mark on the ground plane of the current preview image.
The method according to any one of claims 1-4, wherein the method further comprises:

During the navigation process, if the current camera pose issued by the server is received, the AR coordinate system is corrected based on the current camera pose, so that the corrected AR coordinate system is the same as the world coordinate system. Keep it aligned.
An indoor visual navigation method, wherein the method is executed by a server, and the method includes:

If an indoor image to be located uploaded by a mobile device is received, determine the camera pose when the mobile device collects the indoor image;

Send the camera pose corresponding to the indoor image to the mobile device, so that the mobile device establishes an AR coordinate system aligned with the world coordinate system based on the camera pose corresponding to the indoor image, and based on the AR The coordinate system and the shortest route are superimposed on the current preview image collected by the mobile device with a three-dimensional mark indicating the direction of the route; wherein, the shortest route is the indoor topology of the mobile device based on the destination information set by the user. Planned in the map.
The method according to claim 6, wherein the step of determining the camera pose when the mobile device collects the indoor image comprises:

The indoor image is feature-matched with the visual map in the pre-established visual map library to obtain the camera pose when the mobile device collects the indoor image; wherein the visual map is based on the sparse point cloud model of the indoor scene Characterization.
The method according to claim 7, wherein the step of matching the features of the indoor image with a visual map in a pre-established visual map library to obtain the camera pose when the mobile device collects the indoor image ,include:

Calculating the global image descriptor of the indoor image by using a deep hash algorithm;

Searching for multiple key frame images similar to the global image descriptor in the visual map library;

Acquiring key frame information of each of the key frame images;

Dividing the plurality of key frame images into a plurality of clusters according to the key frame information;

Traverse each of the clusters to obtain local feature points of the indoor image;

Calculate local descriptors and match them with local feature points in the cluster;

Obtain the 3D map points corresponding to the successfully matched local feature points;

If the number of 3D map points is greater than the preset number, the camera pose corresponding to the indoor image is obtained.
The method according to claim 7 or 8, wherein the process of establishing the visual map library comprises:

Acquire multiple scene images collected by the mobile device in the indoor scene;

Performing three-dimensional reconstruction on the multiple scene images based on the SFM algorithm to obtain a visual map library containing sparse point cloud models corresponding to the multiple scene images.
The method according to any one of claims 7-9, wherein the method further comprises:

Align the visual map with the pre-imported indoor plan distribution map.
The method according to any one of claims 7-10, wherein the method further comprises:

Compress the visual map in the visual map library.
The method according to claim 11, wherein the step of compressing the visual map in the visual map library comprises:

Encoding the original features in the visual map;

Save the original features after encoding, and clear the original features before encoding.
The method according to any one of claims 6-12, wherein the method further comprises:

Periodically acquiring the current preview image collected by the mobile device during the navigation process, and determining the current camera pose when the mobile device collects the current preview image;

Send the current camera pose to the mobile device, so that the mobile device corrects the AR coordinate system based on the current camera pose.
An indoor visual navigation device, wherein the device is arranged on the side of a mobile device, and the device includes:

An image upload module configured to upload the indoor image to a server if the indoor image to be located is collected, so that the server determines the camera pose when the mobile device collects the indoor image;

A coordinate system establishment module configured to receive the camera pose corresponding to the indoor image returned by the server, and establish an AR coordinate system aligned with the world coordinate system based on the camera pose corresponding to the indoor image;

The route planning module is configured to plan the shortest route in the pre-imported indoor topology map based on the destination information set by the user;

The navigation display module is configured to display the current preview image collected by the mobile device on the interface of the mobile device, and based on the AR coordinate system and the shortest route, superimpose a three-dimensional mark indicating the direction of the route On the current preview image.
An indoor visual navigation device, wherein the device is arranged on the server side, and the device includes:

The pose determination module is configured to determine the camera pose when the mobile device collects the indoor image if an indoor image to be positioned uploaded by the mobile device is received;

A device navigation module configured to send the camera pose corresponding to the indoor image to the mobile device, so that the mobile device establishes an AR coordinate system aligned with the world coordinate system based on the camera pose corresponding to the indoor image , And based on the AR coordinate system and the shortest route, superimpose a three-dimensional mark indicating the direction of the route on the current preview image collected by the mobile device; wherein, the shortest route is set by the mobile device according to the user The destination information is planned in the indoor topology map.
An indoor visual navigation system, wherein the system includes a mobile device and a server that are communicatively connected; wherein the mobile device is configured to execute the method according to any one of claims 1 to 5, and the server is configured to execute The method according to any one of claims 6 to 13.
An electronic device, including: a processor and a storage device;

A computer program is stored on the storage device, and the computer program, when run by the processor, executes the method according to any one of claims 1 to 5, or the method according to any one of claims 6 to 13 method.
A computer-readable storage medium having a computer program stored on the computer-readable storage medium, wherein the computer program executes the steps of the method according to any one of claims 1 to 5 or the above when the computer program is run by a processor The steps of the method of any one of claims 6-13.