CN112381877A

CN112381877A - Positioning fusion and indoor positioning method, device, equipment and medium

Info

Publication number: CN112381877A
Application number: CN202011240872.7A
Authority: CN
Inventors: 张晋川; 赵晨旭
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2020-11-09
Filing date: 2020-11-09
Publication date: 2021-02-19
Anticipated expiration: 2040-11-09
Also published as: CN112381877B

Abstract

The application discloses a positioning fusion method, an indoor positioning method, a positioning fusion device, an indoor positioning device and a medium, which relate to the fields of visual positioning, intelligent search, deep learning and augmented reality, and the positioning fusion method comprises the following steps: acquiring a plurality of environment acquisition images matched with an indoor environment; extracting various types of visual positioning features of the three-dimensional object in each environment acquisition image to obtain a plurality of standard visual positioning features matched with the indoor environment; wherein each standard visual positioning feature is attached to a three-dimensional object and is associated with a three-dimensional position point in the indoor environment; fusing to obtain a heterogeneous probability matrix matched with the indoor environment according to the position relation between every two standard visual positioning characteristics; and each probability value in the heterogeneous probability matrix represents the direct connection relation probability between every two standard visual positioning features. By using the technical scheme of the application, fusion of various visual positioning features can be realized, so that accurate positioning is carried out.

Description

Positioning fusion and indoor positioning method, device, equipment and medium

Technical Field

The present application relates to the field of image processing, and in particular, to the field of visual positioning, intelligent search, deep learning, and augmented reality, and more particularly, to a method, an apparatus, a device, and a medium for positioning fusion and indoor positioning.

Background

The visual positioning method is a positioning mode of shooting an environment image by a camera, comparing the environment image with known map elements or calculating the position of a photographer in a recursion mode. According to different positioning scenes, visual positioning can be divided into indoor positioning and outdoor positioning, wherein the indoor positioning can be applied to the aspects of path planning and the like under complex indoor scenes.

In the prior art, a visual positioning method is mainly based on feature points of an environment image. However, the visual positioning method based on the feature points is heavily dependent on the scene, and the positioning effect is poor in the scene with obvious illumination or structure change.

Disclosure of Invention

The embodiment of the application provides a positioning fusion method, an indoor positioning method, a positioning fusion device, an indoor positioning device, equipment and a medium.

In a first aspect, an embodiment of the present application discloses a positioning fusion method, including:

acquiring a plurality of environment acquisition images matched with an indoor environment;

extracting various types of visual positioning features of the three-dimensional object in each environment acquisition image to obtain a plurality of standard visual positioning features matched with the indoor environment;

wherein each standard visual positioning feature is attached to a three-dimensional object and is associated with a three-dimensional position point in the indoor environment;

fusing to obtain a heterogeneous probability matrix matched with the indoor environment according to the position relation between every two standard visual positioning characteristics;

and each probability value in the heterogeneous probability matrix represents the direct connection relation probability between every two standard visual positioning features.

In a second aspect, an embodiment of the present application discloses an indoor positioning method, including:

acquiring a target retrieval image acquired by an object to be positioned, and extracting various types of visual positioning characteristics of the target retrieval image;

matching each comparison visual positioning feature extracted from the target retrieval image with each standard visual positioning feature in an indoor environment to generate a one-dimensional coarse matching feature vector;

the one-dimensional coarse matching feature vector represents the occurrence probability of each standard visual positioning feature in the indoor environment in the target retrieval image;

determining a matching relation between each comparison visual positioning feature in the target retrieval image and a standard visual positioning feature in the indoor environment according to the one-dimensional coarse matching feature vector and the heterogeneous probability matrix matched with the indoor environment;

and according to the matching relation, indoor positioning is carried out on the object to be positioned.

In a third aspect, an embodiment of the present application discloses a positioning fusion device, including:

the environment acquisition image acquisition module is used for acquiring a plurality of environment acquisition images matched with the indoor environment;

the standard visual positioning feature acquisition module is used for extracting various types of visual positioning features of the three-dimensional object in each environment acquisition image to obtain a plurality of standard visual positioning features matched with the indoor environment;

the heterogeneous probability matrix acquisition module is used for fusing a heterogeneous probability matrix matched with an indoor environment according to the position relation between every two standard visual positioning characteristics;

In a fourth aspect, an embodiment of the present application discloses an indoor positioning device, which includes:

the visual positioning feature extraction module is used for acquiring a target retrieval image acquired by an object to be positioned and extracting various types of visual positioning features from the target retrieval image;

the one-dimensional coarse matching feature vector acquisition module is used for matching each comparison visual positioning feature extracted from the target retrieval image with each standard visual positioning feature in an indoor environment to generate a one-dimensional coarse matching feature vector;

the matching relation determining module is used for determining the matching relation between each comparison visual positioning feature in the target retrieval image and the standard visual positioning feature in the indoor environment according to the one-dimensional coarse matching feature vector and the heterogeneous probability matrix matched with the indoor environment;

and the object to be positioned positioning module is used for carrying out indoor positioning on the object to be positioned according to the matching relation.

In a fifth aspect, an embodiment of the present application discloses an electronic device, which includes at least one processor and a memory communicatively connected to the at least one processor. Wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the localization fusion method or the indoor localization method as described in any of the embodiments of the present application.

In a sixth aspect, embodiments of the present application disclose a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the localization fusion method or the indoor localization method as described in any of the embodiments of the present application.

According to the technical scheme, the multiple visual positioning features are fused for positioning, and the positioning accuracy is improved.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present application, nor do they limit the scope of the present application. Other features of the present application will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:

fig. 1 is a flowchart of a positioning fusion method provided in an embodiment of the present application;

FIG. 2a is a flow chart of another localization fusion method provided by the embodiments of the present application;

FIG. 2b is a schematic diagram of a heterogeneous probability map provided by an embodiment of the present application;

fig. 3 is a flowchart of an indoor positioning method provided in an embodiment of the present application;

fig. 4a is a flowchart of another indoor positioning method provided in the embodiment of the present application;

fig. 4b is a flowchart of an indoor positioning method provided in a specific application scenario of the present application;

FIG. 5 is a schematic structural diagram of a positioning fusion device according to an embodiment of the present disclosure;

fig. 6 is a schematic structural diagram of an indoor positioning device provided in an embodiment of the present application;

fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Fig. 1 is a flowchart of a positioning fusion method provided in an embodiment of the present application, and the technical solution of the embodiment of the present application may be applied to a situation where visual positioning features in an environment image are fused to obtain a heterogeneous probability matrix matching an indoor environment. The method can be implemented by a positioning fusion device, which can be implemented by software and/or hardware, and is generally integrated in an electronic device, typically, a server, a desktop, a mobile phone, a tablet, and the like.

As shown in fig. 1, the technical solution of the embodiment of the present application specifically includes the following steps:

and S110, acquiring a plurality of environment acquisition images matched with the indoor environment.

Wherein, indoor environment can be the environment in areas such as parking garage, market or cinema, and indoor environment relies on single type visual positioning characteristic to fix a position because the influence of factors such as illumination, structure, and accuracy and effect are relatively poor. Therefore, the present application needs to fuse multiple types of visual positioning features for indoor positioning.

The environment acquisition image is matched with the indoor environment, and the set of the environment acquisition image can completely cover the panorama of the indoor environment. Illustratively, the environment-captured image may be captured by a panoramic camera, and the capture mode of the environment-captured image is not limited in this embodiment.

And S120, extracting various types of visual positioning features of the three-dimensional object in each environment acquisition image to obtain multiple standard visual positioning features matched with the indoor environment.

Wherein each of the standard visual positioning features is attached to a three-dimensional object and is associated with a three-dimensional location point in the indoor environment.

The three-dimensional object refers to an object in which all parts included in an indoor environment are not in the same plane, and is used for extracting visual positioning features.

The visual positioning feature is obtained by extracting the three-dimensional object and is used for carrying out visual positioning. The standard visual positioning feature is a visual positioning feature extracted from a three-dimensional object included in the indoor environment. The three-dimensional location point is a three-dimensional spatial location point of a three-dimensional object in an indoor environment.

Optionally, the types of visual positioning features may include: two or more of feature point features in the three-dimensional object, line features in the three-dimensional object, plane features in the three-dimensional object, and semantic features in the three-dimensional object.

The feature point refers to a point where the gray value of the image changes drastically or a point with a large curvature on the edge of the image, and in this embodiment, the feature point feature in the three-dimensional object refers to a point where the gray value corresponding to the three-dimensional object in the environment-acquired image changes drastically or a point with a large curvature on the edge of the three-dimensional object. The Feature point Feature extraction may be implemented by a Harris corner detection algorithm, an MOPS (Multi-Scale oriented patches) algorithm, or an SIFT (Scale Invariant Feature Transform) algorithm, which is not limited in the embodiment of the present application.

The Line feature in the three-dimensional object refers to a straight Line in the three-dimensional object and a correlation between the straight lines, and the Line feature is extracted by using an LSD (Line Segment Detector) algorithm or an LBD (Line binary descriptor) algorithm, which is not limited in this embodiment.

The surface characteristics of the three-dimensional object comprise the area, the perimeter, the centroid, the third-order standard moment and the like of the three-dimensional object, the perimeter can be calculated according to the length of a chain code, the chain code is a digital sequence formed by the starting point and the direction symbol of a line segment, the third-order standard moment can be calculated according to the characteristic points of the three-dimensional object, and the area and the centroid can also be obtained in the calculation process of the third-order standard moment. However, the present embodiment does not limit the way of extracting the face feature.

The semantic feature in the three-dimensional object is information which is expressed by the environment acquisition image and is closest to human understanding, and for example, when the three-dimensional object comprises a combination of a bowl, a chopstick and a spoon, the semantic corresponding to the three-dimensional object can be tableware. The semantic features in the three-dimensional object can be extracted by a machine learning model trained in advance, but the embodiment does not limit the manner of extracting the semantic features, the type of machine learning, and the method of training the machine learning model.

And S130, fusing to obtain a heterogeneous probability matrix matched with the indoor environment according to the position relation between every two standard visual positioning features.

The heterogeneous probability matrix is composed of direct connection relation probabilities between every two standard visual positioning features, and the direct connection relation probabilities are used for expressing the position relation between every two standard visual positioning features. The larger the probability of the direct connection relation is, the closer the position between every two standard visual positioning features is, and the larger the probability that the two standard visual positioning features belong to the same three-dimensional object is.

According to the technical scheme, the environment acquisition images of the indoor environment are acquired, various visual positioning characteristics of the three-dimensional object contained in each environment acquisition image are extracted to serve as standard visual positioning characteristics, and the heterogeneous probability matrix is generated according to the position relation between every two standard visual positioning characteristics. The method solves the problems that the visual positioning method based on the feature points in the prior art is seriously dependent on scenes and has poor positioning effect under scenes with obvious illumination or structural change, and realizes the fusion of various visual positioning features for positioning, thereby improving the positioning accuracy.

Fig. 2a is a flowchart of another positioning fusion method in the embodiment of the present application, and the present embodiment further embodies the process of generating the heterogeneous probability matrix on the basis of the above embodiment.

Correspondingly, as shown in fig. 2a, the technical solution of the embodiment of the present application specifically includes the following steps:

s210, acquiring a plurality of environment acquisition images matched with the indoor environment.

S220, extracting various types of visual positioning features of the three-dimensional object in each environment acquisition image to obtain multiple standard visual positioning features matched with the indoor environment.

And respectively extracting characteristic point characteristics, line characteristics, surface characteristics and semantic characteristics of each three-dimensional object in the environment acquisition image.

And S230, calculating a common-view relation matrix matched with each standard visual positioning feature according to each standard visual positioning feature included in the environment acquisition image.

And each element in the common-view relationship matrix represents the position relationship of every two visual positioning features in the same environment acquisition image.

The common-view relationship represents the position relationship of two visual positioning features in the same environment acquisition image, and correspondingly, each element in the common-view relationship matrix is used for representing the position relationship of every two visual positioning features in the same environment acquisition image. The positional relationship of two visual alignment features in the same environment captured image can be represented by the number of pixels between the two visual alignment features.

And S240, calculating a three-dimensional spatial position relation matrix matched with each standard visual positioning feature according to the three-dimensional position points respectively corresponding to each standard visual positioning feature.

And representing the position relation of every two standard visual positioning characteristics in the three-dimensional space by each element in the three-dimensional space position relation matrix.

The three-dimensional spatial position relationship represents the position relationship between three-dimensional position points corresponding to two standard visual positioning features. Correspondingly, the position relationship of each two standard visual positioning features in the three-dimensional space can be represented by the corresponding distance value in the three-dimensional space of the three-dimensional position points corresponding to each two standard visual positioning features.

And S250, calculating to obtain a heterogeneous probability matrix matched with the indoor environment according to the common-view relation matrix and the three-dimensional spatial position relation matrix.

In the embodiment of the application, the common-view relation matrix and the three-dimensional spatial position relation matrix are superposed to obtain the heterogeneous probability matrix matched with the indoor environment.

In the embodiment of the present application, the heterogeneous probability matrix is represented by the sum of the co-view relationship matrix and the three-dimensional spatial position relationship matrix, but the composition of the heterogeneous probability matrix is not limited in the embodiment of the present application.

For example, fig. 2b is a schematic diagram of a heterogeneous probability map provided in an embodiment of the present application, as shown in fig. 2b, each layer represents one type of standard visual positioning feature, and each node represents each actual standard visual positioning feature, for example, when the layer C is a feature point feature, the node on the layer C is a feature point feature of each three-dimensional object. The connecting lines represent direct connection relations between every two standard visual positioning features, and the direct connection relation probability value corresponding to each connecting line is calculated, so that the heterogeneous probability matrix can be obtained.

According to the technical scheme, the environment acquisition images of the indoor environment are acquired, various visual positioning features of the three-dimensional object contained in each environment acquisition image are extracted to serve as standard visual positioning features, and the heterogeneous probability matrix is generated according to the position relation between the environment acquisition aspect and the three-dimensional space aspect between every two standard visual positioning features. The method solves the problems that the visual positioning method based on the feature points in the prior art is seriously dependent on scenes and has poor positioning effect under scenes with obvious illumination or structural change, and realizes the fusion of various visual positioning features for positioning, thereby improving the positioning accuracy.

Fig. 3 is a flowchart of an indoor positioning method in the embodiment of the present application, and the technical solution in the embodiment of the present application may be applied to a case where indoor positioning is performed on an object to be positioned according to a retrieval image acquired by the object to be positioned. The method may be implemented by an indoor positioning apparatus, which may be implemented by software and/or hardware, and is generally integrated in an electronic device, and typically may be integrated in a server, a desktop, a mobile phone, a tablet, and the like.

Correspondingly, as shown in fig. 3, the technical solution of the embodiment of the present application specifically includes the following steps:

s310, obtaining a target retrieval image acquired by the object to be positioned, and extracting various types of visual positioning characteristics of the target retrieval image.

The object to be positioned is a photographer of a target retrieval image, and the target retrieval image is an image of the object to be positioned at the current position acquired through an image shooting device and is used for positioning the current position of the object to be positioned.

Illustratively, when the object to be positioned is a user and the target retrieval image is an image inside a market, the embodiment can accurately position the position of the user in the market, and further can realize path navigation inside the market. When the object to be positioned is a vehicle and the target retrieval image is an underground parking lot image, the position of the vehicle in the underground parking lot can be accurately positioned, and further the vehicle can be assisted to conduct automatic parking and other operations.

In the embodiment of the application, after a target retrieval image acquired by an object to be positioned is acquired, a plurality of types of visual positioning features extracted from the target retrieval image are used as comparison visual positioning features.

S320, matching each comparison visual positioning feature extracted from the target retrieval image with each standard visual positioning feature in an indoor environment to generate a one-dimensional coarse matching feature vector.

Wherein the one-dimensional coarse matching feature vector characterizes the occurrence probability of each standard visual positioning feature in the indoor environment in the target retrieval image.

And the element value of each element in the one-dimensional coarse matching feature vector represents the probability value of each standard visual positioning feature appearing in the target retrieval image. The higher the probability value is, the higher the possibility that the corresponding standard visual positioning feature appears in the target retrieval image is, so that the standard visual positioning feature with the higher probability value can be matched with each comparison visual positioning feature.

S330, determining the matching relation between each comparison visual positioning feature in the target retrieval image and the standard visual positioning feature in the indoor environment according to the one-dimensional coarse matching feature vector and the heterogeneous probability matrix matched with the indoor environment.

The number of standard visual alignment features in an indoor environment is typically greater than the number of alignment visual alignment features. And according to the one-dimensional coarse matching feature vector and the heterogeneous probability matrix, the standard visual positioning feature matched with each comparison visual positioning feature in each standard visual positioning feature in each indoor environment can be obtained. And then, the comparison visual positioning features and the one-to-one corresponding relation between the standard visual positioning features matched with the comparison visual positioning features can be obtained.

Optionally, the determining, according to the one-dimensional coarse matching feature vector and the heterogeneous probability matrix matched with the indoor environment, a matching relationship between each comparison visual positioning feature in the target retrieval image and a standard visual positioning feature in the indoor environment may include: calculating to obtain a one-dimensional fine matching feature vector according to the one-dimensional coarse matching feature vector and the heterogeneous probability matrix; and determining the matching relation between each comparison visual positioning feature in the target retrieval image and the standard visual positioning feature in the indoor environment according to the one-dimensional fine matching feature vector.

Elements in the one-dimensional fine matching feature vector are also used for representing probability values of the standard visual positioning features appearing in the target retrieval image, but the probability values in the one-dimensional fine matching feature vector are more accurate than the probability values in the one-dimensional coarse matching feature vector.

And selecting corresponding number of probability values from high to low in each probability value in the one-dimensional fine matching feature vector according to the number of the comparison visual positioning features in the target retrieval image, wherein the standard visual positioning features corresponding to the selected probability values are the standard visual positioning features appearing in the target retrieval image. And matching each selected standard visual positioning feature with each comparison visual positioning feature to obtain the one-to-one correspondence between the comparison visual positioning features and the standard visual positioning features.

S340, according to the matching relation, indoor positioning is carried out on the object to be positioned.

And after the comparison visual positioning feature is successfully matched with the standard visual positioning feature in the indoor environment, the position of the object to be positioned can be obtained according to the position of the standard visual positioning feature.

According to the technical scheme of the embodiment of the application, various types of visual positioning features are extracted from a target retrieval image acquired by an object to be positioned, the extracted comparison visual positioning features are matched with various standard visual positioning features of an indoor environment to generate a one-dimensional coarse matching feature vector, the matching relation between the comparison visual positioning features and the standard visual positioning features is obtained according to the one-dimensional coarse matching feature vector and a heterogeneous probability matrix of the indoor environment, and then the object to be positioned is positioned indoors. The problem that in the prior art, a visual positioning method based on feature points is seriously dependent on scenes and has a poor positioning effect in scenes with obvious illumination or structural changes is solved, and the accuracy of indoor positioning is improved.

Fig. 4a is a flowchart of another indoor positioning method in this embodiment, which is further embodied in the present embodiment of the present invention by a process of generating a one-dimensional coarse matching feature vector, a process of obtaining a matching relationship between a comparison visual positioning feature and a standard visual positioning feature, and a process of performing indoor positioning according to the matching relationship, based on the above embodiments.

Correspondingly, as shown in fig. 4a, the technical solution of the embodiment of the present application specifically includes the following steps:

s410, obtaining a target retrieval image acquired by an object to be positioned, and extracting various types of visual positioning features of the target retrieval image.

And S420, matching each comparison visual positioning feature with each standard visual positioning feature respectively to obtain a single feature matching vector corresponding to each comparison visual positioning feature respectively.

Wherein each element in the single feature matching vector represents a matching probability of the comparison visual positioning feature and each standard visual positioning feature.

And S430, generating a one-dimensional coarse matching feature vector according to each single feature matching vector.

And adding the single feature matching vectors to obtain a one-dimensional coarse matching feature vector.

And S440, multiplying the one-dimensional coarse matching feature vector by the heterogeneous probability matrix for multiple times until the calculation result is converged, and determining the converged calculation result as the one-dimensional fine matching feature vector.

The calculation result convergence means that all probabilities in the calculation result do not change any more, and all probability values in the one-dimensional fine matching feature vector obtained at the moment can accurately represent the probability of each standard visual positioning feature appearing in the target retrieval image.

S450, selecting a target standard visual positioning feature matched with the quantity value from all standard visual positioning features according to the quantity value of the comparison visual positioning feature included in the target retrieval image and the element value of each element in the one-dimensional fine matching feature vector.

In the embodiment of the application, because the number of the standard visual positioning features is usually greater than the number of the comparison visual positioning features, the number value of the comparison visual positioning features is obtained, and for each probability value in the one-dimensional fine matching feature vector, the number value of the standard visual positioning features is selected from high probability to low probability to serve as a set of the target standard visual positioning features.

And S460, respectively matching each comparison visual positioning feature with each target standard visual positioning feature to establish a matching relationship between each comparison visual positioning feature and each target standard visual positioning feature.

After the set of target standard visual positioning features is obtained, each comparison visual positioning feature is matched with each target standard visual positioning feature again, and a one-to-one correspondence relationship between the comparison visual positioning features and the target standard visual positioning features can be established.

S470, according to the matching relation, adopting a camera attitude estimation algorithm to carry out indoor positioning on the object to be positioned.

The camera pose estimation algorithm is used for acquiring the current position of the image shooting device, namely the current position of the object to be positioned.

The camera pose estimation algorithm can be a PNP (perspective-n-point) algorithm, and the principle of the PNP algorithm is to calculate the pose of the camera according to the point pairs of a plurality of three-dimensional space points and image plane points. In this embodiment, after the one-to-one correspondence relationship is established between the visual positioning features and the target standard visual positioning features by comparison, the target standard visual positioning features correspond to three-dimensional position points, and the comparison visual positioning features can correspond to a pixel point in the target retrieval image, so that the current position of the object to be positioned can be obtained according to a PNP algorithm.

Fig. 4b is a flowchart of indoor positioning provided in a specific applicable scenario of the present application, and as shown in fig. 4b, after a picture taken by a user is obtained, feature point features, line features, surface features, and semantic features in the picture are extracted and matched with standard visual positioning features in an indoor environment to obtain rough matching, a one-dimensional rough matching feature vector is formed from a rough matching result, and is repeatedly multiplied by a heterogeneous probability matrix until the result converges to obtain a one-dimensional fine matching feature vector, a probability value in the one-dimensional fine matching feature vector can represent a probability that the standard visual positioning features appear in the picture, the standard visual positioning features matched with the rough matching result are taken out to form a matching relationship with each feature in the picture, and a positioning result can be obtained by using a PNP algorithm or an object positioning method.

Fig. 5 is a schematic structural diagram of a positioning fusion apparatus in an embodiment of the present application, where the apparatus may be implemented by software and/or hardware, and is generally integrated in an electronic device, and may be typically integrated in a server, a desktop, a mobile phone, a tablet, and the like. The device includes: an environment capture image acquisition module 510, a standard visual positioning feature acquisition module 520, and a heterogeneous probability matrix acquisition module 530. Wherein:

an environment acquisition image acquisition module 510, configured to acquire a plurality of environment acquisition images matched with an indoor environment;

a standard visual positioning feature obtaining module 520, configured to extract multiple types of visual positioning features of a three-dimensional object in each environment-collected image, so as to obtain multiple standard visual positioning features matched with an indoor environment;

wherein each standard visual positioning feature is attached to a three-dimensional object and is associated with a three-dimensional location point in the indoor environment;

a heterogeneous probability matrix obtaining module 530, configured to obtain a heterogeneous probability matrix matching the indoor environment through fusion according to a position relationship between each two standard visual positioning features;

On the basis of the above embodiment, the types of the visual positioning features include: feature point features in a three-dimensional object, line features in a three-dimensional object, face features in a three-dimensional object, and semantic features in a three-dimensional object.

On the basis of the foregoing embodiment, the heterogeneous probability matrix obtaining module 530 includes:

the common-view relation matrix calculating unit is used for calculating a common-view relation matrix matched with each standard visual positioning feature according to each standard visual positioning feature included in the environment acquisition image;

each element in the common-view relationship matrix represents the position relationship of every two visual positioning features in the same environment acquisition image;

the three-dimensional spatial position relation matrix calculation unit is used for calculating a three-dimensional spatial position relation matrix matched with each standard visual positioning feature according to the three-dimensional position points respectively corresponding to each standard visual positioning feature;

each element in the three-dimensional space position relation matrix represents the position relation of every two standard visual positioning features in the three-dimensional space;

and the heterogeneous probability matrix calculation unit is used for calculating to obtain a heterogeneous probability matrix matched with the indoor environment according to the common-view relation matrix and the three-dimensional spatial position relation matrix.

The positioning fusion device provided by the embodiment of the application can execute the positioning fusion method provided by any embodiment of the application, and has corresponding functional modules and beneficial effects of the execution method.

Fig. 6 is a schematic structural diagram of an indoor positioning apparatus in an embodiment of the present application, where the apparatus may be implemented by software and/or hardware, and is generally integrated in an electronic device, and may be integrated in a server, a desktop, a mobile phone, a tablet, and the like. The device includes: a visual positioning feature extraction module 610, a one-dimensional coarse matching feature vector acquisition module 620, a matching relationship determination module 630, and an object-to-be-positioned positioning module 640, wherein:

the visual positioning feature extraction module 610 is configured to acquire a target retrieval image acquired by an object to be positioned, and extract multiple types of visual positioning features from the target retrieval image;

a one-dimensional coarse matching feature vector obtaining module 620, configured to match each comparison visual positioning feature extracted from the target retrieval image with each standard visual positioning feature included in an indoor environment, so as to generate a one-dimensional coarse matching feature vector;

a matching relationship determining module 630, configured to determine, according to the one-dimensional coarse matching feature vector and the heterogeneous probability matrix matching the indoor environment, a matching relationship between each comparison visual positioning feature in the target retrieval image and a standard visual positioning feature in the indoor environment;

and the object to be positioned positioning module 640 is used for performing indoor positioning on the object to be positioned according to the matching relationship.

On the basis of the foregoing embodiment, the one-dimensional coarse matching feature vector obtaining module 620 includes:

a single feature matching vector obtaining unit, configured to match each comparison visual positioning feature with each standard visual positioning feature, respectively, to obtain a single feature matching vector corresponding to each comparison visual positioning feature, respectively;

wherein each element in the single feature matching vector represents a matching probability of a comparison visual positioning feature and each standard visual positioning feature;

and the one-dimensional coarse matching feature vector generating unit is used for generating one-dimensional coarse matching feature vectors according to the single feature matching vectors.

On the basis of the above embodiment, the matching relationship determining module 630 includes:

the one-dimensional fine matching feature vector calculating unit is used for calculating to obtain a one-dimensional fine matching feature vector according to the one-dimensional coarse matching feature vector and the heterogeneous probability matrix;

and the matching relation determining unit is used for determining the matching relation between each comparison visual positioning feature in the target retrieval image and the standard visual positioning feature in the indoor environment according to the one-dimensional fine matching feature vector.

On the basis of the above embodiment, the one-dimensional fine matching feature vector calculation unit is configured to:

multiplying the one-dimensional coarse matching feature vector by the heterogeneous probability matrix for multiple times until the calculation result is converged, and determining the converged calculation result as the one-dimensional fine matching feature vector.

On the basis of the above embodiment, the matching relationship determining unit is configured to:

selecting a target standard visual positioning feature matched with the quantity value from all standard visual positioning features according to the quantity value of the comparison visual positioning feature included in the target retrieval image and the element value of each element in the one-dimensional fine matching feature vector;

and respectively matching each comparison visual positioning feature with each target standard visual positioning feature to establish a matching relation between each comparison visual positioning feature and each target standard visual positioning feature.

On the basis of the above embodiment, the module 640 for positioning an object to be positioned includes:

and the indoor positioning unit is used for performing indoor positioning on the object to be positioned by adopting a camera attitude estimation algorithm according to the matching relation.

The indoor positioning device provided by the embodiment of the application can execute the indoor positioning method provided by any embodiment of the application, and has the corresponding functional modules and beneficial effects of the execution method.

According to an embodiment of the present application, an electronic device and a readable storage medium are also provided.

Fig. 7 is a block diagram of an electronic device according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.

As shown in fig. 7, the electronic apparatus includes: one or more processors 71, memory 72, and interfaces for connecting the various components, including a high speed interface and a low speed interface. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). Fig. 7 illustrates an example of a processor 71.

Memory 72 is a non-transitory computer readable storage medium as provided herein. Wherein the memory stores instructions executable by at least one processor to cause the at least one processor to perform a position fusion method provided herein or to perform an indoor positioning method provided herein. A non-transitory computer-readable storage medium of the present application stores computer instructions for causing a computer to perform a location fusion method provided by the present application or perform an indoor location method provided by the present application.

The memory 72, as a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to display verification of a browser in the embodiments of the present application (for example, the environment captured image acquisition module 510, the standard visual positioning feature acquisition module 520, and the heterogeneous probability matrix acquisition module 530 shown in fig. 5, or the visual positioning feature extraction module 610, the one-dimensional coarse matching feature vector acquisition module 620, the matching relationship determination module 630, and the object to be positioned positioning module 640 shown in fig. 6). The processor 71 executes various functional applications and data processing of the server by executing non-transitory software programs, instructions and modules stored in the memory 72, namely, implementing the positioning fusion method in the above method embodiment or implementing the indoor positioning method in the above method embodiment.

The memory 72 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the electronic device, and the like. Further, the memory 72 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 72 may optionally include memory located remotely from the processor 71, which may be connected to the electronic device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The electronic device may further include: an input device 73 and an output device 74. The processor 71, the memory 72, the input device 73 and the output device 74 may be connected by a bus or other means, as exemplified by the bus connection in fig. 7.

The input device 73 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic apparatus, such as a touch screen, keypad, mouse, track pad, touch pad, pointer stick, one or more mouse buttons, track ball, joystick, or other input device. The output devices 74 may include a display device, auxiliary lighting devices (e.g., LEDs), and tactile feedback devices (e.g., vibrating motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.

Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service are overcome.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, and the present invention is not limited thereto as long as the desired results of the technical solutions disclosed in the present application can be achieved.

The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. A method of localization fusion, comprising:

2. The method of claim 1, wherein the type of visual positioning feature comprises: feature point features in a three-dimensional object, line features in a three-dimensional object, face features in a three-dimensional object, and semantic features in a three-dimensional object.

3. The method according to claim 1, wherein the fusing a heterogeneous probability matrix matching an indoor environment according to a position relationship between two standard visual positioning features comprises:

calculating a common-view relation matrix matched with each standard visual positioning feature according to each standard visual positioning feature included in the environment acquisition image;

calculating a three-dimensional spatial position relation matrix matched with each standard visual positioning feature according to the three-dimensional position points respectively corresponding to each standard visual positioning feature;

and calculating to obtain a heterogeneous probability matrix matched with the indoor environment according to the common-view relation matrix and the three-dimensional spatial position relation matrix.

4. An indoor positioning method, comprising:

5. The method of claim 4, wherein said matching each extracted aligned visual alignment feature in the target search image with each standard visual alignment feature included in the indoor environment to generate a one-dimensional coarse matching feature vector comprises:

matching each comparison visual positioning feature with each standard visual positioning feature respectively to obtain a single feature matching vector corresponding to each comparison visual positioning feature respectively;

and generating a one-dimensional coarse matching feature vector according to each single feature matching vector.

6. The method of claim 4, wherein the determining a matching relationship between each aligned visual alignment feature in the target retrieval image and a standard visual alignment feature in the indoor environment according to the one-dimensional coarse matching feature vector and the heterogeneous probability matrix matching the indoor environment comprises:

calculating to obtain a one-dimensional fine matching feature vector according to the one-dimensional coarse matching feature vector and the heterogeneous probability matrix;

and determining the matching relation between each comparison visual positioning feature in the target retrieval image and the standard visual positioning feature in the indoor environment according to the one-dimensional fine matching feature vector.

7. The method of claim 6, wherein the calculating a one-dimensional fine matching feature vector according to the one-dimensional coarse matching feature vector and the heterogeneous probability matrix comprises:

8. The method of claim 6, wherein the determining a matching relationship between each aligned visual positioning feature in a target retrieval image and a standard visual positioning feature in the indoor environment according to the one-dimensional fine matching feature vector comprises:

9. The method of claim 4, wherein said indoor positioning of the object to be positioned according to the matching relationship comprises:

and according to the matching relation, performing indoor positioning on the object to be positioned by adopting a camera attitude estimation algorithm.

10. A localization fusion device, comprising:

11. The apparatus of claim 10, wherein the types of visual positioning features comprise: feature point features in a three-dimensional object, line features in a three-dimensional object, face features in a three-dimensional object, and semantic features in a three-dimensional object.

12. The apparatus of claim 10, wherein the heterogeneous probability matrix obtaining module comprises:

13. An indoor positioning device comprising:

14. The apparatus of claim 13, wherein the one-dimensional coarse matching feature vector obtaining module comprises:

15. The apparatus of claim 13, wherein the match relationship determination module comprises:

16. The apparatus of claim 15, wherein the one-dimensional fine matching feature vector calculation unit is to:

17. The apparatus of claim 15, wherein the match relationship determination unit is configured to:

18. The apparatus of claim 13, wherein the object to be located positioning module comprises:

19. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the localization fusion method of any one of claims 1-3 or to perform the indoor localization method of any one of claims 4-9.

20. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the location fusion method of any one of claims 1-3 or perform the indoor location method of any one of claims 4-9.