CN114449651A

CN114449651A - Method for generating reliable crowdsourcing sample set, constructing multi-resolution database and positioning

Info

Publication number: CN114449651A
Application number: CN202210064098.1A
Authority: CN
Inventors: 王邦; 谭飞; 周婵欣; 莫益军
Original assignee: Huazhong University of Science and Technology
Current assignee: Huazhong University of Science and Technology
Priority date: 2022-01-20
Filing date: 2022-01-20
Publication date: 2022-05-06
Anticipated expiration: 2042-01-20
Also published as: CN114449651B

Abstract

The invention discloses a method for generating a reliable crowdsourcing sample set, constructing a multi-resolution database and positioning, and belongs to the technical field of communication and wireless networks. According to the invention, by constructing a multi-resolution positioning database structure, online positioning is carried out in sequence from low resolution to high resolution, and the computational complexity of a large database can be effectively reduced; identifying reliable crowdsourcing samples through a reliability measurement algorithm, and solving the problem that positioning labels of the crowdsourcing samples are inaccurate; according to the method, a curved surface is built by reliable samples, and new fitting samples are generated by resampling, so that the problem that crowdsourcing sample labels are not uniformly distributed is solved; the resolution of the fingerprint and the robustness of a positioning system are increased by adding the auxiliary features of the partition Euclidean distance; by reasonably combining classification and regression models, the positioning accuracy under the conditions of different resolutions is improved while the calculation complexity is reduced.

Description

Method for generating reliable crowdsourcing sample set, constructing multi-resolution database and positioning

Technical Field

The invention belongs to the technical field of communication and wireless networks, and particularly relates to a method for generating a reliable crowdsourcing sample set, constructing a multi-resolution database and positioning the multi-resolution database.

Background

Positioning technology has been an interesting topic for many civilian and military applications, such as search and rescue missions, internet of things, and robotic navigation. Although the technology of Global Positioning System (GPS) is well developed, it cannot be the best means for indoor environment because of the accuracy factor. In recent years, many technologies for indoor positioning, such as visible light communication, bluetooth, ultra wideband, etc., have been developed. Despite the advantages and disadvantages of each of these technologies, Wi-Fi based fingerprinting is the most widely used technology due to its strong universality. However, as the size and complexity of indoor environments increase, Wi-Fi fingerprinting presents two major challenges.

First, as the complexity of the indoor environment increases, the cost of the site survey required for Wi-Fi offline database construction also increases dramatically. Crowdsourcing, one of the most powerful solutions, takes advantage of the unknown public in the internet to accomplish the task of collecting samples, thereby relieving the enormous burden of site surveying. While crowd sourcing can reduce the burden of field surveys, crowd-sourced samples are collected by laypersons, often with errors in labeling. In addition to the random acquisition process of the crowdsourcing technique, the crowdsourced samples may not be uniform in density, and even may not have samples at all in some localized areas. These are all problems that the crowdsourcing function is to solve.

On the other hand, as the indoor environment scales up, searching for a precise location requires longer processing time and additional computing resources. The main indoor positioning research has been focused on accurate coordinate-level position estimation, i.e., obtaining sub-meter-level accuracy grid position estimation. However, some location-aware services may not require such explicit location accuracy. For example, tracking of medical devices in hospitals; locating a parking space in an airport; location of zoning in fire control for short periods, etc. Therefore, in some cases, it is more meaningful to identify the area, such as a building, a room, etc., to which the user belongs than to provide an accurate location specified by coordinate information, so many location-aware services require different levels of positioning resolution. However, in the research work of indoor positioning, there is little research on such.

Disclosure of Invention

Aiming at the defects and the improvement requirements of the prior art, the invention provides a method for generating a reliable crowdsourcing sample set, constructing a multi-resolution database and positioning, and aims to simplify the database architecture in a large-scale complex indoor positioning environment, meet different position sensing service requirements and increase the positioning speed at an online stage by setting a plurality of resolution levels. And the problems that crowdsourcing samples have wrong position marking and sampling is uneven are solved to a certain extent.

To achieve the above object, according to a first aspect of the present invention, there is provided a method for generating a Wi-Fi reliable crowdsourcing sample set, the method comprising:

s1, splitting a target area from top to bottom in a multi-level mode until the target area is split into grids, and obtaining units of different levels of the target area, the sub-areas and the sub-areas … … grids, wherein the grids correspond to the highest resolution;

s2, distributing original crowdsourcing samples based on a multi-level splitting result of the target area;

s3, respectively measuring the reliability of the crowdsourcing samples in each grid, and reserving credible crowdsourcing samples with the reliability larger than a threshold value;

s4, performing surface fitting on all the reserved credible crowdsourcing samples;

and S5, uniformly sampling the fitted surface function to obtain a Wi-Fi reliable crowdsourcing sample set.

Preferably, the reliability measure takes the form of a contour coefficient

The calculation formula is as follows:

wherein, the first and the second end of the pipe are connected with each other,

represents a grid g_kInner coordinate (x)_n,y_n) Processing the collected RSS vectors

Representing a sample

And the signal distance between other crowd-sourced samples in the same grid as it belongs to,

representing a sample

Minimum of signal distance, S, from all other crowd-sourced samples in other grids_kRepresenting a sample

All of the sample sets within the grid are,

representing a sample

And a sample

The signal distance between.

Has the advantages that: aiming at the problem of wrong labeling of the conventional crowdsourcing samples, the method adopts the contour value coefficient to evaluate the compatibility of the crowdsourcing samples and the positioning areas of the crowdsourcing samples, and the contour coefficient can combine two factors of cohesion and separation, so that reasonable and effective reliability measurement is realized; because the low-compatibility, namely low-reliability samples are removed, the quality of the database is ensured, and the accuracy of subsequent positioning is improved.

Preferably, the curve fitting manner is as follows:

contour value set S of original crowdsourced samples_h＝{h₁,h₂,…,h_N}, set of contour values S of credible crowd-sourced samples_h′＝{h₁,h₂,…,h_γN denotes the original crowdsourcing sample number, γ denotes the trusted crowdsourcing sample number, where the contour value

Using the contour value h_nComputing samples in trusted crowdsourcing samples

Is given by a weight coefficient w_n，n＝1,…,γ：

Where ρ represents the ratio of reliable crowd-sourced samples;

by taking the weight coefficient w_nFitting a function to the signal of the mth AP at position (x)_n,y_n) Value and sample of

Is performed by the deviation between the signal strength values received by the mth APThe objective function of weighting, minimizing the sum of squared weighted residuals, is:

wherein phi is_m(x, y) is a fitting surface function for fitting the wireless signal propagation surface from the AP in a given partition.

Has the advantages that: aiming at the problem of inconsistent quality of the existing crowdsourcing samples, the weights to be assigned to the samples are calculated based on the contour values of the samples, and on one hand, the reliability is in direct proportion to the contribution to the curved surface. On the other hand, the weight w is controlled_nTo [1, ρ ]]The range of (2) ensures that the degree of difference between different samples is not too great; the wireless curved surface function is constructed in a weighting mode, and because different weights are given to the samples when the curved surface is fitted, the differences of the samples can be fully realized, more weights are given to reliable crowdsourcing samples with larger influences on the curved surface construction process, and the reliability and the effectiveness of the fitted curved surface are enhanced.

To achieve the above object, according to a second aspect of the present invention, there is provided a method for constructing a multi-resolution offline database, the method comprising:

t1. generating a Wi-Fi reliable crowdsourcing sample set using the method as described in the first aspect;

and T2, constructing a multi-resolution off-line database according to a mode from bottom to top based on a multi-level splitting result of the target area, and obtaining a partition fingerprint of the previous resolution layer from a plurality of partition fingerprint sets of the current resolution layer.

Preferably, the hierarchy name of the multi-resolution in the multi-resolution offline database ordered from high resolution to low resolution is L₁,L₂,…L_JWherein L is_jThe layer has P partitions, each partition consisting of K L_j-1Subdivision of a layer, then L_jThe data for the p-th partition of the layer is:

wherein, AP₁,AP₂,AP_MRespectively, 1,2, M signal emission sources are represented, Label represents the Label of the partition, each row except the last row represents the signal vector of the sub-partition, the last row represents the average vector of all the sub-partition vectors, each column except the last column represents the signal strength of the AP, the last column represents the partition Label of the data,

represents L_j-1The signal strength value of the mth AP received by the kth partition of the layer,

is the K number L_j-1The partition represents the average value of the signal intensity of the vector in the mth dimension AP, J is 1,2, … J, J represents the resolution layer number, P is 1,2, …, P, M is 1,2 …, M represents the total number of signal emission sources.

Has the advantages that: aiming at the problem that the existing partition data lack of the difference degree, the fingerprint method of the upper partition is formed by the fingerprint set of the lower sub-partition, and the partition fingerprints can keep more original information of the lower sub-partition, so that the difference degree of the data of each partition is increased, and the robustness of the positioning system is improved.

wherein, AP₁,AP₂,AP_MRespectively, 1,2, M signal emission sources, F₁,F_pRespectively representing the 1 st and P auxiliary features, Label representing the Label of the partition to which the Label belongs, except that each row of the last row represents the fingerprint of the sub-partition, the last row represents the average fingerprint of all the fingerprints of the sub-partitions, each column of the first M columns represents the signal strength value of the AP, each column of the next P columns represents the specific value of the auxiliary feature, and the last column represents the partition Label of the data,

is the K number L_j-1The partitions represent the mean of the signal strength of the vector at the m-dimension AP, d_pqRepresentation of features

Or

Representative vector of each region with itself

J is 1,2, … J, J is the number of resolution layers, P is 1,2, …, P, Q is 1, …, Q, M is 1,2 …, M, Q is L_j-1M represents the total number of signal emission sources.

Has the beneficial effects that: aiming at the problem of single characteristic of the existing fingerprint, the global characteristic is extracted by adopting Euclidean distances of different subarea signal intensity vectors as an auxiliary characteristic, and the resolution capability of the fingerprint can be improved by the auxiliary characteristic, so that the robustness of a positioning system is enhanced.

To achieve the above object, according to a third aspect of the present invention, there is provided a multi-resolution positioning method including:

(1) sequentially determining the subareas of the target from the coarse resolution layer to the fine resolution layer according to a top-down mode, and classifying by using a classification model;

(2) if the resolution level is customized by the user, after the target is classified into a sub-region of the resolution level defined by the user, a regression model is adopted to carry out final accurate positioning on the target user; otherwise, after the target is classified into the sub-region with the highest resolution, the regression model is adopted to carry out final accurate positioning on the target user.

Preferably, step (1) comprises the sub-steps of:

(1.1) Forming a complete test fingerprint of the target user

Wherein the content of the first and second substances,

respectively, indicating the signal strength of the 1 st, 2 nd, … th APs received by the target user, d_1t,…d_pt,…,d_PtRespectively representing the vector distance between the original test fingerprint and the p-th partition fingerprint;

and (1.2) based on the complete test fingerprint of the target user, adopting a K nearest neighbor method to find the closest fingerprint in the offline database, and taking the corresponding partition identifier as the estimated subregion of the target user.

Has the beneficial effects that: aiming at the problem of resource waste in the existing positioning system, the resolution identifiers are added to stop subsequent unnecessary positioning operation in advance, so that the calculation complexity can be reduced, the requirements of different resolutions can be met, and the calculation resources can be saved. In addition, all resolution layers can be automatically traversed under the condition of no identifier, and the universality of the system is improved.

To achieve the above object, according to a fourth aspect of the present invention, there is provided a multi-resolution positioning system including: a computer-readable storage medium and a processor;

the computer-readable storage medium is used for storing executable instructions;

the processor is configured to read executable instructions stored in the computer-readable storage medium and execute the multi-resolution positioning method according to the third aspect.

To achieve the above object, according to a fifth aspect of the present invention, there is provided a computer-readable storage medium including a stored computer program; when being executed by a processor, the computer program controls the device on which the computer readable storage medium is positioned to execute the method.

Generally, by the above technical solution conceived by the present invention, the following beneficial effects can be obtained:

(1) for the existing crowdsourcing samples, the labeling is wrong, the distribution is uneven, reliability measurement and surface fitting uniform sampling are adopted to regenerate the crowdsourcing samples of each partition, and unreliable samples are removed to ensure the quality of the database; the uniform sampling of surface fitting can ensure the uniform distribution of crowdsourcing, thereby realizing the reliability and stability of the database.

(2) Aiming at the problem that the existing database only has a single-layer resolution structure, the databases with different resolution levels are sequentially constructed in a bottom-up mode, and because the large database is divided into smaller database components with different resolution levels, the positioning database with clear layers can accelerate the matching work of online fingerprints and offline fingerprints, thereby laying a data foundation for improving the positioning speed of online users.

(3) Aiming at the problem that the multi-resolution requirement of a user cannot be met by the existing positioning method, the user is positioned from top to bottom, and because the partition fingerprints of each layer have strong representativeness, the classification of each layer has high accuracy, so that the requirements of different resolutions are met.

Drawings

FIG. 1 is a flow chart of a multi-resolution fingerprint positioning system based on Wi-Fi crowdsourcing samples according to the present invention.

Fig. 2 is an architecture diagram of multi-resolution positioning according to an embodiment of the present invention.

Fig. 3 is a schematic diagram of a top-down online positioning stage according to an embodiment of the present invention.

Fig. 4 is a schematic diagram of the hstdataset data set according to the embodiment of the present invention.

Fig. 5 is a schematic diagram of a ujindorloc public data set provided in an embodiment of the present invention.

Fig. 6 is a graph comparing hit rates in the ujilndoloc loc dataset for embodiments of the present invention and other machine learning algorithms.

FIG. 7 is a graph comparing the hit effect of other machine learning algorithms in the HUSTDataset data set as a function of resolution level according to an embodiment of the present invention.

FIG. 8 is a comparison of the positioning effect of the embodiment of the present invention and other comparison methods.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.

The invention provides a multi-resolution fingerprint positioning system based on Wi-Fi crowdsourcing samples. By constructing a multi-resolution positioning database structure, online positioning is sequentially carried out from low resolution to high resolution, and the calculation complexity of a large database can be effectively reduced; identifying reliable crowdsourcing samples through a reliability measurement algorithm, and solving the problem that positioning labels of the crowdsourcing samples are inaccurate; according to the method, a curved surface is built by reliable samples, and new fitting samples are generated by resampling, so that the problem that crowdsourcing sample labels are not uniformly distributed is solved; the resolution of the fingerprint and the robustness of a positioning system are increased by adding the auxiliary features of the partition Euclidean distance; by reasonably combining classification and regression models, the positioning accuracy under the conditions of different resolutions is improved while the calculation complexity is reduced.

As shown in fig. 1, the present invention provides a multi-resolution fingerprint positioning system based on Wi-Fi crowd-sourced samples, the method comprising the steps of:

step S1, firstly splitting a target area into a plurality of partitions, and then splitting each partition in a finer step until the highest resolution level (the bottom layer) is possessed. And finally, carrying out primary distribution on the crowdsourcing samples based on the partition of the bottommost layer.

First, the present embodiment is partitioned according to obstacles and walls inherent to a building, such as a report hall, an office, a restaurant, and the like. Then, each partition is further divided into four parts having a network structure, and then, the present embodiment performs the next higher resolution partition for each partition of the layer, which is also divided into four parts. Finally, the sub-regions continue to be divided in the same manner until the embodiment reaches the highest resolution. The space is divided into different layers according to the resolution from high to low, and the layer name of the multi-resolution ordered from high resolution to low resolution is L₁,L₂…. As shown in fig. 2.

After partitioning of each level is completed, the bottom layer L is divided₁Each partition of the crowdsourcing sample is used as a basic grid, and the basic grid distribution is carried out on the crowdsourcing samples according to the position labels of the crowdsourcing samples. Let these basic grids be set to g_1,,g₂,…,g_KWherein K is L₁The number of grids. Let the crowdsourcing sample set as S ═ S₁,s₂,…,s_N},s_n＝(r₁,r₂,…,r_M,x_n,y_n) Wherein N is the total number of crowdsourcing samples; m is the total number of APs, (r)₁,r₂,…,r_M) For the sample in coordinates (x)_n,y_n) The collected RSS vectors. After assigning to a particular grid, each crowdsourced sample has an additional label identifying its assigned grid number to use

Representing samples assigned to the kth grid, let S_kIs the crowd-sourced sample set assigned to the kth grid.

And S2, performing reliability measurement on each partition crowdsourcing sample, and only reserving credible crowdsourcing samples with reliability larger than a threshold value.

First, the embodiment can calculate the crowdsourcing samples

And other crowdsourced samples in a grid with which it belongs

Average signal distance between:

wherein d () is the euclidean distance of the signal between two crowdsourced samples, which is calculated as:

since the summation does not contain the Euclidean signal distance between the sample and the summation, that is, does not contain

Therefore, it is not only easy to use

Divided by | S _k1 to display

To the extent of its grid. Smaller values represent a higher degree of engagement with the grid.

Also, the present embodiment may calculate the average signal distance between the crowd-sourced sample and all other crowd-sourced samples in different grids. Is provided with

Is any other crowd-sourced sample in another grid,

is calculated as:

is a measure of dissimilarity, the grid with the smallest average dissimilarity being called

Of the adjacent grid.

Finally, the contour value of the crowdsourced sample is calculated:

the profile values may be converted into the following form:

from the above equation it follows:

if the contour value is

When the distance between the samples and other grids is larger than that of the grid to which the samples belong, the samples are unreliable, and the embodiment rejects the samples. Conversely, if the contour value is

It indicates that the crowd-sourced sample position label has certain reliability, and is reserved.

And S3, carrying out curved surface construction according to the reserved credible crowdsourcing samples, and then sampling in the curved surface at fixed intervals to generate fitting samples.

The signal surface is fitted with the rejected crowdsourcing samples first. Let the set of all available APs be a ═ AP₁,AP₂,…,AP_M}. Book (I)Embodiments apply a polynomial function phi_m(x, y) to fit the wireless signal propagation surface from the AP in a given partition.

Wherein, a_ijIs the fitting coefficient, if the present embodiment assumes equal reliability values for all reliable crowd-sourced samples, then the objective function is to minimize the sum of the squared residuals as:

but since the reliability values of all the crowd-sourced samples are not equal, not the squared residual is minimized but the squared weighted residual is weighted. The weights of the samples are set as follows:

set the profile values of all samples

Before rejection is S_h＝{h₁,h₂,…,h_NAfter rejection is S_h′＝{h₁,h₂,…,h_γN is the total number of original crowdsourcing samples in the partition, and γ is the number of reliable crowdsourcing samples after culling according to the contour value. The ratio of reliable crowd-sourced samples is expressed as

Make the minimum and maximum profile values min (S) in the reliability value set_h′)＝h_minAnd max (S)_h′)＝h_max. This embodiment uses a scaling function to guarantee [ ρ,1 [ ]]Range of, reliability weight w_nScaling function f (h)_n) The method comprises the following steps:

the objective function that minimizes the sum of the squared weighted residuals then becomes:

the function phi may be utilized after the wireless signal propagation surface is constructed_m(x, y) an arbitrary coordinate position is taken as an input to obtain a fitted RSS value, and the embodiment may sample from the constructed surface at regular intervals as a new fitted sample. Let the down-sampled sample set be Ω ═ Ψ₁,Ψ₂,…,Ψ_F}，Ψ_f＝(ψ₁,ψ₂,…,ψ_M,x_f,y_f)，(ψ₁,ψ₂,…,ψ_M) As a coordinate (x) at the sampling center_f,y_f) The RSS vector is fitted.

And S4, constructing a multi-resolution off-line database according to a bottom-up mode based on the fitting samples, and obtaining a partition fingerprint of the previous resolution layer from a plurality of partition fingerprint sets of the current resolution layer.

The bottom layer L for this embodiment₁To obtain an upper layer L₂The process of (e) is similar for example, and subsequent higher layer processes are similar. First, L is introduced₁The structure of (1). Let the bottom layer L₁Q is one of these resolution partitions. Let q correspond to a fitting sample set as

Omega is L₁And the number of downsampled fitting samples corresponding to the qth partition of the layer. L is a radical of an alcohol₁The data form of the hierarchical q-region is a ω × M RSSI matrix.

Then calculating gamma^<L1,q＞Line mean vector of

As a representative vector for the q region. The present embodiment utilizes the representative vector to construct L₂A database of layers.

Suppose L₂The layer has P partitions, each partition consisting of K L₁Partition of layer (u)^<L1,1>,u^<L1,2>,…,u^<L1,K>) And (4) forming. Then L₂The data format of the p-th region of the layer is:

wherein the content of the first and second substances,

m is 1,2 …, M is the number K of L₁The partitions represent the average of the vectors in the mth dimension AP. It is calculated as follows:

it is to be noted that it is preferable that,

is L₂The representative vector of the p-th region of the layer becomes a higher layer such as L in the subsequent construction₃Is provided.

To enhance the resolution of the fingerprint, the present embodiment also adds an assist feature to each partition. For L₂For each row vector of the database, the assistant feature is the original Lambda^<L2,p＞Characteristic u of^<L1,q>Or u^<L2,p＞(Q-1, …, Q, P-1, …, P) with its own representative vector u for each region^<L2,p＞The euclidean distance (P1, …, P) is calculated as follows:

therefore, L after adding the assist feature₂The data for the p-th region of the layer are:

then, by the same method, lower layer data with higher resolution is hierarchically aggregated to form upper layer data to the highest layer, so that a bottom-up database structure is formed.

And S5, in the multi-resolution online positioning stage, determining the subareas of the target from the coarse resolution layer to the fine resolution layer in sequence in a top-down mode, and classifying the subareas by using a KNN classification model.

The top-down pyramid structure of the database may make KNN classification easier, so the present embodiment uses a classical KNN classification method for classification at multiple resolutions. The KNN model employed does not require a training process because the entire data set has already been constructed prior to online localization.

Multi-resolution online positioning processes online requests according to defined resolution level requirements. In classical indoor positioning, the test fingerprint is only the RSS measurements from the AP. However, in the case of multi-resolution on-line positioning, the test fingerprint requires an identifier to display the required resolution level. Let F_tAn RSS measurement representing the test fingerprint.

In addition to this RSS fingerprint, the user also needs to add a requirement identifier of one resolution level to the test fingerprint, which is formed as follows:

for each layer structure, each partition has the same shape

The data of (1). When the target user uploads the RSS vector, the system computes the helper features for the target user to generate a complete test fingerprint. As at L₂The layer finding sub-area, the complete test fingerprint of the target user is:

wherein d is_ptIs F_tTo u^<L2,p＞The euclidean distance of (c). Then at L₂Finding the most similar fingerprint from the data of all the areas of the layer, and identifying the partition identifier corresponding to the most similar fingerprint<L2,p>As an estimated sub-area of the target user.

If no level identification is given, then it is assumed that the requirement is for pinpoint and hierarchical traversal of the classification chain, and finally a value for pinpoint is given, as shown in FIG. 3.

And S6, after the target is classified into the sub-region with the highest resolution, selecting an XGboost regression model to carry out final accurate positioning on the target user.

At the lowest layer L₁Each sub-area has a database, for example, the fitting sample of the k-th area is S_k. These samples are used to train a regression model with the input of (r)₁,r₂,…,r_M) The regression target is (x)_n,y_n). In the embodiment, XGboost is used as a regression model for accurate positioning, and is an improved version of a gradient lifting algorithm GBDT, so that the performance can be improved, and the calculation speed is increased.

The multi-resolution fingerprint positioning system based on Wi-Fi crowdsourcing samples provided by the above method embodiments is further explained with reference to an application example in a specific scenario.

In this example, there are two data sets, the first being HUSTDataset. Its target area is shown in fig. 4, and at least 70 AP signals can be received for each sample. In the data set acquisition process, a signal sample is randomly acquired by using a smart phone and then divided into training data and testing data. Finally, this example adds gaussian noise with a variance of 0.6 to the position label of the sample to simulate an error-carrying crowd-sourced sample in the HUSTDataset. The second data set is from a common data set named ujindorloc. The UJIIndoorLoc database covers three buildings as shown in fig. 5.

The present embodiment first evaluates the classification performance under low resolution, which refers to a sub-area with a large area or with obvious physical boundaries, for example: buildings, floors, rooms, etc. In addition to performing the experiment of the KNN algorithm adopted in the present embodiment, the present embodiment also compares other advanced machine learning enhancement algorithms, such as XGBoost, LightGBM. The experimental result based on the ujiindiororc dataset is shown in fig. 6, and the KNN algorithm adopted in the present embodiment has the highest hit rate, which can reach more than 98%. The effectiveness of the method provided by the embodiment in low resolution is verified, and a foundation is laid for subsequent high resolution.

Then, the present embodiment further evaluates the classification performance at a high resolution, where the high resolution refers to a partition structure with a finer granularity below the room level. Based on the experimental results on the HUSTDataset data set, as shown in FIG. 7, the classification model using KNN performs better than other models, and the accuracy is significantly improved as the hierarchy is increased (the higher the hierarchy, the lower the resolution).

Finally, the present embodiment evaluates the accurate positioning performance of different methods. The present embodiment sets up an experimental scenario based on the HUSTDataset dataset to measure the fine positioning performance of the proposed system, compared to other crowdsourcing based methods. The comparison scheme is a nearest neighbor positioning method, a weighted surface positioning method of patent CN109059919A and a multi-level positioning method of patent CN 111474516A. Fig. 8 shows a positioning error accumulation distribution curve. It can be seen that the method provided by the embodiment can significantly improve the positioning accuracy. This result reveals that the structure proposed in this embodiment, which is progressive layer by layer according to resolution, can not only improve the hit performance of the region, but also improve the accurate positioning performance.

It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. A method for generating a Wi-Fi reliable crowdsourcing sample set, the method comprising:

2. The method of claim 1, wherein the reliability metric takes the form of a contour coefficient

The calculation formula is as follows:

wherein the content of the first and second substances,

Representing a sample

representing a sample

All of the sample sets within the grid are,

representing a sample

And a sample

The signal distance between.

3. The method of claim 2, wherein the surface fitting is as follows:

contour value set S of original crowdsourced samples_h＝{h₁，h₂，...，h_N}, set of contour values S of credible crowd-sourced samples_h′＝{h₁，h₂，...，h_γN denotes the original crowdsourcing sample number, γ denotes the trusted crowdsourcing sample number, where the contour value

Using the value of the profile h_nComputing samples in trusted crowdsourcing samples

Is given by a weight coefficient w_n，n＝1，...，γ：

Where p represents the ratio of reliable crowd-sourced samples;

by taking the weight coefficient w_nFitting a function to the signal of the mth AP at position (x)_n，y_n) Value and sample of

Is weighted to minimize the square weightingThe objective function of the sum of residuals is:

4. A construction method of a multi-resolution off-line database is characterized by comprising the following steps:

t1. generating a Wi-Fi reliable crowdsourcing sample set using the method of any one of claims 1 to 3;

5. The method of claim 4, wherein the multi-resolution offline database has a hierarchy name L of multi-resolution ordered from high resolution to low resolution₁，L₂，...L_JWherein L is_jThe layer has P partitions, each partition consisting of K L_j-1Subdivision of a layer, then L_jThe data for the p-th partition of the layer is:

wherein, AP₁，AP₂，AP_MRespectively representing 1,2, M signal emission sources, Label representing the Label of the partition to which the Label belongs, dividing each line of the last line to represent the signal vector of the sub-partition, and the last line to represent the signal vector of the sub-partitionThe average vector of all sub-partition vectors, except that each column of the last column represents the signal strength of the AP, the last column represents the partition label of the data,

is the K number L_j-1The partitions represent the average value of the signal strength of the vector in the mth dimension AP, J is 1,2, … J, J represents the number of resolution layers, P is 1,2, …, P, M is 1, 2.

6. The method of claim 4, wherein the multi-resolution offline database has a hierarchy name L of multi-resolution ordered from high resolution to low resolution₁，L₂，...L_JWherein L is_jThe layer has P partitions, each partition consisting of K L_j-1Subdivision of a layer, then L_jThe data for the p-th partition of the layer is:

wherein, AP₁，AP₂，AP_MRespectively, 1,2, M signal emission sources, F₁，F_pRespectively representing the 1 st and P auxiliary features, Label representing the Label of the partition to which the Label belongs, except that each row of the last row represents the fingerprint of the sub-partition, the last row represents the average fingerprint of all the fingerprints of the sub-partitions, each column of the first M columns represents the signal strength value of the AP, each column of the next P columns represents the specific value of the auxiliary feature, and the last column represents the partition Label of the data,

represents L_j-1Received by the Kth partition of the layerThe signal strength values of the M APs,

is the K number L_j-1The partitions represent the mean of the signal strength of the vector at the m-dimension AP, d_pqRepresentation feature

Or

Representative vector of each region with itself

J is 1,2, … J, J denotes the number of resolution layers, P is 1,2, …, P, Q is 1,2_j-1M represents the total number of signal emission sources.

7. A multi-resolution positioning method, the positioning method comprising:

8. The positioning method according to claim 7, wherein the step (1) comprises the sub-steps of:

(1.1) Forming a complete test fingerprint of the target user

Wherein the content of the first and second substances,

respectively, the signal strength of the 1 st, 2 nd,. M APs received by the target user, d_1t，...d_pt，...，d_PtRespectively representing the vector distance between the original test fingerprint and the p-th partition fingerprint;

9. A multi-resolution positioning system, comprising: a computer-readable storage medium and a processor;

the processor is configured to read executable instructions stored in the computer-readable storage medium and execute the multi-resolution positioning method of claim 7 or 8.

10. A computer-readable storage medium comprising a stored computer program; the computer program, when executed by a processor, controls an apparatus on which the computer-readable storage medium is located to perform the method of any of claims 1 to 8.