CN114449651A - Method for generating reliable crowdsourcing sample set, constructing multi-resolution database and positioning - Google Patents

Method for generating reliable crowdsourcing sample set, constructing multi-resolution database and positioning Download PDF

Info

Publication number
CN114449651A
CN114449651A CN202210064098.1A CN202210064098A CN114449651A CN 114449651 A CN114449651 A CN 114449651A CN 202210064098 A CN202210064098 A CN 202210064098A CN 114449651 A CN114449651 A CN 114449651A
Authority
CN
China
Prior art keywords
resolution
partition
samples
crowdsourcing
positioning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210064098.1A
Other languages
Chinese (zh)
Other versions
CN114449651B (en
Inventor
王邦
谭飞
周婵欣
莫益军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN202210064098.1A priority Critical patent/CN114449651B/en
Publication of CN114449651A publication Critical patent/CN114449651A/en
Application granted granted Critical
Publication of CN114449651B publication Critical patent/CN114449651B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W64/00Locating users or terminals or network equipment for network management purposes, e.g. mobility management
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S5/00Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations
    • G01S5/02Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations using radio waves
    • G01S5/0252Radio frequency fingerprinting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/30Services specially adapted for particular environments, situations or purposes
    • H04W4/33Services specially adapted for particular environments, situations or purposes for indoor environments, e.g. buildings
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/80Services using short range communication, e.g. near-field communication [NFC], radio-frequency identification [RFID] or low energy communication

Abstract

The invention discloses a method for generating a reliable crowdsourcing sample set, constructing a multi-resolution database and positioning, and belongs to the technical field of communication and wireless networks. According to the invention, by constructing a multi-resolution positioning database structure, online positioning is carried out in sequence from low resolution to high resolution, and the computational complexity of a large database can be effectively reduced; identifying reliable crowdsourcing samples through a reliability measurement algorithm, and solving the problem that positioning labels of the crowdsourcing samples are inaccurate; according to the method, a curved surface is built by reliable samples, and new fitting samples are generated by resampling, so that the problem that crowdsourcing sample labels are not uniformly distributed is solved; the resolution of the fingerprint and the robustness of a positioning system are increased by adding the auxiliary features of the partition Euclidean distance; by reasonably combining classification and regression models, the positioning accuracy under the conditions of different resolutions is improved while the calculation complexity is reduced.

Description

Method for generating reliable crowdsourcing sample set, constructing multi-resolution database and positioning
Technical Field
The invention belongs to the technical field of communication and wireless networks, and particularly relates to a method for generating a reliable crowdsourcing sample set, constructing a multi-resolution database and positioning the multi-resolution database.
Background
Positioning technology has been an interesting topic for many civilian and military applications, such as search and rescue missions, internet of things, and robotic navigation. Although the technology of Global Positioning System (GPS) is well developed, it cannot be the best means for indoor environment because of the accuracy factor. In recent years, many technologies for indoor positioning, such as visible light communication, bluetooth, ultra wideband, etc., have been developed. Despite the advantages and disadvantages of each of these technologies, Wi-Fi based fingerprinting is the most widely used technology due to its strong universality. However, as the size and complexity of indoor environments increase, Wi-Fi fingerprinting presents two major challenges.
First, as the complexity of the indoor environment increases, the cost of the site survey required for Wi-Fi offline database construction also increases dramatically. Crowdsourcing, one of the most powerful solutions, takes advantage of the unknown public in the internet to accomplish the task of collecting samples, thereby relieving the enormous burden of site surveying. While crowd sourcing can reduce the burden of field surveys, crowd-sourced samples are collected by laypersons, often with errors in labeling. In addition to the random acquisition process of the crowdsourcing technique, the crowdsourced samples may not be uniform in density, and even may not have samples at all in some localized areas. These are all problems that the crowdsourcing function is to solve.
On the other hand, as the indoor environment scales up, searching for a precise location requires longer processing time and additional computing resources. The main indoor positioning research has been focused on accurate coordinate-level position estimation, i.e., obtaining sub-meter-level accuracy grid position estimation. However, some location-aware services may not require such explicit location accuracy. For example, tracking of medical devices in hospitals; locating a parking space in an airport; location of zoning in fire control for short periods, etc. Therefore, in some cases, it is more meaningful to identify the area, such as a building, a room, etc., to which the user belongs than to provide an accurate location specified by coordinate information, so many location-aware services require different levels of positioning resolution. However, in the research work of indoor positioning, there is little research on such.
Disclosure of Invention
Aiming at the defects and the improvement requirements of the prior art, the invention provides a method for generating a reliable crowdsourcing sample set, constructing a multi-resolution database and positioning, and aims to simplify the database architecture in a large-scale complex indoor positioning environment, meet different position sensing service requirements and increase the positioning speed at an online stage by setting a plurality of resolution levels. And the problems that crowdsourcing samples have wrong position marking and sampling is uneven are solved to a certain extent.
To achieve the above object, according to a first aspect of the present invention, there is provided a method for generating a Wi-Fi reliable crowdsourcing sample set, the method comprising:
s1, splitting a target area from top to bottom in a multi-level mode until the target area is split into grids, and obtaining units of different levels of the target area, the sub-areas and the sub-areas … … grids, wherein the grids correspond to the highest resolution;
s2, distributing original crowdsourcing samples based on a multi-level splitting result of the target area;
s3, respectively measuring the reliability of the crowdsourcing samples in each grid, and reserving credible crowdsourcing samples with the reliability larger than a threshold value;
s4, performing surface fitting on all the reserved credible crowdsourcing samples;
and S5, uniformly sampling the fitted surface function to obtain a Wi-Fi reliable crowdsourcing sample set.
Preferably, the reliability measure takes the form of a contour coefficient
Figure BDA0003479570600000021
The calculation formula is as follows:
Figure BDA0003479570600000031
Figure BDA0003479570600000032
Figure BDA0003479570600000033
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003479570600000034
represents a grid gkInner coordinate (x)n,yn) Processing the collected RSS vectors
Figure BDA0003479570600000035
Representing a sample
Figure BDA0003479570600000036
And the signal distance between other crowd-sourced samples in the same grid as it belongs to,
Figure BDA0003479570600000037
representing a sample
Figure BDA0003479570600000038
Minimum of signal distance, S, from all other crowd-sourced samples in other gridskRepresenting a sample
Figure BDA0003479570600000039
All of the sample sets within the grid are,
Figure BDA00034795706000000310
representing a sample
Figure BDA00034795706000000311
And a sample
Figure BDA00034795706000000312
The signal distance between.
Has the advantages that: aiming at the problem of wrong labeling of the conventional crowdsourcing samples, the method adopts the contour value coefficient to evaluate the compatibility of the crowdsourcing samples and the positioning areas of the crowdsourcing samples, and the contour coefficient can combine two factors of cohesion and separation, so that reasonable and effective reliability measurement is realized; because the low-compatibility, namely low-reliability samples are removed, the quality of the database is ensured, and the accuracy of subsequent positioning is improved.
Preferably, the curve fitting manner is as follows:
contour value set S of original crowdsourced samplesh={h1,h2,…,hN}, set of contour values S of credible crowd-sourced samplesh′={h1,h2,…,hγN denotes the original crowdsourcing sample number, γ denotes the trusted crowdsourcing sample number, where the contour value
Figure BDA00034795706000000313
Using the contour value hnComputing samples in trusted crowdsourcing samples
Figure BDA00034795706000000314
Is given by a weight coefficient wn,n=1,…,γ:
Figure BDA00034795706000000315
Figure BDA00034795706000000316
Figure BDA0003479570600000041
Figure BDA0003479570600000042
Where ρ represents the ratio of reliable crowd-sourced samples;
by taking the weight coefficient wnFitting a function to the signal of the mth AP at position (x)n,yn) Value and sample of
Figure BDA0003479570600000043
Is performed by the deviation between the signal strength values received by the mth APThe objective function of weighting, minimizing the sum of squared weighted residuals, is:
Figure BDA0003479570600000044
wherein phi ism(x, y) is a fitting surface function for fitting the wireless signal propagation surface from the AP in a given partition.
Has the advantages that: aiming at the problem of inconsistent quality of the existing crowdsourcing samples, the weights to be assigned to the samples are calculated based on the contour values of the samples, and on one hand, the reliability is in direct proportion to the contribution to the curved surface. On the other hand, the weight w is controllednTo [1, ρ ]]The range of (2) ensures that the degree of difference between different samples is not too great; the wireless curved surface function is constructed in a weighting mode, and because different weights are given to the samples when the curved surface is fitted, the differences of the samples can be fully realized, more weights are given to reliable crowdsourcing samples with larger influences on the curved surface construction process, and the reliability and the effectiveness of the fitted curved surface are enhanced.
To achieve the above object, according to a second aspect of the present invention, there is provided a method for constructing a multi-resolution offline database, the method comprising:
t1. generating a Wi-Fi reliable crowdsourcing sample set using the method as described in the first aspect;
and T2, constructing a multi-resolution off-line database according to a mode from bottom to top based on a multi-level splitting result of the target area, and obtaining a partition fingerprint of the previous resolution layer from a plurality of partition fingerprint sets of the current resolution layer.
Preferably, the hierarchy name of the multi-resolution in the multi-resolution offline database ordered from high resolution to low resolution is L1,L2,…LJWherein L isjThe layer has P partitions, each partition consisting of K Lj-1Subdivision of a layer, then LjThe data for the p-th partition of the layer is:
Figure BDA0003479570600000051
Figure BDA0003479570600000052
wherein, AP1,AP2,APMRespectively, 1,2, M signal emission sources are represented, Label represents the Label of the partition, each row except the last row represents the signal vector of the sub-partition, the last row represents the average vector of all the sub-partition vectors, each column except the last column represents the signal strength of the AP, the last column represents the partition Label of the data,
Figure BDA0003479570600000053
represents Lj-1The signal strength value of the mth AP received by the kth partition of the layer,
Figure BDA0003479570600000054
is the K number Lj-1The partition represents the average value of the signal intensity of the vector in the mth dimension AP, J is 1,2, … J, J represents the resolution layer number, P is 1,2, …, P, M is 1,2 …, M represents the total number of signal emission sources.
Has the advantages that: aiming at the problem that the existing partition data lack of the difference degree, the fingerprint method of the upper partition is formed by the fingerprint set of the lower sub-partition, and the partition fingerprints can keep more original information of the lower sub-partition, so that the difference degree of the data of each partition is increased, and the robustness of the positioning system is improved.
Preferably, the hierarchy name of the multi-resolution in the multi-resolution offline database ordered from high resolution to low resolution is L1,L2,…LJWherein L isjThe layer has P partitions, each partition consisting of K Lj-1Subdivision of a layer, then LjThe data for the p-th partition of the layer is:
Figure BDA0003479570600000061
wherein, AP1,AP2,APMRespectively, 1,2, M signal emission sources, F1,FpRespectively representing the 1 st and P auxiliary features, Label representing the Label of the partition to which the Label belongs, except that each row of the last row represents the fingerprint of the sub-partition, the last row represents the average fingerprint of all the fingerprints of the sub-partitions, each column of the first M columns represents the signal strength value of the AP, each column of the next P columns represents the specific value of the auxiliary feature, and the last column represents the partition Label of the data,
Figure BDA0003479570600000062
represents Lj-1The signal strength value of the mth AP received by the kth partition of the layer,
Figure BDA0003479570600000063
is the K number Lj-1The partitions represent the mean of the signal strength of the vector at the m-dimension AP, dpqRepresentation of features
Figure BDA0003479570600000064
Or
Figure BDA0003479570600000065
Representative vector of each region with itself
Figure BDA0003479570600000066
J is 1,2, … J, J is the number of resolution layers, P is 1,2, …, P, Q is 1, …, Q, M is 1,2 …, M, Q is Lj-1M represents the total number of signal emission sources.
Has the beneficial effects that: aiming at the problem of single characteristic of the existing fingerprint, the global characteristic is extracted by adopting Euclidean distances of different subarea signal intensity vectors as an auxiliary characteristic, and the resolution capability of the fingerprint can be improved by the auxiliary characteristic, so that the robustness of a positioning system is enhanced.
To achieve the above object, according to a third aspect of the present invention, there is provided a multi-resolution positioning method including:
(1) sequentially determining the subareas of the target from the coarse resolution layer to the fine resolution layer according to a top-down mode, and classifying by using a classification model;
(2) if the resolution level is customized by the user, after the target is classified into a sub-region of the resolution level defined by the user, a regression model is adopted to carry out final accurate positioning on the target user; otherwise, after the target is classified into the sub-region with the highest resolution, the regression model is adopted to carry out final accurate positioning on the target user.
Preferably, step (1) comprises the sub-steps of:
(1.1) Forming a complete test fingerprint of the target user
Figure BDA0003479570600000071
Wherein the content of the first and second substances,
Figure BDA0003479570600000072
respectively, indicating the signal strength of the 1 st, 2 nd, … th APs received by the target user, d1t,…dpt,…,dPtRespectively representing the vector distance between the original test fingerprint and the p-th partition fingerprint;
and (1.2) based on the complete test fingerprint of the target user, adopting a K nearest neighbor method to find the closest fingerprint in the offline database, and taking the corresponding partition identifier as the estimated subregion of the target user.
Has the beneficial effects that: aiming at the problem of resource waste in the existing positioning system, the resolution identifiers are added to stop subsequent unnecessary positioning operation in advance, so that the calculation complexity can be reduced, the requirements of different resolutions can be met, and the calculation resources can be saved. In addition, all resolution layers can be automatically traversed under the condition of no identifier, and the universality of the system is improved.
To achieve the above object, according to a fourth aspect of the present invention, there is provided a multi-resolution positioning system including: a computer-readable storage medium and a processor;
the computer-readable storage medium is used for storing executable instructions;
the processor is configured to read executable instructions stored in the computer-readable storage medium and execute the multi-resolution positioning method according to the third aspect.
To achieve the above object, according to a fifth aspect of the present invention, there is provided a computer-readable storage medium including a stored computer program; when being executed by a processor, the computer program controls the device on which the computer readable storage medium is positioned to execute the method.
Generally, by the above technical solution conceived by the present invention, the following beneficial effects can be obtained:
(1) for the existing crowdsourcing samples, the labeling is wrong, the distribution is uneven, reliability measurement and surface fitting uniform sampling are adopted to regenerate the crowdsourcing samples of each partition, and unreliable samples are removed to ensure the quality of the database; the uniform sampling of surface fitting can ensure the uniform distribution of crowdsourcing, thereby realizing the reliability and stability of the database.
(2) Aiming at the problem that the existing database only has a single-layer resolution structure, the databases with different resolution levels are sequentially constructed in a bottom-up mode, and because the large database is divided into smaller database components with different resolution levels, the positioning database with clear layers can accelerate the matching work of online fingerprints and offline fingerprints, thereby laying a data foundation for improving the positioning speed of online users.
(3) Aiming at the problem that the multi-resolution requirement of a user cannot be met by the existing positioning method, the user is positioned from top to bottom, and because the partition fingerprints of each layer have strong representativeness, the classification of each layer has high accuracy, so that the requirements of different resolutions are met.
Drawings
FIG. 1 is a flow chart of a multi-resolution fingerprint positioning system based on Wi-Fi crowdsourcing samples according to the present invention.
Fig. 2 is an architecture diagram of multi-resolution positioning according to an embodiment of the present invention.
Fig. 3 is a schematic diagram of a top-down online positioning stage according to an embodiment of the present invention.
Fig. 4 is a schematic diagram of the hstdataset data set according to the embodiment of the present invention.
Fig. 5 is a schematic diagram of a ujindorloc public data set provided in an embodiment of the present invention.
Fig. 6 is a graph comparing hit rates in the ujilndoloc loc dataset for embodiments of the present invention and other machine learning algorithms.
FIG. 7 is a graph comparing the hit effect of other machine learning algorithms in the HUSTDataset data set as a function of resolution level according to an embodiment of the present invention.
FIG. 8 is a comparison of the positioning effect of the embodiment of the present invention and other comparison methods.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
The invention provides a multi-resolution fingerprint positioning system based on Wi-Fi crowdsourcing samples. By constructing a multi-resolution positioning database structure, online positioning is sequentially carried out from low resolution to high resolution, and the calculation complexity of a large database can be effectively reduced; identifying reliable crowdsourcing samples through a reliability measurement algorithm, and solving the problem that positioning labels of the crowdsourcing samples are inaccurate; according to the method, a curved surface is built by reliable samples, and new fitting samples are generated by resampling, so that the problem that crowdsourcing sample labels are not uniformly distributed is solved; the resolution of the fingerprint and the robustness of a positioning system are increased by adding the auxiliary features of the partition Euclidean distance; by reasonably combining classification and regression models, the positioning accuracy under the conditions of different resolutions is improved while the calculation complexity is reduced.
As shown in fig. 1, the present invention provides a multi-resolution fingerprint positioning system based on Wi-Fi crowd-sourced samples, the method comprising the steps of:
step S1, firstly splitting a target area into a plurality of partitions, and then splitting each partition in a finer step until the highest resolution level (the bottom layer) is possessed. And finally, carrying out primary distribution on the crowdsourcing samples based on the partition of the bottommost layer.
First, the present embodiment is partitioned according to obstacles and walls inherent to a building, such as a report hall, an office, a restaurant, and the like. Then, each partition is further divided into four parts having a network structure, and then, the present embodiment performs the next higher resolution partition for each partition of the layer, which is also divided into four parts. Finally, the sub-regions continue to be divided in the same manner until the embodiment reaches the highest resolution. The space is divided into different layers according to the resolution from high to low, and the layer name of the multi-resolution ordered from high resolution to low resolution is L1,L2…. As shown in fig. 2.
After partitioning of each level is completed, the bottom layer L is divided1Each partition of the crowdsourcing sample is used as a basic grid, and the basic grid distribution is carried out on the crowdsourcing samples according to the position labels of the crowdsourcing samples. Let these basic grids be set to g1,,g2,…,gKWherein K is L1The number of grids. Let the crowdsourcing sample set as S ═ S1,s2,…,sN},sn=(r1,r2,…,rM,xn,yn) Wherein N is the total number of crowdsourcing samples; m is the total number of APs, (r)1,r2,…,rM) For the sample in coordinates (x)n,yn) The collected RSS vectors. After assigning to a particular grid, each crowdsourced sample has an additional label identifying its assigned grid number to use
Figure BDA0003479570600000101
Representing samples assigned to the kth grid, let SkIs the crowd-sourced sample set assigned to the kth grid.
And S2, performing reliability measurement on each partition crowdsourcing sample, and only reserving credible crowdsourcing samples with reliability larger than a threshold value.
First, the embodiment can calculate the crowdsourcing samples
Figure BDA0003479570600000102
And other crowdsourced samples in a grid with which it belongs
Figure BDA0003479570600000103
Average signal distance between:
Figure BDA0003479570600000104
wherein d () is the euclidean distance of the signal between two crowdsourced samples, which is calculated as:
Figure BDA0003479570600000105
since the summation does not contain the Euclidean signal distance between the sample and the summation, that is, does not contain
Figure BDA0003479570600000106
Therefore, it is not only easy to use
Figure BDA0003479570600000107
Divided by | S k1 to display
Figure BDA0003479570600000108
To the extent of its grid. Smaller values represent a higher degree of engagement with the grid.
Also, the present embodiment may calculate the average signal distance between the crowd-sourced sample and all other crowd-sourced samples in different grids. Is provided with
Figure BDA0003479570600000109
Is any other crowd-sourced sample in another grid,
Figure BDA00034795706000001010
is calculated as:
Figure BDA00034795706000001011
Figure BDA00034795706000001012
is a measure of dissimilarity, the grid with the smallest average dissimilarity being called
Figure BDA00034795706000001013
Of the adjacent grid.
Finally, the contour value of the crowdsourced sample is calculated:
Figure BDA00034795706000001014
the profile values may be converted into the following form:
Figure BDA0003479570600000111
from the above equation it follows:
Figure BDA0003479570600000112
if the contour value is
Figure BDA0003479570600000113
When the distance between the samples and other grids is larger than that of the grid to which the samples belong, the samples are unreliable, and the embodiment rejects the samples. Conversely, if the contour value is
Figure BDA0003479570600000114
It indicates that the crowd-sourced sample position label has certain reliability, and is reserved.
And S3, carrying out curved surface construction according to the reserved credible crowdsourcing samples, and then sampling in the curved surface at fixed intervals to generate fitting samples.
The signal surface is fitted with the rejected crowdsourcing samples first. Let the set of all available APs be a ═ AP1,AP2,…,APM}. Book (I)Embodiments apply a polynomial function phim(x, y) to fit the wireless signal propagation surface from the AP in a given partition.
Figure BDA0003479570600000115
Wherein, aijIs the fitting coefficient, if the present embodiment assumes equal reliability values for all reliable crowd-sourced samples, then the objective function is to minimize the sum of the squared residuals as:
Figure BDA0003479570600000116
but since the reliability values of all the crowd-sourced samples are not equal, not the squared residual is minimized but the squared weighted residual is weighted. The weights of the samples are set as follows:
set the profile values of all samples
Figure BDA0003479570600000117
Before rejection is Sh={h1,h2,…,hNAfter rejection is Sh′={h1,h2,…,hγN is the total number of original crowdsourcing samples in the partition, and γ is the number of reliable crowdsourcing samples after culling according to the contour value. The ratio of reliable crowd-sourced samples is expressed as
Figure BDA0003479570600000118
Make the minimum and maximum profile values min (S) in the reliability value seth′)=hminAnd max (S)h′)=hmax. This embodiment uses a scaling function to guarantee [ ρ,1 [ ]]Range of, reliability weight wnScaling function f (h)n) The method comprises the following steps:
Figure BDA0003479570600000121
the objective function that minimizes the sum of the squared weighted residuals then becomes:
Figure BDA0003479570600000122
the function phi may be utilized after the wireless signal propagation surface is constructedm(x, y) an arbitrary coordinate position is taken as an input to obtain a fitted RSS value, and the embodiment may sample from the constructed surface at regular intervals as a new fitted sample. Let the down-sampled sample set be Ω ═ Ψ12,…,ΨF},Ψf=(ψ12,…,ψM,xf,yf),(ψ12,…,ψM) As a coordinate (x) at the sampling centerf,yf) The RSS vector is fitted.
And S4, constructing a multi-resolution off-line database according to a bottom-up mode based on the fitting samples, and obtaining a partition fingerprint of the previous resolution layer from a plurality of partition fingerprint sets of the current resolution layer.
The bottom layer L for this embodiment1To obtain an upper layer L2The process of (e) is similar for example, and subsequent higher layer processes are similar. First, L is introduced1The structure of (1). Let the bottom layer L1Q is one of these resolution partitions. Let q correspond to a fitting sample set as
Figure BDA0003479570600000123
Figure BDA0003479570600000124
Omega is L1And the number of downsampled fitting samples corresponding to the qth partition of the layer. L is a radical of an alcohol1The data form of the hierarchical q-region is a ω × M RSSI matrix.
Figure BDA0003479570600000125
Then calculating gamma<L1,q>Line mean vector of
Figure BDA0003479570600000126
As a representative vector for the q region. The present embodiment utilizes the representative vector to construct L2A database of layers.
Suppose L2The layer has P partitions, each partition consisting of K L1Partition of layer (u)<L1,1>,u<L1,2>,…,u<L1,K>) And (4) forming. Then L2The data format of the p-th region of the layer is:
Figure BDA0003479570600000131
wherein the content of the first and second substances,
Figure BDA0003479570600000132
m is 1,2 …, M is the number K of L1The partitions represent the average of the vectors in the mth dimension AP. It is calculated as follows:
Figure BDA0003479570600000133
it is to be noted that it is preferable that,
Figure BDA0003479570600000134
is L2The representative vector of the p-th region of the layer becomes a higher layer such as L in the subsequent construction3Is provided.
To enhance the resolution of the fingerprint, the present embodiment also adds an assist feature to each partition. For L2For each row vector of the database, the assistant feature is the original Lambda<L2,p>Characteristic u of<L1,q>Or u<L2,p>(Q-1, …, Q, P-1, …, P) with its own representative vector u for each region<L2,p>The euclidean distance (P1, …, P) is calculated as follows:
Figure BDA0003479570600000135
therefore, L after adding the assist feature2The data for the p-th region of the layer are:
Figure BDA0003479570600000136
then, by the same method, lower layer data with higher resolution is hierarchically aggregated to form upper layer data to the highest layer, so that a bottom-up database structure is formed.
And S5, in the multi-resolution online positioning stage, determining the subareas of the target from the coarse resolution layer to the fine resolution layer in sequence in a top-down mode, and classifying the subareas by using a KNN classification model.
The top-down pyramid structure of the database may make KNN classification easier, so the present embodiment uses a classical KNN classification method for classification at multiple resolutions. The KNN model employed does not require a training process because the entire data set has already been constructed prior to online localization.
Multi-resolution online positioning processes online requests according to defined resolution level requirements. In classical indoor positioning, the test fingerprint is only the RSS measurements from the AP. However, in the case of multi-resolution on-line positioning, the test fingerprint requires an identifier to display the required resolution level. Let FtAn RSS measurement representing the test fingerprint.
Figure BDA0003479570600000141
In addition to this RSS fingerprint, the user also needs to add a requirement identifier of one resolution level to the test fingerprint, which is formed as follows:
Figure BDA0003479570600000142
for each layer structure, each partition has the same shape
Figure BDA0003479570600000143
The data of (1). When the target user uploads the RSS vector, the system computes the helper features for the target user to generate a complete test fingerprint. As at L2The layer finding sub-area, the complete test fingerprint of the target user is:
Figure BDA0003479570600000144
wherein d isptIs FtTo u<L2,p>The euclidean distance of (c). Then at L2Finding the most similar fingerprint from the data of all the areas of the layer, and identifying the partition identifier corresponding to the most similar fingerprint<L2,p>As an estimated sub-area of the target user.
If no level identification is given, then it is assumed that the requirement is for pinpoint and hierarchical traversal of the classification chain, and finally a value for pinpoint is given, as shown in FIG. 3.
And S6, after the target is classified into the sub-region with the highest resolution, selecting an XGboost regression model to carry out final accurate positioning on the target user.
At the lowest layer L1Each sub-area has a database, for example, the fitting sample of the k-th area is Sk. These samples are used to train a regression model with the input of (r)1,r2,…,rM) The regression target is (x)n,yn). In the embodiment, XGboost is used as a regression model for accurate positioning, and is an improved version of a gradient lifting algorithm GBDT, so that the performance can be improved, and the calculation speed is increased.
The multi-resolution fingerprint positioning system based on Wi-Fi crowdsourcing samples provided by the above method embodiments is further explained with reference to an application example in a specific scenario.
In this example, there are two data sets, the first being HUSTDataset. Its target area is shown in fig. 4, and at least 70 AP signals can be received for each sample. In the data set acquisition process, a signal sample is randomly acquired by using a smart phone and then divided into training data and testing data. Finally, this example adds gaussian noise with a variance of 0.6 to the position label of the sample to simulate an error-carrying crowd-sourced sample in the HUSTDataset. The second data set is from a common data set named ujindorloc. The UJIIndoorLoc database covers three buildings as shown in fig. 5.
The present embodiment first evaluates the classification performance under low resolution, which refers to a sub-area with a large area or with obvious physical boundaries, for example: buildings, floors, rooms, etc. In addition to performing the experiment of the KNN algorithm adopted in the present embodiment, the present embodiment also compares other advanced machine learning enhancement algorithms, such as XGBoost, LightGBM. The experimental result based on the ujiindiororc dataset is shown in fig. 6, and the KNN algorithm adopted in the present embodiment has the highest hit rate, which can reach more than 98%. The effectiveness of the method provided by the embodiment in low resolution is verified, and a foundation is laid for subsequent high resolution.
Then, the present embodiment further evaluates the classification performance at a high resolution, where the high resolution refers to a partition structure with a finer granularity below the room level. Based on the experimental results on the HUSTDataset data set, as shown in FIG. 7, the classification model using KNN performs better than other models, and the accuracy is significantly improved as the hierarchy is increased (the higher the hierarchy, the lower the resolution).
Finally, the present embodiment evaluates the accurate positioning performance of different methods. The present embodiment sets up an experimental scenario based on the HUSTDataset dataset to measure the fine positioning performance of the proposed system, compared to other crowdsourcing based methods. The comparison scheme is a nearest neighbor positioning method, a weighted surface positioning method of patent CN109059919A and a multi-level positioning method of patent CN 111474516A. Fig. 8 shows a positioning error accumulation distribution curve. It can be seen that the method provided by the embodiment can significantly improve the positioning accuracy. This result reveals that the structure proposed in this embodiment, which is progressive layer by layer according to resolution, can not only improve the hit performance of the region, but also improve the accurate positioning performance.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (10)

1. A method for generating a Wi-Fi reliable crowdsourcing sample set, the method comprising:
s1, splitting a target area from top to bottom in a multi-level mode until the target area is split into grids, and obtaining units of different levels of the target area, the sub-areas and the sub-areas … … grids, wherein the grids correspond to the highest resolution;
s2, distributing original crowdsourcing samples based on a multi-level splitting result of the target area;
s3, respectively measuring the reliability of the crowdsourcing samples in each grid, and reserving credible crowdsourcing samples with the reliability larger than a threshold value;
s4, performing surface fitting on all the reserved credible crowdsourcing samples;
and S5, uniformly sampling the fitted surface function to obtain a Wi-Fi reliable crowdsourcing sample set.
2. The method of claim 1, wherein the reliability metric takes the form of a contour coefficient
Figure FDA00034795705900000114
The calculation formula is as follows:
Figure FDA0003479570590000011
Figure FDA0003479570590000012
Figure FDA0003479570590000013
wherein the content of the first and second substances,
Figure FDA0003479570590000014
represents a grid gkInner coordinate (x)n,yn) Processing the collected RSS vectors
Figure FDA0003479570590000015
Figure FDA0003479570590000016
Representing a sample
Figure FDA0003479570590000017
And the signal distance between other crowd-sourced samples in the same grid as it belongs to,
Figure FDA0003479570590000018
representing a sample
Figure FDA0003479570590000019
Minimum of signal distance, S, from all other crowd-sourced samples in other gridskRepresenting a sample
Figure FDA00034795705900000110
All of the sample sets within the grid are,
Figure FDA00034795705900000111
representing a sample
Figure FDA00034795705900000112
And a sample
Figure FDA00034795705900000113
The signal distance between.
3. The method of claim 2, wherein the surface fitting is as follows:
contour value set S of original crowdsourced samplesh={h1,h2,...,hN}, set of contour values S of credible crowd-sourced samplesh′={h1,h2,...,hγN denotes the original crowdsourcing sample number, γ denotes the trusted crowdsourcing sample number, where the contour value
Figure FDA0003479570590000021
Using the value of the profile hnComputing samples in trusted crowdsourcing samples
Figure FDA0003479570590000022
Is given by a weight coefficient wn,n=1,...,γ:
Figure FDA0003479570590000023
Figure FDA0003479570590000024
Figure FDA0003479570590000025
Figure FDA0003479570590000026
Where p represents the ratio of reliable crowd-sourced samples;
by taking the weight coefficient wnFitting a function to the signal of the mth AP at position (x)n,yn) Value and sample of
Figure FDA0003479570590000027
Is weighted to minimize the square weightingThe objective function of the sum of residuals is:
Figure FDA0003479570590000028
wherein phi ism(x, y) is a fitting surface function for fitting the wireless signal propagation surface from the AP in a given partition.
4. A construction method of a multi-resolution off-line database is characterized by comprising the following steps:
t1. generating a Wi-Fi reliable crowdsourcing sample set using the method of any one of claims 1 to 3;
and T2, constructing a multi-resolution off-line database according to a mode from bottom to top based on a multi-level splitting result of the target area, and obtaining a partition fingerprint of the previous resolution layer from a plurality of partition fingerprint sets of the current resolution layer.
5. The method of claim 4, wherein the multi-resolution offline database has a hierarchy name L of multi-resolution ordered from high resolution to low resolution1,L2,...LJWherein L isjThe layer has P partitions, each partition consisting of K Lj-1Subdivision of a layer, then LjThe data for the p-th partition of the layer is:
Figure FDA0003479570590000031
Figure FDA0003479570590000032
wherein, AP1,AP2,APMRespectively representing 1,2, M signal emission sources, Label representing the Label of the partition to which the Label belongs, dividing each line of the last line to represent the signal vector of the sub-partition, and the last line to represent the signal vector of the sub-partitionThe average vector of all sub-partition vectors, except that each column of the last column represents the signal strength of the AP, the last column represents the partition label of the data,
Figure FDA0003479570590000033
represents Lj-1The signal strength value of the mth AP received by the kth partition of the layer,
Figure FDA0003479570590000034
is the K number Lj-1The partitions represent the average value of the signal strength of the vector in the mth dimension AP, J is 1,2, … J, J represents the number of resolution layers, P is 1,2, …, P, M is 1, 2.
6. The method of claim 4, wherein the multi-resolution offline database has a hierarchy name L of multi-resolution ordered from high resolution to low resolution1,L2,...LJWherein L isjThe layer has P partitions, each partition consisting of K Lj-1Subdivision of a layer, then LjThe data for the p-th partition of the layer is:
Figure FDA0003479570590000041
wherein, AP1,AP2,APMRespectively, 1,2, M signal emission sources, F1,FpRespectively representing the 1 st and P auxiliary features, Label representing the Label of the partition to which the Label belongs, except that each row of the last row represents the fingerprint of the sub-partition, the last row represents the average fingerprint of all the fingerprints of the sub-partitions, each column of the first M columns represents the signal strength value of the AP, each column of the next P columns represents the specific value of the auxiliary feature, and the last column represents the partition Label of the data,
Figure FDA0003479570590000042
represents Lj-1Received by the Kth partition of the layerThe signal strength values of the M APs,
Figure FDA0003479570590000043
is the K number Lj-1The partitions represent the mean of the signal strength of the vector at the m-dimension AP, dpqRepresentation feature
Figure FDA0003479570590000044
Or
Figure FDA0003479570590000045
Representative vector of each region with itself
Figure FDA0003479570590000046
J is 1,2, … J, J denotes the number of resolution layers, P is 1,2, …, P, Q is 1,2j-1M represents the total number of signal emission sources.
7. A multi-resolution positioning method, the positioning method comprising:
(1) sequentially determining the subareas of the target from the coarse resolution layer to the fine resolution layer according to a top-down mode, and classifying by using a classification model;
(2) if the resolution level is customized by the user, after the target is classified into a sub-region of the resolution level defined by the user, a regression model is adopted to carry out final accurate positioning on the target user; otherwise, after the target is classified into the sub-region with the highest resolution, the regression model is adopted to carry out final accurate positioning on the target user.
8. The positioning method according to claim 7, wherein the step (1) comprises the sub-steps of:
(1.1) Forming a complete test fingerprint of the target user
Figure FDA0003479570590000047
Wherein the content of the first and second substances,
Figure FDA0003479570590000051
respectively, the signal strength of the 1 st, 2 nd,. M APs received by the target user, d1t,...dpt,...,dPtRespectively representing the vector distance between the original test fingerprint and the p-th partition fingerprint;
and (1.2) based on the complete test fingerprint of the target user, adopting a K nearest neighbor method to find the closest fingerprint in the offline database, and taking the corresponding partition identifier as the estimated subregion of the target user.
9. A multi-resolution positioning system, comprising: a computer-readable storage medium and a processor;
the computer-readable storage medium is used for storing executable instructions;
the processor is configured to read executable instructions stored in the computer-readable storage medium and execute the multi-resolution positioning method of claim 7 or 8.
10. A computer-readable storage medium comprising a stored computer program; the computer program, when executed by a processor, controls an apparatus on which the computer-readable storage medium is located to perform the method of any of claims 1 to 8.
CN202210064098.1A 2022-01-20 2022-01-20 Method for generating reliable crowdsourcing sample set, constructing multi-resolution database and positioning Active CN114449651B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210064098.1A CN114449651B (en) 2022-01-20 2022-01-20 Method for generating reliable crowdsourcing sample set, constructing multi-resolution database and positioning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210064098.1A CN114449651B (en) 2022-01-20 2022-01-20 Method for generating reliable crowdsourcing sample set, constructing multi-resolution database and positioning

Publications (2)

Publication Number Publication Date
CN114449651A true CN114449651A (en) 2022-05-06
CN114449651B CN114449651B (en) 2023-02-10

Family

ID=81367700

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210064098.1A Active CN114449651B (en) 2022-01-20 2022-01-20 Method for generating reliable crowdsourcing sample set, constructing multi-resolution database and positioning

Country Status (1)

Country Link
CN (1) CN114449651B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005043430A (en) * 2003-07-22 2005-02-17 Ricoh Co Ltd Fixing method of spatial optical modulation means and video magnifying device
US20070014468A1 (en) * 2005-07-12 2007-01-18 Gines David L System and method for confidence measures for mult-resolution auto-focused tomosynthesis
US20100164786A1 (en) * 2008-06-27 2010-07-01 Thales Method of Characterizing the Convection Intensity of a Cloud, by a Meteorological Radar
CN103379441A (en) * 2013-07-12 2013-10-30 华中科技大学 Indoor positioning method based on region segmentation and curve fitting
CN103559791A (en) * 2013-10-31 2014-02-05 北京联合大学 Vehicle detection method fusing radar and CCD camera signals
CN103987014A (en) * 2014-04-21 2014-08-13 深圳市九二一云网络科技有限公司 Distance measuring method for indoor wireless access end and wireless client side based on rate domain
CN105242239A (en) * 2015-10-19 2016-01-13 华中科技大学 Indoor subarea positioning method based on crowdsourcing fingerprint clustering and matching
CN111836188A (en) * 2020-06-17 2020-10-27 华中科技大学 Online cooperative positioning and system based on Wi-Fi RSS

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005043430A (en) * 2003-07-22 2005-02-17 Ricoh Co Ltd Fixing method of spatial optical modulation means and video magnifying device
US20070014468A1 (en) * 2005-07-12 2007-01-18 Gines David L System and method for confidence measures for mult-resolution auto-focused tomosynthesis
US20100164786A1 (en) * 2008-06-27 2010-07-01 Thales Method of Characterizing the Convection Intensity of a Cloud, by a Meteorological Radar
CN103379441A (en) * 2013-07-12 2013-10-30 华中科技大学 Indoor positioning method based on region segmentation and curve fitting
CN103559791A (en) * 2013-10-31 2014-02-05 北京联合大学 Vehicle detection method fusing radar and CCD camera signals
CN103987014A (en) * 2014-04-21 2014-08-13 深圳市九二一云网络科技有限公司 Distance measuring method for indoor wireless access end and wireless client side based on rate domain
CN105242239A (en) * 2015-10-19 2016-01-13 华中科技大学 Indoor subarea positioning method based on crowdsourcing fingerprint clustering and matching
CN111836188A (en) * 2020-06-17 2020-10-27 华中科技大学 Online cooperative positioning and system based on Wi-Fi RSS

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘经南; 方媛; 郭迟; 高柯夫: "位置大数据的分析处理研究进展", 《武汉大学学报(信息科学版)》 *
胡其美; 曹苹; 林侃; 王邦: "一种基于改进粒子群的多小区天线联合优化算法", 《移动通信》 *

Also Published As

Publication number Publication date
CN114449651B (en) 2023-02-10

Similar Documents

Publication Publication Date Title
US20230351445A1 (en) Determining locations of interest based on user visits
Song et al. A novel convolutional neural network based indoor localization framework with WiFi fingerprinting
WO2021093872A1 (en) Crowdsensing-based multi-source information fusion indoor positioning method and system
US9904932B2 (en) Analyzing semantic places and related data from a plurality of location data reports
Rong et al. Du-parking: Spatio-temporal big data tells you realtime parking availability
CN108534779B (en) Indoor positioning map construction method based on track correction and fingerprint improvement
US8983490B2 (en) Locating a mobile device
CN108882172B (en) Indoor moving trajectory data prediction method based on HMM model
CN108919177B (en) Positioning map construction method based on virtual information source estimation and track correction
CN108834077B (en) Tracking area division method and device based on user movement characteristics and electronic equipment
CN110536257B (en) Indoor positioning method based on depth adaptive network
CN109379711B (en) positioning method
CN112135248A (en) WIFI fingerprint positioning method based on K-means optimal estimation
WO2018188509A1 (en) Estate information processing method and apparatus, computer device and storage medium
Kwak et al. Magnetic field based indoor localization system: A crowdsourcing approach
CN111836188B (en) Online cooperative positioning and system based on Wi-Fi RSS
CN114449651B (en) Method for generating reliable crowdsourcing sample set, constructing multi-resolution database and positioning
Wietrzykowski et al. Adopting the FAB-MAP algorithm for indoor localization with WiFi fingerprints
CN115205699B (en) Map image spot clustering fusion processing method based on CFSFDP improved algorithm
CN114679683A (en) Indoor intelligent positioning method based on derivative fingerprint migration
CN115190587A (en) WIFI position determination method and device, electronic equipment and storage medium
CN112381078B (en) Elevated-based road identification method, elevated-based road identification device, computer equipment and storage medium
Guo et al. A hybrid indoor positioning algorithm for cellular and Wi-Fi networks
CN115062708A (en) Abnormal node detection method based on track deviation point embedding and depth clustering
CN114239821A (en) Selection method of geomagnetic matching adaptation area

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant