CN114449651A - Method for generating reliable crowdsourcing sample set, constructing multi-resolution database and positioning - Google Patents
Method for generating reliable crowdsourcing sample set, constructing multi-resolution database and positioning Download PDFInfo
- Publication number
- CN114449651A CN114449651A CN202210064098.1A CN202210064098A CN114449651A CN 114449651 A CN114449651 A CN 114449651A CN 202210064098 A CN202210064098 A CN 202210064098A CN 114449651 A CN114449651 A CN 114449651A
- Authority
- CN
- China
- Prior art keywords
- resolution
- partition
- samples
- crowdsourcing
- positioning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W64/00—Locating users or terminals or network equipment for network management purposes, e.g. mobility management
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S5/00—Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations
- G01S5/02—Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations using radio waves
- G01S5/0252—Radio frequency fingerprinting
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W4/00—Services specially adapted for wireless communication networks; Facilities therefor
- H04W4/30—Services specially adapted for particular environments, situations or purposes
- H04W4/33—Services specially adapted for particular environments, situations or purposes for indoor environments, e.g. buildings
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W4/00—Services specially adapted for wireless communication networks; Facilities therefor
- H04W4/80—Services using short range communication, e.g. near-field communication [NFC], radio-frequency identification [RFID] or low energy communication
Abstract
The invention discloses a method for generating a reliable crowdsourcing sample set, constructing a multi-resolution database and positioning, and belongs to the technical field of communication and wireless networks. According to the invention, by constructing a multi-resolution positioning database structure, online positioning is carried out in sequence from low resolution to high resolution, and the computational complexity of a large database can be effectively reduced; identifying reliable crowdsourcing samples through a reliability measurement algorithm, and solving the problem that positioning labels of the crowdsourcing samples are inaccurate; according to the method, a curved surface is built by reliable samples, and new fitting samples are generated by resampling, so that the problem that crowdsourcing sample labels are not uniformly distributed is solved; the resolution of the fingerprint and the robustness of a positioning system are increased by adding the auxiliary features of the partition Euclidean distance; by reasonably combining classification and regression models, the positioning accuracy under the conditions of different resolutions is improved while the calculation complexity is reduced.
Description
Technical Field
The invention belongs to the technical field of communication and wireless networks, and particularly relates to a method for generating a reliable crowdsourcing sample set, constructing a multi-resolution database and positioning the multi-resolution database.
Background
Positioning technology has been an interesting topic for many civilian and military applications, such as search and rescue missions, internet of things, and robotic navigation. Although the technology of Global Positioning System (GPS) is well developed, it cannot be the best means for indoor environment because of the accuracy factor. In recent years, many technologies for indoor positioning, such as visible light communication, bluetooth, ultra wideband, etc., have been developed. Despite the advantages and disadvantages of each of these technologies, Wi-Fi based fingerprinting is the most widely used technology due to its strong universality. However, as the size and complexity of indoor environments increase, Wi-Fi fingerprinting presents two major challenges.
First, as the complexity of the indoor environment increases, the cost of the site survey required for Wi-Fi offline database construction also increases dramatically. Crowdsourcing, one of the most powerful solutions, takes advantage of the unknown public in the internet to accomplish the task of collecting samples, thereby relieving the enormous burden of site surveying. While crowd sourcing can reduce the burden of field surveys, crowd-sourced samples are collected by laypersons, often with errors in labeling. In addition to the random acquisition process of the crowdsourcing technique, the crowdsourced samples may not be uniform in density, and even may not have samples at all in some localized areas. These are all problems that the crowdsourcing function is to solve.
On the other hand, as the indoor environment scales up, searching for a precise location requires longer processing time and additional computing resources. The main indoor positioning research has been focused on accurate coordinate-level position estimation, i.e., obtaining sub-meter-level accuracy grid position estimation. However, some location-aware services may not require such explicit location accuracy. For example, tracking of medical devices in hospitals; locating a parking space in an airport; location of zoning in fire control for short periods, etc. Therefore, in some cases, it is more meaningful to identify the area, such as a building, a room, etc., to which the user belongs than to provide an accurate location specified by coordinate information, so many location-aware services require different levels of positioning resolution. However, in the research work of indoor positioning, there is little research on such.
Disclosure of Invention
Aiming at the defects and the improvement requirements of the prior art, the invention provides a method for generating a reliable crowdsourcing sample set, constructing a multi-resolution database and positioning, and aims to simplify the database architecture in a large-scale complex indoor positioning environment, meet different position sensing service requirements and increase the positioning speed at an online stage by setting a plurality of resolution levels. And the problems that crowdsourcing samples have wrong position marking and sampling is uneven are solved to a certain extent.
To achieve the above object, according to a first aspect of the present invention, there is provided a method for generating a Wi-Fi reliable crowdsourcing sample set, the method comprising:
s1, splitting a target area from top to bottom in a multi-level mode until the target area is split into grids, and obtaining units of different levels of the target area, the sub-areas and the sub-areas … … grids, wherein the grids correspond to the highest resolution;
s2, distributing original crowdsourcing samples based on a multi-level splitting result of the target area;
s3, respectively measuring the reliability of the crowdsourcing samples in each grid, and reserving credible crowdsourcing samples with the reliability larger than a threshold value;
s4, performing surface fitting on all the reserved credible crowdsourcing samples;
and S5, uniformly sampling the fitted surface function to obtain a Wi-Fi reliable crowdsourcing sample set.
Preferably, the reliability measure takes the form of a contour coefficientThe calculation formula is as follows:
wherein, the first and the second end of the pipe are connected with each other,represents a grid gkInner coordinate (x)n,yn) Processing the collected RSS vectorsRepresenting a sampleAnd the signal distance between other crowd-sourced samples in the same grid as it belongs to,representing a sampleMinimum of signal distance, S, from all other crowd-sourced samples in other gridskRepresenting a sampleAll of the sample sets within the grid are,representing a sampleAnd a sampleThe signal distance between.
Has the advantages that: aiming at the problem of wrong labeling of the conventional crowdsourcing samples, the method adopts the contour value coefficient to evaluate the compatibility of the crowdsourcing samples and the positioning areas of the crowdsourcing samples, and the contour coefficient can combine two factors of cohesion and separation, so that reasonable and effective reliability measurement is realized; because the low-compatibility, namely low-reliability samples are removed, the quality of the database is ensured, and the accuracy of subsequent positioning is improved.
Preferably, the curve fitting manner is as follows:
contour value set S of original crowdsourced samplesh={h1,h2,…,hN}, set of contour values S of credible crowd-sourced samplesh′={h1,h2,…,hγN denotes the original crowdsourcing sample number, γ denotes the trusted crowdsourcing sample number, where the contour value
Using the contour value hnComputing samples in trusted crowdsourcing samplesIs given by a weight coefficient wn,n=1,…,γ:
Where ρ represents the ratio of reliable crowd-sourced samples;
by taking the weight coefficient wnFitting a function to the signal of the mth AP at position (x)n,yn) Value and sample ofIs performed by the deviation between the signal strength values received by the mth APThe objective function of weighting, minimizing the sum of squared weighted residuals, is:
wherein phi ism(x, y) is a fitting surface function for fitting the wireless signal propagation surface from the AP in a given partition.
Has the advantages that: aiming at the problem of inconsistent quality of the existing crowdsourcing samples, the weights to be assigned to the samples are calculated based on the contour values of the samples, and on one hand, the reliability is in direct proportion to the contribution to the curved surface. On the other hand, the weight w is controllednTo [1, ρ ]]The range of (2) ensures that the degree of difference between different samples is not too great; the wireless curved surface function is constructed in a weighting mode, and because different weights are given to the samples when the curved surface is fitted, the differences of the samples can be fully realized, more weights are given to reliable crowdsourcing samples with larger influences on the curved surface construction process, and the reliability and the effectiveness of the fitted curved surface are enhanced.
To achieve the above object, according to a second aspect of the present invention, there is provided a method for constructing a multi-resolution offline database, the method comprising:
t1. generating a Wi-Fi reliable crowdsourcing sample set using the method as described in the first aspect;
and T2, constructing a multi-resolution off-line database according to a mode from bottom to top based on a multi-level splitting result of the target area, and obtaining a partition fingerprint of the previous resolution layer from a plurality of partition fingerprint sets of the current resolution layer.
Preferably, the hierarchy name of the multi-resolution in the multi-resolution offline database ordered from high resolution to low resolution is L1,L2,…LJWherein L isjThe layer has P partitions, each partition consisting of K Lj-1Subdivision of a layer, then LjThe data for the p-th partition of the layer is:
wherein, AP1,AP2,APMRespectively, 1,2, M signal emission sources are represented, Label represents the Label of the partition, each row except the last row represents the signal vector of the sub-partition, the last row represents the average vector of all the sub-partition vectors, each column except the last column represents the signal strength of the AP, the last column represents the partition Label of the data,represents Lj-1The signal strength value of the mth AP received by the kth partition of the layer,is the K number Lj-1The partition represents the average value of the signal intensity of the vector in the mth dimension AP, J is 1,2, … J, J represents the resolution layer number, P is 1,2, …, P, M is 1,2 …, M represents the total number of signal emission sources.
Has the advantages that: aiming at the problem that the existing partition data lack of the difference degree, the fingerprint method of the upper partition is formed by the fingerprint set of the lower sub-partition, and the partition fingerprints can keep more original information of the lower sub-partition, so that the difference degree of the data of each partition is increased, and the robustness of the positioning system is improved.
Preferably, the hierarchy name of the multi-resolution in the multi-resolution offline database ordered from high resolution to low resolution is L1,L2,…LJWherein L isjThe layer has P partitions, each partition consisting of K Lj-1Subdivision of a layer, then LjThe data for the p-th partition of the layer is:
wherein, AP1,AP2,APMRespectively, 1,2, M signal emission sources, F1,FpRespectively representing the 1 st and P auxiliary features, Label representing the Label of the partition to which the Label belongs, except that each row of the last row represents the fingerprint of the sub-partition, the last row represents the average fingerprint of all the fingerprints of the sub-partitions, each column of the first M columns represents the signal strength value of the AP, each column of the next P columns represents the specific value of the auxiliary feature, and the last column represents the partition Label of the data,represents Lj-1The signal strength value of the mth AP received by the kth partition of the layer,is the K number Lj-1The partitions represent the mean of the signal strength of the vector at the m-dimension AP, dpqRepresentation of featuresOrRepresentative vector of each region with itselfJ is 1,2, … J, J is the number of resolution layers, P is 1,2, …, P, Q is 1, …, Q, M is 1,2 …, M, Q is Lj-1M represents the total number of signal emission sources.
Has the beneficial effects that: aiming at the problem of single characteristic of the existing fingerprint, the global characteristic is extracted by adopting Euclidean distances of different subarea signal intensity vectors as an auxiliary characteristic, and the resolution capability of the fingerprint can be improved by the auxiliary characteristic, so that the robustness of a positioning system is enhanced.
To achieve the above object, according to a third aspect of the present invention, there is provided a multi-resolution positioning method including:
(1) sequentially determining the subareas of the target from the coarse resolution layer to the fine resolution layer according to a top-down mode, and classifying by using a classification model;
(2) if the resolution level is customized by the user, after the target is classified into a sub-region of the resolution level defined by the user, a regression model is adopted to carry out final accurate positioning on the target user; otherwise, after the target is classified into the sub-region with the highest resolution, the regression model is adopted to carry out final accurate positioning on the target user.
Preferably, step (1) comprises the sub-steps of:
Wherein the content of the first and second substances,respectively, indicating the signal strength of the 1 st, 2 nd, … th APs received by the target user, d1t,…dpt,…,dPtRespectively representing the vector distance between the original test fingerprint and the p-th partition fingerprint;
and (1.2) based on the complete test fingerprint of the target user, adopting a K nearest neighbor method to find the closest fingerprint in the offline database, and taking the corresponding partition identifier as the estimated subregion of the target user.
Has the beneficial effects that: aiming at the problem of resource waste in the existing positioning system, the resolution identifiers are added to stop subsequent unnecessary positioning operation in advance, so that the calculation complexity can be reduced, the requirements of different resolutions can be met, and the calculation resources can be saved. In addition, all resolution layers can be automatically traversed under the condition of no identifier, and the universality of the system is improved.
To achieve the above object, according to a fourth aspect of the present invention, there is provided a multi-resolution positioning system including: a computer-readable storage medium and a processor;
the computer-readable storage medium is used for storing executable instructions;
the processor is configured to read executable instructions stored in the computer-readable storage medium and execute the multi-resolution positioning method according to the third aspect.
To achieve the above object, according to a fifth aspect of the present invention, there is provided a computer-readable storage medium including a stored computer program; when being executed by a processor, the computer program controls the device on which the computer readable storage medium is positioned to execute the method.
Generally, by the above technical solution conceived by the present invention, the following beneficial effects can be obtained:
(1) for the existing crowdsourcing samples, the labeling is wrong, the distribution is uneven, reliability measurement and surface fitting uniform sampling are adopted to regenerate the crowdsourcing samples of each partition, and unreliable samples are removed to ensure the quality of the database; the uniform sampling of surface fitting can ensure the uniform distribution of crowdsourcing, thereby realizing the reliability and stability of the database.
(2) Aiming at the problem that the existing database only has a single-layer resolution structure, the databases with different resolution levels are sequentially constructed in a bottom-up mode, and because the large database is divided into smaller database components with different resolution levels, the positioning database with clear layers can accelerate the matching work of online fingerprints and offline fingerprints, thereby laying a data foundation for improving the positioning speed of online users.
(3) Aiming at the problem that the multi-resolution requirement of a user cannot be met by the existing positioning method, the user is positioned from top to bottom, and because the partition fingerprints of each layer have strong representativeness, the classification of each layer has high accuracy, so that the requirements of different resolutions are met.
Drawings
FIG. 1 is a flow chart of a multi-resolution fingerprint positioning system based on Wi-Fi crowdsourcing samples according to the present invention.
Fig. 2 is an architecture diagram of multi-resolution positioning according to an embodiment of the present invention.
Fig. 3 is a schematic diagram of a top-down online positioning stage according to an embodiment of the present invention.
Fig. 4 is a schematic diagram of the hstdataset data set according to the embodiment of the present invention.
Fig. 5 is a schematic diagram of a ujindorloc public data set provided in an embodiment of the present invention.
Fig. 6 is a graph comparing hit rates in the ujilndoloc loc dataset for embodiments of the present invention and other machine learning algorithms.
FIG. 7 is a graph comparing the hit effect of other machine learning algorithms in the HUSTDataset data set as a function of resolution level according to an embodiment of the present invention.
FIG. 8 is a comparison of the positioning effect of the embodiment of the present invention and other comparison methods.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
The invention provides a multi-resolution fingerprint positioning system based on Wi-Fi crowdsourcing samples. By constructing a multi-resolution positioning database structure, online positioning is sequentially carried out from low resolution to high resolution, and the calculation complexity of a large database can be effectively reduced; identifying reliable crowdsourcing samples through a reliability measurement algorithm, and solving the problem that positioning labels of the crowdsourcing samples are inaccurate; according to the method, a curved surface is built by reliable samples, and new fitting samples are generated by resampling, so that the problem that crowdsourcing sample labels are not uniformly distributed is solved; the resolution of the fingerprint and the robustness of a positioning system are increased by adding the auxiliary features of the partition Euclidean distance; by reasonably combining classification and regression models, the positioning accuracy under the conditions of different resolutions is improved while the calculation complexity is reduced.
As shown in fig. 1, the present invention provides a multi-resolution fingerprint positioning system based on Wi-Fi crowd-sourced samples, the method comprising the steps of:
step S1, firstly splitting a target area into a plurality of partitions, and then splitting each partition in a finer step until the highest resolution level (the bottom layer) is possessed. And finally, carrying out primary distribution on the crowdsourcing samples based on the partition of the bottommost layer.
First, the present embodiment is partitioned according to obstacles and walls inherent to a building, such as a report hall, an office, a restaurant, and the like. Then, each partition is further divided into four parts having a network structure, and then, the present embodiment performs the next higher resolution partition for each partition of the layer, which is also divided into four parts. Finally, the sub-regions continue to be divided in the same manner until the embodiment reaches the highest resolution. The space is divided into different layers according to the resolution from high to low, and the layer name of the multi-resolution ordered from high resolution to low resolution is L1,L2…. As shown in fig. 2.
After partitioning of each level is completed, the bottom layer L is divided1Each partition of the crowdsourcing sample is used as a basic grid, and the basic grid distribution is carried out on the crowdsourcing samples according to the position labels of the crowdsourcing samples. Let these basic grids be set to g1,,g2,…,gKWherein K is L1The number of grids. Let the crowdsourcing sample set as S ═ S1,s2,…,sN},sn=(r1,r2,…,rM,xn,yn) Wherein N is the total number of crowdsourcing samples; m is the total number of APs, (r)1,r2,…,rM) For the sample in coordinates (x)n,yn) The collected RSS vectors. After assigning to a particular grid, each crowdsourced sample has an additional label identifying its assigned grid number to useRepresenting samples assigned to the kth grid, let SkIs the crowd-sourced sample set assigned to the kth grid.
And S2, performing reliability measurement on each partition crowdsourcing sample, and only reserving credible crowdsourcing samples with reliability larger than a threshold value.
First, the embodiment can calculate the crowdsourcing samplesAnd other crowdsourced samples in a grid with which it belongsAverage signal distance between:
wherein d () is the euclidean distance of the signal between two crowdsourced samples, which is calculated as:
since the summation does not contain the Euclidean signal distance between the sample and the summation, that is, does not containTherefore, it is not only easy to useDivided by | S k1 to displayTo the extent of its grid. Smaller values represent a higher degree of engagement with the grid.
Also, the present embodiment may calculate the average signal distance between the crowd-sourced sample and all other crowd-sourced samples in different grids. Is provided withIs any other crowd-sourced sample in another grid,is calculated as:
is a measure of dissimilarity, the grid with the smallest average dissimilarity being calledOf the adjacent grid.
Finally, the contour value of the crowdsourced sample is calculated:
the profile values may be converted into the following form:
from the above equation it follows:if the contour value isWhen the distance between the samples and other grids is larger than that of the grid to which the samples belong, the samples are unreliable, and the embodiment rejects the samples. Conversely, if the contour value isIt indicates that the crowd-sourced sample position label has certain reliability, and is reserved.
And S3, carrying out curved surface construction according to the reserved credible crowdsourcing samples, and then sampling in the curved surface at fixed intervals to generate fitting samples.
The signal surface is fitted with the rejected crowdsourcing samples first. Let the set of all available APs be a ═ AP1,AP2,…,APM}. Book (I)Embodiments apply a polynomial function phim(x, y) to fit the wireless signal propagation surface from the AP in a given partition.
Wherein, aijIs the fitting coefficient, if the present embodiment assumes equal reliability values for all reliable crowd-sourced samples, then the objective function is to minimize the sum of the squared residuals as:
but since the reliability values of all the crowd-sourced samples are not equal, not the squared residual is minimized but the squared weighted residual is weighted. The weights of the samples are set as follows:
set the profile values of all samplesBefore rejection is Sh={h1,h2,…,hNAfter rejection is Sh′={h1,h2,…,hγN is the total number of original crowdsourcing samples in the partition, and γ is the number of reliable crowdsourcing samples after culling according to the contour value. The ratio of reliable crowd-sourced samples is expressed asMake the minimum and maximum profile values min (S) in the reliability value seth′)=hminAnd max (S)h′)=hmax. This embodiment uses a scaling function to guarantee [ ρ,1 [ ]]Range of, reliability weight wnScaling function f (h)n) The method comprises the following steps:
the objective function that minimizes the sum of the squared weighted residuals then becomes:
the function phi may be utilized after the wireless signal propagation surface is constructedm(x, y) an arbitrary coordinate position is taken as an input to obtain a fitted RSS value, and the embodiment may sample from the constructed surface at regular intervals as a new fitted sample. Let the down-sampled sample set be Ω ═ Ψ1,Ψ2,…,ΨF},Ψf=(ψ1,ψ2,…,ψM,xf,yf),(ψ1,ψ2,…,ψM) As a coordinate (x) at the sampling centerf,yf) The RSS vector is fitted.
And S4, constructing a multi-resolution off-line database according to a bottom-up mode based on the fitting samples, and obtaining a partition fingerprint of the previous resolution layer from a plurality of partition fingerprint sets of the current resolution layer.
The bottom layer L for this embodiment1To obtain an upper layer L2The process of (e) is similar for example, and subsequent higher layer processes are similar. First, L is introduced1The structure of (1). Let the bottom layer L1Q is one of these resolution partitions. Let q correspond to a fitting sample set as Omega is L1And the number of downsampled fitting samples corresponding to the qth partition of the layer. L is a radical of an alcohol1The data form of the hierarchical q-region is a ω × M RSSI matrix.
Then calculating gamma<L1,q>Line mean vector ofAs a representative vector for the q region. The present embodiment utilizes the representative vector to construct L2A database of layers.
Suppose L2The layer has P partitions, each partition consisting of K L1Partition of layer (u)<L1,1>,u<L1,2>,…,u<L1,K>) And (4) forming. Then L2The data format of the p-th region of the layer is:
wherein the content of the first and second substances,m is 1,2 …, M is the number K of L1The partitions represent the average of the vectors in the mth dimension AP. It is calculated as follows:
it is to be noted that it is preferable that,is L2The representative vector of the p-th region of the layer becomes a higher layer such as L in the subsequent construction3Is provided.
To enhance the resolution of the fingerprint, the present embodiment also adds an assist feature to each partition. For L2For each row vector of the database, the assistant feature is the original Lambda<L2,p>Characteristic u of<L1,q>Or u<L2,p>(Q-1, …, Q, P-1, …, P) with its own representative vector u for each region<L2,p>The euclidean distance (P1, …, P) is calculated as follows:
therefore, L after adding the assist feature2The data for the p-th region of the layer are:
then, by the same method, lower layer data with higher resolution is hierarchically aggregated to form upper layer data to the highest layer, so that a bottom-up database structure is formed.
And S5, in the multi-resolution online positioning stage, determining the subareas of the target from the coarse resolution layer to the fine resolution layer in sequence in a top-down mode, and classifying the subareas by using a KNN classification model.
The top-down pyramid structure of the database may make KNN classification easier, so the present embodiment uses a classical KNN classification method for classification at multiple resolutions. The KNN model employed does not require a training process because the entire data set has already been constructed prior to online localization.
Multi-resolution online positioning processes online requests according to defined resolution level requirements. In classical indoor positioning, the test fingerprint is only the RSS measurements from the AP. However, in the case of multi-resolution on-line positioning, the test fingerprint requires an identifier to display the required resolution level. Let FtAn RSS measurement representing the test fingerprint.
In addition to this RSS fingerprint, the user also needs to add a requirement identifier of one resolution level to the test fingerprint, which is formed as follows:
for each layer structure, each partition has the same shapeThe data of (1). When the target user uploads the RSS vector, the system computes the helper features for the target user to generate a complete test fingerprint. As at L2The layer finding sub-area, the complete test fingerprint of the target user is:
wherein d isptIs FtTo u<L2,p>The euclidean distance of (c). Then at L2Finding the most similar fingerprint from the data of all the areas of the layer, and identifying the partition identifier corresponding to the most similar fingerprint<L2,p>As an estimated sub-area of the target user.
If no level identification is given, then it is assumed that the requirement is for pinpoint and hierarchical traversal of the classification chain, and finally a value for pinpoint is given, as shown in FIG. 3.
And S6, after the target is classified into the sub-region with the highest resolution, selecting an XGboost regression model to carry out final accurate positioning on the target user.
At the lowest layer L1Each sub-area has a database, for example, the fitting sample of the k-th area is Sk. These samples are used to train a regression model with the input of (r)1,r2,…,rM) The regression target is (x)n,yn). In the embodiment, XGboost is used as a regression model for accurate positioning, and is an improved version of a gradient lifting algorithm GBDT, so that the performance can be improved, and the calculation speed is increased.
The multi-resolution fingerprint positioning system based on Wi-Fi crowdsourcing samples provided by the above method embodiments is further explained with reference to an application example in a specific scenario.
In this example, there are two data sets, the first being HUSTDataset. Its target area is shown in fig. 4, and at least 70 AP signals can be received for each sample. In the data set acquisition process, a signal sample is randomly acquired by using a smart phone and then divided into training data and testing data. Finally, this example adds gaussian noise with a variance of 0.6 to the position label of the sample to simulate an error-carrying crowd-sourced sample in the HUSTDataset. The second data set is from a common data set named ujindorloc. The UJIIndoorLoc database covers three buildings as shown in fig. 5.
The present embodiment first evaluates the classification performance under low resolution, which refers to a sub-area with a large area or with obvious physical boundaries, for example: buildings, floors, rooms, etc. In addition to performing the experiment of the KNN algorithm adopted in the present embodiment, the present embodiment also compares other advanced machine learning enhancement algorithms, such as XGBoost, LightGBM. The experimental result based on the ujiindiororc dataset is shown in fig. 6, and the KNN algorithm adopted in the present embodiment has the highest hit rate, which can reach more than 98%. The effectiveness of the method provided by the embodiment in low resolution is verified, and a foundation is laid for subsequent high resolution.
Then, the present embodiment further evaluates the classification performance at a high resolution, where the high resolution refers to a partition structure with a finer granularity below the room level. Based on the experimental results on the HUSTDataset data set, as shown in FIG. 7, the classification model using KNN performs better than other models, and the accuracy is significantly improved as the hierarchy is increased (the higher the hierarchy, the lower the resolution).
Finally, the present embodiment evaluates the accurate positioning performance of different methods. The present embodiment sets up an experimental scenario based on the HUSTDataset dataset to measure the fine positioning performance of the proposed system, compared to other crowdsourcing based methods. The comparison scheme is a nearest neighbor positioning method, a weighted surface positioning method of patent CN109059919A and a multi-level positioning method of patent CN 111474516A. Fig. 8 shows a positioning error accumulation distribution curve. It can be seen that the method provided by the embodiment can significantly improve the positioning accuracy. This result reveals that the structure proposed in this embodiment, which is progressive layer by layer according to resolution, can not only improve the hit performance of the region, but also improve the accurate positioning performance.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.
Claims (10)
1. A method for generating a Wi-Fi reliable crowdsourcing sample set, the method comprising:
s1, splitting a target area from top to bottom in a multi-level mode until the target area is split into grids, and obtaining units of different levels of the target area, the sub-areas and the sub-areas … … grids, wherein the grids correspond to the highest resolution;
s2, distributing original crowdsourcing samples based on a multi-level splitting result of the target area;
s3, respectively measuring the reliability of the crowdsourcing samples in each grid, and reserving credible crowdsourcing samples with the reliability larger than a threshold value;
s4, performing surface fitting on all the reserved credible crowdsourcing samples;
and S5, uniformly sampling the fitted surface function to obtain a Wi-Fi reliable crowdsourcing sample set.
2. The method of claim 1, wherein the reliability metric takes the form of a contour coefficientThe calculation formula is as follows:
wherein the content of the first and second substances,represents a grid gkInner coordinate (x)n,yn) Processing the collected RSS vectors Representing a sampleAnd the signal distance between other crowd-sourced samples in the same grid as it belongs to,representing a sampleMinimum of signal distance, S, from all other crowd-sourced samples in other gridskRepresenting a sampleAll of the sample sets within the grid are,representing a sampleAnd a sampleThe signal distance between.
3. The method of claim 2, wherein the surface fitting is as follows:
contour value set S of original crowdsourced samplesh={h1,h2,...,hN}, set of contour values S of credible crowd-sourced samplesh′={h1,h2,...,hγN denotes the original crowdsourcing sample number, γ denotes the trusted crowdsourcing sample number, where the contour value
Using the value of the profile hnComputing samples in trusted crowdsourcing samplesIs given by a weight coefficient wn,n=1,...,γ:
Where p represents the ratio of reliable crowd-sourced samples;
by taking the weight coefficient wnFitting a function to the signal of the mth AP at position (x)n,yn) Value and sample ofIs weighted to minimize the square weightingThe objective function of the sum of residuals is:
wherein phi ism(x, y) is a fitting surface function for fitting the wireless signal propagation surface from the AP in a given partition.
4. A construction method of a multi-resolution off-line database is characterized by comprising the following steps:
t1. generating a Wi-Fi reliable crowdsourcing sample set using the method of any one of claims 1 to 3;
and T2, constructing a multi-resolution off-line database according to a mode from bottom to top based on a multi-level splitting result of the target area, and obtaining a partition fingerprint of the previous resolution layer from a plurality of partition fingerprint sets of the current resolution layer.
5. The method of claim 4, wherein the multi-resolution offline database has a hierarchy name L of multi-resolution ordered from high resolution to low resolution1,L2,...LJWherein L isjThe layer has P partitions, each partition consisting of K Lj-1Subdivision of a layer, then LjThe data for the p-th partition of the layer is:
wherein, AP1,AP2,APMRespectively representing 1,2, M signal emission sources, Label representing the Label of the partition to which the Label belongs, dividing each line of the last line to represent the signal vector of the sub-partition, and the last line to represent the signal vector of the sub-partitionThe average vector of all sub-partition vectors, except that each column of the last column represents the signal strength of the AP, the last column represents the partition label of the data,represents Lj-1The signal strength value of the mth AP received by the kth partition of the layer,is the K number Lj-1The partitions represent the average value of the signal strength of the vector in the mth dimension AP, J is 1,2, … J, J represents the number of resolution layers, P is 1,2, …, P, M is 1, 2.
6. The method of claim 4, wherein the multi-resolution offline database has a hierarchy name L of multi-resolution ordered from high resolution to low resolution1,L2,...LJWherein L isjThe layer has P partitions, each partition consisting of K Lj-1Subdivision of a layer, then LjThe data for the p-th partition of the layer is:
wherein, AP1,AP2,APMRespectively, 1,2, M signal emission sources, F1,FpRespectively representing the 1 st and P auxiliary features, Label representing the Label of the partition to which the Label belongs, except that each row of the last row represents the fingerprint of the sub-partition, the last row represents the average fingerprint of all the fingerprints of the sub-partitions, each column of the first M columns represents the signal strength value of the AP, each column of the next P columns represents the specific value of the auxiliary feature, and the last column represents the partition Label of the data,represents Lj-1Received by the Kth partition of the layerThe signal strength values of the M APs,is the K number Lj-1The partitions represent the mean of the signal strength of the vector at the m-dimension AP, dpqRepresentation featureOrRepresentative vector of each region with itselfJ is 1,2, … J, J denotes the number of resolution layers, P is 1,2, …, P, Q is 1,2j-1M represents the total number of signal emission sources.
7. A multi-resolution positioning method, the positioning method comprising:
(1) sequentially determining the subareas of the target from the coarse resolution layer to the fine resolution layer according to a top-down mode, and classifying by using a classification model;
(2) if the resolution level is customized by the user, after the target is classified into a sub-region of the resolution level defined by the user, a regression model is adopted to carry out final accurate positioning on the target user; otherwise, after the target is classified into the sub-region with the highest resolution, the regression model is adopted to carry out final accurate positioning on the target user.
8. The positioning method according to claim 7, wherein the step (1) comprises the sub-steps of:
Wherein the content of the first and second substances,respectively, the signal strength of the 1 st, 2 nd,. M APs received by the target user, d1t,...dpt,...,dPtRespectively representing the vector distance between the original test fingerprint and the p-th partition fingerprint;
and (1.2) based on the complete test fingerprint of the target user, adopting a K nearest neighbor method to find the closest fingerprint in the offline database, and taking the corresponding partition identifier as the estimated subregion of the target user.
9. A multi-resolution positioning system, comprising: a computer-readable storage medium and a processor;
the computer-readable storage medium is used for storing executable instructions;
the processor is configured to read executable instructions stored in the computer-readable storage medium and execute the multi-resolution positioning method of claim 7 or 8.
10. A computer-readable storage medium comprising a stored computer program; the computer program, when executed by a processor, controls an apparatus on which the computer-readable storage medium is located to perform the method of any of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210064098.1A CN114449651B (en) | 2022-01-20 | 2022-01-20 | Method for generating reliable crowdsourcing sample set, constructing multi-resolution database and positioning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210064098.1A CN114449651B (en) | 2022-01-20 | 2022-01-20 | Method for generating reliable crowdsourcing sample set, constructing multi-resolution database and positioning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114449651A true CN114449651A (en) | 2022-05-06 |
CN114449651B CN114449651B (en) | 2023-02-10 |
Family
ID=81367700
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210064098.1A Active CN114449651B (en) | 2022-01-20 | 2022-01-20 | Method for generating reliable crowdsourcing sample set, constructing multi-resolution database and positioning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114449651B (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2005043430A (en) * | 2003-07-22 | 2005-02-17 | Ricoh Co Ltd | Fixing method of spatial optical modulation means and video magnifying device |
US20070014468A1 (en) * | 2005-07-12 | 2007-01-18 | Gines David L | System and method for confidence measures for mult-resolution auto-focused tomosynthesis |
US20100164786A1 (en) * | 2008-06-27 | 2010-07-01 | Thales | Method of Characterizing the Convection Intensity of a Cloud, by a Meteorological Radar |
CN103379441A (en) * | 2013-07-12 | 2013-10-30 | 华中科技大学 | Indoor positioning method based on region segmentation and curve fitting |
CN103559791A (en) * | 2013-10-31 | 2014-02-05 | 北京联合大学 | Vehicle detection method fusing radar and CCD camera signals |
CN103987014A (en) * | 2014-04-21 | 2014-08-13 | 深圳市九二一云网络科技有限公司 | Distance measuring method for indoor wireless access end and wireless client side based on rate domain |
CN105242239A (en) * | 2015-10-19 | 2016-01-13 | 华中科技大学 | Indoor subarea positioning method based on crowdsourcing fingerprint clustering and matching |
CN111836188A (en) * | 2020-06-17 | 2020-10-27 | 华中科技大学 | Online cooperative positioning and system based on Wi-Fi RSS |
-
2022
- 2022-01-20 CN CN202210064098.1A patent/CN114449651B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2005043430A (en) * | 2003-07-22 | 2005-02-17 | Ricoh Co Ltd | Fixing method of spatial optical modulation means and video magnifying device |
US20070014468A1 (en) * | 2005-07-12 | 2007-01-18 | Gines David L | System and method for confidence measures for mult-resolution auto-focused tomosynthesis |
US20100164786A1 (en) * | 2008-06-27 | 2010-07-01 | Thales | Method of Characterizing the Convection Intensity of a Cloud, by a Meteorological Radar |
CN103379441A (en) * | 2013-07-12 | 2013-10-30 | 华中科技大学 | Indoor positioning method based on region segmentation and curve fitting |
CN103559791A (en) * | 2013-10-31 | 2014-02-05 | 北京联合大学 | Vehicle detection method fusing radar and CCD camera signals |
CN103987014A (en) * | 2014-04-21 | 2014-08-13 | 深圳市九二一云网络科技有限公司 | Distance measuring method for indoor wireless access end and wireless client side based on rate domain |
CN105242239A (en) * | 2015-10-19 | 2016-01-13 | 华中科技大学 | Indoor subarea positioning method based on crowdsourcing fingerprint clustering and matching |
CN111836188A (en) * | 2020-06-17 | 2020-10-27 | 华中科技大学 | Online cooperative positioning and system based on Wi-Fi RSS |
Non-Patent Citations (2)
Title |
---|
刘经南; 方媛; 郭迟; 高柯夫: "位置大数据的分析处理研究进展", 《武汉大学学报(信息科学版)》 * |
胡其美; 曹苹; 林侃; 王邦: "一种基于改进粒子群的多小区天线联合优化算法", 《移动通信》 * |
Also Published As
Publication number | Publication date |
---|---|
CN114449651B (en) | 2023-02-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20230351445A1 (en) | Determining locations of interest based on user visits | |
Song et al. | A novel convolutional neural network based indoor localization framework with WiFi fingerprinting | |
WO2021093872A1 (en) | Crowdsensing-based multi-source information fusion indoor positioning method and system | |
US9904932B2 (en) | Analyzing semantic places and related data from a plurality of location data reports | |
Rong et al. | Du-parking: Spatio-temporal big data tells you realtime parking availability | |
CN108534779B (en) | Indoor positioning map construction method based on track correction and fingerprint improvement | |
US8983490B2 (en) | Locating a mobile device | |
CN108882172B (en) | Indoor moving trajectory data prediction method based on HMM model | |
CN108919177B (en) | Positioning map construction method based on virtual information source estimation and track correction | |
CN108834077B (en) | Tracking area division method and device based on user movement characteristics and electronic equipment | |
CN110536257B (en) | Indoor positioning method based on depth adaptive network | |
CN109379711B (en) | positioning method | |
CN112135248A (en) | WIFI fingerprint positioning method based on K-means optimal estimation | |
WO2018188509A1 (en) | Estate information processing method and apparatus, computer device and storage medium | |
Kwak et al. | Magnetic field based indoor localization system: A crowdsourcing approach | |
CN111836188B (en) | Online cooperative positioning and system based on Wi-Fi RSS | |
CN114449651B (en) | Method for generating reliable crowdsourcing sample set, constructing multi-resolution database and positioning | |
Wietrzykowski et al. | Adopting the FAB-MAP algorithm for indoor localization with WiFi fingerprints | |
CN115205699B (en) | Map image spot clustering fusion processing method based on CFSFDP improved algorithm | |
CN114679683A (en) | Indoor intelligent positioning method based on derivative fingerprint migration | |
CN115190587A (en) | WIFI position determination method and device, electronic equipment and storage medium | |
CN112381078B (en) | Elevated-based road identification method, elevated-based road identification device, computer equipment and storage medium | |
Guo et al. | A hybrid indoor positioning algorithm for cellular and Wi-Fi networks | |
CN115062708A (en) | Abnormal node detection method based on track deviation point embedding and depth clustering | |
CN114239821A (en) | Selection method of geomagnetic matching adaptation area |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |