CN111639712A - Positioning method and system based on density peak clustering and gradient lifting algorithm - Google Patents

Positioning method and system based on density peak clustering and gradient lifting algorithm Download PDF

Info

Publication number
CN111639712A
CN111639712A CN202010482361.XA CN202010482361A CN111639712A CN 111639712 A CN111639712 A CN 111639712A CN 202010482361 A CN202010482361 A CN 202010482361A CN 111639712 A CN111639712 A CN 111639712A
Authority
CN
China
Prior art keywords
value
sample
signal strength
algorithm
point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010482361.XA
Other languages
Chinese (zh)
Inventor
魏爱辉
李卫宁
张晖
陈春海
方士琦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beidou Shurui Beijing Technology Co ltd
Original Assignee
Beidou Shurui Beijing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beidou Shurui Beijing Technology Co ltd filed Critical Beidou Shurui Beijing Technology Co ltd
Priority to CN202010482361.XA priority Critical patent/CN111639712A/en
Publication of CN111639712A publication Critical patent/CN111639712A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Abstract

The application discloses a positioning method and a system based on density peak clustering and gradient lifting algorithm, wherein the method comprises the following steps: setting a reference point in a preset area; collecting a first received signal strength value of the reference point, forming a piece of fingerprint data together with the position coordinate of the reference point, and storing the fingerprint data into a fingerprint database; acquiring a second received signal strength value of each access point acquired by the positioning terminal at a point to be positioned; comparing the second received signal strength value with the first received signal strength value, and obtaining a comparison result; and obtaining the position coordinates of the to-be-positioned point according to the comparison result. Compared with the prior art, the method has the following beneficial effects: the positioning method provided by the application realizes the measurement and calculation of the position coordinates of the to-be-measured point, can concentrate the algorithm in limited AP, can obviously reduce the error of indoor positioning compared with the KNN algorithm, and improves the positioning precision by 20%. In addition, the number of required APs is smaller for the same positioning accuracy.

Description

Positioning method and system based on density peak clustering and gradient lifting algorithm
Technical Field
The application relates to the field of positioning in a wireless local area network, in particular to a positioning method and a positioning system based on density peak clustering and a gradient lifting algorithm.
Background
Location fingerprinting is a location method based on scene analysis and matching. In an indoor environment, the signal strength of the AP (access point) transmissions received by different location points is different, and therefore, the current location information can be described by using the RSSI value of different APs at the point. The algorithm comprises two stages, an off-line stage and an on-line stage.
In the off-line stage, firstly, a reasonable reference point is planned in an area to be positioned, then RSSI signal values of all APs are collected at the reference point, and form a piece of fingerprint data together with the position coordinates of the point, and the fingerprint data are stored in a fingerprint database;
in the on-line stage, the Received Signal Strength (RSS) signal values of each AP, which are acquired by the positioning terminal at the point to be positioned, are compared with the data in the off-line database through a matching algorithm to obtain the position coordinates of the point to be positioned.
However, in the prior art, the received signal strength is easily affected by many factors such as co-channel interference, complex and changeable indoor environment, moving crowds and the like, so that the received signal strength has serious volatility, changes of the signal strength are caused, the indoor positioning accuracy is seriously affected, and a lot of problems are brought to the Wi-Fi indoor positioning technology based on the fingerprint positioning algorithm.
Disclosure of Invention
The main objective of the present application is to provide a positioning method based on density peak clustering and gradient boosting algorithm, which includes:
setting a reference point in a preset area;
collecting a first received signal strength value of the reference point, forming a piece of fingerprint data together with the position coordinate of the reference point, and storing the fingerprint data into a fingerprint database;
acquiring a second received signal strength value of each access point acquired by the positioning terminal at a point to be positioned;
comparing the second received signal strength value with the first received signal strength value, and obtaining a comparison result;
and obtaining the position coordinates of the to-be-positioned point according to the comparison result.
Optionally, collecting a first received signal strength value of the reference point, and forming a fingerprint data together with the position coordinates of the reference point, and storing the fingerprint data into a fingerprint database includes:
establishing a fingerprint database by adopting a density peak value clustering algorithm;
calculating local density and distance of a Gaussian core of the sample in the space to which the sample belongs through a density peak value clustering algorithm;
screening a density peak value clustering algorithm, and simultaneously obtaining a sample with a high value as a clustering center;
and taking the sample value of the cluster center as the first received signal strength value of the reference point.
Optionally, the screening of the density peak clustering algorithm and obtaining a high-value sample as a clustering center includes:
the received signal strength data from reference point k to signal receiving points l, m, n are filtered initially
Figure BDA0002515701250000021
Taking data of a point l as an X coordinate of a three-dimensional coordinate system, data of a point m as a Y coordinate, data of a point n as a Z coordinate, and τ as a recording frequency, wherein points which represent a sample as three-dimensional space S distribution are as follows:
Figure BDA0002515701250000022
the Euclidean distance for a sample i to a sample j in space S is defined as dijThe two-dimensional properties of local density and distance of the gaussian kernel of the sample i, ρ i and i respectively, are defined as follows:
Figure BDA0002515701250000031
Figure BDA0002515701250000032
wherein d iscFor the truncation distance, piRepresenting a distance in space S from sample i smaller than dcThe number of samples of (a);
d in space SijThe total number is N ═ N (N-1)/2, and the ascending order is:
d1≤d2≤…≤dN;dc=df(Nμ)
wherein f (N mu) represents an integer obtained by rounding N mu, and mu epsilon (0, 1) is a given parameter;
for a sample i in the space S, the binary group (rho) is obtained through the calculation of the formulaii),i∈IsDrawing a binary group (rho i, i) } of all samples in the space on a two-dimensional plane by taking rho as a horizontal axis and taking a vertical axis as a vertical axis, and selecting the binary group satisfying max { rho { (rho i, i) }i*i1, 2, …, with n samples as cluster centers.
Optionally, the positioning method based on density peak clustering and gradient boosting algorithm further includes: and constructing a positioning model on the basis of a gradient lifting algorithm, wherein the gradient lifting algorithm uses an addition model and continuously reduces residual errors generated in a training process to achieve an algorithm for classifying or regressing data.
Optionally, constructing the localization model based on the gradient boosting algorithm comprises:
establishing a mapping relation between fingerprint data and physical position coordinates through a gradient model algorithm, taking the fingerprint database D as an input space, and initializing a classification regression tree:
Figure BDA0002515701250000033
wherein, yiRepresenting the physical position coordinates of the ith reference point; tau is an output value of a leaf node of the classification regression tree, namely a predicted value of the position coordinate of the ith reference point; n is the number of fingerprint samples; l is a loss function of the model;
using the value approximation of the negative gradient of the loss function on the current model instead of the residual error as an approximation of the error to fit the next classification backReturn to Tree, Fm-1(x) The negative gradient value of the loss function of the classification regression tree is expressed as:
Figure BDA0002515701250000041
the input space of the mth classification regression tree Φ { (x)1,αm1),(x2,αm2),…,(xN,αmN)};
Calculating the output value of each child node through linear search:
Figure BDA0002515701250000042
and fitting the next classification regression tree by taking the negative gradient value of the current model as an approximate value of the error through the loss function, wherein the final positioning model of the gradient model algorithm is as follows:
Figure BDA0002515701250000043
wherein M represents the total number of the classification regression trees generated by iteration, and a regularization coefficient lambda needs to be multiplied before each classification regression tree in the iteration processmTo avoid over-fitting the training data, the value range is (0, 1)],τmjTo classify the output values of the leaf nodes of the regression tree, I denotes when x ∈βmjTake 1 if not, or 0 if not.
Optionally, the loss function is a Huber loss function, which takes a fractional point σ as a boundary, and reduces the influence of an abnormal value on a prediction result by adopting two different strategies; for abnormal points far away from the center, an absolute value loss function is adopted, and abnormal points near the center adopt a mean square error loss function; the Huber loss function is as follows:
Figure BDA0002515701250000044
optionally, comparing the second received signal strength value with the first received signal strength value, and obtaining a comparison result includes:
comparing the second signal strength values with the first signal strength values in the fingerprint database in sequence;
and taking the position coordinate corresponding to the first signal strength value closest to the second signal strength value as the position coordinate of the access point.
According to another aspect of the present application, there is also provided a positioning system based on density peak clustering and gradient boosting algorithm, including:
the reference point setting module is used for setting a reference point in a preset area;
the fingerprint database establishing module is used for collecting a first received signal strength value of the reference point, forming a piece of fingerprint data together with the position coordinate of the reference point and storing the fingerprint data into a fingerprint database;
the receiving signal strength value acquisition module is used for acquiring a second receiving signal strength value of each access point acquired by the positioning terminal at a point to be positioned;
a comparison module, configured to compare the second received signal strength value with the first received signal strength value, and obtain a comparison result;
and the coordinate acquisition module is used for acquiring the position coordinate of the to-be-positioned point according to the comparison result.
Optionally, the fingerprint database establishing module includes:
the fingerprint database establishing module is used for establishing a fingerprint database by adopting a density peak value clustering algorithm;
the Gaussian kernel local density and distance calculation module calculates the Gaussian kernel local density and distance of the sample in the space to which the sample belongs through a density peak value clustering algorithm;
the cluster center screening module is used for screening the density peak value clustering algorithm and obtaining a sample with a high value as a cluster center;
and the first received signal strength value determining module is used for taking the sample value of the cluster center as the first received signal strength value of the reference point.
Optionally in a clusterThe heart screening module includes: a filtering module for setting the received signal intensity data from the reference point k to the signal receiving points l, m, n after preliminary filtering
Figure BDA0002515701250000061
The three-dimensional space S distribution module is used for taking data of a point I as an X coordinate of a three-dimensional coordinate system, taking data of a point m as a Y coordinate, taking data of a point n as a Z coordinate, expressing tau as a recording frequency, and expressing a sample as a point of three-dimensional space S distribution:
Figure BDA0002515701250000062
a definition module for defining Euclidean distance from sample i to sample j in space S as dijThe two-dimensional properties of local density and distance of the gaussian kernel of the sample i, ρ i and i respectively, are defined as follows:
Figure BDA0002515701250000063
Figure BDA0002515701250000064
wherein d iscFor the truncation distance, piRepresenting a distance in space S from sample i smaller than dcThe number of samples of (a);
d in space SijThe total number is N ═ N (N-1)/2, and the ascending order is:
d1≤d2≤…≤dN;dc=df(Nμ)
wherein f (N mu) represents an integer obtained by rounding N mu, and mu epsilon (0, 1) is a given parameter;
a drawing module for calculating the sample i in the space S by the formula to obtain the binary group (rho)ii),i∈IsDrawing a binary group (rho i, i) } of all samples in the space on a two-dimensional plane by taking rho as a horizontal axis and taking a vertical axis as a vertical axis, and selecting the binary group satisfying max { rho { (rho i, i) }i*i1, 2, …, with n samples as cluster centers.
Compared with the prior art, the method has the following beneficial effects:
the invention provides a WIFI indoor positioning algorithm based on linear discriminant analysis and a gradient lifting decision tree. The algorithm firstly uses DPC to extract main positioning characteristics in the original position fingerprint, and removes redundancy and noise; then, a GBDT positioning model is constructed by utilizing a forward distribution algorithm and an addition model, the measurement and calculation of the position coordinates of the point to be measured are realized, the algorithm can obviously reduce the error of indoor positioning in a limited AP set compared with a KNN (K-Nearest Neighbor) algorithm, and the positioning precision is improved by 20%. In addition, the number of required APs is smaller for the same positioning accuracy.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, serve to provide a further understanding of the application and to enable other features, objects, and advantages of the application to be more apparent. The drawings and their description illustrate the embodiments of the invention and do not limit it. In the drawings:
FIG. 1 is a schematic flow chart diagram of a positioning method based on density peak clustering and gradient boosting algorithm according to an embodiment of the present application;
FIG. 2 is a schematic flow chart diagram of a positioning method based on density peak clustering and gradient boosting according to an embodiment of the present application;
FIG. 3 is a sample three-dimensional space S-map according to one embodiment of the present application;
FIG. 4 is a schematic diagram of a cluster center according to one embodiment of the present application;
FIG. 5 is a comparison graph of different fingerprint data set positioning errors according to one embodiment of the present application;
FIG. 6 is a diagram illustrating maximum depths of classification regression trees, according to an embodiment of the present application;
FIG. 7 is a schematic diagram of a computer device according to one embodiment of the present application; and
FIG. 8 is a schematic diagram of a computer-readable storage medium according to one embodiment of the present application.
Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It should be understood that the data so used may be interchanged under appropriate circumstances such that embodiments of the application described herein may be used. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
Referring to fig. 1-2, an embodiment of the present application provides a positioning method based on density peak clustering and gradient boosting algorithm, including:
s2: setting a reference point in a preset area;
s4: collecting a first received signal strength value of the reference point, forming a piece of fingerprint data together with the position coordinate of the reference point, and storing the fingerprint data into a fingerprint database;
s6: acquiring a second received signal strength value of each access point acquired by the positioning terminal at a point to be positioned;
s8: comparing the second received signal strength value with the first received signal strength value, and obtaining a comparison result;
s10: and obtaining the position coordinates of the to-be-positioned point according to the comparison result.
In an embodiment of the application, a fingerprint database is established by adopting a density peak clustering algorithm; calculating local density and distance of a Gaussian core of the sample in the space to which the sample belongs through a density peak value clustering algorithm; screening a density peak value clustering algorithm, and simultaneously obtaining a sample with a high value as a clustering center; and taking the sample value of the cluster center as the first received signal strength value of the reference point.
In an embodiment of the present application, comparing the second rssi value with the first rssi value, and obtaining the comparison result includes:
comparing the second signal strength values with the first signal strength values in the fingerprint database in sequence;
and taking the position coordinate corresponding to the first signal strength value closest to the second signal strength value as the position coordinate of the access point.
In an embodiment of the present application, the step of screening the density peak clustering algorithm and obtaining a high-value sample as a clustering center includes:
the received signal strength data from reference point k to signal receiving points l, m, n are filtered initially
Figure BDA0002515701250000091
Taking data of a point l as an X coordinate of a three-dimensional coordinate system, data of a point m as a Y coordinate, data of a point n as a Z coordinate, and τ as a recording frequency, wherein points which represent a sample as three-dimensional space S distribution are as follows:
Figure BDA0002515701250000092
the Euclidean distance for a sample i to a sample j in space S is defined as dijOf sample iThe two-dimensional properties of local density and distance of the Gaussian kernel, ρ i and i respectively, are defined as follows:
Figure BDA0002515701250000093
Figure BDA0002515701250000094
wherein d iscFor the truncation distance, piRepresenting a distance in space S from sample i smaller than dcThe number of samples of (a);
d in space SijThe total number is N ═ N (N-1)/2, and the ascending order is:
d1≤d2≤…≤dN;dc=df(Nμ)
wherein f (N mu) represents an integer obtained by rounding N mu, and mu epsilon (0, 1) is a given parameter;
for a sample i in the space S, the binary group (rho) is obtained through the calculation of the formulaii),i∈IsDrawing a binary group (rho i, i) } of all samples in the space on a two-dimensional plane by taking rho as a horizontal axis and taking a vertical axis as a vertical axis, and selecting the binary group satisfying max { rho { (rho i, i) }i*i1, 2, …, with n samples as cluster centers.
1 establishing a fingerprint library
Because interference exists among multiple groups of signals or RSS is influenced by shielding, reflection, absorption and the like of indoor objects, the RSS data needs to be removed to realize primary filtering processing of the RSS data. In order to further optimize the processing to improve the credibility of the data, a fingerprint database is established by adopting a Density peak clustering algorithm (DPC). Local density and distance of a Gaussian Kernel (Gaussian Kernel) of the sample in the space where the sample belongs are calculated through a DPC algorithm, the sample with higher value obtained by screening the two attributes is taken as a clustering center, and the clustering center has the highest density attribute in the sample space.
Referring to FIG. 3, it is assumed that RSS data from a reference point k to signal receiving points l, m, n are filtered primarily
Figure BDA0002515701250000101
Taking data of point l as an X coordinate of a three-dimensional coordinate system, data of point m as a Y coordinate, data of point n as a Z coordinate, τ represents the number of recordings, and a sample is represented as a point of a three-dimensional space S distribution as shown in fig. 3:
Figure BDA0002515701250000102
the Euclidean distance for a sample i to a sample j in space S is defined as dijThe two-dimensional properties of local density and distance of the gaussian kernel for sample i, ρ i and i, respectively, are defined as follows.
Figure BDA0002515701250000103
Figure BDA0002515701250000104
Wherein d iscFor the truncation distance, piRepresenting a distance in space S from sample i smaller than dcThe number of samples.
D in space SijThe total number is N ═ N (N-1)/2, and the ascending order is:
d1≤d2≤…≤dN;dc=df(Nμ)
where f (N.mu.) represents an integer obtained by rounding off N.mu.and [ mu ] (0, 1) is a given parameter.
For a sample i in the space S, the binary group (rho) is obtained through the calculation of the formulaii),i∈IsDrawing a binary group (rho i, i) } of all samples in the space on a two-dimensional plane by taking rho as a horizontal axis and taking a vertical axis as a vertical axis, and selecting the binary group satisfying max { rho { (rho i, i) }i*i1, 2, …, with n samples as cluster centers. Selecting a clustering center sample in a sample space to be stored in a fingerprint database according to the calculation method, and taking the sample as the position fingerprint of the reference point
DPC extraction methodThe fingerprint database after bit feature is
Figure BDA0002515701250000111
Wherein x isi=(rss1,…,rssp) New fingerprint data, y, representing the ith reference pointiIs the physical location coordinate of the ith reference point, and p is the feature dimension, which has a great influence on the final prediction accuracy of the model. If p is too small, the introduced positioning features are relatively less, so that the positioning accuracy is lower; otherwise, redundant information and noise in the fingerprint data can be introduced, and the final position coordinate prediction result is influenced. The feature dimension p of the fingerprint data retained after the DPC extraction of the positioning features needs to be trained in an off-line stage to find the retained optimal dimension.
In an embodiment of the present application, the positioning method based on density peak clustering and gradient boosting algorithm further includes: and constructing a positioning model on the basis of a gradient lifting algorithm, wherein the gradient lifting algorithm uses an addition model and continuously reduces residual errors generated in a training process to achieve an algorithm for classifying or regressing data.
In an embodiment of the present application, constructing the localization model based on the gradient boosting algorithm includes:
establishing a mapping relation between fingerprint data and physical position coordinates through a gradient model algorithm, taking the fingerprint database D as an input space, and initializing a classification regression tree:
Figure BDA0002515701250000112
wherein, yiRepresenting the physical position coordinates of the ith reference point; tau is an output value of a leaf node of the classification regression tree, namely a predicted value of the position coordinate of the ith reference point; n is the number of fingerprint samples; l is a loss function of the model;
the value approximation of the negative gradient of the loss function on the current model is used instead of the residual error, as an approximation of the error, to fit the next classification regression tree, Fm-1(x) The negative gradient value of the loss function of the classification regression tree is expressed as:
Figure BDA0002515701250000121
the input space of the mth classification regression tree Φ { (x)1,αm1),(x2,αm2),…,(xN,αmN)};
Calculating the output value of each child node through linear search:
Figure BDA0002515701250000122
and fitting the next classification regression tree by taking the negative gradient value of the current model as an approximate value of the error through the loss function, wherein the final positioning model of the gradient model algorithm is as follows:
Figure BDA0002515701250000123
wherein M represents the total number of the classification regression trees generated by iteration, and a regularization coefficient lambda needs to be multiplied before each classification regression tree in the iteration processmTo avoid over-fitting the training data, the value range is (0, 1)],τmjTo classify the output values of the leaf nodes of the regression tree, I denotes when x ∈βmjTake 1 if not, or 0 if not.
2 building GBDT positioning model
The method comprises the steps of constructing a positioning model on the basis of GBDT, continuously reducing residual errors generated in a training process by adopting an addition model (namely linear combination of basis functions) by adopting the GBDT to achieve an algorithm for classifying or regressing data, introducing a basic learning model trained by an iterative mode in a gradient lifting thought framework, weighting and fusing the trained basic learning models, combining weak learners into a strong learner, improving generalization capability and model accuracy of the algorithm, and constructing a final algorithm model.
Utilizing GBDT to construct a mapping relation between fingerprint data and physical position coordinates, taking the fingerprint database D generated in the step (1) as an input space, and initializing a classification regression tree:
Figure BDA0002515701250000131
in the formula: y isiRepresenting the physical position coordinates of the ith reference point; tau is an output value of a leaf node of the classification regression tree, namely a predicted value of the position coordinate of the ith reference point; n is the number of fingerprint samples; l is the loss function of the model.
The value approximation of the negative gradient of the loss function on the current model is used instead of the residual, as an approximation of the error to fit the next classification regression tree. Fm-1(x) The negative gradient value of the penalty function of the classification regression tree can be expressed as:
Figure BDA0002515701250000132
the input space of the mth classification regression tree Φ { (x)1,αm1),(x2,αm2),…,(xN,αmN)}。
In order to minimize the deviation of the predicted value output by the classification regression tree, the invention adopts linear search to calculate the output value of each sub-node:
Figure BDA0002515701250000133
and fitting the next classification regression tree by using the negative gradient value of the loss function in the current model as an approximate value of the error, wherein the final GBDT positioning model is as follows:
Figure BDA0002515701250000134
in the formula: m represents the total number of the classification regression trees generated by iteration, and a regularization coefficient lambda needs to be multiplied before each classification regression tree in the iteration processmTo avoid over-fitting the training data, the value range is (0, 1)],τmjTo classify the output values of the leaf nodes of the regression tree, I denotes when x ∈βmjTake 1 if not, or 0 if not.
In an embodiment of the present application, the loss function is a Huber loss function, which takes a fractional point σ as a boundary, and two different strategies are used to reduce the influence of an abnormal value on a prediction result; for abnormal points far away from the center, an absolute value loss function is adopted, and abnormal points near the center adopt a mean square error loss function; the Huber loss function is as follows:
Figure BDA0002515701250000141
the selection of the loss function has a great influence on the prediction accuracy of the GBDT positioning model. The invention selects a Huber loss function which takes a quantile point sigma as a boundary, and adopts two different strategies to reduce the influence of abnormal values on a prediction result. For outliers farther from the center, the absolute loss function (LAD) is used, while points near the center apply the mean square error loss function (LS). Therefore, the Huber loss function is selected to obviously reduce the influence of the abnormal value in the fingerprint database D on the positioning result. The Huber loss function is as follows:
Figure BDA0002515701250000142
3. actual verification
Referring to fig. 5, when the DPC extracted positioning feature dimension p is 4, the GBDT loss function is a Huber function, the learning rate is 0.02, the number of classification regression trees is 126, and the maximum depth of a single classification regression tree is 4, the average positioning accuracy can reach 1.51m, which is significantly better than the KNN indoor positioning algorithm.
GBDT is a lifting algorithm that generates classification regression trees iteratively to reduce prediction bias gradually. To avoid overfitting the training samples, increasing the generalization capability of the model, the maximum depth of each classification regression tree needs to be limited in the iterative process. The following figure shows the influence of the maximum depth of the classification regression tree on the accuracy of the positioning algorithm in the text when the maximum depth of the classification regression tree takes different values.
Referring to fig. 6, it can be seen from fig. 6 that the average positioning error of the algorithm of the present invention decreases with the increase of the maximum depth of the classification regression tree, when the maximum depth is 4, the curve reaches the inflection point, the average positioning is 1.51m, and thereafter, the maximum depth of the classification regression tree continues to increase, the average positioning error gradually increases, and the generalization capability of the model also decreases.
According to another aspect of the present application, there is also provided a positioning system based on density peak clustering and gradient boosting algorithm, including:
the reference point setting module is used for setting a reference point in a preset area;
the fingerprint database establishing module is used for collecting a first received signal strength value of the reference point, forming a piece of fingerprint data together with the position coordinate of the reference point and storing the fingerprint data into a fingerprint database;
the receiving signal strength value acquisition module is used for acquiring a second receiving signal strength value of each access point acquired by the positioning terminal at a point to be positioned;
a comparison module, configured to compare the second received signal strength value with the first received signal strength value, and obtain a comparison result;
and the coordinate acquisition module is used for acquiring the position coordinate of the to-be-positioned point according to the comparison result.
Compared with the prior art, the method has the following beneficial effects:
the invention provides a WIFI indoor positioning algorithm based on linear discriminant analysis and a gradient lifting decision tree. The algorithm firstly uses DPC to extract main positioning characteristics in the original position fingerprint, and removes redundancy and noise; then, a GBDT positioning model is constructed by utilizing a forward distribution algorithm and an addition model, the measurement and calculation of the position coordinates of the point to be measured are realized, the algorithm can obviously reduce the error of indoor positioning in a limited AP set compared with a KNN (K-Nearest Neighbor) algorithm, and the positioning precision is improved by 20%. In addition, the number of required APs is smaller for the same positioning accuracy.
Referring to fig. 7, the present application further provides a computer device including a memory, a processor, and a computer program stored in the memory and executable by the processor, wherein the processor implements the method of any one of the above methods when executing the computer program.
Referring to fig. 8, a computer-readable storage medium, a non-volatile readable storage medium, having stored therein a computer program which, when executed by a processor, implements any of the methods described above.
A computer program product comprising computer readable code which, when executed by a computer device, causes the computer device to perform the method of any of the above.
It will be apparent to those skilled in the art that the modules or steps of the present invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and they may alternatively be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, or fabricated separately as individual integrated circuit modules, or fabricated as a single integrated circuit module from multiple modules or steps. Thus, the present invention is not limited to any specific combination of hardware and software.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (10)

1. A positioning method based on density peak value clustering and gradient lifting algorithm is characterized by comprising the following steps:
setting a reference point in a preset area;
collecting a first received signal strength value of the reference point, forming a piece of fingerprint data together with the position coordinate of the reference point, and storing the fingerprint data into a fingerprint database;
acquiring a second received signal strength value of the positioning terminal at the access point;
comparing the second received signal strength value with the first received signal strength value, and obtaining a comparison result;
and obtaining the position coordinates of the to-be-positioned point according to the comparison result.
2. The method of claim 1, wherein the first rssi value of the reference point is collected and combined with the location coordinates of the reference point to form a fingerprint data, and the storing of the fingerprint data into the fingerprint database comprises:
establishing a fingerprint database by adopting a density peak value clustering algorithm;
calculating local density and distance of a Gaussian core of the sample in the space to which the sample belongs through a density peak value clustering algorithm;
screening a density peak value clustering algorithm, and simultaneously obtaining a sample with a high value as a clustering center;
and taking the sample value of the cluster center as the first received signal strength value of the reference point.
3. The method of claim 2, wherein the step of screening the density peak clustering algorithm and obtaining a high-value sample as a clustering center comprises:
the received signal strength data from reference point k to signal receiving points l, m, n are filtered initially
Figure FDA0002515701240000011
Taking data of a point l as an X coordinate of a three-dimensional coordinate system, data of a point m as a Y coordinate, data of a point n as a Z coordinate, and τ as a recording frequency, wherein points which represent a sample as three-dimensional space S distribution are as follows:
Figure FDA0002515701240000021
the Euclidean distance for a sample i to a sample j in space S is defined as dijThe two-dimensional properties of local density and distance of the gaussian kernel of the sample i, ρ i and i respectively, are defined as follows:
Figure FDA0002515701240000022
Figure FDA0002515701240000023
wherein d iscFor the truncation distance, piRepresenting a distance in space S from sample i smaller than dcThe number of samples of (a);
d in space SijThe total number is N ═ N (N-1)/2, and the ascending order is:
d1≤d2≤…≤dN;dc=df(Nμ)
wherein f (N mu) represents an integer obtained by rounding N mu, and mu epsilon (0, 1) is a given parameter;
for a sample i in the space S, the binary group (rho) is obtained through the calculation of the formulaii),i∈IsDrawing a binary group (rho i, i) } of all samples in the space on a two-dimensional plane by taking rho as a horizontal axis and taking a vertical axis as a vertical axis, and selecting the binary group satisfying max { rho { (rho i, i) }i*i1, 2, …, with n samples as cluster centers.
4. The method of claim 3, wherein the method further comprises: and constructing a positioning model on the basis of a gradient lifting algorithm, wherein the gradient lifting algorithm uses an addition model and continuously reduces residual errors generated in a training process to achieve an algorithm for classifying or regressing data.
5. The method of claim 4, wherein the constructing the localization model based on the gradient boosting algorithm comprises:
establishing a mapping relation between fingerprint data and physical position coordinates through a gradient model algorithm, taking the fingerprint database D as an input space, and initializing a classification regression tree:
Figure FDA0002515701240000031
wherein, yiRepresenting the physical position coordinates of the ith reference point; tau is an output value of a leaf node of the classification regression tree, namely a predicted value of the position coordinate of the ith reference point; n is the number of fingerprint samples; l is a loss function of the model;
the value approximation of the negative gradient of the loss function on the current model is used instead of the residual error, as an approximation of the error, to fit the next classification regression tree, Fm-1(x) The negative gradient value of the loss function of the classification regression tree is expressed as:
Figure FDA0002515701240000032
the input space of the mth classification regression tree Φ { (x)1,αm1),(x2,αm2),…,(xN,αmN)};
Calculating the output value of each child node through linear search:
Figure FDA0002515701240000033
and fitting the next classification regression tree by taking the negative gradient value of the current model as an approximate value of the error through the loss function, wherein the final positioning model of the gradient model algorithm is as follows:
Figure FDA0002515701240000034
wherein M represents the total number of the classification regression trees generated by iteration, and a regularization coefficient lambda needs to be multiplied before each classification regression tree in the iteration processmTo avoid over-fitting the training dataThe value range is (0, 1)],τmjTo classify the output values of the leaf nodes of the regression tree, I denotes when x ∈βmjTake 1 if not, or 0 if not.
6. The method for positioning based on density peak clustering and gradient boosting algorithm according to claim 5, wherein the loss function is a Huber loss function, which is bounded by a quantile point σ, and two different strategies are adopted to reduce the influence of abnormal values on the prediction result; for abnormal points far away from the center, an absolute value loss function is adopted, and abnormal points near the center adopt a mean square error loss function; the Huber loss function is as follows:
Figure FDA0002515701240000041
7. the method of claim 1, wherein comparing the second RSSI value with the first RSSI value and obtaining the comparison result comprises:
comparing the second signal strength values with the first signal strength values in the fingerprint database in sequence;
and taking the position coordinate corresponding to the first signal strength value closest to the second signal strength value as the position coordinate of the access point.
8. A positioning system based on density peak clustering and gradient boosting algorithm is characterized by comprising:
the reference point setting module is used for setting a reference point in a preset area;
the fingerprint database establishing module is used for collecting a first received signal strength value of the reference point, forming a piece of fingerprint data together with the position coordinate of the reference point and storing the fingerprint data into a fingerprint database;
the receiving signal strength value acquisition module is used for acquiring a second receiving signal strength value of each access point acquired by the positioning terminal at a point to be positioned;
a comparison module, configured to compare the second received signal strength value with the first received signal strength value, and obtain a comparison result;
and the coordinate acquisition module is used for acquiring the position coordinate of the to-be-positioned point according to the comparison result.
9. The density peak clustering and gradient boosting algorithm-based positioning system according to claim 8, wherein the fingerprint database building module comprises:
the fingerprint database establishing module is used for establishing a fingerprint database by adopting a density peak value clustering algorithm;
the Gaussian kernel local density and distance calculation module calculates the Gaussian kernel local density and distance of the sample in the space to which the sample belongs through a density peak value clustering algorithm;
the cluster center screening module is used for screening the density peak value clustering algorithm and obtaining a sample with a high value as a cluster center;
and the first received signal strength value determining module is used for taking the sample value of the cluster center as the first received signal strength value of the reference point.
10. The density peak clustering and gradient boosting algorithm-based localization system according to claim 9, wherein the cluster center filtering module comprises:
a filtering module for setting the received signal intensity data from the reference point k to the signal receiving points l, m, n after preliminary filtering
Figure FDA0002515701240000051
The three-dimensional space S distribution module is used for taking data of a point I as an X coordinate of a three-dimensional coordinate system, taking data of a point m as a Y coordinate, taking data of a point n as a Z coordinate, expressing tau as a recording frequency, and expressing a sample as a point of three-dimensional space S distribution:
Figure FDA0002515701240000052
a definition module for defining Euclidean distance from sample i to sample j in space S as dijThe two-dimensional properties of local density and distance of the gaussian kernel of the sample i, ρ i and i respectively, are defined as follows:
Figure FDA0002515701240000053
Figure FDA0002515701240000061
wherein d iscFor the truncation distance, piRepresenting a distance in space S from sample i smaller than dcThe number of samples of (a);
d in space SijThe total number is N ═ N (N-1)/2, and the ascending order is:
d1≤d2≤…≤dN;dc=df(Nμ)
wherein f (N mu) represents an integer obtained by rounding N mu, and mu epsilon (0, 1) is a given parameter;
a drawing module for calculating the sample i in the space S by the formula to obtain the binary group (rho)ii),i∈IsDrawing a binary group (rho i, i) } of all samples in the space on a two-dimensional plane by taking rho as a horizontal axis and taking a vertical axis as a vertical axis, and selecting the binary group satisfying max { rho { (rho i, i) }i*i1, 2, …, with n samples as cluster centers.
CN202010482361.XA 2020-05-29 2020-05-29 Positioning method and system based on density peak clustering and gradient lifting algorithm Pending CN111639712A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010482361.XA CN111639712A (en) 2020-05-29 2020-05-29 Positioning method and system based on density peak clustering and gradient lifting algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010482361.XA CN111639712A (en) 2020-05-29 2020-05-29 Positioning method and system based on density peak clustering and gradient lifting algorithm

Publications (1)

Publication Number Publication Date
CN111639712A true CN111639712A (en) 2020-09-08

Family

ID=72331047

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010482361.XA Pending CN111639712A (en) 2020-05-29 2020-05-29 Positioning method and system based on density peak clustering and gradient lifting algorithm

Country Status (1)

Country Link
CN (1) CN111639712A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113570021A (en) * 2021-06-10 2021-10-29 江苏安派客物联科技有限公司 Passive counting system and method
CN116801192A (en) * 2023-05-30 2023-09-22 山东建筑大学 Indoor electromagnetic fingerprint updating method and system by end cloud cooperation

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160165566A1 (en) * 2014-12-04 2016-06-09 Hyundai Mobis Co., Ltd. Method for building database for fingerprinting positioning and fingerprinting positioning method using the built database
CN108562867A (en) * 2018-04-17 2018-09-21 北京邮电大学 A kind of fingerprint positioning method and device based on cluster
CN108627798A (en) * 2018-04-04 2018-10-09 北京工业大学 WLAN indoor positioning algorithms based on linear discriminant analysis and gradient boosted tree
EP3396400A1 (en) * 2017-04-27 2018-10-31 Deutsche Telekom AG A system and method for clustering wi-fi fingerprints for indoor-outdoor detection
CN109041206A (en) * 2018-07-03 2018-12-18 东南大学 A kind of indoor positioning floor method of discrimination based on improvement fuzzy kernel clustering

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160165566A1 (en) * 2014-12-04 2016-06-09 Hyundai Mobis Co., Ltd. Method for building database for fingerprinting positioning and fingerprinting positioning method using the built database
EP3396400A1 (en) * 2017-04-27 2018-10-31 Deutsche Telekom AG A system and method for clustering wi-fi fingerprints for indoor-outdoor detection
CN108627798A (en) * 2018-04-04 2018-10-09 北京工业大学 WLAN indoor positioning algorithms based on linear discriminant analysis and gradient boosted tree
CN108562867A (en) * 2018-04-17 2018-09-21 北京邮电大学 A kind of fingerprint positioning method and device based on cluster
CN109041206A (en) * 2018-07-03 2018-12-18 东南大学 A kind of indoor positioning floor method of discrimination based on improvement fuzzy kernel clustering

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张萌 等: "基于密度峰值聚类的随机森林室内定位" *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113570021A (en) * 2021-06-10 2021-10-29 江苏安派客物联科技有限公司 Passive counting system and method
CN116801192A (en) * 2023-05-30 2023-09-22 山东建筑大学 Indoor electromagnetic fingerprint updating method and system by end cloud cooperation
CN116801192B (en) * 2023-05-30 2024-03-12 山东建筑大学 Indoor electromagnetic fingerprint updating method and system by end cloud cooperation

Similar Documents

Publication Publication Date Title
CN113378632B (en) Pseudo-label optimization-based unsupervised domain adaptive pedestrian re-identification method
CN112101430B (en) Anchor frame generation method for image target detection processing and lightweight target detection method
CN104881706B (en) A kind of power-system short-term load forecasting method based on big data technology
US20220076150A1 (en) Method, apparatus and system for estimating causality among observed variables
CN110188228B (en) Cross-modal retrieval method based on sketch retrieval three-dimensional model
Unnikrishnan et al. Toward objective evaluation of image segmentation algorithms
CN109671102B (en) Comprehensive target tracking method based on depth feature fusion convolutional neural network
CN108984785B (en) Historical data and increment-based fingerprint database updating method and device
CN107885778B (en) Personalized recommendation method based on dynamic near point spectral clustering
CN107633226B (en) Human body motion tracking feature processing method
CN108428015B (en) Wind power prediction method based on historical meteorological data and random simulation
CN114841257B (en) Small sample target detection method based on self-supervision comparison constraint
CN107169117B (en) Hand-drawn human motion retrieval method based on automatic encoder and DTW
CN108446619B (en) Face key point detection method and device based on deep reinforcement learning
CN108627798B (en) WLAN indoor positioning algorithm based on linear discriminant analysis and gradient lifting tree
CN104994366A (en) FCM video key frame extracting method based on feature weighing
CN110866934A (en) Normative coding-based complex point cloud segmentation method and system
CN112116613A (en) Model training method, image segmentation method, image vectorization method and system thereof
CN111639712A (en) Positioning method and system based on density peak clustering and gradient lifting algorithm
CN109871907B (en) Radar target high-resolution range profile identification method based on SAE-HMM model
CN111597943A (en) Table structure identification method based on graph neural network
Meng et al. Vigilance adaptation in adaptive resonance theory
CN116245259A (en) Photovoltaic power generation prediction method and device based on depth feature selection and electronic equipment
CN116415177A (en) Classifier parameter identification method based on extreme learning machine
CN115801152A (en) WiFi action identification method based on hierarchical transform model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination