CN110213710A

CN110213710A - A kind of high-performance indoor orientation method, indoor locating system based on random forest

Info

Publication number: CN110213710A
Application number: CN201910319905.8A
Authority: CN
Inventors: 黄鹏宇; 赵豪杰; 刘伟
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2019-04-19
Filing date: 2019-04-19
Publication date: 2019-09-06

Abstract

The invention belongs to wireless communication technology fields, disclose a kind of high-performance indoor orientation method based on random forest.The indoor orientation method passes through the signal strength size of access point, the access point of erasure signal intensity difference, jitter during establishing location fingerprint library first；It reuses the better access point of information gain method selective positioning effect from remaining access point and constitutes access point set.The location fingerprint library for indicating position feature is established on this basis.Then, position is carried out to positions all in positioning scene by clustering algorithm and divides group, the Random Forest model for dividing group to construct high-precision, high stability for each position later.User location is determined using Random Forest model in position fixing process.The present invention can improve the stability and positioning accuracy of positioning by the Random Forest model of building high-precision, high stability with effective solution list decision tree model orientation precision is limited, locating effect is unstable, is easily trapped into the problem of over-fitting.

Description

A kind of high-performance indoor orientation method, indoor locating system based on random forest

Technical field

The invention belongs to wireless communication technology field more particularly to a kind of high-performance indoor positioning sides based on random forest Method, indoor locating system.

Background technique

With the rapid development of society, in the modern society of height urbanization, demand of the people to spatial positional information is not Disconnected to improve, location technology is also increasingly valued by people.Especially in recent years, GPS positioning system, mobile interchange are based on The location information service that the technologies such as net, smart phone provide, brings great convenience to people's daily life.People can use Its provide location-based service come called a taxi, ordered takeaway, found bank even friend-making etc. many applications.

In the application scenarios of current location technology, in addition to the mature outdoor positioning scene of industry, indoor positioning scene Also it is gradually given more sustained attention by users and technology developer.According to statistics, with the continuous development of economic society, people Residence time is increasingly longer indoors.So far it is averaged 70% mobile phone in people one day and 80% data connection is come From interior.Therefore, corresponding indoor positioning technologies are increasingly becoming the research direction to attract attention.Such as: in large size Location information demand under the indoor environment of the complexity such as market, factory, hospital, office building and underground coal mine is just very urgent.

But since in environment indoors, blocking and shielding for building can not effectively detect GPS satellite in indoor environment Signal.Therefore, widely applied GPS positioning system can not provide effective service indoors in outdoor positioning environment.There are no Method provides the location information of enough accuracy to meet the location requirement of people in the present context.It would therefore be highly desirable to study new technology Approach deeply, is effectively studied to do to the orientation problem in indoor environment.

Current main indoor positioning technologies can be divided into following several classes:

(1) proximity detection method: largely laying locating base station, by the judgement to base station location signal transmission attenuation degree, To determine user whether near some base station；

(2) polygon positioning mode: laying locating base station, and the distance between the locating base station of measurement user to known location is led to It crosses three sides or polygon positioning mode determines the position of user；

(3) fingerprint location method: establishing fingerprint database in located space, passes through actual measurement information and fingerprint database Comparison positioned；

(4) dead reckoning: according to predetermined position, estimation or known speed and direction determine active user Position.

Wherein, proximity detection method only needs to determine user terminal and which base station is neighbouring, therefore required equipment is relatively easy, but Due to being only capable of judging which base station user is located near, positioning accuracy is very general.Moreover, positioning accuracy in the method It is closely related with base station layout density, needs to lay a large amount of locating base stations to improve positioning accuracy.The laying of a large amount of base stations needs Consumption fund and time cost are wanted, higher maintenance cost is also required in operation usually.

Polygon positioning mode needs also need to lay base station, and the density that positioning accuracy is laid with base station is also closely related --- Base station density is bigger, and positioning accuracy is accordingly higher.Positioning accuracy is relatively high in the actual environment for this method.But due to polygon fixed Position method needs to calculate user at a distance from base station, therefore system hardware and software structure is all relative complex.Also, current ranging side Formula, such as: ultrasonic wave, laser, TOA etc. have that range performance can be remarkably decreased on non-direct path.Therefore, more Side localization method is relatively suitble to use environment spacious, that barrier is less.

Dead reckoning is that the position of user is gradually calculated by inertial navigation equipment on the basis of known users initial position Coordinate.This method sampling of data is stablized, and external base station equipment is not depended on.But the premise of dead reckoning is determining user's initial bit It sets, it is therefore desirable to other matched uses of location technology.Meanwhile there are the errors in congenital structure to accumulate at any time for inertial navigation equipment Tired problem.

Fingerprint location method is used in laying is above divided into offline and online two stages.It needs in off-line phase this method fixed The finger print information in a large amount of test points is gradually measured in the environment of position, to construct finger print information storehouse.Therefore, algorithm under off-line state Workload is relatively large.But tuning on-line stage, user terminal only need to collect the finger print information on current location, do not need to position The support of base station.Therefore equipment is simple, positioning accuracy is relatively high.

It is simple according to above-mentioned discussion visible fingerprint positioning mode positioning device, positioning accuracy is high, easy to use.Therefore the present invention Raising is innovated in selection in the technical foundation of this method.

Conventional fingerprint location method algorithm flow includes: that off-line phase (collect, reference position point by reference position fingerprint characteristic Group, group is interior to construct decision tree), on-line stage (determines affiliated point according to the Euclidean distance between the fingerprint of current location and each point of group Group, determines position coordinates according to decision tree in group).

The present invention mainly innovates traditional fingerprint location method in terms of two, is improved.It is in finger print data first The fingerprint feature information received in the establishment process in library to each reference position screens, and screens out noise therein and shakiness Fixed information.Fingerprint base is established using garbled finger print information, the distracter in finger print information is rejected, can effectively improve The efficiency of decision algorithm in position fixing process.

Secondly, introducing the thought of random forest method, the judgement tree method in conventional fingerprint location algorithm is replaced.Here it leads If top-down achievement assorting process has preferable visual because the advantages of decision Tree algorithms is that classification accuracy is higher Change effect, orderliness is clear, should be readily appreciated that.Meanwhile excessive training data is not needed during decision tree building, algorithm calculates Cost is relatively low, and energy consumption is less, and structure is simple, is suitable for being applied in the positioning system with the limitation of stringent energy consumption.But at the same time, In the establishment process of fingerprint base, although dividing reference point to group by clustering algorithm, sentence when constructing decision tree in group to reduce The number of plies certainly set.But in actual working environment, divides the average reference positional number after group in group sometimes still very much, pass through The number of plies for dividing group that can not usually reduce decision tree.But if the number of plies of decision tree is excessive or branch is excessive, in judging process It is then easy to happen overfitting problem, to influence the effect of classification, positioning.Although beta pruning can be passed through on algorithm flow Technology alleviates the appearance of overfitting problem to a certain extent, but beta pruning simultaneously can cause adjudicate fuzziness increase, to drop Low positioning accuracy, while also resulting in that classifying quality is unstable, influence the stability of location algorithm.

For the above problem of traditional decision-tree in the presence of actual scene, random forest is introduced in the present invention The way of thinking.Random forest is constructed according to the finger print information of the reference point in group i.e. after reference point divides group.Relative in group For the conventional method for constructing a unique decision tree, random forest processing high dimensional data preferably, training speed is fast, can Parallel.Relative to judgement tree method, random forest is in terms of algorithm complexity without apparent disadvantage.Moreover, random forest has There is very strong anti-over-fitting ability, to solve the problems, such as that judgement tree algorithm is easily trapped into over-fitting.For actual location In environment, the problems such as the disequilibrium of much noise present in reference point location information and data set, random forest can be more Good balance error, even for the location data for having Partial Feature to lack, random forest method still can remain certain Accuracy is adjudicated, therefore random forest method has better adaptive capacity to environment.

Summary of the invention

In view of the problems of the existing technology, the present invention provides a kind of high-performance indoor positioning side based on random forest Method.

Realization process of the invention is as follows: a kind of high-performance indoor orientation method based on random forest, the indoor positioning Method passes through the signal strength size of access point, erasure signal intensity difference, signal during establishing location fingerprint library first Unstable access point；The better access point composition of information gain method selective positioning effect from remaining access point is reused to connect Enter point set.The location fingerprint library for indicating position feature is established on this basis.Then, by clustering algorithm in positioning scene All positions carry out position and divide group, the Random Forest model for dividing group to construct high-precision, high stability for each position later.Positioning User location is determined using Random Forest model in the process.

Further, the high-performance indoor orientation method based on random forest specifically includes:

The first step, acquisition are initially accessed point data, construct the original fingerprint data library of localizing environment:

Localizing environment is divided into the identical grid of multiple sizes, and using the center position coordinate of each grid as this The coordinate of position, and position grid is numbered；It is acquired in each position coordinate and is initially accessed point data, i.e., carried out in each position coordinate The RSSI data of a period of time acquire, and it is fixed to be constructed with the RSSI data of the collected all Wifi access points of surrounding of each position coordinate The original fingerprint data library of position environment；

Second step constructs location fingerprint library:

(1) the deletion threshold value th of access point signals intensity is set；It is collected on each position in the database of calculating position The average signal strength of each access point deletes the access point that average signal strength in each position is less than threshold value；Retain average signal Intensity is more than or equal to the access point of threshold value, and remaining effective pre-selection access point constitutes pre-selection access point set；

(2) according to the effective access point data recorded in location database, each access point in pre-selection access point set is calculated Information gain, access point is ranked up by the sequence of information gain from big to small, select before k access point, constitute finally Fingerprint access point set；

(3) point set is accessed according to fingerprint to screen obtained location database, only retain final fingerprint access The access point data for including in point set, obtains the fingerprint of each position, constitutes final location fingerprint library；

Third step carries out a point group, structure to the position in localizing environment based on k-means algorithm according to the fingerprint of each position It builds position and divides group；

4th step divides group to construct random forest for each position；

(1) the number M for the decision tree that setting random forest is included, and access point set required when building decision tree The access point number K for including；

(2) divide the position that each group is included in group according to position, adopted over these locations from location fingerprint library The data collected, constitute pre-selection training set, and pre-selection training set size is N；

(3) M group training sample set is generated based on pre-selection training set, and the size of each group of training sample set and pre-selection are trained The size of collection is identical；When generating every group of training sample set, it is random from pre-selection training set to be all made of the criterion for sampling put back to It extracts；

(4) an access point set is generated at random for each training sample set；Each access point set is all based on Obtained pre-selection access point set generates M group access point set at random, and the size of every group of access point set is K, and same group connects It is different to enter the access point that point set includes, duplicate access point is had between diverse access point set；

(5) M decision-tree model is constructed according to training sample set and access point set, each decision-tree model is all based on C4.5 traditional decision-tree obtains；

5th step, positioning stage；

(1) fingerprint of the positioning sample positioned to needs calculates positioning sample fingerprint and divides the Euclidean between group to each position Distance, selection divide group with the smallest position of its Euclidean distance, divide group using this point of group as where its target position；

(2) positioning sample sheet divides the M decision tree of group to move down from root node respectively along the position, every until being moved to The leaf node of a decision tree, the leaf node of each decision tree are the judgement knot for positioning sample on the decision-tree model Fruit；

(3) to M court verdict is obtained, the final target position of positioning sample is obtained using ballot method；If M judgement knot The number for having a result to occur in fruit is most, then the corresponding position coordinates of the result are final target location coordinate；If There are multiple court verdict frequency of occurrence most, then the mean place coordinate of these court verdicts is sat as final target position Mark.

Further, the information gain for calculating each access point in pre-selection access point set, carries out as follows:

1) the uncertainty H (G) of location information in localizing environment is calculated:

Wherein, G indicates the position in localizing environment, G_iIndicate i-th of position, P (G_i) indicate position G_iThe probability of appearance, m Indicate the number of position in localizing environment；

2) under conditions of calculating known access point, in localizing environment location information uncertainty H (G | AP_i):

Wherein, AP_iIndicate i-th of access point, v_jIndicate AP_iSignal strength value, N indicate AP_iThe value of signal strength Number, and H (G | AP_i=v_j) indicate in known AP_iSignal strength value be v_jUnder conditions of, the information of position in localizing environment Entropy, calculation formula are identical as H (G)；

3) the comentropy Gain (AP of access point is calculated according to result 1) and 2)_i):

Gain(AP_i)=H (G)-H (G | AP_i)；

Wherein, Gain (AP_i) indicate known access point AP_iUnder conditions of, location information uncertainty subtracts in localizing environment On a small quantity.

Further, a point group is carried out to environment position using k-means algorithm, carried out as follows:

1) determine that group's quantity k is divided in position, group center of the optional k position as this k group, and using the fingerprint of position as Group's fingerprint, wherein k is more than or equal to 2；

2) all positions are calculated to the Euclidean distance of this k group center, position is assigned to apart from the smallest group.All positions It sets after distributing, asks position to divide the mean value of all group element fingerprints in group, as new group center；

3) it repeats 2), until group center is no longer changed, i.e., end of cluster is divided in position.

It further, the use of C4.5 traditional decision-tree is that group is each divided to establish multiple decision trees, every decision tree is by following step It is rapid to carry out:

1) discretization is carried out to the signal strength value of each access point, i.e., the value range of access point is divided into several companies Continuous value range, and calculate corresponding information gain-ratio after each access point discretization, information gain-ratio is maximum connects for selection Access point is as root node；

2) by the corresponding branch of each value range of access point, and using the value range as the judgement of the branch Condition；

3) above procedure is repeated, each branch tie point child node is further determined that, until the child node of final each branch It is leaf node, that is, completes the foundation of decision tree.

6, the high-performance indoor orientation method based on random forest as claimed in claim 2, which is characterized in that it is fixed to calculate Position sample fingerprint divides the Euclidean distance between group to each position, is calculated as follows:

Wherein D (T, C_j) indicate that positioning sample T and j-th of position divide the Euclidean distance of group, C_jIt is that group is divided in j-th of position Group center, k indicate that the access point number of group, SS are divided in the position of selection_i(T) signal of i-th of access point in positioning sample is indicated Intensity, SS_i(C_j) indicate that the signal strength of i-th of access point of group is divided in j-th of position.

Further, positioning sample is moved down along the decision-tree model of this point of group from root node, is according to positioning sample The value of feature access point judge that it meets the judgment condition of decision tree which branch to determine its moving direction, until movement To leaf node, which is to position sample in the final position of the decision tree.

Another object of the present invention is to provide the high-performance indoor orientation methods described in a kind of application based on random forest Indoor locating system.

Another object of the present invention is to provide the high-performance indoor orientation methods described in a kind of application based on random forest Intelligent terminal.

Another object of the present invention is to provide the high-performance indoor orientation methods described in a kind of application based on random forest Unmanned plane.

In conclusion advantages of the present invention and good effect are as follows:

(1) present invention deletes the access point of poor signal quality, jitter by access point selection.And then use letter It ceases the better access point of gain method selective positioning effect from remaining access point and constructs location fingerprint.It is good to select positioning performance Access point set can more preferably indicate position feature, be conducive to the raising of positioning accuracy.Access point selection simultaneously simplifies position The data dimension of fingerprint reduces the computation complexity of localization method；

(2) present invention divides group to all reference point locations progress position by clustering algorithm, and group is divided to reduce each position The range of choice of judging process, to reduce the algorithm complexity of location algorithm.

(3) present invention is by dividing group to establish the random forest mould being made of more decision-tree models each position Type can effectively overcome overfitting problem present in single decision-tree model.On the basis of improving positioning accuracy, also Greatly improve the stability of positioning.

Detailed description of the invention

Fig. 1 is the high-performance indoor orientation method flow chart provided in an embodiment of the present invention based on random forest.

Fig. 2 is the high-performance indoor orientation method implementation flow chart provided in an embodiment of the present invention based on random forest.

Fig. 3 is positioning scene schematic diagram provided in an embodiment of the present invention.

Fig. 4 is provided in an embodiment of the present invention in positioning scene, the present invention and other two kinds of localization method multiple bearings The error comparison diagram of experiment.

Fig. 5 is provided in an embodiment of the present invention in positioning scene, and the present invention is from other two kinds of localization methods at different points Positioning result comparison diagram under conditions of group's number.

Fig. 6 is a decision-tree model in M decision-tree model of Random Forest model provided in an embodiment of the present invention Schematic diagram.

Specific embodiment

In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to embodiments, to the present invention It is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not used to Limit the present invention.

Problem not accurate enough for the Site Survey result of the prior art and that stability is poor.The present invention is based on integrated The thought of study forms a random forest by more decision trees of building, and is combined to these decision trees by decision rule Together, can efficiently against single decision tree judgement when there are the problem of.Random forest can provide a high stability Positioning result do not need cut operator and without considering the problems of over-fitting.

Application principle of the invention is explained in detail with reference to the accompanying drawing.

As shown in Figure 1, the high-performance indoor orientation method provided in an embodiment of the present invention based on random forest includes following Step:

S101: acquisition access point data；

S102: carrying out access point selection, selects the access point set that signal strength is big and resolution capability is strong, constructs position Fingerprint base；

S103: position is carried out to all positions and divides group；

S104: group is divided to establish Random Forest model for each position；

S105: group is divided in the position where determining positioning sample target position to positioning sample；

S106: the random forest location model of group is divided to determine the specific position of positioning sample using positioning sample position It sets.

High-performance indoor orientation method provided in an embodiment of the present invention based on random forest specifically includes the following steps:

1) acquisition is initially accessed point data, constructs the original fingerprint data library of localizing environment:

2) location fingerprint library is constructed:

The deletion threshold value th of access point signals intensity 2a) is set.It is collected on each position in the database of calculating position The average signal strength of each access point deletes the access point that average signal strength in each position is less than threshold value；Retain average signal Intensity is more than or equal to the access point of threshold value, and remaining effective pre-selection access point constitutes pre-selection access point set；

2b) according to the effective access point data recorded in location database, each access point in pre-selection access point set is calculated Information gain, access point is ranked up by the sequence of information gain from big to small, select before k access point, constitute finally Fingerprint access point set；

Point set 2c) is accessed according to fingerprint to screen the location database that step 1) obtains, and only retains final finger The access point data for including in line access point set, obtains the fingerprint of each position, constitutes final location fingerprint library；

3) according to the fingerprint of each position, a point group, building position point are carried out based on position of the k-means algorithm to localizing environment Group；

4) group is divided to construct random forest for each position

4a) determine the number M for the decision tree that random forest is included, and access point set required when building decision tree The access point number K for including；

4b) the position for dividing group to be included obtained according to step 3), is adopted over these locations from location database The data collected, constitute pre-selection training set, and pre-selection training set size is N；

M group training sample set 4c) is generated based on pre-selection training set, and the size of each group of training sample set and pre-selection are trained The size of collection is identical.When generating every group of training sample set, it is random from pre-selection training set to be all made of the criterion for sampling put back to It extracts；

4d) an access point set is generated at random for each training sample set.Each access point set is all based on Step 2b) obtained in pre-selection access point set generate M group access point set at random, the size of every group of access point set is K, and The access point that same group of access point set includes is different, and duplicate access point is had between diverse access point set；

M decision-tree model 4f) is constructed according to training sample set and access point set.Each decision-tree model is all based on C4.5 traditional decision-tree obtains；

5) positioning stage

The positioning sample fingerprint for needing to position 5a) is given, positioning sample fingerprint is calculated and divides the Euclidean between group to each position Distance selects to divide group to divide group as where its target position with the smallest position of its Euclidean distance；

5b) positioning sample sheet is moved down respectively along the M decision-tree model of this point of group from root node, every until being moved to The leaf node of a decision tree, the leaf node of each decision tree are the judgement knot for positioning sample on the decision-tree model Fruit；

5c) in 5b) in obtained M court verdict, the final target position of positioning sample is obtained using ballot method.If M The number for having a result to occur in a court verdict is most, then the corresponding position coordinates of the result are final target position Coordinate；If there is multiple court verdict frequency of occurrence most, the mean place coordinate of these court verdicts is as final target Position coordinates；

Application principle of the invention is further described with reference to the accompanying drawing.

As shown in Fig. 2, the high-performance indoor orientation method provided in an embodiment of the present invention based on random forest specifically includes Following steps:

Step 1 constructs original fingerprint data library；

Such as Fig. 3, the positioning scene of this example is the part corridor in 4 buildings areas I of this school teaching, area 340m²；

This step is that the localizing environment is first divided into the square grid that 177 side lengths are 0.8m, in grid The position coordinates of the heart point position coordinate representation grid, and each position is numbered, use G_iIndicate i-th of position；Again to every One position carries out data acquisition, records each access point signals intensity RSSI value that each position detection arrives, forms a record The original fingerprint data library of the RSSI of each position sampled data.

Step 2, access point selection；

When this step carries out access point selection, it is divided into following two step and carries out:

The threshold value th of access point signals intensity 2a) is set.It is calculated according to location database collected each on each position The average signal strength of access point deletes the access point that average signal strength in each position is less than threshold value；It is strong to retain average signal Degree is more than or equal to the access point of threshold value, filters out pre-selection access point, constitutes pre-selection access point set；

The information gain that access point in pre-selection access point set 2b) is calculated based on Information Gain Method, by information gain Descending sorts to access point, and maximum 15 access points of information gain is selected to constitute access point set, the information gain of access point Calculating process is as follows:

2b1) calculate the uncertainty H (G) of location information in localizing environment:

Wherein, G indicates the position in localizing environment, G_iIndicate i-th of position, P (G_i) indicate position G_iThe probability of appearance, 177 be the number of position in localizing environment；

2b2) calculate under conditions of known access point, in localizing environment location information uncertainty H (G | AP_i):

2b3) according to 2b1) and result 2b2) calculate the comentropy Gain (AP of access point_i):

Gain(AP_i)=H (G)-H (G | AP_i)；

Wherein, Gain (AP_i) indicate known access point AP_iUnder conditions of, location information uncertainty subtracts in localizing environment On a small quantity.Information gain is bigger, and the reduction amount for indicating position uncertain information is bigger, and access point is stronger to the resolution capability of position.

Step 3 constructs location fingerprint library.

Point set is accessed according to fingerprint to screen the access point signals intensity data library that step 1 obtains, i.e., only retains The access point data for including in fingerprint access point set, obtains the fingerprint of each position, forms final location fingerprint library.

Group is divided in step 4, position.

This step carries out a point group, process to position according to the fingerprint of each position using classical k-means grouping method It is as follows:

It 4a) determines and divides group quantity k, determined under this example localizing environment shown in Fig. 3 and group's number is divided to be 4, optional 4 positions As the group center of this 4 groups, and using the fingerprint of position as group's fingerprint；

All positions 4b) are calculated to the Euclidean distance of this 4 group center, are assigned it to apart from the smallest group, Suo Youwei It sets after distributing, asks position to divide the mean value of all group element fingerprints in group, as new group center；

Step 4b 4c) is repeated, until group center's fingerprint is no longer changed, as end of cluster is divided in position.

Step 5 divides group to establish Random Forest model for each position.

5a) determine the number M for the decision tree that random forest is included, and access point set required when building decision tree The access point number K for including；

5b) according to the position that group is included is divided, sample phase is obtained from location database, is collected over these locations Data, constitute pre-selection training set, pre-selection access point set sizes be N；

M group training sample set 5c) is generated based on pre-selection training set, and the size of each group of training sample set and pre-selection are trained The size of collection is identical.When generating every group of training sample set, it is random from pre-selection training set to be all made of the criterion for sampling put back to It extracts；

5d) an access point set is generated at random for each training sample set.Each access point set is all based on Pre-selection access point set generates M group access point set at random, and the size of every group of access point set is K, and same group of access point set The access point that conjunction includes is different, and identical access point is had between diverse access point set；

M decision-tree model 5f) is constructed according to training sample set and access point set.Each decision-tree model is all based on C4.5 traditional decision-tree obtains；It is that each position divides group to establish decision-tree model based on C4.5 algorithm, which is based on access The information gain-ratio of point carrys out the access point of each node of trade-off decision tree, establishes a Location-Unknown information and reduces most fast judgement mould Type, process are as follows:

Discretization 5f1) is carried out to the signal strength value of each access point, i.e., is divided into the value range of access point several Access point value range is divided into but is not limited to two sections by continuous value range, this example, and calculates each access point discretization Corresponding information gain-ratio afterwards selects the maximum access point of information gain-ratio as root node；

Each value range of access point 5f2) is corresponded into a branch, and sentencing using the value range as the branch Certainly condition；

Above procedure 5f3) is repeated, further determines that each branch tie point child node, until the child node of final each branch It is leaf node, that is, completes the foundation of decision tree.

Step 6, judgement are set to sample position and divide group.

The sample data positioned as needed calculates the data to each position and divides Euclidean distance between group, this example exists 4 Euclidean distances are obtained in localizing environment shown in Fig. 3, selection divides group with the positioning the smallest position of sample Euclidean distance, by this Divide group as the group that divides where its target position, Euclidean distance calculates as follows:

Wherein D (T, C_j) indicate that positioning sample T and j-th of position divide the Euclidean distance of group, C_jIt is that group is divided in j-th of position Group center, SS_i(T) signal strength of i-th of access point in positioning sample, SS are indicated_i(C_j) indicate that j-th of position divides the i-th of group The signal strength of a access point.

Step 7 judges the specific location for positioning sample.

7a) positioning sample sheet is moved down respectively along the M decision-tree model of this point of group from root node, every until being moved to The leaf node of a decision tree, the leaf node of each decision tree are the judgement knot for positioning sample on the decision-tree model Fruit；

Such as table 1 is a positioning sample, Fig. 6 is one of them divided in the Random Forest model of group where positioning sample Decision-tree model, position fixing process of the sample in the decision-tree model are as follows:

7a1) root node of the decision-tree model is AP4, therefore first determines whether which item tree is the value of AP4 in positioning sample meet The judgment condition of branch, it is clear that the sample meets the judgment condition of AP4 node right end branch, therefore along AP4 right end branch to Lower movement；

It 7a2) is moved to node AP1, the value of positioning sample AP1 is 63, meet the judgment condition of the most right branch of the node, Then moved down along AP1 right end branch；

7a3) when AP1, which is moved down, is moved to node AP6, the value of positioning sample AP6 is 57, meets the node most The judgment condition of left branch, then moved down along AP6 left end branch；

7a4) when AP6 is moved to node G6, which is leaf node, i.e., the positioning sample is in decision-tree model Target position is G6.

Table 1

AP1	AP2	AP3	AP4	AP5	AP6
						63	56	45	70	61	57

7b) in 7a) in obtained M court verdict, the present invention obtains the final target position of positioning sample using ballot method It sets.If the number for having a result to occur in M court verdict is most, the corresponding position coordinates of the result are final mesh Cursor position coordinate；If there is multiple court verdict frequency of occurrence most, the mean place coordinate of these court verdicts is as final Target location coordinate.

Application effect of the invention is explained in detail below with reference to emulation.

Experiment 1: in practical laboratory experiment scene shown in Fig. 3, the test data of each 33 reference points of experiment statistics, Total to carry out 20 experiments, experimental result is as shown in Figure 4.This 20 times are tested, indoor orientation method of the present invention is put down The average localization error of the equal minimum 1.3718m of error, the indoor orientation method selected based on multiple AP are 1.4669m, Indoor orientation method average localization error based on information gain is up to 1.7074m.Positioning result is calculated on this basis Variance, and for the variance of 20 positioning experiment results, the variance of the method for the present invention is 0.0173, is selected based on multiple AP The variance of indoor orientation method be 0.1445, the variance of the indoor orientation method based on information gain is 0.0842.Experiment knot Fruit shows that indoor orientation method of the invention has higher positioning accuracy and better position stability.And with it is aforementioned Theoretical analysis result is consistent.

Experiment 2: in practical indoor positioning scene shown in Fig. 3, test position divides group several to the shadow of three kinds of localization methods The locating effect comparison of loud and three kinds of methods under the same conditions.Test results are shown in figure 5.It can by the experimental result of Fig. 5 See, under same experimental conditions, locating effect of the present invention is better than other two kinds of localization methods always, it was demonstrated that the method for the present invention Stability due to control methods.

The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention Made any modifications, equivalent replacements, and improvements etc., should all be included in the protection scope of the present invention within mind and principle.

Claims

1. a kind of high-performance indoor orientation method based on random forest, which is characterized in that the high property based on random forest Can indoor orientation method include access point selection, the signal strength erasure signal for first passing through access point is of poor quality, jitter Access point；Information gain method better access point of selective positioning effect from remaining access point is used again, selects positioning performance Good access point set；Establish the location fingerprint library of an expression position feature；Position is carried out to all positions by clustering algorithm Set a point group；Group is divided to construct the Random Forest model that positioning accuracy is high and stability is good for each position.

2. the high-performance indoor orientation method based on random forest as described in claim 1, which is characterized in that it is described based on The high-performance indoor orientation method of machine forest specifically includes:

Localizing environment is divided into the identical grid of multiple sizes, and using the center position coordinate of each grid as the position Coordinate, and to position grid number；It is acquired in each position coordinate and is initially accessed point data, i.e., carry out one section in each position coordinate The RSSI data of time acquire, and construct positioning ring with the RSSI data of the collected all Wifi access points of surrounding of each position coordinate The original fingerprint data library in border；

Second step constructs location fingerprint library:

(1) the deletion threshold value th of access point signals intensity is set；It is collected on each position in the database of calculating position respectively to connect The average signal strength of access point deletes the access point that average signal strength in each position is less than threshold value；Retain average signal strength More than or equal to the access point of threshold value, remaining effective pre-selection access point constitutes pre-selection access point set；

(2) according to the effective access point data recorded in location database, the letter of each access point in pre-selection access point set is calculated Gain is ceased, access point is ranked up by the sequence of information gain from big to small, k access point before selecting constitutes final finger Line accesses point set；

(3) point set is accessed according to fingerprint to screen obtained location database, only retain final fingerprint access point set The access point data for including in conjunction, obtains the fingerprint of each position, constitutes final location fingerprint library；

Third step carries out a point group to the position in localizing environment based on k-means algorithm, constructs position according to the fingerprint of each position Set a point group；

4th step divides group to construct random forest for each position；

(1) the number M for the decision tree that random forest is included is set, and access point set required when building decision tree includes Access point number K；

(2) divide the position that each group is included in group according to position, collected over these locations from location fingerprint library Data, constitute pre-selection training set, pre-selection training set size be N；

(3) M group training sample set, and the size of each group of training sample set and pre-selection training set are generated based on pre-selection training set Size is identical；When generating every group of training sample set, it is all made of the criterion for sampling put back to and is randomly selected from pre-selection training set；

(4) an access point set is generated at random for each training sample set；Each access point set is all based on to obtain Pre-selection access point set generate M group access point set at random, the size of every group of access point set is K, and same group of access point The access point that set includes is different, and duplicate access point is had between diverse access point set；

(5) M decision-tree model is constructed according to training sample set and access point set, each decision-tree model is all based on C4.5 What traditional decision-tree obtained；

5th step, positioning stage；

(1) fingerprint of the positioning sample positioned to needs calculates positioning sample fingerprint to each position and divides the Euclidean distance between group, Selection divides group with the smallest position of its Euclidean distance, divides group using this point of group as where its target position；

(2) positioning sample sheet divides the M decision tree of group to move down from root node respectively along the position, until being moved to each determine The leaf node of plan tree, the leaf node of each decision tree are the court verdict for positioning sample on the decision-tree model；

(3) to M court verdict is obtained, the final target position of positioning sample is obtained using ballot method；If in M court verdict The number for having a result to occur is most, then the corresponding position coordinates of the result are final target location coordinate；If having more A court verdict frequency of occurrence is most, then the mean place coordinate of these court verdicts is as final target location coordinate.

3. the high-performance indoor orientation method based on random forest as claimed in claim 2, which is characterized in that calculate pre-selection and connect Enter the information gain of each access point in point set, carry out as follows:

Wherein, G indicates the position in localizing environment, G_iIndicate i-th of position, P (G_i) indicate position G_iThe probability of appearance, m are indicated The number of position in localizing environment；

Wherein, AP_iIndicate i-th of access point, v_jIndicate AP_iSignal strength value, N indicate AP_iThe value number of signal strength, H(G|AP_i=v_j) indicate in known AP_iSignal strength value be v_jUnder conditions of, the comentropy of position in localizing environment, Calculation formula is identical as H (G)；

Gain(AP_i)=H (G)-H (G | AP_i)；

Wherein, Gain (AP_i) indicate known access point AP_iUnder conditions of, the reduction of location information uncertainty in localizing environment Amount.

4. the high-performance indoor orientation method based on random forest as claimed in claim 2, which is characterized in that use k- Means algorithm carries out a point group to environment position, carries out as follows:

1) it determines that group's quantity k, group center of the optional k position as this k group are divided in position, and the fingerprint of position is referred to as group Line, wherein k is more than or equal to 2；

2) all positions are calculated to the Euclidean distance of this k group center, position is assigned to apart from the smallest group.All positions point After complete, position is asked to divide the mean value of all group element fingerprints in group, as new group center；

5. the high-performance indoor orientation method based on random forest as claimed in claim 2, which is characterized in that determined using C4.5 Plan tree method is each group to be divided to establish multiple decision trees, and every decision tree carries out as follows:

1) discretization is carried out to the signal strength value of each access point, i.e., be divided into the value range of access point several continuous Value range, and corresponding information gain-ratio after each access point discretization is calculated, select the maximum access point of information gain-ratio As root node；

2) by the corresponding branch of each value range of access point, and using the value range as the judgement item of the branch Part；

3) above procedure is repeated, each branch tie point child node is further determined that, until the child node of final each branch is Leaf node completes the foundation of decision tree.

6. the high-performance indoor orientation method based on random forest as claimed in claim 2, which is characterized in that calculate positioning sample This fingerprint divides the Euclidean distance between group to each position, is calculated as follows:

Wherein D (T, C_j) indicate that positioning sample T and j-th of position divide the Euclidean distance of group, C_jIt is that j-th of position is divided in the group of group The heart, k indicate that the access point number of group, SS are divided in the position of selection_i(T) signal strength of i-th of access point in positioning sample is indicated, SS_i(C_j) indicate that the signal strength of i-th of access point of group is divided in j-th of position.

7. the high-performance indoor orientation method based on random forest as claimed in claim 2, which is characterized in that positioning sample edge The decision-tree model of this point of group moved down from root node, be to judge that it is full according to the value of the feature access point of positioning sample The judgment condition of which branch of sufficient decision tree determines its moving direction, and until being moved to leaf node, which is Sample is positioned in the final position of the decision tree.

8. a kind of interior using the high-performance indoor orientation method described in claim 1~7 any one based on random forest Positioning system.

9. a kind of intelligence using the high-performance indoor orientation method described in claim 1~7 any one based on random forest Terminal.

10. a kind of high-performance indoor orientation method using described in claim 1~7 any one based on random forest nobody Machine.