CN117221816A

CN117221816A - Multi-building floor positioning method based on Wavelet-CNN

Info

Publication number: CN117221816A
Application number: CN202311204239.6A
Authority: CN
Inventors: 毛永毅; 王晓甜
Original assignee: Xian University of Posts and Telecommunications
Current assignee: Xian University of Posts and Telecommunications
Priority date: 2023-09-19
Filing date: 2023-09-19
Publication date: 2023-12-12

Abstract

The invention relates to a multi-building floor positioning method based on Wavelet-CNN, which belongs to the field of multi-building floor positioning, wherein firstly, preprocessing data acquired offline through Wavelet change, normalization and other operations, then converting vectors of preprocessed RSSI values into gray images as a training set, training the gray images by a CNN training model, extracting all relevant features of fingerprint data, establishing a feature map, matching the features extracted from the RSSI data acquired in real time with the trained feature map, and estimating the final position.

Description

Multi-building floor positioning method based on Wavelet-CNN

Technical Field

The invention belongs to the field of multi-building floor positioning, and particularly relates to a multi-building floor positioning method based on Wavelet-CNN.

Background

With the development of the internet and mobile communication technologies, there is an increasing demand for indoor location-based services, and navigation and positioning services have become an important component in the lives of many people. Global satellite navigation systems (GNSS) are one of the most important positioning tools for their superior positioning instantaneity and high accuracy. Because people spend most of their time indoors, the global satellite navigation system is widely used for determining the position of an outdoor area, and satellite signals cannot penetrate through walls and lack of line of sight (LOS) and other factors, so that the global satellite navigation system is limited in indoor positioning and cannot provide indoor positioning service for people in daily life. Wi-Fi, bluetooth, radio Frequency Identification (RFID), infrared and Ultra Wideband (UWB) technologies have been developed and used for indoor positioning, and among these technologies, the cost performance of Wi-Fi is the most popular viable technology.

Wi-Fi based positioning stands out in these methods due to its wide deployment and simplicity, and particularly fingerprint-based indoor positioning is widely used for indoor positioning due to its ease of use and the lack of additional hardware facilities. Fingerprinting techniques use Received Signal Strength Indication (RSSI) or Channel State Information (CSI) to predict target locations. In a CSI based positioning system, information on the communication link is used, including rank indication, knowledge of the encoder matrix and knowledge of the channel quality, to determine the target position. Whereas RSS-based positioning systems only use the collected/accepted signal strengths (RSS) from multiple MAC addresses to determine the target location. Thus, CSI contains more information and is more robust than RSSI-based methods. However, CSI capable devices require advanced network interface cards into which current smartphones are not embedded. Whereas the RSSI fingerprint-based method does not require additional hardware, it is the most commonly used fingerprint identification technique.

The fingerprint positioning technology has an important role in improving the accuracy of indoor floor positioning, but the existing method for performing fingerprint matching by using deep learning still has the problem of unobvious expression of important features in fingerprints, so that the accuracy of fingerprint matching is reduced, the positioning accuracy is influenced, and the fingerprint positioning based on WiFi CSI also needs to acquire information by using special equipment. And most deep learning models achieve the purpose of improving the positioning precision only by improving the feature extraction of signals by the improved model.

Disclosure of Invention

Aiming at the problems that the important feature expression in fingerprints is not outstanding in a method for carrying out fingerprint matching by using deep learning, the accuracy of fingerprint matching is reduced, the positioning accuracy is affected, the fingerprint positioning based on WiFi CSI also needs to acquire information by using special equipment, and the like, the invention provides a multi-building floor positioning method based on Wavelet-CNN. According to the method, wiFi RSSI of offline data is decomposed and recombined through wavelet transformation, the raw data is converted into gray images to serve as input of a 2D-CNN classifier, then a CNN model is optimized to achieve feature extraction and classification of building/floor signal data, and positioning accuracy of indoor floors is effectively improved.

In order to achieve the above purpose, the present invention is realized by the following technical scheme: a multi-building floor positioning method based on Wavelet-CNN comprises an off-line learning stage and an on-line positioning stage, wherein the off-line learning stage comprises the following steps:

firstly, carrying out wavelet transformation and normalization pretreatment on Wi-Fi signal data acquired offline, then converting vectors of the pretreated RSSI values into gray images, dividing the data into a training set and a verification set, training the gray images by a CNN training model, extracting all relevant characteristics of fingerprint data, and establishing a characteristic diagram;

on-line positioning:

and converting Wi-Fi signal data acquired in real time into real-time fingerprint images after the same preprocessing operation, sending each real-time fingerprint image into a trained CNN model to be matched with the trained feature map, and finally outputting a building identification number ID and a floor ID from the trained CNN model to realize floor positioning.

Further, the specific process of converting the RSSI value into the gray scale image in the step 1) is: firstly, a given one-dimensional array is created into a two-dimensional array, virtual values are added or reduced under the condition that one-dimensional array vectors are not influenced, and the total number of the AP numbers in the original data values is n ² And adds all of its added virtual Ap's RSSI values to +100,i.e. as undetected value, then 1 Xn ² Is reshaped into an n x n gray scale image.

Further, the wavelet transformation adopts Haar wavelet, the denoising level is set to be 3, three-scale approximation coefficients and three-scale detail coefficients are obtained, and then thresholding is carried out on each coefficient according to a set threshold value.

Further, the Haar wavelet function can be expressed as two functions:

in the wavelet decomposition process, decomposing the signal into approximation coefficients and detail coefficients with different scales, and for a Haar wavelet basis, a reconstruction formula of the primary wavelet decomposition is as follows:

wherein a is _1,k Is a first order approximation coefficient, d _1,k Is the first level detail coefficient, ψ _0,k (t) and ψ _1,k And (t) is the corresponding wavelet basis function.

Further, the CNN model adopts a multi-output CNN model, the model adopts a 2-dimensional convolution kernel, a ReLU activation function is added, the model is subjected to maximum pooling after a plurality of continuous convolutions, the problem of overfitting in the training process is solved through a Dropout layer, then the data of local features extracted from the convolution layer are changed into vectors through a flexible function and are input into a full-connection layer Denselayer, a method of branching is performed in a first full-connection layer Denselayer, building number identification and floor identification are divided into two paths, the two paths of data are respectively sent to an output end through the full-connection layer Denselayer, an activation function Reluaction, the full-connection layer Denselayer and an activation function Softmax to respectively finish classification of building numbers and floors, after the full-connection layer, the output of the full-connection layer is converted into probability vectors through the Softmax layer, finally, the classification layer determines the class with the highest probability as a label corresponding to be input, and the cross entropy error is used as a function of the CNN classification model. .

Further, the continuous convolution includes two Conv2D64,3 ×3 convolution kernels and two Conv2D128,3×3 convolution kernels arranged in sequence, a ReLU activation function is disposed between the two Conv2D64,3 ×3 and the two Conv2D128,3×3, and a max pooling operation is adopted between the Conv2D64,3 ×3 and the Conv2D128,3×3 to reduce the dimension of each feature map.

Compared with the prior art, the invention has the beneficial effects that: according to the positioning method based on the combination of wavelet transformation and CNN model, wiFi RSSI of offline data is decomposed and recombined through wavelet transformation, and converted into gray images to serve as input of a 2D-CNN classifier, and then the CNN model is optimized to achieve feature extraction and classification of building/floor signal data. The multi-building floor positioning method based on Wavelet-CNN provided by the invention solves the defects of low positioning precision and the like in indoor floor positioning, and effectively improves the accuracy and stability of indoor floor positioning.

Drawings

Fig. 1 is a schematic diagram of a Wavelet-CNN based multi-building floor positioning.

Fig. 2 is a Wavelet-CNN classification structure.

Fig. 3 is an example of a fingerprint image.

FIG. 4 is a comparison of accuracy for different normalization methods.

Fig. 5 is a comparison of the accuracy of different K-value models.

Fig. 6 is a building/floor accuracy-loss variation. Fig. 6 (a) training set accuracy-loss variation graph and fig. 6 (b) validation set accuracy-loss variation graph.

Fig. 7 is a comparison of the performance of different models.

Detailed Description

The invention is described in further detail below with reference to the drawings and examples of implementation.

As shown in FIG. 1, in the multi-building floor positioning method based on Wavelet-CNN, firstly, preprocessing off-line collected data through Wavelet change, normalization and other operations, and then converting a vector of the preprocessed RSSI value into a gray image, wherein the data is divided into a training set and a verification set. The CNN training model trains the gray images, extracts all relevant characteristics of fingerprint data, and establishes a characteristic diagram.

And in an online stage, matching the features extracted from the RSSI data acquired in real time with the trained feature images, and estimating the final position. The CNN model belongs to a two-dimensional classifier, wi-Fi signal data acquired in real time are converted into real-time fingerprint images after the same preprocessing operation, and each real-time fingerprint image is classified into a predefined label. The location tag module contains information about the location of the user. And finally, outputting the building identification number ID and the floor ID at the classifier to realize floor positioning, and obtaining dynamic floor information and user positions.

Wavelet transformation is a method of time-frequency analysis that decomposes a signal into a plurality of sub-signals (a low-frequency signal and a high-frequency signal) and is capable of performing processing such as compression, denoising, etc. on the signal without losing original information. Different time-frequency characteristics can be obtained by selecting different wavelet bases, and Haar wavelets are selected as the wavelet bases, so that the method has good localization characteristics. Setting the denoising level as 3, then carrying out three-layer wavelet transformation on the signal to obtain three-scale approximate coefficients and detail coefficients, and then carrying out thresholding on each coefficient according to a set threshold value so as to achieve the purpose of denoising. This removes noise from the signal by three layers of wavelet transform and retains the main characteristics of the original signal.

The Haar wavelet basis belongs to one of the simplest orthogonal wavelet bases and consists of two basis functions: a unit impulse function and a step function. In continuous wavelet transforms, the Haar wavelet function can be represented by two functions:

in the wavelet decomposition process, the signal is decomposed into approximation coefficients and detail coefficients of different scales. For a Haar wavelet base, the reconstruction formula for the first order wavelet decomposition is:

wherein a is _1,k Is a first order approximation coefficient, d _1,k Is the first level detail coefficient, ψ _0,k (t) and ψ _1,k (t) is the corresponding wavelet basis function, and the reconstruction formula is similar for higher level wavelet decomposition and will not be described here.

CNN is a common and efficient image classification technique. CNN is composed of an input layer, a hidden layer, and an output layer. The hidden layers include a convolution layer, an activation layer, a pooling layer, and a full-connection layer, which play an important role in extracting the positional features of the fingerprint image. In order to classify fingerprint images, the invention employs a multi-output, multi-tag CNN model. The most common form of CNN classifier is a multi-tag single output classifier, with only one set of full-connection layer branches used for classification. However, to increase the efficiency of the system and reduce complexity, the classification is done at the end of the network with a fully connected layer of multi-output multi-tag classifier branches. The model classifies the image into two different labels. In the present invention, the models classify images converted from the positioning fingerprint data according to the building and floor positions, respectively.

Through operations such as data preprocessing and Wavelet transformation, a two-dimensional gray image with a corresponding label is used as input to be imported into a CNN model for training, wavelet-CNN performs multi-output classification at the end of a network at a building/floor, and building and floor positions are respectively determined according to classification tasks. Because of the 23×23 gray scale image input, the model uses a 2-dimensional convolution kernel, adds a ReLU activation function, and performs maximum pooling after continuous convolution. The 2×2×128 data after maximum pooling (Max pooling) passes through the Dropout layer to solve the over-fitting problem in the training process, and then the data of 2×2×128 extracted from the convolution layer (called feature map) is changed into a 1×512 vector by the spreading function, and is sent to the full-connection layer Densellayer as input. Because the characteristic difference of the buildings with different building numbers is obvious, the classification precision of the neural network is improved by respectively identifying the different building numbers and the floors, the invention adopts a method of branching at the first full-connection layer Denselayer to divide the building number identification and the floor identification into two paths, the data of the building number identification and allocation 200, the data of the floor identification and allocation 300 and the 12 data are temporarily not used. The two paths of data are respectively sent to an output end through a full-connection layer Denselayer, an activation function ReluActivate, a full-connection layer Denselayer and an activation function Softmax to respectively finish the classification of building numbers and floors. After fully connecting the layers, the output of the fully connected layers is converted to a probability vector using the Softmax layer.

Finally, the classification layer determines the class with the highest probability as the label of the corresponding input. The cross entropy error is used as a loss function of the classification model. And in the real-time positioning stage, building numbers and floor numbers are determined according to the trained model. Furthermore, in order to monitor the training performance of Wavelet-CNN on the validation set, an "early stop" strategy is employed by adjusting the validation patience parameter ρ. Verification patience refers to the number of times that verification loss on the training model does not show improvement compared to the minimum loss. In this way, the training of the model can be terminated after ρ times, and the last iteratively trained model will be the final model. The following are the roles of the various modules in the Wavelet-CNN model.

Convolution layer: the function of the convolutional layer of CNN is to extract the features of the input data. The convolution layer contains a plurality of convolution kernels for extracting the fundamental features of the image. The more the convolution kernels, the more the extracted feature information, so that the convolution layer is abstract and the higher-order features are rich. And carrying out convolution operation on the fingerprint image to obtain feature mapping. The convolution operation is shown in the formula:

in the formula, S is the identification of feature mapping, and the coordinates of the feature mapping are (x, y). K is convolution kernel, I is a two-dimensional matrix formed by image pixel values input by convolution operation, and (m, n) is convolution kernel coordinates. The convolution layer parameters include convolution kernel size, step size, and padding, which together determine the size of the feature map of the convolution layer output. The larger the convolution kernel, the more complex the extracted input features. The convolution kernel of the CNN model of the invention is 3×3, and the step length is 1.

Activation function: the output and input of each layer in the CNN are linear, however, the actual processed data is non-linear. To solve such a problem, an activation function is added on each of the convolution and pooling layers. The method mainly maps the characteristics to a high-dimensional nonlinear interval for explanation, and solves the problem that the linear model cannot solve. Common activation functions include Sigmoid functions and ReLU functions. The method solves the problem of gradient disappearance by using the ReLU function, and simplifies the calculation process. ReLU with function resolution of

f(x)＝max(0,x) (4)

Pooling layer: the pooling layer of CNN consists of pooling functions. The role of the pooling layer is to reduce the dimensionality of each feature map without losing too much important information, the pooling layer being sandwiched between successive convolution layers, compressing the amount of data and the number of parameters to reduce the overfitting. The pooling layer may reduce the time complexity by reducing the number of operations of the next convolutional layer or full-connectivity layer. Among the various methods of configuring the pooling layer, the present invention employs a max-pooling method to extract the maximum value in the sliding window.

Full tie layer: the full connection layer maps the learned features to a mark space of the sample, and classifies the data after feature extraction. The local features extracted by the convolution layer and the pooling layer are input into a traditional neural network and are classified by matching with an output layer. Dropout functions are introduced into the full connection layer, so that overfitting in the training process is prevented, and the generalization capability of the model is improved.

The effectiveness of the algorithm of the present invention is verified by specific experimental data as follows.

The present invention uses the publicly available ujiindioorloc dataset to evaluate and analyze the performance of the proposed system. The ujiindorloc database is a commonly used reference database that compares different indoor positioning methods based on Wi-Fi fingerprints. These data were collected from three different buildings at Jaume I university, with 4 or more floors ujiindiorpoc containing 21048 Wi-Fi fingerprint samples, containing 19937 training set samples and 1111 test set samples, collected from more than 20 users and using 25 different devices for 4 months in 933 Reference Points (RP). Each fingerprint can be identified by a location tag, including a building ID (0, 1,2, floor ID (0, 1,2, 3, 4) and latitude and longitude coordinates, each sample in the database is characterized by 520WAP and corresponding RSSI, with a signal ranging from-104 dBm to 0dBm +100 indicating an undetected access point.

Table 1 dataset attributes

CNNs typically process images, which requires that the trained fingerprint data be converted to "fingerprint images" prior to input as CNNs, in contrast to traditional machine-learned fingerprint recognition methods. To match the input of the CNN, a given RSSI vector needs to be generated into a suitable input structure, and the CNN classifier needs a two-dimensional array as input. Thus, a given one-dimensional array is created as a two-dimensional array, adding virtual values without affecting the one-dimensional array vector. As shown in table 2, the 520 APs in the original dataset add 9 virtual APs, and add all their RSSI values to +100, which is the undetected value.

For image construction, each +100 value is replaced with-110, and all WAP RSSI values are added by +110, at which point all undetected values 100 become 0, and the RSSI values also become positive numbers. Then, the vector of 1×529 is reshaped into a gray image of 23×23, as shown in fig. 2, the complete reference point fingerprint data is converted into a fingerprint image, and then the features of the fingerprint image are extracted by using CNN, completing classification.

Table 2 expanded dataset

In contrast to conventional Wi-Fi RSSI fingerprinting, CNNs can determine a particular floor without the need for additional equipment and data. Wi-Fi RSSI fingerprint data is used as a feature matrix, and building ID/floor ID is used as a tag. CNNs are trained using a two-dimensional gray scale image data format. The model generation process and model performance will be described in detail below, with accuracy and test set loss functions as performance indicators.

Experimental environment

The invention uses Keras to build Wavelet-CNN network model. Keras can be used as a high-level application program interface for Tensorflow, a framework developed by researchers and engineers in the Google Brain team to support all types of deep learning algorithms. The GPU version of TensorFlow was chosen because CNNs have a large amount of computational data and are very computationally intensive. The computer used in this experiment was a mechanic's notebook computer with 8GB RAM. The graphics card is a Nvidia GeForce GTX 1050Ti with 4GB graphics memory. The configuration of the computer environment is shown in table 3.

TABLE 3 Environment configuration

Hardware/Software	Version
		System	Windows 10
GPU	Nvidia GeForce GTX 1050Ti(4GB)
		CPU	Intel i7-7700HQ
RAM	8GB
		Python	3.7
TensorFlow	2.6.0(GPU)
		Keras	2.6.0

Wavelet-CNN model optimization

The invention experimentally explores the generation and optimization of Wavelet-CNN models, and before comparing the performance with other models, we tested various models and found the best performance model by changing the convolution layer number and the filter size. Each time the input training set data is a preprocessed 23 x 23 fingerprint image, the process is performed during model training. Since the filter for convolution operation and hidden nodes of the full connection layer are randomly initialized each time, the training result is different each time. Therefore, five tests were performed for each structure, and the average of the five test results was found, and the best performance model was found by comparison accuracy, and the performance results were as shown in table 4.

Table 4 comparison of different model structures

All the structures in table 4 consist of a fully connected layer and two output layers. It can be seen that the greater the number of filters used in the convolutional layer, the higher the performance. In a given dataset, the two 64-filter convolutional layers and the two 128-filter convolutional layers and the two Maxpooling layers perform best structurally, with test results better than the first four models. The parameters adjusted during the optimization and the fixed parameters are shown in table 5.

During preprocessing of the ujiindorloc dataset, data normalization is effective to improve learning ability of the CNN model. The input data of CNN is gray image, and the traditional normalization method is to divide the RSSI value converted into positive number by 255, compress the data to 0-1 for training. The normalization method adopted by the invention normalizes the fingerprint to the maximum value of the RSSI of each sampling fingerprint, and is not the global maximum value of the whole training set and the test set. As fig. 4 compares the three methods, compared with the traditional normalization method and global maximum normalization, the building recognition accuracy is improved by 0.45% and 0.18%, and the floor recognition accuracy is improved remarkably by 1.81% and 2.98%, respectively.

TABLE 5 model parameter configuration

In evaluating the generalization ability of the Wavelet-CNN model, a K-fold cross-validation method was used. The traditional test set evaluation method can only evaluate the prediction capability of the model, but cannot evaluate the stability of the model. The use of K-fold cross-validation can avoid overfitting and can also cooperate with the model to select the optimal hyper-parameter combination. The K-fold cross-validation divides the training set data into K parts, and randomly selects one part at a time as the validation set, the remaining K-1 parts as the training set. Through multiple different divisions, the prediction results obtained by the K verification sets are averaged, so that the performance of the model can be estimated more accurately.

The K values of the cross validation are different, and the proportions of the divided training set and the validation set are different. As shown in fig. 5, the K values can be selected from 5 to 11, and different K values can affect positioning accuracy and performance of the model. When k=8, it can be seen from the graph that the building and floor positioning accuracy both obtain relatively good accuracy, reaching 99.91% and 96.32% respectively, and the loss function values of the test set also reach the lowest, respectively 0.01 and 0.269, and have better performance and classification effect compared with other K values.

The over-fitting problem is a common problem in deep learning, and in order to avoid over-fitting of a CNN model, the invention uses a double-parameter Dropout factor and L2 regularization. The L2 regularization can make the parameters with larger weights occupy larger proportion in the loss function, and reduce the complexity of the model, so that overfitting is avoided. The Dropout layer randomly discards neurons in the training process with a predefined probability. In the training stage, two optimizers of adaptive moment estimation (Adam) and random gradient descent (SGD) are adopted, and the performance of the model is evaluated by changing Dropout factors and L2 regularization factors. The experimental results on the ujiindorloc dataset are shown in table 6. Regardless of the optimization method and L2 regularization factor, classification accuracy is highest when Dropout is set to 0.4. From the results, it can be inferred that Adam's performance is superior to SGD in most cases. Adam reaches the highest accuracy when the Dropout factor is set to 0.4 and when L2 regularization is equal to 0.001. On the other hand, SGD can reach the highest accuracy when Dropout factor and L2 regularization are set to 0.4 and default parameters, respectively. The use of the patience parameter in the early stop strategy will stop the training cycle before the maximum Epoch is reached. Compared with the traditional gradient descent optimization algorithms such as SGD, the Adam is more stable in model training and higher in convergence speed.

TABLE 6 influence of Dropout, L2-alignment factor and optimizer on Wavelet-CNN Performance

Analysis of results

The performance of the proposed model on the test set data is evaluated. And through continuous testing, finding out a performance optimal model, setting model parameters, and training input data. The present invention uses validation accuracy, loss and test accuracy to evaluate Wavelet-CNN models. The verification accuracy represents the proportion of correctly classified data to verification data, and the test accuracy represents the proportion of correctly classified test data to test data. Figure 6 shows training set accuracy, loss results for building and floor positioning and validation set accuracy, loss results.

As can be seen from fig. 6 (b), the accuracy of building/floor recognition increases with increasing number of iterations. After 30 times of iteration, the building training set is stable, the building training set precision reaches 99.84%, and the loss function is reduced to about 0.005; the accuracy of the floor verification set reaches 99.72%, and the loss function is reduced to about 0.002. The accuracy of the floor training set reaches 99.04%, and the loss function is reduced to about 0.0271; the accuracy of the floor verification set reaches 98.72%, and the loss function is reduced to about 0.03. The accuracy of identifying the building is higher than that of the floor, and the building meets the characteristics of the data set. Comparing the accuracy-loss curves of the verification set and the training set can show that the model achieves good classification effect.

Evaluating the model with test set data can better account for the performance and accuracy of the model, and can also avoid overfitting problems. The invention evaluates the performance of the test model on the test set by multiple times, and compares the model performance before and after wavelet transformation with the data. As shown in table 7, the building/floor accuracy before wavelet transformation was 99.82% and 93.97%, respectively, and the data after wavelet transformation processing was put into the model, the building recognition accuracy was improved by 0.09%, the floor recognition accuracy was improved by 2.35%, and the accuracy improvement effect was remarkable. And the loss value of the test set also drops significantly.

TABLE 7 comparison of accuracy before and after wavelet transform

Finally, the present invention further compares the localization performance of the Wavelet-CNN model with the DNN model proposed by the literature (Nowicki M, wietrzykowski J. Low-effort place recognition with WiFi fingerprints using deep learning [ C ]// Automation 2017:Innovations in Automation,Robotics and Measurement Techniques1.Springer International Publishing,2017:575-584.), the Scalable DNN model proposed by the literature (Kim K S, lee S, huang K.A scalabledeep neural network architecture for multi-building and multi-floor indoor localization based on Wi-Fi fingerprinting [ J ]. Big DataAnalytics,2018,3:1-17 ]), the RF+SAE+Stacking model proposed by the literature (Junlin G, xin Z, huaDeng W, et al WiFi fingerprint positioning methodbased on fusion of autoencoder and Stacking mode [ C ]//2020International Conference on Culture-Orient Science & Technology (ICCST), 2020:356-361.) and the RF+SAE+Stacking model proposed by the literature (ElesaA E A, kim S. Hiercal mu-building and multi-floor indoor localization based on recurrent neural networks [ C ]/2021 ]/Ind [ C ]/2021 ]. Ind.) (Kim K.K.K) with the data of the three-wire model of the invention, the blood vessel-CNN model, the blood vessel-N model, the blood vessel (C ]. Wietrykowski J.J.J. low-International Conference on Culture-Orient & Technology (ICCST), kim.) (Kim.)) and the blood vessel (Kim.J.)) with the blood-N model, the blood vessel model, and the blood vessel-CNN model. The results of the comparison of the performance of the different models are shown in figure 7. It is worth mentioning that most of the existing deep learning models related to indoor positioning are single-input single-output models, but the multi-output models of the invention are used for positioning buildings and floors respectively, the classification effect of the existing classification models on the floors can reach more than 99 percent basically, and only the floor positioning results are compared to compare the positioning performance of Wavelet-CNN.

As can be seen from FIG. 7, the Wavelet-CNN model has an improved accuracy of 4.63% over the UJIIndenorLoc dataset compared to the DNN model, and is significantly better than the Scalable DNN model. Compared with the RF+SAE+stacking model, the RNN model and the CDAE-CNN model, the precision is also obviously improved. In general, the Wavelet-CNN model provided by the invention not only optimizes the input mode of the original data, but also simplifies the structure of the model. In the building/floor positioning technology, the building identification precision of up to 99.91% and the floor 96.32% can be realized, and the building identification method has good classification performance and generalization capability.

The invention provides a network structure combining wavelet transformation and CNN, which adopts a multi-output multi-classification model based on fingerprint images to realize higher classification precision in multi-building floors. Wi-Fi signal data of each acquisition point are regarded as a signal, the Wi-Fi signal data are decomposed into a plurality of wavelet coefficients through wavelet transformation, filtering and denoising are carried out on the Wi-Fi signal data, and finally the wavelet coefficients are reconstructed into new clean signals. Therefore, the accuracy of indoor position identification can be effectively improved, and the influence caused by interference is reduced. By super-parameter adjustment of the CNN model, deep features are extracted from the CNN, and classification performance of the model is greatly improved. Finally, in order to compare the classification performance and generalization capability of the proposed model, on the UJIIndenorLoc public data set, a comparison analysis is carried out with the model proposed by the latest document, and the model provided by the invention realizes the identification precision of up to 99.91% and 96.32% of the building, and the identification precision is superior to other methods. Experimental results show that the provided Wavelet-CNN model has higher identification precision and generalization capability in multi-building floor positioning.

The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any changes or substitutions easily contemplated by those skilled in the art within the scope of the present invention should be included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A multi-building floor positioning method based on Wavelet-CNN is characterized in that: the method comprises an offline learning stage and an online positioning stage, wherein the offline learning stage comprises the following steps:

on-line positioning:

2. The Wavelet-CNN based multi-building floor positioning method according to claim 1, wherein: the specific process of converting the RSSI value into the gray image in the step 1) is as follows: firstly, a given one-dimensional array is created into a two-dimensional array, virtual values are added or reduced under the condition that one-dimensional array vectors are not influenced, and the total number of the AP numbers in the original data values is n ² And adding the RSSI values of all the added virtual Aps to +100, namely undetected values, and then adding 1 Xn ² Is reshaped into an n x n gray scale image.

3. The Wavelet-CNN based multi-building floor positioning method according to claim 1, wherein: the wavelet transformation adopts Haar wavelet, the denoising level is set to be 3, the approximate coefficient and the detail coefficient of three scales are obtained, and then each coefficient is subjected to thresholding according to a set threshold value.

4. A Wavelet-CNN based multi-building floor positioning method according to claim 3, wherein: the Haar wavelet function can be represented by two functions:

5. The Wavelet-CNN based multi-building floor positioning method according to claim 1, wherein: the CNN model adopts a multi-output CNN model, the model adopts a 2-dimensional convolution kernel, a ReLU activation function is added, the model is subjected to maximum pooling after a plurality of continuous convolutions, the problem of overfitting in the training process is solved after passing through a Dropout layer, then the data of local features extracted from the convolution layer are changed into vectors through a flexible function and are input into a full-connection layer Denselayer, a method of branching is carried out on the first full-connection layer Denselayer, building number identification and floor identification are divided into two paths, the two paths of data are respectively sent to an output end through the full-connection layer Denselayer, an activation function ReluActivate, the full-connection layer Denselayer and an activation function Softmax to respectively finish classification of building numbers and floors, after the full-connection layer, the output of the full-connection layer is converted into probability vectors through the Softmax layer, and finally, the classification layer determines the class with the highest probability as a label corresponding to input, and the model uses cross entropy error as a function of a CNN classification model.

6. The Wavelet-CNN based multi-building floor positioning method according to claim 5, wherein: the continuous convolution comprises two Conv2D64,3 multiplied by 3 convolution kernels and two Conv2D128 and 3 multiplied by 3 convolution kernels which are sequentially arranged, wherein a ReLU activation function is arranged between the two Conv2D64,3 multiplied by 3 and the two Conv2D128 and 3 multiplied by 3, and the dimension of each feature mapping is reduced by adopting a maximum pooling operation between the Conv2D64, the Conv2D128 and the Conv2D 128.