CN106961434B

CN106961434B - Method for fingerprint modeling and identification of wireless equipment

Info

Publication number: CN106961434B
Application number: CN201710172588.2A
Authority: CN
Inventors: 孙弘毅; 华景煜; 沈振宇; 陈晓宇; 仲盛
Original assignee: Nanjing University
Current assignee: Nanjing University
Priority date: 2017-03-21
Filing date: 2017-03-21
Publication date: 2020-10-16
Anticipated expiration: 2037-03-21
Also published as: CN106961434A

Abstract

The invention discloses a method for fingerprint modeling and identification of wireless equipment, which comprises the steps of firstly decomposing phase values of CSI; then, estimating the CFO correlation term from the CSI-resolved phase values comprises: removing the impact in CSI measurement caused by FDD and SFO; removing the effect of ToF; obtaining a CFO value from the CFO noise image obtained in the previous step, wherein the CFO value comprises a high-density area of the selected image, and the high-density area obtained in the previous step is converted into a binary image; acquiring a connected part of an image, and processing a set of point compositions of the obtained connected part by using a least square method to obtain a slope of the point set; and finally, the CFO estimated value is used as the fingerprint feature of the equipment to perform bidirectional identification on the interconnected WiFi hotspot and wireless equipment. The invention can accurately and rapidly estimate the CFO from the CSI, does not need additional equipment and is difficult to imitate.

Description

Method for fingerprint modeling and identification of wireless equipment

Technical Field

The invention relates to a method for fingerprint modeling and identification of wireless equipment, and belongs to the technical field of network security.

Background

WIFI attracts a wide variety of attacks due to its breadth. Among these attacks, counterfeit network access points (rouge APs) and freeway networking (WiFi freeload) are the most common and also pose a significant security and privacy hazard.

Rouge Aps mean that in a public place, an attacker impersonates a legitimate network access point (accesspoints) to set up a network device. It typically uses the same BSSID and SSID as the original AP. Once the user is fooled into connecting to it, an attacker can steal all the user's network traffic by launching a man-in-the-middle attack. It is estimated that roughly 20% of companies face such problems.

WiFi Freeloading refers to an unauthorized user bypassing the APs authentication mechanism and then entering a private WLAN free of charge. Of course, once an express visitor enters the network, he may steal more than just the network bandwidth.

One important way to defend against these attacks is to establish a powerful mutual authentication mechanism between clients and Aps. Indeed, 802.1li RSNA (Robust Security Network Association) does provide an alternative mutual authentication mechanism (such as a digital certificate) using traditional cryptographic methods, which if used properly, makes attacks less likely.

According to Jana et al, wireless networks using 802.1li RSNA still suffer from some drawbacks due to some practical problems. For example, because the signal strength is the only basis for clients to select an AP in the current application scenario, the user may be confused with a false AP without any defense simply because it has a stronger signal strength than other APs. Due to the encumbrance of digital certificate management and distribution, many networks simply choose to provide user authentication (userathendication) without access point authentication (AP authentication), resulting in a false access point that is easily deployed by an attacker.

For freeloading attacks, authentication is based on passwords chosen by most users, and the passwords are generally vulnerable, and especially the same password can be used by the users, and the password is easy to be stolen and disclosed. (such as WPA 2-PSK).

For the above reasons, researchers have provided some solutions based on device fingerprinting (non-cryptographic) in recent years. These measures are not intended to replace cryptographic methods, but are intended to provide an additional layer of security against the difficulties faced when taking traditional cryptographic methods.

One case scenario is: without fingerprint technology, a user can easily be confused about a false AP connected to the same true BSSID and SSID when he enters a coffee shop where he often visits. With fingerprinting, however, the user is alerted that he may be connected to a counterfeit AP.

However, in real life, no such applications exist. This is due to some important practical problems. First, this approach requires special hardware, which hinders the application. For example, Brik et al proposes using radio frequency characteristics to identify devices. But he needs some extra equipment to grab and analyze the radio signal. Second, hardware features can be spoofed, and thus security cannot be guaranteed. Kohno et al proposed a detection mechanism for rogue APs that used the clock screen measured by TCP/ICMP timestamps as the fingerprint of the device, and Jana and Kasera have shown that TCP/ICMP timestamps are easily spoofed. Instead, they measure the time synchronization function timestamp in the beacon/probe corresponding frame, which is hardware-marked and therefore somewhat difficult to spoof. However, there is evidence that it is still possible to spoof such timestamp information by modifying the device driver of a false AP.

And (3) related knowledge:

CSI: the channel state information describes the mixing of the effects of scattering, attenuation, and power attenuation in the propagation of the signal from the transmitter to the receiver. The IEEE802.11 standard defines a mechanism for measuring CSI in one transmit-receive antenna (Tx-Rx antennapair) pair.

The CSI continuously grabs the signal strength and phase information of each OFDM carrier.

X-received signal vector

Y-propagation signal vector

H-channel matrix

N-noise vector

Y＝H*X+N

Where H is a complex vector called the Channel Frequency Response (CFR) that affects the signal gain between Tx-Rx pairs. This information can be used to achieve reliable communication at high data rates. And CSI means that CFR is sampled at different subcarriers.

Over a frequency band of 2.4Ghz of 20Mhz, the CSI measurements consist of 30 complex numbers, each corresponding to a selected subcarrier.

Let N be_txAnd N_rxRepresenting the number of transmit and receive antennas, that has 30 × N for an accepted 802.11 frame_rx*N_txAnd a CSI stream.

The CSI stream for the kth subcarrier between the ith transmit antenna and the jth receive antenna may be expressed as

Where | H | represents the amplitude of the subcarrier k and φ_k,i,jRepresenting the phase portion of subcarrier k.

CFO: CFO is carrier frequency offset. For an OFDM system, the carrier frequency f should ideally be the same in the Tx-Rx pair. However, due to hardware imperfections, there is typically an offset in the Tx-Rx oscillator, which results in a CFO. Since a large CFO may cause a large noise at the receiving end, the CFO is compensated by hardware. However, due to hardware imperfections, there will still be a residual amount Δ f of CFO after compensation_c。

Such a CFO may cause a phase shift in the received signal

Wherein

Δf_cRepresenting the CFO after compensation. For commercial WIFI devices, residual CFO is unavoidable. The residual CFO may reach 100kHZ according to the IEEE802.11 n standard. For convenience, CFO herein generally refers to residual CFO.

Disclosure of Invention

The purpose of the invention is as follows: the present invention proposes a novel wireless device fingerprinting method to avoid the above-mentioned problems and can be used to protect against rouge APs and WIFI freeloading. A device is fingerprinted by estimating its Carrier Frequency Offset (CFO). CFO occurs because oscillator drift, which is caused by crystal defects and cannot be imitated by any software, remains consistent for a long time but varies considerably between different devices. Thus, it can be used as a device fingerprint.

The most significant challenge is that no software, either on the handset device or the APs, can evaluate the CFO from the underlying hardware. The most advanced method is to analyze the original signal by using additional signal analysis equipment (such as vector signal analysis and USRP), which greatly hinders its application in real life.

In contrast, the present invention proposes to indirectly mine the CFO from the Channel State Information (CSI), which is easily obtained by software on the off-the-shelf wireless device.

The method is generated for coping with increasingly rampant Rouge AP and WIFI Freeload attacks, and the Rouge AP can be realized because a false AP can forge BSSID which is the same as a real AP on one hand, and because a mobile device cannot distinguish the difference between the false AP and the real AP on the other hand, the AP can be selected only through signal strength. The reason why the WIFI Freeload attack can be generated is that the existing authentication method is not perfect, and in many cases, as long as a correct password is input or a MAC address of a legal device is modified, the AP can accept the connection of the device, and the password or the MAC address is easily obtained in various ways.

Therefore, based on the above problem, the present invention establishes a fingerprint for each AP and each device, so that the AP can decide whether to approve the device connection according to the fingerprint, and the device can select the correct AP to connect through the fingerprint.

A CFO related item is selected as a fingerprint because the CFO is generated due to carrier oscillator offset in the WIFI network card, and the CFO does not change due to time and place changes, only differs from device to device, and it cannot be constructed because it is a purely hardware related feature. Also, in theory, selecting a CFO related term is more efficient than selecting a CFO.

The technical scheme is as follows: a method for fingerprint modeling and identification for a wireless device, comprising the steps of:

1. decomposing phase values of the CSI;

2. estimate the CFO from the CSI resolved phase values:

a) removing the impact in CSI measurement caused by FDD and SFO;

b) removing the effect of ToF;

3. obtaining a CFO value from the CFO noise image obtained in the previous step;

a) selecting a high density region of the image;

b) converting the high-density area obtained in the last step into a binary image;

c) acquiring a communication part of the image;

d) processing the obtained set of the points of the communicated part by using a least square method to obtain the slope of the point set;

4. and utilizing the CFO estimated value as the fingerprint feature of the equipment to perform bidirectional identification on the interconnected WiFi hotspot and wireless equipment.

a) Fingerprint collection is carried out on legal WiFi hotspots, and a white list is established

b) And when WiFi hotspot access is carried out, acquiring a CFO estimated value of the WiFi hotspot currently accessed by using a foreward method, comparing the estimated value with the fingerprint characteristics of the WiFi hotspot in the acquired white list, and if the similarity is lower than a certain threshold value, judging the WiFi hotspot to be an illegal WiFi hotspot.

c) The method comprises the steps of establishing a white list for the wireless equipment needing to be accessed in advance on the WiFi hotspot, and comparing the fingerprint characteristics of the access equipment to enable the WiFi hotspot to reversely identify the accessed wireless equipment.

Resolving phase values of CSI values

Assuming that the fingerprinting device receives n frames from the target device, it obtains a measurement of CSI from the network card driver for each frame. Let us consider the CSI of the frame between a pair of Tx-Rx at time t. For the k-th subcarrier, the CSI measurement includes a phase field phi_t,kPhase field phi_t,kThe phase offset of the frame between the sender and receiver on the sub-carriers is measured.

φ_t,k＝k(2παζ_d+2πβζ_s)+ψ_t,k+2πΔf_ct (2)

Wherein, 2 pi delta f_ct is exactly the phase shift caused by CFO, Δ f_cI.e. the CFO term to be estimated.

Estimation of CFO from CSI resolved phase values

First a new phase variable is defined and calculated

For each frame at time t, where_t,1And phi_t,-1Representing the phase values of the subcarriers with indices 1 and-1, respectively, and then calculating their phase difference for each pair of adjacent frames

And TDoA arrival time Δ t microseconds. Then, draw all

From a periodic series of stripes, the slope of which is estimated and finally taken as the value of CFO.

Removal of FDD and SFO: both FDD and SFO cause a time delay in CSI measurement that causes a phase offset that is linearly related to the subcarrier index. According to the formula (2), if k is to be satisfied₁+k₂0 being

And

in addition, the phase offset due to FDD and SFO can be removed.

For the point in time t of the time,

can be expressed as follows:

removing ToF: the receiver and sender are first required to remain stationary during the collection of frames, thus fixing their relative distance, phase difference

Obtaining the CFO values from the fringe image involves two steps: streak extraction and slope estimation.

Data feature extraction

To estimate the slope, a set of points that make up each stripe is first obtained. The step of extracting the stripes is divided into three steps.

(1) Selecting high density regions

A sliding window algorithm is used to identify the high density regions. In each fixed window, the number of all points is counted and then moved to another window length. Finally, the window of the highest number of dots is selected as the high density region.

(2) Converting high density regions into binary images

The high-density area obtained in the previous step is processed into a binary image, so that the outlier is eliminated by using the great difference between the normal point and the outlier. This high density region is first rasterized into a series of identical small rectangles, each of which corresponds to a pixel in the newly generated binary image. Then, for each rectangle, the number of points therein is calculated. If the total number of points exceeds a predefined threshold, the pixel corresponding to the converted binary image is set to 1, otherwise it is set to 0. In doing so, many outliers are removed. For those images where there are still large noise points, many existing mechanisms such as PCA-based ALM are used for further processing.

(3) Obtaining a communicating part

The k longest connected components are identified in the binary image and then converted from estimating the slope of the fringes to estimating the slope of the k longest connected components. Of course, in practice, some erroneous portions may be obtained that do not correspond to any stripes. This requires the following mechanism for processing.

Estimation of streaks

After the step of streak extraction, a set of points for each long connected component is obtained.And setting the set of connected points as S. Slope k of the diagonal_cCan be obtained by the least square method,

wherein the content of the first and second substances,

is the slope k_cThe reason for this is based on the simple fact that the correct CFO estimates are usually very close, while the wrong CFO estimates are usually different.

Bidirectional identification of WiFi hotspots and wireless devices

We use the CFO estimate obtained by the foregoing method as a fingerprint of the wireless device for bidirectional identification of WiFi hotspots and wireless devices:

a) the fingerprint collection of the legal WiFi hotspots can be carried out by a method of collecting fingerprints in advance, or a holder of the legal WiFi hotspots can publish fingerprint characteristic values of the legal hotspots to establish a white list of the device fingerprints.

b) When WiFi hotspot access is carried out, a CFO estimated value of a WiFi hotspot accessed currently is acquired by using a foreward method, the estimated value is compared with the fingerprint characteristics of the WiFi hotspot in an acquired white list, and if the similarity is lower than a certain threshold value, the false WiFi hotspot is determined.

c) The same approach can also be used to enable WiFi hotspots to identify the functionality of the wireless device being accessed, thus having the ability to restrict access to a particular wireless device.

Has the advantages that: compared with the prior art, the method for fingerprint modeling and identification of the wireless equipment has the following advantages:

1. the first wireless device fingerprint modeling mechanism based on CFO without additional hardware device is proposed, and is easily applied to the existing devices, such as notebook computers or smart phones. Experiments will prove that such fingerprints do not change by time and place.

2. A novel approach is proposed to accurately and quickly predict CFO from CSI without the need for additional equipment and is difficult to counterfeit.

3. A prototype was implemented to witness our mechanism performance. The results show that the present invention can achieve a high accuracy and can be used in real life as a method of rouge AP probing.

Drawings

FIG. 1 shows the same device at different time locations

A graph in which the slope of the fringes is closely related to the CFO, and (a) location l₁Time t₁(b) location l₁Time t₂(a) location l₂Time t₁；

FIG. 2 shows different devices in the same environment

Relationship diagram, (a) device 1, (b) device 2, (c) device 3;

FIG. 3 is a graph of the effect of using the method of the present invention after removing FDD and SFO, (a) a fringe image extracted from subcarrier-28, (b) using a newly defined phase value

The fringe image of (a);

FIG. 4 is a graph of the effect of selecting adjacent phase pairs after phase processing, (a) using a fringe pattern of all adjacent phase pairs, (b) using a fringe pattern based on phase pairs after processing;

FIG. 5 is a graph of the CFO values for two APs (NETGEAR R7000 and TP-LINK WDR4300) at different times of the day, where each interpolated value pair represents the maximum and minimum values among the 15 measurements;

FIG. 6 is the CFO values collected from a millet Note phone in a month;

FIG. 7 is a schematic diagram of a laboratory setup, where the triangles represent touchdown points;

fig. 8 is a graph of measurements of different APs at different locations in a laboratory.

Detailed Description

The present invention is further illustrated by the following examples, which are intended to be purely exemplary and are not intended to limit the scope of the invention, as various equivalent modifications of the invention will occur to those skilled in the art upon reading the present disclosure and fall within the scope of the appended claims.

The present invention addresses the following two major threats in WLANs:

rouge AP (camouflaged hotspot): an attacker creates an unauthorized AP in a public place such as an airport or cafe to make up the authorized AP. It is assumed that the attacker is powerful enough to be able to modify the SSID and BSSID fields of each frame to make it consistent with an authorized AP. In addition, the attacker uses the same authentication policy (such as pre-shared key or 802.1Xauthentication) as the authenticated AP, but always allows the user to pass the authentication. Note that the rogue AP and the real AP are simultaneously active. In this case, the signal strength will be the only selection criterion. According to our experiments, if two overlapping WLANs use the same SSID, the handset device will only show one of its WLAN lists that is strong in signal strength. If the user logs in to a fake AP, the attacker can realize man-in-the-middle attack to acquire user information or analyze the traffic data of the user without being discovered.

WiFi freecasting (WiFi wipe network): an attacker steals the credentials to log into the private network and can then log into the private network as a legitimate user. The authentication may be a simple password, if the WLAN adopts WPA-Enterprise mode, the attacker can obtain the credentials in many ways, and once the attacker enters the network, the attacker can not only steal the bandwidth.

The method of the present invention attempts to estimate a frequency offset (e.g., CFO) of a device as its fingerprint information to enable detection of an attack. The CFO is generated because the carrier oscillator is offset in the WiFi network card, and the CFO does not change with time or location, but only differs from device to device. Furthermore, it is difficult to impersonate fraud because it is purely hardware in nature and is hard to be affected by any running software.

The CSI phase information provides the phase offset accumulated during signal transmission. The size of the CFO is estimated considering whether or not the measurement of CSI is possible. The CFO can be estimated by not using additional devices if feasible, since the CSI can be obtained by modifying the existing wireless device network card driver and then using upper layer applications.

The composition of each phase value of the CSI is analyzed before the CFO is estimated from the CSI.

Resolving phase values of CSI values

Assuming that the fingerprinting device receives n frames from the target device, it obtains a measurement of CSI from the network card driver for each frame. Let us consider the CSI of the frame between a pair of Tx-Rx at time t. For the k-th subcarrier, the CSI measurement includes a phase field phi_t,kPhase field phi_t,kThe phase offset of the frame between the sender and receiver on the sub-carriers is measured. Phi is a_t,kIs composed of 4 components:

wherein the content of the first and second substances,

it is the phase shift caused by CFO, and the other three are due to the following reasons:

ω_t,k: detecting delays from framesThe phase offset caused by the delay (FDD), when a frame arrives at the receiver, the receiver will spend some time detecting it, which will cause a time delay τ in the CSI measurement_d. Such a delay will result in a phase shift ω_t,k，ω_t,kIs proportional to frequency and is mathematically expressed as ω_t,k＝2παkζ_dWhere α is a constant coefficient, k is an index for one subcarrier, ζ_dIs one and_dhighly relevant value since ζ_dVaries with time, then ω_t,kAnd is also different from frame to frame.

θ_t,k: phase Offset caused by Sampling Frequency Offset (SFO). SFO is caused by the Sample Clock (Sample Clock) asynchrony between the sender and the receiver. Similarly to the frame detection delay, the asynchrony also introduces a time delay τ_sWhich in turn causes a phase shift that is linearly related to the subcarrier index, and therefore theta_t,k＝2πβkζ_sWhere β is a constant, k is a subcarrier index, ζ_sIs a function of τ_sThe variables of the decision.

ψ_t,k: is the phase offset caused by the time of flight (ToF), which represents the time of flight of the signal from the sender to the receiver, which causes a phase offset. In the absence of multipath, there is psi_t,k＝2πf_kt_pWherein t is_pRefers to the micro-propagation time, f, from sender to receiver_kRefers to the frequency of the k-th subcarrier. Once multipath is considered, there is another term to consider that is closely related to the environment. Since this offset is mainly determined by the time of flight, it is useful in the field of indoor positioning. In fact, there are many prior art studies on how to utilize the phase domain of CSI for indoor positioning. However, since the present invention aims to extract a portion of a CFO instead of a ToF portion, the existing algorithm cannot be directly used.

Now, based on the above analysis, equation (1) can be rewritten as:

φ_t,k＝k(2παζ_d+2πβζ_s)+ψ_t,k+2πΔf_ct (2)

Estimation of CFO from CSI resolved phase values

First a new phase variable is defined and calculated

(mapping to [ - π, π]) And TDoA (time Difference of arrival) arrival time Δ t microseconds. Then, draw all

As shown in fig. 1 (a). These points come from a series of periodic stripes, as identified by the boxes in FIG. 1 (a). Furthermore, the stripes are oblique and appear to have the same slope. The slope of these fringes is estimated and finally taken as the value of CFO.

As an example, a fringe image of the same device as a function of time and place of testing is shown in fig. 1. Also in fig. 2 the fringe images of the different devices are compared and the slopes of the fringes are found to be different. To summarize, it can be seen from these images that the slope of the streak varies only from device to device and does not change from environment to environment, which is just like the desired feature of the CFO, and is considered as an estimate of the CFO.

Frame Detection Delay (FDD) and removal of Sampling Frequency Offset (SFO): both FDD and SFO cause a time delay in CSI measurement, which causesA phase offset that is linearly related to the subcarrier index. According to the formula (2), if k is to be satisfied₁+k₂0 being

And

in addition, the phase offset due to FDD and SFO can be removed.

In the 802.11n standard, CSI measurements record indices of [ -28, -26,. -, -2, -1,1, 3. -, 27,28, when the bandwidth is 20Mhz]Of 30 sub-carriers. Among them, only [ -1,1 [ ]]And [ -28,28]These two groups meet the requirements. This is that in the process of the invention is defined

The reason for (1). For the point in time t of the time,

can be expressed as follows:

fig. 3 compares the fringe image of the original phase value of subcarrier-28 with the fringe image of the average phase values of subcarriers-1 and 1, and it can be seen that the removal of the phase offsets of FDD and SFO makes the image more clear.

From the formula (4), it is apparent that Δ f_cIs that

And Δ t, theoretically, the points in the fringe image should form periodic lines rather than periodic fringes.

The difference between theory and reality is due to various measurement errors and a known firmware problem, due to the Intel 5300 network card we use. These measurement errors cause the fringes not to be a smooth curve but rather to have many noisy points. For example, if the clock frequency of the device is 10Mhz, the exact time of receiving a frame at the receiving end is at most 0.1 microseconds. Therefore, in such a case, the interval between any pair of points in the fringe image should be at least 0.1 microseconds. Firmware problems, creating extra stripes in the image, which make Δ t and

the functional relationship cannot be satisfied. But such additional fringes do not affect the slope estimation. And to detect attacks, it is better to use a CFO-related quantity than to use only the CFO itself as a fingerprint. In other words, our work is to convert from estimating CFO to CSI-based measurements, calculating the streak slope from the noise streak image.

When a mobile device (laptop or smartphone) wants to obtain a fingerprint from an AP, it is first allowed to connect to the AP as usual, and then. The mobile device sends test data to the AP using a built-in tool Ping and then collects CSI measurements for all reply frames. To ensure high accuracy of the acquired fingerprint, approximately 5000 frames are required. Since the transmission rate can reach 11Mbps in the 802.11b scenario, this process is less than 10 seconds. Based on the CSI measurements, we derive the fringe pattern described above, and then the slope of these fringes as the fingerprint of the device. The process of giving a fingerprint to a mobile device is the same except that a WIFIP2P connection is configured between the two devices.

Obtaining the CFO values from the fringe image involves two steps: is streak extraction and slope estimation.

Data feature extraction

To estimate the slope, a set of points that make up each stripe is first obtained. The step of extracting the streaks is divided into three steps as shown in fig. 4.

(1) Selecting high density regions

The distribution of TDoAs in adjacent frames is mainly determined by the transmission rate of the network. As a result, as illustrated in the first subgraph shown in fig. 4, some of the spacings on the Δ t axis may have more points than those forming high density regions in the fringe pattern. It is clear that the fringes have more points and the linear fit is more accurate. A sliding window algorithm is used on the at axis to identify these high density regions. In each fixed window, the number of all points is counted and then moved to another window length. Finally, the window of the highest number of points is selected as the high density region, and all subsequent processes are performed in this window. The size of the window is set empirically by a person after examining 30 devices. Assuming that the average span of the fringes on the Δ t axis is s, the window length should be set to 6s in our experiments, so as to ensure that the extracted density region contains at least 3 distinct fringes.

(2) Converting high density regions into binary images

Looking at the second subgraph of fig. 4, it can be seen that most of the points are concentrated along the centerline of the stripe, however, there are still some outliers sparsely distributed among the different stripes. Clearly, these outliers have some negative impact on the slope estimation and therefore need to be removed. For this purpose, the high-density area obtained in the previous step is processed into a binary image, thus eliminating outliers by using a large difference between the normal points and the outliers. This high density region is first rasterized into a series of identical small rectangles, each of which corresponds to a pixel in the newly generated binary image. Then, for each rectangle, the number of points therein is calculated. If the total number exceeds a predefined threshold, the pixel corresponding to the converted binary image is set to 1, otherwise it is set to 0. In doing so, as shown in the third sub-graph of FIG. 4, many outliers are removed. For those images where there are still large noise points, many existing mechanisms such as PCA-based ALM are used for further processing.

(3) Obtaining a communicating part

The image segmentation algorithm proposed by haralick and shapero is used to obtain the k longest connected components in the 2D binary image, and the problem then translates from estimating the slope of the fringes to estimating the slope of the k longest connected components. Of course, in practice, some erroneous portions may be obtained that do not correspond to any stripes. This requires the following mechanism for processing.

Estimation of streaks

After the step of streak extraction, a set of points for each long connected component is obtained. And setting the set of connected points as S. Slope k of the diagonal_cCan be obtained by the least square method,

wherein the content of the first and second substances,

Bidirectional identification of WiFi hotspots and wireless devices

Results of the experiment

The implementation effect of the method of the invention is tested through experiments.

CFO fingerprint data of test equipment (APs and smartphones) was collected using a thinkpa X200 laptop as a data collector. The computer is equipped with a 5300 wireless network card for which a modified third party driver is installed so that the corresponding CSI measurement values can be obtained for each received frame. In order to perform fingerprint modeling on a router, a notebook computer is connected to the router, and then a ping tool under the linux system is used for sending messages to the router. For each replied frame message, the notebook obtains and stores a corresponding measured value of CSI. The default 1 second between the two Ping messages is a little too long for the device fingerprint to be captured, so the-i parameter is used to reduce this interval to 0.002 seconds, which makes it possible to send 500 frames of messages every second. For each device, a total of 5000 reply frames are collected, which typically takes approximately 10 seconds. The notebook computer can calculate the CFO as the fingerprint of the device by the method proposed above. In order to collect fingerprint information of a smart phone, a notebook computer is configured as a WIFI hotspot. The handset then connects to the computer and sends information to it, as was done in collecting the router (AP) fingerprint. Then, we can calculate the fingerprint information of the handset based on the CSI values of the received frames.

The two aspects of the invention are discussed with emphasis on their performance: firstly, the stability of the fingerprint in terms of time and place, and secondly, the accuracy and the false positive rate of the method. Over 23 smartphones and 30 APs, including identical and non-identical models, were tested experimentally.

Stability test

The fingerprint in the experiment refers to the value of CFO estimated from CSI measurements, which can be conveniently obtained from existing wireless devices.

Stability over time: the experiment considered two time spans, one day and one month. Fig. 5 shows CFO values collected at different times of the day for two different models of APs (NETGEAR R7000 and TP-LINK WDR 4300). Both APs were allowed to run throughout the day and the CSI values were recorded every 6 hours. It can be seen that the CFO estimates for both APs are consistent throughout any period of the day.

To verify that CFO is long-term stable, we collected the values of CFO from a millet Note handset for each day of the month, as shown in fig. 6, and the estimates of CFO for a month were essentially the same and differed by at most 0.1.

Spatial stability: the phase domain of the CSI contains a ToF offset, which is highly dependent on the relative position between the devices and the surrounding environment. We propose several methods to cancel this interference factor. To bring into a complex multi-path environment, an indoor environment, such as the 7.7m x 6.5m laboratory shown in fig. 7, was chosen to demonstrate the spatial stability of the CFO fingerprints of the four APs. For each AP, it is placed in four different locations, one separated from the computer by a wall. In each test, two test persons were required to walk around the room to simulate a real, changing environment at any time. Fingerprinting was performed 15 times for each AP at each site. The average CFO values for these APs at different locations are presented in fig. 8. It shows that the changes between CFOs due to different environmental influences can be neglected compared to the differences between different APs, which shows that our measured CFOs are spatially stable. Such experiments were also performed on 4 different smartphones and then similar results were obtained, which we only show the AP results due to space limitations.

Accuracy test

Each AP/handset is fingerprinted K times and the results are stored in a database. Assuming a total of N devices, a total of K.N fingerprints are collected. In each test, for each device d (representing the device index), a set of samples H is formed by randomly selecting M from K fingerprints_d. Then, by calculating H_dTo establish a white list W. In the experiment, the size of M was set to one third of K, so if K is 7, then M is 2. Let S denote the set of remaining fingerprints. Then, if one fingerprint in S is compared to the fingerprint of a different device in W, we simulate in fact detecting Rogue AP or Freeloading attacks. On the other hand, if we actually simulate a normal conversation by comparing a fingerprint in S with the fingerprint of an identical device in W. Therefore, the detection rate P of the attack can be defined_dAnd false alarm rate P_fThe following are:

wherein id (i) represents the device number corresponding to the fingerprint i, and if the fingerprint i and the fingerprint j are matched, the match (i, j) value is 1, otherwise, the match (i, j) value is 0. The process of matching calculates H_id(i)(the sampling set corresponding to the device with the device number id (i)) is taken as T_hIf T is_hLess than 1 deg., then we will T_hSet to 1 ° and the absolute angular difference between two fringes is taken as d_a. Then by comparison d_aAnd T in our experiments_hTo decide whether two fingerprints match: if d is_aLess than T_hA match is considered, otherwise a mismatch.

TABLE I

Detailed scenario of AP experiment

Scene	AP brand	Full fingerprint	#of APs	Value of K
					Teaching building	Huashi	84	12	7
Star Baker	Is unknown	90	6	15
					Laboratory	Net work, general union	120	8	15
Library	Huashi	60	4	15

TABLE II

AP identification accuracy under different scenes

Scene	Detection rate P_d	Error warning rate P_f
			Teaching building	94.32％	4.52％
Star Baker	95.05％	2.31％
			Laboratory	97.24％	1.47％
College library	94.37％	5.11％

Consider first an evaluation of an AP. Experiments were performed in four scenarios, including a teaching building, a laboratory, a library, and a few starbucks, and a total of 354 fingerprints were collected. Details of the AP in these environments are shown in table I. In starbucks, their equipment is not visible and so no trademark is known. Except for starbucks, all APs are the same vendor and model in each scenario. It should be noted that this is actually a worst-case attack scenario that we can encounter, in which an attacker sets a fake AP of the same model as the authenticated AP. Intuitively, APs from the same vendor are more likely to have similar fingerprints. According to the results shown in table II, the detection rate is close to 94% in the four scenarios, while the false alarm rate is below 5.11%.

Similar experiments were also performed for smartphones, except for the AP. The experiment included 23 different manufacturers of smartphones as shown in table III, where K15 for each phone. Wherein the definition of the accuracy rate and the false alarm rate is the same as that of the AP test. The accuracy rate reaches 94%, and the false alarm rate is below 3%.

TABLE III

Detailed information of mobile phone

Smart phone	Number of
		Charm family	3
Three stars	5
		Millet	7
Others	8

Also as in tablesExperiments were performed on cell phones of the same model as those described in IV. Because the database is small, P is stored_dAnd P_fExpressed as a function. We will find that the method of the invention can still distinguish between different devices of the same model.

TABLE IV

Accuracy of mobile phones of the same model

Smart phone	Number of	Detectivity	Error warning rate
				Millet mobile phone with same WIFI network card	6	149/160	1/70
Three stars S4	3	70/70	0/35

Claims

1. A method for fingerprint modeling and identification for a wireless device, comprising the steps of:

(1) decomposing a phase value of the channel state information; the method comprises the following steps:

for a pair of communicating wireless devices, the wireless devices including a receiving device and a transmitting device, when the receiving device receives an n-frame message from the transmitting deviceFor each frame, it drives the data of measuring the channel state information from the network card; for the k-th subcarrier, the channel state information includes a phase field phi_t,kPhase field phi_t,kMeasuring a phase offset of a frame between a sender and a receiver on a subcarrier;

φ_t,k＝k(2παζ_d+2πβζ_s)+ψ_t,k+2πΔf_ct (formula 1)

Wherein α, β are constant coefficients, ζ_dIs a quantity, ζ, related to the detection time delay of the same frame_sIs a time offset term, psi, introduced by the sampling clock asynchrony between the transmitting and receiving ends_t,kIs the phase offset, Δ f, introduced by the propagation delay_cRepresenting the carrier frequency offset term to be estimated, 2 pi Δ f_cThe carrier frequency offset related item is different from the carrier frequency offset by only one constant coefficient;

(2) eliminating interference items irrelevant to carrier frequency offset from phase values resolved from channel state information; the method comprises the following steps:

(2.1) removing frame detection delay and sampling frequency offset: according to equation 1, k will be satisfied₁＝1,k₂1 is ═ 1

And

averaging, removing phase shift due to frame detection delay and sampling frequency shift; for time t, according to equation 1, define

Wherein phi is_t,1And phi_t,-1Represent the phase values of the subcarriers with subscripts 1 and-1, respectively;

(2.2) removing propagation delay: the receiver and sender are first required to remain stationary during the collection of the frames, thus fixing their relative distance, defining the phase difference:

for each pair of adjacent frames, their phase difference is calculated

And arrival time difference Δ t in microseconds; these all points are then represented in a phase-time coordinate system, these points

A series of periodic stripes are formed;

(3) calculating an estimated value of carrier frequency offset from the fringe image; comprises the following contents:

(3.1) selecting high-density regions

Identifying a high density region using a sliding window algorithm; in each fixed window, calculating the number of all points, and then moving to the position of another window length; finally, selecting the window of the highest number of points as the high-density area;

(3.2) converting the high-density region into a binary image

Equally dividing the high-density area into small rectangles, wherein each rectangle corresponds to one pixel in the newly generated binary image; then, for each rectangle, calculating the number of points therein; if the total number exceeds a predefined threshold, the corresponding pixel is set to 1, otherwise to 0;

(3.3) obtaining the communicating part

Identifying k maximum connected components and a point set of each connected component in the binary image, and converting the problem of estimating the slope of the stripe into the problem of estimating the slope of the k maximum connected components;

(3.4) estimation of fringe slope

After steps (3.1) - (3.3), a set of points for each of the k largest connected components will be obtained; setting a connected point set as S; let the slope value of the twill be k_cObtaining an estimated value of the slope by a least square method

Wherein rho is a coefficient to be optimized by a least square method, and beta is a constant; therefore, each connected component is found out, then the estimated value of the slope is calculated by using a least square method and is used as the slope value of the twill; finally, clustering the slope values, and solving the slope average value of the class with the largest quantity in the clustering result as the final carrier frequency offset estimation value;

(4) the carrier frequency offset estimation value is used as the fingerprint feature of the wireless equipment to perform bidirectional identification on the interconnected WiFi hotspot and the wireless equipment; the method comprises the following steps:

a) fingerprint collection is carried out on legal WiFi hotspots, and a white list is established;

b) when WiFi hot spots are accessed, acquiring a carrier frequency offset estimation value of a WiFi hot spot accessed currently by using the method from the step (1) to the step (3), comparing the estimation value with the fingerprint characteristics of the WiFi hot spot in the acquired white list, and if the similarity is lower than a threshold value, judging that the WiFi hot spot is illegal;

c) and a white list is established in advance for the wireless equipment needing to be accessed on the WiFi hotspot, and the WiFi hotspot can perform reverse identification on the accessed wireless equipment by comparing the fingerprint characteristics of the accessed wireless equipment.