CN109600751B - Pseudo base station detection method based on network side user data - Google Patents

Pseudo base station detection method based on network side user data Download PDF

Info

Publication number
CN109600751B
CN109600751B CN201811376023.7A CN201811376023A CN109600751B CN 109600751 B CN109600751 B CN 109600751B CN 201811376023 A CN201811376023 A CN 201811376023A CN 109600751 B CN109600751 B CN 109600751B
Authority
CN
China
Prior art keywords
base station
abnormal
url
abnormal base
access
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201811376023.7A
Other languages
Chinese (zh)
Other versions
CN109600751A (en
Inventor
戴彬
毛世奇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN201811376023.7A priority Critical patent/CN109600751B/en
Publication of CN109600751A publication Critical patent/CN109600751A/en
Application granted granted Critical
Publication of CN109600751B publication Critical patent/CN109600751B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W12/00Security arrangements; Authentication; Protecting privacy or anonymity
    • H04W12/12Detection or prevention of fraud

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Mobile Radio Communication Systems (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a pseudo base station detection method based on network side user data, which comprises the following steps: analyzing a URL and corresponding access information from HTTP request data of multiple users of the mobile internet, and filtering the URL obtained by analysis according to a domain name white list of the current network so as to obtain an abnormal URL; for any abnormal URL, obtaining an access quantity threshold value of the abnormal URL, and determining a base station with the access quantity of the abnormal URL being larger than the access quantity threshold value in a target time interval as an abnormal base station, so as to obtain an abnormal base station set corresponding to the abnormal URL; and for any abnormal URL, if the corresponding abnormal base station set is not empty, determining that the abnormal URL is a malicious URL, and dividing the abnormal base station set to enable adjacent abnormal base stations in the geographical position to belong to the same abnormal base station subset, thereby obtaining the activity area of the pseudo base station. The invention can accurately obtain a plurality of activity areas of the pseudo base station without depending on the mobile terminal user.

Description

Pseudo base station detection method based on network side user data
Technical Field
The invention belongs to the field of mobile internet network security, and particularly relates to a pseudo base station detection method based on network side user data.
Background
The pseudo base station can utilize a one-way authentication mechanism that the mobile phone cannot identify the base station to acquire the mobile phone card information and forcibly send the spam short messages to the mobile phone in a group. In the mobile internet, the pseudo base station is one of the main ways to propagate malicious URLs (Uniform Resource locators), because the number of spam messages mass-sent by the pseudo base station can be changed at will, and phishing websites with financial fraud properties are often propagated by using common service official numbers with low netizen protection psychology, such as 95555, 95588, 10086, etc., so that users are often easily cheated and suffer from economic property loss. And the pseudo base station has the characteristics of mobility, capability of efficiently realizing short message sending, low investment, high return and the like. The data pipeline port detects the pseudo base station, and can effectively reduce the damage of malicious URL links in short messages sent by the pseudo base station in groups to net citizens.
The communication operator cannot acquire the data of the mobile terminal, and can only use the user data at the network side to realize the monitoring of the pseudo base station. The current method for detecting the pseudo base station mainly comprises the following steps: and detecting based on signaling interaction data and terminal APP data. Based on the signaling interaction method, the mobile phone which is forced to register initiates the Location update from a certain source LAC (Location Area Code) to a certain LAC in the current network, which is mainly observed from the Location update signaling of the mobile network side. The source LAC, that is, the LAC of the pseudo base station, may preliminarily determine whether the source LAC is a pseudo base station according to the source LAC. The method can only achieve approximate position information of a single pseudo base station, and cannot achieve real-time tracking of a plurality of pseudo base stations. The method based on the terminal APP data is mainly used for accurately positioning the pseudo base station through the mobile terminal information, namely, the pseudo base station is detected by analyzing the related data of the short message, such as a sending number, a new LAC position and the like, comparing the processed information with the data in a library to detect whether the pseudo base station is the pseudo base station, and then estimating the position of the pseudo base station by returning the user position information of the received spam short message. In this way, the mobile phone needs to be started for positioning, which will bring traffic and electric quantity loss to the mobile phone, and meanwhile, the geographical activity track of the pseudo base station in a certain large area cannot be sensed.
In general, the existing method for detecting the pseudo base station cannot accurately obtain the activity area of the pseudo base station without depending on the mobile terminal user, and is not beneficial to a communication operator to effectively track the pseudo base station in real time.
Disclosure of Invention
In view of the defects and the improvement requirements of the prior art, the invention provides a method for detecting pseudo base stations based on network side user data, which aims to accurately obtain the activity areas of a plurality of pseudo base stations without depending on mobile terminal users.
In order to achieve the above object, according to an aspect of the present invention, a method for detecting a pseudo base station based on network-side user data is provided, which includes the following steps:
(1) analyzing a URL and corresponding access information from HTTP request data of multiple users of the mobile internet, and filtering the URL obtained by analysis according to a domain name white list of the current network so as to obtain an abnormal URL;
(2) for any abnormal URL, obtaining an access quantity threshold value of the abnormal URL, and determining a base station with the access quantity of the abnormal URL being larger than the access quantity threshold value in a target time interval as an abnormal base station, so as to obtain an abnormal base station set corresponding to the abnormal URL;
(3) for any abnormal URL, if the corresponding abnormal base station set is not empty, determining that the abnormal URL is a malicious URL, and dividing the abnormal base station set to enable adjacent abnormal base stations in the geographic position to belong to the same abnormal base station subset, so that the activity range of the pseudo base station is obtained;
the visit quantity threshold is a demarcation point of URL visit quantity of the abnormal base station and the non-abnormal base station, and the starting time of the target time interval is the time when the corresponding abnormal URL is detected for the first time.
According to a large amount of statistical information, after receiving spam messages with malicious URL links sent by pseudo base station equipment in a mass mode, a mobile terminal user can access the malicious URL links through HTTP requests in a certain time period; HTTP requests initiated by a plurality of victims to the URL link in the coverage area of the corresponding operator base station present the characteristic that the local area multi-user intensive access is realized on the geographical position, and the peripheral area access is sparse or even not realized. According to the invention, by utilizing the characteristics, the visit quantity of each base station visiting the abnormal URL in the target time interval is counted and compared with the visit quantity threshold value, so that the malicious URL and the corresponding abnormal base station set are identified, and the abnormal base stations adjacent to each other in the geographical position in the abnormal base station set are divided into the same abnormal base station subset, so that a plurality of activity areas of the pseudo base station can be accurately and effectively obtained.
The target time interval is specifically set according to the behavior characteristic of the pseudo base station for sending the spam short message and the behavior characteristic of the user for accessing the connection in the short message, so that whether the base station has intensive access characteristics or not can be accurately reflected by the acquired data, the abnormal base station can be accurately identified, and the accuracy of the detection of the pseudo base station can be improved.
Further, in step (3), the method for dividing the abnormal base station set into the abnormal base station subset includes:
acquiring the longitude and latitude of each abnormal base station in the abnormal base station set, and calculating the distance between any two abnormal base stations;
if the distance between the first abnormal base station and the second abnormal base station is smaller than the distance threshold, judging that the first abnormal base station and the second abnormal base station are adjacent in geographic position;
if the first abnormal base station and the second abnormal base station are adjacent to the third abnormal base station in the geographical position respectively, judging that the first abnormal base station and the second abnormal base station are adjacent in the geographical position;
dividing the abnormal base stations adjacent to each other in the geographic position into the same abnormal base station subset;
the first abnormal base station, the second abnormal base station and the third abnormal base station are different abnormal base stations corresponding to the same malicious URL.
Further, the calculation method of the access amount threshold value comprises the following steps:
for any abnormal URL, acquiring the identification code of each abnormal base station in the corresponding abnormal base station set and the access amount to the abnormal URL in a target time interval, and calculating a four-bit distance;
and calculating the access amount threshold according to the first fraction and the third fraction of the quartile range.
Further, the formula for calculating the access amount threshold is:
Thu=Q3+1.5*(Q3-Q1);
wherein ThuRepresenting the access threshold, Q1 and Q3 represent the first and third quantiles of the quartile range, respectively.
Generally, by the above technical solution conceived by the present invention, the following beneficial effects can be obtained:
(1) the pseudo base station detection method based on the network side user data obtains the URL and the corresponding access information by analyzing the HTTP request data of the network side, counts the access amount of each base station accessing an abnormal URL in a target time interval, and compares the access amount with an access amount threshold value, thereby identifying a malicious URL and identifying an abnormal base station. On one hand, the method only utilizes the user data at the network side to complete the detection of the pseudo base station without depending on the mobile terminal user; on the other hand, the method conforms to the behavior characteristics of the pseudo base stations, and is favorable for accurately and effectively detecting the activity areas of the pseudo base stations.
(2) According to the pseudo base station detection method based on the network side user data, the adjacent pseudo base stations in the geographic position are added into the same pseudo base station set, so that the activity area of the pseudo base station can be accurately obtained, and a communication operator can detect the pseudo base station with mobility in real time.
Drawings
Fig. 1 is a schematic diagram illustrating a conventional pseudo base station sending a malicious URL to an end user by a short message;
fig. 2 is a flowchart of a pseudo base station detection method based on network-side user data according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
Fig. 1 is a schematic diagram of a pseudo base station sending a malicious URL to a terminal user through a short message, in brief, the pseudo base station sends a spam short message carrying the malicious URL link to the user, the user accesses the malicious URL link carried in the short message through an operator base station, the pseudo base station cannot detect the malicious URL link, and an active area of the pseudo base station can only be determined through the detected base station. In order to obtain the activity range of the pseudo base station, the pseudo base station detection method based on the network side user data provided by the invention directly analyzes the URL from the HTTP request data of the network side user accessing the network, and if the data meet the characteristics that the local area (base station) has high access amount and the base station is adjacent to the base station to form a strip, the data are judged to be the pseudo base station data.
Specifically, the method for detecting a pseudo base station based on network side user data provided by the present invention is shown in fig. 2, and includes the following steps:
(1) analyzing a URL and corresponding access information from HTTP request data of multiple users of the mobile internet, and filtering the URL obtained by analysis according to a domain name white list of the current network so as to obtain an abnormal URL;
the access information corresponding to the URL comprises a user identifier, a base station identifier of an access base station and an access timestamp;
because the URL comprises domain name information, filtering is carried out according to a white list of the current network after the URL is extracted, namely normal URLs can be filtered out, and abnormal URLs are obtained;
(2) for any abnormal URL, obtaining an access quantity threshold value of the abnormal URL, and determining a base station with the access quantity of the abnormal URL being larger than the access quantity threshold value in a target time interval as an abnormal base station, so as to obtain abnormal base station combination corresponding to the abnormal URL;
the starting time of the target time interval is the time when the corresponding abnormal URL is detected for the first time, and the target time interval is specifically set according to the behavior characteristic of the pseudo base station for sending the spam short message and the behavior characteristic of the user for accessing the connection in the short message, so that whether the base station has the intensive access characteristic or not can be accurately reflected by the acquired data, the abnormal base station can be accurately identified, and the accuracy of the detection of the pseudo base station is improved; relevant research shows that the residence time of a pseudo base station at one geographical position is about one hour, the coverage range of a plurality of base stations can be rolled over one day, and meanwhile, most of received short messages can be checked by a user within 2 hours; the two indexes are comprehensively considered, and the specific value of the target time interval can be determined through simple comparison verification by combining a specific application scene;
the visit quantity threshold is a demarcation point of the URL visit quantity of the abnormal base station and the non-abnormal base station, and in an optional implementation mode, the calculation method of the visit quantity threshold comprises the following steps:
for any abnormal URL, acquiring the identification code of each abnormal base station in the corresponding abnormal base station set and the access amount to the abnormal URL in a target time interval, and calculating a four-bit distance; calculating an access amount threshold according to the first fraction and the third fraction of the four-bit distance;
specifically, according to each base station identifier accessing the abnormal URL and the corresponding access amount in the target time interval, a set of set data in the form of { bsid1:34, bsid2:89: bsid3:283, … } may be obtained, where each element in the set data represents the base station identifier accessing the abnormal URL and the corresponding access amount, respectively, for example, where the first element "bsid 1: 34" represents that the base station identifier of the base station is bsid1, and the corresponding access amount is 34; sorting the elements in the set data according to the access amount, and taking each element as a sample to obtain the sample number, the mean value, the standard deviation, the minimum value, the maximum value and 3 four-bit numbers of the data set, and the numbers at 25%, 50% and 75% positions of the data; when the four-bit distance is calculated, the specific calculation can be completed by utilizing the correlation function of the pandas library;
in this embodiment, the formula for calculating the access amount threshold specifically includes:
Thu=Q3+1.5*(Q3-Q1);
wherein ThuRepresenting the access threshold, Q1 and Q3 representing the first and third quantiles of a quartile range, respectively;
according to a theory related to statistics, the access quantity threshold is calculated by adopting the method, so that an abnormal base station with abnormal access to any abnormal URL can be effectively detected;
(3) for any abnormal URL, if the corresponding abnormal base station set is not empty, determining that the abnormal URL is a malicious URL, and dividing the abnormal base station set to enable adjacent abnormal base stations in the geographic position to belong to the same abnormal base station subset, so that the activity range of the pseudo base station is obtained;
in an optional embodiment, the method for dividing the abnormal base station set into the abnormal base station subset includes:
acquiring the longitude and latitude of each abnormal base station in the abnormal base station set, and calculating the distance between any two abnormal base stations;
if the distance between the first abnormal base station and the second abnormal base station is smaller than the distance threshold, judging that the first abnormal base station and the second abnormal base station are adjacent in geographic position;
if the first abnormal base station and the second abnormal base station are adjacent to the third abnormal base station in the geographical position respectively, judging that the first abnormal base station and the second abnormal base station are adjacent in the geographical position;
dividing the abnormal base stations adjacent to each other in the geographic position into the same abnormal base station subset;
the first abnormal base station, the second abnormal base station and the third abnormal base station are different abnormal base stations corresponding to the same malicious URL;
in the above method for determining the vicinity of the abnormal base station in the underground position, the latitude is
Figure BDA0001870787170000071
And
Figure BDA0001870787170000072
and two abnormal base stations with longitude difference of Δ λ, the distance d between the two pseudo base stations can be calculated by using the following formula:
Figure BDA0001870787170000073
in the above calculation formula, R represents the radius of the earth;
based on the above determination method, even if the distance between the abnormal base stations a and C is greater than the distance threshold, the abnormal base station a is adjacent to the abnormal base station B, and the abnormal base station C is adjacent to the abnormal base station B, it is still determined that the abnormal base station a is adjacent to the abnormal base station C, and the abnormal base stations A, B and C belong to the same abnormal base station subset;
the distance threshold is set according to the distance between the base stations in the actual environment so as to ensure that the moving range of the pseudo base station can be accurately obtained; according to the basic knowledge of the base stations, the distance between the base stations is generally within the range of (300,1000) meters, a single-step tuning method is used, namely, the value of (30,1000) is subjected to traversal value taking, the malicious URL detection quantity obtained by selecting different distance thresholds is compared with manual judgment, and the distance threshold closest to the manual check value is the optimal distance threshold.
According to a large amount of statistical information, after receiving spam messages with malicious URL links sent by pseudo base station equipment in a mass mode, a mobile terminal user can access the malicious URL links through HTTP requests in a certain time period; HTTP requests initiated by a plurality of victims to the URL link in the coverage area of the corresponding operator base station present the characteristic that the local area multi-user intensive access is realized on the geographical position, and the peripheral area access is sparse or even not realized. According to the invention, by utilizing the characteristics, the visit quantity of each base station visiting the abnormal URL in the target time interval is counted and compared with the visit quantity threshold value, so that the malicious URL is identified, the pseudo base station is further identified, and a plurality of activity areas of the pseudo base station can be accurately and effectively determined.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (2)

1. A pseudo base station detection method based on network side user data is characterized by comprising the following steps:
(1) analyzing a URL and corresponding access information from HTTP request data of multiple users of the mobile internet, and filtering the URL obtained by analysis according to a domain name white list of the current network so as to obtain an abnormal URL;
(2) for any abnormal URL, obtaining an access quantity threshold value of the abnormal URL, and determining a base station with the access quantity of the abnormal URL being larger than the access quantity threshold value in a target time interval as an abnormal base station, so as to obtain an abnormal base station set corresponding to the abnormal URL;
the calculation method of the access amount threshold value comprises the following steps: for any abnormal URL, acquiring the identification code of each abnormal base station in the corresponding abnormal base station set and the visit quantity of the abnormal URL in the target time interval, and calculating the four-bit distance; calculating the access amount threshold value according to the first quantile and the third quantile of the quartile range;
(3) for any abnormal URL, if the corresponding abnormal base station set is not empty, determining that the abnormal URL is a malicious URL, and dividing the abnormal base station set to enable adjacent abnormal base stations in the geographic position to belong to the same abnormal base station subset, so that an activity area of a pseudo base station is obtained;
in the step (3), the method for dividing the abnormal base station set into the abnormal base station subset includes:
acquiring the longitude and latitude of each abnormal base station in the abnormal base station set, and calculating the distance between any two abnormal base stations;
if the distance between a first abnormal base station and a second abnormal base station is smaller than a distance threshold value, judging that the first abnormal base station and the second abnormal base station are adjacent in geographic position;
if the first abnormal base station and the second abnormal base station are adjacent to a third abnormal base station in the geographical position respectively, judging that the first abnormal base station and the second abnormal base station are adjacent in the geographical position;
dividing the abnormal base stations adjacent to each other in the geographic position into the same abnormal base station subset; the visit quantity threshold is a demarcation point of URL visit quantity of an abnormal base station and a non-abnormal base station, and the starting time of the target time interval is the time when the corresponding abnormal URL is detected for the first time; the first abnormal base station, the second abnormal base station and the third abnormal base station are different abnormal base stations corresponding to the same malicious URL.
2. The method for detecting pseudo base station based on network side user data according to claim 1, wherein the formula for calculating the access amount threshold is:
Thu=Q3+1.5*(Q3-Q1);
wherein ThuRepresenting the access threshold, Q1 and Q3 representing the first quantile and the third quantile of the quartile range, respectively.
CN201811376023.7A 2018-11-19 2018-11-19 Pseudo base station detection method based on network side user data Expired - Fee Related CN109600751B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811376023.7A CN109600751B (en) 2018-11-19 2018-11-19 Pseudo base station detection method based on network side user data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811376023.7A CN109600751B (en) 2018-11-19 2018-11-19 Pseudo base station detection method based on network side user data

Publications (2)

Publication Number Publication Date
CN109600751A CN109600751A (en) 2019-04-09
CN109600751B true CN109600751B (en) 2020-09-18

Family

ID=65958748

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811376023.7A Expired - Fee Related CN109600751B (en) 2018-11-19 2018-11-19 Pseudo base station detection method based on network side user data

Country Status (1)

Country Link
CN (1) CN109600751B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111866848B (en) * 2019-04-28 2023-04-18 北京数安鑫云信息技术有限公司 Mobile base station identification method and device and computer equipment

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103391547A (en) * 2012-05-08 2013-11-13 腾讯科技(深圳)有限公司 Information processing method and terminal
CN104219219B (en) * 2013-07-05 2018-02-27 腾讯科技(深圳)有限公司 A kind of method of data processing, server and system
CN104185158A (en) * 2014-09-01 2014-12-03 北京奇虎科技有限公司 Malicious short message processing method and client based on false base station
CN107155186B (en) * 2017-04-10 2020-02-14 中国移动通信集团江苏有限公司 Pseudo base station positioning method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
利用手机信息精确定位伪基站的方法研究;王德广等;《微信电脑应用》;20141120;第25-27页 *
基于多用户垃圾短信数据的伪基站活动轨迹可视分析方法;蒲誉文等;《计算机应用》;20180410;第1207-1212页 *

Also Published As

Publication number Publication date
CN109600751A (en) 2019-04-09

Similar Documents

Publication Publication Date Title
Li et al. FBS-Radar: Uncovering Fake Base Stations at Scale in the Wild.
US9165288B2 (en) Inferring relationships based on geo-temporal data other than telecommunications
Handte et al. Crowd Density Estimation for Public Transport Vehicles.
Redondi et al. Building up knowledge through passive WiFi probes
CN103229528A (en) Method and apparatus for fingerprint identification of wireless communication device
CN106686157B (en) Method and system for identifying proxy IP
CN103648096A (en) Method for rapidly detecting and positioning illegal base station intrusion
Soundararaj et al. Estimating real-time high-street footfall from Wi-Fi probe requests
CN108924759B (en) Method, device and system for identifying mobile generator
WO2020232999A1 (en) Information security-based positioning data monitoring method and related device
CN109195219B (en) Method for determining position of mobile terminal by server
CN107872767A (en) A kind of net about car brush single act recognition methods and identifying system
CN107155186B (en) Pseudo base station positioning method and device
Wang et al. A Modified Inverse Distance Weighting Method for Interpolation in Open Public Places Based on Wi‐Fi Probe Data
WO2017035993A1 (en) Safety evaluation method and device
Fuxjaeger et al. Towards privacy-preserving wi-fi monitoring for road traffic analysis
CN107181717A (en) A kind of risk endpoint detection methods and device
CN110475274B (en) Method for identifying abnormal AP in mobile positioning technology
CN111034251A (en) Improving quality of service in a radio network by extracting geographic coordinates from a communication session
Chernyshev et al. Revisiting urban war nibbling: Mobile passive discovery of classic bluetooth devices using ubertooth one
CN108235323B (en) Big data-based pseudo base station early warning method and system
CN113645625B (en) Pseudo base station positioning method, pseudo base station positioning device, electronic equipment and readable medium
CN109600751B (en) Pseudo base station detection method based on network side user data
Gebru A privacy-preserving scheme for passive monitoring of people’s flows through WiFi beacons
CN107612946B (en) IP address detection method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20200918

Termination date: 20211119