WO2021093823A1 - Pseudo base station identification method and system, and computer readable storage medium - Google Patents

Pseudo base station identification method and system, and computer readable storage medium Download PDF

Info

Publication number
WO2021093823A1
WO2021093823A1 PCT/CN2020/128476 CN2020128476W WO2021093823A1 WO 2021093823 A1 WO2021093823 A1 WO 2021093823A1 CN 2020128476 W CN2020128476 W CN 2020128476W WO 2021093823 A1 WO2021093823 A1 WO 2021093823A1
Authority
WO
WIPO (PCT)
Prior art keywords
base station
pseudo base
performance data
counter
pseudo
Prior art date
Application number
PCT/CN2020/128476
Other languages
French (fr)
Chinese (zh)
Inventor
陆纬
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2021093823A1 publication Critical patent/WO2021093823A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/06Testing, supervising or monitoring using simulated traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W12/00Security arrangements; Authentication; Protecting privacy or anonymity
    • H04W12/12Detection or prevention of fraud

Definitions

  • the present invention relates to but not limited to the field of communication technology, and more specifically, relates to but not limited to a pseudo base station identification method, system, and computer-readable storage medium.
  • the communication equipment system in order to be able to comprehensively evaluate the operating status of the base station system and to monitor and optimize network services, it will define the parameters that evaluate and identify the normal, pros and cons of the system’s various important operating conditions, namely the counter, and then The network management system regularly collects counter data from network equipment, and these counter data are the source of the base station performance big data mentioned in the solution in this article.
  • Pseudo base stations controlled by illegal organizations or individuals independent of the public mobile network, disguised as a base station of a regular mobile communication operator (for example, broadcasting a mobile operator’s public land mobile network ID (PLMN ID)) , So as to trick the user terminal to initiate a network registration or location update request to it, and then extract the terminal's information.
  • PLMN ID public land mobile network ID
  • pseudo base stations often send fraudulent short messages, malicious URL links or advertising short messages to the terminal, thereby endangering users.
  • the existing methods for locating pseudo base stations are generally divided into two types.
  • the first is to use SMS data for analysis.
  • This method is to detect the user's received short message, extract the characteristics of the short message and compare it with the big data of the cloud short message. If the characteristic information of the problem short message is met, the short message is placed in the intercepted state and the current terminal position is reported to locate the pseudo base station. .
  • This method needs to extract the user's text messages and location information.
  • the other is to use a certain detection terminal to extract the characteristic information of the pseudo base station, such as frequency point, signal strength and other information.
  • This method can detect the existence of pseudo base stations in real time, but because pseudo base stations are often mounted in vehicles and are highly mobile, public security agencies often do not have enough manpower and material resources to conduct large-scale net-based searches.
  • the embodiments of the present invention provide a pseudo base station identification method, system, and computer-readable storage medium.
  • the pseudo base station identification method includes: determining a key counter from the performance data counter of the base station, and the data in the key counter is affected by the pseudo base station; establishing a corresponding regression analysis model according to the key counter; The statistical data corresponding to the key counter is extracted from the database and substituted into the regression analysis model for machine learning to determine the threshold value of the correlation coefficient of the pseudo base station; the performance data of the key counter is calculated and compared with the threshold value to determine whether The pseudo base station exists.
  • a pseudo base station identification system includes: a base station and an analysis device; the base station is used to generate performance data using a performance data counter; and the analysis device includes data extraction Module, rule learning module, and data storage module; the data extraction module is connected to the network management database, and extracts data corresponding to key counters from the network management database; the planning learning module is used to formulate analysis models and calculate based on historical performance data The threshold value of the correlation coefficient; the data storage module is used to store the data required by the pseudo base station identification method described in the embodiment of the present invention.
  • a computer-readable storage medium stores one or more programs, and the one or more programs can be executed by one or more processors, In order to realize the steps of the pseudo base station identification method described in the embodiment of the present invention.
  • FIG. 1 is a schematic flowchart of a method for identifying a pseudo base station according to Embodiment 1 of the present invention
  • 2 is a schematic diagram of the influence of the pseudo base station on the data of the LTE average user count counter of the normal base station provided by the second embodiment of the present invention
  • FIG. 3 is a schematic diagram of the influence of a user's normal handover of a cell on the counter data of the number of successful handovers in and out of the base station according to the second embodiment of the present invention
  • FIG. 4 is a schematic diagram of the influence of the pseudo base station on the counter data of the number of successful handovers in and out of the normal base station according to the second embodiment of the present invention
  • 6 is a flow chart of drawing pseudo base station trajectory in an unknown area according to the confidence interval obtained by learning according to the sixth embodiment of the present invention.
  • FIG. 7 is a schematic diagram of a pseudo base station identification system provided by Embodiment 7 of the present invention.
  • FIG. 8 is a schematic diagram of another pseudo base station identification system provided by Embodiment 3 of the present invention.
  • the present invention provides a pseudo base station identification method and system based on base station performance big data and machine learning.
  • the location of the pseudo base station can be realized through the location information of the base station.
  • the method uses a machine learning algorithm to detect fluctuations in the performance data of the base station. Study and analyze, determine whether there is a pseudo base station, and combine the location of the base station and the road network information to draw the movement trajectory of the pseudo base station.
  • the pseudo base station identification method includes: determining a key counter from the performance data counter of the base station, and the data in the key counter is affected by the pseudo base station; The counter establishes the corresponding regression analysis model; extracts the statistical data corresponding to the key counter from the network management database, and substitutes it into the regression analysis model for machine learning to determine the threshold value of the correlation coefficient of the pseudo base station; calculates the performance data of the key counter and compares the threshold value, Determine whether there is a pseudo base station.
  • FIG. 1 is a schematic flowchart of a pseudo base station identification method provided in this embodiment.
  • the process of the pseudo base station identification method includes:
  • the key counter determined from the base station requires that the data collected by the key counter will be affected by the pseudo base station, and the greater the impact, the more obvious the more conducive to the identification of the pseudo base station.
  • the performance data of the base station cannot directly feed back whether the user has accessed the pseudo base station. If a user terminal is connected to the pseudo base station, what the normal base station perceives is a normal user switch-out operation. However, when the pseudo base station passes by the normal base station, more users are often affected, which will cause the performance data of the normal base station to fluctuate or deteriorate. These counters that produce data fluctuations are the key counters. Therefore, these key counters can be used. The data fluctuation of the counter speculates whether there is a pseudo base station.
  • the regression analysis model can clearly show the changes and influencing factors of the data in the key counter.
  • the statistical data corresponding to the key counter is extracted from the network management database and substituted into the regression analysis model for machine learning to determine the threshold value of the correlation coefficient of the pseudo base station.
  • the network management database contains the performance data counted by the performance data counters of all base stations .
  • Machine learning requires a large amount of data. Therefore, after the regression analysis model is determined in step S102, in addition to machine learning on local data, a large amount of data and statistical data related to key counters can also be obtained from the network management database. Machine learning makes the results of machine learning more accurate, and the obtained threshold value of the correlation coefficient of the pseudo base station is closer to the actual situation.
  • S104 Calculate the performance data of the key counter and compare it with the threshold to determine whether the pseudo base station exists.
  • the threshold value of the correlation coefficient of the pseudo base station calculate the current performance data from the key counter data collected by the local base station, and compare the performance data with the threshold value to determine whether there are pseudo base stations currently around the base station.
  • the base station includes a single base station or a base station in an area
  • the selected key counter may also be a performance counter or multiple performance counters, which are integrated to obtain data.
  • a single base station can be selected to work, or multiple base stations in an area can be selected to work together.
  • the location of the pseudo base station after determining that there is a pseudo base station, it further includes: determining the location of the pseudo base station and drawing the movement track of the pseudo base station; determining the location of the pseudo base station according to the location information of the base station where the pseudo base station is found, within a preset time interval , Calculate multiple location information of the pseudo base station and connect them to obtain the movement trajectory of the pseudo base station.
  • the location of the pseudo base station After determining the presence of pseudo base stations around the base station, the location of the pseudo base station can be roughly determined through the location information of the base station.
  • the location information of multiple base stations can also be analyzed for comparison. Accurately analyze the location of the pseudo base station.
  • pseudo base stations are usually mobile base stations and rarely stay in one place. Therefore, in order to effectively combat criminal activities of pseudo base stations, it is convenient to understand and analyze the movement trend of pseudo base stations. Monitoring within a time interval and determining the movement trajectory of the pseudo base station can easily determine the next trend of the pseudo base station, thereby providing assistance in combating the pseudo base station.
  • calculating the analysis data of the key counter of the base station to compare with the threshold includes, but is not limited to: substituting the statistical data of the key counter into the regression analysis model to obtain performance data, and comparing the performance data with the threshold to determine whether there is false Base station. It is understandable that the method for judging a pseudo base station provided above is only one of many methods. In practical applications, other correction parameters or other optional parameters can be added in the calculation process to make the result more accurate. Even in some cases, in order to prevent the results from being leaked and used, encryption algorithms can be added to make the calculation results more secure.
  • the performance data counter of the base station includes, but is not limited to: the average number of users, the number of successful executions of intra-cell intra-frequency handover of the X2 interface between base stations, the number of successful intra-cell intra-frequency handovers of the X2 interface between base stations, The number of successful executions of inter-cell inter-frequency handover of the X2 interface between base stations, the number of successful inter-cell inter-frequency handovers of the X2 interface between base stations, the number of successful inter-cell inter-frequency handovers of the S1 interface between base stations, and the number of successful inter-cell inter-frequency handovers of the S1 interface between base stations.
  • the number of successful executions of inter-cell inter-frequency handover the number of successful executions of inter-cell intra-frequency handover of the S1 interface between base stations, and the number of successful inter-cell intra-frequency handovers of the S1 interface between base stations.
  • the regression analysis model is a calculation formula corresponding to the key counter, and the result of the calculation formula is the threshold value of the pseudo base station correlation coefficient of the key counter.
  • the calculation formula also includes reasonable coefficients, which are made close to the actual values through machine learning.
  • the performance data counter includes a performance data counter in a base station or a performance data counter between base stations.
  • the data of the performance data counter counted at this time is the data of the performance data counter in the base station; when the pseudo base station is moving beyond a base station Or when the coverage of the base stations in an area is the case, the data of the performance data counter counted at this time may be the data of the performance data counter between the base stations. Avoiding errors in data coordination between base stations, making data collection errors smaller and recognition results more accurate.
  • This embodiment provides a pseudo base station identification method, including: determining a key counter from the performance data counter of the base station, and the data in the key counter is affected by the pseudo base station; establishing a corresponding regression analysis model according to the key counter; and from the network management database The statistical data corresponding to the key counter is extracted and substituted into the regression analysis model for machine learning to determine the threshold of the correlation coefficient of the pseudo base station; the performance data of the key counter is calculated and compared with the threshold to determine whether there is a pseudo base station.
  • the regression analysis model and machine learning are used to obtain the threshold value of the pseudo base station correlation coefficient, and the pseudo base station correlation coefficient is used
  • the threshold value is used to identify whether there is a pseudo base station, so that the identification result is more accurate.
  • this embodiment also determines the location of the pseudo base station through the location information of the base station, and draws the movement trajectory of the pseudo base station by acquiring the location information of the pseudo base station for a period of time, which is used to predict the subsequent movement position of the pseudo base station, which is helpful To combat the crime of fake base stations.
  • This embodiment takes a 4G LTE protocol base station as an example, and proposes an implementation manner of a pseudo base station identification method provided in an embodiment of the present invention.
  • LTE (4G protocol) base station we select the following key counters: average number of LTE users, X2 (inter-base station interface) between eNBs (base station), the number of successful executions of simultaneous handover between cells, and X2 inter-eNB cells The number of successful executions of inter-frequency handover in the same frequency, the number of successful executions of inter-cell inter-frequency handover of X2 ports between eNBs, the number of successful inter-cell inter-frequency handovers of X2 ports between eNBs, and the number of inter-eNB S1 (inter-base station interface) ports.
  • Ru can be used as a correlation coefficient to determine the pseudo base station.
  • W n value may be closer and closer to a reasonable value through machine learning, and R u calculated confidence interval.
  • the counters of the neighboring cell and the current cell can be used for regression analysis modeling at the same time.
  • Figure 3 shows the user's normal handover cell to the base station handover to and from the base station.
  • Schematic diagram of the influence of the success counter data when a user terminal is connected to a pseudo base station, the number of handovers out of the base station will be +1, while the number of handovers in adjacent cells will not be +1. Switch out the schematic diagram of the influence of the success counter data. If a large number of users are connected to the pseudo base station, the number of handovers of this base station will increase more than normal data, but the number of handovers of adjacent cells will not increase significantly.
  • the "number of handovers” is used here to represent the number of successful simultaneous and inter-frequency handovers of the S1 and X2 ports between eNBs, and the “number of handovers” means the same frequency and different frequency handovers of the S1 and X2 ports between eNBs.
  • the number of successful entries Sets the base station in the switching times of the current statistical particle size interval is O now, the current cell in the past n days on the same switching frequency and the statistical particle size n is O; the current frequency handover statistics size k-th adjacent cell is IK now , the number of switch-ins with the same statistical granularity on the nth day in the past is IK n , and the following two formulas can be obtained:
  • R o is a cut-out station number of the correlation coefficient
  • R o larger value indicates a pseudo base stations may exist.
  • R i is the correlation coefficient of the number of cell cut-in times of adjacent base stations. The smaller the R i, the more likely there is a pseudo base station.
  • counters there are other counters that can be used to build a regression model. This method is not exhaustive. For GSM (2G protocol), UMTS (3G protocol), TD-SCDMA (3G protocol), and 5G base stations or controllers, similar counters can be selected to establish regression analysis models and the above methods can be used for machine learning.
  • the data analysis algorithm can be specified by the user or obtained through machine deep learning
  • Machine learning is performed through historical performance data of known pseudo base stations. To give R u, R o, R i or the like correlation user-defined threshold model.
  • It can be connected to the network management system to obtain performance data of base stations in unknown areas, and use analysis models and threshold information to calculate and determine whether there are pseudo base stations around the target base station.
  • the pseudo base station identification method compared with the existing pseudo base station positioning solution, there is no need to specifically design a terminal to locate the pseudo base station.
  • a set of systems can be used in multiple places, just input the model and performance data.
  • the performance data analysis model is universal and can be customized by users. It is applicable to 2G, 3G, 4G and 5G networks, regardless of the communication equipment of any manufacturer, as long as the performance data related to the model is statistically analyzed, the system can be applied.
  • This method does not collect data on the user terminal, and the performance data it relies on is all control plane data and does not involve user plane data. There is no risk of user information leakage.
  • FIG. 5 is a flowchart of machine learning based on performance data in an embodiment of the present invention. The process mainly includes the following steps (S501-S505):
  • Step S501 screening the performance data counters that the pseudo base station will affect. For this embodiment, that is, the average number of LTE users.
  • step S502 a regression analysis model is established according to the selected counter.
  • the above-mentioned Ru formula is used to build a model, and the model is stored in the database.
  • the average number of LTE users at the current statistical granularity time point is X now , and the number of users corresponding to the same statistical granularity in the past n days is divided into X n ; W n is the weight of the difference between the value of the past few days and the same time today.
  • step S503 the counter data required for this learning is extracted from the network management system, and the marked pseudo base station data is imported.
  • Table 1 shows the performance counter data of the average number of LTE users imported into the system
  • Table 1 shows the data of the average number of LTE users of the base station with ID equal to 301 at the same statistical granularity during four days.
  • Collectime is the data collection time
  • Subnetwork is the subnet ID
  • EnbId is the base station ID
  • the column of LTE average users represents the number of users counted at the corresponding time.
  • the counter data collected from the network management system will be stored using the structure in Table 1.
  • the Mark field indicates whether there is a mark of a pseudo base station near the base station. This part of the data comes from the data of a known pseudo base station in the public security system.
  • step S504 machine learning is performed based on historical performance data and information of known pseudo base stations with tags.
  • the value of W n weight should be closer to the current time, the higher the weight.
  • the values of W n are all set to 1, and n is 3.
  • For the performance data is determined to be a base station, if the calculated value by the equation R u R min ⁇ R max range, it may be determined that the base station in the presence of the corresponding data outside the time point pseudo base stations.
  • step S505 the confidence interval obtained by learning is stored in the database.
  • the algorithm is the formula established in this article using the average number of LTE users between eNBs, the number of switching in and out of the S1 and X2 interfaces.
  • the machine learning model involved in determining the pseudo base station can be pre-modeled or customized by the user. Analysis models can be stored in the cloud, and a cloud model library can be established.
  • the number of handovers in and out between eNBs is used to establish an analysis model.
  • the fourth embodiment is similar to the third embodiment, and the steps in FIG. 5 can also be referred to.
  • the filtered counter is the number of successful handovers in and out of the S1 and X2 interfaces between eNBs.
  • step S502 the above-mentioned R o and R i are used to build a model.
  • R o is a cut-out station number of the correlation coefficient
  • R o larger value indicates a pseudo base stations may exist.
  • R i is the correlation coefficient of the number of cell cut-in times of adjacent base stations. The smaller the R i, the more likely there is a pseudo base station.
  • step S503 the counter data required for this learning is extracted from the network management system, and the marked pseudo base station data is imported.
  • the data of S1 interface is used as an example:
  • Table 2 shows the performance counter data of the number of times the S1 port is switched out between eNBs imported into this system
  • Table 2 shows the performance counter data of the number of times that the base station with ID equal to 301 has switched out the S1 port between eNBs at the same statistical granularity during four days.
  • Table 3 shows the performance counter data of the number of S1 port handovers between two adjacent base stations eNB
  • Table 3 shows the performance counter data of the number of inter-eNB S1 port handover times of two base stations with base station IDs equal to 302 and 303.
  • the two base stations 302 and 303 are adjacent to the base station 301.
  • step S504 machine learning is performed based on historical performance data and information of known pseudo base stations with tags.
  • the weight of each correlation coefficient is still set to 1.
  • the value of R o is 705 and the value of R i is -11.
  • the base station to be determined if it is greater than or R o equal to 705 and the value of the neighboring base station is R i is less than or equal to -11. It can be determined that there is a pseudo base station around the base station.
  • the more sample data is learned the more accurate the threshold and weight values corresponding to R o and R i will be.
  • step S505 the confidence interval obtained by learning is stored in the database.
  • This embodiment integrates the average number of LTE users and the number of handovers in and out of eNBs to establish a regression analysis model.
  • the fifth embodiment is the same as the third embodiment and the fourth embodiment. In order to make the result more accurate, the data of the above two embodiments are comprehensively calculated.
  • Step S501 screening the performance data counters that the pseudo base station will affect.
  • the counters of the third embodiment and the fourth embodiment are selected.
  • step S502 a regression analysis model is established according to the selected counter.
  • the model R u, R o, R i still using the model R u, R o, R i .
  • the thresholds of the coefficients also use the values in the above two embodiments. among them:
  • Ru is the correlation coefficient of the LTE average number of users counter.
  • R o is the correlation coefficient of the counter of the number of times that the base station has switched out the success.
  • R i is the correlation coefficient of the counter of the number of times that the neighboring base station switches into the success.
  • R u correlation coefficient falls confidence interval -258 ⁇ -238 and R o equal to 705 and greater than the R i -11 or less, it may be determined that the neighboring base stations A pseudo base station is suspected.
  • Step S503, step S504, and step S505 are the same as the third and fourth embodiments.
  • Embodiment 6 is a diagrammatic representation of Embodiment 6
  • the pseudo base station is determined in the unknown area and the pseudo base station trajectory is drawn.
  • Fig. 6 is a flowchart of drawing the pseudo base station trajectory in the unknown area according to the confidence interval obtained by learning. The process includes the following steps:
  • Step S601 Extract historical data of required performance counters from the network management system. And use the same format described in Table 1 to store in the database, leave the Mark field blank.
  • Step S602 Calculate the data of each statistical granularity of each base station according to the formula in Embodiment 3 or Embodiment 4 or Embodiment 5 to obtain the correlation coefficient R of the pseudo base station of each base station at different time points. If R falls within the range of the confidence interval R min to R max , modify the Mark field of this base station to True. After the data of all base stations are calculated, the records in which the Mark field is True are filtered out, and all base station IDs that are judged to have pseudo base stations nearby can be obtained. See Table 4 below.
  • step S603 since the location of each base station is known, according to the longitude and latitude of the base station in Table 4, combined with the map data, a dot matrix can be marked on the map. Combined with the road network information, the trajectory diagram of the pseudo base station can be drawn.
  • this system can be applied as long as the performance data related to the model is statistically analyzed.
  • the analysis model can be modified according to the actual situation. After the machine learning is completed, the system can analyze data from different regions and different manufacturers. It has the characteristics of low cost and wide applicability. It is convenient for the public security system to carry out targeted arrests in the areas under its jurisdiction.
  • FIG. 7 is a schematic diagram of a pseudo base station identification system provided by this embodiment.
  • the pseudo base station identification system includes: a base station, a network management server, and a network management database;
  • the network management database stores performance data counted by performance data counters of all base stations
  • the base station is configured to use the performance data counter to generate performance data corresponding to the key counter;
  • the network management server includes a data extraction module, a rule learning module, and a data storage module;
  • the data extraction module is connected to the network management database, and extracts the performance data corresponding to the key counter from the network management database;
  • the planning learning module is used to formulate an analysis model, and calculate the threshold value of the correlation coefficient of the pseudo base station according to the performance data corresponding to the key counter extracted from the network management database;
  • the data storage module is used to store data required to implement the pseudo base station identification method described in Embodiment 1 to Embodiment 6 of the present application.
  • the network management server further includes: a user interaction module and a graphics management module;
  • the user interaction module is used to make a pseudo base station reminder through the connection of the base station and the user terminal;
  • the graphic management module is used to generate the movement trajectory of the pseudo base station.
  • this embodiment also provides a schematic diagram of another pseudo base station identification system for reference.
  • FIG. 8 is a schematic diagram of another pseudo base station identification system provided by this embodiment.
  • Figure 8 includes: user interaction module, data extraction module, data storage module, database, rule learning module and graphics management module.
  • the rule learning module includes analysis model management module, threshold management module, counter model management module and base station model management module.
  • the network management database can be updated by connecting to an external system.
  • the external systems connected to this system include the network management system and the external database.
  • the network management system is responsible for providing historical performance data of the base station.
  • the external database is responsible for providing information such as maps, road networks, and known pseudo base stations.
  • the data extraction module includes: docking with the existing network network management system, extracting the required counters, base station location information, base station and cell correspondence, cell neighboring relationship and other information from the network management system database and converting them into the format required by the system And stored in the data storage module.
  • the data storage module includes: storage of historical data of performance counters, known pseudo base station location information, road network data, regression analysis model, judgment threshold, pseudo base station trajectory graph and other data.
  • the rule learning module includes: formulating an analysis model and calculating the threshold of the correlation coefficient based on historical performance data.
  • User interaction module includes: user interaction and display of pseudo base station trajectory.
  • the graphics management module includes: matching of base station location information with road network information, and drawing of pseudo base station trajectory graphs.
  • This embodiment also provides a computer-readable storage medium.
  • the computer-readable storage medium stores one or more programs, and the one or more programs can be executed by one or more processors to implement the embodiments of the present application.
  • the embodiments of the present invention provide a pseudo base station identification method, system, and computer-readable storage medium, so as to realize pseudo base station identification without involving user privacy.
  • the pseudo base station identification method includes: determining the key counter from the performance data counter of the base station, and the data in the key counter is affected by the pseudo base station; establishing the corresponding regression analysis model according to the key counter; extracting the statistical data corresponding to the key counter from the network management database , And substituted into the regression analysis model for machine learning to determine the threshold of the correlation coefficient of the pseudo base station; calculate the performance data of the key counter and compare with the threshold to determine whether there is a pseudo base station.
  • the identification of pseudo base stations by obtaining key counter data from the base station avoids problems involving user privacy.
  • machine learning is also introduced to determine pseudo base stations, making the identification of pseudo base stations more accurate.
  • communication media usually contain computer-readable instructions, data structures, computer program modules, or other data in a modulated data signal such as carrier waves or other transmission mechanisms, and may include any information delivery medium. Therefore, the present invention is not limited to any specific combination of hardware and software.

Abstract

A pseudo base station identification method and system, and a computer readable storage medium. The pseudo base station identification method comprises: determining a key counter from performance data counters of a base station, wherein data in the key counter is affected by a pseudo base station (S101); establishing a corresponding regression analysis model according to the key counter (S102); extracting statistical data corresponding to the key counter from a network management database, substituting same into the regression analysis model for machine learning, to determine a threshold of a correlation coefficient of the pseudo base station (S103); and calculating performance data of the key counter and comparing same with the threshold, to determine whether the pseudo base station exists (S104).

Description

一种伪基站识别方法、系统及计算机可读存储介质Pseudo base station identification method, system and computer readable storage medium
相关申请的交叉引用Cross-references to related applications
本申请基于申请号为201911114985.X、申请日为2019年11月14日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。This application is filed based on a Chinese patent application with an application number of 201911114985.X and an application date of November 14, 2019, and claims the priority of the Chinese patent application. The entire content of the Chinese patent application is hereby incorporated into this application by reference.
技术领域Technical field
本发明涉及但不限于通信技术领域,更具体地说,涉及但不限于一种伪基站识别方法、系统及及计算机可读存储介质。The present invention relates to but not limited to the field of communication technology, and more specifically, relates to but not limited to a pseudo base station identification method, system, and computer-readable storage medium.
背景技术Background technique
在通信设备系统中,为了能够综合评价基站系统运行的状况,能够对网络业务的监控和优化,会定义评价、标识系统各种重要运行状态正常、优劣与否的参数,即计数器,然后可以利用网管系统定期从网络设备中采集计数器数据,这些计数器数据就是本文方案中所提到的基站性能大数据来源。In the communication equipment system, in order to be able to comprehensively evaluate the operating status of the base station system and to monitor and optimize network services, it will define the parameters that evaluate and identify the normal, pros and cons of the system’s various important operating conditions, namely the counter, and then The network management system regularly collects counter data from network equipment, and these counter data are the source of the base station performance big data mentioned in the solution in this article.
非法组织或个人控制的伪基站,独立于公众移动网络之外,通过伪装成正规移动通信运营商的基站(例如广播某移动运营商的公共陆地移动网络标识(public land mobile network ID,PLMN ID),从而诱骗用户终端向其发起网络注册或位置更新请求,进而提取终端的信息。另外,伪基站常常如向终端发送诈骗短信、恶意网址链接或广告短信等,从而危害到用户。Pseudo base stations controlled by illegal organizations or individuals, independent of the public mobile network, disguised as a base station of a regular mobile communication operator (for example, broadcasting a mobile operator’s public land mobile network ID (PLMN ID)) , So as to trick the user terminal to initiate a network registration or location update request to it, and then extract the terminal's information. In addition, pseudo base stations often send fraudulent short messages, malicious URL links or advertising short messages to the terminal, thereby endangering users.
现有的定位伪基站的方法一般分为两种。第一种为利用短信数据进行分析。这种方式是通过检测用户收到短信,提取短信中的特征并与云端短信大数据进行对比,如果符合问题短信特征信息,则将短信置于被拦截状态,并上报当前终端位置来定位伪基站。这种方式需要提取用户的短信与位置信息,在日益强调网络安全与隐私的今日,往往会受到用户抵触。另一种为利用某种探测终端,提取伪基站的特征信息如频点、信号强度等信息。这种方式能够实时探测伪基站的存在,但由于伪基站往往搭载在车辆中,移动性强,公安机关往往没有足够的人力物力做大规模的撒网式搜索。The existing methods for locating pseudo base stations are generally divided into two types. The first is to use SMS data for analysis. This method is to detect the user's received short message, extract the characteristics of the short message and compare it with the big data of the cloud short message. If the characteristic information of the problem short message is met, the short message is placed in the intercepted state and the current terminal position is reported to locate the pseudo base station. . This method needs to extract the user's text messages and location information. Today, when network security and privacy are increasingly emphasized, users tend to resist. The other is to use a certain detection terminal to extract the characteristic information of the pseudo base station, such as frequency point, signal strength and other information. This method can detect the existence of pseudo base stations in real time, but because pseudo base stations are often mounted in vehicles and are highly mobile, public security agencies often do not have enough manpower and material resources to conduct large-scale net-based searches.
发明内容Summary of the invention
本发明实施例提供了一种伪基站识别方法、系统及计算机可读存储介质。The embodiments of the present invention provide a pseudo base station identification method, system, and computer-readable storage medium.
根据本发明实施例的伪基站识别方法包括:从基站的性能数据计数器中确定关键计数器,所述关键计数器中的数据受伪基站的影响;根据所述关键计数器建立对应的回归分析模型;从网管数据库中提取所述关键计数器对应的统计数据,并代入到所述回归分析模型中进行机器学习,确定伪基站相关系数的阈值;计算所述关键计数器的性能数据与所述阈值进行比较,确定是否存在所述伪基站。The pseudo base station identification method according to the embodiment of the present invention includes: determining a key counter from the performance data counter of the base station, and the data in the key counter is affected by the pseudo base station; establishing a corresponding regression analysis model according to the key counter; The statistical data corresponding to the key counter is extracted from the database and substituted into the regression analysis model for machine learning to determine the threshold value of the correlation coefficient of the pseudo base station; the performance data of the key counter is calculated and compared with the threshold value to determine whether The pseudo base station exists.
根据本发明另一实施例,提供了一种伪基站识别系统,所述伪基站识别系统包括:基站和分析装置;所述基站用于使用性能数据计数器生成性能数据;所述分析装置包括数据提取模块、规则学习模 块和数据存储模块;所述数据提取模块与网管数据库连接,并从所述网管数据库中提取关键计数器对应的数据;所述规划学习模块用于制定分析模型,根据历史性能数据计算相关系数的阈值;所述数据存储模块用于存储,以实现本发明实施例所述的伪基站识别方法所需的数据。According to another embodiment of the present invention, a pseudo base station identification system is provided. The pseudo base station identification system includes: a base station and an analysis device; the base station is used to generate performance data using a performance data counter; and the analysis device includes data extraction Module, rule learning module, and data storage module; the data extraction module is connected to the network management database, and extracts data corresponding to key counters from the network management database; the planning learning module is used to formulate analysis models and calculate based on historical performance data The threshold value of the correlation coefficient; the data storage module is used to store the data required by the pseudo base station identification method described in the embodiment of the present invention.
根据本发明再一实施例,提供了一种计算机可读存储介质,所述计算机可读存储介质存储有一个或者多个程序,所述一个或者多个程序可被一个或者多个处理器执行,以实现本发明实施例所述的伪基站识别方法的步骤。According to still another embodiment of the present invention, a computer-readable storage medium is provided, the computer-readable storage medium stores one or more programs, and the one or more programs can be executed by one or more processors, In order to realize the steps of the pseudo base station identification method described in the embodiment of the present invention.
本发明其他特征和相应的有益效果在说明书的后面部分进行阐述说明,且应当理解,至少部分有益效果从本发明说明书中的记载变的显而易见。Other features and corresponding beneficial effects of the present invention are described in the latter part of the specification, and it should be understood that at least part of the beneficial effects will become apparent from the description in the specification of the present invention.
附图说明Description of the drawings
下面将结合附图及实施例对本发明作进一步说明,附图中:The present invention will be further described below in conjunction with the accompanying drawings and embodiments. In the accompanying drawings:
图1为本发明实施例一提供的一种伪基站识别方法的流程示意图;FIG. 1 is a schematic flowchart of a method for identifying a pseudo base station according to Embodiment 1 of the present invention;
图2为本发明实施例二提供的伪基站对正常基站LTE平均用户数计数器数据影响的示意图;2 is a schematic diagram of the influence of the pseudo base station on the data of the LTE average user count counter of the normal base station provided by the second embodiment of the present invention;
图3为本发明实施例二提供的用户正常切换小区对基站切换入、切换出成功次数计数器数据影响的示意图;3 is a schematic diagram of the influence of a user's normal handover of a cell on the counter data of the number of successful handovers in and out of the base station according to the second embodiment of the present invention;
图4为本发明实施例二提供的伪基站对正常基站切换入、切换出成功次数计数器数据影响的示意图;4 is a schematic diagram of the influence of the pseudo base station on the counter data of the number of successful handovers in and out of the normal base station according to the second embodiment of the present invention;
图5为本发明实施例三、四和五提供的根据性能数据进行机器学习的流程图;5 is a flowchart of machine learning based on performance data provided by Embodiments 3, 4, and 5 of the present invention;
图6为本发明实施例六提供的根据学习获得的置信区间来对未知区域进行伪基站轨迹绘制的流程图;6 is a flow chart of drawing pseudo base station trajectory in an unknown area according to the confidence interval obtained by learning according to the sixth embodiment of the present invention;
图7为本发明实施例七提供的一种伪基站识别系统的示意图;FIG. 7 is a schematic diagram of a pseudo base station identification system provided by Embodiment 7 of the present invention;
图8为本发明实施例三提供的另外一种伪基站识别系统的示意图。FIG. 8 is a schematic diagram of another pseudo base station identification system provided by Embodiment 3 of the present invention.
具体实施方式Detailed ways
为了使本发明的目的、技术方案及优点更加清楚明白,下面通过具体实施方式结合附图对本发明实施例作进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。同时需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。In order to make the objectives, technical solutions, and advantages of the present invention clearer, the following further describes the embodiments of the present invention in detail through specific implementations in conjunction with the accompanying drawings. It should be understood that the specific embodiments described here are only used to explain the present invention, but not to limit the present invention. At the same time, it should be noted that the embodiments in the application and the features in the embodiments can be combined with each other if there is no conflict.
实施例一:Example one:
本发明提供了一种基于基站性能大数据与机器学习的伪基站识别方法与系统,同时还可以通过基站的位置信息实现伪基站的定位,该方法利用机器学习算法对基站的性能数据的波动进行学习分析,判断出是否存在伪基站,结合基站位置与路网信息可以绘制出伪基站的移动轨迹。The present invention provides a pseudo base station identification method and system based on base station performance big data and machine learning. At the same time, the location of the pseudo base station can be realized through the location information of the base station. The method uses a machine learning algorithm to detect fluctuations in the performance data of the base station. Study and analyze, determine whether there is a pseudo base station, and combine the location of the base station and the road network information to draw the movement trajectory of the pseudo base station.
为了实现伪基站的识别,本实施例提供了一种伪基站识别方法,该伪基站识别方法包括:从基站的性能数据计数器中确定关键计数器,关键计数器中的数据受伪基站的影响;根据关键计数器建立对 应的回归分析模型;从网管数据库中提取关键计数器对应的统计数据,并代入到回归分析模型中进行机器学习,确定伪基站相关系数的阈值;计算关键计数器的性能数据与阈值进行比较,确定是否存在伪基站。In order to realize the identification of pseudo base stations, this embodiment provides a method for identifying pseudo base stations. The pseudo base station identification method includes: determining a key counter from the performance data counter of the base station, and the data in the key counter is affected by the pseudo base station; The counter establishes the corresponding regression analysis model; extracts the statistical data corresponding to the key counter from the network management database, and substitutes it into the regression analysis model for machine learning to determine the threshold value of the correlation coefficient of the pseudo base station; calculates the performance data of the key counter and compares the threshold value, Determine whether there is a pseudo base station.
请参见图1,图1为本实施例提供的一种伪基站识别方法的流程示意图。伪基站识别方法的流程包括:Please refer to FIG. 1. FIG. 1 is a schematic flowchart of a pseudo base station identification method provided in this embodiment. The process of the pseudo base station identification method includes:
S101、从基站的性能数据计数器中确定关键计数器。S101. Determine a key counter from the performance data counter of the base station.
从基站中确定的关键计数器,要求该关键计数器采集到的数据会受到伪基站的影响,并且影响越大,越明显越有利于伪基站的识别。The key counter determined from the base station requires that the data collected by the key counter will be affected by the pseudo base station, and the greater the impact, the more obvious the more conducive to the identification of the pseudo base station.
基站的性能数据无法直接反馈出用户是否接入了伪基站。如果一个用户终端被接入伪基站,正常基站的感觉到的是一次正常的用户切出操作。但是当伪基站经过正常基站周边时,往往会有比较多的用户被影响,这会使正常基站的性能数据产生波动或恶化,这些产生数据波动的计数器即为关键计数器,因此,可利用这些关键计数器的数据波动情况推测伪基站是否存在。The performance data of the base station cannot directly feed back whether the user has accessed the pseudo base station. If a user terminal is connected to the pseudo base station, what the normal base station perceives is a normal user switch-out operation. However, when the pseudo base station passes by the normal base station, more users are often affected, which will cause the performance data of the normal base station to fluctuate or deteriorate. These counters that produce data fluctuations are the key counters. Therefore, these key counters can be used. The data fluctuation of the counter speculates whether there is a pseudo base station.
S102、根据所述关键计数器建立对应的回归分析模型。S102: Establish a corresponding regression analysis model according to the key counter.
确定了关键计数器以后需要根据关键计数器来确定对应的回归分析模型,该回归分析模型能够清楚的展示出该关键计数器中数据的变化和影响因素。After determining the key counter, it is necessary to determine the corresponding regression analysis model according to the key counter. The regression analysis model can clearly show the changes and influencing factors of the data in the key counter.
S103、进行机器学习,确定伪基站相关系数的阈值。S103. Perform machine learning to determine the threshold value of the correlation coefficient of the pseudo base station.
从网管数据库中提取所述关键计数器对应的统计数据,并代入到回归分析模型中进行机器学习,确定伪基站相关系数的阈值,该网管数据库中包含了所有基站的性能数据计数器所统计的性能数据。The statistical data corresponding to the key counter is extracted from the network management database and substituted into the regression analysis model for machine learning to determine the threshold value of the correlation coefficient of the pseudo base station. The network management database contains the performance data counted by the performance data counters of all base stations .
机器学习需要大量的数据来进行,因此在步骤S102确定回归分析模型后,除了可以对本地的数据进行机器学习以外,还可以从网管数据库中获取大量的与关键计数器相关的数据和统计数据,进行机器学习,使得机器学习的结果更加的准确,得到的伪基站相关系数的阈值更加接近实际情况。Machine learning requires a large amount of data. Therefore, after the regression analysis model is determined in step S102, in addition to machine learning on local data, a large amount of data and statistical data related to key counters can also be obtained from the network management database. Machine learning makes the results of machine learning more accurate, and the obtained threshold value of the correlation coefficient of the pseudo base station is closer to the actual situation.
S104、计算所述关键计数器的性能数据与所述阈值进行比较,确定是否存在所述伪基站。S104. Calculate the performance data of the key counter and compare it with the threshold to determine whether the pseudo base station exists.
通过机器学习的到伪基站相关系数的阈值,对本地基站采集到的关键计数器的数据,进行计算得到当前性能数据,将该性能数据与阈值进行比较久可以确定基站周围当前是否存在伪基站。Through the machine learning the threshold value of the correlation coefficient of the pseudo base station, calculate the current performance data from the key counter data collected by the local base station, and compare the performance data with the threshold value to determine whether there are pseudo base stations currently around the base station.
在本实施例中,所述基站包括单一基站或一个区域内的基站,选取的关键计数器也可以是一种性能计数器或者多种性能计数,进行综合得到数据。在实际应用中,基站工作时可以选单基站工作,也可以选择一个区域内的多个基站协同进行工作。In this embodiment, the base station includes a single base station or a base station in an area, and the selected key counter may also be a performance counter or multiple performance counters, which are integrated to obtain data. In practical applications, when a base station is working, a single base station can be selected to work, or multiple base stations in an area can be selected to work together.
在本实施例中,在确定存在伪基站之后还包括:确定伪基站的位置,并绘制出伪基站的移动轨迹;根据发现伪基站的基站位置信息确定伪基站的位置,在预设时间间隔内,统计伪基站的多个位置信息并连接得出伪基站的移动轨迹。在确定了基站周围存在伪基站后,通过基站的位置信息就可以大致确定伪基站的位置,在使用一个区域内的基站判断伪基站时,还可以通过多台基站的位置信息进行分析, 可以比较精确的分析出伪基站的位置。In this embodiment, after determining that there is a pseudo base station, it further includes: determining the location of the pseudo base station and drawing the movement track of the pseudo base station; determining the location of the pseudo base station according to the location information of the base station where the pseudo base station is found, within a preset time interval , Calculate multiple location information of the pseudo base station and connect them to obtain the movement trajectory of the pseudo base station. After determining the presence of pseudo base stations around the base station, the location of the pseudo base station can be roughly determined through the location information of the base station. When using base stations in an area to determine the pseudo base station, the location information of multiple base stations can also be analyzed for comparison. Accurately analyze the location of the pseudo base station.
在实际生活中,伪基站通常都是移动式基站,很少会只停留在一个地方,因此,为了有效的打击伪基站的犯罪活动,便于了解和分析伪基站的运动趋势,对伪基站在一段时间间隔内进行监视,确定出伪基站的移动轨迹可以方便的判断出伪基站的下一步动向,从而为打击伪基站提供帮助。In real life, pseudo base stations are usually mobile base stations and rarely stay in one place. Therefore, in order to effectively combat criminal activities of pseudo base stations, it is convenient to understand and analyze the movement trend of pseudo base stations. Monitoring within a time interval and determining the movement trajectory of the pseudo base station can easily determine the next trend of the pseudo base station, thereby providing assistance in combating the pseudo base station.
在本实施例中,计算基站的关键计数器的分析数据与阈值进行比较包括但不限于:将关键计数器的统计数据代入到回归分析模型中得到性能数据,将性能数据与阈值进行比较确定是否存在伪基站。可以理解的是,上述提供的判断伪基站的方法只是众多方法中的一种,在实际应用中还可以在计算的过程中增加其他的矫正参数或者其他可选参数等,使得结果更加的准确,甚至在一些情况下为了避免结果被泄露和利用还可以增加加密算法,使得计算结果更加的安全。In this embodiment, calculating the analysis data of the key counter of the base station to compare with the threshold includes, but is not limited to: substituting the statistical data of the key counter into the regression analysis model to obtain performance data, and comparing the performance data with the threshold to determine whether there is false Base station. It is understandable that the method for judging a pseudo base station provided above is only one of many methods. In practical applications, other correction parameters or other optional parameters can be added in the calculation process to make the result more accurate. Even in some cases, in order to prevent the results from being leaked and used, encryption algorithms can be added to make the calculation results more secure.
在本实施例中,基站的性能数据计数器包括但不限于:平均用户数、基站间X2接口的小区间同频切换出执行成功次数、基站间X2接口的小区间同频切换入执行成功次数、基站间X2接口的小区间异频切换出执行成功次数、基站间X2接口的小区间异频切换入执行成功次数、基站间S1接口的小区间异频切换出执行成功次数、基站间S1接口的小区间异频切换入执行成功次数、基站间S1接口的小区间同频切换出执行成功次数和基站间S1接口的小区间同频切换入执行成功次数任意一种。In this embodiment, the performance data counter of the base station includes, but is not limited to: the average number of users, the number of successful executions of intra-cell intra-frequency handover of the X2 interface between base stations, the number of successful intra-cell intra-frequency handovers of the X2 interface between base stations, The number of successful executions of inter-cell inter-frequency handover of the X2 interface between base stations, the number of successful inter-cell inter-frequency handovers of the X2 interface between base stations, the number of successful inter-cell inter-frequency handovers of the S1 interface between base stations, and the number of successful inter-cell inter-frequency handovers of the S1 interface between base stations. The number of successful executions of inter-cell inter-frequency handover, the number of successful executions of inter-cell intra-frequency handover of the S1 interface between base stations, and the number of successful inter-cell intra-frequency handovers of the S1 interface between base stations.
在本实施例中,回归分析模型为与关键计数器对应的计算式,计算式的结果即为关键计数器的伪基站相关系数的阈值。在计算式中还包括合理系数,合理系数通过机器学习使得合理系数接近实际值。In this embodiment, the regression analysis model is a calculation formula corresponding to the key counter, and the result of the calculation formula is the threshold value of the pseudo base station correlation coefficient of the key counter. The calculation formula also includes reasonable coefficients, which are made close to the actual values through machine learning.
在本实施例中,性能数据计数器包括基站内的性能数据计数器,或基站之间的性能数据计数器。例如当伪基站在移动时没有超出一个基站或者一个区域内基站的覆盖范围时,此时统计的性能数据计数器的数据即是基站内的性能数据计数器的数据;当伪基站在移动时超出一个基站或者一个区域内基站的覆盖范围时,此时统计的性能数据计数器的数据可以采用基站间的性能数据计数器的数据。避免了基站间的数据协调出现错误,使得数据采集数据的误差更小和识别结果更加的准确。In this embodiment, the performance data counter includes a performance data counter in a base station or a performance data counter between base stations. For example, when the pseudo base station does not exceed the coverage of a base station or a base station in an area when moving, the data of the performance data counter counted at this time is the data of the performance data counter in the base station; when the pseudo base station is moving beyond a base station Or when the coverage of the base stations in an area is the case, the data of the performance data counter counted at this time may be the data of the performance data counter between the base stations. Avoiding errors in data coordination between base stations, making data collection errors smaller and recognition results more accurate.
本实施例提供了一种伪基站识别方法,包括:从基站的性能数据计数器中确定关键计数器,关键计数器中的数据受伪基站的影响;根据关键计数器建立对应的回归分析模型;从网管数据库中提取关键计数器对应的统计数据,并代入到回归分析模型中进行机器学习,确定伪基站相关系数的阈值;计算关键计数器的性能数据与阈值进行比较,确定是否存在伪基站。通过获取基站的关键计数器的数据来判断识别伪基站,不需要获取用户的隐私数据,避免造成用户的反感;同时采用回归分析模型和机器学习,得到伪基站相关系数的阈值,通过伪基站相关系数的阈值来识别是否存在伪基站,使得识别结果更加的精准。This embodiment provides a pseudo base station identification method, including: determining a key counter from the performance data counter of the base station, and the data in the key counter is affected by the pseudo base station; establishing a corresponding regression analysis model according to the key counter; and from the network management database The statistical data corresponding to the key counter is extracted and substituted into the regression analysis model for machine learning to determine the threshold of the correlation coefficient of the pseudo base station; the performance data of the key counter is calculated and compared with the threshold to determine whether there is a pseudo base station. Judge and identify the pseudo base station by obtaining the key counter data of the base station, without obtaining the user's private data, avoiding the user's disgust; at the same time, the regression analysis model and machine learning are used to obtain the threshold value of the pseudo base station correlation coefficient, and the pseudo base station correlation coefficient is used The threshold value is used to identify whether there is a pseudo base station, so that the identification result is more accurate.
进一步的本实施例还通过基站的位置信息确定出伪基站的位置,并通过获取一段时间伪基站的位置信息,绘制出伪基站的移动轨迹,用于预判伪基站的后续运动位置,有助于打击伪基站的犯罪。Further, this embodiment also determines the location of the pseudo base station through the location information of the base station, and draws the movement trajectory of the pseudo base station by acquiring the location information of the pseudo base station for a period of time, which is used to predict the subsequent movement position of the pseudo base station, which is helpful To combat the crime of fake base stations.
实施例二:Embodiment two:
本实施例以4G的LTE协议基站为例,对本发明实施例提供的一种伪基站识别方法提出的一种实施方式。This embodiment takes a 4G LTE protocol base station as an example, and proposes an implementation manner of a pseudo base station identification method provided in an embodiment of the present invention.
包括以下步骤:It includes the following steps:
1)筛选性能数据计数器,确定关键计数器。1) Filter performance data counters and determine key counters.
以LTE(4G协议)基站为例,我们选取如下几个关键计数器:LTE平均用户数、eNB(基站)间X2(基站间接口)口小区间同频切换出执行成功次数、eNB间X2口小区间同频切换入执行成功次数、eNB间X2口小区间异频切换出执行成功次数、eNB间X2口小区间异频切换入执行成功次数、eNB间S1(基站间接口)口小区间异频切换出执行成功次数、eNB间S1口小区间异频切换入执行成功次数、eNB间S1口小区间同频切换入成功次数、eNB间S1口小区间同频切换出执行成功次数等等。对于LTE平均用户数,如果有伪基站经过正常基站的管理范围,那么被伪基站接走的用户数会导致该基站这个统计粒度的平均用户数降低,当伪基站离开该基站的管理范围时,平均用户数又会回升。可参见图2。图2为伪基站对正常基站LTE平均用户数计数器数据影响的示意图。Taking the LTE (4G protocol) base station as an example, we select the following key counters: average number of LTE users, X2 (inter-base station interface) between eNBs (base station), the number of successful executions of simultaneous handover between cells, and X2 inter-eNB cells The number of successful executions of inter-frequency handover in the same frequency, the number of successful executions of inter-cell inter-frequency handover of X2 ports between eNBs, the number of successful inter-cell inter-frequency handovers of X2 ports between eNBs, and the number of inter-eNB S1 (inter-base station interface) ports. The number of successful executions of handover, the number of successful inter-frequency handovers between eNBs and S1 ports, the number of successful intra-frequency handovers between eNBs and S1 ports, the number of successful inter-eNB S1 inter-cell handovers, and so on. For the average number of LTE users, if a pseudo base station passes through the management range of a normal base station, the number of users picked up by the pseudo base station will result in a decrease in the average number of users of the base station’s statistical granularity. When the pseudo base station leaves the management range of the base station, The average number of users will rise again. See Figure 2. Figure 2 is a schematic diagram of the influence of a pseudo base station on the data of a normal base station LTE average number of users counter.
2)利用上述伪基站对正常计数器数值的波动影响建立回归分析模型。2) Establish a regression analysis model using the influence of the above pseudo base station on the fluctuation of the normal counter value.
依旧以上述LTE相关的关键计数器为例。Take the above-mentioned key counters related to LTE as an example.
设定当前统计粒度时间点的LTE平均用户数为X now,过去第n天对应相同统计粒度的用户数分为为X n;W n为过去第n天的值与今天同一时刻的差值的权重。可得到如下基本公式: Set the average number of LTE users at the current statistical granularity time point as X now , and the number of users corresponding to the same statistical granularity on the nth day in the past is divided into X n ; W n is the difference between the value of the nth day in the past and the same time today Weights. The following basic formula can be obtained:
Figure PCTCN2020128476-appb-000001
Figure PCTCN2020128476-appb-000001
R u可作为判定伪基站的一项相关系数。W n的值可通过机器学习不断逼近合理值,并计算出R u的置信区间。可通过计算某一时间点的性能数据直接推测出该时间点是否存在伪基站。 Ru can be used as a correlation coefficient to determine the pseudo base station. W n value may be closer and closer to a reasonable value through machine learning, and R u calculated confidence interval. By calculating the performance data at a certain point in time, it can be directly inferred whether there is a pseudo base station at that point in time.
进一步的,可以同时利用邻接小区与本小区的计数器进行回归分析建模。Further, the counters of the neighboring cell and the current cell can be used for regression analysis modeling at the same time.
当用户终端切换到邻接小区时,本基站的切换出次数会+1,邻接小区的切换入次数会+1,可参见图3所示,图3为用户正常切换小区对基站切换入、切换出成功次数计数器数据影响的示意图。但当用户终端被接入伪基站时,本基站的切换出次数会+1,而邻接小区的切换入次数却不会+1,可参见图4,图4为伪基站对正常基站切换入、切换出成功次数计数器数据影响的示意图。如果有大量用户被接入伪基站,本基站的切换出次数较正常数据有比较大的增长,而邻接小区的切换入次数却不会有较大增长。篇幅所限,这里统一用“切换出次数”代表eNB间S1口及X2口同频及异频切换出成功次数,“切换入次数”则代表eNB间S1口及X2口同频及异频切换入成功次数。设定本基站在当前统计粒度区间内的切换出次数为O now,当前小区在过去第n天同一统计粒度的切换出次数为O n;第k个邻接 小区的当前统计粒度的切换入次数为IK now,在过去第n天相同统计粒度的切换入次数为IK n可得如下两个公式: When the user terminal switches to an adjacent cell, the number of handovers of the base station will be +1, and the number of handovers of the neighboring cell will be +1, as shown in Figure 3, Figure 3 shows the user's normal handover cell to the base station handover to and from the base station. Schematic diagram of the influence of the success counter data. However, when a user terminal is connected to a pseudo base station, the number of handovers out of the base station will be +1, while the number of handovers in adjacent cells will not be +1. Switch out the schematic diagram of the influence of the success counter data. If a large number of users are connected to the pseudo base station, the number of handovers of this base station will increase more than normal data, but the number of handovers of adjacent cells will not increase significantly. Due to space limitations, the "number of handovers" is used here to represent the number of successful simultaneous and inter-frequency handovers of the S1 and X2 ports between eNBs, and the "number of handovers" means the same frequency and different frequency handovers of the S1 and X2 ports between eNBs. The number of successful entries. Sets the base station in the switching times of the current statistical particle size interval is O now, the current cell in the past n days on the same switching frequency and the statistical particle size n is O; the current frequency handover statistics size k-th adjacent cell is IK now , the number of switch-ins with the same statistical granularity on the nth day in the past is IK n , and the following two formulas can be obtained:
Figure PCTCN2020128476-appb-000002
Figure PCTCN2020128476-appb-000002
Figure PCTCN2020128476-appb-000003
Figure PCTCN2020128476-appb-000003
上述公式中R o为基站切出次数的相关系数,R o值越大表示越有可能存在伪基站。R i为邻接基站小区切入次数的相关系数,R i越小越有可能存在伪基站。 In the above formula R o is a cut-out station number of the correlation coefficient, R o larger value indicates a pseudo base stations may exist. R i is the correlation coefficient of the number of cell cut-in times of adjacent base stations. The smaller the R i, the more likely there is a pseudo base station.
根据计数器含义的不同,还有其他计数器可以用于建立回归模型,本方法不做穷举。对于GSM(2G协议)、UMTS(3G协议)、TD-SCDMA(3G协议)以及5G基站或控制器,都可以选择类似计数器建立回归分析模型并采用上述方法进行机器学习。数据分析算法可由用户指定,也可以通过机器深度学习获得Depending on the meaning of the counter, there are other counters that can be used to build a regression model. This method is not exhaustive. For GSM (2G protocol), UMTS (3G protocol), TD-SCDMA (3G protocol), and 5G base stations or controllers, similar counters can be selected to establish regression analysis models and the above methods can be used for machine learning. The data analysis algorithm can be specified by the user or obtained through machine deep learning
.
3)利用历史性能数据和带标记的基站信息进行机器学习,得到伪基站相关系数的阈值。3) Use historical performance data and labeled base station information for machine learning to obtain the threshold value of the correlation coefficient of the pseudo base station.
通过已知存在伪基站的历史性能数据进行机器学习。得到R u、R o、R i等相关系数的或者用户自定义模型的阈值。 Machine learning is performed through historical performance data of known pseudo base stations. To give R u, R o, R i or the like correlation user-defined threshold model.
4)利用已有分析模型和阈值,计算待分析区域的性能数据,判定伪基站是否存在。得到疑似存在伪基站的基站列表。4) Using the existing analysis model and threshold, calculate the performance data of the area to be analyzed, and determine whether the pseudo base station exists. Obtain a list of base stations suspected of having pseudo base stations.
可对接网管系统获取未知区域内基站的性能数据,利用分析模型以及阈值信息,计算并判定出目标基站周边是否存在伪基站。It can be connected to the network management system to obtain performance data of base stations in unknown areas, and use analysis models and threshold information to calculate and determine whether there are pseudo base stations around the target base station.
5)最终,结合路网数据与基站位置信息,绘制出伪基站的移动轨迹。5) Finally, combine the road network data and base station location information to draw the movement trajectory of the pseudo base station.
本实施例提供的伪基站识别方法,相对于已有的伪基站定位方案,不需要专门设计一种终端来定位伪基站。一套系统便可以在多处使用,只需要输入模型和性能数据即可。性能数据的分析模型是通用的,并且可以用户自定义,适用于2G、3G、4G及5G网络,且不论是任何厂商的通讯设备,只要统计分析模型相关的性能数据,便可以应用此系统。本方法不会采集用户终端上的数据,依赖的性能数据都是控制面数据,不涉及用户面数据。没有用户信息泄露的风险。In the pseudo base station identification method provided in this embodiment, compared with the existing pseudo base station positioning solution, there is no need to specifically design a terminal to locate the pseudo base station. A set of systems can be used in multiple places, just input the model and performance data. The performance data analysis model is universal and can be customized by users. It is applicable to 2G, 3G, 4G and 5G networks, regardless of the communication equipment of any manufacturer, as long as the performance data related to the model is statistically analyzed, the system can be applied. This method does not collect data on the user terminal, and the performance data it relies on is all control plane data and does not involve user plane data. There is no risk of user information leakage.
实施例三:Example three:
本实施例利用LTE平均用户数建立分析模型,图5为本发明实施例根据性能数据进行机器学习的流程图。该流程主要包含以下步骤(S501-S505):This embodiment uses the average number of LTE users to establish an analysis model. FIG. 5 is a flowchart of machine learning based on performance data in an embodiment of the present invention. The process mainly includes the following steps (S501-S505):
步骤S501,筛选伪基站会波及影响的性能数据计数器。对于本实施例,即LTE平均用户数。Step S501, screening the performance data counters that the pseudo base station will affect. For this embodiment, that is, the average number of LTE users.
步骤S502,根据选定的计数器,建立回归分析模型。本实施例中,使用上述R u公式建立模型,并将此模型存储在数据库中。 In step S502, a regression analysis model is established according to the selected counter. In this embodiment, the above-mentioned Ru formula is used to build a model, and the model is stored in the database.
Figure PCTCN2020128476-appb-000004
Figure PCTCN2020128476-appb-000004
其中当前统计粒度时间点的LTE平均用户数为X now,过去n天对应相同统计粒度的用户数分为为X n;W n为过去几天的值与今天同一时刻的差值的权重。 The average number of LTE users at the current statistical granularity time point is X now , and the number of users corresponding to the same statistical granularity in the past n days is divided into X n ; W n is the weight of the difference between the value of the past few days and the same time today.
步骤S503,从网管系统中抽取本次学习需要的计数器数据,并导入带标记的伪基站数据。In step S503, the counter data required for this learning is extracted from the network management system, and the marked pseudo base station data is imported.
表1导入本系统的LTE平均用户数性能计数器数据示意Table 1 shows the performance counter data of the average number of LTE users imported into the system
Figure PCTCN2020128476-appb-000005
Figure PCTCN2020128476-appb-000005
表1为ID等于301的基站在四天期间同一个统计粒度的LTE平均用户数数据示意。其中Collectime为数据的采集时间;Subnetwork为子网ID;EnbId为基站ID;LTE平均用户数这一列代表在对应时间统计到的用户数。从网管系统中采集到的计数器数据将使用表1中的结构存储。Mark字段表示该基站附近是否存在伪基站的标记,这部分数据来来源于公安系统中已知伪基站的数据。Table 1 shows the data of the average number of LTE users of the base station with ID equal to 301 at the same statistical granularity during four days. Among them, Collectime is the data collection time; Subnetwork is the subnet ID; EnbId is the base station ID; the column of LTE average users represents the number of users counted at the corresponding time. The counter data collected from the network management system will be stored using the structure in Table 1. The Mark field indicates whether there is a mark of a pseudo base station near the base station. This part of the data comes from the data of a known pseudo base station in the public security system.
步骤S504,根据历史性能数据和带标记的已知伪基站信息,进行机器学习。W n权重的值应该是越靠近当前时间,权重越高。本实施例中为简化说明,W n的值都设置为1,n为3。计算可得R u的值为-248,再加上正负10作为误差,可得到R u相关系数的置信区间R min~R max=-258~-238。重复这一学习过程,不断地调整权重及误差的值,最终得到较为准确的置信区间。对于待判定基站的性能数据, 若通过R u公式计算出来的值在R min~R max区间内,则可判定该基站在数据对应时间点周边存在伪基站。 In step S504, machine learning is performed based on historical performance data and information of known pseudo base stations with tags. The value of W n weight should be closer to the current time, the higher the weight. To simplify the description in this embodiment, the values of W n are all set to 1, and n is 3. R u can be calculated to obtain the value -248, plus as an error of plus or minus 10, to obtain the confidence interval R min R u correlation coefficient ~ R max = -258 ~ -238. Repeat this learning process, continuously adjust the weight and error values, and finally get a more accurate confidence interval. For the performance data is determined to be a base station, if the calculated value by the equation R u R min ~ R max range, it may be determined that the base station in the presence of the corresponding data outside the time point pseudo base stations.
学习的性能数据以及伪基站的样本数越多,得到的置信区间越准确。对于节假日、边缘基站等会对数据造成特殊影响的数据,会在学习前剔除。The more the learned performance data and the number of samples of the pseudo base station, the more accurate the confidence interval obtained. For holidays, edge base stations and other data that will have a special impact on the data, it will be eliminated before learning.
步骤S505,将学习获得的置信区间存储在数据库中。In step S505, the confidence interval obtained by learning is stored in the database.
对于4G基站,所述算法即是本文利用eNB间LTE平均用户数、S1、X2接口切换入、切换出次数建立的公式。判定伪基站涉及的机器学习模型可预先建模也可以用户自定义。分析模型可存储在云端,并建立云端模型库。For 4G base stations, the algorithm is the formula established in this article using the average number of LTE users between eNBs, the number of switching in and out of the S1 and X2 interfaces. The machine learning model involved in determining the pseudo base station can be pre-modeled or customized by the user. Analysis models can be stored in the cloud, and a cloud model library can be established.
实施例四:Embodiment four:
本实施例利用eNB间切换入、切换出次数建立分析模型,实施例四和实施例三类似,同样可参考图5中的步骤。In this embodiment, the number of handovers in and out between eNBs is used to establish an analysis model. The fourth embodiment is similar to the third embodiment, and the steps in FIG. 5 can also be referred to.
步骤S501,筛选出来的计数器是eNB间S1、X2接口切换入、切换出成功次数。In step S501, the filtered counter is the number of successful handovers in and out of the S1 and X2 interfaces between eNBs.
步骤S502,使用上述R o、R i建立模型。 In step S502, the above-mentioned R o and R i are used to build a model.
Figure PCTCN2020128476-appb-000006
Figure PCTCN2020128476-appb-000006
Figure PCTCN2020128476-appb-000007
Figure PCTCN2020128476-appb-000007
其中公式中R o为基站切出次数的相关系数,R o值越大表示越有可能存在伪基站。R i为邻接基站小区切入次数的相关系数,R i越小越有可能存在伪基站。 Wherein in the formula R o is a cut-out station number of the correlation coefficient, R o larger value indicates a pseudo base stations may exist. R i is the correlation coefficient of the number of cell cut-in times of adjacent base stations. The smaller the R i, the more likely there is a pseudo base station.
步骤S503,从网管系统中抽取本次学习需要的计数器数据,并导入带标记的伪基站数据。简化起见,都使用S1接口的数据为例:In step S503, the counter data required for this learning is extracted from the network management system, and the marked pseudo base station data is imported. For simplification, the data of S1 interface is used as an example:
表2导入本系统的eNB间S1口切换出次数性能计数器数据示意Table 2 shows the performance counter data of the number of times the S1 port is switched out between eNBs imported into this system
Figure PCTCN2020128476-appb-000008
Figure PCTCN2020128476-appb-000008
Figure PCTCN2020128476-appb-000009
Figure PCTCN2020128476-appb-000009
表2为ID等于301的基站在四天期间同一个统计粒度的eNB间S1口切换出次数性能计数器数据示意。Table 2 shows the performance counter data of the number of times that the base station with ID equal to 301 has switched out the S1 port between eNBs at the same statistical granularity during four days.
表3两个相邻基站eNB间S1口切换入次数性能计数器数据示意Table 3 shows the performance counter data of the number of S1 port handovers between two adjacent base stations eNB
Figure PCTCN2020128476-appb-000010
Figure PCTCN2020128476-appb-000010
Figure PCTCN2020128476-appb-000011
Figure PCTCN2020128476-appb-000011
表3为基站ID等于302和303的两个基站的eNB间S1口切换入次数性能计数器数据示意。302、303这两个基站与基站301相邻。Table 3 shows the performance counter data of the number of inter-eNB S1 port handover times of two base stations with base station IDs equal to 302 and 303. The two base stations 302 and 303 are adjacent to the base station 301.
步骤S504,根据历史性能数据和带标记的已知伪基站信息,进行机器学习。In step S504, machine learning is performed based on historical performance data and information of known pseudo base stations with tags.
为简化说明,依旧设置各个相关系数的权重为1。将数据带入公式进行计算可得R o的值为705,R i的值为-11。那么对于待判定的基站,如果其R o的值大于或等于705且该基站的相邻基站R i的值小于或等于-11。即可判定该基站周边疑似存在伪基站。同样的,学习的样本数据越多,得到的R o、R i对应的阈值及权重值就越精确。 To simplify the description, the weight of each correlation coefficient is still set to 1. Putting the data into the formula for calculation , the value of R o is 705 and the value of R i is -11. Then the base station to be determined, if it is greater than or R o equal to 705 and the value of the neighboring base station is R i is less than or equal to -11. It can be determined that there is a pseudo base station around the base station. Similarly, the more sample data is learned, the more accurate the threshold and weight values corresponding to R o and R i will be.
步骤S505,将学习获得的置信区间存储在数据库中。In step S505, the confidence interval obtained by learning is stored in the database.
实施例五:Embodiment five:
本实施例综合LTE平均用户数与eNB间切换入、切换出次数建立回归分析模型,实施例五与实施例三与实施例四的步骤相同。为使结果更加精确,将上述两个实施例的数据综合计算。This embodiment integrates the average number of LTE users and the number of handovers in and out of eNBs to establish a regression analysis model. The fifth embodiment is the same as the third embodiment and the fourth embodiment. In order to make the result more accurate, the data of the above two embodiments are comprehensively calculated.
步骤S501,筛选伪基站会波及影响的性能数据计数器。本实施例中将实施例三和实施例四的计数器都筛选出来。Step S501, screening the performance data counters that the pseudo base station will affect. In this embodiment, the counters of the third embodiment and the fourth embodiment are selected.
步骤S502,根据选定的计数器,建立回归分析模型。本实施例中,依旧使用上述R u、R o、R i建立模型。各项系数的阈值也使用上述两个实施例中的值。其中: In step S502, a regression analysis model is established according to the selected counter. In this embodiment, still using the model R u, R o, R i . The thresholds of the coefficients also use the values in the above two embodiments. among them:
R u为LTE平均用户数计数器的相关系数。 Ru is the correlation coefficient of the LTE average number of users counter.
R o为本基站切换出成功次数计数器的相关系数。 R o is the correlation coefficient of the counter of the number of times that the base station has switched out the success.
R i为相邻基站切换入成功次数计数器的相关系数。 R i is the correlation coefficient of the counter of the number of times that the neighboring base station switches into the success.
建立如下模型:Build the following model:
R=(-258<R u<-238)AND(R o≥705)AND(R i≤-11) R=(-258<R u <-238)AND(R o ≥705)AND(R i ≤-11)
该公式的含义为:如果一个基站的数据经过计算后,R u相关系数落在了置信区间-258~-238内且R o大于等于705且R i小于等于-11,则可判定该基站周边疑似存在伪基站。 The meaning of the equation is as follows: If the data is calculated after a base station, R u correlation coefficient falls confidence interval -258 ~ -238 and R o equal to 705 and greater than the R i -11 or less, it may be determined that the neighboring base stations A pseudo base station is suspected.
步骤S503、步骤S504、步骤S505与实施例三和实施例四相同。Step S503, step S504, and step S505 are the same as the third and fourth embodiments.
实施例六:Embodiment 6:
本实施例对未知区域进行伪基站判定并绘制伪基站轨迹图,图6为根据学习获得的置信区间来对 未知区域进行伪基站轨迹绘制的流程图。该流程包含以下步骤:In this embodiment, the pseudo base station is determined in the unknown area and the pseudo base station trajectory is drawn. Fig. 6 is a flowchart of drawing the pseudo base station trajectory in the unknown area according to the confidence interval obtained by learning. The process includes the following steps:
步骤S601,从网管系统中抽取所需性能计数器的历史数据。并使用表1中所述相同的格式存储在数据库中,Mark字段留空。Step S601: Extract historical data of required performance counters from the network management system. And use the same format described in Table 1 to store in the database, leave the Mark field blank.
步骤S602,对每个基站每个统计粒度的数据按照实施例三或实施例四或实施例五中的公式进行计算,得到每个基站在不同时间点的伪基站相关系数R。如果R落在置信区间R min~R max范围内,则修改此基站的Mark字段为True。所有基站的数据都计算完成后,筛选出其中Mark字段为True的记录,可得到所有判定为附近存在伪基站的基站ID。如下表4。 Step S602: Calculate the data of each statistical granularity of each base station according to the formula in Embodiment 3 or Embodiment 4 or Embodiment 5 to obtain the correlation coefficient R of the pseudo base station of each base station at different time points. If R falls within the range of the confidence interval R min to R max , modify the Mark field of this base station to True. After the data of all base stations are calculated, the records in which the Mark field is True are filtered out, and all base station IDs that are judged to have pseudo base stations nearby can be obtained. See Table 4 below.
表4周边存在伪基站的基站列表Table 4 List of base stations with pseudo base stations nearby
Figure PCTCN2020128476-appb-000012
Figure PCTCN2020128476-appb-000012
步骤S603,由于每个基站的位置是已知的,根据表4中的基站的经纬度,结合地图数据,可在地图上标记出点阵。再结合路网信息,即可绘制出伪基站的轨迹图。In step S603, since the location of each base station is known, according to the longitude and latitude of the base station in Table 4, combined with the map data, a dot matrix can be marked on the map. Combined with the road network information, the trajectory diagram of the pseudo base station can be drawn.
采用上述基于基站性能大数据与机器学习的伪基站定位方法,具有普遍适用性。不论是任何厂商 的通讯设备,只要统计分析模型相关的性能数据,便可以应用此系统。分析模型可以根据实际情况作出修改。机器学习完成后,系统可分析不同区域、不同厂商的数据。具有成本低、适用性广的特点。便于公安系统对所辖地区实施针对性抓捕行动。Using the above pseudo base station positioning method based on base station performance big data and machine learning has universal applicability. Regardless of the communication equipment of any manufacturer, this system can be applied as long as the performance data related to the model is statistically analyzed. The analysis model can be modified according to the actual situation. After the machine learning is completed, the system can analyze data from different regions and different manufacturers. It has the characteristics of low cost and wide applicability. It is convenient for the public security system to carry out targeted arrests in the areas under its jurisdiction.
实施例七:Embodiment Seven:
请参见图7,图7为本实施例提供的一种伪基站识别系统的示意图,伪基站识别系统包括:基站、网管服务器和网管数据库;Please refer to FIG. 7. FIG. 7 is a schematic diagram of a pseudo base station identification system provided by this embodiment. The pseudo base station identification system includes: a base station, a network management server, and a network management database;
所述网管数据库中存储有所有基站的性能数据计数器所统计的性能数据;The network management database stores performance data counted by performance data counters of all base stations;
所述基站用于使用性能数据计数器生成关键计数器对应的性能数据;The base station is configured to use the performance data counter to generate performance data corresponding to the key counter;
所述网管服务器包括数据提取模块、规则学习模块和数据存储模块;The network management server includes a data extraction module, a rule learning module, and a data storage module;
所述数据提取模块与网管数据库连接,并从所述网管数据库中提取关键计数器对应的性能数据;The data extraction module is connected to the network management database, and extracts the performance data corresponding to the key counter from the network management database;
所述规划学习模块用于制定分析模型,根据所述网管数据库中提取关键计数器对应的性能数据计算伪基站相关系数的阈值;The planning learning module is used to formulate an analysis model, and calculate the threshold value of the correlation coefficient of the pseudo base station according to the performance data corresponding to the key counter extracted from the network management database;
所述数据存储模块用于存储,以实现本申请实施例一至实施例六所述的伪基站识别方法所需的数据。The data storage module is used to store data required to implement the pseudo base station identification method described in Embodiment 1 to Embodiment 6 of the present application.
在本实施例中,网管服务器还包括:用户交互模块和图形管理模块;In this embodiment, the network management server further includes: a user interaction module and a graphics management module;
所述用户交互模块用于通过所述基站与用户终端连接进行伪基站提醒;The user interaction module is used to make a pseudo base station reminder through the connection of the base station and the user terminal;
所述图形管理模块用于生成伪基站的移动轨迹。The graphic management module is used to generate the movement trajectory of the pseudo base station.
针对实际情况,本实施例还提供了另外一种可供参考的伪基站识别系统的示意图,可参见图8,图8为本实施例提供的另外一种伪基站识别系统的示意图。图8中包括了:用户交互模块、数据提取模块、数据存储模块、数据库、规则学习模块和图形管理模块。其中规则学习模块又包含分析模型管理模块、阈值管理模块、计数器模型管理模块与基站模型管理模块。In view of the actual situation, this embodiment also provides a schematic diagram of another pseudo base station identification system for reference. Refer to FIG. 8, which is a schematic diagram of another pseudo base station identification system provided by this embodiment. Figure 8 includes: user interaction module, data extraction module, data storage module, database, rule learning module and graphics management module. The rule learning module includes analysis model management module, threshold management module, counter model management module and base station model management module.
为了保证网管数据库中的数据能够覆盖更大的范围,提供更多的数据信息,网管数据库可以通过接入外部系统进行数据更新,本系统对接的外部系统包括网管系统、以及外部数据库。网管系统负责提供基站的历史性能数据。外部数据库负责提供地图、路网、已知伪基站等信息。In order to ensure that the data in the network management database can cover a larger area and provide more data information, the network management database can be updated by connecting to an external system. The external systems connected to this system include the network management system and the external database. The network management system is responsible for providing historical performance data of the base station. The external database is responsible for providing information such as maps, road networks, and known pseudo base stations.
在图8中,数据提取模块包括:对接现网网管系统,从网管系统数据库中提取所需计数器、基站位置信息、基站与小区对应关系、小区邻接关系等信息并转换为本系统所需的格式并存储到数据存储模块中。In Figure 8, the data extraction module includes: docking with the existing network network management system, extracting the required counters, base station location information, base station and cell correspondence, cell neighboring relationship and other information from the network management system database and converting them into the format required by the system And stored in the data storage module.
数据存储模块包括:存储性能计数器历史数据、已知伪基站位置信息、路网数据、回归分析模型、判定阈值、伪基站轨迹图等数据。The data storage module includes: storage of historical data of performance counters, known pseudo base station location information, road network data, regression analysis model, judgment threshold, pseudo base station trajectory graph and other data.
规则学习模块包括:制定分析模型,根据历史性能数据计算相关系数的阈值。The rule learning module includes: formulating an analysis model and calculating the threshold of the correlation coefficient based on historical performance data.
用户交互模块包括:用户交互、伪基站轨迹的展示。User interaction module includes: user interaction and display of pseudo base station trajectory.
图形管理模块包括:基站位置信息与路网信息的匹配,伪基站轨迹图的绘制。The graphics management module includes: matching of base station location information with road network information, and drawing of pseudo base station trajectory graphs.
本实施例还提供了一种计算机可读存储介质,计算机可读存储介质存储有一个或者多个程序,所述一个或者多个程序可被一个或者多个处理器执行,以实现本申请实施例一至实施例六所述的伪基站识别方法的步骤。This embodiment also provides a computer-readable storage medium. The computer-readable storage medium stores one or more programs, and the one or more programs can be executed by one or more processors to implement the embodiments of the present application. One to the steps of the pseudo base station identification method described in the sixth embodiment.
有益效果Beneficial effect
本发明实施例提供了一种伪基站识别方法、系统及计算机可读存储介质,以实现在不涉及用户隐私的情况下实现伪基站的识别。该伪基站识别方法包括:从基站的性能数据计数器中确定关键计数器,关键计数器中的数据受伪基站的影响;根据关键计数器建立对应的回归分析模型;从网管数据库中提取关键计数器对应的统计数据,并代入到回归分析模型中进行机器学习,确定伪基站相关系数的阈值;计算关键计数器的性能数据与阈值进行比较,确定是否存在伪基站。通过从基站中获取关键计数器的数据进行伪基站的识别避免了涉及用户隐私的问题,同时还引入机器学习的方式来判断伪基站,使得伪基站的识别更加的准确。The embodiments of the present invention provide a pseudo base station identification method, system, and computer-readable storage medium, so as to realize pseudo base station identification without involving user privacy. The pseudo base station identification method includes: determining the key counter from the performance data counter of the base station, and the data in the key counter is affected by the pseudo base station; establishing the corresponding regression analysis model according to the key counter; extracting the statistical data corresponding to the key counter from the network management database , And substituted into the regression analysis model for machine learning to determine the threshold of the correlation coefficient of the pseudo base station; calculate the performance data of the key counter and compare with the threshold to determine whether there is a pseudo base station. The identification of pseudo base stations by obtaining key counter data from the base station avoids problems involving user privacy. At the same time, machine learning is also introduced to determine pseudo base stations, making the identification of pseudo base stations more accurate.
可见,本领域的技术人员应该明白,上文中所公开方法中的全部或某些步骤、系统、装置中的功能模块/单元可以被实施为软件(可以用计算装置可执行的计算机程序代码来实现)、固件、硬件及其适当的组合。在硬件实施方式中,在以上描述中提及的功能模块/单元之间的划分不一定对应于物理组件的划分;例如,一个物理组件可以具有多个功能,或者一个功能或步骤可以由若干物理组件合作执行。某些物理组件或所有物理组件可以被实施为由处理器,如中央处理器、数字信号处理器或微处理器执行的软件,或者被实施为硬件,或者被实施为集成电路,如专用集成电路。It can be seen that those skilled in the art should understand that all or some of the steps, functional modules/units in the system, and devices in the methods disclosed above can be implemented as software (which can be implemented by computer program code executable by a computing device). ), firmware, hardware and their appropriate combination. In the hardware implementation, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, a physical component may have multiple functions, or a function or step may consist of several physical components. The components are executed cooperatively. Some physical components or all physical components can be implemented as software executed by a processor, such as a central processing unit, a digital signal processor, or a microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit .
此外,本领域普通技术人员公知的是,通信介质通常包含计算机可读指令、数据结构、计算机程序模块或者诸如载波或其他传输机制之类的调制数据信号中的其他数据,并且可包括任何信息递送介质。所以,本发明不限制于任何特定的硬件和软件结合。In addition, as is well known to those of ordinary skill in the art, communication media usually contain computer-readable instructions, data structures, computer program modules, or other data in a modulated data signal such as carrier waves or other transmission mechanisms, and may include any information delivery medium. Therefore, the present invention is not limited to any specific combination of hardware and software.
以上内容是结合具体的实施方式对本发明实施例所作的进一步详细说明,不能认定本发明的具体实施只局限于这些说明。对于本发明所属技术领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干简单推演或替换,都应当视为属于本发明的保护范围。The above content is a further detailed description of the embodiments of the present invention in combination with specific implementations, and it cannot be considered that the specific implementations of the present invention are limited to these descriptions. For those of ordinary skill in the technical field to which the present invention belongs, several simple deductions or substitutions can be made without departing from the concept of the present invention, which should be regarded as belonging to the protection scope of the present invention.

Claims (11)

  1. 一种伪基站识别方法,包括:A method for identifying pseudo base stations, including:
    从基站的性能数据计数器中确定关键计数器,所述关键计数器中的数据受伪基站的影响;Determining a key counter from the performance data counter of the base station, and the data in the key counter is affected by the pseudo base station;
    根据所述关键计数器建立对应的回归分析模型;Establishing a corresponding regression analysis model according to the key counter;
    从网管数据库中提取所述关键计数器对应的统计数据,并代入到所述回归分析模型中进行机器学习,确定伪基站相关系数的阈值,所述网管数据库中包含了所有基站的性能数据计数器所统计的性能数据;The statistical data corresponding to the key counter is extracted from the network management database and substituted into the regression analysis model for machine learning to determine the threshold value of the correlation coefficient of the pseudo base station. The network management database contains the statistics of the performance data counters of all base stations. Performance data;
    计算所述关键计数器的性能数据与所述阈值进行比较,确定是否存在所述伪基站。Calculate the performance data of the key counter and compare it with the threshold to determine whether the pseudo base station exists.
  2. 如权利要求1所述的伪基站识别方法,其中,所述基站包括单一基站或一个区域内的基站,所述关键计数器包括至少一种性能数据计数器。The pseudo base station identification method according to claim 1, wherein the base station includes a single base station or a base station in an area, and the key counter includes at least one type of performance data counter.
  3. 如权利要求2所述的伪基站识别方法,其中,在确定存在所述伪基站之后还包括:确定所述伪基站的位置,并绘制出所述伪基站的移动轨迹;3. The pseudo base station identification method according to claim 2, wherein after determining that the pseudo base station exists, the method further comprises: determining the location of the pseudo base station, and drawing the movement trajectory of the pseudo base station;
    根据发现所述伪基站的基站位置信息确定所述伪基站的位置,在预设时间间隔内,统计所述伪基站的多个位置信息并连接得出所述伪基站的移动轨迹。The location of the pseudo base station is determined according to the location information of the pseudo base station found, and within a preset time interval, multiple location information of the pseudo base station is counted and connected to obtain the movement track of the pseudo base station.
  4. 如权利要求2所述的伪基站识别方法,其中,所述计算基站的所述关键计数器的分析数据与所述阈值进行比较包括:3. The pseudo base station identification method according to claim 2, wherein the calculating the analysis data of the key counter of the base station and comparing the threshold value comprises:
    将所述关键计数器的统计数据代入到所述回归分析模型中得到性能数据,将所述性能数据与所述阈值进行比较确定是否存在伪基站。Substituting the statistical data of the key counter into the regression analysis model to obtain performance data, and comparing the performance data with the threshold value to determine whether there is a pseudo base station.
  5. 如权利要求2所述的伪基站识别方法,其中,所述基站的性能数据计数器包括:平均用户数、基站间X2接口的小区间同频切换出执行成功次数、基站间X2接口的小区间同频切换入执行成功次数、基站间X2接口的小区间异频切换出执行成功次数、基站间X2接口的小区间异频切换入执行成功次数、基站间S1接口的小区间异频切换出执行成功次数、基站间S1接口的小区间异频切换入执行成功次数、基站间S1接口的小区间同频切换出执行成功次数和基站间S1接口的小区间同频切换入执行成功次数任意一种。The pseudo base station identification method according to claim 2, wherein the performance data counter of the base station includes: the average number of users, the number of successful executions of intra-cell simultaneous handover of the X2 interface between the base stations, and the inter-cell synchronization of the X2 interface between the base stations. The number of successful executions of frequency handover, the number of successful executions of inter-cell inter-frequency handover of the X2 interface between base stations, the number of successful inter-cell inter-frequency handovers of the X2 interface between base stations, and the successful execution of inter-cell inter-frequency handover of the S1 interface between base stations. The number of times, the number of successful executions of inter-cell inter-frequency handover of the S1 interface between base stations, the number of successful executions of inter-cell intra-frequency handover of the S1 interface between base stations, and the number of successful inter-cell intra-frequency handovers of the S1 interface between base stations.
  6. 如权利要求2所述的伪基站识别方法,其中,所述回归分析模型为与所述关键计数器对应的计算式,所述计算式的结果即为所述关键计数器的伪基站相关系数。3. The pseudo base station identification method according to claim 2, wherein the regression analysis model is a calculation formula corresponding to the key counter, and the result of the calculation formula is the pseudo base station correlation coefficient of the key counter.
  7. 如权利要求6所述的伪基站识别方法,其中,所述计算式中还包括合理系数,所述合理系数通过所述机器学习使得所述合理系数接近实际值。7. The pseudo base station identification method according to claim 6, wherein the calculation formula further includes a reasonable coefficient, and the reasonable coefficient makes the reasonable coefficient close to the actual value through the machine learning.
  8. 如权利要求1-7任一项所述的伪基站识别方法,其中,所述性能数据计数器包括基站内的性能数据计数器,或基站之间的性能数据计数器。7. The pseudo base station identification method according to any one of claims 1-7, wherein the performance data counter comprises a performance data counter in a base station or a performance data counter between base stations.
  9. 一种伪基站识别系统,包括:A pseudo base station identification system, including:
    网管数据库,其存储有所有基站的性能数据计数器所统计的性能数据;Network management database, which stores performance data counted by performance data counters of all base stations;
    基站,被配置为使用性能数据计数器生成关键计数器对应的性能数据;The base station is configured to use the performance data counter to generate performance data corresponding to the key counter;
    网管服务器,其包括:Network management server, which includes:
    数据提取模块,被配置为与网管数据库连接,并从所述网管数据库中提取关键计数器对应的性能数据;The data extraction module is configured to connect with the network management database, and extract the performance data corresponding to the key counter from the network management database;
    规划学习模块,被配置为制定分析模型,根据所述网管数据库中提取关键计数器对应的性能数据计算伪基站相关系数的阈值;The planning learning module is configured to formulate an analysis model, and calculate the threshold of the correlation coefficient of the pseudo base station according to the performance data corresponding to the key counter extracted from the network management database;
    数据存储模块,被配置为存储,以实现如权利要求1-8任一项所述的伪基站识别方法所需的数据。The data storage module is configured to store the data required by the pseudo base station identification method according to any one of claims 1-8.
  10. 如权利要求9所述的伪基站识别系统,其中,所述网管服务器还包括:The pseudo base station identification system according to claim 9, wherein the network management server further comprises:
    用户交互模块,被配置为通过所述基站与用户终端连接进行伪基站提醒;The user interaction module is configured to perform pseudo base station reminders through the connection between the base station and the user terminal;
    图形管理模块,被配置为生成伪基站的移动轨迹。The graphics management module is configured to generate the movement trajectory of the pseudo base station.
  11. 一种计算机可读存储介质,存储有一个或者多个程序,所述一个或者多个程序可被一个或者多个处理器执行,以实现如权利要求1至8中任一项所述的伪基站识别方法的步骤。A computer-readable storage medium storing one or more programs, and the one or more programs can be executed by one or more processors to realize the pseudo base station according to any one of claims 1 to 8 Identify the steps of the method.
PCT/CN2020/128476 2019-11-14 2020-11-12 Pseudo base station identification method and system, and computer readable storage medium WO2021093823A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911114985.XA CN112804701A (en) 2019-11-14 2019-11-14 Pseudo base station identification method, system and computer readable storage medium
CN201911114985.X 2019-11-14

Publications (1)

Publication Number Publication Date
WO2021093823A1 true WO2021093823A1 (en) 2021-05-20

Family

ID=75803786

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/128476 WO2021093823A1 (en) 2019-11-14 2020-11-12 Pseudo base station identification method and system, and computer readable storage medium

Country Status (2)

Country Link
CN (1) CN112804701A (en)
WO (1) WO2021093823A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106304086A (en) * 2016-08-17 2017-01-04 努比亚技术有限公司 Pseudo-base station recognition methods and device
US9628994B1 (en) * 2015-12-30 2017-04-18 Argela Yazilim ve Bilisim Teknolojileri San. ve Tic. A.S. Statistical system and method for catching a man-in-the-middle attack in 3G networks
CN108260126A (en) * 2016-12-29 2018-07-06 中国移动通信集团浙江有限公司 A kind of pseudo-base station recognition positioning method and device
CN109219049A (en) * 2018-09-21 2019-01-15 新华三技术有限公司成都分公司 Pseudo-base station recognition methods, device and computer readable storage medium
CN110312259A (en) * 2019-08-20 2019-10-08 Oppo广东移动通信有限公司 Pseudo-base station recognition methods, device, terminal and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9628994B1 (en) * 2015-12-30 2017-04-18 Argela Yazilim ve Bilisim Teknolojileri San. ve Tic. A.S. Statistical system and method for catching a man-in-the-middle attack in 3G networks
CN106304086A (en) * 2016-08-17 2017-01-04 努比亚技术有限公司 Pseudo-base station recognition methods and device
CN108260126A (en) * 2016-12-29 2018-07-06 中国移动通信集团浙江有限公司 A kind of pseudo-base station recognition positioning method and device
CN109219049A (en) * 2018-09-21 2019-01-15 新华三技术有限公司成都分公司 Pseudo-base station recognition methods, device and computer readable storage medium
CN110312259A (en) * 2019-08-20 2019-10-08 Oppo广东移动通信有限公司 Pseudo-base station recognition methods, device, terminal and storage medium

Also Published As

Publication number Publication date
CN112804701A (en) 2021-05-14

Similar Documents

Publication Publication Date Title
WO2017185664A1 (en) Method for positioning terminal, and network device
US11012461B2 (en) Network device vulnerability prediction
CN105744553B (en) Network association analysis method and device
US20060270400A1 (en) Methods and structures for improved monitoring and troubleshooting in wireless communication systems
US20140004829A1 (en) Mobile device and method to monitor a baseband processor in relation to the actions on an applicaton processor
EP4236240A2 (en) Network anomaly detection
EP3413512B1 (en) Alarm information processing method, apparatus and system
EP3132592A1 (en) Method and system for identifying significant locations through data obtainable from a telecommunication network
CN107147521B (en) Early warning and monitoring method for complaint service
CN105144773A (en) Updating stored information about wireless access points
CN106899948B (en) Pseudo base station discovery method, system, terminal and server
CN107567030A (en) A kind of method and system investigated with evading pseudo-base station interference
CN103731866A (en) Method and system for detecting performance of subscriber terminals
Moysen et al. Unsupervised learning for detection of mobility related anomalies in commercial LTE networks
WO2021093823A1 (en) Pseudo base station identification method and system, and computer readable storage medium
KR102333866B1 (en) Method and Apparatus for Checking Problem in Mobile Communication Network
US20180160314A1 (en) Identification method, device and system for wcdma network cell soft switching band and storage medium
CN106878965A (en) A kind of method and apparatus for assessing mobile terminal performance
US10721707B2 (en) Characterization of a geographical location in a wireless network
US20210368431A1 (en) Automatic evaluation and management of slice reselection experiences
CN111465030B (en) Indoor MDT longitude and latitude backfill method, device, computer equipment and storage medium
EP3952377B1 (en) Detection method, apparatus and system for unauthorized unmanned aerial vehicle
WO2022149149A1 (en) Artificial intelligence with dynamic causal model for failure analysis in mobile communication network
CN111372270B (en) Method, device, equipment and medium for determining suspected fault cell
Chernogorov et al. Data Mining Approach to Detection of Random Access Sleeping Cell Failures in Cellular Mobile Networks

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20887943

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20887943

Country of ref document: EP

Kind code of ref document: A1