CN116567672A - High-speed rail user identification method, device and storage medium - Google Patents

High-speed rail user identification method, device and storage medium Download PDF

Info

Publication number
CN116567672A
CN116567672A CN202310739330.1A CN202310739330A CN116567672A CN 116567672 A CN116567672 A CN 116567672A CN 202310739330 A CN202310739330 A CN 202310739330A CN 116567672 A CN116567672 A CN 116567672A
Authority
CN
China
Prior art keywords
speed rail
cell
model
speed
serving cell
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310739330.1A
Other languages
Chinese (zh)
Inventor
杨飞虎
刘贤松
欧大春
姜志恒
易峰
赵南疆
徐静静
张忠平
尹劲松
曾毅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China United Network Communications Group Co Ltd
Original Assignee
China United Network Communications Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China United Network Communications Group Co Ltd filed Critical China United Network Communications Group Co Ltd
Priority to CN202310739330.1A priority Critical patent/CN116567672A/en
Publication of CN116567672A publication Critical patent/CN116567672A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/02Arrangements for optimising operational condition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/08Testing, supervising or monitoring using real traffic
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The application provides a high-speed rail user identification method, a high-speed rail user identification device and a storage medium, which relate to the field of communication and can identify high-speed rail users efficiently and accurately. The method comprises the following steps: determining a high-speed rail main control cell; the high-speed rail main control cell is used for providing services for the high-speed rail user group; determining a first model according to fingerprint characteristics of a high-speed rail main control cell; the first model is used for determining the confidence coefficient of the high-speed rail main control cell; determining a first target service cell according to the confidence coefficient of the high-speed rail main control cell; the first target serving cell is a serving cell with confidence coefficient larger than a preset confidence coefficient threshold value in a high-speed rail main control cell; determining a second model according to the first target service cell; the second model is used for identifying the high-speed rail user; and identifying the high-speed rail users in the users to be identified according to the second model. The method and the device are used for identifying the high-speed rail users.

Description

High-speed rail user identification method, device and storage medium
Technical Field
The present disclosure relates to the field of communications, and in particular, to a method and apparatus for identifying a high-speed rail user, and a storage medium.
Background
In recent years, high-speed rails are becoming the first choice for people to travel long distances and commute. As the number of people traveling on high-speed rails increases, the internet surfing demands of high-speed rail users on the high-speed rails also increase. In this context, it is important for operators to build a high-quality network to meet the internet surfing demands of high-speed rail users on high-speed rails. Therefore, accurate identification of the high-speed rail users is required, so that the quality of the high-speed rail network is optimized, and the perception of the users is improved. The method for identifying the high-speed rail users at the present stage still cannot identify the high-speed rail users efficiently and accurately.
Disclosure of Invention
The application provides a high-speed rail user identification method, a high-speed rail user identification device and a storage medium, which can identify high-speed rail users.
In order to achieve the above purpose, the present application adopts the following technical scheme:
in a first aspect, the present application provides a method for identifying a high-speed rail user, the method comprising: determining a high-speed rail main control cell; the high-speed rail main control cell is used for providing services for the high-speed rail user group; determining a first model according to fingerprint characteristics of a high-speed rail main control cell; the first model is used for determining the confidence coefficient of the high-speed rail main control cell; determining a first target service cell according to the confidence coefficient of the high-speed rail main control cell; the first target serving cell is a serving cell with confidence coefficient larger than a preset confidence coefficient threshold value in a high-speed rail main control cell; determining a second model according to the first target service cell; the second model is used for identifying the high-speed rail user; and identifying the high-speed rail users in the users to be identified according to the second model.
In one possible implementation manner, determining the high-speed rail master cell specifically includes: acquiring a second target serving cell; the second target serving cell is a serving cell with switching times smaller than preset times in a preset time period on the high-speed railway line; determining a high-speed rail user group from the user groups corresponding to the second target service cell according to the long interval speed algorithm; and determining a high-speed railway master control cell according to the first preset characteristic condition and the high-speed railway user group.
In one possible implementation, the first preset feature condition includes one or more of the following: the moving route has periodicity, the distance between the moving route and the high-speed railway route is smaller than a first preset distance threshold value, and ECI of a corresponding serving cell in a preset period is the same.
In one possible implementation manner, acquiring the second target serving cell specifically includes: determining a third target serving cell; the shortest distance between the third service cell and the high-speed railway line is smaller than a second preset distance threshold value; acquiring S1-MME interface data of a user in a third target service cell;
the S1-MME interface data comprises ECI of a third target service cell; time sequencing is carried out on the S1-MME interface data; and acquiring a second serving cell from the third serving cell according to the ordered S1-MME interface data.
In one possible implementation, the first model and the second model are constructed according to the LightGBM algorithm. In one possible implementation manner, determining the second model according to the first target serving cell specifically includes: constructing an initial model according to a LightGBM algorithm; judging whether the recognition accuracy of the initial model is greater than or equal to an accuracy threshold according to a K-fold cross-validation algorithm; and determining the initial model as a second model in the case that the recognition accuracy of the initial model is greater than or equal to the accuracy threshold.
In a second aspect, the present application provides a high-speed rail user identification apparatus, the apparatus comprising: a processing unit; the processing unit is used for determining a high-speed rail main control cell; the high-speed rail main control cell is used for providing services for the high-speed rail user group; the processing unit is also used for determining a first model according to the fingerprint characteristics of the high-speed rail main control cell; the first model is used for determining the confidence coefficient of the high-speed rail main control cell; the processing unit is further used for determining a first target service cell according to the confidence coefficient of each cell; the first target serving cell is a serving cell with confidence coefficient larger than a preset confidence coefficient threshold value in a high-speed rail main control cell; the processing unit is further used for determining a second model according to the first target service cell; the second model is used for identifying the high-speed rail user; and the processing unit is also used for identifying the high-speed rail users in the users to be identified according to the second model.
In one possible implementation, the apparatus further includes: an acquisition unit; an acquisition unit, configured to acquire a second target serving cell; the second target serving cell is a serving cell with switching times smaller than preset times in a preset time period on the high-speed railway line; the processing unit is further used for determining a high-speed railway user group from the user groups corresponding to the second target service cell according to the long interval speed algorithm; the processing unit is further used for determining a high-speed railway master control cell according to the first preset characteristic condition and the high-speed railway user group.
In one possible implementation, the first preset feature condition includes one or more of the following: the moving route has periodicity, the distance between the moving route and the high-speed railway route is smaller than a first preset distance threshold value, and ECI of a corresponding serving cell in a preset period is the same.
In one possible implementation manner, acquiring the second target serving cell specifically includes: the processing unit is further used for determining a third target serving cell; the shortest distance between the third service cell and the high-speed railway line is smaller than a second preset distance threshold value; the acquisition unit is further used for acquiring S1-MME interface data of the user in the third target service cell; the S1-MME interface data comprises ECI of a third target service cell; the processing unit is also used for time ordering the S1-MME interface data; and the processing unit is also used for acquiring a second serving cell from the third serving cell according to the ordered S1-MME interface data.
In a possible implementation, the processing unit is further configured to construct the first model and the second model according to a LightGBM algorithm. In a possible implementation manner, the processing unit is further configured to determine, according to the first target serving cell, a second model, and specifically includes: the processing unit is also used for constructing an initial model according to the LightGBM algorithm; the processing unit is also used for judging whether the recognition accuracy of the initial model is greater than or equal to an accuracy threshold according to a K-fold cross-validation algorithm; and the processing unit is further used for determining the initial model as the second model under the condition that the identification accuracy of the initial model is greater than or equal to the accuracy threshold.
In a third aspect, the present application provides a high-speed rail user identification apparatus, the apparatus comprising: a processor and a communication interface; the communication interface is coupled to a processor for running a computer program or instructions to implement the high-speed rail user identification method as described in any one of the possible implementations of the first aspect and the first aspect.
In a fourth aspect, the present application provides a computer readable storage medium having instructions stored therein which, when run on a terminal, cause the terminal to perform a high-speed rail user identification method as described in any one of the possible implementations of the first aspect and the first aspect.
In a fifth aspect, embodiments of the present application provide a computer program product comprising instructions that, when run on a high-speed rail user identification device, cause the high-speed rail user identification device to perform the high-speed rail user identification method as described in any one of the possible implementations of the first aspect and the first aspect.
In a sixth aspect, embodiments of the present application provide a chip comprising a processor and a communication interface, the communication interface and the processor being coupled, the processor being configured to execute a computer program or instructions to implement a high-speed rail user identification method as described in any one of the possible implementations of the first aspect and the first aspect.
Specifically, the chip provided in the embodiments of the present application further includes a memory, configured to store a computer program or instructions.
Based on the above technical scheme, the high-speed rail user identification method provided by the embodiment of the application includes the steps of firstly determining a high-speed rail main control cell, then determining a first model according to fingerprint features of the high-speed rail main control cell and a LightGBM algorithm, calculating confidence degrees of all the high-speed rail main control cells according to the first model, further determining a service cell with the confidence degrees larger than a preset confidence degree threshold value as a first target service cell, then constructing an initial model according to the LightGBM algorithm and the first target service cell, training the initial model according to features of the first target service cell, and determining the initial model as a second model under the condition that identification accuracy of the initial model is larger than or equal to the accuracy threshold value, so that high-speed rail users in users to be identified are identified according to the second model. Therefore, the method and the device can accurately identify the high-speed rail users, so that potential coverage quality problems are found, network quality is optimized, user perception is improved, and a better network is created.
Drawings
Fig. 1 is a schematic diagram of a neighbor cell relationship of a serving cell according to an embodiment of the present application;
Fig. 2 is a schematic diagram of a distance relationship between a high-speed rail base station and a high-speed rail according to an embodiment of the present application;
fig. 3 is a schematic flow chart of forming a histogram by a histogram algorithm according to an embodiment of the present application;
FIG. 4 is a schematic flow chart of histogram difference optimization according to an embodiment of the present application;
FIG. 5 is a schematic diagram showing a comparison of different growth modes according to an embodiment of the present application;
fig. 6 is a schematic flow chart of a model constructed according to the LightGBM algorithm according to an embodiment of the present application;
fig. 7 is a schematic architecture diagram of a high-speed rail identification system according to an embodiment of the present application;
fig. 8 is a schematic hardware structure of a high-speed rail user identification device according to an embodiment of the present application;
fig. 9 is a schematic hardware structure diagram of another high-speed rail user identification device according to an embodiment of the present application;
fig. 10 is a flow chart of a high-speed rail user identification method according to an embodiment of the present application;
fig. 11 is a schematic diagram of confidence of a part of high-speed rail master cells according to an embodiment of the present application;
FIG. 12 is a schematic diagram of prediction accuracy according to a K-fold cross-validation initial model provided in an embodiment of the present application;
fig. 13 is a schematic diagram of a travel track of an identified high-speed rail user on a high-speed rail line according to an embodiment of the present disclosure;
FIG. 14 is a schematic diagram showing a comparison of the track of a certain identified high-speed rail user on a high-speed rail line and MR data provided in the embodiment of the present application;
fig. 15 is a flowchart of another method for identifying a high-speed rail user according to an embodiment of the present application;
fig. 16 is a schematic diagram of a serving cell of a part of a high-speed railway line according to an embodiment of the present application;
FIG. 17 is a schematic diagram of a calculation of a partially identified high-speed rail user according to a long interval algorithm provided in an embodiment of the present application;
fig. 18 is a flowchart of another method for identifying a high-speed rail user according to an embodiment of the present application;
fig. 19 is a layout of a high-speed railway line of a land area along a serving cell provided in an embodiment of the present application;
fig. 20 is a schematic diagram of S1-MME interface data of a user in a third target serving cell according to an embodiment of the present application;
fig. 21 is a schematic diagram of S1-MME interface data of a user in another third target serving cell according to an embodiment of the present application;
fig. 22 is a schematic diagram of a handover of a user between serving cells according to an embodiment of the present application;
fig. 23 is a schematic structural diagram of a high-speed rail user identification device according to an embodiment of the present application.
Detailed Description
The following describes in detail a method, an apparatus and a storage medium for identifying a high-speed rail user according to an embodiment of the present application with reference to the accompanying drawings.
The term "and/or" is herein merely an association relationship describing an associated object, meaning that there may be three relationships, e.g., a and/or B, may represent: a exists alone, A and B exist together, and B exists alone.
The terms "first" and "second" and the like in the description and in the drawings are used for distinguishing between different objects or for distinguishing between different processes of the same object and not for describing a particular sequential order of objects.
Furthermore, references to the terms "comprising" and "having" and any variations thereof in the description of the present application are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed but may optionally include other steps or elements not listed or inherent to such process, method, article, or apparatus.
It should be noted that, in the embodiments of the present application, words such as "exemplary" or "such as" are used to mean serving as an example, instance, or illustration. Any embodiment or design described herein as "exemplary" or "for example" should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete fashion.
In order to facilitate understanding of the technical solution of the present application, the following description refers to technical terms related to the present application:
1. the high-speed rail signal covers the feature point.
In a long term evolution (long term evolution, LTE) wireless network, the coverage of high-speed rail signals is mainly characterized by the following:
(1) The radio signal has a large transmission loss.
Because the carriage of the high-speed railway train adopts a totally-enclosed structure, the wireless signal transmission loss is larger. Table 1 below shows the loss of wireless signals penetrating various high-speed railway models:
TABLE 1 loss of radio signal penetration through various high-speed railway models
(2) Doppler shift.
The doppler shift means that the signal wave changes with the change of the relative position of the transmitter and the receiver, and the faster the relative movement speed, the larger the shift. The frequency shift may cause subcarrier interference, thereby affecting cell handover and selection. Table 2 below shows doppler shifts at different frequency bands:
TABLE 2 Doppler shift at different frequency bands
The influence of Doppler frequency shift can be reduced by reasonably selecting the base station site, and in addition, the self-adaptive frequency shift correction algorithm developed by each equipment manufacturer can correct the frequency shift and promote the baseband performance demodulation.
(3) Frequent switching.
The high-speed rail has the characteristic of high-speed movement, and the time for the high-speed rail to pass through the cell is short in a high-speed movement scene, so that the user terminal can be frequently switched among the service cells. The switching of the serving cell is too frequent, so that the call drop rate is increased, and the perception of the user is affected. Aiming at the problem of frequent cell switching, the scheme adopted at present is to combine multiple service cells and combine a plurality of adjacent service cells into one logic service cell. Therefore, under the scene of high-speed movement, the user terminal cannot switch in one logic cell, so that the times of switching cells can be reduced, and the switching success rate is improved.
In the case of high-speed movement, the overlapping coverage required for handover between serving cell values becomes large, so that the serving cell spacing needs to be re-planned.
2. High-speed rail signal coverage strategies.
(1) And (5) private network coverage.
The high-speed railway coverage is divided into public network networking and private network networking. The public network networking utilizes the existing or newly-built sites and adopts the same frequency point as the peripheral base stations, thereby simultaneously covering the peripheral users of the high-speed rail and the high-speed rail users. The private network is composed of stations along the high-speed rail, only high-speed rail users are served, and peripheral high-speed rail users are still served by the public network. The private network frequency may be the same as or different from the public network.
The high-speed rail LTE network adopts a private network mode to cover high-speed rail signals, adopts a chain type adjacent cell design along the line and does not switch with a public network, so that a user can keep good network continuity when moving at a high speed, and the communication quality is improved. Because of the limitation of frequency resources, if a special frequency network scheme is adopted, the special frequency network and the public network respectively use 10MHz, the peak rate of the high-speed railway LTE is halved, the spectrum utilization rate is reduced, and the user perception is affected, so that the LTE network is suitable for adopting a networking mode of the same-frequency special network.
(2) And (5) switching strategies.
When the same-frequency private network networking is adopted in the high-speed railway LTE network, switching between the private network and the public network along the high-speed railway is required. As shown in fig. 1, a private network cell along a high-speed rail needs to be configured with a neighboring cell relationship with a front cell and a rear cell, and is not configured with a neighboring cell relationship with a peripheral public network cell.
(3) And setting the station spacing.
Illustratively, as shown in fig. 2, the inter-station distance of the high-speed railway base station is related to factors such as the distance of the base station from the rail, the coverage radius of the cell, etc., when the distance of the base station from the rail is small, the incident angle becomes small, the wireless signal is more lost when penetrating the vehicle body, and the corresponding inter-station distance becomes large; when the distance between the base station and the rail is large, the wireless signal attenuation is overlarge, and the communication requirement of the terminal cannot be met when the wireless signal attenuation reaches the high-speed rail train.
The high-speed rail technology is continuously developed, and the wireless signal coverage scene is continuously changed, so that the problem of good wireless signal coverage needs to be solved in a targeted manner, the user experience in the high-speed rail high-speed mobile scene is improved, and good praise is won for operators. In order to accurately reflect the real coverage of the wireless signal and the service perception of the high-speed railway user, an operator adopts network management data, map Reduce (MR) data, external data representation (external data representation, XDR) data and the like to perform positioning analysis on the high-speed railway user. Among them, there are three rule models for the algorithm for identifying the high-speed rail users: user signaling occurs in the high-speed rail private network, user location trajectory and high-speed rail line matches, and user movement speed is greater than a certain threshold.
The current algorithm for identifying high-speed rail users based on rules has the following problems: the high-speed rail user is identified by a plurality of continuous cells matched with a switching chain, the high-speed rail user is easy to miss identification, the moving speed of the user is calculated by the time difference of the line distance and the line mapping between the cells which are sequentially appeared through user signaling, the error is too large, the high-speed rail user is easy to miss identification, the waiting time of a passenger and the residence time in a station are not considered when the moving distance of the user is calculated, the calculated moving speed is lower than the actual speed, and the high-speed rail user is missed to be identified.
Currently, the conventional high-speed rail identification method adopts an identification method based on XDR data, and specifically comprises the following steps:
3. the principle of the traditional algorithm is introduced.
(1) DPI principle and XDR data acquisition.
The deep packet inspection (deep packet inspection, DPI) system performs inspection and analysis on the flow and the message content of the network key interface, and performs filtration control on the flow according to a strategy to realize collection of a signaling plane and a user plane, so as to filter and collect information generated by the user internet surfing behavior.
The DPI system is divided into a three-layer architecture, namely an acquisition layer, a decoding layer and a reference layer. The acquisition layer and the decoding layer are responsible for data acquisition, traffic analysis and log synthesis, and are generally stored in the database of the decoding layer in a manner of call detail record (call detail record, CDR) and transaction detail record (transaction detail record, TDR) record. The application layer mainly completes calculation, arrangement, statistics, reasonable organization and data storage of CDR and TDR record data and performs presentation.
(2) XDR data mining.
The collected XDR data are subjected to data cleaning, standardized warehousing and other operations, and the following four types of data can be obtained: s1-mobility management entity (mobility management entity, MME) interface data, hypertext transfer protocol (hypertext transfer protocol, HTTP), video (video) and domain name system (domain name system, DNS), wherein S1-MME is user signaling plane data, HTTP is user-on-line page data, video is user-viewing video data, DNS is user-on-line path data, and S1-MME data needs to be analyzed and mined for high-speed rail user identification.
The S1 interface is a communication interface between the LTE base station and the packet core network (evolved packet core, EPC), and the S1 interface is divided into two interfaces, namely S1-MME and S1-U. Wherein S1-MME is used for control plane and S1-U is used for user plane.
The S1-MME interface data in the XDR data is mainly the content of the user internet signaling plane, and the international mobile equipment identification code (international mobile equipment identity, IMEI), the user occupation cell start time, the user occupation end time, the user occupation cell condition and other information can be obtained from the part of S1-MME interface data, and is used for high-speed railway user identification.
4. High-speed rail user behavior characteristics and recognition algorithm models.
The longitude and latitude of the high-speed railway service cell can be obtained from the basic information of the high-speed railway service cell, so that the distance between the high-speed railway service cells can be calculated. From the information of the S1-MME interface data, a list of the occupied service cell of the user and the start-stop time of the occupied service cell of the user can be obtained, so that the speed of the user in a certain interval can be calculated. And then analyzing and processing the information through an algorithm, and identifying the high-speed rail user.
(1) And identifying the high-speed rail users with long time spans according to the large granularity model.
And analyzing the high-speed rail users with long time span once every 30 minutes, and recording statistical period summary information such as starting and stopping cells, cell numbers, distances, duration, speed, directions and the like of the high-speed rail users. Table 3 below shows the large particle size model:
TABLE 3 Large particle model
(2) And identifying the high-speed rail users with short time spans according to the small granularity model.
And analyzing the high-speed rail users with short time span once every 10 minutes, and recording statistical period summary information such as starting and stopping cells, cell numbers, distances, duration, speed, directions and the like of the high-speed rail users. Table 4 below shows the small particle size model:
Table 4 small particle size model
(3) And identifying the high-speed rail users according to the fusion model.
The fusion model can obtain the recognition result of the high-speed rail user, and the users recognized by the large granularity model and the small granularity model in the last hour are combined every hour, so that the fusion model is obtained, and statistical period summary information such as starting and stopping cells, cell numbers, distances, duration, speed, direction and the like of the high-speed rail user is recorded for upper-layer application analysis.
The conventional high-speed rail user identification method mainly comprises the steps of acquiring a list of occupied service cells of users from S1-MME interface data, calculating the speed of the users in each interval by starting and ending time of the occupied service cells, and adopting different granularity models based on different time spans of the high-speed rail users. Thus, after the analysis and processing of the basic information by the algorithm, the high-speed rail user is identified.
5. LightGBM algorithm.
The gradient lifting decision tree (gradient boosting decision tree, GBDT) is a permanent model in machine learning, the main idea is to obtain an optimal model by iterative training of a weak classifier, and the model has the advantages of good training effect, difficult overfitting and the like. Microsoft corporation 2017 proposed the LightGBM algorithm, which is an improved GBDT-based algorithm that can process massive amounts of data more efficiently than GBDT. The LightGBM algorithm mainly includes the following features: histogram (Histogram) algorithm, leaf-growth strategy with depth constraint, single-side gradient sampling (GOSS), mutual exclusion feature merging (exclusive feature bundling, EFB), support class features, efficient parallelism, and high-speed memory (Cache) hit rate optimization.
(1) Histogram algorithm.
The histogram algorithm is to divide the data into different discrete areas according to the characteristic value of the original data and then traverse the discrete data to find the optimal dividing points. Wherein, each 'bucket' divided by the characteristic value has two layers of meanings, and one layer is the number of samples in each 'bucket'; the other layer is the sum of the gradients of the samples in each "bucket" (the squared mean of the first order gradient sum is equivalent to the mean square loss).
As can be seen from the description of the histogram algorithm, the complexity of the model can be reduced after the model is processed by the histogram algorithm. And only discrete values are stored after the raw data are divided into barrels according to the characteristic values of the raw data, so that the occupancy rate of the memory is greatly reduced. Secondly, the histogram algorithm uses the bin to replace the original data, which is equivalent to increasing the regularization ratio, so that more detail features are discarded, and similar data can be divided into the same barrel, so that the regularization degree is affected by the selection of the number of the bin, and the fitting risk is lower when the bin is smaller.
Illustratively, the process of forming a histogram by a histogram algorithm is shown in fig. 3.
In the LightGBM algorithm, the histogram algorithm further includes a histogram difference optimization, that is, after the LightGBM algorithm obtains a histogram of a leaf, the histogram of its sibling leaf can be obtained with minimal cost by using the histogram difference method.
Illustratively, as shown in FIG. 4, when a histogram of a leaf and a histogram of its parent node are obtained, the histogram of the sibling leaf of the leaf may also be obtained. In this way, the speed of the LightGBM algorithm can be further optimized.
(2) Leaf-growth strategy with depth limitation.
The GBDT and extreme gradient lifting (extreme gradient boosting, XGBoost) models employ a leaf-wise split growth approach on a leaf growth strategy that traverses the entire dataset for each node of the same layer during splitting, i.e., each iteration. Although the leaf-wise splitting can be performed in parallel for each layer and control the complexity of the model, since each iteration traverses the entire dataset, many unnecessary searches and splits are generated, thereby consuming more memory and increasing the computational cost.
The LightGBM algorithm improves the leaf growth strategy by adopting a growth mode according to leaf-wise splitting, namely splitting only the leaf with the largest splitting gain in all leaves at a time. The growth mode of the leaf-wise division has smaller error and accelerates the learning speed of the algorithm. However, this growth method does not split other leaves, so that the splitting result is not sufficiently refined, and splitting only one leaf per layer will increase the depth of the tree, resulting in overfitting of the model. The LightGBM algorithm therefore limits the depth of the tree during per-leaf growth to avoid overfitting.
Illustratively, the growth patterns per layer leaf-wise split and per leaf-wise split are shown in FIG. 5.
(3)GOSS。
In GBDT, each sample has a different gradient value, the gradient of the sample can reflect the contribution degree to the model, the larger the gradient of the sample is, the more information gain is contributed to the model, the smaller the gradient of the sample is, and the better the performance in the model is.
The lightning GBM algorithm introduces GOSS, and the basic idea is that from the perspective of sample reduction, gradient size information of samples is used as the weight of sample importance, all samples with large gradients are reserved, samples with small gradients are randomly sampled in proportion, and in order not to change the data distribution of the samples, a constant is introduced for the samples with small gradients to balance when the gain is calculated. Therefore, the number of samples is reduced and the training speed of the model is improved while the original data distribution is not changed any more.
(4)EFB。
High latitude data is typically very sparse and mutual exclusivity exists between features. Illustratively, several features generated after one-hot encoding are not simultaneously 0. This data has some impact on both the model's effectiveness and the speed of operation. The EFB can solve the problem of high latitude data sparsity. If the two features are not completely mutually exclusive, an index can be used to measure the degree of feature non-mutual exclusion, and when the index value is smaller, we can choose to bind the two features which are not completely mutually exclusive without affecting the final accuracy.
For example, as shown in table 3, let feature 1, feature 2 and feature 3 be mutually exclusive sparse features, bind three features into one dense new feature by EFB algorithm, and then replace the original three features with the one new feature, so as to reduce feature dimension without losing information, avoid unnecessary 0 value calculation, and improve the speed of gradient enhancement algorithm. Table 5 below shows the bundling of three features into one dense new feature by the EFB algorithm:
TABLE 5 binding three features into one dense new feature
# Feature 1 Feature 2 Feature 3 Novel features
1 0 2 0 2
2 0 0 0 0
3 0 0 0 0
4 0 0 1 1
5 3 0 0 3
From the above description of the LightGBM algorithm, the LightGBM algorithm is a GBDT algorithm with highly optimized performance, and can be regarded as an XGboost optimization algorithm.
Illustratively, the LightGBM algorithm may be formulated as equation 1 below:
lightgbm=xgboost+histogram+goss+efb equation 1
(5) LightGBM algorithm parameters.
The LightGBM algorithm parameters are relatively complex and can be generally divided into core parameters, learning control parameters, IO parameters, target parameters, measurement parameters and the like, and the core parameters, the learning control parameters and the measurement parameters generally need to be regulated.
Illustratively, table 6 below shows default values and paraphrasing of commonly used important parameters:
table 6 default values and definitions of important parameters commonly used
Illustratively, as shown in fig. 6, fig. 6 is a specific step of building a model according to the LightGBM algorithm.
The technical terms related to the present application are described above.
At present, high-speed rails gradually become the first choice for people to travel for a long distance. The demands of high-speed rail users for surfing the internet are also increasing, so that the quality of the mobile communication network on the high-speed rail also affects the brand public praise of operators. Therefore, it is important for operators to establish a high-quality network to meet the internet surfing requirements of high-speed rail users. In order to establish a high-quality network to meet the internet surfing requirements of high-speed railway users, operators and related units analyze the high-speed railway users according to the positioning of network management data, measurement report (measurement report, MR) data, XDR data and the like, and analyze the aspects of network coverage, service performance, capacity and the like of the high-speed railway on the basis, thereby providing corresponding network construction and optimization schemes.
The conventional high-speed rail user identification method has the following problems:
(1) The high-speed rail users are identified through a plurality of continuous service cells matched with a switching chain, so that the omission of the high-speed rail users is easily caused;
(2) Calculating the moving speed of the user through the distance between service cells and the time difference of line mapping which occur successively through user signaling, wherein the error is larger;
(3) The user waiting and in-station residence time are not considered in the calculation.
Therefore, the missing identification of part of the high-speed rail users occurs, so that the high-speed rail users cannot be accurately identified.
Aiming at the defects, the high-speed rail user identification method comprises the steps of firstly determining a high-speed rail main control cell through preliminary screening, then determining a first model according to fingerprint features of the high-speed rail main control cell and a LightGBM algorithm, calculating confidence degrees of all the high-speed rail main control cells according to the first model, and further determining a service cell with the confidence degree larger than a preset confidence degree threshold as a first target service cell; and then constructing an initial model according to a LightGBM algorithm and the first target serving cell, training the initial model according to the characteristics of the first target serving cell, and determining the initial model as a second model when the identification accuracy of the initial model is greater than or equal to an accuracy threshold value, so that the high-speed rail users in the users to be identified are identified according to the second model. From this, this application carries out the preliminary screening back through carrying out the service district, carries out the rescreening through first model to the service district after the preliminary screening, obtains the service district that the confidence level is preceding as the training object of initial model to this confirms the second model, from this application can accurately discern the high-speed railway user through the second model, thereby discover potential coverage quality problem, optimize network quality, improve user perception, build higher high-quality high-speed railway network. Before describing the high-speed rail user identification method in detail, the implementation environment and application field Jing Jinhang of the embodiment of the present application will be described.
Exemplary, as shown in fig. 7, a high-speed rail user identification system according to an embodiment of the present application includes a base station 701 and a high-speed rail 702.
Alternatively, the base station 701 may be a base station (base transceiver station, BTS) in a global system for mobile communications (global system for mobile communication, GSM), code division multiple access (code division multiple access, CDMA), base station (node B) in wideband code division multiple access (wideband code division multiple access, WCDMA), base station (eNB) in the internet of things (internet of things, ioT) or narrowband internet of things (narrow band-internet of things, NB-IoT), a future fifth generation mobile communication technology (5th generation mobile communication technology,5G) mobile communication network or a future evolved public land mobile network (public land mobile network, PLMN), to which the embodiments of the present application are not limited in any way.
It should be understood that only one base station 701 is shown in fig. 7, and in practical applications, the number of base stations 701 may be set according to practical situations, which is not specifically limited in this application. For example, the number of base stations 701 is specifically set according to the length of the travel route of the high-speed rail 702.
In this embodiment, the coverage area of the base station 701 is a service cell, where a high-speed rail user on the high-speed rail 702 can access the base station 701 in the service cell, and the high-speed rail user can access the base station 701 and then perform internet surfing service.
Alternatively, the high-speed rail 702 is fully called a high-speed railway, and refers to a railway with a high design standard grade and traveling at a highest speed of more than 200 km/h. The design standards of the high-speed railway grade are different in different periods, and the high-speed railway definition is updated continuously. The embodiment of the present application is not particularly limited.
Fig. 8 is a schematic diagram of a hardware structure of a high-speed rail user identification device according to an embodiment of the present application. The high-speed rail user identification device comprises a processor 81, a memory 82, a communication interface 83 and a bus 84. The processor 81, the memory 82 and the communication interface 83 may be connected by a bus 84.
The processor 81 is a control center of the high-speed rail user identification device, and may be one processor or a collective name of a plurality of processing elements. For example, the processor 81 may be a general-purpose central processing unit (central processing unit, CPU), or may be another general-purpose processor. Wherein the general purpose processor may be a microprocessor or any conventional processor or the like.
As one example, processor 81 may include one or more CPUs, such as CPU 0 and CPU 1 shown in fig. 8.
The memory 82 may be, but is not limited to, a read-only memory (ROM) or other type of static storage device that can store static information and instructions, a random access memory (random access memory, RAM) or other type of dynamic storage device that can store information and instructions, or an electrically erasable programmable read-only memory (EEPROM), magnetic disk storage or other magnetic storage device, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer.
In a possible implementation, the memory 82 may exist separately from the processor 81, and the memory 82 may be connected to the processor 81 through the bus 84 for storing instructions or program codes. When the processor 81 calls and executes the instructions or program codes stored in the memory 82, the high-speed rail user identification method provided in the following embodiments of the present application can be implemented.
In another possible implementation, the memory 82 may also be integrated with the processor 81.
The communication interface 83 is used for connecting the high-speed railway user identification device with other devices through a communication network, wherein the communication network can be an ethernet, a wireless access network, a wireless local area network (wireless local area networks, WLAN) and the like. The communication interface 83 may include a receiving unit for receiving data, and a transmitting unit for transmitting data.
Bus 84, which may be an industry standard architecture (industry standard architecture, ISA) bus, an external device interconnect (peripheral component interconnect, PCI) bus, or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. The bus may be classified as an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in fig. 8, but not only one bus or one type of bus.
Fig. 9 shows another hardware configuration of the high-speed rail user identification apparatus in the embodiment of the present application. As shown in fig. 9, the high-speed rail user identification apparatus may include a processor 91 and a communication interface 92. The processor 91 is coupled to a communication interface 92.
The function of the processor 91 may be as described above with reference to the processor 81. The processor 91 also has a memory function and can function as the memory 82.
The communication interface 92 is used to provide data to the processor 91. The communication interface 92 may be an internal interface of the high-speed railway user identification device or an external interface of the high-speed railway user identification device (corresponding to the communication interface 83).
It should be noted that the structure shown in fig. 8 (or fig. 9) does not constitute a limitation of the high-speed rail user identification apparatus, and the high-speed rail user identification apparatus may include more or less components than those shown in fig. 8 (or fig. 9), or may combine some components, or may have a different arrangement of components.
The following specifically describes a high-speed rail user confirmation method provided by the application with reference to the accompanying drawings:
as shown in fig. 10, fig. 10 is a schematic flow chart of a high-speed rail user identification method provided in the present application, which includes the following steps:
s1001, the high-speed rail user identification device determines a high-speed rail main control cell.
The high-speed rail main control cell is used for providing services for the high-speed rail user group.
In one possible implementation manner, determining the high-speed rail master cell specifically includes the following steps: acquiring a second target serving cell; the second target serving cell is a serving cell with switching times smaller than preset times in a preset time period on a high-speed railway line; determining the high-speed railway user group from the user groups corresponding to the second target service cell according to a long interval speed algorithm and a first preset characteristic condition; and determining the high-speed railway master control cell according to the high-speed railway user group. It should be noted that, the process of determining the high-speed rail main control cell by the high-speed rail user identification device can be referred to S1501-S1503 described below, and will not be repeated here.
S1002, the high-speed railway user identification device determines a first model according to fingerprint characteristics of a high-speed railway master control cell.
The first model is used for determining the confidence of the high-speed railway master control cell.
Optionally, the first model is constructed according to a LightGBM algorithm, and the LightGBM algorithm is used for extracting and training fingerprint features of the high-speed railway master control cell, and determining the confidence coefficient of the high-speed railway master control cell according to the fingerprint features of the high-speed railway master control cell. The description and principle of the LightGBM algorithm are referred to in the section 5 of the description of the technical term, and will not be repeated here.
The confidence level is also referred to as reliability, or confidence level or confidence coefficient. In the case of sampling to estimate the overall parameters, the conclusion is always uncertain due to the randomness of the samples, so that a probability-based statement method, i.e. interval estimation in mathematical statistics, needs to be adopted. I.e. how large the estimated value is within a certain allowed error range from the overall parameter, the corresponding probability is called confidence. In this embodiment, the confidence level is a weight value of the high-speed rail master cell parameter.
Optionally, the fingerprint characteristics of the high-speed rail main control cell are the characteristics of the starting time of the cell occupied by the user, the ending time of the cell occupied by the user, the signal quality when the cell occupied by the user, and the like.
S1003, the high-speed railway user identification device determines a first target service cell according to the confidence coefficient of the high-speed railway main control cell.
The first target serving cell is a serving cell with a confidence coefficient greater than a preset confidence coefficient threshold value in the high-speed rail main control cell.
Alternatively, in practical application, the preset confidence threshold may be set according to practical requirements, which is not specifically limited in the present application.
Illustratively, fig. 11 shows the confidence of the high-speed rail master cell after K (k=4) verification, and the last column is the average value of the confidence.
Taking the characteristic cell with the preset confidence coefficient threshold value of 0.9999, specifically, the characteristic cell with the confidence coefficient larger than the preset confidence coefficient threshold value, as the first target service cell, can simplify the high-speed rail main control cell, and improves the calculation performance.
S1004, the high-speed railway user identification device determines a second model according to the first target service cell.
Wherein the second model is used for identifying the high-speed rail user.
Optionally, the second model is constructed according to the LightGBM algorithm. The description and principle of the LightGBM algorithm are referred to in the section 5 of the description of the technical term, and will not be repeated here.
It should be noted that, according to the first target serving cell, the second model is determined, which specifically includes the following three steps:
(1) Constructing an initial model according to a LightGBM algorithm; the LightGBM algorithm is used for feature extraction and training of the first target serving cell. The description and principle of the LightGBM algorithm are referred to in the section 5 of the description of the technical term, and will not be repeated here.
(2) Judging whether the identification accuracy of the initial model is greater than or equal to an accuracy threshold according to a K-fold cross-validation algorithm;
(3) And determining the initial model as the second model under the condition that the identification accuracy of the initial model is greater than or equal to an accuracy threshold. Alternatively, although the first model may also implement high-precision recognition, the first model is extremely bulky due to training with the fingerprint features of all the high-speed rail master cells. Therefore, the initial model is built by using the LightGBM algorithm for the second time, and the first target service cell is adopted for training, so that a second model is obtained, the calculation performance can be optimized, and the recognition accuracy is not reduced.
Illustratively, as shown in fig. 12, fig. 12 shows the prediction accuracy of the initial model according to K-fold (n=4) cross-validation. The positive samples are high-speed rail users determined according to a long-interval algorithm, and the negative samples are other users. It can be seen that the identification accuracy rate reaches 98.61%, and accurate high-speed rail user identification can be realized.
S1005, the high-speed rail user identification device identifies the high-speed rail user in the users to be identified according to the second model.
Illustratively, the identified high-speed rail user is tested on a section of the high-speed rail line in conjunction with the actual scenario, as shown in fig. 13-14, fig. 13 is a track of the user on a certain high-speed rail line, and fig. 14 is track and MR data of the high-speed rail user on a certain high-speed rail line. As can be seen from fig. 14, the MR data of the high-speed rail user and the travel track are highly coincident, so that the high-speed rail user identification method provided by the application can be illustrated to accurately identify the high-speed rail user.
Based on the technical scheme, the high-speed rail user identification method provided by the application comprises the steps of firstly determining a high-speed rail main control cell, and then determining a first model according to fingerprint characteristics of the high-speed rail main control cell and a LightGBM algorithm; then, calculating the confidence degrees of all the high-speed rail main control cells according to the first model, and determining the service cells with the confidence degrees larger than a preset confidence degree threshold value as first target service cells; then constructing an initial model according to a LightGBM algorithm and a first target serving cell, and training the initial model according to the characteristics of the first target serving cell; and when the identification accuracy of the initial model is greater than or equal to the accuracy threshold, determining the initial model as a second model, so that the high-speed rail users in the users to be identified are identified according to the second model. From this, this application carries out the preliminary screening back through carrying out the service district, carries out the rescreening through first model to the service district after the preliminary screening, obtains the service district that the confidence level is preceding as the training object of initial model to this confirms the second model, from this application can accurately discern the high-speed railway user through the second model, thereby discover potential coverage quality problem, optimize network quality, improve user perception, build higher high-quality high-speed railway network.
Illustratively, in connection with fig. 10, as shown in fig. 15, the above step 1001 may be specifically implemented by the following S1501-S1503:
s1501, the high-speed railway user identification device acquires a second target service cell.
The second target serving cell is a serving cell with switching times smaller than preset times in a preset time period on the high-speed railway line.
It should be noted that, the preset period and the preset number of times may be set according to actual requirements, which is not specifically limited in the present application.
It should be noted that, for a specific manner of acquiring the second target serving cell by the high-speed rail user identification apparatus, see the following S1801-S1804, which are not repeated herein.
S1502, the high-speed railway user identification device determines a high-speed railway user group from the user groups corresponding to the second target service cell according to a long interval speed algorithm.
In one possible implementation manner, a plurality of users with the same motion characteristics and the moving speed reaching more than 200km/h once or a plurality of times in the user group corresponding to the second target service cell are determined as the high-speed rail user group.
Illustratively, a long interval algorithm is specifically described in connection with fig. 16.
Alternatively, assuming that the coverage area of each serving cell is 5 km, first, the switching time of the user from a to B and D to E is taken as a time difference, then the distance between the perpendicular bisector of AB and the intersecting point M and N of the high-speed railway and the high-speed railway is taken as MN, and the speed of the user corresponding to the second target serving cell is calculated by the formula "speed=distance/time difference".
In one possible implementation, the high-speed rail user group may be determined from the user groups corresponding to the second target serving cell by a long interval speed algorithm and a first preset feature condition.
As shown in fig. 17, the speed difference is small, and the high-speed railway user group can be effectively determined from the user groups corresponding to the second target serving cell through a long interval speed algorithm and the first preset characteristic condition.
It should be noted that there must be an error between the distance between MNs and the actual high-speed railway line, but when the distance between AB and DE is far enough, the error is negligible, so the negative effect on the solution is also negligible.
Illustratively, assuming a distance of 5000 meters from AB to DE, a distance of 100 meters from AB, DE to the high-speed rail, the error is calculated as:
from this, it can be seen that the error is less than 0.1% and negligible.
S1503, the high-speed railway user identification device determines a high-speed railway master control cell according to the first preset characteristic condition and the high-speed railway user group.
It should be noted that the first preset feature condition includes one or more of the following: the moving route has periodicity, the distance between the moving route and the high-speed railway route is smaller than a first preset distance threshold value, and the cell unique identifiers ECI of the corresponding serving cells in a preset period are the same.
In one possible implementation manner, in combination with methods such as big data analysis, a serving cell meeting a first preset feature condition may be obtained from serving cells traversed by the high-speed rail user group, and determined as a high-speed rail master cell.
Based on the technical scheme, the high-speed rail user identification method provided by the application can screen the service cells, firstly acquire the service cells with switching times smaller than the preset times in the preset time period, and determine the service cells as the second target service cells; then, determining a high-speed railway user group from the user groups corresponding to the second target service cell according to the long interval speed algorithm and the first preset characteristic condition; and determining the high-speed railway master control cell according to the high-speed railway user group. From this, this application carries out the preliminary screening back through carrying out the service district, carries out the rescreening through first model to the service district after the preliminary screening, obtains the service district that the confidence level is preceding as the training object of initial model to this confirms the second model, from this application can accurately discern the high-speed railway user through the second model, thereby discover potential coverage quality problem, optimize network quality, improve user perception, build higher high-quality high-speed railway network.
Illustratively, as shown in fig. 18 in conjunction with fig. 15, the above S1501 may be implemented specifically by the following S1801-S1804:
s1801, the high-speed railway user identification device acquires a third target service cell.
The shortest distance between the third service cell and the high-speed railway line is smaller than a second preset distance threshold value;
it should be noted that, the second preset distance threshold may be set according to actual requirements, which is not specifically limited in the present application.
Optionally, using the railway layer in the GIS layer, the used cell along the high-speed rail is marked as a third target service cell.
For example, taking area a as an example, as shown in fig. 19, fig. 19 is a diagram of a railway line service cell area of area a, where there are currently 6 railway lines, namely, railway line 1, railway line 2, railway line 3, railway line 4, railway line 5 and railway line 6. And extracting a used cell within N kilometers (N can be 0 to 10) along the high-speed rail to serve as a third target serving cell.
S1802, the high-speed railway user identification device acquires S1-MME interface data of the user in the third target service cell.
Wherein the S1-MME interface data comprises ECI of the third target serving cell.
Illustratively, fig. 20 is S1-MME interface data for a user within a third target serving cell.
S1803, the high-speed rail user identification device performs time sequencing on the S1-MME interface data.
Illustratively, fig. 21 is a time ordered result of the S1-MME interface data of the user in fig. 20.
S1804, the high-speed railway user identification device acquires a second target serving cell from the third target serving cell according to the ordered S1-MME interface data.
The second target serving cell is a serving cell which is not frequently switched in the third target serving cell.
Illustratively, in connection with fig. 21, from a time result, there is frequent user handover to the serving cell, such as at 08:08:08:237548 takes up cell 18349057,2 seconds and switches to cell 18131459 and 17 seconds and switches back to cell 18349057, which is the case only in a static scenario.
Illustratively, as shown in fig. 22, fig. 22 is a diagram of a handover procedure between three cells of a user. As can be seen from the figure, the user frequently switches between serving cell a and serving cell B, and finally switches to serving cell C. In practical application, the user frequently switches among a plurality of service cells, and in order to obtain the second target service cell, the algorithm needs to be adopted to reject the frequently switched cell from the third target service cell,
Illustratively, the algorithm employed is:
(1) Frequent switching is carried out in n cells within 10 seconds (n currently supports less than 5), only the last appearing moment of the serving cell is taken as the sequence time of the serving cell, namely, the time slice value is possibly lost by taking the leaving cell as the judgment basis, and the influence of taking the entering cell or the leaving cell on the final calculation speed is small, so that the reference speed algorithm part is explained.
(2) And (3) frequent switching is performed within n times of more than 30 seconds (n is currently supported to be less than 5), and neighbor relation exists between cells, judging that the user is likely to be static, and marking the speed of the user as 0.
Based on the technical scheme, the high-speed rail user identification method provided by the application can be used for primarily screening the service cells, determining the service cells with the shortest distance from the high-speed rail line smaller than the second preset distance threshold value as the third target service cells, then acquiring S1-MME interface data of the third target service cells, and time ordering the S1-MME interface data, so that the service cells with the switching times smaller than the preset times in the preset time period are acquired from the third target service cells according to an algorithm, and the service cells are determined as the second target service cells. From this, this application is through carrying out preliminary screening back to the serving cell, can also carry out the rescreening to the serving cell after the preliminary screening through first model, obtain the serving cell that the confidence level is preceding as the training object of initial model to this confirms the second model, from this application can accurately discern the high-speed railway user through the second model, thereby discover potential coverage quality problem, optimize network quality, improve user perception, build higher high-quality high-speed railway network.
As shown in fig. 23, an exemplary structure diagram of a high-speed rail user identification device according to an embodiment of the present application is shown, where the device includes: a processing unit 2301 and an acquisition unit 2302.
Optionally, the processing unit is used for determining a high-speed rail main control cell; the high-speed rail main control cell is used for providing services for the high-speed rail user group.
Optionally, the processing unit 2301 is further configured to determine a first model according to fingerprint features of the high-speed rail master cell; the first model is used for determining the confidence of the high-speed railway master control cell.
Optionally, the processing unit 2301 is further configured to determine the first target serving cell according to a confidence level of the high-speed rail master cell.
Optionally, the processing unit 2301 is further configured to determine a second model according to the first target serving cell.
Optionally, the processing unit 2301 is further configured to identify a high-speed rail user of the users to be identified according to the second model.
Optionally, the acquiring unit 2302 is configured to acquire a second target serving cell.
Optionally, the processing unit 2301 is further configured to determine a high-speed rail user group from the user groups corresponding to the second target serving cell according to the long interval speed algorithm and the first preset feature condition.
Optionally, the processing unit 2301 is further configured to determine a high-speed rail master cell according to the high-speed rail user group.
Optionally, the processing unit 2301 is further configured to determine a third target serving cell.
Optionally, the acquiring unit 2302 is further configured to acquire S1-MME interface data of the user in the third target serving cell.
Optionally, the processing unit 2301 is further configured to time sequence the S1-MME interface data.
Optionally, the processing unit 2301 is further configured to obtain the second target serving cell from the third serving cell according to the ordered S1-MME interface data.
Optionally, the processing unit 2301 is further configured to construct the first model and the second model according to a LightGBM algorithm.
Optionally, the processing unit 2301 is further configured to construct an initial model according to a LightGBM algorithm.
Optionally, the processing unit 2301 is further configured to determine whether the recognition accuracy of the initial model is greater than or equal to an accuracy threshold according to a K-fold cross-validation algorithm.
Optionally, the processing unit 2301 is further configured to determine the initial model as the second model if the recognition accuracy of the initial model is greater than or equal to the accuracy threshold.
In addition, the technical effects of the high-speed rail user identification apparatus of fig. 23 may refer to the technical effects of the high-speed rail user identification method of the above embodiment, and will not be described herein.
From the foregoing description of the embodiments, it will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of functional modules is illustrated, and in practical application, the above-described functional allocation may be implemented by different functional modules according to needs, i.e. the internal structure of the apparatus is divided into different functional modules to implement all or part of the functions described above. The specific working processes of the above-described systems, devices and units may refer to the corresponding processes in the foregoing method embodiments, which are not described herein.
The present application provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the high-speed rail user identification method of the method embodiments described above.
The embodiment of the application also provides a computer readable storage medium, wherein the computer readable storage medium stores instructions, and when the instructions run on a computer, the instructions cause the computer to execute the high-speed rail user identification method in the method flow shown in the method embodiment.
The computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (random access memory, RAM), a read-only memory (ROM), an erasable programmable read-only memory (erasable programmable read only memory, EPROM), a register, a hard disk, an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing, or any other form of computer readable storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an application specific integrated circuit (application specific integrated circuit, ASIC). In the context of the present application, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
Since the high-speed rail user identification apparatus, the computer readable storage medium and the computer program product in the embodiments of the present application may be applied to the above-mentioned method, the technical effects that can be obtained by the method may also refer to the above-mentioned method embodiments, and the embodiments of the present application are not described herein again.
The foregoing is merely a specific real-time manner of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions within the technical scope of the present disclosure should be covered in the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (14)

1. A method for identifying a high-speed rail user, the method comprising:
determining a high-speed rail main control cell; the high-speed rail main control cell is used for providing services for the high-speed rail user group;
determining a first model according to the fingerprint characteristics of the high-speed rail main control cell; the first model is used for determining the confidence level of the high-speed rail main control cell;
determining a first target serving cell according to the confidence coefficient of the high-speed rail main control cell; the first target serving cell is a serving cell in which the confidence coefficient is larger than a preset confidence coefficient threshold value in the high-speed rail main control cell;
Determining a second model according to the first target service cell; the second model is used for identifying the high-speed rail user;
and identifying the high-speed rail users in the users to be identified according to the second model.
2. The method according to claim 1, wherein the determining the high-speed rail master cell specifically comprises:
acquiring a second target serving cell; the second target serving cell is a serving cell with switching times smaller than preset times in a preset time period on a high-speed railway line;
determining the high-speed railway user group from the user groups corresponding to the second target service cell according to a long interval speed algorithm;
and determining the high-speed railway master control cell according to a first preset characteristic condition and the high-speed railway user group.
3. The method of claim 2, wherein the first preset feature condition comprises one or more of: the moving route has periodicity, the distance between the moving route and the high-speed railway route is smaller than a first preset distance threshold value, and the cell unique identifiers ECI of the corresponding serving cells in a preset period are the same.
4. The method according to claim 3, wherein the acquiring the second target serving cell specifically includes:
Determining a third target serving cell; the shortest distance between the third service cell and the high-speed railway line is smaller than a second preset distance threshold value;
acquiring S1-Mobile Management Entity (MME) interface data of a user in the third target service cell; the S1-MME interface data comprises ECI of the third target service cell;
time sequencing is carried out on the S1-MME interface data;
and acquiring the second target serving cell from the third serving cell according to the ordered S1-MME interface data.
5. The method of claim 4, wherein the first model and the second model are constructed according to a LightGBM algorithm.
6. The method according to claim 5, wherein said determining a second model from said first target serving cell comprises:
constructing an initial model according to the LightGBM algorithm;
judging whether the identification accuracy of the initial model is greater than or equal to an accuracy threshold according to a K-fold cross-validation algorithm;
and determining the initial model as the second model under the condition that the identification accuracy of the initial model is greater than or equal to an accuracy threshold.
7. A high-speed rail user identification apparatus, the apparatus comprising: a processing unit;
The processing unit is used for determining a high-speed rail main control cell; the high-speed rail main control cell is used for providing services for the high-speed rail user group;
the processing unit is further used for determining a first model according to the fingerprint characteristics of the high-speed rail main control cell; the first model is used for determining the confidence level of the high-speed rail main control cell;
the processing unit is further configured to determine a first target serving cell according to the confidence level of the high-speed rail main control cell; the first target serving cell is a serving cell in which the confidence coefficient is larger than a preset confidence coefficient threshold value in the high-speed rail main control cell;
the processing unit is further configured to determine a second model according to the first target serving cell; the second model is used for identifying the high-speed rail user;
and the processing unit is further used for identifying the high-speed rail users in the users to be identified according to the second model.
8. The apparatus of claim 7, wherein the apparatus further comprises: an acquisition unit;
the acquisition unit is used for acquiring a second target serving cell; the second target serving cell is a serving cell with switching times smaller than preset times in a preset time period on a high-speed railway line;
The processing unit is further configured to determine the high-speed railway user group from the user groups corresponding to the second target serving cell according to a long interval speed algorithm;
the processing unit is further configured to determine the high-speed rail master control cell according to a first preset feature condition and the high-speed rail user group.
9. The apparatus of claim 8, wherein the first preset feature condition comprises one or more of: the moving route has periodicity, the distance between the moving route and the high-speed railway route is smaller than a first preset distance threshold value, and ECI of a corresponding serving cell in a preset period is the same.
10. The apparatus of claim 9, wherein the obtaining the second target serving cell specifically comprises:
the processing unit is further configured to determine a third target serving cell; the shortest distance between the third service cell and the high-speed railway line is smaller than a second preset distance threshold value;
the acquiring unit is further configured to acquire S1-MME interface data of a user in the third target serving cell; the S1-MME interface data comprises ECI of the third target service cell;
the processing unit is further configured to time sequence the S1-MME interface data;
The processing unit is further configured to obtain the second target serving cell from the third serving cell according to the ordered S1-MME interface data.
11. The apparatus of claim 10, wherein the processing unit is further configured to construct the first model and the second model according to the LightGBM algorithm.
12. The apparatus of claim 10, wherein the processing unit is further configured to determine a second model based on the first target serving cell, specifically comprising:
the processing unit is further used for constructing an initial model according to the LightGBM algorithm;
the processing unit is further used for judging whether the identification accuracy of the initial model is greater than or equal to an accuracy threshold according to a K-fold cross-validation algorithm;
the processing unit is further configured to determine the initial model as the second model if the recognition accuracy of the initial model is greater than or equal to an accuracy threshold.
13. A high-speed rail user identification device, comprising: a processor and a communication interface; the communication interface is coupled to the processor for running a computer program or instructions to implement the high-speed rail user identification method as claimed in any one of claims 1-6.
14. A computer readable storage medium having instructions stored therein, characterized in that when executed by a computer, the computer performs the high-speed rail user identification method as claimed in any one of the preceding claims 1-5.
CN202310739330.1A 2023-06-20 2023-06-20 High-speed rail user identification method, device and storage medium Pending CN116567672A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310739330.1A CN116567672A (en) 2023-06-20 2023-06-20 High-speed rail user identification method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310739330.1A CN116567672A (en) 2023-06-20 2023-06-20 High-speed rail user identification method, device and storage medium

Publications (1)

Publication Number Publication Date
CN116567672A true CN116567672A (en) 2023-08-08

Family

ID=87486343

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310739330.1A Pending CN116567672A (en) 2023-06-20 2023-06-20 High-speed rail user identification method, device and storage medium

Country Status (1)

Country Link
CN (1) CN116567672A (en)

Similar Documents

Publication Publication Date Title
CN108156626B (en) Rail transit wireless network quality evaluation method, device and medium
US10674434B2 (en) Frequency spectrum prediction method and apparatus for cognitive wireless network
CN106575294B (en) Track data query method and device
CN108289302B (en) Method and system for positioning atmospheric waveguide interference of TD-LTE network
JP2002057614A (en) Method and device for evaluating rf propagation in radio communication system
CN101620785B (en) Method for recognizing motor vehicle and non-motor vehicle based on mobile phone signal data
CN111144452B (en) Mobile user trip chain extraction method based on signaling data and clustering algorithm
CN108574934B (en) Pseudo base station positioning method and device
CN109996245B (en) Communication resource delivery evaluation method and device, electronic equipment and storage medium
CN109769198B (en) High-speed rail user positioning method, device, equipment and computer storage medium
CN110572770B (en) High-speed rail mobile network user positioning method and system
CN111866847B (en) Mobile communication network data analysis method, equipment and computer storage medium
CN110933601B (en) Target area determination method, device, equipment and medium
CN116567672A (en) High-speed rail user identification method, device and storage medium
CN111417091B (en) User identification method and device, exception handling method, equipment and storage medium
CN109756887B (en) High-speed rail accompanying mobile terminal identification method and device and computer readable storage medium
CN113766428A (en) Urban public transport passenger travel track estimation method, system, terminal and storage medium
KR101591566B1 (en) Position tracking method and apparatus
CN108271203B (en) Network quality evaluation method and equipment
CN109327854B (en) Track user identification method and device
CN109376211B (en) Method and device for establishing wireless cell time space database
CN111385731B (en) Train user positioning method, device, equipment and medium
Lima et al. Human Mobility Support for Personalized Data Offloading
CN115641243B (en) Commute corridor determination method, device, equipment and storage medium
CN116112873A (en) Travel OD segmentation method and device based on mobile positioning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination