CN108573265B - People flow statistical method and statistical system - Google Patents

People flow statistical method and statistical system Download PDF

Info

Publication number
CN108573265B
CN108573265B CN201710141863.4A CN201710141863A CN108573265B CN 108573265 B CN108573265 B CN 108573265B CN 201710141863 A CN201710141863 A CN 201710141863A CN 108573265 B CN108573265 B CN 108573265B
Authority
CN
China
Prior art keywords
serving cell
historical
reported
area
reported information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710141863.4A
Other languages
Chinese (zh)
Other versions
CN108573265A (en
Inventor
陈霖
游龙涛
韩桂鲁
李志阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN201710141863.4A priority Critical patent/CN108573265B/en
Publication of CN108573265A publication Critical patent/CN108573265A/en
Application granted granted Critical
Publication of CN108573265B publication Critical patent/CN108573265B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/10Scheduling measurement reports ; Arrangements for measurement reports
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • H04W4/021Services related to particular areas, e.g. point of interest [POI] services, venue services or geofences
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • H04W4/025Services making use of location information using location based information parameters
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Abstract

The invention discloses a people flow statistical method, which comprises the following steps: acquiring the boundary of a designated area, and carrying out internal and external classification on historical UE reported data with accurate positions; according to the classification result, label processing is respectively carried out on the historical UE reported data inside and outside the region; acquiring historical UE (user equipment) reported information, extracting features from the historical UE reported information, and constructing a feature classification model by combining tags falling inside and outside an area; classifying whether the collected UE reported information samples without accurate positions fall in a specified area or not according to the characteristic classification model; and carrying out people flow statistics on the classified sample points falling in the specified area. The invention also discloses a statistical system. The invention can solve the technical problems of insufficient data samples based on a GPS data source and errors in reported information based on UE, so as to improve the accuracy of people flow statistics aiming at a specific area.

Description

People flow statistical method and statistical system
Technical Field
The invention relates to the technical field of wireless communication, in particular to a people flow statistical method and a statistical system.
Background
With the rise of mobile internet, various applications based on mobile terminals such as mobile phones are emerging, especially mobile phone positioning technologies related to positioning services. Because the current mobile phone has high people ownership rate, the people flow statistics based on the mobile phone has high accuracy and practical value, for example, the store layout is analyzed by analyzing the people flow in commerce, the traffic is improved by analyzing the characteristics of people flow in traffic, and emergencies are known in time according to the changes of the people flow in the safety field.
A positioning technology based on a mobile phone generally has two acquisition channels, one is Position information acquired based on a GPS (Global positioning System), and the other is Position estimation based on a UE (User Equipment) and various types of wireless communication measurement information reported to a wireless base station. The advantages of positioning information by GPS are: the positioning accuracy is very high, and the general positioning error is less than 10 meters, but the limitation of the GPS positioning information is also very large, including the following aspects: firstly, although the existing smart phones are basically equipped with a GPS module, in consideration of power saving, many users can turn on the GPS module only when starting positioning and navigation applications, and cannot turn on the GPS module in a general state; second, GPS can only be used outdoors in an unobstructed environment, and cannot provide positioning information in an indoor environment because GPS satellite signals cannot be received. Therefore, the regional pedestrian volume statistics using only the GPS location information will suffer from a serious shortage of sample size, and the statistical result will be difficult to reach a satisfactory level.
On the contrary, the advantages and disadvantages of the positioning method based on the UE reported information for position estimation are exactly opposite to the GPS position information, and the positioning method is not limited by the environment and the user settings, and can be used indoors and outdoors, and compared with the disadvantages of insufficient GPS positioning information samples, the positioning method based on the UE reported information has obvious advantages in the sample adequacy aspect. On the other hand, the positioning error is much larger than the position information of the GPS based on the estimated position information of the information reported by the UE. Therefore, the accuracy is difficult to meet the requirement by performing the people flow statistics of the hot spot region only by using the position information.
The above is only for the purpose of assisting understanding of the technical solution of the present invention, and does not represent an admission that the above is the prior art.
Disclosure of Invention
The invention mainly aims to provide a people flow rate statistical method and a people flow rate statistical system, and aims to solve the technical problems that a data sample based on a GPS data source is insufficient and errors exist in reported information based on UE (user equipment) so as to improve the accuracy of people flow rate statistics aiming at a specific area.
In order to achieve the above object, the present invention provides a people flow rate statistical method, which comprises the following steps:
acquiring the boundary of a designated area, and carrying out internal and external classification on historical UE reported data with accurate positions;
according to the classification result, label processing is respectively carried out on the historical UE reported data inside and outside the region;
acquiring historical UE (user equipment) reported information, extracting features from the historical UE reported information, and constructing a feature classification model by combining tags falling inside and outside an area;
classifying whether the collected UE reported information sample without the accurate position falls in a designated area or not according to the characteristic classification model;
and carrying out people flow statistics on the classified sample points falling in the specified area.
Preferably, the step of obtaining the historical UE report information, extracting the features from the historical UE report information, and constructing the feature classification model by combining the tags inside and outside the area includes:
acquiring historical UE (user equipment) reported information, and counting distribution characteristics of IDs (identity) of a main serving cell and an adjacent serving cell;
and extracting a main service cell ID list and an adjacent service cell ID list covering the designated area, and constructing a main adjacent area sequence set by combining labels falling inside and outside the designated area.
Preferably, the step of classifying whether the collected UE reported information sample without an accurate location falls in a designated area according to the feature classification model includes:
classifying the collected UE reported information samples without accurate positions according to the constructed main adjacent region sequence set;
if the ID of the main service cell of the information sample reported by the UE is not in the ID list of the main service cell, judging that the information sample reported by the UE falls outside the specified area;
if the main serving cell ID of the reported information is in the main serving cell ID list, acquiring the number of the neighbor serving cell IDs in the neighbor serving cell ID list, and when the number is greater than a preset threshold value, judging that the UE reported information sample falls in the specified area.
Preferably, the step of obtaining the historical UE report information, extracting the features from the historical UE report information, and constructing the feature classification model by combining the tags inside and outside the area includes:
acquiring historical UE (user equipment) reporting information, and extracting the signal intensity of a main serving cell and a neighbor serving cell from the historical UE reporting information;
preprocessing the acquired signal intensity of the main serving cell and the adjacent serving cells;
and training and constructing a classification model by adopting a classification algorithm in data mining according to the historical data subjected to label processing and the preprocessed samples.
Preferably, the step of classifying whether the collected UE reported information sample without an accurate location falls in a designated area according to the feature classification model includes:
acquiring the signal intensity of a main serving cell and an adjacent serving cell of the UE reported information sample;
and determining whether the sample falls in the designated area or not according to the classification model and the signal strengths of the main serving cell and the adjacent serving cells.
In order to achieve the above object, the present invention further provides a statistical system, including:
the regional classification module is used for acquiring the boundary of the designated region and performing regional internal and external classification on the historical UE reported data with accurate positions;
the label processing module is used for respectively carrying out label processing on the historical UE reported data inside and outside the region according to the classification result;
the device comprises a construction module, a characteristic classification module and a characteristic classification module, wherein the construction module is used for acquiring historical UE reported information, extracting characteristics from the historical UE reported information and constructing a characteristic classification model by combining labels falling inside and outside an area;
the classification processing module is used for classifying whether the collected UE reported information sample without the accurate position falls in a specified area or not according to the characteristic classification model;
and the counting module is used for carrying out people flow counting on the classified sample points falling in the specified area.
Preferably, the building block comprises:
the statistical unit is used for acquiring historical UE reported information and counting the distribution characteristics of the IDs of the main serving cell and the adjacent serving cells;
and the constructing unit is used for extracting a main service cell ID list and an adjacent service cell ID list covering the specified area and constructing a main adjacent area sequence set by combining labels falling inside and outside the specified area.
Preferably, the classification processing module includes:
the classification unit is used for classifying the collected UE reported information samples without accurate positions according to the constructed main adjacent cell sequence set;
a determining unit, configured to determine that the UE-reported information sample falls outside the designated area if the primary serving cell ID of the UE-reported information sample is not in the primary serving cell ID list;
the determining unit is further configured to, if the primary serving cell ID of the reported information is in the primary serving cell ID list, obtain the number of neighboring serving cell IDs in the neighboring serving cell ID list, and determine that the UE reported information sample falls in the designated area when the number is greater than a predetermined threshold.
Preferably, the building block comprises:
the extracting unit is used for acquiring historical UE reported information and extracting the signal intensity of a main serving cell and a neighboring serving cell from the historical UE reported information;
the preprocessing unit is used for preprocessing the acquired signal intensity of the main serving cell and the adjacent serving cell;
and the construction unit is used for training and constructing a classification model by adopting a classification algorithm in data mining according to the historical data after the label processing and the preprocessed sample.
Preferably, the classification processing module includes:
an obtaining unit, configured to obtain signal strengths of a main serving cell and an adjacent serving cell of the information sample reported by the UE;
and the determining unit is used for determining whether the sample falls in the specified area or not according to the classification model and the signal strengths of the main serving cell and the adjacent serving cells.
According to the people flow statistical method and the statistical system, the boundary of the designated area is obtained, and the internal and external classification of the area is carried out on the historical UE reported data with accurate positions; according to the classification result, label processing is respectively carried out on the historical UE reported data inside and outside the region; acquiring historical UE (user equipment) reported information, extracting features from the historical UE reported information, and constructing a feature classification model by combining tags falling inside and outside an area; classifying whether the collected UE reported information samples without accurate positions fall in a specified area or not according to the characteristic classification model; and carrying out people flow statistics on the classified sample points falling into the designated area. Therefore, the mode of the UE reporting information is learned based on the accurate position, the characteristic classification model is established, the method has the characteristics of good robustness and strong adaptability, the technical problems that data samples based on a GPS data source are insufficient and errors exist in the UE reporting information can be solved, and therefore the accuracy of people flow statistics for a specific area is improved.
Drawings
FIG. 1 is a schematic flow chart of a people flow statistical method according to a first embodiment of the present invention;
FIG. 2 is a schematic diagram of historical data falling inside and outside a designated area;
FIG. 3 is a detailed flowchart of a first embodiment of the step S3 in FIG. 1;
FIG. 4 is a diagram illustrating the distribution statistics of the primary serving cell;
fig. 5 is a diagram illustrating distribution statistics of neighboring cells;
FIG. 6 is a detailed flowchart of the first embodiment of the step S4 in FIG. 1;
FIG. 7 is a detailed flowchart of a second embodiment of the step S3 in FIG. 1;
FIG. 8 is a detailed flowchart of a second embodiment of the step S4 in FIG. 1;
FIG. 9 is a flow chart of a first embodiment of the flow statistical method of the present invention;
FIG. 10 is a functional block diagram of a statistical system according to a first embodiment of the present invention;
FIG. 11 is a schematic diagram of a refinement function module of the first embodiment of the building block of FIG. 10;
FIG. 12 is a schematic diagram of a detailed functional block diagram of a first embodiment of the classification processing block of FIG. 10;
FIG. 13 is a schematic diagram of a refinement function module of a second embodiment of the building block of FIG. 10;
FIG. 14 is a schematic diagram of a refinement function module of a second embodiment of the classification processing module of FIG. 10;
FIG. 15 is a functional block diagram of a statistical system according to a second embodiment of the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The invention provides a people flow rate statistical method, referring to fig. 1, in an embodiment, the people flow rate statistical method includes:
s1, acquiring the boundary of a designated area, and performing intra-area and intra-area classification on historical UE reported data with accurate positions;
in this embodiment, according to coordinates of corner points of a designated area, a polygon algorithm is used to classify data reported by historical UE having an accurate position inside and outside the designated area, where the accurate position is obtained by using an Assisted Global Positioning System (AGPS) in the UE (User Equipment); the historical UE reported data is obtained from the information reported by the UE.
S2, according to the classification result, performing label processing on the historical UE reported data inside and outside the area respectively;
in this embodiment, as shown in fig. 2, after the intra-area and intra-area classification of the historical UE reported data with an accurate position is completed, the historical UE reported data falling within the designated area is marked as 1, and the historical UE reported data falling outside the designated area is marked as 0; the historical UE reporting information includes, but is not limited to, MR (Measurement Report) information.
S3, obtaining historical UE reported information, extracting features from the historical UE reported information, and constructing a feature classification model by combining labels falling inside and outside an area;
in this embodiment, the feature set is extracted from the history UE reported information, which includes but is not limited to: a main service cell ID list, an adjacent service cell ID list, a main service cell signal strength, an adjacent service cell signal strength, distances from all cells (including the main service cell and the adjacent service cell) to a central point of a designated area and the like.
In this embodiment, a feature classification model constructed by combining features extracted from historical UE reported information and tags falling inside and outside an area may be a classification rule artificially specified according to a feature set; or a classification model which is established by learning the information reporting mode of the UE according to historical data and through a series of algorithms.
S4, classifying whether the collected UE reported information sample without the accurate position falls in a designated area or not according to the feature classification model;
in this embodiment, according to the extracted feature set, by combining with the above-mentioned label for labeling the historical data inside and outside the specified area, a feature classification model inside/outside the area is established by a data mining and/or statistical analysis method. Data mining methods including, but not limited to, decision trees, logistic regression, random forests, and the like can be selected.
And classifying the collected UE reported information without the accurate position according to the characteristic classification model, and distinguishing whether each piece of data information falls in the designated area or outside the designated area.
And S5, carrying out people flow statistics on the classified sample points falling in the specified area.
In this embodiment, the classified sample points falling in the designated area may be subjected to pedestrian flow statistics. Further, UE reported information in a specified time period can be obtained, and after a classification result is obtained according to the characteristic classification model, the pedestrian flow of the specified area in the specified time period can be counted according to a required time period.
The invention provides a people flow statistical method, which comprises the steps of classifying historical UE reported data with accurate positions inside and outside a region by obtaining the boundary of a specified region, respectively carrying out label processing on the historical reported data inside and outside the region according to a classification result, extracting features from the historical UE reported information, constructing a feature classification model by combining labels falling inside and outside the region, classifying whether collected UE reported information samples without accurate positions fall in the specified region according to the feature classification model, and carrying out people flow statistics on sample points which are obtained by classification and fall in the specified region. Therefore, the mode of the UE reporting information is learned based on the accurate position, the characteristic classification model is established, the method has the characteristics of good robustness and strong adaptability, the technical problems that data samples based on a GPS data source are insufficient and errors exist in the UE reporting information can be solved, and therefore the accuracy of people flow statistics for a specific area is improved.
In the first embodiment, as shown in fig. 3, on the basis of the embodiment shown in fig. 1, the step S3 includes:
step S31, obtaining historical UE reported information, and counting distribution characteristics of IDs of a main service cell and an adjacent service cell;
and step S32, extracting a main service cell ID list and an adjacent service cell ID list covering the designated area, and constructing a main adjacent area sequence set by combining labels falling inside and outside the designated area.
In this embodiment, the feature set selects a feature sequence of IDs of a main serving cell and an adjacent serving cell in history information reported by the UE, and extracts the ID of the main serving cell and the ID of the adjacent serving cell as feature sets to establish a feature classification model whether the feature sets fall within a specified area.
Specifically, the distribution characteristics of the IDs of the main serving cell and the neighboring serving cell are counted, and the statistical results are shown in fig. 4 and fig. 5. As can be seen from fig. 4 and 5, the ratio of the occurrence times of each primary serving cell ID in the history data sample and the cumulative ratio after the occurrence times are arranged in descending order from high to low according to the frequency. Therefore, for the coverage of the designated area, no matter the main serving cell or the adjacent serving cell, a few cells are covered; and extracting the ID of the most main service cell/adjacent service cell in the specific area, and constructing a main adjacent cell ID characteristic sequence set of the specific area. The accumulated coverage percentage threshold can be reasonably configured according to actual needs, and a main serving cell/adjacent serving cell which mainly covers the specific area can be screened out through the threshold, and the main serving cell and the adjacent serving cell jointly form a main adjacent cell sequence feature set.
In an embodiment, as shown in fig. 6, on the basis of the embodiment shown in fig. 1, the step S4 includes:
s41, classifying the collected UE reported information samples without accurate positions according to the constructed main adjacent region sequence set;
in this embodiment, according to the constructed main neighboring cell sequence set, the collected UE reported information samples that do not include an accurate position are classified, and a record falling in the designated area is identified. One of the decision rules is listed below, but of course, other decision rules may be defined in other embodiments.
Step S42, if the ID of the main service cell of the information sample reported by the UE is not in the ID list of the main service cell, the information sample reported by the UE is judged to fall outside the specified area;
in this embodiment, for an acquired information sample reported by a UE, if the primary serving cell ID is not in the obtained primary serving cell ID list, the sample is directly classified as falling outside the specified area;
step S43, if the primary serving cell ID of the information reported by the UE is in the primary serving cell ID list, obtaining the number of neighboring serving cell IDs in the neighboring serving cell ID list, and when the number is greater than a predetermined threshold, determining that the information sample reported by the UE falls in the designated area.
In this embodiment, if the main serving cell ID of the reported information is in the main serving cell ID list, the number of all neighboring serving cell IDs of the sample appearing in the neighboring serving cell ID list needs to be further obtained, and if the obtained number is greater than a predetermined threshold, the obtained number is classified as falling within the specified area, otherwise, the obtained number is classified as falling outside the specified area. Thus, by counting the sample records falling in the designated area, the people stream change condition of the designated area can be obtained. Of course, if the UE reporting information is obtained according to the specified time, the traffic volume of the specified area in the specified time period may be counted according to the required time period.
In a second embodiment, as shown in fig. 7, on the basis of the embodiment shown in fig. 1, the step S3 includes:
step S33, obtaining historical UE reported information, and extracting the signal intensity of a main service cell and an adjacent service cell from the historical UE reported information;
in this embodiment, the signal strengths of the main serving cell and the neighboring serving cells in the history UE report information are extracted as feature sets, and the feature sets are used to establish a feature classification model whether the feature classification model falls in a specified area. The Signal strength is preferably RSRP (Reference Signal Receiving Power).
Step S34, preprocessing the acquired signal intensity of the main serving cell and the adjacent serving cell;
in this embodiment, the UE report information may specifically include RSRPs of 1 main serving cell and RSRPs of 3 neighboring serving cells, and if an RSRP is not acquired and a missing value occurs, preprocessing may be performed, where the preprocessing includes, but is not limited to, removing an invalid value and a significant offset value, and processing the missing value.
And S35, training and constructing a classification model by adopting a classification algorithm in data mining according to the historical data subjected to label processing and the preprocessed samples.
In this embodiment, model training is performed on the history data after label processing and the preprocessed samples by using algorithms such as logistic regression, so as to obtain a binary classification model.
Therefore, accurate point positioning of the UE is not needed, certain characteristics of the designated area are learned based on historical data, whether the UE sample belongs to the designated area or not can be directly identified and judged, and then the flow of people in the designated area is obtained.
In an embodiment, as shown in fig. 8, on the basis of the embodiment shown in fig. 7, the step S4 includes:
step S44, obtaining the signal intensity of a main service cell and an adjacent service cell of the information sample reported by the UE;
and step S45, determining whether the sample is in the designated area according to the classification model and the signal strength of the main service cell and the adjacent service cells.
In this embodiment, according to the obtained classification model, the reported RSRP of the main serving cell and the RSRP of the neighboring serving cells are analyzed to determine whether the sample falls within the designated area. In this embodiment, through the learning training of the logistic regression classification model, whether the sample falls within the designated area can be determined according to the reported RSRP of the main serving cell and the neighboring serving cells. Thus, by counting the sample records falling in the designated area, the change situation of the people flow according to the hour granularity can be obtained. Of course, if the UE report information is obtained according to the specified time, the traffic volume of the specified area in the specified time period may be counted according to the required time period.
In an embodiment, as shown in fig. 9, on the basis of the embodiment shown in fig. 1, the step S3 further includes:
s6, positioning the historical UE reported information, and expanding the specified area to obtain an expanded area;
and S7, screening out historical UE reporting information falling in the expansion area.
In this embodiment, an algorithm such as a fingerprint positioning algorithm or a triangulation positioning algorithm is used to position the historical UE reported information, the designated area is extended to a certain range to obtain an extended area, for example, the extended area is extended to the outside by 300 meters along the edge of the designated area, and then the historical UE reported information sample points falling within the extended area are screened out.
After the rough screening, the sample point set falling in the designated area is greatly reduced compared with the original sample set aiming at the whole city, and then a characteristic classification model is constructed aiming at the sample points passing through the rough screening, and the sample points falling in the designated area are judged and the people flow statistics is carried out. Therefore, under the condition of not reducing the statistical accuracy, the calculation complexity can be effectively reduced, and the practicability is improved.
Referring to fig. 10, a statistical system 100 according to an embodiment of the present invention includes:
the regional classification module 10 is configured to obtain a boundary of a designated region, and classify inside and outside the region for historical UE reported data with an accurate position;
in this embodiment, according to coordinates of corner points of a designated area, a polygon algorithm is used to classify data reported by historical UE having an accurate position inside and outside the designated area, where the accurate position is obtained by using an Assisted Global Positioning System (AGPS) in the UE (User Equipment); the historical UE reported data is obtained from the information reported by the UE.
A tag processing module 20, configured to perform tag processing on the historical UE reported data inside and outside the area according to the classification result;
in this embodiment, as shown in fig. 2, after the intra-area and intra-area classification of the historical UE reported data with an accurate position is completed, the historical UE reported data falling within the designated area is marked as 1, and the historical UE reported data falling outside the designated area is marked as 0; the historical UE Report information includes, but is not limited to, MR (Measurement Report) information.
A building module 30, configured to obtain historical UE report information, extract features from the historical UE report information, and build a feature classification model by combining tags that fall inside and outside an area;
in this embodiment, the feature set is extracted from the historical UE reporting information, which includes but is not limited to: a main service cell ID list, an adjacent service cell ID list, a main service cell signal strength, an adjacent service cell signal strength, distances from all cells (including the main service cell and the adjacent service cell) to a central point of a designated area and the like.
In this embodiment, a feature classification model constructed by combining features extracted from historical UE reported information and tags falling inside and outside an area may be a classification rule artificially specified according to a feature set; or a classification model which is established by learning the information reporting mode of the UE according to historical data and through a series of algorithms.
The classification processing module 40 is configured to perform classification processing on whether the collected UE reported information sample that does not include an accurate position falls in a designated area or not according to the feature classification model;
in this embodiment, according to the extracted feature set, by combining with the above-mentioned label for marking the historical data inside and outside the specified area, a feature classification model inside/outside the area is established by a data mining and/or statistical analysis method. Data mining methods including, but not limited to, decision trees, logistic regression, random forests, and the like can be selected.
And classifying the collected UE reported information without accurate positions according to the characteristic classification model, and distinguishing whether each piece of data information falls in a specified area or outside the specified area.
And the statistic module 50 is configured to perform people flow statistics on the classified sample points falling in the designated area.
In this embodiment, the classified sample points falling in the designated area may be subjected to pedestrian flow statistics. Further, UE reported information in a specified time period can be obtained, and after a classification result is obtained according to the characteristic classification model, the pedestrian flow of the specified area in the specified time period can be counted according to a required time period.
The statistical system provided by the invention classifies the inside and outside of the area of the historical UE reported data with accurate positions by obtaining the boundary of the designated area, then carries out label processing on the historical reported data inside and outside the area according to the classification result, extracts the characteristics from the historical UE reported information, constructs a characteristic classification model by combining the labels falling inside and outside the area, finally carries out classification processing on whether the collected UE reported information sample without accurate positions falls in the designated area or not according to the characteristic classification model, and carries out people flow statistics on the sample points which are obtained by classification and fall in the designated area. Therefore, the mode of the UE reporting information is learned based on the accurate position, the characteristic classification model is established, the method has the characteristics of good robustness and strong adaptability, the technical problems that data samples based on a GPS data source are insufficient and errors exist in the UE reporting information can be solved, and therefore the accuracy of people flow statistics for a specific area is improved.
In a first embodiment, as shown in fig. 11, on the basis of the embodiment shown in fig. 10, the building module 30 includes:
a counting unit 301, configured to obtain historical UE reporting information, and count distribution characteristics of IDs of a main serving cell and neighboring serving cells;
a constructing unit 302, configured to extract a primary serving cell ID list and a neighbor serving cell ID list that cover the specified area, and construct a primary neighbor sequence set in combination with tags that fall inside and outside the specified area.
In this embodiment, the feature set selects the main serving cell and the neighboring serving cell ID feature sequences in the history UE report information, and extracts the main serving cell ID and the neighboring serving cell ID list as feature sets to establish a feature classification model whether the feature classification model falls in a specified area.
Specifically, the distribution characteristics of the IDs of the main serving cell and the neighboring serving cells are counted, and the statistical results are shown in fig. 4 and fig. 5. As can be seen from fig. 4 and 5, the ratio of the occurrence times of each primary serving cell ID in the history data sample and the cumulative ratio after the occurrence times are arranged in descending order from high to low according to the frequency. Therefore, for the coverage of the designated area, no matter the main serving cell or the adjacent serving cell, a few cells are covered; and extracting the ID of the most main service cell/adjacent service cell in the specific area, and constructing a main adjacent cell ID characteristic sequence set of the specific area. The accumulated coverage percentage threshold can be reasonably configured according to actual needs, and a main serving cell/adjacent serving cell which mainly covers the specific area can be screened out through the threshold, and the main serving cell and the adjacent serving cell jointly form a main adjacent cell sequence feature set.
In an embodiment, as shown in fig. 12, on the basis of the embodiment shown in fig. 10, the classification processing module 40 includes:
a classifying unit 401, configured to classify, according to the constructed primary neighboring cell sequence set, the collected sample that does not contain the UE reporting information at an accurate position;
in this embodiment, according to the constructed main neighboring cell sequence set, the collected UE reported information samples that do not include an accurate position are classified, and a record falling in the designated area is identified. One of the decision rules is listed below, but of course, other decision rules may be defined in other embodiments.
A determining unit 402, configured to determine that the UE-reported information sample falls outside the designated area if the primary serving cell ID of the UE-reported information sample is not in the primary serving cell ID list;
in this embodiment, for an acquired information sample reported by a UE, if the primary serving cell ID is not in the obtained primary serving cell ID list, the sample is directly classified as falling outside the specified area;
the determining unit 402 is further configured to, if the primary serving cell ID of the reported information is in the primary serving cell ID list, obtain the number of neighboring serving cell IDs in the neighboring serving cell ID list, and determine that the UE reported information sample falls in the designated area when the number is greater than a predetermined threshold.
In this embodiment, if the main serving cell ID of the reported information is in the main serving cell ID list, the number of all neighboring serving cell IDs of the sample appearing in the neighboring serving cell ID list needs to be further obtained, and if the obtained number is greater than a predetermined threshold, the obtained number is classified as falling within the specified area, otherwise, the obtained number is classified as falling outside the specified area. Thus, by counting the sample records falling in the designated area, the people stream change condition of the designated area can be obtained. Of course, if the UE reporting information is obtained according to the specified time, the traffic volume of the specified area in the specified time period may be counted according to the required time period.
In a second embodiment, as shown in fig. 13, on the basis of the embodiment shown in fig. 10, the building module 30 includes:
an extracting unit 303, configured to acquire historical UE report information, and extract signal strengths of a main serving cell and a neighboring serving cell from the historical UE report information;
in this embodiment, the signal strengths of the main serving cell and the neighboring serving cells in the history UE report information are extracted as feature sets, and the feature sets are used to establish a feature classification model whether the feature classification model falls in a specified area. The Signal strength is preferably RSRP (Reference Signal Receiving Power).
A preprocessing unit 304, configured to preprocess the acquired signal strengths of the primary serving cell and the neighboring serving cells;
in this embodiment, the information reported by the UE may specifically include RSRPs of 1 main serving cell and RSRPs of 3 neighboring serving cells, and if no RSRP is collected and a missing value occurs, preprocessing may be performed, where the preprocessing includes, but is not limited to, removing an invalid value and a significant deviation value, and processing the missing value.
The constructing unit 302 is configured to train and construct a classification model by using a classification algorithm in data mining according to the history data after the label processing and the preprocessed sample.
In this embodiment, model training is performed on the history data after label processing and the preprocessed samples by using algorithms such as logistic regression, so as to obtain a binary classification model.
Therefore, accurate point positioning of the UE is not needed, certain characteristics of the designated area are learned based on historical data, whether the UE sample belongs to the designated area or not can be directly identified and judged, and then the flow of people in the designated area is obtained.
In an embodiment, as shown in fig. 14, on the basis of the embodiment shown in fig. 10, the classification processing module 40 includes:
an obtaining unit 403, configured to obtain signal strengths of a main serving cell and an adjacent serving cell of the information sample reported by the UE;
a determining unit 404, configured to determine whether the sample falls within the specified area according to the classification model and the signal strengths of the primary serving cell and the neighboring serving cells.
In this embodiment, according to the obtained classification model, the reported RSRP of the main serving cell and the RSRP of the neighboring serving cells are analyzed to determine whether the sample falls within the designated area. In this embodiment, through the learning training of the logistic regression classification model, whether the sample falls within the designated area can be determined according to the reported RSRP of the main serving cell and the neighboring serving cells. Thus, by counting the sample records falling in the designated area, the change situation of the people flow according to the hour granularity can be obtained. Of course, if the UE report information is obtained according to the specified time, the traffic volume of the specified area in the specified time period may be counted according to the required time period.
In an embodiment, as shown in fig. 15, based on the embodiment shown in fig. 10, the statistical system 100 further includes:
an extension module 60, configured to locate the history UE reported information, and extend the specified area to obtain an extended area;
and a screening module 70, configured to screen out historical UE reporting information that falls within the extended area.
In this embodiment, an algorithm such as a fingerprint positioning algorithm or a triangulation positioning algorithm is used to position the historical UE reported information, the designated area is extended to a certain range to obtain an extended area, for example, the extended area is extended to the outside by 300 meters along the edge of the designated area, and then the historical UE reported information sample points falling within the extended area are screened out.
After the coarse screening, the sample point set falling in the designated area is greatly reduced compared with the original sample set aiming at the whole city, and then a characteristic classification model is constructed aiming at the sample points passing through the coarse screening, and the sample points falling in the designated area are judged and the people flow statistics is carried out. Therefore, the calculation complexity can be effectively reduced under the condition of not reducing the statistical accuracy, and the practicability is improved.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. A people flow rate statistical method is characterized by comprising the following steps:
acquiring the boundary of a designated area, and carrying out internal and external classification on historical UE reported data with accurate positions;
according to the classification result, label processing is respectively carried out on the historical UE reported data inside and outside the region;
acquiring historical UE (user equipment) reported information, extracting features from the historical UE reported information, and constructing a feature classification model by combining tags falling inside and outside an area;
classifying whether the collected UE reported information sample without the accurate position falls in a designated area or not according to the characteristic classification model;
and carrying out people flow statistics on the classified sample points falling in the specified area.
2. The people flow rate statistical method according to claim 1, wherein the step of obtaining historical UE reported information, extracting features from the historical UE reported information, and constructing a feature classification model by combining tags falling inside and outside an area comprises:
acquiring historical UE (user equipment) reported information, and counting distribution characteristics of IDs (identity) of a main serving cell and an adjacent serving cell;
and extracting a main service cell ID list and an adjacent service cell ID list covering the specified area, and constructing a main adjacent area sequence set by combining labels falling inside and outside the specified area.
3. The people flow rate statistical method according to claim 2, wherein the step of classifying whether the collected UE reported information samples that do not include an accurate location fall in a designated area according to the feature classification model comprises:
classifying the collected UE reported information samples without accurate positions according to the constructed main adjacent region sequence set;
if the ID of the main service cell of the information sample reported by the UE is not in the ID list of the main service cell, judging that the information sample reported by the UE falls outside the designated area;
if the main serving cell ID of the reported information is in the main serving cell ID list, acquiring the number of the neighbor serving cell IDs in the neighbor serving cell ID list, and when the number is greater than a preset threshold value, judging that the UE reported information sample falls in the specified area.
4. The people flow rate statistical method according to claim 1, wherein the step of obtaining historical UE reported information, extracting features from the historical UE reported information, and constructing a feature classification model by combining tags falling inside and outside an area comprises:
acquiring historical UE (user equipment) reporting information, and extracting the signal intensity of a main serving cell and a neighbor serving cell from the historical UE reporting information;
preprocessing the acquired signal intensity of the main serving cell and the adjacent serving cells;
and training and constructing a classification model by adopting a classification algorithm in data mining according to the historical data subjected to label processing and the preprocessed samples.
5. The people flow rate statistical method according to claim 4, wherein the step of classifying whether the collected UE reported information samples without accurate positions fall in a designated area according to the feature classification model comprises:
acquiring the signal intensity of a main serving cell and an adjacent serving cell of the UE reported information sample;
and determining whether the sample falls in the designated area or not according to the classification model and the signal strengths of the main serving cell and the adjacent serving cells.
6. A statistical system, characterized in that the statistical system comprises:
the regional classification module is used for acquiring the boundary of the designated region and performing regional internal and external classification on the historical UE reported data with accurate positions;
the label processing module is used for respectively carrying out label processing on the historical UE reported data inside and outside the region according to the classification result;
the system comprises a construction module, a classification module and a classification module, wherein the construction module is used for acquiring historical UE (user equipment) reported information, extracting features from the historical UE reported information and constructing a feature classification model by combining tags falling inside and outside an area;
the classification processing module is used for classifying whether the collected UE reported information sample without the accurate position falls in a designated area or not according to the characteristic classification model;
and the counting module is used for carrying out people flow counting on the classified sample points falling in the specified area.
7. The statistical system of claim 6, wherein the construction module comprises:
the statistical unit is used for acquiring historical UE reported information and counting the distribution characteristics of the IDs of the main serving cell and the adjacent serving cells;
and the constructing unit is used for extracting a main service cell ID list and an adjacent service cell ID list covering the specified area and constructing a main adjacent area sequence set by combining labels falling inside and outside the specified area.
8. The statistical system of claim 7, wherein the classification processing module comprises:
the classification unit is used for classifying the collected UE reported information samples without accurate positions according to the constructed main adjacent region sequence set;
a determining unit, configured to determine that the UE-reported information sample falls outside the designated area if the primary serving cell ID of the UE-reported information sample is not in the primary serving cell ID list;
the determining unit is further configured to, if the primary serving cell ID of the reported information is in the primary serving cell ID list, obtain the number of neighboring serving cell IDs in the neighboring serving cell ID list, and determine that the UE reported information sample falls in the designated area when the number is greater than a predetermined threshold.
9. The statistical system of claim 6, wherein the construction module comprises:
the extracting unit is used for acquiring historical UE reported information and extracting the signal intensity of a main serving cell and a neighboring serving cell from the historical UE reported information;
the preprocessing unit is used for preprocessing the acquired signal strength of the main serving cell and the adjacent serving cells;
and the construction unit is used for training and constructing a classification model by adopting a classification algorithm in data mining according to the historical data after the label processing and the preprocessed sample.
10. The statistical system of claim 9, wherein the classification processing module comprises:
an obtaining unit, configured to obtain signal strengths of a main serving cell and an adjacent serving cell of the information sample reported by the UE;
and the determining unit is used for determining whether the sample falls in the specified area or not according to the classification model and the signal strengths of the main serving cell and the adjacent serving cells.
CN201710141863.4A 2017-03-10 2017-03-10 People flow statistical method and statistical system Active CN108573265B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710141863.4A CN108573265B (en) 2017-03-10 2017-03-10 People flow statistical method and statistical system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710141863.4A CN108573265B (en) 2017-03-10 2017-03-10 People flow statistical method and statistical system

Publications (2)

Publication Number Publication Date
CN108573265A CN108573265A (en) 2018-09-25
CN108573265B true CN108573265B (en) 2023-04-07

Family

ID=63577395

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710141863.4A Active CN108573265B (en) 2017-03-10 2017-03-10 People flow statistical method and statistical system

Country Status (1)

Country Link
CN (1) CN108573265B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109600758B (en) * 2018-11-15 2022-03-29 南昌航空大学 RSS-based people flow monitoring method
CN111898322B (en) * 2020-08-11 2024-03-01 腾讯科技(深圳)有限公司 Data processing method and related equipment
CN113409018B (en) * 2021-06-25 2024-03-05 北京红山信息科技研究院有限公司 People stream density determining method, device, equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB201518687D0 (en) * 2015-10-21 2015-12-02 Fujitsu Ltd EMF impact triggered reporting and cell selection
CN105516928A (en) * 2016-01-15 2016-04-20 中国联合网络通信有限公司广东省分公司 Position recommending method and system based on position crowd characteristics
CN106251578A (en) * 2016-08-19 2016-12-21 深圳奇迹智慧网络有限公司 Artificial abortion's early warning analysis method and system based on probe
CN106303953A (en) * 2016-07-29 2017-01-04 上海斐讯数据通信技术有限公司 A kind of people flow rate statistical system and method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
ES2361937T3 (en) * 2007-11-29 2011-06-24 Nokia Siemens Networks Oy RADIO CELL PERFORMANCE CONTROL AND / OR MONITORING BASED ON USER EQUIPMENT POSITIONING DATA AND RADIO QUALITY PARAMETER.

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB201518687D0 (en) * 2015-10-21 2015-12-02 Fujitsu Ltd EMF impact triggered reporting and cell selection
CN105516928A (en) * 2016-01-15 2016-04-20 中国联合网络通信有限公司广东省分公司 Position recommending method and system based on position crowd characteristics
CN106303953A (en) * 2016-07-29 2017-01-04 上海斐讯数据通信技术有限公司 A kind of people flow rate statistical system and method
CN106251578A (en) * 2016-08-19 2016-12-21 深圳奇迹智慧网络有限公司 Artificial abortion's early warning analysis method and system based on probe

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于Cell_ID的区域定位方法的研究与实现;王雪靖等;《软件》;20161115(第11期);全文 *

Also Published As

Publication number Publication date
CN108573265A (en) 2018-09-25

Similar Documents

Publication Publication Date Title
Bachir et al. Inferring dynamic origin-destination flows by transport mode using mobile phone data
Lesani et al. Development and testing of a real-time WiFi-bluetooth system for pedestrian network monitoring, classification, and data extrapolation
CN109982366B (en) Target value area analysis method, device, equipment and medium based on big data
Janecek et al. The cellular network as a sensor: From mobile phone data to real-time road traffic monitoring
Sohn et al. Dynamic origin–destination flow estimation using cellular communication system
CN109996186A (en) A kind of network coverage problem identification method and device, read/write memory medium
Fekih et al. A data-driven approach for origin–destination matrix construction from cellular network signalling data: a case study of Lyon region (France)
CN108109423B (en) Underground parking lot intelligent navigation method and system based on WiFi indoor positioning
CN109688532B (en) Method and device for dividing city functional area
CN109996278B (en) Road network quality evaluation method, device, equipment and medium
CN107767669A (en) Public bus network passenger flow OD methods of estimation based on WiFi and bluetooth recognition
CN108133001B (en) MR indoor and outdoor separation method, device and medium
CN108573265B (en) People flow statistical method and statistical system
CN103068035A (en) Wireless network location method, device and system
CN109803274B (en) Antenna azimuth angle optimization method and system
CN112506972B (en) User resident area positioning method and device, electronic equipment and storage medium
CN104636611A (en) Urban road/ road segment vehicle speed evaluation method
CN107027148A (en) A kind of Radio Map classification and orientation methods based on UE speed
CN103167605B (en) A kind of WiFi outdoor positioning method that satellite auxiliary signal coverage diagram is set up/upgraded
CN110460962A (en) Rail traffic user identification method and device
Ghorpade et al. An integrated stop-mode detection algorithm for real world smartphone-based travel survey
CN107133689B (en) Position marking method
Leca et al. Significant location detection & prediction in cellular networks using artificial neural networks
Rodrigues et al. Impact of crowdsourced data quality on travel pattern estimation
Duan et al. MobilePulse: Dynamic profiling of land use pattern and OD matrix estimation from 10 million individual cell phone records in Shanghai

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant