CN108573265B

CN108573265B - People flow statistical method and statistical system

Info

Publication number: CN108573265B
Application number: CN201710141863.4A
Authority: CN
Inventors: 陈霖; 游龙涛; 韩桂鲁; 李志阳
Original assignee: ZTE Corp
Current assignee: ZTE Corp
Priority date: 2017-03-10
Filing date: 2017-03-10
Publication date: 2023-04-07
Anticipated expiration: 2037-03-10
Also published as: CN108573265A

Abstract

The invention discloses a people flow statistical method, which comprises the following steps: acquiring the boundary of a designated area, and carrying out internal and external classification on historical UE reported data with accurate positions; according to the classification result, label processing is respectively carried out on the historical UE reported data inside and outside the region; acquiring historical UE (user equipment) reported information, extracting features from the historical UE reported information, and constructing a feature classification model by combining tags falling inside and outside an area; classifying whether the collected UE reported information samples without accurate positions fall in a specified area or not according to the characteristic classification model; and carrying out people flow statistics on the classified sample points falling in the specified area. The invention also discloses a statistical system. The invention can solve the technical problems of insufficient data samples based on a GPS data source and errors in reported information based on UE, so as to improve the accuracy of people flow statistics aiming at a specific area.

Description

People flow statistical method and statistical system

Technical Field

The invention relates to the technical field of wireless communication, in particular to a people flow statistical method and a statistical system.

Background

With the rise of mobile internet, various applications based on mobile terminals such as mobile phones are emerging, especially mobile phone positioning technologies related to positioning services. Because the current mobile phone has high people ownership rate, the people flow statistics based on the mobile phone has high accuracy and practical value, for example, the store layout is analyzed by analyzing the people flow in commerce, the traffic is improved by analyzing the characteristics of people flow in traffic, and emergencies are known in time according to the changes of the people flow in the safety field.

A positioning technology based on a mobile phone generally has two acquisition channels, one is Position information acquired based on a GPS (Global positioning System), and the other is Position estimation based on a UE (User Equipment) and various types of wireless communication measurement information reported to a wireless base station. The advantages of positioning information by GPS are: the positioning accuracy is very high, and the general positioning error is less than 10 meters, but the limitation of the GPS positioning information is also very large, including the following aspects: firstly, although the existing smart phones are basically equipped with a GPS module, in consideration of power saving, many users can turn on the GPS module only when starting positioning and navigation applications, and cannot turn on the GPS module in a general state; second, GPS can only be used outdoors in an unobstructed environment, and cannot provide positioning information in an indoor environment because GPS satellite signals cannot be received. Therefore, the regional pedestrian volume statistics using only the GPS location information will suffer from a serious shortage of sample size, and the statistical result will be difficult to reach a satisfactory level.

On the contrary, the advantages and disadvantages of the positioning method based on the UE reported information for position estimation are exactly opposite to the GPS position information, and the positioning method is not limited by the environment and the user settings, and can be used indoors and outdoors, and compared with the disadvantages of insufficient GPS positioning information samples, the positioning method based on the UE reported information has obvious advantages in the sample adequacy aspect. On the other hand, the positioning error is much larger than the position information of the GPS based on the estimated position information of the information reported by the UE. Therefore, the accuracy is difficult to meet the requirement by performing the people flow statistics of the hot spot region only by using the position information.

The above is only for the purpose of assisting understanding of the technical solution of the present invention, and does not represent an admission that the above is the prior art.

Disclosure of Invention

The invention mainly aims to provide a people flow rate statistical method and a people flow rate statistical system, and aims to solve the technical problems that a data sample based on a GPS data source is insufficient and errors exist in reported information based on UE (user equipment) so as to improve the accuracy of people flow rate statistics aiming at a specific area.

In order to achieve the above object, the present invention provides a people flow rate statistical method, which comprises the following steps:

acquiring the boundary of a designated area, and carrying out internal and external classification on historical UE reported data with accurate positions;

according to the classification result, label processing is respectively carried out on the historical UE reported data inside and outside the region;

acquiring historical UE (user equipment) reported information, extracting features from the historical UE reported information, and constructing a feature classification model by combining tags falling inside and outside an area;

classifying whether the collected UE reported information sample without the accurate position falls in a designated area or not according to the characteristic classification model;

and carrying out people flow statistics on the classified sample points falling in the specified area.

Preferably, the step of obtaining the historical UE report information, extracting the features from the historical UE report information, and constructing the feature classification model by combining the tags inside and outside the area includes:

acquiring historical UE (user equipment) reported information, and counting distribution characteristics of IDs (identity) of a main serving cell and an adjacent serving cell;

and extracting a main service cell ID list and an adjacent service cell ID list covering the designated area, and constructing a main adjacent area sequence set by combining labels falling inside and outside the designated area.

Preferably, the step of classifying whether the collected UE reported information sample without an accurate location falls in a designated area according to the feature classification model includes:

classifying the collected UE reported information samples without accurate positions according to the constructed main adjacent region sequence set;

if the ID of the main service cell of the information sample reported by the UE is not in the ID list of the main service cell, judging that the information sample reported by the UE falls outside the specified area;

if the main serving cell ID of the reported information is in the main serving cell ID list, acquiring the number of the neighbor serving cell IDs in the neighbor serving cell ID list, and when the number is greater than a preset threshold value, judging that the UE reported information sample falls in the specified area.

acquiring historical UE (user equipment) reporting information, and extracting the signal intensity of a main serving cell and a neighbor serving cell from the historical UE reporting information;

preprocessing the acquired signal intensity of the main serving cell and the adjacent serving cells;

and training and constructing a classification model by adopting a classification algorithm in data mining according to the historical data subjected to label processing and the preprocessed samples.

acquiring the signal intensity of a main serving cell and an adjacent serving cell of the UE reported information sample;

and determining whether the sample falls in the designated area or not according to the classification model and the signal strengths of the main serving cell and the adjacent serving cells.

In order to achieve the above object, the present invention further provides a statistical system, including:

the regional classification module is used for acquiring the boundary of the designated region and performing regional internal and external classification on the historical UE reported data with accurate positions;

the label processing module is used for respectively carrying out label processing on the historical UE reported data inside and outside the region according to the classification result;

the device comprises a construction module, a characteristic classification module and a characteristic classification module, wherein the construction module is used for acquiring historical UE reported information, extracting characteristics from the historical UE reported information and constructing a characteristic classification model by combining labels falling inside and outside an area;

the classification processing module is used for classifying whether the collected UE reported information sample without the accurate position falls in a specified area or not according to the characteristic classification model;

and the counting module is used for carrying out people flow counting on the classified sample points falling in the specified area.

Preferably, the building block comprises:

the statistical unit is used for acquiring historical UE reported information and counting the distribution characteristics of the IDs of the main serving cell and the adjacent serving cells;

and the constructing unit is used for extracting a main service cell ID list and an adjacent service cell ID list covering the specified area and constructing a main adjacent area sequence set by combining labels falling inside and outside the specified area.

Preferably, the classification processing module includes:

the classification unit is used for classifying the collected UE reported information samples without accurate positions according to the constructed main adjacent cell sequence set;

a determining unit, configured to determine that the UE-reported information sample falls outside the designated area if the primary serving cell ID of the UE-reported information sample is not in the primary serving cell ID list;

the determining unit is further configured to, if the primary serving cell ID of the reported information is in the primary serving cell ID list, obtain the number of neighboring serving cell IDs in the neighboring serving cell ID list, and determine that the UE reported information sample falls in the designated area when the number is greater than a predetermined threshold.

Preferably, the building block comprises:

the extracting unit is used for acquiring historical UE reported information and extracting the signal intensity of a main serving cell and a neighboring serving cell from the historical UE reported information;

the preprocessing unit is used for preprocessing the acquired signal intensity of the main serving cell and the adjacent serving cell;

and the construction unit is used for training and constructing a classification model by adopting a classification algorithm in data mining according to the historical data after the label processing and the preprocessed sample.

Preferably, the classification processing module includes:

an obtaining unit, configured to obtain signal strengths of a main serving cell and an adjacent serving cell of the information sample reported by the UE;

and the determining unit is used for determining whether the sample falls in the specified area or not according to the classification model and the signal strengths of the main serving cell and the adjacent serving cells.

According to the people flow statistical method and the statistical system, the boundary of the designated area is obtained, and the internal and external classification of the area is carried out on the historical UE reported data with accurate positions; according to the classification result, label processing is respectively carried out on the historical UE reported data inside and outside the region; acquiring historical UE (user equipment) reported information, extracting features from the historical UE reported information, and constructing a feature classification model by combining tags falling inside and outside an area; classifying whether the collected UE reported information samples without accurate positions fall in a specified area or not according to the characteristic classification model; and carrying out people flow statistics on the classified sample points falling into the designated area. Therefore, the mode of the UE reporting information is learned based on the accurate position, the characteristic classification model is established, the method has the characteristics of good robustness and strong adaptability, the technical problems that data samples based on a GPS data source are insufficient and errors exist in the UE reporting information can be solved, and therefore the accuracy of people flow statistics for a specific area is improved.

Drawings

FIG. 1 is a schematic flow chart of a people flow statistical method according to a first embodiment of the present invention;

FIG. 2 is a schematic diagram of historical data falling inside and outside a designated area;

FIG. 3 is a detailed flowchart of a first embodiment of the step S3 in FIG. 1;

FIG. 4 is a diagram illustrating the distribution statistics of the primary serving cell;

fig. 5 is a diagram illustrating distribution statistics of neighboring cells;

FIG. 6 is a detailed flowchart of the first embodiment of the step S4 in FIG. 1;

FIG. 7 is a detailed flowchart of a second embodiment of the step S3 in FIG. 1;

FIG. 8 is a detailed flowchart of a second embodiment of the step S4 in FIG. 1;

FIG. 9 is a flow chart of a first embodiment of the flow statistical method of the present invention;

FIG. 10 is a functional block diagram of a statistical system according to a first embodiment of the present invention;

FIG. 11 is a schematic diagram of a refinement function module of the first embodiment of the building block of FIG. 10;

FIG. 12 is a schematic diagram of a detailed functional block diagram of a first embodiment of the classification processing block of FIG. 10;

FIG. 13 is a schematic diagram of a refinement function module of a second embodiment of the building block of FIG. 10;

FIG. 14 is a schematic diagram of a refinement function module of a second embodiment of the classification processing module of FIG. 10;

FIG. 15 is a functional block diagram of a statistical system according to a second embodiment of the present invention.

The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

The invention provides a people flow rate statistical method, referring to fig. 1, in an embodiment, the people flow rate statistical method includes:

s1, acquiring the boundary of a designated area, and performing intra-area and intra-area classification on historical UE reported data with accurate positions;

in this embodiment, according to coordinates of corner points of a designated area, a polygon algorithm is used to classify data reported by historical UE having an accurate position inside and outside the designated area, where the accurate position is obtained by using an Assisted Global Positioning System (AGPS) in the UE (User Equipment); the historical UE reported data is obtained from the information reported by the UE.

S2, according to the classification result, performing label processing on the historical UE reported data inside and outside the area respectively;

in this embodiment, as shown in fig. 2, after the intra-area and intra-area classification of the historical UE reported data with an accurate position is completed, the historical UE reported data falling within the designated area is marked as 1, and the historical UE reported data falling outside the designated area is marked as 0; the historical UE reporting information includes, but is not limited to, MR (Measurement Report) information.

S3, obtaining historical UE reported information, extracting features from the historical UE reported information, and constructing a feature classification model by combining labels falling inside and outside an area;

in this embodiment, the feature set is extracted from the history UE reported information, which includes but is not limited to: a main service cell ID list, an adjacent service cell ID list, a main service cell signal strength, an adjacent service cell signal strength, distances from all cells (including the main service cell and the adjacent service cell) to a central point of a designated area and the like.

In this embodiment, a feature classification model constructed by combining features extracted from historical UE reported information and tags falling inside and outside an area may be a classification rule artificially specified according to a feature set; or a classification model which is established by learning the information reporting mode of the UE according to historical data and through a series of algorithms.

S4, classifying whether the collected UE reported information sample without the accurate position falls in a designated area or not according to the feature classification model;

in this embodiment, according to the extracted feature set, by combining with the above-mentioned label for labeling the historical data inside and outside the specified area, a feature classification model inside/outside the area is established by a data mining and/or statistical analysis method. Data mining methods including, but not limited to, decision trees, logistic regression, random forests, and the like can be selected.

And classifying the collected UE reported information without the accurate position according to the characteristic classification model, and distinguishing whether each piece of data information falls in the designated area or outside the designated area.

And S5, carrying out people flow statistics on the classified sample points falling in the specified area.

In this embodiment, the classified sample points falling in the designated area may be subjected to pedestrian flow statistics. Further, UE reported information in a specified time period can be obtained, and after a classification result is obtained according to the characteristic classification model, the pedestrian flow of the specified area in the specified time period can be counted according to a required time period.

The invention provides a people flow statistical method, which comprises the steps of classifying historical UE reported data with accurate positions inside and outside a region by obtaining the boundary of a specified region, respectively carrying out label processing on the historical reported data inside and outside the region according to a classification result, extracting features from the historical UE reported information, constructing a feature classification model by combining labels falling inside and outside the region, classifying whether collected UE reported information samples without accurate positions fall in the specified region according to the feature classification model, and carrying out people flow statistics on sample points which are obtained by classification and fall in the specified region. Therefore, the mode of the UE reporting information is learned based on the accurate position, the characteristic classification model is established, the method has the characteristics of good robustness and strong adaptability, the technical problems that data samples based on a GPS data source are insufficient and errors exist in the UE reporting information can be solved, and therefore the accuracy of people flow statistics for a specific area is improved.

In the first embodiment, as shown in fig. 3, on the basis of the embodiment shown in fig. 1, the step S3 includes:

step S31, obtaining historical UE reported information, and counting distribution characteristics of IDs of a main service cell and an adjacent service cell;

and step S32, extracting a main service cell ID list and an adjacent service cell ID list covering the designated area, and constructing a main adjacent area sequence set by combining labels falling inside and outside the designated area.

In this embodiment, the feature set selects a feature sequence of IDs of a main serving cell and an adjacent serving cell in history information reported by the UE, and extracts the ID of the main serving cell and the ID of the adjacent serving cell as feature sets to establish a feature classification model whether the feature sets fall within a specified area.

Specifically, the distribution characteristics of the IDs of the main serving cell and the neighboring serving cell are counted, and the statistical results are shown in fig. 4 and fig. 5. As can be seen from fig. 4 and 5, the ratio of the occurrence times of each primary serving cell ID in the history data sample and the cumulative ratio after the occurrence times are arranged in descending order from high to low according to the frequency. Therefore, for the coverage of the designated area, no matter the main serving cell or the adjacent serving cell, a few cells are covered; and extracting the ID of the most main service cell/adjacent service cell in the specific area, and constructing a main adjacent cell ID characteristic sequence set of the specific area. The accumulated coverage percentage threshold can be reasonably configured according to actual needs, and a main serving cell/adjacent serving cell which mainly covers the specific area can be screened out through the threshold, and the main serving cell and the adjacent serving cell jointly form a main adjacent cell sequence feature set.

In an embodiment, as shown in fig. 6, on the basis of the embodiment shown in fig. 1, the step S4 includes:

s41, classifying the collected UE reported information samples without accurate positions according to the constructed main adjacent region sequence set;

in this embodiment, according to the constructed main neighboring cell sequence set, the collected UE reported information samples that do not include an accurate position are classified, and a record falling in the designated area is identified. One of the decision rules is listed below, but of course, other decision rules may be defined in other embodiments.

Step S42, if the ID of the main service cell of the information sample reported by the UE is not in the ID list of the main service cell, the information sample reported by the UE is judged to fall outside the specified area;

in this embodiment, for an acquired information sample reported by a UE, if the primary serving cell ID is not in the obtained primary serving cell ID list, the sample is directly classified as falling outside the specified area;

step S43, if the primary serving cell ID of the information reported by the UE is in the primary serving cell ID list, obtaining the number of neighboring serving cell IDs in the neighboring serving cell ID list, and when the number is greater than a predetermined threshold, determining that the information sample reported by the UE falls in the designated area.

In this embodiment, if the main serving cell ID of the reported information is in the main serving cell ID list, the number of all neighboring serving cell IDs of the sample appearing in the neighboring serving cell ID list needs to be further obtained, and if the obtained number is greater than a predetermined threshold, the obtained number is classified as falling within the specified area, otherwise, the obtained number is classified as falling outside the specified area. Thus, by counting the sample records falling in the designated area, the people stream change condition of the designated area can be obtained. Of course, if the UE reporting information is obtained according to the specified time, the traffic volume of the specified area in the specified time period may be counted according to the required time period.

In a second embodiment, as shown in fig. 7, on the basis of the embodiment shown in fig. 1, the step S3 includes:

step S33, obtaining historical UE reported information, and extracting the signal intensity of a main service cell and an adjacent service cell from the historical UE reported information;

in this embodiment, the signal strengths of the main serving cell and the neighboring serving cells in the history UE report information are extracted as feature sets, and the feature sets are used to establish a feature classification model whether the feature classification model falls in a specified area. The Signal strength is preferably RSRP (Reference Signal Receiving Power).

Step S34, preprocessing the acquired signal intensity of the main serving cell and the adjacent serving cell;

in this embodiment, the UE report information may specifically include RSRPs of 1 main serving cell and RSRPs of 3 neighboring serving cells, and if an RSRP is not acquired and a missing value occurs, preprocessing may be performed, where the preprocessing includes, but is not limited to, removing an invalid value and a significant offset value, and processing the missing value.

And S35, training and constructing a classification model by adopting a classification algorithm in data mining according to the historical data subjected to label processing and the preprocessed samples.

In this embodiment, model training is performed on the history data after label processing and the preprocessed samples by using algorithms such as logistic regression, so as to obtain a binary classification model.

Therefore, accurate point positioning of the UE is not needed, certain characteristics of the designated area are learned based on historical data, whether the UE sample belongs to the designated area or not can be directly identified and judged, and then the flow of people in the designated area is obtained.

In an embodiment, as shown in fig. 8, on the basis of the embodiment shown in fig. 7, the step S4 includes:

step S44, obtaining the signal intensity of a main service cell and an adjacent service cell of the information sample reported by the UE;

and step S45, determining whether the sample is in the designated area according to the classification model and the signal strength of the main service cell and the adjacent service cells.

In this embodiment, according to the obtained classification model, the reported RSRP of the main serving cell and the RSRP of the neighboring serving cells are analyzed to determine whether the sample falls within the designated area. In this embodiment, through the learning training of the logistic regression classification model, whether the sample falls within the designated area can be determined according to the reported RSRP of the main serving cell and the neighboring serving cells. Thus, by counting the sample records falling in the designated area, the change situation of the people flow according to the hour granularity can be obtained. Of course, if the UE report information is obtained according to the specified time, the traffic volume of the specified area in the specified time period may be counted according to the required time period.

In an embodiment, as shown in fig. 9, on the basis of the embodiment shown in fig. 1, the step S3 further includes:

s6, positioning the historical UE reported information, and expanding the specified area to obtain an expanded area;

and S7, screening out historical UE reporting information falling in the expansion area.

In this embodiment, an algorithm such as a fingerprint positioning algorithm or a triangulation positioning algorithm is used to position the historical UE reported information, the designated area is extended to a certain range to obtain an extended area, for example, the extended area is extended to the outside by 300 meters along the edge of the designated area, and then the historical UE reported information sample points falling within the extended area are screened out.

After the rough screening, the sample point set falling in the designated area is greatly reduced compared with the original sample set aiming at the whole city, and then a characteristic classification model is constructed aiming at the sample points passing through the rough screening, and the sample points falling in the designated area are judged and the people flow statistics is carried out. Therefore, under the condition of not reducing the statistical accuracy, the calculation complexity can be effectively reduced, and the practicability is improved.

Referring to fig. 10, a statistical system 100 according to an embodiment of the present invention includes:

the regional classification module 10 is configured to obtain a boundary of a designated region, and classify inside and outside the region for historical UE reported data with an accurate position;

A tag processing module 20, configured to perform tag processing on the historical UE reported data inside and outside the area according to the classification result;

in this embodiment, as shown in fig. 2, after the intra-area and intra-area classification of the historical UE reported data with an accurate position is completed, the historical UE reported data falling within the designated area is marked as 1, and the historical UE reported data falling outside the designated area is marked as 0; the historical UE Report information includes, but is not limited to, MR (Measurement Report) information.

A building module 30, configured to obtain historical UE report information, extract features from the historical UE report information, and build a feature classification model by combining tags that fall inside and outside an area;

in this embodiment, the feature set is extracted from the historical UE reporting information, which includes but is not limited to: a main service cell ID list, an adjacent service cell ID list, a main service cell signal strength, an adjacent service cell signal strength, distances from all cells (including the main service cell and the adjacent service cell) to a central point of a designated area and the like.

The classification processing module 40 is configured to perform classification processing on whether the collected UE reported information sample that does not include an accurate position falls in a designated area or not according to the feature classification model;

in this embodiment, according to the extracted feature set, by combining with the above-mentioned label for marking the historical data inside and outside the specified area, a feature classification model inside/outside the area is established by a data mining and/or statistical analysis method. Data mining methods including, but not limited to, decision trees, logistic regression, random forests, and the like can be selected.

And classifying the collected UE reported information without accurate positions according to the characteristic classification model, and distinguishing whether each piece of data information falls in a specified area or outside the specified area.

And the statistic module 50 is configured to perform people flow statistics on the classified sample points falling in the designated area.

The statistical system provided by the invention classifies the inside and outside of the area of the historical UE reported data with accurate positions by obtaining the boundary of the designated area, then carries out label processing on the historical reported data inside and outside the area according to the classification result, extracts the characteristics from the historical UE reported information, constructs a characteristic classification model by combining the labels falling inside and outside the area, finally carries out classification processing on whether the collected UE reported information sample without accurate positions falls in the designated area or not according to the characteristic classification model, and carries out people flow statistics on the sample points which are obtained by classification and fall in the designated area. Therefore, the mode of the UE reporting information is learned based on the accurate position, the characteristic classification model is established, the method has the characteristics of good robustness and strong adaptability, the technical problems that data samples based on a GPS data source are insufficient and errors exist in the UE reporting information can be solved, and therefore the accuracy of people flow statistics for a specific area is improved.

In a first embodiment, as shown in fig. 11, on the basis of the embodiment shown in fig. 10, the building module 30 includes:

a counting unit 301, configured to obtain historical UE reporting information, and count distribution characteristics of IDs of a main serving cell and neighboring serving cells;

a constructing unit 302, configured to extract a primary serving cell ID list and a neighbor serving cell ID list that cover the specified area, and construct a primary neighbor sequence set in combination with tags that fall inside and outside the specified area.

In this embodiment, the feature set selects the main serving cell and the neighboring serving cell ID feature sequences in the history UE report information, and extracts the main serving cell ID and the neighboring serving cell ID list as feature sets to establish a feature classification model whether the feature classification model falls in a specified area.

Specifically, the distribution characteristics of the IDs of the main serving cell and the neighboring serving cells are counted, and the statistical results are shown in fig. 4 and fig. 5. As can be seen from fig. 4 and 5, the ratio of the occurrence times of each primary serving cell ID in the history data sample and the cumulative ratio after the occurrence times are arranged in descending order from high to low according to the frequency. Therefore, for the coverage of the designated area, no matter the main serving cell or the adjacent serving cell, a few cells are covered; and extracting the ID of the most main service cell/adjacent service cell in the specific area, and constructing a main adjacent cell ID characteristic sequence set of the specific area. The accumulated coverage percentage threshold can be reasonably configured according to actual needs, and a main serving cell/adjacent serving cell which mainly covers the specific area can be screened out through the threshold, and the main serving cell and the adjacent serving cell jointly form a main adjacent cell sequence feature set.

In an embodiment, as shown in fig. 12, on the basis of the embodiment shown in fig. 10, the classification processing module 40 includes:

a classifying unit 401, configured to classify, according to the constructed primary neighboring cell sequence set, the collected sample that does not contain the UE reporting information at an accurate position;

A determining unit 402, configured to determine that the UE-reported information sample falls outside the designated area if the primary serving cell ID of the UE-reported information sample is not in the primary serving cell ID list;

the determining unit 402 is further configured to, if the primary serving cell ID of the reported information is in the primary serving cell ID list, obtain the number of neighboring serving cell IDs in the neighboring serving cell ID list, and determine that the UE reported information sample falls in the designated area when the number is greater than a predetermined threshold.

In a second embodiment, as shown in fig. 13, on the basis of the embodiment shown in fig. 10, the building module 30 includes:

an extracting unit 303, configured to acquire historical UE report information, and extract signal strengths of a main serving cell and a neighboring serving cell from the historical UE report information;

A preprocessing unit 304, configured to preprocess the acquired signal strengths of the primary serving cell and the neighboring serving cells;

in this embodiment, the information reported by the UE may specifically include RSRPs of 1 main serving cell and RSRPs of 3 neighboring serving cells, and if no RSRP is collected and a missing value occurs, preprocessing may be performed, where the preprocessing includes, but is not limited to, removing an invalid value and a significant deviation value, and processing the missing value.

The constructing unit 302 is configured to train and construct a classification model by using a classification algorithm in data mining according to the history data after the label processing and the preprocessed sample.

In an embodiment, as shown in fig. 14, on the basis of the embodiment shown in fig. 10, the classification processing module 40 includes:

an obtaining unit 403, configured to obtain signal strengths of a main serving cell and an adjacent serving cell of the information sample reported by the UE;

a determining unit 404, configured to determine whether the sample falls within the specified area according to the classification model and the signal strengths of the primary serving cell and the neighboring serving cells.

In an embodiment, as shown in fig. 15, based on the embodiment shown in fig. 10, the statistical system 100 further includes:

an extension module 60, configured to locate the history UE reported information, and extend the specified area to obtain an extended area;

and a screening module 70, configured to screen out historical UE reporting information that falls within the extended area.

After the coarse screening, the sample point set falling in the designated area is greatly reduced compared with the original sample set aiming at the whole city, and then a characteristic classification model is constructed aiming at the sample points passing through the coarse screening, and the sample points falling in the designated area are judged and the people flow statistics is carried out. Therefore, the calculation complexity can be effectively reduced under the condition of not reducing the statistical accuracy, and the practicability is improved.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A people flow rate statistical method is characterized by comprising the following steps:

2. The people flow rate statistical method according to claim 1, wherein the step of obtaining historical UE reported information, extracting features from the historical UE reported information, and constructing a feature classification model by combining tags falling inside and outside an area comprises:

and extracting a main service cell ID list and an adjacent service cell ID list covering the specified area, and constructing a main adjacent area sequence set by combining labels falling inside and outside the specified area.

3. The people flow rate statistical method according to claim 2, wherein the step of classifying whether the collected UE reported information samples that do not include an accurate location fall in a designated area according to the feature classification model comprises:

if the ID of the main service cell of the information sample reported by the UE is not in the ID list of the main service cell, judging that the information sample reported by the UE falls outside the designated area;

4. The people flow rate statistical method according to claim 1, wherein the step of obtaining historical UE reported information, extracting features from the historical UE reported information, and constructing a feature classification model by combining tags falling inside and outside an area comprises:

5. The people flow rate statistical method according to claim 4, wherein the step of classifying whether the collected UE reported information samples without accurate positions fall in a designated area according to the feature classification model comprises:

6. A statistical system, characterized in that the statistical system comprises:

the system comprises a construction module, a classification module and a classification module, wherein the construction module is used for acquiring historical UE (user equipment) reported information, extracting features from the historical UE reported information and constructing a feature classification model by combining tags falling inside and outside an area;

the classification processing module is used for classifying whether the collected UE reported information sample without the accurate position falls in a designated area or not according to the characteristic classification model;

7. The statistical system of claim 6, wherein the construction module comprises:

8. The statistical system of claim 7, wherein the classification processing module comprises:

the classification unit is used for classifying the collected UE reported information samples without accurate positions according to the constructed main adjacent region sequence set;

9. The statistical system of claim 6, wherein the construction module comprises:

the preprocessing unit is used for preprocessing the acquired signal strength of the main serving cell and the adjacent serving cells;

10. The statistical system of claim 9, wherein the classification processing module comprises: