CN113133035B

CN113133035B - LTE high-load cell discrimination method and system

Info

Publication number: CN113133035B
Application number: CN202110490568.6A
Authority: CN
Inventors: 吴宇庆; 徐俊凯; 刘飞浪
Original assignee: HUBEI POST TELECOMMUNICATION PLANNING DESIGN CO Ltd
Current assignee: HUBEI POST TELECOMMUNICATION PLANNING DESIGN CO Ltd
Priority date: 2018-06-06
Filing date: 2018-06-06
Publication date: 2022-07-22
Anticipated expiration: 2038-06-06
Also published as: CN113133035A; CN108777870B; CN108777870A

Abstract

The invention relates to a high-load cell screening method and system, belongs to the technical field of communication, and particularly relates to an LTE high-load cell screening method and system. According to the invention, the high-load cell to be expanded can be screened out only according to the wireless performance statistical data extracted by the LTE network manager. Compared with the prior art, the method has two positive effects: compared with the traditional algorithm, the error of the method related by the invention is at least one order of magnitude smaller. The method related by the invention is a non-parametric algorithm, does not need to manually appoint a threshold, directly derives a result according to the wireless performance statistical data, and is more objective.

Description

LTE high-load cell discrimination method and system

Technical Field

The invention relates to a high-load cell screening method and system, belongs to the technical field of communication, and particularly relates to an LTE high-load cell screening method and system.

Background

Along with the rapid construction and development of the domestic LTE network, the requirements for network capacity analysis and optimization are stronger and stronger, and the core work is used for discriminating the LTE high-load cells to be expanded. For example, an operator performs capacity expansion with reference to the device carrying capacity, the number of effective RRC connected users, and the cell throughput based on the channel utilization, analyzes the performance statistical data extracted by the network manager, and meets the "high load capacity to be expanded" condition when the cell meets the following conditions:

threshold one (large flow): the average utilization rate of the downlink PRB of the cell in the busy hour is more than 50 percent, and the throughput of the cell in the busy hour is more than 6 GB; threshold two (multi-user): the average utilization rate of downlink PRBs in a cell is more than 50% in a busy hour, and the maximum number of effective RRC connections is more than 200;

and (3) statistical conditions are as follows: and the large data platform extracts full-month data according to months, counts up reaching a capacity expansion threshold one or two when busy for at least 4 days continuously for 7 days, and outputs a capacity expansion list.

The traditional algorithm for discriminating the high-load cell to be expanded by using the specified expansion threshold is a parameter algorithm. However, since the parameter of the capacity expansion threshold directly affects several billions of dollars of investment every year in china only, operators generally uniformly account the important parameter nationwide at a group level and issue the parameter to provinces and cities for execution. In actual projects, the user models have large differences in regions due to large differences in user behaviors in different places, and on the other hand, the user models fluctuate dramatically due to various marketing measures such as "no-flow package". Therefore, the parameter of 'one cutting' is difficult to be refined according to the user group or the specific area, and the error is large.

Therefore, it is a technical problem that needs to be solved urgently at present to improve an intelligent monitoring system in the prior art to meet the requirements of different application scenarios.

Disclosure of Invention

The following presents a simplified summary of one or more aspects in order to provide a basic understanding of such aspects. This summary is not an extensive overview of all contemplated aspects, and is intended to neither identify key or critical elements of all aspects nor delineate the scope of any or all aspects. Its sole purpose is to present some concepts of one or more aspects in a simplified form as a prelude to the more detailed description that is presented later.

The invention mainly aims to solve the problems that the capacity expansion index is single and cannot meet the requirement of refining a user group or a specific area in the prior art, and provides a method and a system for discriminating an LTE high-load cell. According to the method and the system, a threshold does not need to be specified manually, the result is directly derived according to the wireless performance statistical data, and the capacity expansion judgment result is more accurate.

In order to solve the problems, the scheme of the invention is as follows:

an LTE high load cell discrimination method comprises the following steps:

step 1, extracting statistical data of performance indexes of a cell in an area to be evaluated, and eliminating invalid numbers;

step 2, selecting a first index and at least one second index from the statistical data of the performance indexes, and calculating a Pearson correlation coefficient of the second index and the first index;

step 3, fitting the first index and the second index by taking the first index as a horizontal coordinate and the second index as a vertical coordinate; obtaining a screening inflection point K based on a plurality of sample mean values before the second index and the corresponding Pearson correlation coefficients;

step 4, rejecting samples with vertical coordinates smaller than the vertical coordinates corresponding to the inflection point K and horizontal coordinates smaller than the horizontal coordinates corresponding to the inflection point K, and repeating the step 3-4 until the screening inflection point is at the leftmost end or the rightmost end of the fitting line;

and 5, screening out the high-load cell according to a preset condition based on the final sample.

In at least one embodiment of the present invention, in the step 1, samples with an average number of RRC connected users smaller than a predetermined value are rejected.

In at least one embodiment of the invention, the first indicator is an average number of RRC connected users; the second index is

And the downlink PRB average occupancy rate and/or the downlink user plane flow of the PDCP layer.

In at least one embodiment of the invention, the mean value of a plurality of samples with the second index ranked at the top is multiplied by the square r ^2 of the Pearson correlation coefficient to obtain the ordinate of the inflection point, and the abscissa corresponding to the inflection point on the fitting curve is taken as the abscissa of the inflection point.

In at least one embodiment of the present invention, in the step 4, the condition for screening the high load cell is:

there were at least 4 high load samples for 7 consecutive days;

or

At least 7 high load samples for 15 consecutive days;

or

There were at least 14 high load samples for 30 consecutive days.

An LTE high load cell screening system, comprising:

the network data extraction module is used for extracting the statistical data of the performance indexes of the cells of the area to be evaluated and eliminating invalid numbers;

the correlation coefficient determining module is used for selecting a first index and at least one second index from the performance index statistical data and calculating the Pearson correlation coefficient of the second index and the first index;

the screening inflection point determining module is used for fitting the first index and the second index by taking the first index as an abscissa and taking the second index as an ordinate; obtaining a screening inflection point K based on a plurality of sample mean values in front of the second index and the corresponding Pearson correlation coefficient;

the relevant data eliminating module is used for eliminating samples of which the vertical coordinate is smaller than the vertical coordinate corresponding to the inflection point K and the horizontal coordinate is smaller than the horizontal coordinate corresponding to the inflection point K, and the screening inflection point determining module and the relevant data eliminating module are repeatedly called until the screening inflection point is at the leftmost end or the rightmost end of the fitting line;

and the high-load cell screening module screens out the cell to be expanded according to preset conditions based on the final sample.

In at least one embodiment of the present invention, in the network data extraction module, samples with an average number of RRC connected users smaller than a predetermined value are rejected.

And the average occupancy rate of the downlink PRB and/or the downlink user plane traffic of the PDCP layer.

In at least one embodiment of the invention, the ordinate of the inflection point is obtained by multiplying the mean value of a plurality of samples with the second index ranked at the top by the square r ^2 of the correlation coefficient of Pearson, and the abscissa of the inflection point corresponding to the fitted curve is taken as the abscissa of the inflection point.

In at least one embodiment of the present invention, in the high load cell screening module, the conditions for screening the high load cell are:

at least 4 high load samples for 7 consecutive days;

or

There were at least 7 high load samples for 15 consecutive days;

or

There were at least 14 high load samples for 30 consecutive days.

As can be seen from the above description: compared with the traditional method, the error of the method related by the invention is at least one order of magnitude smaller; the method related by the invention is a nonparametric algorithm, does not need to manually specify a threshold, directly derives a result according to the wireless performance statistical data, and is more objective.

Drawings

The accompanying drawings, which are incorporated herein and form a part of the specification, illustrate embodiments of the present invention and, together with the description, further serve to explain the principles of the invention and to enable a person skilled in the pertinent art to make and use the disclosure.

Fig. 1 is a scatter diagram of "average downlink PRB occupancy" and "average number of RRC connected users" (1472 high-load samples obtained by a threshold method).

Fig. 2 illustrates a scatter diagram of "PDCP layer downlink user plane traffic" and "average number of RRC connected users" (1472 high load samples obtained by the thresholding method);

fig. 3 illustrates a scatter diagram (total data) of "downlink PRB average occupancy" and "average number of RRC-connected users";

fig. 4 illustrates a scatter diagram (total data) of "PDCP layer downlink user plane traffic" and "average number of RRC-connected users";

fig. 5 illustrates a "scatter diagram of the average occupancy rate of downlink PRBs" and an "average number of RRC-connected users" (high-load sample list V1);

fig. 6 illustrates a scatter plot of "PDCP layer downlink user plane traffic" and "average number of RRC connected users" (high load sample list V1);

fig. 7 illustrates a "scatter diagram of the average occupancy rate of downlink PRBs" and an "average number of RRC-connected users" (high-load sample finalize list);

fig. 8 illustrates a scatter plot (high load sample final list) of "PDCP layer downlink user plane traffic" and "average number of RRC connected users";

embodiments of the present invention will be described with reference to the accompanying drawings.

Detailed Description

Examples

The purpose of the embodiment is achieved by the following principle: under the influence of a scheduling algorithm, the capacity and performance indexes of the LTE network are influenced and restricted mutually, and under the condition of low load, the load and throughput indexes are in quasi-linear and positive correlation with the number of effective users; in high load, once the resource is short, the relation between the load and the throughput index and the number of the effective users is changed into negative correlation, namely, the state is 'saturated'. By utilizing the characteristics, the embodiment analyzes the wireless performance statistical data, identifies the inflection point of which the index is changed from positive correlation to negative correlation, and identifies the sample in a saturated state, so as to discriminate the high-load cell to be expanded.

The specific steps of this example are as follows:

step 1, extracting 24-hour performance statistical data of N days of m cells in a certain area from a network manager.

The data should include: load indexes (uplink PRB average occupancy rate and downlink PRB average occupancy rate), cell busy hour throughput indexes (PDCP layer uplink user plane flow and PDCP layer downlink user plane flow) and average RRC connection user number.

Description of related indexes:

n: the number of days for extracting statistics is typically 7, 15, 30.

The average utilization rate of uplink PRBs: the average number of occupied uplink PRBs/number of PRBs in a cell is 100%, and the use condition of uplink physical resources is reflected.

Average utilization rate of downlink PRB: the average number of occupied downlink PRBs/number of PRBs in a cell is 100%, and the use condition of the downlink physical resources is reflected.

Average number of RRC connected users: and counting the average number of the RRC connections existing at the same time, sampling the measurement parameters by presetting a measurement time interval to obtain the number of the RRC connections existing at the same time in the given cell, and then averaging.

Cell busy hour throughput: the air interface service flow is divided into the number of bytes of uplink and downlink services, and the index reflects the uplink and downlink flows of the air interface.

And 2, carrying out pretreatment.

And eliminating samples with the average RRC connection user number of 0. The term "total data" herein below refers to performance statistics data of m cells Nx24 hours in a certain area after removing samples with the average number of RRC connected users being 0.

And 3, calculating the Pearson correlation coefficient of all the data.

The pearson correlation coefficient is a linear correlation coefficient, and the correlation coefficient is represented by r, which describes the degree of the linear correlation between two variables. The value of r is between-1 and +1, if r >0, the two variables are positively correlated, namely the larger the value of one variable is, the larger the value of the other variable is; if r <0, it indicates that the two variables are negatively correlated, i.e., the larger the value of one variable, the smaller the value of the other variable.

r²Refers to the scale that can be interpreted in a fitted line (linear relationship). For example: if r of y and x²The variation of y, 0.7, i.e. 70%, can be explained by the best fit line of x and y, the remaining 30% being affected by other factors. A larger absolute value of r indicates a stronger correlation.

Step 4, load index analysis is carried out based on all data

And generating a scatter diagram of the average occupancy rate of the downlink PRB and the average user number of RRC connection from samples in all the data and a linear fitting line.

Calculating the mean value of the maximum 5 statistical values of the 'average occupancy rate of the downlink PRB' and the square r of the correlation coefficient of the Pearson obtained in the step 3²The product is multiplied by K0 on the ordinate. The abscissa of K0 is calculated from the fit line. See fig. 3.

The significance of point K0 is to give an ideal case: samples at the lower left of the point K0 (the "average occupancy rate of downlink PRBs < K0 ordinate" and the "average number of RRC connected users < K0 abscissa") all satisfy a fitting line (linear relationship), i.e., all are low-load samples; samples above or to the right of point K0 ("downlink PRB average occupancy > K0 ordinate" or "average number of RRC connected users > K0 abscissa") all do not satisfy the fit line (linear relationship), but are influenced by other factors, i.e., all high-load samples.

Step 5, carrying out throughput analysis based on all the obtained data

And (3) generating a scatter diagram of the downlink user plane flow of the PDCP layer and the average RRC connection user number and a fitting line (linear relation) by using samples in all data.

Calculating the average value of the maximum 5 statistical values of the downlink user plane flow of the PDCP layer and the square r of the Pearson correlation coefficient obtained in the step 3²The product is taken as the ordinate of the point K1. The abscissa of K1 is calculated by a linear fit line. See fig. 4.

The significance of point K1 is to give an ideal situation: samples at the lower left of the point K1 (the "PDCP layer downlink user plane traffic < K1 ordinate" and the "average RRC connection user number < K1 abscissa") all satisfy a fitting line (linear relationship), that is, all are low-load samples; samples above or to the right of point K1 (the "PDCP layer downlink user plane traffic > K1 ordinate" or "average number of RRC connected users > K1 abscissa") all do not satisfy the fit line (linear relationship), but are affected by other factors, i.e. all high load samples.

Step 6, obtaining a high load sample list

And screening low-load samples of the average occupancy rate of the downlink PRB (the vertical coordinate of K0), the downlink user plane flow rate of the PDCP layer (the vertical coordinate of K1) and the average RRC connection user number (Min) (the horizontal coordinate of K0 and the horizontal coordinate of K1) from the statistical data of a certain area, and obtaining a high-load sample list.

Step 7, carrying out load analysis based on the high load sample list

The high-load sample list is used to generate a scatter diagram (fig. 5) of "average downlink PRB occupancy rate" and "average number of RRC connected users" and a loses fitting line (a is 0.5, which is a nonlinear relationship).

The LOESS refers specifically to local weight polynomial fitting. Fitting each point on the data set with a polynomial of lower degree, the closer to the point to be fitted the higher the weight, and conversely the farther away the weight is lower

If the highest point (point K0') of the generated fit line is not at the left end or the right end of the interval, the fit line has a segment of "average downlink PRB occupancy" linearly related to "average number of RRC connected users", because there is a portion of low-load samples in the "high-load sample list v 1" (located at the lower left of point K0' in fig. 5, that is, "average downlink PRB occupancy < K0 'ordinate" and "average number of RRC connected users" < K0' abscissa).

Step 8, carrying out throughput analysis based on high load sample list

Samples in the high-load sample list are used to generate a scatter diagram of "PDCP layer downlink user plane traffic" and "average number of RRC connected users" (fig. 6), and a loses fitting line (a is 0.5, i.e. a non-linear relationship).

Maximum Point of line fit (point K'₁) If the fitting line is not at the left end or the right end of the interval, the fitting line has a segment of "PDCP layer downlink user plane traffic" which is linearly related to the "average number of RRC connected users", which indicates that there is a part of low load samples (located at the lower left of point K1' in fig. 6, i.e., "PDCP layer downlink user plane traffic")<K1' ordinate "and" average number of RRC-connected users "<K1' abscissa).

Step 9, correcting to obtain a second high-load sample list

Screening low-load sample 'downlink PRB average occupancy rate' from all data<K0' ordinate ' and ' PDCP layer downlink user plane traffic<K1' ordinate "and" average number of RRC-connected users "<Min(K′₀Abscissa, K'₁Abscissa). And correcting to obtain a second high-load sample list.

And step 10, repeating the steps 7-9, and correcting the high-load sample list until a line fitting the downlink PRB average occupancy rate and the average RRC connection user number, a line fitting the downlink user plane flow rate of the PDCP layer and the average RRC connection user number, wherein the highest points of the two fitting lines are all at the left end or the right end of the interval.

A third list of high load samples is obtained. The samples are used for generating a scatter diagram (shown in figure 7) of the average occupancy rate of the downlink PRB and the average number of RRC connection users, and a scatter diagram (shown in figure 8) of the downlink user plane flow rate of the PDCP layer and the average number of the RRC connection users, wherein the highest points of the fitted lines are all at the left end of the interval, namely the average occupancy rate of the downlink PRB and the downlink user plane flow rate of the PDCP layer are in a negative correlation relation with the average number of the RRC connection users, and the characteristic is the radio performance characteristic in a typical high-load state.

Step 11, identifying 'high load cell to be expanded'

The statistical conditions are as follows: when the number of samples in the third final list of high-load samples of a certain cell is at least 4 in 7 consecutive days (or at least 7 in 15 consecutive days, or at least 14 in 30 consecutive days), the cell is the cell to be expanded.

In summary, the conventional algorithm for discriminating the high-load cell to be expanded by using the specified expansion threshold is a parameter algorithm. However, since the parameter of the capacity expansion threshold directly affects several billions of dollars of investment every year in china only, operators generally uniformly account the important parameter nationwide at a group level and issue the parameter to provinces and cities for execution. In actual projects, the user models have large differences in regions due to large differences in user behaviors in different places, and on the other hand, the user models fluctuate dramatically due to various marketing measures such as "no-flow package". Therefore, the parameter of 'one cutting' is difficult to be refined according to user groups or specific areas, and the error is large.

The present embodiment will be described in detail with reference to specific application examples.

First, a conventional expansion method will be described.

For example, when an operator expands a project at a university, 7x 24-hour wireless performance statistics (10447 samples) of each cell of the university are extracted, and 1472 samples are screened out to reach an expansion threshold of one or two by using the above algorithm. However, in 1472 high-load samples obtained by the threshold method, 300 low-load samples are found:

in a scatter diagram of the "downlink PRB average occupancy" and the "average number of RRC connected users" (as in fig. 1), the highest point (point K0) of the fitted line is not at the left end or the right end of the interval, and the fitted line has a segment of "downlink PRB average occupancy" positively correlated (approximately linear) with the "average number of RRC connected users", because there is a part of the low load sample (located at the lower left of the point K0, that is, "downlink PRB average occupancy < K0 ordinate" and "average number of RRC connected users" < K0 abscissa). In a scattergram of the "PDCP layer downlink user plane traffic" and the "average number of RRC connected users" (see fig. 2), the highest point (point K1) of the fit line is not located at the left end or the right end of the interval, and the fit line has a segment of "PDCP layer downlink user plane traffic" positively correlated (approximately linear) with the "average number of RRC connected users", because there is a part of low load samples (located at the lower left of the point K1, that is, "PDCP layer downlink user plane traffic < K1 ordinate" and "average number of RRC connected users" < K1 abscissa).

The "final high-load sample list" has 1596 samples, and in the scatter diagram of "average downlink PRB occupancy rate" and "average number of RRC connected users" (as in fig. 7) and the scatter diagram of "downlink user plane traffic at PDCP layer" and "average number of RRC connected users" (as in fig. 8), the highest point of the fitted line is at the left end of the interval, and the "average downlink PRB occupancy rate" and "downlink user plane traffic at PDCP layer" are in a negative correlation with the "average number of RRC connected users", which is a typical radio performance characteristic in a high-load state. Therefore, the threshold method misses 1596- (1472-.

Therefore, the error of the high-load sample obtained by the threshold method is (300+ 424)/1596-45.4%.

According to the statistical conditions: "continuously for 7 days or at least 4 days, count and reach the expansion threshold one or threshold two from busy hour, output the expansion list", the expansion list of the threshold method (based on 1472 high load samples) has 30 high load cells to be expanded. However, compared with the expansion list based on 1596 high-load samples, which is the "final high-load sample list", the expansion list of the threshold method (based on 1472 high-load samples) has 3 misjudged and 3 missed-judged cells to be expanded, and the error of the expansion scheme is 20%.

The following describes a method employing this embodiment:

1. and extracting performance statistical data of m cells Nx24 hours in a certain area from the network management system.

The data shall include; load indexes (uplink PRB average occupancy rate and downlink PRB average occupancy rate), cell busy hour throughput indexes (PDCP layer uplink user plane flow and PDCP layer downlink user plane flow) and average RRC connection user number.

2. And (4) carrying out pretreatment.

And eliminating samples with the average RRC connection user number of 0, wherein 9412 samples in the 'statistical data'.

3. Pearson correlation coefficients for the "statistics" are calculated.

Significant correlation at the.01 level (two-sided).

4. Load index analysis of' statistical data

And generating a scatter diagram of the average occupancy rate of the downlink PRB and the average user number of RRC connection according to the statistical data and a fitting line (linear relation).

Calculating the mean value of the maximum 5 statistical values of the 'average occupancy rate of the downlink PRB' and the square r of the correlation coefficient of the Pearson obtained in the step 3²Multiplying, the product being taken as the point K₀The ordinate of (c). Calculating to obtain K through a fitted line₀The abscissa of (a). See fig. 3.

5. Throughput analysis of "statistics

And generating a scatter diagram of the downlink user plane flow of the PDCP layer and the average RRC connection user number and a fitting line (linear relation) by using the statistical data.

Calculating the average value of maximum 5 statistical values of 'PDCP layer downlink user plane flow', and the square r of the correlation coefficient of the Pearson obtained in the step 3²Multiplication, product as point K₁The ordinate of (c). Calculating to obtain K through a fitted line₁The abscissa of (a). See fig. 4.

6. Obtain "high load sample List V1"

Screening downlink PRB from' statistical data of certain regionAverage occupancy rate<K₀Ordinate 'and' PDCP layer downlink user plane traffic<K₁Ordinate "and" average number of RRC-connected users<Min(K₀Abscissa, K₁Abscissa) "of the low load samples, a" high load sample list V1 "is obtained, in which 1680 samples are present.

7. Load analysis of "high load sample List V1

The "high load sample list V1" is used to generate a scatter diagram (fig. 5) of "average downlink PRB occupancy" and "average number of RRC connected users", and a losss fitting line (a is 0.5, a non-linear relationship).

Highest point (point K ') of generated fitted line'₀) Not at the left end or the right end of the interval, the fitting line has a segment of ' average occupancy rate of downlink PRB ' linearly related to ' average number of RRC connected users ', because a part of low-load samples (located at point K ' in FIG. 5) exist in the ' high-load sample list v 1'₀Lower left of (1), i.e. "average occupancy of downlink PRBs<K′₀Ordinate "and" average number of RRC-connected users "<K′₀Abscissa).

8. Throughput analysis of "high load sample inventory V1

The "high load sample list V1" is used to generate a scattergram of "PDCP layer downlink user plane traffic" and "average number of RRC connected users" (fig. 6), and a loses fitting line (a ═ 0.5, i.e., a non-linear relationship).

Highest Point of fitted line (Point K'₁) Not at the left or right end of the interval, if the fitted line has a segment "PDCP layer downlink user plane traffic" linearly related to the "average number of RRC connected users", it indicates that there is a part of the low load sample (located at point K 'in fig. 6)'₁Lower left of (1), i.e. "PDCP layer downlink user plane traffic<K′₁Ordinate "and" average number of RRC-connected users "<K′₁Abscissa).

9. The correction obtained "high load sample List V2"

Screening low-load sample downlink PRB average occupancy rate from statistical data<K′₀Ordinate 'and' PDCP layer downlink user plane traffic<K′₁Ordinate "and" average number of RRC-connected users "<Min(K′₀Abscissa, K'₁Abscissa axis). The correction obtains a "high load sample list V2".

10. And 7-9, the high-load sample list is corrected until a line fitting the downlink PRB average occupancy rate and the average RRC connection user number, a line fitting the PDCP layer downlink user plane flow rate and the average RRC connection user number, and the highest points of the two fitting lines are all at the left end or the right end of the interval.

A "high load sample final manifest" was obtained, of which 1596 samples. In the scatter diagram of "downlink PRB average occupancy" and "average number of RRC connected users" (as in fig. 7) and the scatter diagram of "PDCP layer downlink user plane traffic" and "average number of RRC connected users" (as in fig. 8), the highest point of the fitted line is at the left end of the interval, and the "downlink PRB average occupancy" and "PDCP layer downlink user plane traffic" and "average number of RRC connected users" are in a negative correlation relationship, which is a typical radio performance characteristic in a high load state.

The sample error e of the "final list of high load samples" can be quantitatively estimated as follows:

the "high load sample list v 1" sample error is first calculated.

In the high load sample list v1, the low load sample downlink PRB average occupancy rate<K′₀Ordinate 'and' PDCP layer downlink user plane traffic<K′₁Ordinate "and" average number of RRC-connected users "<Min(K′₀Abscissa, K'₁Abscissa), the number of misjudgments is 303; the high load samples are 1680-.

Sample error "high load sample list v 1":

sample error "high load sample Final List":

since the "high load sample list v 1" is calculated from the maximum 5 statistics of the "average occupancy of downlink PRBs" and "downlink user plane traffic of PDCP layer", the "total number of samples of the final list of high load samples" in the above formula is divided by 5.

11. Discrimination of 'high load cell to be expanded'

According to the statistical conditions: when the number of samples in the "final list of high-load samples" of a cell is at least 4 in 7 consecutive days (or at least 7 in 15 consecutive days, or at least 14 in 30 consecutive days), the cell is the cell to be expanded.

Based on the expansion list of 1596 high-load samples, which is the final high-load sample list, the error of the expansion list is as follows:

in summary, the conventional algorithm for discriminating the high-load cell to be expanded by using the specified expansion threshold is a parameter algorithm. When the threshold method is adopted in the embodiment, the error of the high-load sample is 45.4%; according to the statistical conditions: "the statistics reaches the first threshold or the second threshold when busy hour counts for at least 4 consecutive 7 days", and the error of the output expansion list is 20%.

By adopting the nonparametric algorithm specifically related to the invention, the error of the high-load sample is 1.8 percent; according to the same statistical conditions: when the number of samples of a certain cell in the high-load sample final list is at least 4 in 7 continuous days, the error of the output expansion list is 0.9 percent.

Compared with the traditional algorithm, the method related by the invention has an error at least one order of magnitude smaller.

In conclusion, the following results are obtained: the nonparametric algorithm can discriminate the high-load cell to be expanded only according to the wireless performance statistical data extracted by the LTE network management system. Compared with the prior art, the method has two positive effects:

compared with the traditional algorithm, the method related by the invention has the error at least one order of magnitude smaller

The method related by the invention is a nonparametric algorithm, does not need to manually specify a threshold, directly derives a result according to the wireless performance statistical data, and is more objective.

In this embodiment, while, for purposes of simplicity of explanation, the methodologies are shown and described as a series of acts, it is to be understood and appreciated that the methodologies are not limited by the order of acts, as some acts may, in accordance with one or more embodiments, occur in different orders and/or concurrently with other acts from that shown and described herein or not shown and described herein, as would be understood by one skilled in the art.

Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC. The ASIC may reside in a user terminal. In the alternative, the processor and the storage medium may reside as discrete components in a user terminal.

In one or more exemplary embodiments, the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software as a computer program product, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a computer. By way of example, and not limitation, such computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. Any connection is properly termed a computer-readable medium. For example, if the software is transmitted from a web site, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, Digital Subscriber Line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. Disk (disk) and disc (disc), as used herein, includes Compact Disc (CD), laser disc, optical disc, Digital Versatile Disc (DVD), floppy disk and blu-ray disc where disks (disks) usually reproduce data magnetically, while discs (discs) reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.

It is noted that references in the specification to "one embodiment," "an example embodiment," "some embodiments," etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to effect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.

The previous description of the disclosure is provided to enable any person skilled in the art to make or use the disclosure. Various modifications to the disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the spirit or scope of the disclosure. Thus, the disclosure is not intended to be limited to the examples and designs described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. An LTE high-load cell discrimination method is characterized by comprising the following steps:

step 1, extracting statistical data of cell performance indexes of an area to be evaluated, and rejecting samples with the average RRC connection user number of 0;

step 3, generating a scatter diagram and an LOESS fit line of the first index and the second index by taking the first index as a horizontal coordinate and the second index as a vertical coordinate; obtaining a screening inflection point K based on a plurality of sample mean values in front of the second index and the corresponding Pearson correlation coefficient;

step 4, rejecting samples of which the ordinate is smaller than the ordinate corresponding to the inflection point K and the abscissa is smaller than the abscissa corresponding to the inflection point K, and repeating the step 3-4 until the inflection point is screened at the leftmost end or the rightmost end of the fitting line;

and 5, screening out the high-load cells according to preset conditions based on the final samples.

2. The LTE high-load cell discrimination method according to claim 1, wherein in step 1, samples with an average RRC connection user number smaller than a predetermined value are rejected.

3. The method for discriminating the LTE high-load cell according to claim 1, wherein the first index is an average number of RRC connected users; the second index is the average occupancy rate of the downlink PRB and/or the downlink user plane traffic of the PDCP layer.

4. The LTE high-load cell screening method according to claim 1, wherein the mean value of a plurality of samples with the second indexes ranked at the top and the square r of the Pearson correlation coefficient²And multiplying to obtain the ordinate of the inflection point, and taking the abscissa corresponding to the inflection point on the fitting curve as the abscissa of the inflection point.

5. The method for screening the high-load LTE cell according to claim 1, wherein in said step 4, the conditions for screening the high-load cell are:

at least 4 high load samples for 7 consecutive days;

or

There were at least 7 high load samples for 15 consecutive days;

or

There were at least 14 high load samples for 30 consecutive days.

6. An LTE high load cell screening system, characterized in that includes:

the screening inflection point determining module is used for generating a scatter diagram and an LOESS fit line of the first index and the second index by taking the first index as a horizontal coordinate and the second index as a vertical coordinate; obtaining a screening inflection point K based on a plurality of sample mean values in front of the second index and the corresponding Pearson correlation coefficient;

7. The LTE high-load cell screening system according to claim 6, wherein samples with average RRC connection user number smaller than a predetermined value are removed from the network data extraction module.

8. The LTE high load cell screening system of claim 6, wherein the first indicator is an average number of RRC connected users; the second index is the average occupancy rate of the downlink PRB and/or the downlink user plane traffic of the PDCP layer.

9. The LTE high load cell screening system according to claim 6, wherein the mean value of a plurality of samples with the second index ranked at the top and the square r of the Pearson correlation coefficient²And multiplying to obtain the ordinate of the inflection point, and taking the abscissa corresponding to the inflection point on the fitting curve as the abscissa of the inflection point.

10. The LTE high-load cell screening system according to claim 6, wherein in the high-load cell screening module, the conditions for screening the high-load cell are as follows:

there were at least 4 high load samples for 7 consecutive days;

or

At least 7 high load samples for 15 consecutive days;

or

There were at least 14 high load samples for 30 consecutive days.