WO2018014674A1

WO2018014674A1 - Method, apparatus, and system for determining degree of association of input and output of black box system

Info

Publication number: WO2018014674A1
Application number: PCT/CN2017/087940
Authority: WO
Inventors: 孟晟; 施风; 眭鸿飞; 赵黎波; 王士刚
Original assignee: 中兴通讯股份有限公司
Priority date: 2016-07-20
Filing date: 2017-06-12
Publication date: 2018-01-25
Also published as: CN107645393A

Abstract

A method, an apparatus, and a system for determining the degree of association of the input and output of a black box system, the method comprising: matching service quality indicator KQI data in the black box system with key performance indicator KPI data to form a data vector space; on the basis of the service type of the KQI data, clustering the KPI data; breaking down the data vector space to separate out association features of the KPI data to the KQI data, and calculating a normalised degree of association of the KPI data to the KQI data; on the basis of the normalised degree of association, respectively determining KPI data associated with each KQI data, and calculating an association weighting of the associated KPI to the KQI data; and determining that the associated KPI data and association weighting are the degree of association of the input and output of the black box system. The present invention solves the problems in the prior art of low precision and finding incomplete items of association when determining the degree of association of the input and output of a black box system.

Description

Method, device and system for determining input and output correlation degree of black box system

Technical field

The present invention relates to the field of network and communication technologies, and in particular, to a method, device and system for determining the degree of input and output association of a black box system.

Background technique

When a service quality indicator KQI (Key Quality Indicators) is deteriorated in the service network (including the communication system), it is hoped to identify key performance indicators (KPIs) that cause KQI deterioration for parameter adjustment. Or network optimization. The KQI and the KPI are a black box system. FIG. 1 is a schematic diagram of a black box system between the KQI and the KPI according to the related art of the present invention, as shown in FIG. Traditional solutions rely mainly on manual experience. Faced with numerous systems or local KPIs, reporting parameters, alarm information, auxiliary messages, etc. (usually as many as hundreds), it is very difficult to troubleshoot problems in a timely and accurate manner. Therefore, cracking the black box association between KQI and KPI has always been a hot topic in the industry.

The theoretical basis of the related technology is classical statistics, and it is considered that there is a fixed function between the correlation between KQI and KPI. Try to use the parameter estimation, regression analysis and other methods to fit the degree of relevance (ie KQI=f(KPI)), but this method can not get the correct conclusion, and can not solve the current situation of "KPI is good KQI." FIG. 2 is a schematic diagram showing a phenomenon in which the KPI is good KQI is poor in a scatter plot according to the related art of the present invention, and the fitting/regression of the existing method does not correctly reflect the relationship between KQI and KPI. as shown in picture 2.

Limited by the theoretical basis, the existing method can only select a small number of "should be relevant" KPI items, manually specify the correlation model and give the weight between the KPIs based on experience, and then calculate the degree of association. Causes the following problems:

1), not comprehensive. Existing methods do not have the ability to simultaneously calculate the degree of association between all KPIs to be evaluated and KQI, nor can they cope with situations where the KPI list is variable. It is typically reflected in the omission of strongly related KPI items. 2), not accurate. Existing methods cannot accurately quantify all KPIs to be evaluated and current The degree of relevance of KQI. It is typically reflected in the wrong relevance conclusion. 3) Based on 1) and 2), the existing method cannot find the complete orthogonal KPI set of KQI, and thus cannot accurately guide parameter optimization or business value-added mining. 4) There are two methods for KPI clustering in the related art: one is manual division, and the other is simple clustering algorithm based on Euclidean distance only (such as K-Means), which requires manual designation of clusters. Number, and not combined with business characteristics and engineering significance. 5) Based on 1) to 4), the related art method cannot quantitatively use the threshold joint decision KQI of multiple KPIs to exceed the limit or give a confidence level of KQI overrun. Furthermore, it is impossible to optimize or pre-optimize the network parameters according to the multi-dimensional KPI threshold in the case of temporarily missing KQI data.

In view of the above problems in the related art, no effective solution has been found yet.

Summary of the invention

The embodiment of the invention provides a method, a device and a system for determining the correlation degree of the input and output of the black box system, so as to at least solve the problem that the accuracy of the black box system is too low and the related items are not found when determining the correlation between the input and output of the black box system in the related art. .

According to an embodiment of the present invention, a method for determining an input/output association degree of a black box system is provided, comprising: matching a service quality indicator KQI data in a black box system with key performance indicator KPI data to form a data vector space; The KQI data service type clusters the KPI data, wherein the clustering result is used to select orthogonal strong correlation KPI items, and the auxiliary decision indicator data health degree; the data vector space is decomposed to separate And extracting a correlation feature of the KPI data to the KQI data, and calculating a normalized degree of association of the KPI data with the KQI data; determining, according to the normalized correlation degree, each of the KQI data Associated KPI data, and calculate an associated weight of the associated KPI data for the KQI data; determining the associated KPI data and the associated weight as the input and output relevance of the black box system .

Optionally, clustering the KPI data according to the service type of the KQI data includes: dividing the KQI data and the KPI data into a KQI data layer and a KPI data layer; at the KQI data layer and the The abstraction layer parameters related to the service type corresponding to the KQI data are added between the KPI data layers, wherein the abstraction layer normalizes or maps the KPI data. To fit the corresponding mining algorithm; use the abstraction layer parameters to cluster KPI data.

Optionally, the decomposing the data vector space includes: decomposing the data vector space by at least one of the following manners, and separating an association feature of the KPI data to the KQI data: dimension reduction in a spatial dimension, Divide directly in the spatial dimension and dimension in the spatial dimension.

Optionally, matching the service quality indicator KQI data with the key performance indicator KPI data to form a data vector space to separate the KPI data from the KQI data includes: matching the KQI data with the KPI data to form a data vector space. The degree of association of the KPI data with the KQI data is separated.

Optionally, matching the service quality indicator KQI data with the key performance indicator KPI data to form a data vector space to separate the KPI data from the KQI data includes: performing the service characteristic information according to the KQI data. The KPI data is distributed and graphically displayed to determine the validity of the KPI data, and the missing, abnormal or mapping processing is performed as needed; the KQI data is matched with the valid KPI data to form a data vector space to separate the KPI data pair. The degree of association of KQI data.

Optionally, the method for separating the KQI data from the KPI data and forming the data vector space to separate the KPI data from the KQI data includes: dimension reduction processing, direct decomposition, and dimension extraction, and then extracting effective features.

Optionally, the service characteristic information is obtained according to matching a preset database and/or according to a service requirement.

Optionally, performing a dimensionality reduction operation on the data vector space includes: decision tree pruning, regression merging, clustering, expert-assisted decision, and the like.

Optionally, performing direct decomposition on the data vector space includes: decomposing the data vector space based on a Bayesian statistical algorithm, an equivalent numerical calculation method based on a singular value decomposition idea, and the like.

Optionally, performing the dimension expansion on the data vector space and then extracting the effective features includes: performing an algorithm based on the support vector machine SVM to perform the dimension expansion on the data vector space; and performing the dimension expansion processing based on the neural network algorithm, That is, the number of hidden layer units is higher than the input dimension.

Optionally, after determining KPI data associated with each of the KQI data according to the normalized association degree, the method further includes: calculating KPI data associated with each of the KQI data Quantifying the one-dimensional and quantizing multi-dimensional thresholds; obtaining a false positive rate and/or a missed rate of the KQI data overrun according to the quantized multi-dimensional threshold; analyzing the associated KPI according to the missed rate and/or the missed rate Whether the data contains a complete base of the KQI data overrun space.

Optionally, after analyzing, according to the missed rate and/or the missed rate, whether the associated KPI data includes a complete base of the KQI data overrun space, the method further includes:

And analyzing, according to the missed rate and/or the missed rate, the probability that the associated KPI data does not include the KQI data overrun space.

Optionally, after determining the KPI data associated with each of the KQI data according to the normalized association degree, the method further includes: determining whether the KQI data is missing; determining the KQI data. In the absence of the case, the probability of misjudgment of the KQI data is inversely inferred based on the quantized multidimensional threshold of the historical KPI data, and system pre-optimization and parameters and adjustments are performed.

Optionally, the KPI data includes at least one of the following: a Radio Resource Control (RRC) connection establishment success rate, and an Evolved Radio Access Bearer (E-RAB) Success rate, wireless connection rate, E-RAB drop rate, base station ENB (evolved NodeB) handover success rate, cell user face packet loss rate, cell user plane downlink packet loss rate, cell user plane downlink average delay, Cell user plane downlink packet loss rate, cell downlink packet number, MAC layer uplink block error rate, Media Access Control (MAC) layer downlink block error rate, uplink initial hybrid automatic repeat request HARQ retransmission ratio Hybrid Automatic Repeat Request (HARQ) retransmission ratio, downlink dual-stream traffic ratio, Quadrature Phase Shift Keying (QPSK) ratio, uplink 16QAM ratio, downlink QPSK ratio, Downstream 16QAM ratio, downlink 64 Quadrature Amplitude Modulation (QAM) ratio, air interface uplink service byte count, air interface industry Bytes, of uplink physical resource blocks Physical Resource Block, abbreviated as PRB) average utilization, average utilization downlink PRB, PRB average throughput per uplink, each downlink PRB Average throughput, -110dBm coverage, Signal to Interference plus Noise Ratio (SINR), Subband 0 Average Channel Quality Indicator (CQI), Channel User Level .

Optionally, the KQI data includes a Hypertext Transfer Protocol (HTTP) response delay.

Optionally, the cluster includes at least one of the following: a capacity indicator cluster, an access indicator cluster, an efficiency indicator cluster, and a complete retention indicator cluster.

Optionally, the complete maintenance indicator cluster further includes at least one of the following: a group service cluster, an uplink complete cluster, and a downlink complete cluster.

According to another embodiment of the present invention, there is provided an apparatus for determining an input/output association degree of a black box system, comprising: a separation module configured to match a service quality indicator KQI data in a black box system with key performance indicator KPI data a data vector space to separate the degree of association of the KPI data with the KQI data; a clustering module configured to cluster the KPI data according to a service type of the KQI data; the first computing module is set to Decomposing the data vector space, and calculating a normalized degree of association of the KPI data with the KQI data; and a second calculating module, configured to determine each of the KQIs according to the normalized correlation degree The KPI data associated with the data, and the associated weight of the associated KPI data for the KQI data is calculated. A determination module is configured to determine the associated KPI data and the associated weight as an input-output association of the black box system.

According to still another embodiment of the present invention, there is provided an association analysis system, comprising: a storage unit configured to store KQI data and KPI data in a service network; a data pre-processing unit configured to pair the KQI data and the The KPI data is preprocessed, wherein the preprocessing includes: data matching, data cleaning, statistical feature extraction, and statistical data presentation. a clustering unit configured to intelligently cluster the KPI data and output a clustering table; a vector space decomposition unit connected to the data preprocessing unit and configured to form the preprocessed KQI data and KPI data The vector space is decomposed to extract an associative component of the KPII data that can be normalized to the KPI data. Quantizing an association calculation unit, and connecting to the vector space decomposition unit, And performing normalized quantization calculation on the correlation component, obtaining a quantitative relevance degree of the KQI data on the KPI data, calculating a total ranking weight thereof, and outputting a quantization correlation matrix including the weight; a multi-dimensional threshold calculation unit, The method is configured to calculate a multi-dimensional quantization threshold of the KPI data of the correlation item according to the correlation matrix, a false positive rate and/or a missed rate of the KQI overrun, and output a multi-dimensional quantization threshold matrix and KQI over-limit evaluation data; Performing network optimization according to the multi-dimensional quantization threshold matrix and the service network described by the KQI overrun evaluation data.

Optionally, the system further includes: a service data interface, including a presentation interface, configured to receive an external command to perform an auxiliary decision on the output data of the system.

Optionally, the system further includes: a data mining analysis algorithm pool, configured to store a data mining algorithm of the system. A database, set to store data analysis and mining conclusions of the system, and intermediate process information.

According to still another embodiment of the present invention, a storage medium is also provided. The storage medium is arranged to store program code for performing the following steps:

Matching the service quality indicator KQI data in the black box system with the key performance indicator KPI data to form a data vector space;

And clustering the KPI data according to the service type of the KQI data, wherein the clustering result is used to select an orthogonal strong correlation KPI item, and the auxiliary decision indicator data health degree;

Decomposing the data vector space to separate associated features of the KPI data from the KQI data, and calculating a normalized degree of association of the KPI data with the KQI data;

Determining, according to the normalized degree of association, KPI data associated with each of the KQI data, and calculating an associated weight of the associated KPI data to the KQI data;

The associated KPI data and the associated weight are determined as the input and output relevance of the black box system.

Through the present invention, the service quality indicator KQI data in the black box system is matched with the key performance indicator KPI data to form a data vector space; the KPI data is clustered according to the service type of the KQI data, wherein the clustering result is used Selecting orthogonal strong correlation KPI terms, and assisting decision indicator data health; decomposing the data vector space to separate the Correlation characteristics of the KPI data to the KQI data, and calculating a normalized degree of association of the KPI data with the KQI data; determining, respectively, associated with each of the KQI data according to the normalized correlation degree KPI data, and calculating an associated weight of the associated KPI data to the KQI data; determining the associated KPI data and the associated weight as the input-output relevance of the black box system, as The KQI data is quantitatively calculated and normalized and compared with the KPI data. Therefore, the problem of low accuracy and incomplete correlation can be solved in determining the correlation between the input and output of the black box system in the related art, and the quality of the service is improved. Reduce the effect of artificial burden.

DRAWINGS

The drawings described herein are intended to provide a further understanding of the invention, and are intended to be a part of the invention. In the drawing:

1 is a schematic diagram of a black box system between a KQI and a KPI according to the related art of the present invention;

2 is a schematic diagram showing a phenomenon in which a "KPI is good KQI is poor" in a scatter diagram according to the related art of the present invention;

3 is a flow chart of a method of determining an input and output association degree of a black box system according to an embodiment of the present invention;

4 is a structural block diagram of an apparatus for determining an input/output association degree of a black box system according to an embodiment of the present invention;

FIG. 5 is a structural block diagram of an association analysis system according to an embodiment of the present invention; FIG.

6 is a schematic flowchart of a KQI-KPI multi-dimensional quantitative association analysis solution according to an embodiment of the present invention;

7 is a block diagram of a KQI-KPI multi-dimensional quantitative correlation analysis system according to an embodiment of the present invention;

FIG. 8 is a schematic diagram of initial hierarchical clustering in a smart clustering process according to an embodiment of the present invention; FIG.

9 is a graph showing a fitting frequency of a parameter by a normal distribution and a data source health test according to an embodiment of the present invention;

10 is a graph showing a frequency fitting curve of a parameter that does not pass the normal distribution test but passes the data source health test according to an embodiment of the present invention;

11 is a graph showing a frequency fitting curve of a parameter that does not pass the normal distribution test and fails the data source health test according to an embodiment of the present invention;

FIG. 12 is a schematic diagram of calculating KPI weights of related items by using a primary quantization decision point and a secondary quantization decision point according to an embodiment of the present invention.

detailed description

The invention will be described in detail below with reference to the drawings in conjunction with the embodiments. It should be noted that the embodiments in the present application and the features in the embodiments may be combined with each other without conflict.

It is to be understood that the terms "first", "second" and the like in the specification and claims of the present invention are used to distinguish similar objects, and are not necessarily used to describe a particular order or order.

Example 1

In the embodiment, a method for determining the input and output association degree of the black box system is provided. FIG. 3 is a flowchart of a method for determining the input and output association degree of the black box system according to an embodiment of the present invention. The process includes the following steps:

Step S302, matching the service quality indicator KQI data in the black box system with the key performance indicator KPI data to form a data vector space to separate the correlation degree of the KPI data to the KQI data;

Step S304, clustering the KPI data according to the service type of the KQI data, wherein the clustering result is used to select an orthogonal strong correlation KPI item, and the auxiliary decision indicator data health degree;

Step S306, decomposing the data vector space to separate the correlation feature of the KPI data to the KQI data, and calculating the normalized correlation degree of the KPI data to the KQI data;

Step S308, respectively determining KPI data associated with each KQI data according to the normalized association degree, and calculating an associated weight of the associated KPI data to the KQI data;

Step S310, determining the associated KPI data and the associated weight as the input and output relevance of the black box system.

Optionally, the scenario in this embodiment may be applied to network optimization, user portrait, comment, recommendation, etc., but is not limited thereto.

Through the above steps, the service quality indicator KQI data in the black box system is matched with the key performance indicator KPI data to form a data vector space; the KPI data is clustered according to the service type of the KQI data, wherein the clustering result is used. Selecting orthogonal strong correlation KPI terms, and assisting decision indicator data health; decomposing the data vector space to separate associated features of the KPI data from the KQI data, and calculating the KPI data pair a normalized degree of association of the KQI data; determining KPI data associated with each of the KQI data according to the normalized degree of association, and calculating the associated KPI data for the KQI data Correlation weight; determining the associated KPI data and the association weight as the input/output association degree of the black box system, since the KQI data can be quantitatively calculated and normalized and compared with the KPI data, The invention solves the problem that the accuracy of the black box system is too low and the related items are not found when determining the correlation between the input and output of the black box system, thereby improving the quality of the service and reducing the labor burden. fruit.

Optionally, the degree of association between KQI and KPI is regarded as a random variable rather than a fixed function, preferably Bayesian statistical theory, and the method flow is:

The KQI and KPI data collected at both ends of the network/communication system are aligned by time granularity and spatial granularity matching.

The aligned KQI and KPI data are further cleaned according to the protocol, specifications, and actual business requirements. Data cleaning content includes: eliminating outliers and filling in missing values.

The characteristics of the KQI and KPI data after washing are counted and presented in a variety of charts. Optionally, the business data experts combine statistical indicators to determine the health of the data source (whether there are misstatements/missings that can be judged by numerical methods), data distribution types, etc., to better adapt the data mining algorithm.

Intelligent hierarchical clustering of the cleaned KPI data and obtaining a clustering table. Optionally, the business data experts are divided into several categories based on the combination of business requirements and engineering significance, and fine-tuning the KPI classification with blurred boundaries.

The vector space formed by the KQI-KPI data is decomposed, and the quantizable degree of association of each KPI-KQI is separated and normalized. Optionally, the vector space is expanded or coordinate transformed to obtain a clearer KQI-KPI quantizable degree of association.

The KPI-KQI normalized relevance is sorted and judged, and the classification is sorted according to the degree of relevance.

View the cluster of related items KPI and determine the final related item KPI. Optionally, the business data expert specifies the final correlation KPI based on statistical information, business requirements, clustering tables, and engineering significance.

The normalized weight of the final correlation KPI is calculated according to the KPI-KQI normalized degree of association of the final correlation KPI.

The one-dimensional threshold of each final correlation KPI is calculated based on the KPI-KQI normalized correlation degree of the final correlation KPI and the preset warning threshold of the KQI.

According to the preset warning threshold of KQI and the accuracy requirement of false positive/missing rate, the joint threshold of multiple final correlation KPIs is calculated. Optionally, the joint threshold dimension is between 2 and the number of final correlation KPIs.

Calculate the multidimensional threshold of the final correlation KPI based on the one-dimensional threshold and joint threshold of all final correlation KPIs

The KQI overrun leak rate (KQI->KPI) under the expected false positive rate (KPI->KQI) is calculated using the multidimensional threshold of the final correlation KPI. Among them, the false positive rate definition: KQI data filtered by KPI multi-dimensional threshold, KQI has no over-limit ratio; leakage rate definition: KQI over-limit data, which is not within the KPI multi-dimensional threshold screening range.

The association matrix between KQI and its final related item KPI and the KPI multidimensional threshold matrix are presented in the interface, and the associated extraction features are stored in the background expert intelligence database. Optionally, the intermediate output result in the entire calculation decision process may be selectively presented by the service data expert on the interface, and the auxiliary decision and the manual fine adjustment are performed.

After the arrival of the new KQI and KPI data, the KQI overrun judgment is performed using the existing multi-dimensional threshold. If the false positive rate increases beyond the offset threshold, the entire data is recalculated; otherwise, the existing correlation analysis conclusion is combined with the adaptive fine adjustment. Guide KQI optimization and parameter adjustment.

Through this embodiment, the operator, the network builder, and the network maintainer can quickly and comprehensively find a strong KPI of a certain KQI in a quantitative manner, and calculate a corresponding KPI single and multi-dimensional threshold, and then push back the KQI. Whether it exceeds the limit, provides accurate and convenient guidance for network evaluation, performance optimization, parameter adjustment, etc., improves service quality, and greatly reduces the labor burden.

Through the description of the above embodiments, those skilled in the art can clearly understand that the method according to the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course, by hardware, but in many cases, the former is A better implementation. Based on such understanding, the technical solution of the present invention, which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk, The optical disc includes a number of instructions for causing a terminal device (which may be a cell phone, a computer, a server, or a network device, etc.) to perform the methods of various embodiments of the present invention.

Example 2

In the embodiment, a device and a system for determining the degree of the input and output of the black box system are provided. The device and the system are used to implement the above-mentioned embodiments and preferred embodiments, and the detailed description thereof has been omitted. As used below, the term "module" may implement a combination of software and/or hardware of a predetermined function. Although the apparatus described in the following embodiments is preferably implemented in software, hardware, or a combination of software and hardware, is also possible and contemplated.

4 is a structural block diagram of an apparatus for determining an input/output association degree of a black box system according to an embodiment of the present invention. As shown in FIG. 4, the apparatus includes:

The separation module 40 is configured to match the service quality indicator KQI data in the black box system with the key performance indicator KPI data to form a data vector space to separate the correlation degree of the KPI data to the KQI data;

The clustering module 42 is configured to cluster the KPI data according to the service type of the KQI data;

The first calculating module 44 is configured to decompose the data vector space and calculate a normalized correlation degree of the KPI data to the KQI data;

The second calculating module 46 is configured to respectively determine each KQI according to the normalized degree of association The KPI data associated with the data, and the associated weights of the associated KPI data for the KQI data are calculated.

A determination module 48 is arranged to determine the associated KPI data and associated weights as the input and output associations of the black box system.

FIG. 5 is a structural block diagram of an association analysis system according to an embodiment of the present invention. As shown in FIG. 5, the system includes:

The storage unit 50 is configured to store KQI data and KPI data in the service network;

The data pre-processing unit 52 is configured to pre-process the KQI data and the KPI data, wherein the pre-processing includes: data matching, data cleaning, statistical feature extraction, and statistical data presentation.

The clustering unit 54 is configured to perform intelligent clustering on the KPI data, and output a cluster table;

The vector space decomposition unit 56 is connected to the data pre-processing unit and configured to decompose the vector space formed by the pre-processed KQI data and the KPI data, and extract the correlation component of the KPII data-to-KPI data that can be normalized and quantized.

The quantization correlation calculation unit 58 is connected to the vector space decomposition unit, and is configured to perform normalized quantization calculation on the correlation component, obtain a quantitative relevance degree of the KQI data to the KPI data, calculate a total ranking weight thereof, and output a quantization correlation matrix including the weight;

The multi-dimensional threshold calculation unit 60 is configured to calculate a multi-dimensional quantization threshold of the correlation item KPI data according to the correlation matrix, a false positive rate and/or a missed rate of the inverse KQI overrun, and output a multi-dimensional quantization threshold matrix and KQI over-limit evaluation data;

The optimization unit 32 is configured to perform network optimization according to the service network of the multi-dimensional quantization threshold matrix and the KQI over-limit evaluation data.

Optionally, the system further includes: a service data interface, including a presentation interface, configured to receive an external command to perform an auxiliary decision on the output data of the system; according to the mining analysis algorithm pool, set as a data mining algorithm of the storage system; the database is set to be stored System data analysis and mining conclusions, as well as intermediate process information.

The system according to the embodiment, optionally, the base database, is set to store all indicators and Operation management data.

The business data expert interface includes a data and conclusion presentation interface, and is set as a business data expert to view statistical analysis information and input auxiliary conditions or judgments.

The KQI/KPI storage unit is configured to store all KQI/KPI items, including information such as service type, customer requirements, and expert assistance.

The data preprocessing unit extracts the complete set or subset of KQI and KPI items from the storage unit according to the information of the dispatch order, the service type and the customer demand, and sequentially performs data matching, data cleaning, statistical feature extraction and statistical data presentation for correlation analysis. Calculation.

The intelligent clustering unit intelligently clusters KPI items according to the type, distribution and engineering meaning of the business, and combines the business data expert judgment to output the intelligent clustering table. The clustering process is not limited to a single clustering algorithm, nor does it pre-specify the number of fixed clusters, based entirely on data and services.

The vector space decomposition unit is configured to decompose the vector space formed by the preprocessed KQI and KPI data, and extract the KQI-KPI correlation component that can be normalized and quantized. It includes direct decomposition and post-expansion decomposition.

The quantization correlation calculation unit is configured to perform a normalized quantization calculation on the KQI-KPI correlation component extracted by the vector space decomposition unit to obtain a KQI-KPI quantization correlation degree. According to the quantitative correlation degree and the intelligent clustering table, the KPI items of all participating in the association analysis are divided into four types: the final related item KPI, the related similar item KPI, the reminder item KPI and the unrelated item KPI. The total ranking weight is then calculated based on the quantitative relevance of all final correlation KPIs. Finally, the quantization correlation matrix (including the weight) is output on the presentation interface.

The multi-dimensional threshold calculation unit is configured to calculate a multi-dimensional quantization threshold of the final correlation KPI, and then calculate a false positive rate and a missed rate of the KPI multi-dimensional quantization threshold inverse KQI overrun. Finally, the multi-dimensional quantization threshold matrix and KQI over-limit false positive rate and missed-rate rate are output on the presentation interface.

The data mining analysis algorithm pool is set to store all the data mining algorithms that may be used in the entire association analysis calculation process. Each computing module automatically or expertly assists in selecting the appropriate algorithm based on the data and business characteristics.

Data Expert Intelligence Library, set to store business-based association analysis conclusions and intermediate process letters Information, can be used for business association analysis, network / system parameter optimization and other value-added services.

It should be noted that each of the above modules may be implemented by software or hardware. For the latter, the foregoing may be implemented by, but not limited to, the foregoing modules are all located in the same processor; or, the above modules are in any combination. The forms are located in different processors.

Example 3

The implementation of the technical solution is further described in detail below with reference to specific scenarios, but the embodiment is not limited to the present invention. The KQI-KPI multi-dimensional quantitative correlation analysis solution is provided in the embodiment of the present invention, and FIG. 6 is an embodiment of the present invention. The KQI-KPI multi-dimensional quantitative correlation analysis scheme flow diagram is provided as shown in FIG. 6. FIG. 7 is a block diagram of a KQI-KPI multi-dimensional quantitative correlation analysis system provided by an embodiment of the present invention, and the corresponding system module structure of the scheme is as shown in FIG. 7. Shown. The program includes the following steps:

S101. The main corresponding functional unit is module 40. The module 40 reads the KQI item and the KPI item to be associated with the analysis from the module 30 according to the service type, the protocol specification, and the like; then the module 40 reads the corresponding KQI and KPI data from the module 10, and performs data matching, data cleaning, and statistics. Pre-processing operations such as analysis, fitting, and feature extraction. In the data pre-processing process, the module 40 calls the module 90; optionally, the module 40 calls the module 100.

S102. The main corresponding functional unit is module 50. The module 50 invoking the module 90, and the partial index feature values obtained by the module 40 in S101, performs initial hierarchical clustering of the number of undefined categories, and FIG. 8 is an initial hierarchical clustering in the smart clustering process provided by the embodiment of the present invention. Schematic, as shown in Figure 8. Module 50 then invokes module 100 for the final clustering decision and outputs a clustering table. Optionally, the module 40 invokes the module 20 to perform on-site expert-assisted clustering decisions.

S103. The main corresponding functional unit is module 60. The module 60 calls the KQI and KPI data output by the module 40 to form a KQI-KPI data vector space, and the calling module 90 performs a vector space decomposition operation to separate the quantizable KQI-KPI correlation components.

S104. The main corresponding functional unit is module 70. The module 70 performs a normalized quantization operation on the KQI-KPI correlation components to obtain the degree of association between each KPI and the corresponding KQI. The KQI-KPI correlation degree range is [0, 1]. By setting the irrelevant threshold and the relevant threshold, the relevant item KPI and reminder can be judged. KPI and unrelated item KPI. In order to minimize the degree of overlap between related items KPIs, the module 70 calls the cluster table output by the module 50, and determines the final related item KPI from the KPI of the same cluster related item, and the related item KPI in the other clusters is called Related KPIs of the same type. After calculating the normalized weight of the final correlation KPI, the module 70 outputs the relevant KPI matrix of each KQI, including the weight.

S105. The main corresponding functional unit is module 80. The module 80 sequentially calculates the single threshold and the multi-dimensional threshold of the final correlation KPI according to the final correlation KPI and weight of the KQI determined by the module 70. According to the requirements of business and system accuracy, the dimension of the multi-dimensional threshold starts from two-dimensional and does not exceed the number of final related items KPI.

S106, the main corresponding functional unit is module 80. Using the final correlation KPI single item and the multi-dimensional threshold outputted in step S105, the KQI and KPI cleaned data obtained in step S101 are used to verify the false positive rate and the missed rate of the KQI overrun. Then use the false positive rate to test whether the limit of the missed rate is equal to or close to zero. “Yes” indicates that the current final KPI item contains the complete base of the KQI overrun. “No” indicates that the KQI still has the relevant KPI. Module 30 is covered.

This embodiment further includes an example of finding a main relevant KPI for a mobile communication carrier to find a low rate of Internet access in a main city of a provincial capital. Using the 24-hour granularity cell-level data of 4000 cells on the selected date, the degree of association between KQI: HTTP response delay and 30 wireless side KPIs was examined.

Step 1, belonging to S101. According to the abnormal service dispatch, the KQI to be evaluated and the corresponding 30 wireless side KPI item lists are read from the module 30. As shown in Table 1.

Table 1

Step 2 belongs to S101. The 24-hour granularity cell level data of the KQI and KPI to be associated with the analysis is read from the module 10, and aligned with the cell number according to the acquisition time.

Step 3 belongs to S101. The matched KQI and KPI data are subjected to outlier processing, statistical analysis, distribution fitting, etc., and statistical feature values are extracted. An example of the presentation of module 20 is as follows

FIG. 9 is a graph showing a frequency fitting curve of a normal distribution and a data source health test according to an embodiment of the present invention. As shown in FIG. 9, the index parameter passes the lognormal distribution test; Degree determination. All statistics and mining algorithms available for module 90.

FIG. 10 is a graph showing a frequency fitting curve of a data source health test that is not passed the normal distribution test in the embodiment of the present invention. As shown in FIG. 10, the index parameter does not pass the normal distribution/log normal distribution. The test automatically attempts to match other common distributions; the health degree determination is passed; the statistics or mining algorithms in the module 90 that need to conform to the normal distribution premise are not available.

11 is a graph showing a frequency fitting curve of a parameter that does not pass the normal distribution test and fails the data source health check according to an embodiment of the present invention. As shown in FIG. 11, the index parameter does not pass the normal distribution/lognormal state. The test of distribution failed to match other common distributions; it did not pass the health judgment; it did not participate in the subsequent calculation, and immediately submitted the troubleshooting. Upon examination, there is a problem of missed reporting and burst repeat reporting of this parameter in the cell.

Step 4 belongs to S102. Using the KPI data and statistical feature values obtained in steps 1 through 3, an initial hierarchical clustering of 30 KPIs without an unspecified number of categories is performed, as shown in FIG.

Step 5 belongs to S102. Based on step 4, combining the expert clustering information in the

module

100, 30 KPIs are grouped into six classes. An example is shown in Table 2.

Table 2

Step 6 belongs to S103. This example uses the Bayesian statistical principle to perform KQI-KPI data vector space decomposition, and describes the KPI->KQI mapping association component with conditional probability. Since the KQI-KPI data has been strictly matched by the spatial and temporal dimensions and the values are known in module 10, then according to Bayeux Equation, for a given KQI value (usually taking the warning threshold), the KPI->KQI correlation component can be calculated numerically.

Step 7, belonging to S104. Based on step 6, the KPI->KQI correlation component is quantized and normalized, and the KPI-KQI normalized correlation degree ∈[0,1] is obtained. In this example, the threshold for determining the correlation item KPI is set to be higher than 0.7, the threshold of the decision-independent item is set to be lower than 0.5, and the judgment of 0.5 and 0.7 is a reminder item. The decision method is determined according to business requirements and numerical characteristics, and is not limited to threshold hard judgment.

Step 8 belongs to S104. For the relevant KPI of the judgment, the cluster table obtained in step 5 is queried, and the KPI with the highest degree of relevance in the same category is the final correlation KPI, and the number of related items does not exceed 6 categories. FIG. 12 is a schematic diagram of the calculation of the correlation item KPI weight by using the primary quantization decision point and the secondary quantization decision point in the embodiment of the present invention. In this example, the auxiliary decision point is added in step 7 ("quantization decision point 2" in FIG. 12). The normalized weight of the final correlation term is calculated together with the correlation value of the main decision point ("quantized decision point 1" in Fig. 12). For example, if four final correlation terms KPI _F1 to KPI _{F4 are determined} , referring to FIG. 12, the unnormalized weight W _Oi = (quantization decision point 1) of each final correlation term KPI _i (i=1to 4) Probability value + quantified decision point 2 probability value) / 2 - 0.5; then normalized weight W _Ni = W _Oi / Σ _i (W _Oi ), i = 1 to 4.

An example of the conclusion output of module 70 is shown in Table 3.

table 3

Step 9, belonging to S105. In this example, according to the Bayesian formula and the expected false positive rate, the single-term false positive rate of the final correlation KPI and the expected distribution of the two-dimensional joint false positive rate are calculated. According to the final correlation KPI weight ratio obtained in step 8 and the two-dimensional expected false positive rate distribution, the numerical index number of the final correlation KPI of the pairwise pairing is adaptively determined, thereby obtaining a two-dimensional joint threshold matrix.

Step 10 belongs to S106. According to the single threshold and the two-dimensional joint threshold of the final correlation KPI, combined with the KQI-KPI matching data space obtained in step 2, the final false positive rate and missed rate of KQI: HTTP response delay are calculated. An example of the final KPI multidimensional threshold output presentation after step 10 is shown in Table 4.

Table 4

Step 11 belongs to S106. With the expected false positive rate variable, under the given KQI alert threshold, the calculations of

steps

9 and 10 are performed to obtain the KQI: HTTP response delay miss rate curve. According to whether the missed rate can reach or approach 0, it is judged whether the final related item KPI is complete. In this example, when the HTTP response time delay is 90ms, 95ms, 100ms, and 105ms, the zero miss judgment can be achieved, as shown in Table 5. This proves that the association analysis scheme and system described in the present invention finds complete KQI related items from 30 KPI items, which is in accordance with the description in the beneficial effects.

table 5

It will be apparent to those skilled in the art that the various modules or steps of the invention described above are apparent. It can be implemented by a general-purpose computing device, which can be centralized on a single computing device or distributed over a network of multiple computing devices. Alternatively, they can be implemented by program code executable by the computing device, such that They may be stored in a storage device by a computing device, and in some cases, the steps shown or described may be performed in an order different than that herein, or separately fabricated into individual integrated circuit modules. Alternatively, multiple modules or steps of them can be implemented as a single integrated circuit module. Thus, the invention is not limited to any specific combination of hardware and software.

The above description is only the preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes can be made to the present invention. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and scope of the present invention are intended to be included within the scope of the present invention.

Claims

A method for determining the correlation between input and output of a black box system, comprising:

Matching the service quality indicator KQI data in the black box system with the key performance indicator KPI data to form a data vector space;

And clustering the KPI data according to the service type of the KQI data, wherein the clustering result is used to select an orthogonal strong correlation KPI item, and the auxiliary decision indicator data health degree;

Decomposing the data vector space to separate associated features of the KPI data from the KQI data, and calculating a normalized degree of association of the KPI data with the KQI data;

Determining, according to the normalized degree of association, KPI data associated with each of the KQI data, and calculating an associated weight of the associated KPI data to the KQI data;

The associated KPI data and the associated weight are determined as the input and output relevance of the black box system.
The method according to claim 1, wherein the matching of the quality of service indicator KQI data with the key performance indicator KPI data constitutes a data vector space to separate the degree of association of the KPI data with the KQI data comprises:

The KQI data is matched with the KPI data in one or more dimensions to form a data vector space to separate the degree of association of the KPI data with the KQI data.
The method of claim 1, wherein clustering the KPI data according to a service type of the KQI data comprises:

Dividing the KQI data and the KPI data into a KQI data layer and a KPI data layer;

Adding an abstraction layer parameter related to the service type corresponding to the KQI data between the KQI data layer and the KPI data layer, wherein the abstraction layer normalizes or maps the KPI data to fit the corresponding mining algorithm;

The KPI data is clustered using the abstraction layer parameters.
The method of claim 1 wherein decomposing the data vector space comprises:

The data vector space is decomposed by at least one of the following manners, and the associated features of the KPI data for the KQI data are separated: dimension reduction in spatial dimension, direct segmentation in spatial dimension, and dimension dimension in spatial dimension.
The method according to claim 1, wherein the matching of the quality of service indicator KQI data with the key performance indicator KPI data constitutes a data vector space to separate the degree of association of the KPI data with the KQI data comprises:

Performing distribution fitting and graphic display on the KPI data according to the service characteristic information of the KQI data, and determining reasonable KPI data;

Matching the KQI data with the reasonable KPI data constitutes a data vector space to separate the degree of association of the KPI data with the KQI data.
The method of claim 5, wherein the service characteristic information is obtained according to a matching preset database and/or according to a business requirement.
The method of claim 1 wherein decomposing the data vector space comprises:

Performing dimensionality reduction on the data vector space;

Directly decomposing the data vector space;

After the data vector space is expanded, the effective feature value is extracted.
The method of claim 7, wherein the spatial reduction processing of the data vector comprises at least one of the following methods:

Decision tree pruning, regression merging, clustering, expert-assisted judgment.
The method of claim 7 wherein direct decomposing said data vector space comprises:

Decomposing the data vector space based on a Bayesian statistical algorithm;

Equivalent numerical calculation based on singular value decomposition.
The method according to claim 7, wherein the decomposing and expanding the data vector space comprises:

The algorithm based on the support vector machine SVM performs dimension expansion on the data vector space and then decomposes;

The dimension expansion processing based on the neural network algorithm, that is, the number of hidden layer units is higher than the input dimension.
The method according to any one of claims 1 to 10, wherein after determining the KPI data associated with each of the KQI data according to the normalized degree of association, the method further comprises:

Calculating a quantized one-dimensional threshold or multi-dimensional threshold of KPI data associated with each of the KQI data;

Obtaining a false positive rate and/or a missed rate of the KQI data overrun according to the quantized one-dimensional threshold or multi-dimensional threshold;

And analyzing whether the associated KPI data includes a complete base of the KQI data overrun space according to the missed rate and/or the missed rate.
The method according to claim 11, wherein after analyzing whether the associated KPI data contains a complete base of the KQI data overrun space according to the missed rate and/or the missed rate, the method further include:

And analyzing, according to the missed rate and/or the missed rate, the probability that the associated KPI data does not include the KQI data overrun space.
The method according to any one of claims 1 to 10, wherein after determining the KPI data associated with each of the KQI data according to the normalized degree of association, the method further comprises:

Determining whether the KQI data is missing;

In the case of judging that the KQI data is missing, the probability of misjudgement of the KQI data is inferred based on the quantized multi-dimensional threshold of the historical KPI data, and system pre-optimization and parameters and adjustment are performed.
The method of claim 1 wherein said KPI data comprises at least one of:

Radio resource control RRC connection establishment success rate, evolved radio access bearer E-RAB establishment success rate, wireless connection rate, E-RAB drop rate, base station ENB handover success rate, cell user face packet loss rate, cell User plane downlink packet loss rate, cell user plane downlink average delay, cell user plane downlink packet rejection rate, cell downlink packet number, MAC layer uplink error block rate, media access control MAC layer downlink error block rate, uplink initial hybrid automatic Retransmission request HARQ retransmission ratio, downlink initial HARQ retransmission ratio, downlink dual stream traffic ratio, uplink quadrature phase shift keying QPSK ratio, uplink 16QAM ratio, downlink QPSK ratio, downlink 16QAM ratio, downlink 64 quadrature amplitude modulation QAM Proportional, air interface uplink service byte number, air interface downlink service byte number, uplink physical resource block PRB average utilization rate, downlink PRB average utilization rate, uplink per PRB average throughput, downlink per PRB average throughput, -110dBm coverage rate, Average signal to interference plus noise ratio SINR, subband 0 average channel quality indicator CQI, user plane average activation device UE number.
The method of claim 1 wherein said KQI data comprises a hypertext transfer protocol HTTP response time delay.
The method of claim 1, wherein the clustering comprises at least one of the following: a capacity indicator cluster, an access indicator cluster, an efficiency indicator cluster, and a complete retention indicator cluster.
The method of claim 16 wherein said complete retention indicator gathers The class also includes at least one of the following: packet service clustering, uplink complete keep clustering, and downlink complete keep clustering.
An association analysis system comprising:

a storage unit configured to store KQI data and KPI data in the service network;

a data pre-processing unit, configured to perform pre-processing on the KQI data and the KPI data, where the pre-processing includes: data matching, data cleaning, statistical feature extraction, and statistical data presentation;

a clustering unit configured to intelligently cluster the KPI data and output a cluster table;

a vector space decomposition unit, connected to the data pre-processing unit, configured to decompose the vector space formed by the pre-processed KQI data and the KPI data, and extract the association of the KQI data that can be normalized and quantized into the KPI data. Component

a quantized association calculation unit, connected to the vector spatial decomposition unit, configured to perform a normalized quantization calculation on the correlation component, obtain a quantitative relevance degree of the KQI data to the KPI data, calculate a total ranking weight thereof, and output a quantization correlation matrix including the weights;

The multi-dimensional threshold calculation unit is configured to calculate a multi-dimensional quantization threshold of the KPI data of the correlation item, a false positive rate and/or a missed rate of the KQI overrun according to the correlation matrix, and output a multi-dimensional quantization threshold matrix and KQI over-limit evaluation data;

And an optimization unit configured to perform performance optimization according to the service network described by the multi-dimensional quantization threshold matrix and the KQI overrun evaluation data.
The system of claim 18, wherein the system further comprises:

The service data interface includes a presentation interface configured to receive an external command to perform an auxiliary decision on the output data of the system.
The system of claim 18, wherein the system further comprises:

a data mining analysis algorithm pool, configured to store a data mining algorithm of the system;

A database, set to store data analysis and mining conclusions of the system, and intermediate process information.
A device for determining the correlation between input and output of a black box system, comprising:

a separation module, configured to match the service quality indicator KQI data in the black box system with the key performance indicator KPI data to form a data vector space to separate the correlation degree of the KPI data to the KQI data;

a clustering module, configured to cluster the KPI data according to a service type of the KQI data;

a first calculating module configured to decompose the data vector space and calculate a normalized degree of association of the KPI data with the KQI data;

a second calculating module, configured to respectively determine KPI data associated with each of the KQI data according to the normalized association degree, and calculate an associated weight of the associated KPI data to the KQI data;

A determination module is configured to determine the associated KPI data and the associated weight as an input-output association of the black box system.