WO2018014674A1 - Method, apparatus, and system for determining degree of association of input and output of black box system - Google Patents

Method, apparatus, and system for determining degree of association of input and output of black box system Download PDF

Info

Publication number
WO2018014674A1
WO2018014674A1 PCT/CN2017/087940 CN2017087940W WO2018014674A1 WO 2018014674 A1 WO2018014674 A1 WO 2018014674A1 CN 2017087940 W CN2017087940 W CN 2017087940W WO 2018014674 A1 WO2018014674 A1 WO 2018014674A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
kqi
kpi
association
vector space
Prior art date
Application number
PCT/CN2017/087940
Other languages
French (fr)
Chinese (zh)
Inventor
孟晟
施风
眭鸿飞
赵黎波
王士刚
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2018014674A1 publication Critical patent/WO2018014674A1/en

Links

Images

Definitions

  • the present invention relates to the field of network and communication technologies, and in particular, to a method, device and system for determining the degree of input and output association of a black box system.
  • FIG. 1 is a schematic diagram of a black box system between the KQI and the KPI according to the related art of the present invention, as shown in FIG.
  • Traditional solutions rely mainly on manual experience. Faced with numerous systems or local KPIs, reporting parameters, alarm information, auxiliary messages, etc. (usually as many as hundreds), it is very difficult to troubleshoot problems in a timely and accurate manner. Therefore, cracking the black box association between KQI and KPI has always been a hot topic in the industry.
  • FIG. 2 is a schematic diagram showing a phenomenon in which the KPI is good KQI is poor in a scatter plot according to the related art of the present invention, and the fitting/regression of the existing method does not correctly reflect the relationship between KQI and KPI. as shown in picture 2.
  • KPI clustering There are two methods for KPI clustering in the related art: one is manual division, and the other is simple clustering algorithm based on Euclidean distance only (such as K-Means), which requires manual designation of clusters. Number, and not combined with business characteristics and engineering significance. 5) Based on 1) to 4), the related art method cannot quantitatively use the threshold joint decision KQI of multiple KPIs to exceed the limit or give a confidence level of KQI overrun. Furthermore, it is impossible to optimize or pre-optimize the network parameters according to the multi-dimensional KPI threshold in the case of temporarily missing KQI data.
  • the embodiment of the invention provides a method, a device and a system for determining the correlation degree of the input and output of the black box system, so as to at least solve the problem that the accuracy of the black box system is too low and the related items are not found when determining the correlation between the input and output of the black box system in the related art. .
  • a method for determining an input/output association degree of a black box system comprising: matching a service quality indicator KQI data in a black box system with key performance indicator KPI data to form a data vector space;
  • the KQI data service type clusters the KPI data, wherein the clustering result is used to select orthogonal strong correlation KPI items, and the auxiliary decision indicator data health degree;
  • the data vector space is decomposed to separate And extracting a correlation feature of the KPI data to the KQI data, and calculating a normalized degree of association of the KPI data with the KQI data; determining, according to the normalized correlation degree, each of the KQI data Associated KPI data, and calculate an associated weight of the associated KPI data for the KQI data; determining the associated KPI data and the associated weight as the input and output relevance of the black box system .
  • clustering the KPI data according to the service type of the KQI data includes: dividing the KQI data and the KPI data into a KQI data layer and a KPI data layer; at the KQI data layer and the The abstraction layer parameters related to the service type corresponding to the KQI data are added between the KPI data layers, wherein the abstraction layer normalizes or maps the KPI data. To fit the corresponding mining algorithm; use the abstraction layer parameters to cluster KPI data.
  • the decomposing the data vector space includes: decomposing the data vector space by at least one of the following manners, and separating an association feature of the KPI data to the KQI data: dimension reduction in a spatial dimension, Divide directly in the spatial dimension and dimension in the spatial dimension.
  • matching the service quality indicator KQI data with the key performance indicator KPI data to form a data vector space to separate the KPI data from the KQI data includes: matching the KQI data with the KPI data to form a data vector space. The degree of association of the KPI data with the KQI data is separated.
  • matching the service quality indicator KQI data with the key performance indicator KPI data to form a data vector space to separate the KPI data from the KQI data includes: performing the service characteristic information according to the KQI data.
  • the KPI data is distributed and graphically displayed to determine the validity of the KPI data, and the missing, abnormal or mapping processing is performed as needed; the KQI data is matched with the valid KPI data to form a data vector space to separate the KPI data pair.
  • the degree of association of KQI data is performed by the KQI data.
  • the method for separating the KQI data from the KPI data and forming the data vector space to separate the KPI data from the KQI data includes: dimension reduction processing, direct decomposition, and dimension extraction, and then extracting effective features.
  • the service characteristic information is obtained according to matching a preset database and/or according to a service requirement.
  • performing a dimensionality reduction operation on the data vector space includes: decision tree pruning, regression merging, clustering, expert-assisted decision, and the like.
  • performing direct decomposition on the data vector space includes: decomposing the data vector space based on a Bayesian statistical algorithm, an equivalent numerical calculation method based on a singular value decomposition idea, and the like.
  • performing the dimension expansion on the data vector space and then extracting the effective features includes: performing an algorithm based on the support vector machine SVM to perform the dimension expansion on the data vector space; and performing the dimension expansion processing based on the neural network algorithm, That is, the number of hidden layer units is higher than the input dimension.
  • the method further includes: calculating KPI data associated with each of the KQI data Quantifying the one-dimensional and quantizing multi-dimensional thresholds; obtaining a false positive rate and/or a missed rate of the KQI data overrun according to the quantized multi-dimensional threshold; analyzing the associated KPI according to the missed rate and/or the missed rate Whether the data contains a complete base of the KQI data overrun space.
  • the method further includes:
  • the method further includes: determining whether the KQI data is missing; determining the KQI data. In the absence of the case, the probability of misjudgment of the KQI data is inversely inferred based on the quantized multidimensional threshold of the historical KPI data, and system pre-optimization and parameters and adjustments are performed.
  • the KPI data includes at least one of the following: a Radio Resource Control (RRC) connection establishment success rate, and an Evolved Radio Access Bearer (E-RAB) Success rate, wireless connection rate, E-RAB drop rate, base station ENB (evolved NodeB) handover success rate, cell user face packet loss rate, cell user plane downlink packet loss rate, cell user plane downlink average delay, Cell user plane downlink packet loss rate, cell downlink packet number, MAC layer uplink block error rate, Media Access Control (MAC) layer downlink block error rate, uplink initial hybrid automatic repeat request HARQ retransmission ratio Hybrid Automatic Repeat Request (HARQ) retransmission ratio, downlink dual-stream traffic ratio, Quadrature Phase Shift Keying (QPSK) ratio, uplink 16QAM ratio, downlink QPSK ratio, Downstream 16QAM ratio, downlink 64 Quadrature Amplitude Modulation (QAM) ratio, air interface uplink service byte count, air interface industry Bytes, of uplink physical resource blocks Physical Resource Block
  • the KQI data includes a Hypertext Transfer Protocol (HTTP) response delay.
  • HTTP Hypertext Transfer Protocol
  • the cluster includes at least one of the following: a capacity indicator cluster, an access indicator cluster, an efficiency indicator cluster, and a complete retention indicator cluster.
  • the complete maintenance indicator cluster further includes at least one of the following: a group service cluster, an uplink complete cluster, and a downlink complete cluster.
  • an apparatus for determining an input/output association degree of a black box system comprising: a separation module configured to match a service quality indicator KQI data in a black box system with key performance indicator KPI data a data vector space to separate the degree of association of the KPI data with the KQI data; a clustering module configured to cluster the KPI data according to a service type of the KQI data; the first computing module is set to Decomposing the data vector space, and calculating a normalized degree of association of the KPI data with the KQI data; and a second calculating module, configured to determine each of the KQIs according to the normalized correlation degree The KPI data associated with the data, and the associated weight of the associated KPI data for the KQI data is calculated.
  • a determination module is configured to determine the associated KPI data and the associated weight as an input-output association of the black box system.
  • an association analysis system comprising: a storage unit configured to store KQI data and KPI data in a service network; a data pre-processing unit configured to pair the KQI data and the The KPI data is preprocessed, wherein the preprocessing includes: data matching, data cleaning, statistical feature extraction, and statistical data presentation.
  • a clustering unit configured to intelligently cluster the KPI data and output a clustering table; a vector space decomposition unit connected to the data preprocessing unit and configured to form the preprocessed KQI data and KPI data The vector space is decomposed to extract an associative component of the KPII data that can be normalized to the KPI data.
  • Quantizing an association calculation unit and connecting to the vector space decomposition unit, And performing normalized quantization calculation on the correlation component, obtaining a quantitative relevance degree of the KQI data on the KPI data, calculating a total ranking weight thereof, and outputting a quantization correlation matrix including the weight; a multi-dimensional threshold calculation unit,
  • the method is configured to calculate a multi-dimensional quantization threshold of the KPI data of the correlation item according to the correlation matrix, a false positive rate and/or a missed rate of the KQI overrun, and output a multi-dimensional quantization threshold matrix and KQI over-limit evaluation data; Performing network optimization according to the multi-dimensional quantization threshold matrix and the service network described by the KQI overrun evaluation data.
  • the system further includes: a service data interface, including a presentation interface, configured to receive an external command to perform an auxiliary decision on the output data of the system.
  • a service data interface including a presentation interface, configured to receive an external command to perform an auxiliary decision on the output data of the system.
  • the system further includes: a data mining analysis algorithm pool, configured to store a data mining algorithm of the system.
  • a database set to store data analysis and mining conclusions of the system, and intermediate process information.
  • a storage medium is also provided.
  • the storage medium is arranged to store program code for performing the following steps:
  • the associated KPI data and the associated weight are determined as the input and output relevance of the black box system.
  • the service quality indicator KQI data in the black box system is matched with the key performance indicator KPI data to form a data vector space; the KPI data is clustered according to the service type of the KQI data, wherein the clustering result is used Selecting orthogonal strong correlation KPI terms, and assisting decision indicator data health; decomposing the data vector space to separate the Correlation characteristics of the KPI data to the KQI data, and calculating a normalized degree of association of the KPI data with the KQI data; determining, respectively, associated with each of the KQI data according to the normalized correlation degree KPI data, and calculating an associated weight of the associated KPI data to the KQI data; determining the associated KPI data and the associated weight as the input-output relevance of the black box system, as
  • the KQI data is quantitatively calculated and normalized and compared with the KPI data. Therefore, the problem of low accuracy and incomplete correlation can be solved in determining the correlation between the input and output of the black box system in the related art, and
  • FIG. 1 is a schematic diagram of a black box system between a KQI and a KPI according to the related art of the present invention
  • FIG. 2 is a schematic diagram showing a phenomenon in which a "KPI is good KQI is poor" in a scatter diagram according to the related art of the present invention
  • FIG. 3 is a flow chart of a method of determining an input and output association degree of a black box system according to an embodiment of the present invention
  • FIG. 4 is a structural block diagram of an apparatus for determining an input/output association degree of a black box system according to an embodiment of the present invention
  • FIG. 5 is a structural block diagram of an association analysis system according to an embodiment of the present invention.
  • FIG. 6 is a schematic flowchart of a KQI-KPI multi-dimensional quantitative association analysis solution according to an embodiment of the present invention
  • FIG. 7 is a block diagram of a KQI-KPI multi-dimensional quantitative correlation analysis system according to an embodiment of the present invention.
  • FIG. 8 is a schematic diagram of initial hierarchical clustering in a smart clustering process according to an embodiment of the present invention.
  • FIG. 9 is a graph showing a fitting frequency of a parameter by a normal distribution and a data source health test according to an embodiment of the present invention.
  • FIG. 10 is a graph showing a frequency fitting curve of a parameter that does not pass the normal distribution test but passes the data source health test according to an embodiment of the present invention
  • 11 is a graph showing a frequency fitting curve of a parameter that does not pass the normal distribution test and fails the data source health test according to an embodiment of the present invention
  • FIG. 12 is a schematic diagram of calculating KPI weights of related items by using a primary quantization decision point and a secondary quantization decision point according to an embodiment of the present invention.
  • FIG. 3 is a flowchart of a method for determining the input and output association degree of the black box system according to an embodiment of the present invention. The process includes the following steps:
  • Step S302 matching the service quality indicator KQI data in the black box system with the key performance indicator KPI data to form a data vector space to separate the correlation degree of the KPI data to the KQI data;
  • Step S304 clustering the KPI data according to the service type of the KQI data, wherein the clustering result is used to select an orthogonal strong correlation KPI item, and the auxiliary decision indicator data health degree;
  • Step S306 decomposing the data vector space to separate the correlation feature of the KPI data to the KQI data, and calculating the normalized correlation degree of the KPI data to the KQI data;
  • Step S308 respectively determining KPI data associated with each KQI data according to the normalized association degree, and calculating an associated weight of the associated KPI data to the KQI data;
  • Step S310 determining the associated KPI data and the associated weight as the input and output relevance of the black box system.
  • scenario in this embodiment may be applied to network optimization, user portrait, comment, recommendation, etc., but is not limited thereto.
  • the service quality indicator KQI data in the black box system is matched with the key performance indicator KPI data to form a data vector space; the KPI data is clustered according to the service type of the KQI data, wherein the clustering result is used.
  • the degree of association between KQI and KPI is regarded as a random variable rather than a fixed function, preferably Bayesian statistical theory, and the method flow is:
  • the KQI and KPI data collected at both ends of the network/communication system are aligned by time granularity and spatial granularity matching.
  • the aligned KQI and KPI data are further cleaned according to the protocol, specifications, and actual business requirements.
  • Data cleaning content includes: eliminating outliers and filling in missing values.
  • the characteristics of the KQI and KPI data after washing are counted and presented in a variety of charts.
  • the business data experts combine statistical indicators to determine the health of the data source (whether there are misstatements/missings that can be judged by numerical methods), data distribution types, etc., to better adapt the data mining algorithm.
  • Intelligent hierarchical clustering of the cleaned KPI data and obtaining a clustering table are divided into several categories based on the combination of business requirements and engineering significance, and fine-tuning the KPI classification with blurred boundaries.
  • the vector space formed by the KQI-KPI data is decomposed, and the quantizable degree of association of each KPI-KQI is separated and normalized.
  • the vector space is expanded or coordinate transformed to obtain a clearer KQI-KPI quantizable degree of association.
  • the KPI-KQI normalized relevance is sorted and judged, and the classification is sorted according to the degree of relevance.
  • the business data expert specifies the final correlation KPI based on statistical information, business requirements, clustering tables, and engineering significance.
  • the normalized weight of the final correlation KPI is calculated according to the KPI-KQI normalized degree of association of the final correlation KPI.
  • the one-dimensional threshold of each final correlation KPI is calculated based on the KPI-KQI normalized correlation degree of the final correlation KPI and the preset warning threshold of the KQI.
  • the joint threshold of multiple final correlation KPIs is calculated.
  • the joint threshold dimension is between 2 and the number of final correlation KPIs.
  • KQI overrun leak rate (KQI->KPI) under the expected false positive rate (KPI->KQI) is calculated using the multidimensional threshold of the final correlation KPI.
  • the false positive rate definition KQI data filtered by KPI multi-dimensional threshold, KQI has no over-limit ratio; leakage rate definition: KQI over-limit data, which is not within the KPI multi-dimensional threshold screening range.
  • association matrix between KQI and its final related item KPI and the KPI multidimensional threshold matrix are presented in the interface, and the associated extraction features are stored in the background expert intelligence database.
  • the intermediate output result in the entire calculation decision process may be selectively presented by the service data expert on the interface, and the auxiliary decision and the manual fine adjustment are performed.
  • the KQI overrun judgment is performed using the existing multi-dimensional threshold. If the false positive rate increases beyond the offset threshold, the entire data is recalculated; otherwise, the existing correlation analysis conclusion is combined with the adaptive fine adjustment. Guide KQI optimization and parameter adjustment.
  • the operator, the network builder, and the network maintainer can quickly and comprehensively find a strong KPI of a certain KQI in a quantitative manner, and calculate a corresponding KPI single and multi-dimensional threshold, and then push back the KQI. Whether it exceeds the limit, provides accurate and convenient guidance for network evaluation, performance optimization, parameter adjustment, etc., improves service quality, and greatly reduces the labor burden.
  • the method according to the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course, by hardware, but in many cases, the former is A better implementation.
  • the technical solution of the present invention which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk,
  • the optical disc includes a number of instructions for causing a terminal device (which may be a cell phone, a computer, a server, or a network device, etc.) to perform the methods of various embodiments of the present invention.
  • a device and a system for determining the degree of the input and output of the black box system are provided.
  • the device and the system are used to implement the above-mentioned embodiments and preferred embodiments, and the detailed description thereof has been omitted.
  • the term "module” may implement a combination of software and/or hardware of a predetermined function.
  • the apparatus described in the following embodiments is preferably implemented in software, hardware, or a combination of software and hardware, is also possible and contemplated.
  • FIG. 4 is a structural block diagram of an apparatus for determining an input/output association degree of a black box system according to an embodiment of the present invention. As shown in FIG. 4, the apparatus includes:
  • the separation module 40 is configured to match the service quality indicator KQI data in the black box system with the key performance indicator KPI data to form a data vector space to separate the correlation degree of the KPI data to the KQI data;
  • the clustering module 42 is configured to cluster the KPI data according to the service type of the KQI data
  • the first calculating module 44 is configured to decompose the data vector space and calculate a normalized correlation degree of the KPI data to the KQI data;
  • the second calculating module 46 is configured to respectively determine each KQI according to the normalized degree of association The KPI data associated with the data, and the associated weights of the associated KPI data for the KQI data are calculated.
  • a determination module 48 is arranged to determine the associated KPI data and associated weights as the input and output associations of the black box system.
  • FIG. 5 is a structural block diagram of an association analysis system according to an embodiment of the present invention. As shown in FIG. 5, the system includes:
  • the storage unit 50 is configured to store KQI data and KPI data in the service network
  • the data pre-processing unit 52 is configured to pre-process the KQI data and the KPI data, wherein the pre-processing includes: data matching, data cleaning, statistical feature extraction, and statistical data presentation.
  • the clustering unit 54 is configured to perform intelligent clustering on the KPI data, and output a cluster table
  • the vector space decomposition unit 56 is connected to the data pre-processing unit and configured to decompose the vector space formed by the pre-processed KQI data and the KPI data, and extract the correlation component of the KPII data-to-KPI data that can be normalized and quantized.
  • the quantization correlation calculation unit 58 is connected to the vector space decomposition unit, and is configured to perform normalized quantization calculation on the correlation component, obtain a quantitative relevance degree of the KQI data to the KPI data, calculate a total ranking weight thereof, and output a quantization correlation matrix including the weight;
  • the multi-dimensional threshold calculation unit 60 is configured to calculate a multi-dimensional quantization threshold of the correlation item KPI data according to the correlation matrix, a false positive rate and/or a missed rate of the inverse KQI overrun, and output a multi-dimensional quantization threshold matrix and KQI over-limit evaluation data;
  • the optimization unit 32 is configured to perform network optimization according to the service network of the multi-dimensional quantization threshold matrix and the KQI over-limit evaluation data.
  • the system further includes: a service data interface, including a presentation interface, configured to receive an external command to perform an auxiliary decision on the output data of the system; according to the mining analysis algorithm pool, set as a data mining algorithm of the storage system; the database is set to be stored System data analysis and mining conclusions, as well as intermediate process information.
  • a service data interface including a presentation interface, configured to receive an external command to perform an auxiliary decision on the output data of the system; according to the mining analysis algorithm pool, set as a data mining algorithm of the storage system; the database is set to be stored System data analysis and mining conclusions, as well as intermediate process information.
  • the system according to the embodiment is set to store all indicators and Operation management data.
  • the business data expert interface includes a data and conclusion presentation interface, and is set as a business data expert to view statistical analysis information and input auxiliary conditions or judgments.
  • the KQI/KPI storage unit is configured to store all KQI/KPI items, including information such as service type, customer requirements, and expert assistance.
  • the data preprocessing unit extracts the complete set or subset of KQI and KPI items from the storage unit according to the information of the dispatch order, the service type and the customer demand, and sequentially performs data matching, data cleaning, statistical feature extraction and statistical data presentation for correlation analysis. Calculation.
  • the intelligent clustering unit intelligently clusters KPI items according to the type, distribution and engineering meaning of the business, and combines the business data expert judgment to output the intelligent clustering table.
  • the clustering process is not limited to a single clustering algorithm, nor does it pre-specify the number of fixed clusters, based entirely on data and services.
  • the vector space decomposition unit is configured to decompose the vector space formed by the preprocessed KQI and KPI data, and extract the KQI-KPI correlation component that can be normalized and quantized. It includes direct decomposition and post-expansion decomposition.
  • the quantization correlation calculation unit is configured to perform a normalized quantization calculation on the KQI-KPI correlation component extracted by the vector space decomposition unit to obtain a KQI-KPI quantization correlation degree.
  • the KPI items of all participating in the association analysis are divided into four types: the final related item KPI, the related similar item KPI, the reminder item KPI and the unrelated item KPI.
  • the total ranking weight is then calculated based on the quantitative relevance of all final correlation KPIs.
  • the quantization correlation matrix (including the weight) is output on the presentation interface.
  • the multi-dimensional threshold calculation unit is configured to calculate a multi-dimensional quantization threshold of the final correlation KPI, and then calculate a false positive rate and a missed rate of the KPI multi-dimensional quantization threshold inverse KQI overrun. Finally, the multi-dimensional quantization threshold matrix and KQI over-limit false positive rate and missed-rate rate are output on the presentation interface.
  • the data mining analysis algorithm pool is set to store all the data mining algorithms that may be used in the entire association analysis calculation process.
  • Each computing module automatically or expertly assists in selecting the appropriate algorithm based on the data and business characteristics.
  • Data Expert Intelligence Library set to store business-based association analysis conclusions and intermediate process letters Information, can be used for business association analysis, network / system parameter optimization and other value-added services.
  • each of the above modules may be implemented by software or hardware.
  • the foregoing may be implemented by, but not limited to, the foregoing modules are all located in the same processor; or, the above modules are in any combination.
  • the forms are located in different processors.
  • FIG. 6 is an embodiment of the present invention.
  • the KQI-KPI multi-dimensional quantitative correlation analysis scheme flow diagram is provided as shown in FIG. 6.
  • FIG. 7 is a block diagram of a KQI-KPI multi-dimensional quantitative correlation analysis system provided by an embodiment of the present invention, and the corresponding system module structure of the scheme is as shown in FIG. 7. Shown.
  • the program includes the following steps:
  • the main corresponding functional unit is module 40.
  • the module 40 reads the KQI item and the KPI item to be associated with the analysis from the module 30 according to the service type, the protocol specification, and the like; then the module 40 reads the corresponding KQI and KPI data from the module 10, and performs data matching, data cleaning, and statistics. Pre-processing operations such as analysis, fitting, and feature extraction. In the data pre-processing process, the module 40 calls the module 90; optionally, the module 40 calls the module 100.
  • FIG. 8 is an initial hierarchical clustering in the smart clustering process provided by the embodiment of the present invention. Schematic, as shown in Figure 8.
  • Module 50 then invokes module 100 for the final clustering decision and outputs a clustering table.
  • the module 40 invokes the module 20 to perform on-site expert-assisted clustering decisions.
  • the main corresponding functional unit is module 60.
  • the module 60 calls the KQI and KPI data output by the module 40 to form a KQI-KPI data vector space, and the calling module 90 performs a vector space decomposition operation to separate the quantizable KQI-KPI correlation components.
  • the main corresponding functional unit is module 70.
  • the module 70 performs a normalized quantization operation on the KQI-KPI correlation components to obtain the degree of association between each KPI and the corresponding KQI.
  • the KQI-KPI correlation degree range is [0, 1]. By setting the irrelevant threshold and the relevant threshold, the relevant item KPI and reminder can be judged. KPI and unrelated item KPI.
  • the module 70 calls the cluster table output by the module 50, and determines the final related item KPI from the KPI of the same cluster related item, and the related item KPI in the other clusters is called Related KPIs of the same type. After calculating the normalized weight of the final correlation KPI, the module 70 outputs the relevant KPI matrix of each KQI, including the weight.
  • the main corresponding functional unit is module 80.
  • the module 80 sequentially calculates the single threshold and the multi-dimensional threshold of the final correlation KPI according to the final correlation KPI and weight of the KQI determined by the module 70. According to the requirements of business and system accuracy, the dimension of the multi-dimensional threshold starts from two-dimensional and does not exceed the number of final related items KPI.
  • the main corresponding functional unit is module 80.
  • the KQI and KPI cleaned data obtained in step S101 are used to verify the false positive rate and the missed rate of the KQI overrun. Then use the false positive rate to test whether the limit of the missed rate is equal to or close to zero. “Yes” indicates that the current final KPI item contains the complete base of the KQI overrun. “No” indicates that the KQI still has the relevant KPI. Module 30 is covered.
  • This embodiment further includes an example of finding a main relevant KPI for a mobile communication carrier to find a low rate of Internet access in a main city of a provincial capital.
  • the degree of association between KQI: HTTP response delay and 30 wireless side KPIs was examined.
  • Step 1 belonging to S101.
  • the KQI to be evaluated and the corresponding 30 wireless side KPI item lists are read from the module 30. As shown in Table 1.
  • Step 2 belongs to S101.
  • the 24-hour granularity cell level data of the KQI and KPI to be associated with the analysis is read from the module 10, and aligned with the cell number according to the acquisition time.
  • Step 3 belongs to S101.
  • the matched KQI and KPI data are subjected to outlier processing, statistical analysis, distribution fitting, etc., and statistical feature values are extracted.
  • An example of the presentation of module 20 is as follows
  • FIG. 9 is a graph showing a frequency fitting curve of a normal distribution and a data source health test according to an embodiment of the present invention. As shown in FIG. 9, the index parameter passes the lognormal distribution test; Degree determination. All statistics and mining algorithms available for module 90.
  • FIG. 10 is a graph showing a frequency fitting curve of a data source health test that is not passed the normal distribution test in the embodiment of the present invention.
  • the index parameter does not pass the normal distribution/log normal distribution.
  • the test automatically attempts to match other common distributions; the health degree determination is passed; the statistics or mining algorithms in the module 90 that need to conform to the normal distribution premise are not available.
  • FIG. 11 is a graph showing a frequency fitting curve of a parameter that does not pass the normal distribution test and fails the data source health check according to an embodiment of the present invention.
  • the index parameter does not pass the normal distribution/lognormal state.
  • the test of distribution failed to match other common distributions; it did not pass the health judgment; it did not participate in the subsequent calculation, and immediately submitted the troubleshooting.
  • Step 4 belongs to S102. Using the KPI data and statistical feature values obtained in steps 1 through 3, an initial hierarchical clustering of 30 KPIs without an unspecified number of categories is performed, as shown in FIG.
  • Step 5 belongs to S102. Based on step 4, combining the expert clustering information in the module 100, 30 KPIs are grouped into six classes. An example is shown in Table 2.
  • Step 6 belongs to S103.
  • This example uses the Bayesian statistical principle to perform KQI-KPI data vector space decomposition, and describes the KPI->KQI mapping association component with conditional probability. Since the KQI-KPI data has been strictly matched by the spatial and temporal dimensions and the values are known in module 10, then according to Bayeux Equation, for a given KQI value (usually taking the warning threshold), the KPI->KQI correlation component can be calculated numerically.
  • Step 7 belonging to S104.
  • the KPI->KQI correlation component is quantized and normalized, and the KPI-KQI normalized correlation degree ⁇ [0,1] is obtained.
  • the threshold for determining the correlation item KPI is set to be higher than 0.7
  • the threshold of the decision-independent item is set to be lower than 0.5
  • the judgment of 0.5 and 0.7 is a reminder item.
  • the decision method is determined according to business requirements and numerical characteristics, and is not limited to threshold hard judgment.
  • Step 8 belongs to S104.
  • the cluster table obtained in step 5 is queried, and the KPI with the highest degree of relevance in the same category is the final correlation KPI, and the number of related items does not exceed 6 categories.
  • FIG. 12 is a schematic diagram of the calculation of the correlation item KPI weight by using the primary quantization decision point and the secondary quantization decision point in the embodiment of the present invention.
  • the auxiliary decision point is added in step 7 ("quantization decision point 2" in FIG. 12).
  • the normalized weight of the final correlation term is calculated together with the correlation value of the main decision point ("quantized decision point 1" in Fig. 12). For example, if four final correlation terms KPI F1 to KPI F4 are determined , referring to FIG.
  • Step 9 belonging to S105.
  • the single-term false positive rate of the final correlation KPI and the expected distribution of the two-dimensional joint false positive rate are calculated.
  • the numerical index number of the final correlation KPI of the pairwise pairing is adaptively determined, thereby obtaining a two-dimensional joint threshold matrix.
  • Step 10 belongs to S106. According to the single threshold and the two-dimensional joint threshold of the final correlation KPI, combined with the KQI-KPI matching data space obtained in step 2, the final false positive rate and missed rate of KQI: HTTP response delay are calculated.
  • An example of the final KPI multidimensional threshold output presentation after step 10 is shown in Table 4.
  • Step 11 belongs to S106.
  • the calculations of steps 9 and 10 are performed to obtain the KQI: HTTP response delay miss rate curve. According to whether the missed rate can reach or approach 0, it is judged whether the final related item KPI is complete.
  • the HTTP response time delay is 90ms, 95ms, 100ms, and 105ms, the zero miss judgment can be achieved, as shown in Table 5. This proves that the association analysis scheme and system described in the present invention finds complete KQI related items from 30 KPI items, which is in accordance with the description in the beneficial effects.

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A method, an apparatus, and a system for determining the degree of association of the input and output of a black box system, the method comprising: matching service quality indicator KQI data in the black box system with key performance indicator KPI data to form a data vector space; on the basis of the service type of the KQI data, clustering the KPI data; breaking down the data vector space to separate out association features of the KPI data to the KQI data, and calculating a normalised degree of association of the KPI data to the KQI data; on the basis of the normalised degree of association, respectively determining KPI data associated with each KQI data, and calculating an association weighting of the associated KPI to the KQI data; and determining that the associated KPI data and association weighting are the degree of association of the input and output of the black box system. The present invention solves the problems in the prior art of low precision and finding incomplete items of association when determining the degree of association of the input and output of a black box system.

Description

确定黑盒系统输入输出关联度的方法、装置以及系统Method, device and system for determining input and output correlation degree of black box system 技术领域Technical field
本发明涉及网络与通信技术领域,具体而言,涉及一种确定黑盒系统输入输出关联度的方法、装置以及系统。The present invention relates to the field of network and communication technologies, and in particular, to a method, device and system for determining the degree of input and output association of a black box system.
背景技术Background technique
当服务网络(包括通信系统)中某项业务质量指标KQI(Key Quality Indicators,也叫关键质量指标)恶化,希望找出导致KQI恶化的主要关键性能指标KPI(Key Performance Indicators),以进行参数调整或网络优化。而KQI与KPI之间是一个黑盒系统,图1是根据本发明相关技术的KQI与KPI之间为黑盒系统的示意图,如图1所示。传统解决方法主要依靠人工经验,面对众多系统或局部KPI、上报参数、告警信息、辅助消息等(通常多达数百条),及时准确地排查问题非常困难。因此,破解KQI与KPI间的黑盒关联关系一直是业界关心的热点。When a service quality indicator KQI (Key Quality Indicators) is deteriorated in the service network (including the communication system), it is hoped to identify key performance indicators (KPIs) that cause KQI deterioration for parameter adjustment. Or network optimization. The KQI and the KPI are a black box system. FIG. 1 is a schematic diagram of a black box system between the KQI and the KPI according to the related art of the present invention, as shown in FIG. Traditional solutions rely mainly on manual experience. Faced with numerous systems or local KPIs, reporting parameters, alarm information, auxiliary messages, etc. (usually as many as hundreds), it is very difficult to troubleshoot problems in a timely and accurate manner. Therefore, cracking the black box association between KQI and KPI has always been a hot topic in the industry.
相关技术的理论基础均为经典统计,认为KQI与KPI间的关联度存在固定函数。试图使用参数估计、回归分析等方式对关联度进行函数拟合(即KQI=f(KPI)),但这种方法不能获得正确的结论,也无法解决“KPI好KQI却差”的现状。图2是根据本发明相关技术中以散点图示意“KPI好KQI却差”的现象示意图,而现有方法进行拟合/回归并不能正确反映KQI与KPI之间的关联关系。如图2所示。The theoretical basis of the related technology is classical statistics, and it is considered that there is a fixed function between the correlation between KQI and KPI. Try to use the parameter estimation, regression analysis and other methods to fit the degree of relevance (ie KQI=f(KPI)), but this method can not get the correct conclusion, and can not solve the current situation of "KPI is good KQI." FIG. 2 is a schematic diagram showing a phenomenon in which the KPI is good KQI is poor in a scatter plot according to the related art of the present invention, and the fitting/regression of the existing method does not correctly reflect the relationship between KQI and KPI. as shown in picture 2.
受限于理论基础,现有方法只能根据经验人为选出少量“应该相关”的KPI项、人工指定关联模型并给定KPI间的权重,再进行关联程度计算。导致以下问题:Limited by the theoretical basis, the existing method can only select a small number of "should be relevant" KPI items, manually specify the correlation model and give the weight between the KPIs based on experience, and then calculate the degree of association. Causes the following problems:
1)、不全面。现有方法没有能力同时计算所有待评估KPI与KQI间的关联度,也无法应对KPI列表可变的情形。典型体现为遗漏强相关的KPI项。2)、不准确。现有方法无法准确地量化计算所有待评估KPI与当前 KQI的关联度。典型体现为错误的关联度结论。3)、基于1)和2),现有方法无法找出KQI的完备正交KPI集合,进而无法准确指导参数优化或业务增值挖掘。4)、相关技术中的方法进行KPI聚类时分为两种:一种是人工划分,另一种是仅基于欧式距离的简单聚类算法(例如K-Means),均需要人工指定聚类个数,且未同时结合业务特征与工程意义。5)、基于1)到4),相关技术的方法无法定量使用多个KPI的门限联合判决KQI是否超限或给出KQI超限的置信度。进而无法在暂缺KQI数据的情况下,根据多维KPI门限对网络参数进行优化或预优化。1), not comprehensive. Existing methods do not have the ability to simultaneously calculate the degree of association between all KPIs to be evaluated and KQI, nor can they cope with situations where the KPI list is variable. It is typically reflected in the omission of strongly related KPI items. 2), not accurate. Existing methods cannot accurately quantify all KPIs to be evaluated and current The degree of relevance of KQI. It is typically reflected in the wrong relevance conclusion. 3) Based on 1) and 2), the existing method cannot find the complete orthogonal KPI set of KQI, and thus cannot accurately guide parameter optimization or business value-added mining. 4) There are two methods for KPI clustering in the related art: one is manual division, and the other is simple clustering algorithm based on Euclidean distance only (such as K-Means), which requires manual designation of clusters. Number, and not combined with business characteristics and engineering significance. 5) Based on 1) to 4), the related art method cannot quantitatively use the threshold joint decision KQI of multiple KPIs to exceed the limit or give a confidence level of KQI overrun. Furthermore, it is impossible to optimize or pre-optimize the network parameters according to the multi-dimensional KPI threshold in the case of temporarily missing KQI data.
针对相关技术中存在的上述问题,目前尚未发现有效的解决方案。In view of the above problems in the related art, no effective solution has been found yet.
发明内容Summary of the invention
本发明实施例提供了一种确定黑盒系统输入输出关联度的方法、装置以及系统,以至少解决相关技术中在确定黑盒系统输入输出关联度时精确度过低且找不全关联项的问题。The embodiment of the invention provides a method, a device and a system for determining the correlation degree of the input and output of the black box system, so as to at least solve the problem that the accuracy of the black box system is too low and the related items are not found when determining the correlation between the input and output of the black box system in the related art. .
根据本发明的一个实施例,提供了一种确定黑盒系统输入输出关联度的方法,包括:将黑盒系统中的业务质量指标KQI数据与关键性能指标KPI数据匹配构成数据向量空间;根据所述KQI数据的业务类型对所述KPI数据进行聚类,其中,聚类结果用于选择正交的强相关KPI项,及辅助判决指标数据健康度;对所述数据向量空间进行分解,以分离出所述KPI数据对所述KQI数据的关联特征,并计算所述KPI数据对所述KQI数据的归一化关联度;根据所述归一化关联度分别确定出与每个所述KQI数据相关联的KPI数据,并计算出所述相关联的KPI数据对所述KQI数据的关联权重;将所述相关联的KPI数据和所述关联权重确定为所述黑盒系统的输入输出关联度。According to an embodiment of the present invention, a method for determining an input/output association degree of a black box system is provided, comprising: matching a service quality indicator KQI data in a black box system with key performance indicator KPI data to form a data vector space; The KQI data service type clusters the KPI data, wherein the clustering result is used to select orthogonal strong correlation KPI items, and the auxiliary decision indicator data health degree; the data vector space is decomposed to separate And extracting a correlation feature of the KPI data to the KQI data, and calculating a normalized degree of association of the KPI data with the KQI data; determining, according to the normalized correlation degree, each of the KQI data Associated KPI data, and calculate an associated weight of the associated KPI data for the KQI data; determining the associated KPI data and the associated weight as the input and output relevance of the black box system .
可选地,根据所述KQI数据的业务类型对所述KPI数据进行聚类包括:将所述KQI数据和所述KPI数据分成KQI数据层和KPI数据层;在所述KQI数据层和所述KPI数据层之间加入与所述KQI数据对应的业务类型相关的抽象层参数,其中,抽象层对KPI数据进行规整或映射变换, 以适合相应的挖掘算法;使用所述抽象层参数对KPI数据进行聚类。Optionally, clustering the KPI data according to the service type of the KQI data includes: dividing the KQI data and the KPI data into a KQI data layer and a KPI data layer; at the KQI data layer and the The abstraction layer parameters related to the service type corresponding to the KQI data are added between the KPI data layers, wherein the abstraction layer normalizes or maps the KPI data. To fit the corresponding mining algorithm; use the abstraction layer parameters to cluster KPI data.
可选地,对所述数据向量空间进行分解包括:对所述数据向量空间通过以下方式至少之一进行分解,分离出所述KPI数据对所述KQI数据的关联特征:在空间维度降维、在空间维度直接分割、在空间维度升维。Optionally, the decomposing the data vector space includes: decomposing the data vector space by at least one of the following manners, and separating an association feature of the KPI data to the KQI data: dimension reduction in a spatial dimension, Divide directly in the spatial dimension and dimension in the spatial dimension.
可选地,将业务质量指标KQI数据与关键性能指标KPI数据匹配构成数据向量空间以分离出所述KPI数据对所述KQI数据的关联度包括:将KQI数据与KPI数据匹配构成数据向量空间以分离出所述KPI数据对所述KQI数据的关联度。Optionally, matching the service quality indicator KQI data with the key performance indicator KPI data to form a data vector space to separate the KPI data from the KQI data includes: matching the KQI data with the KPI data to form a data vector space. The degree of association of the KPI data with the KQI data is separated.
可选地,将业务质量指标KQI数据与关键性能指标KPI数据匹配构成数据向量空间以分离出所述KPI数据对所述KQI数据的关联度包括:按照所述KQI数据的业务特性信息对所述KPI数据进行分布拟合和图形展示,判断KPI数据的有效程度,根据需要作缺失、异常或映射处理;将KQI数据与所述有效KPI数据匹配构成数据向量空间以分离出所述KPI数据对所述KQI数据的关联度。Optionally, matching the service quality indicator KQI data with the key performance indicator KPI data to form a data vector space to separate the KPI data from the KQI data includes: performing the service characteristic information according to the KQI data. The KPI data is distributed and graphically displayed to determine the validity of the KPI data, and the missing, abnormal or mapping processing is performed as needed; the KQI data is matched with the valid KPI data to form a data vector space to separate the KPI data pair. The degree of association of KQI data.
可选地,从KQI数据与KPI数据匹配构成数据向量空间分离出所述KPI数据对所述KQI数据的关联度的方法包括:降维处理、直接分解、升维后再提取有效特征。Optionally, the method for separating the KQI data from the KPI data and forming the data vector space to separate the KPI data from the KQI data includes: dimension reduction processing, direct decomposition, and dimension extraction, and then extracting effective features.
可选地,所述业务特性信息是根据匹配预设数据库和/或根据业务需求得到的。Optionally, the service characteristic information is obtained according to matching a preset database and/or according to a service requirement.
可选地,对所述数据向量空间进行降维操作包括:决策树剪枝、回归合并、聚类、专家辅助判决等。Optionally, performing a dimensionality reduction operation on the data vector space includes: decision tree pruning, regression merging, clustering, expert-assisted decision, and the like.
可选地,对所述数据向量空间进行直接分解包括:基于贝叶斯统计算法对所述数据向量空间进行分解,基于奇异值分解思路的等效数值计算方法等。Optionally, performing direct decomposition on the data vector space includes: decomposing the data vector space based on a Bayesian statistical algorithm, an equivalent numerical calculation method based on a singular value decomposition idea, and the like.
可选地,对所述数据向量空间进行扩维后再提取有效特征包括:基于支持向量机SVM的算法对所述数据向量空间进行扩维后再进行分解;基于神经网络算法的扩维处理,即隐层单元个数高于输入维数。 Optionally, performing the dimension expansion on the data vector space and then extracting the effective features includes: performing an algorithm based on the support vector machine SVM to perform the dimension expansion on the data vector space; and performing the dimension expansion processing based on the neural network algorithm, That is, the number of hidden layer units is higher than the input dimension.
可选地,在根据所述归一化关联度分别确定出与每个所述KQI数据相关联的KPI数据之后,所述方法还包括:计算与每个所述KQI数据相关联的KPI数据的量化一维和量化多维门限;根据所述量化多维门限得到所述KQI数据超限的误判率和/或漏判率;根据所述漏判率和/或漏判率分析所述相关联的KPI数据是否包含所述KQI数据超限空间的完备基。Optionally, after determining KPI data associated with each of the KQI data according to the normalized association degree, the method further includes: calculating KPI data associated with each of the KQI data Quantifying the one-dimensional and quantizing multi-dimensional thresholds; obtaining a false positive rate and/or a missed rate of the KQI data overrun according to the quantized multi-dimensional threshold; analyzing the associated KPI according to the missed rate and/or the missed rate Whether the data contains a complete base of the KQI data overrun space.
可选地,在根据所述漏判率和/或漏判率分析所述相关联的KPI数据是否包含所述KQI数据超限空间的完备基之后,所述方法还包括:Optionally, after analyzing, according to the missed rate and/or the missed rate, whether the associated KPI data includes a complete base of the KQI data overrun space, the method further includes:
根据所述漏判率和/或漏判率分析所述相关联的KPI数据不包含所述KQI数据超限空间的概率。And analyzing, according to the missed rate and/or the missed rate, the probability that the associated KPI data does not include the KQI data overrun space.
可选地,在根据所述归一化关联度分别确定出与每个所述KQI数据相关联的KPI数据之后,所述方法还包括:判断所述KQI数据是否缺失;在判断所述KQI数据缺失的情况下,根据历史KPI数据的量化多维门限反向推断所述KQI数据误判的概率,并进行系统预优化与参数与调整。Optionally, after determining the KPI data associated with each of the KQI data according to the normalized association degree, the method further includes: determining whether the KQI data is missing; determining the KQI data. In the absence of the case, the probability of misjudgment of the KQI data is inversely inferred based on the quantized multidimensional threshold of the historical KPI data, and system pre-optimization and parameters and adjustments are performed.
可选地,所述KPI数据包括以下至少之一:无线资源控制(Radio Resource Control,简称为RRC)连接建立成功率、演进的无线接入承载(Evolved Radio Access Bearer,简称为E-RAB)建立成功率、无线接通率、E-RAB掉线率、基站ENB(演进NodeB)间切换成功率、小区用户面上行丢包率、小区用户面下行丢包率、小区用户面下行平均时延、小区用户面下行弃包率、小区下行包数、MAC层上行误块率、媒体接入控制(Media Access Control,简称为MAC)层下行误块率、上行初始混合自动重传请求HARQ重传比率、下行初始(Hybrid Automatic Repeat Request,简称为HARQ)重传比率、下行双流流量占比、上行正交相移键控(Quadrature Phase Shift Keying,简称为QPSK)比例、上行16QAM比例、下行QPSK比例、下行16QAM比例、下行64正交振幅调制(Quadrature Amplitude Modulation,简称为QAM)比例、空口上行业务字节数、空口下行业务字节数、上行物理资源块Physical Resource Block,简称为PRB)平均利用率、下行PRB平均利用率、上行每PRB平均吞吐量、下行每PRB 平均吞吐量、-110dBm覆盖率、平均信号与干扰加噪声比(Signal to Interence plus Noise Ratio,简称为SINR)、子带0平均信道质量指示CQI(Channel Quality Indicator)、用户面平均激活设备UE数。Optionally, the KPI data includes at least one of the following: a Radio Resource Control (RRC) connection establishment success rate, and an Evolved Radio Access Bearer (E-RAB) Success rate, wireless connection rate, E-RAB drop rate, base station ENB (evolved NodeB) handover success rate, cell user face packet loss rate, cell user plane downlink packet loss rate, cell user plane downlink average delay, Cell user plane downlink packet loss rate, cell downlink packet number, MAC layer uplink block error rate, Media Access Control (MAC) layer downlink block error rate, uplink initial hybrid automatic repeat request HARQ retransmission ratio Hybrid Automatic Repeat Request (HARQ) retransmission ratio, downlink dual-stream traffic ratio, Quadrature Phase Shift Keying (QPSK) ratio, uplink 16QAM ratio, downlink QPSK ratio, Downstream 16QAM ratio, downlink 64 Quadrature Amplitude Modulation (QAM) ratio, air interface uplink service byte count, air interface industry Bytes, of uplink physical resource blocks Physical Resource Block, abbreviated as PRB) average utilization, average utilization downlink PRB, PRB average throughput per uplink, each downlink PRB Average throughput, -110dBm coverage, Signal to Interference plus Noise Ratio (SINR), Subband 0 Average Channel Quality Indicator (CQI), Channel User Level .
可选地,所述KQI数据包括超文本传输协议(Hypertext Transfer Protocol,简称为HTTP)响应时延。Optionally, the KQI data includes a Hypertext Transfer Protocol (HTTP) response delay.
可选地,所述聚类包括以下至少之一:容量指标聚类、接入指标聚类、效率指标聚类、完整保持指标聚类。Optionally, the cluster includes at least one of the following: a capacity indicator cluster, an access indicator cluster, an efficiency indicator cluster, and a complete retention indicator cluster.
可选地,所述完整保持指标聚类还包括以下至少之一:分组业务聚类、上行完整保持聚类、下行完整保持聚类。Optionally, the complete maintenance indicator cluster further includes at least one of the following: a group service cluster, an uplink complete cluster, and a downlink complete cluster.
根据本发明的另一个实施例,提供了一种确定黑盒系统输入输出关联度的装置,包括:分离模块,设置为将黑盒系统中的业务质量指标KQI数据与关键性能指标KPI数据匹配构成数据向量空间以分离出所述KPI数据对所述KQI数据的关联度;聚类模块,设置为根据所述KQI数据的业务类型对所述KPI数据进行聚类;第一计算模块,设置为对所述数据向量空间进行分解,并计算所述KPI数据对所述KQI数据的归一化关联度;第二计算模块,设置为根据所述归一化关联度分别确定出与每个所述KQI数据相关联的KPI数据,并计算出所述相关联的KPI数据对所述KQI数据的关联权重。确定模块,设置为将所述相关联的KPI数据和所述关联权重确定为所述黑盒系统的输入输出关联度。According to another embodiment of the present invention, there is provided an apparatus for determining an input/output association degree of a black box system, comprising: a separation module configured to match a service quality indicator KQI data in a black box system with key performance indicator KPI data a data vector space to separate the degree of association of the KPI data with the KQI data; a clustering module configured to cluster the KPI data according to a service type of the KQI data; the first computing module is set to Decomposing the data vector space, and calculating a normalized degree of association of the KPI data with the KQI data; and a second calculating module, configured to determine each of the KQIs according to the normalized correlation degree The KPI data associated with the data, and the associated weight of the associated KPI data for the KQI data is calculated. A determination module is configured to determine the associated KPI data and the associated weight as an input-output association of the black box system.
根据本发明的又一个实施例,提供了一种关联分析系统,包括:存储单元,设置为存储服务网络中的KQI数据和KPI数据;数据预处理单元,设置为对所述KQI数据和所述KPI数据进行预处理,其中,所述预处理包括:数据匹配、数据清洗、统计特征提取以及统计数据呈现。聚类单元,设置为对所述KPI数据进行智能聚类,并输出聚类表;向量空间分解单元,与所述数据预处理单元连接,设置为对预处理后的KQI数据和KPI数据构成的向量空间进行分解,提取可归一量化的所述KQI数据对所述KPI数据的关联分量。量化关联计算单元,与所述向量空间分解单元连接,设 置为对所述关联分量进行归一量化计算,得到所述KQI数据对所述KPI数据的量化关联度,计算其总排序权重,并输出包含所述权重的量化关联矩阵;多维门限计算单元,设置为根据所述关联矩阵计算相关项KPI数据的多维量化门限、反推KQI超限的误判率和/或漏判率,输出多维量化门限矩阵与KQI超限评估数据;优化单元,设置为根据所述多维量化门限矩阵与所述KQI超限评估数据所所述服务网络进行网络优化。According to still another embodiment of the present invention, there is provided an association analysis system, comprising: a storage unit configured to store KQI data and KPI data in a service network; a data pre-processing unit configured to pair the KQI data and the The KPI data is preprocessed, wherein the preprocessing includes: data matching, data cleaning, statistical feature extraction, and statistical data presentation. a clustering unit configured to intelligently cluster the KPI data and output a clustering table; a vector space decomposition unit connected to the data preprocessing unit and configured to form the preprocessed KQI data and KPI data The vector space is decomposed to extract an associative component of the KPII data that can be normalized to the KPI data. Quantizing an association calculation unit, and connecting to the vector space decomposition unit, And performing normalized quantization calculation on the correlation component, obtaining a quantitative relevance degree of the KQI data on the KPI data, calculating a total ranking weight thereof, and outputting a quantization correlation matrix including the weight; a multi-dimensional threshold calculation unit, The method is configured to calculate a multi-dimensional quantization threshold of the KPI data of the correlation item according to the correlation matrix, a false positive rate and/or a missed rate of the KQI overrun, and output a multi-dimensional quantization threshold matrix and KQI over-limit evaluation data; Performing network optimization according to the multi-dimensional quantization threshold matrix and the service network described by the KQI overrun evaluation data.
可选地,所述系统还包括:业务数据接口,包含呈现界面,设置为接收外部指令对所述系统的输出数据进行辅助判决。Optionally, the system further includes: a service data interface, including a presentation interface, configured to receive an external command to perform an auxiliary decision on the output data of the system.
可选地,所述系统还包括:数据挖掘分析算法池,设置为存储所述系统的数据挖掘算法。数据库,设置为存储所述系统的数据分析与挖掘结论,以及中间过程信息。Optionally, the system further includes: a data mining analysis algorithm pool, configured to store a data mining algorithm of the system. A database, set to store data analysis and mining conclusions of the system, and intermediate process information.
根据本发明的又一个实施例,还提供了一种存储介质。该存储介质设置为存储用于执行以下步骤的程序代码:According to still another embodiment of the present invention, a storage medium is also provided. The storage medium is arranged to store program code for performing the following steps:
将黑盒系统中的业务质量指标KQI数据与关键性能指标KPI数据匹配构成数据向量空间;Matching the service quality indicator KQI data in the black box system with the key performance indicator KPI data to form a data vector space;
根据所述KQI数据的业务类型对所述KPI数据进行聚类,其中,聚类结果用于选择正交的强相关KPI项,及辅助判决指标数据健康度;And clustering the KPI data according to the service type of the KQI data, wherein the clustering result is used to select an orthogonal strong correlation KPI item, and the auxiliary decision indicator data health degree;
对所述数据向量空间进行分解,以分离出所述KPI数据对所述KQI数据的关联特征,并计算所述KPI数据对所述KQI数据的归一化关联度;Decomposing the data vector space to separate associated features of the KPI data from the KQI data, and calculating a normalized degree of association of the KPI data with the KQI data;
根据所述归一化关联度分别确定出与每个所述KQI数据相关联的KPI数据,并计算出所述相关联的KPI数据对所述KQI数据的关联权重;Determining, according to the normalized degree of association, KPI data associated with each of the KQI data, and calculating an associated weight of the associated KPI data to the KQI data;
将所述相关联的KPI数据和所述关联权重确定为所述黑盒系统的输入输出关联度。The associated KPI data and the associated weight are determined as the input and output relevance of the black box system.
通过本发明,将黑盒系统中的业务质量指标KQI数据与关键性能指标KPI数据匹配构成数据向量空间;根据所述KQI数据的业务类型对所述KPI数据进行聚类,其中,聚类结果用于选择正交的强相关KPI项,及辅助判决指标数据健康度;对所述数据向量空间进行分解,以分离出所述 KPI数据对所述KQI数据的关联特征,并计算所述KPI数据对所述KQI数据的归一化关联度;根据所述归一化关联度分别确定出与每个所述KQI数据相关联的KPI数据,并计算出所述相关联的KPI数据对所述KQI数据的关联权重;将所述相关联的KPI数据和所述关联权重确定为所述黑盒系统的输入输出关联度,由于可量化计算KQI数据对KPI数据关联度并进行归一化比较,因此可以解决相关技术中在确定黑盒系统输入输出关联度时精确度过低且找不全关联项的问题,达到了提升业务质量和减轻人工负担的效果。Through the present invention, the service quality indicator KQI data in the black box system is matched with the key performance indicator KPI data to form a data vector space; the KPI data is clustered according to the service type of the KQI data, wherein the clustering result is used Selecting orthogonal strong correlation KPI terms, and assisting decision indicator data health; decomposing the data vector space to separate the Correlation characteristics of the KPI data to the KQI data, and calculating a normalized degree of association of the KPI data with the KQI data; determining, respectively, associated with each of the KQI data according to the normalized correlation degree KPI data, and calculating an associated weight of the associated KPI data to the KQI data; determining the associated KPI data and the associated weight as the input-output relevance of the black box system, as The KQI data is quantitatively calculated and normalized and compared with the KPI data. Therefore, the problem of low accuracy and incomplete correlation can be solved in determining the correlation between the input and output of the black box system in the related art, and the quality of the service is improved. Reduce the effect of artificial burden.
附图说明DRAWINGS
此处所说明的附图用来提供对本发明的进一步理解,构成本申请的一部分,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。在附图中:The drawings described herein are intended to provide a further understanding of the invention, and are intended to be a part of the invention. In the drawing:
图1是根据本发明相关技术的KQI与KPI之间为黑盒系统的示意图;1 is a schematic diagram of a black box system between a KQI and a KPI according to the related art of the present invention;
图2是根据本发明相关技术中以散点图示意“KPI好KQI却差”的现象示意图;2 is a schematic diagram showing a phenomenon in which a "KPI is good KQI is poor" in a scatter diagram according to the related art of the present invention;
图3是根据本发明实施例的确定黑盒系统输入输出关联度的方法的流程图;3 is a flow chart of a method of determining an input and output association degree of a black box system according to an embodiment of the present invention;
图4是根据本发明实施例的确定黑盒系统输入输出关联度的装置的结构框图;4 is a structural block diagram of an apparatus for determining an input/output association degree of a black box system according to an embodiment of the present invention;
图5是根据本发明实施例的关联分析系统的结构框图;FIG. 5 is a structural block diagram of an association analysis system according to an embodiment of the present invention; FIG.
图6是本发明实施例提供的KQI-KPI多维量化关联分析方案流程示意图;6 is a schematic flowchart of a KQI-KPI multi-dimensional quantitative association analysis solution according to an embodiment of the present invention;
图7是本发明实施例提供的KQI-KPI多维量化关联分析系统框图;7 is a block diagram of a KQI-KPI multi-dimensional quantitative correlation analysis system according to an embodiment of the present invention;
图8是本发明实施例提供的智慧聚类过程中初始分层聚类示意图;FIG. 8 is a schematic diagram of initial hierarchical clustering in a smart clustering process according to an embodiment of the present invention; FIG.
图9是本发明实施例通过正态分布与数据源健康度检验的参数频度拟合曲线图; 9 is a graph showing a fitting frequency of a parameter by a normal distribution and a data source health test according to an embodiment of the present invention;
图10是本发明实施例未通过正态分布检验但通过数据源健康度检验的参数频度拟合曲线图;10 is a graph showing a frequency fitting curve of a parameter that does not pass the normal distribution test but passes the data source health test according to an embodiment of the present invention;
图11是本发明实施例未通过正态分布检验且未通过数据源健康度检验的参数频度拟合曲线图;11 is a graph showing a frequency fitting curve of a parameter that does not pass the normal distribution test and fails the data source health test according to an embodiment of the present invention;
图12是是本发明实施例采用主量化判决点、辅量化判决点进行关联项KPI权重计算的示意图。FIG. 12 is a schematic diagram of calculating KPI weights of related items by using a primary quantization decision point and a secondary quantization decision point according to an embodiment of the present invention.
具体实施方式detailed description
下文中将参考附图并结合实施例来详细说明本发明。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。The invention will be described in detail below with reference to the drawings in conjunction with the embodiments. It should be noted that the embodiments in the present application and the features in the embodiments may be combined with each other without conflict.
需要说明的是,本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。It is to be understood that the terms "first", "second" and the like in the specification and claims of the present invention are used to distinguish similar objects, and are not necessarily used to describe a particular order or order.
实施例1Example 1
在本实施例中提供了一种确定黑盒系统输入输出关联度的方法,图3是根据本发明实施例的确定黑盒系统输入输出关联度的方法的流程图,如图3所示,该流程包括如下步骤:In the embodiment, a method for determining the input and output association degree of the black box system is provided. FIG. 3 is a flowchart of a method for determining the input and output association degree of the black box system according to an embodiment of the present invention. The process includes the following steps:
步骤S302,将黑盒系统中的业务质量指标KQI数据与关键性能指标KPI数据匹配构成数据向量空间以分离出KPI数据对KQI数据的关联度;Step S302, matching the service quality indicator KQI data in the black box system with the key performance indicator KPI data to form a data vector space to separate the correlation degree of the KPI data to the KQI data;
步骤S304,根据KQI数据的业务类型对KPI数据进行聚类,其中,聚类结果用于选择正交的强相关KPI项,及辅助判决指标数据健康度;Step S304, clustering the KPI data according to the service type of the KQI data, wherein the clustering result is used to select an orthogonal strong correlation KPI item, and the auxiliary decision indicator data health degree;
步骤S306,对数据向量空间进行分解,以分离出KPI数据对KQI数据的关联特征,并计算KPI数据对KQI数据的归一化关联度;Step S306, decomposing the data vector space to separate the correlation feature of the KPI data to the KQI data, and calculating the normalized correlation degree of the KPI data to the KQI data;
步骤S308,根据归一化关联度分别确定出与每个KQI数据相关联的KPI数据,并计算出相关联的KPI数据对KQI数据的关联权重;Step S308, respectively determining KPI data associated with each KQI data according to the normalized association degree, and calculating an associated weight of the associated KPI data to the KQI data;
步骤S310,将相关联的KPI数据和关联权重确定为黑盒系统的输入输出关联度。 Step S310, determining the associated KPI data and the associated weight as the input and output relevance of the black box system.
可选的,本实施例的场景可以应用在网络优化,用户画像,评论,推荐等,但不限于此。Optionally, the scenario in this embodiment may be applied to network optimization, user portrait, comment, recommendation, etc., but is not limited thereto.
通过上述步骤,将黑盒系统中的业务质量指标KQI数据与关键性能指标KPI数据匹配构成数据向量空间;根据所述KQI数据的业务类型对所述KPI数据进行聚类,其中,聚类结果用于选择正交的强相关KPI项,及辅助判决指标数据健康度;对所述数据向量空间进行分解,以分离出所述KPI数据对所述KQI数据的关联特征,并计算所述KPI数据对所述KQI数据的归一化关联度;根据所述归一化关联度分别确定出与每个所述KQI数据相关联的KPI数据,并计算出所述相关联的KPI数据对所述KQI数据的关联权重;将所述相关联的KPI数据和所述关联权重确定为所述黑盒系统的输入输出关联度,由于可量化计算KQI数据对KPI数据关联度并进行归一化比较,因此可以解决相关技术中在确定黑盒系统输入输出关联度时精确度过低且找不全关联项的问题,达到了提升业务质量和减轻人工负担的效果。Through the above steps, the service quality indicator KQI data in the black box system is matched with the key performance indicator KPI data to form a data vector space; the KPI data is clustered according to the service type of the KQI data, wherein the clustering result is used. Selecting orthogonal strong correlation KPI terms, and assisting decision indicator data health; decomposing the data vector space to separate associated features of the KPI data from the KQI data, and calculating the KPI data pair a normalized degree of association of the KQI data; determining KPI data associated with each of the KQI data according to the normalized degree of association, and calculating the associated KPI data for the KQI data Correlation weight; determining the associated KPI data and the association weight as the input/output association degree of the black box system, since the KQI data can be quantitatively calculated and normalized and compared with the KPI data, The invention solves the problem that the accuracy of the black box system is too low and the related items are not found when determining the correlation between the input and output of the black box system, thereby improving the quality of the service and reducing the labor burden. fruit.
可选的,将KQI与KPI间的关联度看作随机变量而非固定函数,优选贝叶斯统计理论,方法流程为:Optionally, the degree of association between KQI and KPI is regarded as a random variable rather than a fixed function, preferably Bayesian statistical theory, and the method flow is:
将网络/通信系统两端采集的KQI与KPI数据按时间粒度与空间粒度匹配对齐。The KQI and KPI data collected at both ends of the network/communication system are aligned by time granularity and spatial granularity matching.
参照协议、规范和业务实际需求等,将匹配对齐后的KQI与KPI数据进一步清洗。数据清洗的内容有:剔除异常值与填补缺失值。The aligned KQI and KPI data are further cleaned according to the protocol, specifications, and actual business requirements. Data cleaning content includes: eliminating outliers and filling in missing values.
统计清洗后的KQI与KPI数据的特征,并以多种图表方式呈现。可选的,由业务数据专家结合统计指标进行分析,判断数据源健康程度(是否存在数值方法可判的错报/漏报等)、数据分布类型等,以更好适配数据挖掘算法。The characteristics of the KQI and KPI data after washing are counted and presented in a variety of charts. Optionally, the business data experts combine statistical indicators to determine the health of the data source (whether there are misstatements/missings that can be judged by numerical methods), data distribution types, etc., to better adapt the data mining algorithm.
对清洗后的KPI数据进行智能分层聚类并得到聚类表。可选的,由业务数据专家结合业务需求与工程意义辅助判定分为几类,并微调界限模糊的KPI分类。 Intelligent hierarchical clustering of the cleaned KPI data and obtaining a clustering table. Optionally, the business data experts are divided into several categories based on the combination of business requirements and engineering significance, and fine-tuning the KPI classification with blurred boundaries.
对KQI-KPI数据构成的向量空间进行分解,分离出每个KPI-KQI的可量化关联度,并进行归一化。可选的,对向量空间进行扩维或坐标转换以获得更清晰的KQI-KPI可量化关联度。The vector space formed by the KQI-KPI data is decomposed, and the quantizable degree of association of each KPI-KQI is separated and normalized. Optionally, the vector space is expanded or coordinate transformed to obtain a clearer KQI-KPI quantizable degree of association.
对KPI-KQI归一化关联度进行排序判定,按照关联度大小排序分类。The KPI-KQI normalized relevance is sorted and judged, and the classification is sorted according to the degree of relevance.
查看相关项KPI所属聚类,判决出最终相关项KPI。可选的,由业务数据专家根据统计信息、业务需求、聚类表与工程意义指定最终相关项KPI。View the cluster of related items KPI and determine the final related item KPI. Optionally, the business data expert specifies the final correlation KPI based on statistical information, business requirements, clustering tables, and engineering significance.
根据最终相关项KPI的KPI-KQI归一化关联度,计算最终相关项KPI的归一化权重。The normalized weight of the final correlation KPI is calculated according to the KPI-KQI normalized degree of association of the final correlation KPI.
根据最终相关项KPI的KPI-KQI归一化关联度,以及KQI的预置警戒门限,计算每个最终相关项KPI的一维门限。The one-dimensional threshold of each final correlation KPI is calculated based on the KPI-KQI normalized correlation degree of the final correlation KPI and the preset warning threshold of the KQI.
根据KQI的预置警戒门限,以及误判/漏判率精度需求,计算多个最终相关项KPI的联合门限。可选的,联合门限维数介于2与最终相关项KPI数量之间。According to the preset warning threshold of KQI and the accuracy requirement of false positive/missing rate, the joint threshold of multiple final correlation KPIs is calculated. Optionally, the joint threshold dimension is between 2 and the number of final correlation KPIs.
根据所有最终相关项KPI的一维门限与联合门限,计算最终相关项KPI的多维门限Calculate the multidimensional threshold of the final correlation KPI based on the one-dimensional threshold and joint threshold of all final correlation KPIs
使用最终相关项KPI的多维门限计算期望误判率(KPI->KQI)下的KQI超限漏判率(KQI->KPI)。其中,误判率定义:KPI多维门限筛选出的KQI数据中,KQI没有超限的比例;漏判率定义:KQI超限的数据中,不在KPI多维门限筛选范围内的比例。The KQI overrun leak rate (KQI->KPI) under the expected false positive rate (KPI->KQI) is calculated using the multidimensional threshold of the final correlation KPI. Among them, the false positive rate definition: KQI data filtered by KPI multi-dimensional threshold, KQI has no over-limit ratio; leakage rate definition: KQI over-limit data, which is not within the KPI multi-dimensional threshold screening range.
在界面呈现KQI与其最终相关项KPI间的关联矩阵与KPI多维门限矩阵,并将关联提取特征存入后台专家智慧数据库。可选的,整个计算判决流程中的中间输出结果可由业务数据专家选择性的在界面呈现,进行辅助判决与人工微调。The association matrix between KQI and its final related item KPI and the KPI multidimensional threshold matrix are presented in the interface, and the associated extraction features are stored in the background expert intelligence database. Optionally, the intermediate output result in the entire calculation decision process may be selectively presented by the service data expert on the interface, and the auxiliary decision and the manual fine adjustment are performed.
新的KQI与KPI数据到来后,用已有多维门限进行KQI超限判决,若误判率上升比例超过偏移门限,则采集全体数据重新计算;否则继续使用已有关联分析结论结合自适应微调,指导KQI优化与参数调整。 After the arrival of the new KQI and KPI data, the KQI overrun judgment is performed using the existing multi-dimensional threshold. If the false positive rate increases beyond the offset threshold, the entire data is recalculated; otherwise, the existing correlation analysis conclusion is combined with the adaptive fine adjustment. Guide KQI optimization and parameter adjustment.
通过本实施例,使得运营商、网络建设商、网络维护商可以用量化的方式快速全面准确地找出某项KQI的强相关KPI,并计算出相应的KPI单项与多维门限,进而反推KQI是否超限,为网络评估、性能优化、参数调整等提供精确方便的指导,提升业务质量,并极大减轻人工负担。Through this embodiment, the operator, the network builder, and the network maintainer can quickly and comprehensively find a strong KPI of a certain KQI in a quantitative manner, and calculate a corresponding KPI single and multi-dimensional threshold, and then push back the KQI. Whether it exceeds the limit, provides accurate and convenient guidance for network evaluation, performance optimization, parameter adjustment, etc., improves service quality, and greatly reduces the labor burden.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到根据上述实施例的方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本发明各个实施例的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the method according to the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course, by hardware, but in many cases, the former is A better implementation. Based on such understanding, the technical solution of the present invention, which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk, The optical disc includes a number of instructions for causing a terminal device (which may be a cell phone, a computer, a server, or a network device, etc.) to perform the methods of various embodiments of the present invention.
实施例2Example 2
在本实施例中还提供了一种确定黑盒系统输入输出关联度的装置、系统,该装置和系统用于实现上述实施例及优选实施方式,已经进行过说明的不再赘述。如以下所使用的,术语“模块”可以实现预定功能的软件和/或硬件的组合。尽管以下实施例所描述的装置较佳地以软件来实现,但是硬件,或者软件和硬件的组合的实现也是可能并被构想的。In the embodiment, a device and a system for determining the degree of the input and output of the black box system are provided. The device and the system are used to implement the above-mentioned embodiments and preferred embodiments, and the detailed description thereof has been omitted. As used below, the term "module" may implement a combination of software and/or hardware of a predetermined function. Although the apparatus described in the following embodiments is preferably implemented in software, hardware, or a combination of software and hardware, is also possible and contemplated.
图4是根据本发明实施例的确定黑盒系统输入输出关联度的装置的结构框图,如图4所示,该装置包括:4 is a structural block diagram of an apparatus for determining an input/output association degree of a black box system according to an embodiment of the present invention. As shown in FIG. 4, the apparatus includes:
分离模块40,设置为将黑盒系统中的业务质量指标KQI数据与关键性能指标KPI数据匹配构成数据向量空间以分离出KPI数据对KQI数据的关联度;The separation module 40 is configured to match the service quality indicator KQI data in the black box system with the key performance indicator KPI data to form a data vector space to separate the correlation degree of the KPI data to the KQI data;
聚类模块42,设置为根据KQI数据的业务类型对KPI数据进行聚类;The clustering module 42 is configured to cluster the KPI data according to the service type of the KQI data;
第一计算模块44,设置为对数据向量空间进行分解,并计算KPI数据对KQI数据的归一化关联度;The first calculating module 44 is configured to decompose the data vector space and calculate a normalized correlation degree of the KPI data to the KQI data;
第二计算模块46,设置为根据归一化关联度分别确定出与每个KQI 数据相关联的KPI数据,并计算出相关联的KPI数据对KQI数据的关联权重。The second calculating module 46 is configured to respectively determine each KQI according to the normalized degree of association The KPI data associated with the data, and the associated weights of the associated KPI data for the KQI data are calculated.
确定模块48,设置为将相关联的KPI数据和关联权重确定为黑盒系统的输入输出关联度。A determination module 48 is arranged to determine the associated KPI data and associated weights as the input and output associations of the black box system.
图5是根据本发明实施例的关联分析系统的结构框图,如图5所示,该系统包括:FIG. 5 is a structural block diagram of an association analysis system according to an embodiment of the present invention. As shown in FIG. 5, the system includes:
存储单元50,设置为存储服务网络中的KQI数据和KPI数据;The storage unit 50 is configured to store KQI data and KPI data in the service network;
数据预处理单元52,设置为对KQI数据和KPI数据进行预处理,其中,预处理包括:数据匹配、数据清洗、统计特征提取以及统计数据呈现。The data pre-processing unit 52 is configured to pre-process the KQI data and the KPI data, wherein the pre-processing includes: data matching, data cleaning, statistical feature extraction, and statistical data presentation.
聚类单元54,设置为对KPI数据进行智能聚类,并输出聚类表;The clustering unit 54 is configured to perform intelligent clustering on the KPI data, and output a cluster table;
向量空间分解单元56,与数据预处理单元连接,设置为对预处理后的KQI数据和KPI数据构成的向量空间进行分解,提取可归一量化的KQI数据对KPI数据的关联分量。The vector space decomposition unit 56 is connected to the data pre-processing unit and configured to decompose the vector space formed by the pre-processed KQI data and the KPI data, and extract the correlation component of the KPII data-to-KPI data that can be normalized and quantized.
量化关联计算单元58,与向量空间分解单元连接,设置为对关联分量进行归一量化计算,得到KQI数据对KPI数据的量化关联度,计算其总排序权重,并输出包含权重的量化关联矩阵;The quantization correlation calculation unit 58 is connected to the vector space decomposition unit, and is configured to perform normalized quantization calculation on the correlation component, obtain a quantitative relevance degree of the KQI data to the KPI data, calculate a total ranking weight thereof, and output a quantization correlation matrix including the weight;
多维门限计算单元60,设置为根据关联矩阵计算相关项KPI数据的多维量化门限、反推KQI超限的误判率和/或漏判率,输出多维量化门限矩阵与KQI超限评估数据;The multi-dimensional threshold calculation unit 60 is configured to calculate a multi-dimensional quantization threshold of the correlation item KPI data according to the correlation matrix, a false positive rate and/or a missed rate of the inverse KQI overrun, and output a multi-dimensional quantization threshold matrix and KQI over-limit evaluation data;
优化单元32,设置为根据多维量化门限矩阵与KQI超限评估数据所服务网络进行网络优化。The optimization unit 32 is configured to perform network optimization according to the service network of the multi-dimensional quantization threshold matrix and the KQI over-limit evaluation data.
可选的,系统还包括:业务数据接口,包含呈现界面,设置为接收外部指令对系统的输出数据进行辅助判决;据挖掘分析算法池,设置为存储系统的数据挖掘算法;数据库,设置为存储系统的数据分析与挖掘结论,以及中间过程信息。Optionally, the system further includes: a service data interface, including a presentation interface, configured to receive an external command to perform an auxiliary decision on the output data of the system; according to the mining analysis algorithm pool, set as a data mining algorithm of the storage system; the database is set to be stored System data analysis and mining conclusions, as well as intermediate process information.
根据本实施例的系统,可选的,基础数据库,设置为存储所有指标与 操作管理数据。The system according to the embodiment, optionally, the base database, is set to store all indicators and Operation management data.
业务数据专家接口,包含数据与结论呈现界面,设置为业务数据专家查看统计分析信息,并输入辅助条件或判决。The business data expert interface includes a data and conclusion presentation interface, and is set as a business data expert to view statistical analysis information and input auxiliary conditions or judgments.
KQI/KPI存储单元,设置为存储全体KQI/KPI项,包含业务类型、客户需求、专家辅助等信息。The KQI/KPI storage unit is configured to store all KQI/KPI items, including information such as service type, customer requirements, and expert assistance.
数据预处理单元,根据派单、业务类型与客户需求等信息,从存储单元提取KQI与KPI项全集或子集,依次进行数据匹配、数据清洗、统计特征提取与统计数据呈现,用于关联分析计算。The data preprocessing unit extracts the complete set or subset of KQI and KPI items from the storage unit according to the information of the dispatch order, the service type and the customer demand, and sequentially performs data matching, data cleaning, statistical feature extraction and statistical data presentation for correlation analysis. Calculation.
智能聚类单元,根据业务的类型、分布与工程意义,结合业务数据专家判决,对KPI项进行智能聚类,输出智能聚类表。聚类过程不限于单一聚类算法,也不预先指定固定聚类数量,完全以数据与业务为基础。The intelligent clustering unit intelligently clusters KPI items according to the type, distribution and engineering meaning of the business, and combines the business data expert judgment to output the intelligent clustering table. The clustering process is not limited to a single clustering algorithm, nor does it pre-specify the number of fixed clusters, based entirely on data and services.
向量空间分解单元,设置为对预处理后的KQI和KPI数据构成的向量空间进行分解,提取可归一量化的KQI-KPI关联分量。包含直接分解与扩维后分解等方式。The vector space decomposition unit is configured to decompose the vector space formed by the preprocessed KQI and KPI data, and extract the KQI-KPI correlation component that can be normalized and quantized. It includes direct decomposition and post-expansion decomposition.
量化关联计算单元,设置为对向量空间分解单元提取出的KQI-KPI关联分量进行归一量化计算,得到KQI-KPI量化关联度。并根据量化关联度与智能聚类表,将全体参与关联分析计算的KPI项划分为四种:最终相关项KPI,相关同类项KPI,提醒项KPI与无关项KPI。然后根据所有最终相关项KPI的量化关联度,计算其总排序权重。最后在呈现界面输出量化关联矩阵(包含权重)。The quantization correlation calculation unit is configured to perform a normalized quantization calculation on the KQI-KPI correlation component extracted by the vector space decomposition unit to obtain a KQI-KPI quantization correlation degree. According to the quantitative correlation degree and the intelligent clustering table, the KPI items of all participating in the association analysis are divided into four types: the final related item KPI, the related similar item KPI, the reminder item KPI and the unrelated item KPI. The total ranking weight is then calculated based on the quantitative relevance of all final correlation KPIs. Finally, the quantization correlation matrix (including the weight) is output on the presentation interface.
多维门限计算单元,设置为计算最终相关项KPI的多维量化门限,然后计算KPI多维量化门限反推KQI超限的误判率和漏判率。最后在呈现界面输出多维量化门限矩阵与KQI超限误判率及漏判率。The multi-dimensional threshold calculation unit is configured to calculate a multi-dimensional quantization threshold of the final correlation KPI, and then calculate a false positive rate and a missed rate of the KPI multi-dimensional quantization threshold inverse KQI overrun. Finally, the multi-dimensional quantization threshold matrix and KQI over-limit false positive rate and missed-rate rate are output on the presentation interface.
数据挖掘分析算法池,设置为存储整个关联分析计算流程中可能用到的全部数据挖掘算法。每个计算模块根据数据与业务特征自动或专家辅助选择合适的算法。The data mining analysis algorithm pool is set to store all the data mining algorithms that may be used in the entire association analysis calculation process. Each computing module automatically or expertly assists in selecting the appropriate algorithm based on the data and business characteristics.
数据专家智慧库,设置为存储基于业务的关联分析结论与中间过程信 息,可用于业务关联分析、网络/系统参数优化及其他增值业务。Data Expert Intelligence Library, set to store business-based association analysis conclusions and intermediate process letters Information, can be used for business association analysis, network / system parameter optimization and other value-added services.
需要说明的是,上述各个模块是可以通过软件或硬件来实现的,对于后者,可以通过以下方式实现,但不限于此:上述模块均位于同一处理器中;或者,上述各个模块以任意组合的形式分别位于不同的处理器中。It should be noted that each of the above modules may be implemented by software or hardware. For the latter, the foregoing may be implemented by, but not limited to, the foregoing modules are all located in the same processor; or, the above modules are in any combination. The forms are located in different processors.
实施例3Example 3
下面结合具体场景对技术方案的实施作进一步的详细描述,但所举实施例不作为对本发明的限定本发明实施例提供了一种KQI-KPI多维量化关联分析方案,图6是本发明实施例提供的KQI-KPI多维量化关联分析方案流程示意图,如附图6所示,图7是本发明实施例提供的KQI-KPI多维量化关联分析系统框图,该方案对应的系统模块结构如附图7所示。该方案包括如下步骤:The implementation of the technical solution is further described in detail below with reference to specific scenarios, but the embodiment is not limited to the present invention. The KQI-KPI multi-dimensional quantitative correlation analysis solution is provided in the embodiment of the present invention, and FIG. 6 is an embodiment of the present invention. The KQI-KPI multi-dimensional quantitative correlation analysis scheme flow diagram is provided as shown in FIG. 6. FIG. 7 is a block diagram of a KQI-KPI multi-dimensional quantitative correlation analysis system provided by an embodiment of the present invention, and the corresponding system module structure of the scheme is as shown in FIG. 7. Shown. The program includes the following steps:
S101,主要对应功能单元为模块40。根据业务类型、协议规范等条件,模块40从模块30读取待关联分析的KQI项与KPI项;然后模块40从模块10中,读取对应KQI与KPI数据,进行数据匹配、数据清洗、统计分析、拟合、特征提取等预处理操作。在数据预处理过程中,必选的,模块40调用模块90;可选的,模块40调用模块100。S101. The main corresponding functional unit is module 40. The module 40 reads the KQI item and the KPI item to be associated with the analysis from the module 30 according to the service type, the protocol specification, and the like; then the module 40 reads the corresponding KQI and KPI data from the module 10, and performs data matching, data cleaning, and statistics. Pre-processing operations such as analysis, fitting, and feature extraction. In the data pre-processing process, the module 40 calls the module 90; optionally, the module 40 calls the module 100.
S102,主要对应功能单元为模块50。模块50调用模块90,与模块40在S101中得到的部分指标特征值,进行不限定类别数量的初始分层聚类,图8是本发明实施例提供的智慧聚类过程中初始分层聚类示意图,如附图8所示。然后模块50调用模块100进行最终聚类判决,并输出聚类表。可选的,模块40调用模块20进行现场专家辅助聚类判决。S102. The main corresponding functional unit is module 50. The module 50 invoking the module 90, and the partial index feature values obtained by the module 40 in S101, performs initial hierarchical clustering of the number of undefined categories, and FIG. 8 is an initial hierarchical clustering in the smart clustering process provided by the embodiment of the present invention. Schematic, as shown in Figure 8. Module 50 then invokes module 100 for the final clustering decision and outputs a clustering table. Optionally, the module 40 invokes the module 20 to perform on-site expert-assisted clustering decisions.
S103,主要对应功能单元为模块60。模块60调用模块40输出的KQI与KPI数据,构成KQI-KPI数据向量空间,调用模块90进行向量空间分解运算,分离出可量化的KQI-KPI关联分量。S103. The main corresponding functional unit is module 60. The module 60 calls the KQI and KPI data output by the module 40 to form a KQI-KPI data vector space, and the calling module 90 performs a vector space decomposition operation to separate the quantizable KQI-KPI correlation components.
S104,主要对应功能单元为模块70。模块70对KQI-KPI关联分量进行归一量化运算,得到每个KPI与对应KQI的关联度。KQI-KPI关联度范围为[0,1],设置无关门限与相关门限,即可判决出相关项KPI、提醒项 KPI与无关项KPI。为尽可能减少相关项KPI间的重叠程度,模块70调用模块50输出的聚类表,从同聚类相关项KPI中判决出最终相关项KPI,其他同聚类中的相关项KPI则称为相关同类项KPI。模块70计算出最终相关项KPI的归一化权重后,输出每个KQI的相关KPI矩阵,包含权重。S104. The main corresponding functional unit is module 70. The module 70 performs a normalized quantization operation on the KQI-KPI correlation components to obtain the degree of association between each KPI and the corresponding KQI. The KQI-KPI correlation degree range is [0, 1]. By setting the irrelevant threshold and the relevant threshold, the relevant item KPI and reminder can be judged. KPI and unrelated item KPI. In order to minimize the degree of overlap between related items KPIs, the module 70 calls the cluster table output by the module 50, and determines the final related item KPI from the KPI of the same cluster related item, and the related item KPI in the other clusters is called Related KPIs of the same type. After calculating the normalized weight of the final correlation KPI, the module 70 outputs the relevant KPI matrix of each KQI, including the weight.
S105,主要对应功能单元为模块80。模块80根据模块70判决出的KQI的最终相关项KPI及权重,依次计算最终相关项KPI的单项门限与多维门限。根据业务与系统精度要求,多维门限的维数从二维起,且不超过最终相关项KPI的数量。S105. The main corresponding functional unit is module 80. The module 80 sequentially calculates the single threshold and the multi-dimensional threshold of the final correlation KPI according to the final correlation KPI and weight of the KQI determined by the module 70. According to the requirements of business and system accuracy, the dimension of the multi-dimensional threshold starts from two-dimensional and does not exceed the number of final related items KPI.
S106,主要对应功能单元为模块80。用步骤S105输出的最终相关项KPI单项及多维门限,用步骤S101得到的KQI与KPI清洗后数据,验证KQI超限的误判率与漏判率。然后以误判率为变量检验漏判率的极限是否等于或接近零,“是”则表明当前最终KPI项包含了KQI超限的完备基,“否”则表明该KQI仍旧有相关KPI没被模块30覆盖。S106, the main corresponding functional unit is module 80. Using the final correlation KPI single item and the multi-dimensional threshold outputted in step S105, the KQI and KPI cleaned data obtained in step S101 are used to verify the false positive rate and the missed rate of the KQI overrun. Then use the false positive rate to test whether the limit of the missed rate is equal to or close to zero. “Yes” indicates that the current final KPI item contains the complete base of the KQI overrun. “No” indicates that the KQI still has the relevant KPI. Module 30 is covered.
本实施例还包括如下示例,为移动通信运营商寻找某省会城市主城区上网低速率的主要相关KPI。使用选定日期当天,4000个小区的24小时粒度小区级数据,考察KQI:HTTP响应时延与30个无线侧KPI的关联程度。This embodiment further includes an example of finding a main relevant KPI for a mobile communication carrier to find a low rate of Internet access in a main city of a provincial capital. Using the 24-hour granularity cell-level data of 4000 cells on the selected date, the degree of association between KQI: HTTP response delay and 30 wireless side KPIs was examined.
步骤1,属于S101。根据异常业务派单,从模块30读取待评估KQI与对应的30个无线侧KPI项列表。如表1所示。 Step 1, belonging to S101. According to the abnormal service dispatch, the KQI to be evaluated and the corresponding 30 wireless side KPI item lists are read from the module 30. As shown in Table 1.
表1Table 1
Figure PCTCN2017087940-appb-000001
Figure PCTCN2017087940-appb-000001
Figure PCTCN2017087940-appb-000002
Figure PCTCN2017087940-appb-000002
Figure PCTCN2017087940-appb-000003
Figure PCTCN2017087940-appb-000003
步骤2,属于S101。从模块10读取待关联分析的KQI与KPI的24小时粒度小区级数据,并按照采集时间与小区编号匹配对齐。 Step 2 belongs to S101. The 24-hour granularity cell level data of the KQI and KPI to be associated with the analysis is read from the module 10, and aligned with the cell number according to the acquisition time.
步骤3,属于S101。对匹配后的KQI与KPI数据进行异常值处理、统计分析、分布拟合等,并提取统计特征值。模块20的呈现示例如下 Step 3 belongs to S101. The matched KQI and KPI data are subjected to outlier processing, statistical analysis, distribution fitting, etc., and statistical feature values are extracted. An example of the presentation of module 20 is as follows
图9是本发明实施例通过正态分布与数据源健康度检验的参数频度拟合曲线图,如附图9所示,该指标参数通过了对数正态分布的检验;同时通过了健康度判定。可用于模块90的所有统计与挖掘算法。FIG. 9 is a graph showing a frequency fitting curve of a normal distribution and a data source health test according to an embodiment of the present invention. As shown in FIG. 9, the index parameter passes the lognormal distribution test; Degree determination. All statistics and mining algorithms available for module 90.
图10是本发明实施例未通过正态分布检验但通过数据源健康度检验的参数频度拟合曲线图,如附图10所示,该指标参数未通过正态分布/对数正态分布的检验,自动尝试匹配其他常见分布;通过了健康度判定;不可使用模块90中需要符合正态分布前提的统计或挖掘算法。FIG. 10 is a graph showing a frequency fitting curve of a data source health test that is not passed the normal distribution test in the embodiment of the present invention. As shown in FIG. 10, the index parameter does not pass the normal distribution/log normal distribution. The test automatically attempts to match other common distributions; the health degree determination is passed; the statistics or mining algorithms in the module 90 that need to conform to the normal distribution premise are not available.
图11是本发明实施例未通过正态分布检验且未通过数据源健康度检验的参数频度拟合曲线图,如附图11所示,该指标参数未通过正态分布/对数正态分布的检验,未能匹配其他常见分布;未通过健康度判定;不参加后续计算,且立即提交故障排查。经检查,该小区存在此参数的漏报与突发重复连报问题。11 is a graph showing a frequency fitting curve of a parameter that does not pass the normal distribution test and fails the data source health check according to an embodiment of the present invention. As shown in FIG. 11, the index parameter does not pass the normal distribution/lognormal state. The test of distribution failed to match other common distributions; it did not pass the health judgment; it did not participate in the subsequent calculation, and immediately submitted the troubleshooting. Upon examination, there is a problem of missed reporting and burst repeat reporting of this parameter in the cell.
步骤4,属于S102。利用步骤1到步骤3得到的KPI数据与统计特征值,对30个KPI进行不限定类别数量的初始分层聚类,如附图8所示。 Step 4 belongs to S102. Using the KPI data and statistical feature values obtained in steps 1 through 3, an initial hierarchical clustering of 30 KPIs without an unspecified number of categories is performed, as shown in FIG.
步骤5,属于S102。基于步骤4,结合模块100中的专家聚类信息,将30个KPI聚为6类。示例如表2。 Step 5 belongs to S102. Based on step 4, combining the expert clustering information in the module 100, 30 KPIs are grouped into six classes. An example is shown in Table 2.
表2Table 2
Figure PCTCN2017087940-appb-000004
Figure PCTCN2017087940-appb-000004
Figure PCTCN2017087940-appb-000005
Figure PCTCN2017087940-appb-000005
步骤6,属于S103。本实例采用贝叶斯统计原理进行KQI-KPI数据向量空间分解,以条件概率描述KPI->KQI的映射关联分量。由于KQI-KPI数据已按空间与时间维度严格匹配且数值在模块10中已知,则依据贝叶 斯公式,对于给定的KQI数值(通常取警戒门限),KPI->KQI的关联分量可通过数值方法计算得出。 Step 6 belongs to S103. This example uses the Bayesian statistical principle to perform KQI-KPI data vector space decomposition, and describes the KPI->KQI mapping association component with conditional probability. Since the KQI-KPI data has been strictly matched by the spatial and temporal dimensions and the values are known in module 10, then according to Bayeux Equation, for a given KQI value (usually taking the warning threshold), the KPI->KQI correlation component can be calculated numerically.
步骤7,属于S104。基于步骤6,对KPI->KQI关联分量进行量化计算并归一化,得到KPI-KQI归一量化关联度∈[0,1]。本实例中用于判决相关项KPI的门限设为高于0.7,判决无关项的门限设为低于0.5,介于0.5和0.7的判决为提醒项。判决方法根据业务需求与数值特征确定,不限于门限硬判。 Step 7, belonging to S104. Based on step 6, the KPI->KQI correlation component is quantized and normalized, and the KPI-KQI normalized correlation degree ∈[0,1] is obtained. In this example, the threshold for determining the correlation item KPI is set to be higher than 0.7, the threshold of the decision-independent item is set to be lower than 0.5, and the judgment of 0.5 and 0.7 is a reminder item. The decision method is determined according to business requirements and numerical characteristics, and is not limited to threshold hard judgment.
步骤8,属于S104。对判决出的相关项KPI,查询步骤5得到的聚类表,取同类中关联度最大的KPI判决为最终相关项KPI,则最终相关项的数量不会超过6类。图12是是本发明实施例采用主量化判决点、辅量化判决点进行关联项KPI权重计算的示意图,本实例在步骤7中加入辅助判决点(附图12中的“量化判决点2”)与主判决点(附图12中的“量化判决点1“)的关联度数值共同计算最终相关项的归一化权重。例如,若判决出4个最终相关项KPIF1到KPIF4后,参考附图12,每个最终相关项KPIi(i=1to 4)的未归一化权重WOi=(量化判决点1处概率值+量化判决点2处概率值)/2–0.5;则归一化权重WNi=WOii(WOi),i=1to 4。 Step 8 belongs to S104. For the relevant KPI of the judgment, the cluster table obtained in step 5 is queried, and the KPI with the highest degree of relevance in the same category is the final correlation KPI, and the number of related items does not exceed 6 categories. FIG. 12 is a schematic diagram of the calculation of the correlation item KPI weight by using the primary quantization decision point and the secondary quantization decision point in the embodiment of the present invention. In this example, the auxiliary decision point is added in step 7 ("quantization decision point 2" in FIG. 12). The normalized weight of the final correlation term is calculated together with the correlation value of the main decision point ("quantized decision point 1" in Fig. 12). For example, if four final correlation terms KPI F1 to KPI F4 are determined , referring to FIG. 12, the unnormalized weight W Oi = (quantization decision point 1) of each final correlation term KPI i (i=1to 4) Probability value + quantified decision point 2 probability value) / 2 - 0.5; then normalized weight W Ni = W Oi / Σ i (W Oi ), i = 1 to 4.
模块70的结论输出呈现示例如表3所示。An example of the conclusion output of module 70 is shown in Table 3.
表3table 3
Figure PCTCN2017087940-appb-000006
Figure PCTCN2017087940-appb-000006
Figure PCTCN2017087940-appb-000007
Figure PCTCN2017087940-appb-000007
步骤9,属于S105。本实例根据贝叶斯公式与期望误判率,计算最终相关项KPI的单项误判率与二维联合误判率期望分布。根据步骤8得到的最终相关项KPI权重比例与二维期望误判率分布,自适应确定两两配对的最终相关项KPI的数值索引编号,以此得到二维联合门限矩阵。 Step 9, belonging to S105. In this example, according to the Bayesian formula and the expected false positive rate, the single-term false positive rate of the final correlation KPI and the expected distribution of the two-dimensional joint false positive rate are calculated. According to the final correlation KPI weight ratio obtained in step 8 and the two-dimensional expected false positive rate distribution, the numerical index number of the final correlation KPI of the pairwise pairing is adaptively determined, thereby obtaining a two-dimensional joint threshold matrix.
步骤10,属于S106。根据最终相关项KPI的单项门限与二维联合门限,结合步骤2得到的KQI-KPI匹配数据空间,计算KQI:HTTP响应时延最终的误判率与漏判率。步骤10之后的最终KPI多维门限输出呈现示例如表4所示。 Step 10 belongs to S106. According to the single threshold and the two-dimensional joint threshold of the final correlation KPI, combined with the KQI-KPI matching data space obtained in step 2, the final false positive rate and missed rate of KQI: HTTP response delay are calculated. An example of the final KPI multidimensional threshold output presentation after step 10 is shown in Table 4.
表4Table 4
Figure PCTCN2017087940-appb-000008
Figure PCTCN2017087940-appb-000008
Figure PCTCN2017087940-appb-000009
Figure PCTCN2017087940-appb-000009
步骤11,属于S106。以期望误判率为变量,在给定的KQI警戒门限下,进行步骤9与步骤10的计算,得到KQI:HTTP响应时延的漏判率曲线。根据漏判率是否可以达到或逼近0判断最终相关项KPI是否完备。本实例中,HTTP响应时延取90ms、95ms、100ms与105ms时,均可达到零漏判,如表5。由此证明本发明描述的关联分析方案与系统,从30个KPI项中找出了完备的KQI相关项,符合有益效果中的描述。 Step 11 belongs to S106. With the expected false positive rate variable, under the given KQI alert threshold, the calculations of steps 9 and 10 are performed to obtain the KQI: HTTP response delay miss rate curve. According to whether the missed rate can reach or approach 0, it is judged whether the final related item KPI is complete. In this example, when the HTTP response time delay is 90ms, 95ms, 100ms, and 105ms, the zero miss judgment can be achieved, as shown in Table 5. This proves that the association analysis scheme and system described in the present invention finds complete KQI related items from 30 KPI items, which is in accordance with the description in the beneficial effects.
表5table 5
Figure PCTCN2017087940-appb-000010
Figure PCTCN2017087940-appb-000010
显然,本领域的技术人员应该明白,上述的本发明的各模块或各步骤 可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,可选地,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,并且在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本发明不限制于任何特定的硬件和软件结合。It will be apparent to those skilled in the art that the various modules or steps of the invention described above are apparent. It can be implemented by a general-purpose computing device, which can be centralized on a single computing device or distributed over a network of multiple computing devices. Alternatively, they can be implemented by program code executable by the computing device, such that They may be stored in a storage device by a computing device, and in some cases, the steps shown or described may be performed in an order different than that herein, or separately fabricated into individual integrated circuit modules. Alternatively, multiple modules or steps of them can be implemented as a single integrated circuit module. Thus, the invention is not limited to any specific combination of hardware and software.
以上所述仅为本发明的优选实施例而已,并不用于限制本发明,对于本领域的技术人员来说,本发明可以有各种更改和变化。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。 The above description is only the preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes can be made to the present invention. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and scope of the present invention are intended to be included within the scope of the present invention.

Claims (21)

  1. 一种确定黑盒系统输入输出关联度的方法,包括:A method for determining the correlation between input and output of a black box system, comprising:
    将黑盒系统中的业务质量指标KQI数据与关键性能指标KPI数据匹配构成数据向量空间;Matching the service quality indicator KQI data in the black box system with the key performance indicator KPI data to form a data vector space;
    根据所述KQI数据的业务类型对所述KPI数据进行聚类,其中,聚类结果用于选择正交的强相关KPI项,及辅助判决指标数据健康度;And clustering the KPI data according to the service type of the KQI data, wherein the clustering result is used to select an orthogonal strong correlation KPI item, and the auxiliary decision indicator data health degree;
    对所述数据向量空间进行分解,以分离出所述KPI数据对所述KQI数据的关联特征,并计算所述KPI数据对所述KQI数据的归一化关联度;Decomposing the data vector space to separate associated features of the KPI data from the KQI data, and calculating a normalized degree of association of the KPI data with the KQI data;
    根据所述归一化关联度分别确定出与每个所述KQI数据相关联的KPI数据,并计算出所述相关联的KPI数据对所述KQI数据的关联权重;Determining, according to the normalized degree of association, KPI data associated with each of the KQI data, and calculating an associated weight of the associated KPI data to the KQI data;
    将所述相关联的KPI数据和所述关联权重确定为所述黑盒系统的输入输出关联度。The associated KPI data and the associated weight are determined as the input and output relevance of the black box system.
  2. 根据权利要求1所述的方法,其中,将业务质量指标KQI数据与关键性能指标KPI数据匹配构成数据向量空间以分离出所述KPI数据对所述KQI数据的关联度包括:The method according to claim 1, wherein the matching of the quality of service indicator KQI data with the key performance indicator KPI data constitutes a data vector space to separate the degree of association of the KPI data with the KQI data comprises:
    将KQI数据与KPI数据在一个或多个维度匹配构成数据向量空间以分离出所述KPI数据对所述KQI数据的关联度。The KQI data is matched with the KPI data in one or more dimensions to form a data vector space to separate the degree of association of the KPI data with the KQI data.
  3. 根据权利要求1所述的方法,其中,根据所述KQI数据的业务类型对所述KPI数据进行聚类包括:The method of claim 1, wherein clustering the KPI data according to a service type of the KQI data comprises:
    将所述KQI数据和所述KPI数据分成KQI数据层和KPI数据层;Dividing the KQI data and the KPI data into a KQI data layer and a KPI data layer;
    在所述KQI数据层和所述KPI数据层之间加入与所述KQI数据对应的业务类型相关的抽象层参数,其中,抽象层对KPI数据进行规整或映射变换,以适合相应的挖掘算法; Adding an abstraction layer parameter related to the service type corresponding to the KQI data between the KQI data layer and the KPI data layer, wherein the abstraction layer normalizes or maps the KPI data to fit the corresponding mining algorithm;
    使用所述抽象层参数对KPI数据进行聚类。The KPI data is clustered using the abstraction layer parameters.
  4. 根据权利要求1所述的方法,其中,对所述数据向量空间进行分解包括:The method of claim 1 wherein decomposing the data vector space comprises:
    对所述数据向量空间通过以下方式至少之一进行分解,分离出所述KPI数据对所述KQI数据的关联特征:在空间维度降维、在空间维度直接分割、在空间维度升维。The data vector space is decomposed by at least one of the following manners, and the associated features of the KPI data for the KQI data are separated: dimension reduction in spatial dimension, direct segmentation in spatial dimension, and dimension dimension in spatial dimension.
  5. 根据权利要求1所述的方法,其中,将业务质量指标KQI数据与关键性能指标KPI数据匹配构成数据向量空间以分离出所述KPI数据对所述KQI数据的关联度包括:The method according to claim 1, wherein the matching of the quality of service indicator KQI data with the key performance indicator KPI data constitutes a data vector space to separate the degree of association of the KPI data with the KQI data comprises:
    按照所述KQI数据的业务特性信息对所述KPI数据进行分布拟合和图形展示,并确定合理KPI数据;Performing distribution fitting and graphic display on the KPI data according to the service characteristic information of the KQI data, and determining reasonable KPI data;
    将KQI数据与所述合理KPI数据匹配构成数据向量空间以分离出所述KPI数据对所述KQI数据的关联度。Matching the KQI data with the reasonable KPI data constitutes a data vector space to separate the degree of association of the KPI data with the KQI data.
  6. 根据权利要求5所述的方法,其中,所述业务特性信息是根据匹配预设数据库和/或根据业务需求得到的。The method of claim 5, wherein the service characteristic information is obtained according to a matching preset database and/or according to a business requirement.
  7. 根据权利要求1所述的方法,其中,对所述数据向量空间进行分解包括:The method of claim 1 wherein decomposing the data vector space comprises:
    对所述数据向量空间进行降维处理;Performing dimensionality reduction on the data vector space;
    对所述数据向量空间直接分解;Directly decomposing the data vector space;
    对所述数据向量空间进行扩维后再提取有效特征值。After the data vector space is expanded, the effective feature value is extracted.
  8. 根据权利要求7所述的方法,其中,对所述数据向量空间降维处理包括以下方法至少之一: The method of claim 7, wherein the spatial reduction processing of the data vector comprises at least one of the following methods:
    决策树剪枝、回归合并、聚类、专家辅助判决。Decision tree pruning, regression merging, clustering, expert-assisted judgment.
  9. 根据权利要求7所述的方法,其中,对所述数据向量空间直接分解包括:The method of claim 7 wherein direct decomposing said data vector space comprises:
    基于贝叶斯统计算法对所述数据向量空间进行分解;Decomposing the data vector space based on a Bayesian statistical algorithm;
    基于奇异值分解思路的等效数值计算。Equivalent numerical calculation based on singular value decomposition.
  10. 根据权利要求7所述的方法,其中,对所述数据向量空间进行扩维后再进行分解包括:The method according to claim 7, wherein the decomposing and expanding the data vector space comprises:
    基于支持向量机SVM的算法对所述数据向量空间进行扩维后再进行分解;The algorithm based on the support vector machine SVM performs dimension expansion on the data vector space and then decomposes;
    基于神经网络算法的扩维处理,即隐层单元个数高于输入维数。The dimension expansion processing based on the neural network algorithm, that is, the number of hidden layer units is higher than the input dimension.
  11. 根据权利要求1至10任意一项所述的方法,其中,在根据所述归一化关联度分别确定出与每个所述KQI数据相关联的KPI数据之后,所述方法还包括:The method according to any one of claims 1 to 10, wherein after determining the KPI data associated with each of the KQI data according to the normalized degree of association, the method further comprises:
    计算与每个所述KQI数据相关联的KPI数据的量化一维门限或多维门限;Calculating a quantized one-dimensional threshold or multi-dimensional threshold of KPI data associated with each of the KQI data;
    根据所述量化一维门限或多维门限得到所述KQI数据超限的误判率和/或漏判率;Obtaining a false positive rate and/or a missed rate of the KQI data overrun according to the quantized one-dimensional threshold or multi-dimensional threshold;
    根据所述漏判率和/或漏判率分析所述相关联的KPI数据是否包含所述KQI数据超限空间的完备基。And analyzing whether the associated KPI data includes a complete base of the KQI data overrun space according to the missed rate and/or the missed rate.
  12. 根据权利要求11所述的方法,其中,在根据所述漏判率和/或漏判率分析所述相关联的KPI数据是否包含所述KQI数据超限空间的完备基之后,所述方法还包括:The method according to claim 11, wherein after analyzing whether the associated KPI data contains a complete base of the KQI data overrun space according to the missed rate and/or the missed rate, the method further include:
    根据所述漏判率和/或漏判率分析所述相关联的KPI数据不包含所述KQI数据超限空间的概率。 And analyzing, according to the missed rate and/or the missed rate, the probability that the associated KPI data does not include the KQI data overrun space.
  13. 根据权利要求1至10任意一项所述的方法,其中,在根据所述归一化关联度分别确定出与每个所述KQI数据相关联的KPI数据之后,所述方法还包括:The method according to any one of claims 1 to 10, wherein after determining the KPI data associated with each of the KQI data according to the normalized degree of association, the method further comprises:
    判断所述KQI数据是否缺失;Determining whether the KQI data is missing;
    在判断所述KQI数据缺失的情况下,根据历史KPI数据的量化多维门限反向推断所述KQI数据误判的概率,并进行系统预优化与参数与调整。In the case of judging that the KQI data is missing, the probability of misjudgement of the KQI data is inferred based on the quantized multi-dimensional threshold of the historical KPI data, and system pre-optimization and parameters and adjustment are performed.
  14. 根据权利要求1所述的方法,其中,所述KPI数据包括以下至少之一:The method of claim 1 wherein said KPI data comprises at least one of:
    无线资源控制RRC连接建立成功率、演进的无线接入承载E-RAB建立成功率、无线接通率、E-RAB掉线率、基站ENB间切换成功率、小区用户面上行丢包率、小区用户面下行丢包率、小区用户面下行平均时延、小区用户面下行弃包率、小区下行包数、MAC层上行误块率、媒体接入控制MAC层下行误块率、上行初始混合自动重传请求HARQ重传比率、下行初始HARQ重传比率、下行双流流量占比、上行正交相移键控QPSK比例、上行16QAM比例、下行QPSK比例、下行16QAM比例、下行64正交振幅调制QAM比例、空口上行业务字节数、空口下行业务字节数、上行物理资源块PRB平均利用率、下行PRB平均利用率、上行每PRB平均吞吐量、下行每PRB平均吞吐量、-110dBm覆盖率、平均信号与干扰加噪声比SINR、子带0平均信道质量指示CQI、用户面平均激活设备UE数。Radio resource control RRC connection establishment success rate, evolved radio access bearer E-RAB establishment success rate, wireless connection rate, E-RAB drop rate, base station ENB handover success rate, cell user face packet loss rate, cell User plane downlink packet loss rate, cell user plane downlink average delay, cell user plane downlink packet rejection rate, cell downlink packet number, MAC layer uplink error block rate, media access control MAC layer downlink error block rate, uplink initial hybrid automatic Retransmission request HARQ retransmission ratio, downlink initial HARQ retransmission ratio, downlink dual stream traffic ratio, uplink quadrature phase shift keying QPSK ratio, uplink 16QAM ratio, downlink QPSK ratio, downlink 16QAM ratio, downlink 64 quadrature amplitude modulation QAM Proportional, air interface uplink service byte number, air interface downlink service byte number, uplink physical resource block PRB average utilization rate, downlink PRB average utilization rate, uplink per PRB average throughput, downlink per PRB average throughput, -110dBm coverage rate, Average signal to interference plus noise ratio SINR, subband 0 average channel quality indicator CQI, user plane average activation device UE number.
  15. 根据权利要求1所述的方法,其中,所述KQI数据包括超文本传输协议HTTP响应时延。The method of claim 1 wherein said KQI data comprises a hypertext transfer protocol HTTP response time delay.
  16. 根据权利要求1所述的方法,其中,所述聚类包括以下至少之一:容量指标聚类、接入指标聚类、效率指标聚类、完整保持指标聚类。The method of claim 1, wherein the clustering comprises at least one of the following: a capacity indicator cluster, an access indicator cluster, an efficiency indicator cluster, and a complete retention indicator cluster.
  17. 根据权利要求16所述的方法,其中,所述完整保持指标聚 类还包括以下至少之一:分组业务聚类、上行完整保持聚类、下行完整保持聚类。The method of claim 16 wherein said complete retention indicator gathers The class also includes at least one of the following: packet service clustering, uplink complete keep clustering, and downlink complete keep clustering.
  18. 一种关联分析系统,包括:An association analysis system comprising:
    存储单元,设置为存储服务网络中的KQI数据和KPI数据;a storage unit configured to store KQI data and KPI data in the service network;
    数据预处理单元,设置为对所述KQI数据和所述KPI数据进行预处理,其中,所述预处理包括:数据匹配、数据清洗、统计特征提取以及统计数据呈现;a data pre-processing unit, configured to perform pre-processing on the KQI data and the KPI data, where the pre-processing includes: data matching, data cleaning, statistical feature extraction, and statistical data presentation;
    聚类单元,设置为对所述KPI数据进行智能聚类,并输出聚类表;a clustering unit configured to intelligently cluster the KPI data and output a cluster table;
    向量空间分解单元,与所述数据预处理单元连接,设置为对预处理后的KQI数据和KPI数据构成的向量空间进行分解,提取可归一量化的所述KQI数据对所述KPI数据的关联分量;a vector space decomposition unit, connected to the data pre-processing unit, configured to decompose the vector space formed by the pre-processed KQI data and the KPI data, and extract the association of the KQI data that can be normalized and quantized into the KPI data. Component
    量化关联计算单元,与所述向量空间分解单元连接,设置为对所述关联分量进行归一量化计算,得到所述KQI数据对所述KPI数据的量化关联度,计算其总排序权重,并输出包含所述权重的量化关联矩阵;a quantized association calculation unit, connected to the vector spatial decomposition unit, configured to perform a normalized quantization calculation on the correlation component, obtain a quantitative relevance degree of the KQI data to the KPI data, calculate a total ranking weight thereof, and output a quantization correlation matrix including the weights;
    多维门限计算单元,设置为根据所述关联矩阵计算相关项KPI数据的多维量化门限、反推KQI超限的误判率和/或漏判率,输出多维量化门限矩阵与KQI超限评估数据;The multi-dimensional threshold calculation unit is configured to calculate a multi-dimensional quantization threshold of the KPI data of the correlation item, a false positive rate and/or a missed rate of the KQI overrun according to the correlation matrix, and output a multi-dimensional quantization threshold matrix and KQI over-limit evaluation data;
    优化单元,设置为根据所述多维量化门限矩阵与所述KQI超限评估数据所所述服务网络进行性能优化。And an optimization unit configured to perform performance optimization according to the service network described by the multi-dimensional quantization threshold matrix and the KQI overrun evaluation data.
  19. 根据权利要求18所述的系统,其中,所述系统还包括:The system of claim 18, wherein the system further comprises:
    业务数据接口,包含呈现界面,设置为接收外部指令对所述系统的输出数据进行辅助判决。The service data interface includes a presentation interface configured to receive an external command to perform an auxiliary decision on the output data of the system.
  20. 根据权利要求18所述的系统,其中,所述系统还包括:The system of claim 18, wherein the system further comprises:
    数据挖掘分析算法池,设置为存储所述系统的数据挖掘算法; a data mining analysis algorithm pool, configured to store a data mining algorithm of the system;
    数据库,设置为存储所述系统的数据分析与挖掘结论,以及中间过程信息。A database, set to store data analysis and mining conclusions of the system, and intermediate process information.
  21. 一种确定黑盒系统输入输出关联度的装置,其中,包括:A device for determining the correlation between input and output of a black box system, comprising:
    分离模块,设置为将黑盒系统中的业务质量指标KQI数据与关键性能指标KPI数据匹配构成数据向量空间以分离出所述KPI数据对所述KQI数据的关联度;a separation module, configured to match the service quality indicator KQI data in the black box system with the key performance indicator KPI data to form a data vector space to separate the correlation degree of the KPI data to the KQI data;
    聚类模块,设置为根据所述KQI数据的业务类型对所述KPI数据进行聚类;a clustering module, configured to cluster the KPI data according to a service type of the KQI data;
    第一计算模块,设置为对所述数据向量空间进行分解,并计算所述KPI数据对所述KQI数据的归一化关联度;a first calculating module configured to decompose the data vector space and calculate a normalized degree of association of the KPI data with the KQI data;
    第二计算模块,设置为根据所述归一化关联度分别确定出与每个所述KQI数据相关联的KPI数据,并计算出所述相关联的KPI数据对所述KQI数据的关联权重;a second calculating module, configured to respectively determine KPI data associated with each of the KQI data according to the normalized association degree, and calculate an associated weight of the associated KPI data to the KQI data;
    确定模块,设置为将所述相关联的KPI数据和所述关联权重确定为所述黑盒系统的输入输出关联度。 A determination module is configured to determine the associated KPI data and the associated weight as an input-output association of the black box system.
PCT/CN2017/087940 2016-07-20 2017-06-12 Method, apparatus, and system for determining degree of association of input and output of black box system WO2018014674A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610579303.2 2016-07-20
CN201610579303.2A CN107645393A (en) 2016-07-20 2016-07-20 Determine the method, apparatus and system of the black-box system input and output degree of association

Publications (1)

Publication Number Publication Date
WO2018014674A1 true WO2018014674A1 (en) 2018-01-25

Family

ID=60991875

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/087940 WO2018014674A1 (en) 2016-07-20 2017-06-12 Method, apparatus, and system for determining degree of association of input and output of black box system

Country Status (2)

Country Link
CN (1) CN107645393A (en)
WO (1) WO2018014674A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110288467A (en) * 2019-04-19 2019-09-27 平安科技(深圳)有限公司 Data digging method, device, electronic equipment and storage medium
CN110837841A (en) * 2018-08-17 2020-02-25 北京亿阳信通科技有限公司 KPI (Key performance indicator) degradation root cause identification method and device based on random forest
US20210081833A1 (en) * 2019-09-18 2021-03-18 International Business Machines Corporation Finding root cause for low key performance indicators
CN112950908A (en) * 2021-02-03 2021-06-11 重庆川仪自动化股份有限公司 Data monitoring and early warning method, system, medium and electronic terminal
CN114386728A (en) * 2020-10-19 2022-04-22 中国移动通信集团北京有限公司 KQI perception limited condition determining method, device, equipment and computer storage medium
CN117596133A (en) * 2024-01-18 2024-02-23 山东中测信息技术有限公司 Service portrayal and anomaly monitoring system and monitoring method based on multidimensional data

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110365503B (en) * 2018-03-26 2022-08-19 华为技术有限公司 Index determination method and related equipment thereof
CN108875365B (en) * 2018-04-22 2023-04-07 湖南省金盾信息安全等级保护评估中心有限公司 Intrusion detection method and intrusion detection device
CN110659731B (en) * 2018-06-30 2022-05-17 华为技术有限公司 Neural network training method and device
CN111327450B (en) * 2018-12-17 2022-09-27 中国移动通信集团北京有限公司 Method, device, equipment and medium for determining quality difference reason
CN109729540B (en) * 2019-01-18 2022-05-17 福建福诺移动通信技术有限公司 Base station parameter automatic optimization method based on neural network
CN112153663B (en) * 2019-06-26 2022-04-05 大唐移动通信设备有限公司 Wireless network evaluation method and device
CN110493803B (en) * 2019-09-17 2023-07-11 南京邮电大学 Cell scene division method based on machine learning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102612060A (en) * 2012-03-31 2012-07-25 西安交通大学 Evaluation method based on entropy value calculation and used for compatibility of cross-layer design
CN102625344A (en) * 2012-03-13 2012-08-01 重庆信科设计有限公司 Model and method for evaluating user experience quality of mobile terminal
CN102685789A (en) * 2012-05-22 2012-09-19 北京东方文骏软件科技有限责任公司 Method for evaluating QoE (Quality Of Experience) of voice service user perception experience by simulating user behaviors
CN102685717A (en) * 2012-05-08 2012-09-19 中国联合网络通信集团有限公司 Network service quality parameter identification method and device

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102098719A (en) * 2011-01-11 2011-06-15 大唐移动通信设备有限公司 Method and device for determining network quality
CN102104900A (en) * 2011-01-27 2011-06-22 大唐移动通信设备有限公司 Method and equipment for analyzing user perception
CN103138963B (en) * 2011-11-25 2016-08-03 华为技术有限公司 A kind of network problem localization method based on user's perception and device
CN102685791B (en) * 2012-05-22 2014-09-10 北京东方文骏软件科技有限责任公司 Method for evaluating user quality of experience (QoE) of WAP (Wireless Application Protocol) services by simulating user behavior
US9424121B2 (en) * 2014-12-08 2016-08-23 Alcatel Lucent Root cause analysis for service degradation in computer networks
CN104994133B (en) * 2015-05-22 2018-08-21 华中科技大学 A kind of mobile Web web page access user experience perception evaluating method based on network KPI
CN105050125B (en) * 2015-06-23 2019-01-29 武汉虹信通信技术有限责任公司 A kind of mobile data service quality evaluating method and device of user oriented experience

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102625344A (en) * 2012-03-13 2012-08-01 重庆信科设计有限公司 Model and method for evaluating user experience quality of mobile terminal
CN102612060A (en) * 2012-03-31 2012-07-25 西安交通大学 Evaluation method based on entropy value calculation and used for compatibility of cross-layer design
CN102685717A (en) * 2012-05-08 2012-09-19 中国联合网络通信集团有限公司 Network service quality parameter identification method and device
CN102685789A (en) * 2012-05-22 2012-09-19 北京东方文骏软件科技有限责任公司 Method for evaluating QoE (Quality Of Experience) of voice service user perception experience by simulating user behaviors

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110837841A (en) * 2018-08-17 2020-02-25 北京亿阳信通科技有限公司 KPI (Key performance indicator) degradation root cause identification method and device based on random forest
CN110837841B (en) * 2018-08-17 2024-05-21 北京亿阳信通科技有限公司 KPI degradation root cause identification method and device based on random forest
CN110288467A (en) * 2019-04-19 2019-09-27 平安科技(深圳)有限公司 Data digging method, device, electronic equipment and storage medium
CN110288467B (en) * 2019-04-19 2023-07-25 平安科技(深圳)有限公司 Data mining method and device, electronic equipment and storage medium
US20210081833A1 (en) * 2019-09-18 2021-03-18 International Business Machines Corporation Finding root cause for low key performance indicators
US11816542B2 (en) * 2019-09-18 2023-11-14 International Business Machines Corporation Finding root cause for low key performance indicators
CN114386728A (en) * 2020-10-19 2022-04-22 中国移动通信集团北京有限公司 KQI perception limited condition determining method, device, equipment and computer storage medium
CN112950908A (en) * 2021-02-03 2021-06-11 重庆川仪自动化股份有限公司 Data monitoring and early warning method, system, medium and electronic terminal
CN117596133A (en) * 2024-01-18 2024-02-23 山东中测信息技术有限公司 Service portrayal and anomaly monitoring system and monitoring method based on multidimensional data
CN117596133B (en) * 2024-01-18 2024-04-05 山东中测信息技术有限公司 Service portrayal and anomaly monitoring system and monitoring method based on multidimensional data

Also Published As

Publication number Publication date
CN107645393A (en) 2018-01-30

Similar Documents

Publication Publication Date Title
WO2018014674A1 (en) Method, apparatus, and system for determining degree of association of input and output of black box system
US11496353B2 (en) Root cause analysis and automation using machine learning
Qin et al. Federated learning and wireless communications
US11271796B2 (en) Automatic customer complaint resolution
US20200401945A1 (en) Data Analysis Device and Multi-Model Co-Decision-Making System and Method
WO2017215647A1 (en) Root cause analysis in a communication network via probabilistic network structure
CN115428368A (en) System and method for remote collaboration
US10541903B2 (en) Methodology to improve the anomaly detection rate
US20170200088A1 (en) System and method for analyzing a root cause of anomalous behavior using hypothesis testing
US11122467B2 (en) Service aware load imbalance detection and root cause identification
CN108063676A (en) Communication network failure method for early warning and device
CN108934016B (en) Method and device for dividing cell scene categories, computer equipment and storage medium
US10409639B2 (en) Task scheduling system with a work breakdown structure and method suitable for mobile health
CN109104731B (en) Method and device for building cell scene category division model and computer equipment
WO2010138286A2 (en) Distributed information storage and retrieval of communication network performance data
EP4075752A1 (en) Intelligent capacity planning and optimization
CN114040272A (en) Path determining method, device and storage medium
CN110062393B (en) Intelligent analysis method for network difference cell
CN105335313B (en) A kind of transmission method and device of basic data
Ganjalizadeh et al. Interplay between distributed AI workflow and URLLC
CN111865681A (en) Core network slice end-to-end time delay optimization method, system and storage medium
CN107371179A (en) Measurement result report method, measurement result method of reseptance, relevant device and system
CN113660687B (en) Network difference cell processing method, device, equipment and storage medium
CN107666403B (en) Index data acquisition method and device
CN113038537B (en) Method and electronic equipment for allocating mobile network spectrum resources

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17830310

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17830310

Country of ref document: EP

Kind code of ref document: A1