WO2024114567A1 - 一种网络问题分析方法以及相关装置 - Google Patents

一种网络问题分析方法以及相关装置 Download PDF

Info

Publication number
WO2024114567A1
WO2024114567A1 PCT/CN2023/134285 CN2023134285W WO2024114567A1 WO 2024114567 A1 WO2024114567 A1 WO 2024114567A1 CN 2023134285 W CN2023134285 W CN 2023134285W WO 2024114567 A1 WO2024114567 A1 WO 2024114567A1
Authority
WO
WIPO (PCT)
Prior art keywords
network
kpi
information
abnormal
affected
Prior art date
Application number
PCT/CN2023/134285
Other languages
English (en)
French (fr)
Inventor
林科
缪丹丹
陈昕
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2024114567A1 publication Critical patent/WO2024114567A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis

Definitions

  • the present application relates to the field of network technology, and in particular to a network problem analysis method and related devices.
  • Network problems include but are not limited to: network element alarms, network element performance degradation, and network element change accidents.
  • network problems are associated, compressed, or filtered to reduce the number of network problems that need to be processed. Based on the experience or rules of network operation and maintenance engineers, the compressed network problems are dispatched for processing.
  • the above method performs co-occurrence correlation analysis on multiple network problems to describe the correlation of these network problems. Then, predefined rules are used to classify the severity of network problems with high correlation.
  • an embodiment of the present application provides a network problem analysis method, the method being applied to a first network, the first network including a plurality of network elements, the method including:
  • Acquire network problem data information where the network problem data information indicates a network problem occurring in a network element in the first network
  • the user data information is user data information of at least one user served by the first network, and the user data information includes a key performance indicator KPI of the user;
  • the network problem analysis result indicates the degree to which the KPI is affected by the network problem
  • the network problem analysis results include:
  • the confidence of the KPI, and/or the degree of degradation of the KPI wherein the confidence of the KPI is the confidence that the KPI is affected by the network problem, and the degree of degradation of the KPI indicates the degree of change of the KPI after the network problem occurs.
  • the network problem data information includes, but is not limited to: network element alarm information, network element performance degradation information, and network element change information.
  • the alarm information of a network element includes: when the network element is a cell, the alarm information of the network element may be unavailability alarm information of the cell, the cell is regarded as a logical network element, the network device may include (or manage) one or more cells, and the network device provides wireless communication services to users (such as terminal devices) through the cells managed by the network device; when the network element is a network device such as a radio frequency unit, the alarm information of the network element may be link interruption information of the network device; when the network element is a network device such as a site, the alarm information of the network element may be de-service alarm information of the network device.
  • the performance degradation information of the network element includes: a connection establishment failure rate greater than a preset threshold, or a device load greater than a preset threshold.
  • the change information of the network element includes: the configuration information of the neighboring cell is changed, or a new carrier frequency is added.
  • the network problem analysis device can also perform standardization processing on the information to obtain standardized network problem data information.
  • the standardized network problem data information includes: the identification of the network problem, the network problem The type and location of the network problem (i.e., the identification information of the network element where the network problem occurs).
  • the user data information refers to the user data information of each user served by the network in the first network.
  • the user data information includes but is not limited to: signaling, messages, information, or messages sent and received by the network element.
  • the user data information also includes: the user's KPI.
  • the user's KPI includes but is not limited to: rate, number of scheduling times, modulation and coding scheme (MCS), dual-stream transmission ratio, initial transmission bit error rate, signal-to-interference and noise ratio (SINR), channel quality information (CQI), resource allocation success rate, available resources, interference, or reference signal received power (RSRP), etc.
  • MCS modulation and coding scheme
  • SINR signal-to-interference and noise ratio
  • CQI channel quality information
  • resource allocation success rate available resources, interference, or reference signal received power (RSRP), etc.
  • the network problem analysis device can also standardize the information to obtain standardized user data information.
  • different users' KPIs may use different units, and the KPIs of these users are standardized to obtain KPIs of the same unit.
  • a network problem analysis device integrates network problem data information and user data information to obtain a network problem analysis result, and the network problem analysis result indicates the degree to which the user data information is affected by the network problem.
  • the user data information includes the user's key performance indicator KPI.
  • the network problem analysis result includes the confidence of the KPI and/or the degree of degradation of the KPI, wherein the confidence of the KPI is the confidence of the degree to which the KPI is affected by the network problem, and the degree of degradation of the KPI indicates the degree of change of the KPI after the network problem occurs.
  • the network problem analysis result indicates the degree of impact caused by the quantified network problem, and improves the accuracy of analyzing the impact caused by the network problem.
  • generating the network problem analysis result according to the network problem data information and the user data information includes:
  • the abnormal user is a user who migrates into, migrates out of, and/or resides in the abnormal network element after the network problem occurs;
  • the network problem analysis result is generated according to the KPI of the abnormal user, and the KPI indicated by the network problem analysis result belongs to the abnormal user.
  • an abnormal network element is determined from multiple network elements of the first network.
  • the network element in the first network affected by the network problem is referred to as an abnormal network element.
  • the abnormal network element includes: a network element where a network problem occurs and a network element having a business association relationship with the network element where the network problem occurs.
  • the network element where the network problem occurs is referred to as the first network element
  • the network element having a business association relationship with the first network element is referred to as the second network element.
  • the network problem analysis device associates the network problem with the first network element according to the network problem data information.
  • abnormal users are identified from the users served by the abnormal network element.
  • Abnormal users are more affected by the network problem. For example, users who migrated in, migrated out, and stayed in the network element within 1 hour after the network problem occurred are considered abnormal users.
  • generating the network problem analysis result according to the KPI of the abnormal user includes:
  • the network problem analysis result is generated according to the affected KPI of the abnormal user, and the KPI indicated by the network problem analysis result is the affected KPI.
  • the KPIs affected by the network problem i.e., the affected KPIs
  • the KPIs affected by the network problem are further determined, thereby further reducing the complexity of the calculation process and improving the accuracy of the network problem analysis.
  • determining the abnormal network element from a plurality of network elements included in the first network according to the network problem data information includes:
  • First network information is generated, where the first network information indicates a network topology structure of the abnormal network element, and the network topology result indicates a service association relationship between network elements.
  • abnormal network elements are found from the first network according to the network problem data information, and then first network information is generated, and the first network information indicates the network topology structure of the abnormal network. Subsequently, the business association relationship between the abnormal network elements can be conveniently determined according to the first network information, reducing the complexity of calculation processing and improving the accuracy of network problem analysis.
  • marking the abnormal network element from a plurality of network elements included in the first network according to the network problem data information includes:
  • Network resource data information includes any one or more of the following information: configuration information of the first network or engineering parameter information of the first network;
  • the second network information indicates a network topology structure of the first network
  • the abnormal network element is determined from a plurality of network elements indicated by the second network information, wherein the plurality of network elements indicated by the second network information are a plurality of network elements included in the first network.
  • the network problem analysis device obtains network resource data information, and the network resource data information includes any one or more of the following information: configuration information of the first network, or engineering parameter information of the first network.
  • the second network information indicates the network topology of the first network.
  • the network topology indicates the service association relationship between network elements. Regarding the service association relationship and the physical connection relationship, taking network element A (the network element A is a base station) and network element B (the network element B is a base station) in a wireless network as an example, there is a physical connection between network element A and network element B.
  • network element A and network element B are configured as logical neighboring areas to achieve a service association relationship between network element A and network element B.
  • an abnormal network element is determined from multiple network elements indicated by the second network information (that is, multiple network elements of the first network).
  • the network element affected by the network problem in the first network is referred to as an abnormal network element.
  • the first network information is generated, and the first network information indicates the network topology of the abnormal network element.
  • network element sets include the first network element and the second network element
  • network element set #1 and network element set #2 include the same network element.
  • the network problem occurring in network element set #1 may be different from the network problem occurring in network element set #2, so the same network element included in network element set #1 and network element set #2 means that the network element may be affected by two different network problems at the same time.
  • network element set #1 and network element set #2 are merged, the merged network element set is output, and the corresponding network topology subgraph is generated for subsequent processing.
  • the first network information indicates the network topology of a first network element and a second network element, wherein the first network element is a network element in the first network where the network problem occurs, and the second network element is a network element having a business association relationship with the first network element.
  • the abnormal network element includes: a network element where a network problem occurs and a network element having a business association relationship with the network element where the network problem occurs.
  • the network element where the network problem occurs is referred to as the first network element
  • the network element having a business association relationship with the first network element is referred to as the second network element.
  • the network problem analysis device associates the network problem with the first network element based on the network problem data information.
  • determining the abnormal network element from a plurality of network elements indicated by the second network information according to the network problem data information includes:
  • the abnormal network element group includes at least two abnormal network elements, a service association relationship exists between the abnormal network elements included in the abnormal network element group, and a change in the service migration information of the abnormal network elements before and after the network problem occurs meets a first threshold;
  • the service association relationship between the abnormal network elements is marked as abnormal, and third network information is generated according to the first network information, wherein the third network information indicates a topological structure of the abnormal network element group.
  • an abnormal network element group can also be determined from the first network information indicating the first network element and the second network element, and the abnormal network element group includes at least two abnormal network elements.
  • the change in the service migration information of the abnormal network element before and after the network problem occurs meets the first threshold.
  • the abnormal network element group that is greatly affected by the network problem is screened from the first network element and the second network element.
  • network elements that are significantly affected by the network problem can be screened from a large number of second network elements. This part of the network elements that are significantly affected by the network problem is selected.
  • the second network element and the first network element are analyzed together as abnormal network elements to find abnormal users served by the abnormal network elements. This effectively reduces the complexity of data processing, reduces the difficulty of calculation, and improves the accuracy of network problem analysis.
  • the service migration information indicating statistical characteristic information of users served by the network element, migrating into the network element, migrating out of the network element, and/or residing in the network element.
  • the statistical characteristic information includes, but is not limited to: total number, proportion, mean, and/or variance, etc.
  • network element #A one of the network elements that has a business association relationship with network element #A is network element #B.
  • the total number of users who migrated from network element #A to network element #B within one hour before the network problem occurred can be counted, and the total number of users who migrated in can be counted every 10 minutes, specifically: 100, 99, 98, 102, 103, 95.
  • Ten minutes after the network problem occurs the total number of users who migrated in is counted every 10 minutes, which is 25.
  • the hypothesis verification method is used to determine whether the changes between the statistical feature information #1 and the statistical feature information #2 are significant.
  • the probability that the total number of migrated users after the network problem occurs has no significant change compared with the total number one hour before the network problem occurs is 0.04, and the preset first threshold is 0.1, which indicates that there is a significant change, and the business association relationship between network element #A and network element #B is an abnormal business association relationship.
  • determining the affected KPI of the abnormal user from the KPI of the abnormal user includes:
  • the affected KPI is determined from the KPI of the abnormal user according to the confidence of the KPI of the abnormal user, wherein the confidence of the affected KPI is greater than a second threshold.
  • At least one KPI of the abnormal user is determined from the user data information of the abnormal user served by the abnormal network group (the abnormal network element included in the abnormal network element group).
  • the KPI may be a standardized KPI.
  • obtaining the KPI value of the abnormal user includes: obtaining the KPI value before the network problem occurs (KPIbefore) and the KPI value after the network problem occurs (KPIafter).
  • the RSRP value range is divided into four, namely -130 decibel relative to one milliwatt (dBm) to -120dBm, -120dBm to -110dBm, -110dBm to -100dBm and -100dBm to 90dBm.
  • 100 RSRP values were collected 5 minutes before the network problem occurred. The proportions falling into these four intervals were 10%, 30%, 40%, and 20%.
  • the proportion of KPIbefore was f(KPI before ), and f(KPI before ) was used as the distribution of KPIbefore (i.e., the probability distribution of KPIbefore); 100 RSRP values were collected 5 minutes after the network problem occurred.
  • the proportions falling into these four intervals were 20%, 40%, 30%, and 10%.
  • the proportion of KPIafter was f(KPI after ), and f(KPI after ) was used as the distribution of KPIafter (i.e., the probability distribution of KPIafter).
  • the difference between the distributions of the two KPIs is calculated.
  • the method for calculating the difference is, for example, JS divergence (value range [0,1]). The smaller the JS divergence, the smaller the difference; the larger the JS divergence, the larger the difference.
  • the KPIs of abnormal users include: rate, number of scheduling times, modulation and coding scheme (MCS), dual-stream transmission ratio, initial transmission bit error rate, signal-to-interference-plus-noise ratio (SINR), channel quality information (CQI), resource allocation success rate, available resources, interference, and reference signal received power (RSRP).
  • MCS modulation and coding scheme
  • SINR signal-to-interference-plus-noise ratio
  • CQI channel quality information
  • resource allocation success rate available resources, interference, and reference signal received power
  • RSRP reference signal received power
  • quantified network problem analysis results are output, making the network problem analysis results more intuitive and improving the accuracy of network problem analysis.
  • generating the network problem analysis result according to the affected KPI of the abnormal user includes:
  • the network problem analysis result is generated according to the fourth network information and the network problem.
  • the communication mechanism relationship can also be called a communication mechanism relationship diagram.
  • the communication mechanism relationship diagram is used to indicate the impact relationship between various KPIs.
  • the impact relationship indicated by the communication mechanism relationship diagram conforms to the impact mechanism described by the communication protocol.
  • a communication network engineer can predefine the communication mechanism relationship diagram, which describes the impact of different types of network problems on KPIs, as well as the causal order of impact between different KPIs.
  • the fourth network information further includes: a degradation degree of a key KPI and a confidence level of the affected KPI, wherein the key KPI belongs to the affected KPI, so that the network problem analysis result more accurately reflects the impact of the network problem on the KPI.
  • the network problem analysis result includes any one or more of the following:
  • the network problem the abnormal network element where the network problem occurs, the abnormal user affected by the network problem, the affected KPI of the abnormal user, the confidence of the affected KPI, or the degree of degradation of the affected KPI
  • the confidence of the affected KPI is the confidence that the affected KPI is affected by the network problem
  • the degree of degradation of the affected KPI indicates the degree of change of the affected KPI after the occurrence of the network problem.
  • the degradation degree of the KPI includes any one or more of the following:
  • the ratio of the statistic of the KPI after the network problem occurs to the third threshold.
  • the statistic includes any one or more of the following:
  • the network problem data information includes any one or more of the following information:
  • the alarm information of the network element the performance degradation information of the network element, or the change information of the network element.
  • an embodiment of the present application provides a network problem analysis device, the network problem analysis device is applied to a first network, the first network includes multiple network elements, and the network problem analysis device includes:
  • a transceiver module configured to obtain network problem data information, wherein the network problem data information indicates a network problem occurring in a network element in the first network
  • the transceiver module is further used to obtain user data information, where the user data information is user data information of at least one user served by the first network, and the user data information includes a key performance indicator KPI of the user;
  • a processing module configured to generate a network problem analysis result according to the network problem data information and the user data information, wherein the network problem analysis result indicates the degree to which the KPI is affected by the network problem
  • the network problem analysis results include:
  • the confidence of the KPI, and/or the degree of degradation of the KPI wherein the confidence of the KPI is the confidence that the KPI is affected by the network problem, and the degree of degradation of the KPI indicates the degree of change of the KPI after the network problem occurs.
  • the processing module is further configured to determine an abnormal network element from a plurality of network elements included in the first network according to the network problem data information, wherein the abnormal network element is a network element affected by the network problem;
  • the processing module is further used to determine an abnormal user from a plurality of users served by the abnormal network element, wherein the abnormal user is a user who migrates into, migrates out of, and/or resides in the abnormal network element after the network problem occurs;
  • the processing module is further configured to generate the network problem analysis result according to the KPI of the abnormal user, wherein the KPI indicated by the network problem analysis result belongs to the abnormal user.
  • the processing module is further configured to determine an affected KPI of the abnormal user from the KPI of the abnormal user, wherein the affected KPI is a KPI of the abnormal user that is affected by the network problem among the multiple KPIs of the abnormal user;
  • the processing module is further configured to generate the network problem analysis result according to the affected KPI of the abnormal user, wherein the KPI indicated by the network problem analysis result is the affected KPI.
  • the processing module is further used to determine the abnormal network element from the multiple network elements included in the first network according to the network problem data information;
  • the processing module is further used to generate first network information, where the first network information indicates the network topology structure of the abnormal network element, and the network topology result indicates the service association relationship between the network elements.
  • the transceiver module is further used to obtain network resource data information, where the network resource data information includes any one or more of the following information: configuration information of the first network or engineering parameter information of the first network;
  • the processing module is further configured to generate second network information according to the network resource data information, wherein the second network information indicates a network topology structure of the first network;
  • the processing module is further used to determine the abnormal network element from the multiple network elements indicated by the second network information according to the network problem data information, wherein the multiple network elements indicated by the second network information are the multiple network elements included in the first network.
  • the first network information indicates a network topology of a first network element and a second network element, wherein the first network element is a network element in the first network where the network problem occurs, and the second network element is a network element having a business association relationship with the first network element.
  • the processing module is further used to determine at least one group of abnormal network elements from the first network element and the second network element according to the service migration information of the first network element and the service migration information of the second network element, wherein the service migration information indicates the statistical characteristic information of the migration in, migration out and/or residence of the user served by the network element;
  • the abnormal network element group includes at least two abnormal network elements, a service association relationship exists between the abnormal network elements included in the abnormal network element group, and a change in the service migration information of the abnormal network elements before and after the network problem occurs meets a first threshold;
  • the processing module is further used to mark the service association relationship between the abnormal network elements as abnormal, generate third network information according to the first network information, and the third network information indicates the topological structure of the abnormal network element group.
  • the processing module is further used to determine the abnormal user served by the abnormal network element according to the third network information
  • the processing module is further configured to determine the affected KPI from the KPI of the abnormal user according to the confidence of the KPI of the abnormal user, wherein the confidence of the affected KPI is greater than a second threshold.
  • the processing module is further configured to generate fourth network information according to the communication mechanism relationship and the affected KPI of the abnormal user, wherein the fourth network information indicates an influence relationship between the affected KPI and the network problem, the influence relationship indicated by the fourth network information is consistent with the influence relationship indicated by the communication mechanism relationship, and the communication mechanism relationship indicates an influence relationship between multiple KPIs and the network problem;
  • the processing module is further used to generate the network problem analysis result according to the fourth network information and the network problem.
  • the fourth network information further includes: a degradation degree of a key KPI and a confidence level of the affected KPI, and the key KPI belongs to the affected KPI.
  • the network problem analysis result includes any one or more of the following:
  • the network problem the abnormal network element where the network problem occurs, the abnormal user affected by the network problem, the affected KPI of the abnormal user, the confidence of the affected KPI, or the degree of degradation of the affected KPI
  • the confidence of the affected KPI is the confidence that the affected KPI is affected by the network problem
  • the degree of degradation of the affected KPI indicates the degree of change of the affected KPI after the occurrence of the network problem.
  • the degradation degree of the KPI includes any one or more of the following:
  • the ratio of the statistic of the KPI after the network problem occurs to the third threshold.
  • the statistics include any one or more of the following:
  • the network problem data information includes any one or more of the following information:
  • the alarm information of the network element the performance degradation information of the network element, or the change information of the network element.
  • an embodiment of the present application provides a computing device, the computing device is applied to a first network, the first network includes a plurality of network elements, the computing device includes: a communication interface and a processor;
  • the communication interface is used to obtain network problem data information, where the network problem data information indicates a network problem occurring in a network element in the first network;
  • the communication interface is further used to obtain user data information, where the user data information is user data information of at least one user served by the first network, and the user data information includes a key performance indicator KPI of the user;
  • a processor configured to generate a network problem analysis result according to the network problem data information and the user data information, wherein the network problem analysis result indicates the degree to which the KPI is affected by the network problem
  • the network problem analysis results include:
  • the confidence of the KPI, and/or the degree of degradation of the KPI wherein the confidence of the KPI is the confidence that the KPI is affected by the network problem, and the degree of degradation of the KPI indicates the degree of change of the KPI after the network problem occurs.
  • the processor is further configured to determine an abnormal network element from a plurality of network elements included in the first network according to the network problem data information, wherein the abnormal network element is a network element affected by the network problem;
  • the processor is further configured to determine an abnormal user from among a plurality of users served by the abnormal network element, wherein the abnormal user is a user who migrates into, migrates out of, and/or resides in the abnormal network element after the network problem occurs;
  • the processor is further configured to generate the network problem analysis result according to the KPI of the abnormal user, wherein the KPI indicated by the network problem analysis result belongs to the abnormal user.
  • the processor is further configured to determine an affected KPI of the abnormal user from the KPI of the abnormal user, wherein the affected KPI is a KPI of the abnormal user that is affected by the network problem among the multiple KPIs of the abnormal user;
  • the processor is further configured to generate the network problem analysis result according to the affected KPI of the abnormal user, wherein the KPI indicated by the network problem analysis result is the affected KPI.
  • the processor is further configured to determine the abnormal network element from a plurality of network elements included in the first network according to the network problem data information;
  • the processor is further configured to generate first network information, wherein the first network information indicates a network topology structure of the abnormal network element, and the network topology result indicates a service association relationship between network elements.
  • the communication interface is further used to obtain network resource data information, wherein the network resource data information includes any one or more of the following information: configuration information of the first network or engineering parameter information of the first network;
  • the processor is further configured to generate second network information according to the network resource data information, wherein the second network information indicates a network topology structure of the first network;
  • the processor is further used to determine the abnormal network element from multiple network elements indicated by the second network information according to the network problem data information, wherein the multiple network elements indicated by the second network information are multiple network elements included in the first network.
  • the first network information indicates a network topology of a first network element and a second network element, wherein the first network element is a network element in the first network where the network problem occurs, and the second network element is a network element having a business association relationship with the first network element.
  • the processor is further configured to determine at least one group of abnormal network elements from the first network element and the second network element according to the service migration information of the first network element and the service migration information of the second network element, wherein the service migration information indicates statistical feature information of migration in, migration out and/or residence of the user served by the network element;
  • the abnormal network element group includes at least two abnormal network elements, a service association relationship exists between the abnormal network elements included in the abnormal network element group, and a change in the service migration information of the abnormal network elements before and after the network problem occurs meets a first threshold;
  • the processor is further configured to mark the service association relationship between the abnormal network elements as abnormal, and generate third network information according to the first network information, wherein the third network information indicates a topological structure of the abnormal network element group.
  • the processor is further configured to determine, according to the third network information, the abnormal user served by the abnormal network element;
  • the processor is further configured to determine the affected KPI from the KPI of the abnormal user according to the confidence of the KPI of the abnormal user, wherein the confidence of the affected KPI is greater than a second threshold.
  • the processor is further configured to generate fourth network information according to the communication mechanism relationship and the affected KPI of the abnormal user, wherein the fourth network information indicates an influence relationship between the affected KPI and the network problem, the influence relationship indicated by the fourth network information is consistent with the influence relationship indicated by the communication mechanism relationship, and the communication mechanism relationship indicates an influence relationship between multiple KPIs and the network problem;
  • the processor is further configured to generate the network problem analysis result according to the fourth network information and the network problem.
  • the fourth network information further includes: a degradation degree of a key KPI and a confidence level of the affected KPI, and the key KPI belongs to the affected KPI.
  • the network problem analysis result includes any one or more of the following:
  • the network problem the abnormal network element where the network problem occurs, the abnormal user affected by the network problem, the affected KPI of the abnormal user, the confidence of the affected KPI, or the degree of degradation of the affected KPI
  • the confidence of the affected KPI is the confidence that the affected KPI is affected by the network problem
  • the degree of degradation of the affected KPI indicates the degree of change of the affected KPI after the occurrence of the network problem.
  • the degradation degree of the KPI includes any one or more of the following:
  • the ratio of the statistic of the KPI after the network problem occurs to the third threshold.
  • the statistics include any one or more of the following:
  • the network problem data information includes any one or more of the following information:
  • the alarm information of the network element the performance degradation information of the network element, or the change information of the network element.
  • a computing device executes the method in any one of the implementation modes in the first aspect.
  • a computing device cluster is provided, and a cloud computing system includes the computing device of the fourth aspect.
  • the present application provides a computer storage medium, which may be non-volatile; the computer storage medium stores computer-readable instructions, and when the computer-readable instructions are executed by a processor, the method in any one of the implementation modes in the first aspect is implemented.
  • the seventh aspect of the present application provides a computer program product comprising instructions, which, when executed on a computer, enables the computer to execute the method in any one of the implementations of the first aspect.
  • a chip system which includes a processor and an interface circuit, and is used to support the network device to implement the functions involved in the above aspects, for example, to send or process the data and/or information involved in the above methods.
  • the chip system also includes a memory, which is used to store program instructions and data necessary for the network device.
  • the chip system can be composed of a chip, or it can include a chip and other discrete devices.
  • FIG1 is a schematic diagram of an application scenario involved in an embodiment of the present application.
  • FIG2 is a schematic diagram of the structure of a computing device 100 in an embodiment of the present application.
  • FIG3 is a schematic diagram of the structure of a computing device cluster in an embodiment of the present application.
  • FIG4 is a schematic diagram of the structure of a computing device cluster in an embodiment of the present application.
  • FIG5 is a schematic diagram of an embodiment of a network problem analysis method in an embodiment of the present application.
  • FIG6 is a schematic diagram of an embodiment of a network problem analysis method in an embodiment of the present application.
  • FIG7 is a schematic diagram of an embodiment of a network problem analysis method in an embodiment of the present application.
  • FIG8 is a schematic diagram of an embodiment of a network problem analysis method in an embodiment of the present application.
  • FIG9 is a schematic diagram of a network topology structure of a first network in an embodiment of the present application.
  • FIG10 is a schematic diagram of a network topology structure of first network information in an embodiment of the present application.
  • FIG11 is a schematic diagram of another embodiment of the first network information in the embodiment of the present application.
  • FIG12 is a schematic diagram of a network topology of the third network information in an embodiment of the present application.
  • FIG13 is a schematic diagram of the structure of a communication mechanism relationship diagram in an embodiment of the present application.
  • FIG14 is a schematic diagram of a topology of fourth network information in an embodiment of the present application.
  • FIG15 is a schematic diagram of a possible computing device 1500 provided in an embodiment of the present application.
  • FIG. 16 is a schematic diagram of the structure of a computing device 1600 in an embodiment of the present application.
  • the term “including” and its variations represent open inclusion, i.e., “including but not limited to”. Unless otherwise stated, the term “or” means “and/or”. The term “based on” means “based at least in part on”. The terms “embodiment” and “some embodiments” mean “at least some embodiments”. The terms “first”, “second”, etc. are used to distinguish different objects, etc., and do not represent a sequence, nor do they limit the "first” and “second” to different types.
  • Autonomous Networks is a new generation of telecommunication networks that can self-configure, self-heal, self-optimize, and self-evolve.
  • Network Element A network element is the smallest unit that can be monitored and managed in network management.
  • Co-existence Analysis refers to an analysis method that quantifies the co-occurrence information in various information carriers. It can reveal the content association of information and the co-occurrence relationship implied by the feature items.
  • Network topology refers to the specific connection structure between network elements that constitute the network, where the nodes of the topology are network elements and the edges are the business association relationships between network elements.
  • the business association relationships can be physical or logical.
  • a graph in which the edges are directed is called a directed graph.
  • a connected path refers to a sequence of alternating nodes and edges between any two nodes in a graph. This sequence is called a connected path between the two nodes.
  • hypothesis testing refers to a statistical inference method used to determine whether the differences between samples and samples, or between samples and the population, are caused by sampling errors or essential differences. When the significance level of the test is lower than the preset threshold, the hypothesis is satisfied, otherwise it is not satisfied.
  • the significance level ranges from (0,1), and is generally set to 0.1, 0.05 or 0.01.
  • Probability distribution refers to the probability law used to express the value of random variables.
  • Distribution Similarity indicates the similarity between two probability distributions. Commonly used measurement methods include: cosine similarity, KL divergence, bulldozer distance, etc.
  • graph search method refers to the method of finding all paths from the start node to the target node by traversing the graph.
  • Commonly used search methods include: depth-first search, breadth-first search, etc.
  • Cumulative Probability Distribution (Cumulative Probability Distribution), cumulative probability distribution is used to express the probability of a random variable falling on any interval.
  • the cumulative probability distribution can be obtained by integrating the probability distribution.
  • Operation Support System also known as Operation Support System
  • OSS is a support platform necessary for the development and operation of telecommunications services.
  • OSS is an integrated, information resource-sharing support system for telecommunications operators. It is mainly composed of network management, system management, billing, business, accounting and customer service, and the systems are organically integrated through a unified information bus.
  • the operation and support system includes an operation and maintenance center and a network management center. It is responsible for the inspection and management of the communication quality and operation of the entire network, and records and collects various data in the operation of the entire network. It has connecting lines between all devices in the entire network, and performs monitoring and control functions on each device.
  • the network problem analysis device proposed in the embodiment of the present application can be divided into: a data monitoring module, a data storage module and an analysis and processing module according to the functional division.
  • the data monitoring module can collect network element data generated in real time by the OSS system during operation.
  • the data monitoring module monitors the network element data of the OSS system in real time, and uploads the network problem data information obtained based on the network element data to the data storage module for storage.
  • the data storage module is also used to store user data information of users served by the OSS system, and the user data information includes the user's key performance index (Key Performance Index, KPI).
  • KPI Key Performance Index
  • the standardization processing refers to preprocessing the network problem data information and user data information into a standardized format to facilitate subsequent analysis and processing.
  • the analysis and processing module generates a network problem analysis result based on the network problem data information and the user data information, and the network problem analysis result indicates the degree to which the KPI is affected by the network problem.
  • the network problem analysis device feeds back the network problem analysis result to the network maintenance engineer, and the network maintenance engineer processes the network running in the OSS system.
  • a network maintenance engineer sorts network problems and creates work orders based on the results of network problem analysis, and then uploads the work orders to the work order management server.
  • the work order management server dispatches multiple work orders to network maintenance engineers for processing according to priority.
  • the analysis and processing module, data storage module and data monitoring module in the network problem analysis device can be implemented by software or by hardware. Exemplarily, the implementation of the analysis and processing module is described below by taking the analysis and processing module as an example. Similarly, the implementation of the data storage module and the data monitoring module can refer to the implementation of the analysis and processing module.
  • the analysis and processing module may include code running on a computing instance.
  • the computing instance may include at least one of a physical host (computing device), a virtual machine, and a container. Further, the above-mentioned computing instance may be one or more.
  • the analysis and processing module may include code running on multiple hosts/virtual machines/containers. It should be noted that the multiple hosts/virtual machines/containers used to run the code may be distributed in the same region (region) or in different regions.
  • the multiple hosts/virtual machines/containers used to run the code may be distributed in the same availability zone (AZ) or in different AZs, each AZ including one data center or multiple data centers with close geographical locations. Among them, usually a region may include multiple AZs.
  • VPC virtual private cloud
  • multiple hosts/virtual machines/containers used to run the code can be distributed in the same virtual private cloud (VPC) or in multiple VPCs.
  • VPC virtual private cloud
  • a VPC is set up in a region.
  • a communication gateway needs to be set up in each VPC to achieve interconnection between VPCs through the communication gateway.
  • the analysis and processing module may include at least one computing device, such as a server, etc.
  • the analysis and processing module may also be a device implemented using an application-specific integrated circuit (ASIC) or a programmable logic device (PLD).
  • ASIC application-specific integrated circuit
  • PLD programmable logic device
  • the PLD may be a complex programmable logical device (CPLD), a field-programmable gate array (FPGA), a generic array logic (GAL) or any combination thereof.
  • CPLD complex programmable logical device
  • FPGA field-programmable gate array
  • GAL generic array logic
  • the multiple computing devices included in the analysis and processing module can be distributed in the same region or in different regions.
  • the multiple computing devices included in the analysis and processing module can be distributed in the same AZ or in different AZs.
  • the multiple computing devices included in the analysis and processing module can be distributed in the same VPC or in multiple VPCs.
  • the multiple computing devices can be any combination of computing devices such as servers, ASICs, PLDs, CPLDs, FPGAs, and GALs.
  • the network problem analysis device may be implemented by software or hardware. As an example, the implementation of the network problem analysis device is described below.
  • a network problem analysis device may include code running on a computing instance.
  • the computing instance may be at least one of a physical host (computing device), a virtual machine, a container, and other computing devices.
  • the computing device may be one or more.
  • the network problem analysis device may include code running on multiple hosts/virtual machines/containers. It should be noted that the multiple hosts/virtual machines/containers used to run the application may be distributed in the same region or in different regions. The multiple hosts/virtual machines/containers used to run the code may be distributed in the same AZ or in different AZs, each AZ including a data center or multiple data centers with close geographical locations. Typically, a region may include multiple AZs.
  • multiple hosts/virtual machines/containers used to run the code can be distributed in the same VPC or in multiple VPCs.
  • a VPC is set up in a region.
  • a communication gateway must be set up in each VPC to achieve interconnection between VPCs through the communication gateway.
  • the network problem analysis device may include at least one computing device, such as a server, etc.
  • the network problem analysis device may also be a device implemented by ASIC or PLD, etc.
  • the PLD may be implemented by CPLD, FPGA, GAL or any combination thereof.
  • the multiple computing devices included in the network problem analysis device can be distributed in the same region or in different regions.
  • the multiple computing devices included in the network problem analysis device can be distributed in the same AZ or in different AZs.
  • the multiple computing devices included in the network problem analysis device can be distributed in the same VPC or in multiple VPCs.
  • the multiple computing devices can be any combination of computing devices such as servers, ASICs, PLDs, CPLDs, FPGAs, and GALs.
  • the present application also provides a computing device 100.
  • the computing device 100 includes: a bus 102, a processor 104, a memory 106, and a communication interface 108.
  • the processor 104, the memory 106, and the communication interface 108 communicate with each other through the bus 102.
  • the computing device 100 may be a server or a terminal device. It should be understood that the present application does not limit the number of processors and memories in the computing device 100.
  • the bus 102 may be a peripheral component interconnect (PCI) bus or an extended industry standard architecture (EISA) bus, etc.
  • the bus may be divided into an address bus, a data bus, a control bus, etc. For ease of representation, only one line is used in FIG. 2 , but it does not mean that there is only one bus or one type of bus.
  • the bus 104 may include a path for transmitting information between various components of the computing device 100 (e.g., the memory 106, the processor 104, and the communication interface 108).
  • the processor 104 may include any one or more processors such as a central processing unit (CPU), a graphics processing unit (GPU), a microprocessor (MP) or a digital signal processor (DSP).
  • processors such as a central processing unit (CPU), a graphics processing unit (GPU), a microprocessor (MP) or a digital signal processor (DSP).
  • CPU central processing unit
  • GPU graphics processing unit
  • MP microprocessor
  • DSP digital signal processor
  • the memory 106 may include a volatile memory, such as a random access memory (RAM).
  • the processor 104 may also include a non-volatile memory, such as a read-only memory (ROM), a flash memory, a hard disk drive (HDD), or a solid state drive (SSD).
  • ROM read-only memory
  • HDD hard disk drive
  • SSD solid state drive
  • the memory 106 stores executable program codes, and the processor 104 executes the executable program codes to respectively implement the functions of the aforementioned analysis and processing module, data storage module, and data monitoring module, thereby implementing the network problem analysis method. That is, the memory 106 stores instructions for executing the network problem analysis method.
  • the memory 106 stores executable codes
  • the processor 104 executes the executable codes to respectively implement the functions of the aforementioned network problem analysis device, thereby implementing the network problem analysis method. That is, the memory 106 stores instructions for executing the network problem analysis method.
  • the communication interface 103 uses a transceiver module such as, but not limited to, a network interface card or a transceiver to implement communication between the computing device 100 and other devices or a communication network.
  • a transceiver module such as, but not limited to, a network interface card or a transceiver to implement communication between the computing device 100 and other devices or a communication network.
  • the embodiment of the present application also provides a computing device cluster.
  • the computing device cluster includes at least one computing device.
  • the computing device can be a server, such as a central server, an edge server, or a local server in a local data center.
  • the computing device can also be a terminal device such as a desktop computer, a laptop computer, or a smart phone.
  • the computing device cluster includes at least one computing device 100.
  • One or more computing devices in the computing device cluster The memory 106 in the device 100 may store the same instructions for executing the network problem analysis method.
  • the memory 106 of one or more computing devices 100 in the computing device cluster may also store partial instructions for executing the network problem analysis method.
  • the combination of one or more computing devices 100 may jointly execute instructions for executing the network problem analysis method.
  • the memory 106 in different computing devices 100 in the computing device cluster can store different instructions, which are respectively used to execute part of the functions of the network problem analysis device. That is, the instructions stored in the memory 106 in different computing devices 100 can implement the functions of one or more modules among the analysis and processing module, the data storage module and the data monitoring module.
  • one or more computing devices in a computing device cluster may be connected via a network.
  • the network may be a wide area network or a local area network, etc.
  • FIG. 4 shows a possible implementation. As shown in FIG. 4 , two computing devices 100A and 100B are connected via a network. Specifically, the network is connected via a communication interface in each computing device.
  • the memory 106 in the computing device 100A stores instructions for executing the functions of the analysis and processing module.
  • the memory 106 in the computing device 100B stores instructions for executing the functions of the data storage module and the data monitoring module.
  • connection method between the computing device clusters shown in Figure 4 can be considered to be that the network problem analysis method provided in the present application requires a large amount of data information, so it is considered to hand over the functions implemented by the data storage module and the data monitoring module to the computing device 100B for execution.
  • the functions of the computing device 100A shown in FIG4 may also be completed by multiple computing devices 100.
  • the functions of the computing device 100B may also be completed by multiple computing devices 100.
  • the embodiment of the present application also provides another computing device cluster.
  • the connection relationship between the computing devices in the computing device cluster can be similar to the connection mode of the computing device cluster described in Figures 3 and 4.
  • the difference is that the memory 106 in one or more computing devices 100 in the computing device cluster can store the same instructions for executing the network problem analysis method.
  • the memory 106 of one or more computing devices 100 in the computing device cluster may also store partial instructions for executing the network problem analysis method.
  • the combination of one or more computing devices 100 may jointly execute instructions for executing the network problem analysis method.
  • the memory 106 in different computing devices 100 in the computing device cluster can store different instructions for executing part of the functions of the network problem analysis system. That is, the instructions stored in the memory 106 in different computing devices 100 can implement the functions of one or more devices in the network problem analysis device.
  • FIG5 is a schematic diagram of an embodiment of a network problem analysis method in an embodiment of the present application.
  • a network problem analysis method proposed in an embodiment of the present application includes:
  • the network problem analysis device obtains network problem data information of a network element in a first network.
  • the network problem data information includes but is not limited to: alarm information of the network element, performance degradation information of the network element, and change information of the network element.
  • the alarm information of a network element includes: when the network element is a cell, the alarm information of the network element may be unavailability alarm information of the cell, the cell is regarded as a logical network element, the network device may include (or manage) one or more cells, and the network device provides wireless communication services to users (such as terminal devices) through the cells managed by the network device; when the network element is a network device such as a radio frequency unit, the alarm information of the network element may be link interruption information of the network device; when the network element is a network device such as a site, the alarm information of the network element may be de-service alarm information of the network device.
  • the performance degradation information of the network element includes: a connection establishment failure rate greater than a preset threshold, or a device load greater than a preset threshold.
  • the change information of the network element includes: the configuration information of the neighboring cell is changed, or a new carrier frequency is added.
  • the network problem analysis device can also perform standardization processing on the information to obtain standardized network problem data information.
  • the standardized network problem data information includes: the identification of the network problem, the type of the network problem, and the location where the network problem occurs (i.e., the identification information of the network element where the network problem occurs). For example, as shown in Table 1:
  • the network problem analysis device obtains user data information, which refers to user data information of each user served by the network in the first network.
  • the user data information includes but is not limited to: signaling, messages, information, or messages sent and received by the network element.
  • the user data information also includes: the user's KPI.
  • the user's KPI includes but is not limited to: rate, number of scheduling times, modulation and coding scheme (MCS), dual-stream transmission ratio, initial transmission bit error rate, signal-to-interference and noise ratio (SINR), channel quality information (CQI), resource allocation success rate, available resources, interference, or reference signal received power (RSRP), etc.
  • MCS modulation and coding scheme
  • SINR signal-to-interference and noise ratio
  • CQI channel quality information
  • resource allocation success rate available resources, interference, or reference signal received power (RSRP), etc.
  • the network problem analysis device can also standardize the information to obtain standardized user data information.
  • different users' KPIs may use different units, and the KPIs of these users are standardized to obtain KPIs of the same unit.
  • the network problem analysis result indicates the degree to which the KPI is affected by the network problem.
  • the abnormal network element and the abnormal user served by the abnormal network element are determined respectively according to the network problem data information and the user data information.
  • the network problem analysis result is determined according to the user data information (KPI) of the abnormal user, and the network problem analysis result indicates the degree to which the affected KPI in the user data information of the abnormal user is affected by the network problem.
  • the network problem analysis result includes: the confidence of the KPI, and/or the degree of degradation of the KPI.
  • the confidence of the KPI is the confidence that the KPI is affected by the network problem
  • the degree of degradation of the KPI indicates the degree of change of the KPI after the network problem occurs.
  • key KPIs can also be selected.
  • the KPIs of the user include: rate, number of scheduling times, modulation and coding scheme (MCS), dual-stream transmission ratio, initial transmission bit error rate, signal-to-interference-plus-noise ratio (SINR), channel quality information (CQI), resource allocation success rate, available resources, interference, and reference signal received power (RSRP).
  • MCS modulation and coding scheme
  • SINR signal-to-interference-plus-noise ratio
  • CQI channel quality information
  • resource allocation success rate available resources, interference, and reference signal received power (RSRP).
  • RSRP and rate are selected as key KPIs.
  • the confidence of the key KPI and the degree of degradation of the key KPI are included in the network problem analysis results.
  • the degree of degradation of the KPI includes any one or more of the following: the difference between the statistics of the KPI after the network problem occurs and the statistics of the KPI before the network problem occurs, the ratio of the statistics of the KPI after the network problem occurs to the statistics of the KPI before the network problem occurs, the difference between the statistics of the KPI after the network problem occurs and a third threshold, or the ratio of the statistics of the KPI after the network problem occurs to the third threshold.
  • the statistics include any one or more of the following: mean, median, lower quantile, upper quantile, or cumulative probability distribution of any interval.
  • the embodiments of the present application do not limit the method for calculating the confidence of the KPI and the method for calculating the degree of degradation of the KPI.
  • a method for calculating the confidence of a KPI includes: collecting data of the KPI before a network problem occurs, recorded as KPIbefore; collecting data of the KPI after the network problem occurs, recorded as KPIafter. Then, the probability distribution of KPIbefore and KPIafter is calculated. Finally, using a distribution similarity measurement method, the difference between the probability distribution before and after the network problem occurs is calculated as the confidence of the KPI.
  • a method for calculating the degree of degradation of a KPI includes: calculating the cumulative probability distribution of KPIbefore in a certain interval, recording S-KPIbefore; calculate the cumulative probability distribution of KPIafter in a certain interval, recorded as S-KPIafter. Then, calculate the difference between S-KPIbefore and S-KPIafter as the degradation degree of the KPI.
  • the statistic of this KPI is the cumulative probability distribution of a certain interval, which is sensitive to local degradation.
  • a network problem analysis device integrates network problem data information and user data information to obtain a network problem analysis result, and the network problem analysis result indicates the degree to which the user data information is affected by the network problem.
  • the user data information includes the user's key performance indicator KPI.
  • the network problem analysis result includes the confidence of the KPI and/or the degree of degradation of the KPI, wherein the confidence of the KPI is the confidence of the degree to which the KPI is affected by the network problem, and the degree of degradation of the KPI indicates the degree of change of the KPI after the network problem occurs.
  • the network problem analysis result indicates the degree of impact caused by the quantified network problem, and improves the accuracy of analyzing the impact caused by the network problem.
  • step 503 (generating a network problem analysis result based on the network problem data information and the user data information), the abnormal network elements in the first network affected by the network problem can also be determined. Then, based on the users served by the abnormal network elements, the abnormal users are determined from these users. From the user data information (KPI) of the abnormal users, the KPIs affected by the network problem (referred to as affected KPIs) are determined. Finally, based on the affected KPIs before the network problem occurs and the KPIs after the network problem occurs, a network problem analysis result is generated, and the network problem analysis result indicates the degree to which the affected KPIs are affected by the network problem. The following are explained separately.
  • Figure 6 is a schematic diagram of an embodiment of a network problem analysis method in an embodiment of the present application.
  • the network problem analysis method proposed in the embodiment of the present application also includes:
  • the network problem analysis device obtains network resource data information, where the network resource data information includes any one or more of the following information: configuration information of the first network, or engineering parameter information of the first network.
  • the network resource data information is shown in Table 2.
  • the network resource data information is shown in Table 3.
  • the network resource data information indicates the service association relationship and/or physical connection relationship between network elements.
  • second network information can be generated, and the second network information indicates the topology structure of the first network.
  • Figure 9 is a schematic diagram of the network topology structure of the first network in an embodiment of the present application.
  • the second network information is generated, and the second network information indicates the network topology structure of the first network.
  • the network topology structure indicates the service association relationship between network elements.
  • network element A network element A is a base station
  • network element B network element B is a base station
  • network element A and network element B are configured as logical neighboring cells to establish a service association relationship between network element A and network element B.
  • User (for example, terminal) services can be switched between network element A and network element B.
  • the network problem data information determine an abnormal network element from the multiple network elements indicated by the second network information, where the abnormal network element is a network element affected by the network problem, and generate first network information.
  • an abnormal network element is determined from the multiple network elements indicated by the second network information (i.e., multiple network elements of the first network).
  • the network element affected by the network problem in the first network is referred to as an abnormal network element.
  • the common network elements include: the network element where the network problem occurs and the network element having a business association relationship with the network element where the network problem occurs.
  • the network element where the network problem occurs is called the first network element
  • the network element having a business association relationship with the first network element is called the second network element.
  • the network problem analysis device associates the network problem with the first network element according to the network problem data information.
  • the first network information is generated, and the first network information indicates the network topology structure of the abnormal network element.
  • the first network information indicates the network topology of the first network element and the network topology of the second network element.
  • the network problem analysis device marks the first network element and the second network element from the second network information, and then generates the first network information.
  • a second network element having a service association relationship with a first network element is output as a network element set.
  • Network element set #1 includes a network element with a network problem (a first network element) and other network elements with no network problem (a second network element) that have a business association relationship with the first network element.
  • network element sets include the first network element and the second network element
  • multiple network element sets with the same network element are merged into one network element set, and the merged network element set is analyzed as a topology subgraph.
  • network element set #1 and network element set #2 include the same network element.
  • the network problem occurring in network element set #1 may be different from the network problem occurring in network element set #2, so the same network element included in network element set #1 and network element set #2 means that the network element may be affected by two different network problems at the same time.
  • FIG 11 is a schematic diagram of another embodiment of the first network information in the embodiment of the present application.
  • Network element set #1 and network element set #2 are merged, the merged network element set is output, and the corresponding network topology subgraph is generated, that is, the topology subgraph #1 in Figure 11.
  • the first network information (topology subgraph #1) is generated according to the second network information (network element set #1 and network element set #2).
  • an abnormal network element group can also be determined from the first network information indicating the first network element and the second network element, and the abnormal network element group includes at least two abnormal network elements.
  • the change in the service migration information of the abnormal network element before and after the network problem occurs meets the first threshold.
  • the abnormal network element group that is greatly affected by the network problem is screened from the first network element and the second network element.
  • Figure 7 is a schematic diagram of an embodiment of a network problem analysis method in an embodiment of the present application.
  • a network problem analysis method proposed in an embodiment of the present application also includes:
  • step S1 service migration information of a first network element and service migration information of a second network element are obtained, the service migration information indicating statistical characteristic information of users served by the network element, migrating into the network element, migrating out of the network element, and/or residing in the network element.
  • the statistical characteristic information includes, but is not limited to, total number, proportion, mean, and/or variance.
  • a hypothesis verification method is used to determine whether the change between statistical feature information #1 and statistical feature information #2 is significant.
  • the service migration information includes: the statistical feature information of the users who migrated in, migrated out, and/or resident users of the network element (the network element is the first network element or the second network element) before the network problem occurred, recorded as statistical feature information #1; the statistical feature information of the users who migrated in, migrated out, and/or resident users of the network element after the network problem occurred, recorded as statistical feature information #2.
  • network element #A one of the network elements that has a business association relationship with network element #A is network element #B.
  • the total number of users who migrated from network element #A to network element #B within one hour before the network problem occurred can be counted, and the total number of users who migrated in can be counted every 10 minutes, specifically: 100, 99, 98, 102, 103, 95.
  • Ten minutes after the network problem occurs the total number of users who migrated in is counted every 10 minutes, which is 25.
  • the hypothesis verification method is used to determine whether the changes between the statistical feature information #1 and the statistical feature information #2 are significant.
  • the probability that the total number of migrated users after the network problem occurs has no significant change compared with the total number one hour before the network problem occurs is 0.04, and the preset first threshold is 0.1, which indicates that there is a significant change, and the business association relationship between network element #A and network element #B is an abnormal business association relationship.
  • step S3 If the statistical feature information changes significantly before and after the network problem occurs (satisfies the first threshold), proceed to step S3; if the statistical feature information does not change significantly before and after the network problem occurs (does not meet the first threshold), proceed to step S4.
  • the statistical characteristic information changes significantly (satisfies the first threshold) before and after the network problem occurs, and the service association relationship between the first network element and the second network element is marked as abnormal.
  • step S3 when the service association relationship between the first network element and a second network element is abnormal, the first network element and the second network element are considered as a pair of abnormal network element groups, and the first network element and the second network element are abnormal network elements. It is considered that the second network element is significantly affected by the network problem.
  • the network problem analysis device marks the abnormal network element group in the first network information, and the network problem analysis device marks the service association relationship between the abnormal network element groups as an abnormal service association relationship, and generates the third network information.
  • Figure 12 is a network topology diagram of the third network information in the embodiment of the present application. Taking the topology subgraph #2 in Figure 11 as an example, after finding the abnormal network element group through the method of steps S1 to S3, the service association relationship between the abnormal network element groups is marked as an abnormal service association relationship.
  • the statistical characteristic information does not change significantly before and after the network problem occurs (does not meet the first threshold), and the service association relationship between the first network element and the second network element is marked as normal.
  • step S4 when the service association relationship between the first network element and a second network element is normal, it is considered that the second network element is not significantly affected by the network problem, and the second network element is regarded as a normal network element.
  • step 604 after the abnormal network element is determined, users who migrated in, migrated out, and/or resided in the network element after the network problem occurred are found and regarded as abnormal users. For example, users who migrated in, migrated out, and resided in the network element within 1 hour after the network problem occurred are regarded as abnormal users.
  • network elements significantly affected by network problems can be screened from a large number of second network elements.
  • the second network elements significantly affected by network problems and the first network elements are analyzed together as abnormal network elements to find abnormal users served by the abnormal network elements.
  • the complexity of data processing and the difficulty of calculation are effectively reduced, and the accuracy of network problem analysis is improved.
  • FIG8 is a schematic diagram of an embodiment of a network problem analysis method in an embodiment of the present application.
  • a network problem analysis method proposed in an embodiment of the present application also includes:
  • At least one KPI of the abnormal user is determined from the user data information of the abnormal user served by the abnormal network group (the abnormal network element included in the abnormal network element group) according to the third network information and the user data information.
  • the KPI may be a standardized KPI.
  • obtaining the KPI value of the abnormal user includes: obtaining the KPI value before the network problem occurs (KPIbefore) and the KPI value after the network problem occurs (KPIafter).
  • step 802 after obtaining the KPI of the abnormal user, the distribution similarity measurement method is used to calculate the difference between the distribution of KPIbefore and KPIafter.
  • the RSRP value range is divided into four, namely -130 decibel relative to one milliwatt (dBm) to -120dBm, -120dBm to -110dBm, -110dBm to -100dBm and -100dBm to 90dBm.
  • 100 RSRP values were collected 5 minutes before the network problem occurred. The proportions falling into these four intervals were 10%, 30%, 40%, and 20%.
  • the proportion of KPIbefore was f(KPI before ), and f(KPI before ) was used as the distribution of KPIbefore (i.e., the probability distribution of KPIbefore); 100 RSRP values were collected 5 minutes after the network problem occurred.
  • the proportions falling into these four intervals were 20%, 40%, 30%, and 10%.
  • the proportion of KPIafter was f(KPI after ), and f(KPI after ) was used as the distribution of KPIafter (i.e., the probability distribution of KPIafter).
  • the difference between the distributions of the two KPIs is calculated.
  • the method for calculating the difference is, for example, JS divergence (value range [0,1]). The smaller the JS divergence, the smaller the difference; the larger the JS divergence, the larger the difference.
  • step 803 the confidence of the KPI is obtained according to the probability distribution of KPIbefore and the probability distribution of KPIafter calculated in step 802.
  • the confidence of the KPI indicates the confidence that the KPI is affected by the network problem.
  • the affected KPI is screened from at least one KPI according to the confidence of the KPI. Specifically, the confidence of the affected KPI is greater than the second threshold.
  • the second threshold is 0.75.
  • the KPIs of abnormal users include: rate, scheduling times, modulation and coding scheme (MCS), dual stream Transmission ratio, initial transmission bit error rate, signal to interference and noise ratio (SINR), channel quality information (CQI), resource allocation success rate, available resources, interference, and reference signal received power (RSRP).
  • MCS modulation and coding scheme
  • SINR signal to interference and noise ratio
  • CQI channel quality information
  • RSRP reference signal received power
  • fourth network information is generated based on the communication mechanism relationship and the affected KPI, the fourth network information indicates the impact relationship between the affected KPI and the network problem, the impact relationship indicated by the fourth network information is consistent with the impact relationship indicated by the communication mechanism relationship, and the communication mechanism relationship indicates the impact relationship between multiple KPIs and the network problem. Then, a network problem analysis result is generated based on the fourth network information.
  • the communication mechanism relationship may also be referred to as a communication mechanism relationship diagram.
  • FIG13 is a schematic diagram of the structure of the communication mechanism relationship diagram in an embodiment of the present application.
  • the communication mechanism relationship diagram is used to indicate the influence relationship between each KPI, and the influence relationship indicated by the communication mechanism relationship diagram conforms to the influence mechanism described by the communication protocol.
  • a communication network engineer may predefine the communication mechanism relationship diagram, which describes the influence of different types of network problems on the KPI, as well as the causal order of influence between different KPIs. For example, as shown in FIG13, the deterioration of RSRP affects SINR.
  • the graph search method is used from the network problem to determine the path connecting each affected KPI and the network problem. Then, the KPI topology structure describing the path is output as the fourth network information.
  • Figure 14 is a topological schematic diagram of the fourth network information in an embodiment of the present application.
  • the fourth network information indicates the influence relationship between each affected KPI, and the influence relationship between the affected KPI and the network problem.
  • a key KPI may be determined among the affected KPIs. For example, rate is selected as the key KPI among the multiple affected KPIs shown in FIG. 14 .
  • the network problem analysis result includes any one or more of the following: the network problem, the abnormal network element where the network problem occurs, the abnormal user affected by the network problem, the affected KPI of the abnormal user, the confidence of the affected KPI, or the degree of degradation of the affected KPI, the confidence of the affected KPI is the confidence that the affected KPI is affected by the network problem, and the degree of degradation of the affected KPI indicates the degree of change of the affected KPI after the network problem occurs.
  • Table 4 Exemplary, as shown in Table 4:
  • quantified network problem analysis results are output, making the network problem analysis results more intuitive and improving the accuracy of network problem analysis.
  • the computing device includes hardware structures and/or software modules corresponding to the execution of each function. It should be easily appreciated by those skilled in the art that, in combination with the units and method steps of each example described in the embodiments disclosed in this application, the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a function is executed in the form of hardware or computer software driving hardware depends on the specific application scenario and design constraints of the technical solution.
  • FIG15 is a schematic diagram of a possible computing device 1500 provided in an embodiment of the present application. These computing devices can be used to implement the above The function of the network problem analysis device in the above method embodiment can also achieve the beneficial effects of the above method embodiment.
  • the computing device 1500 includes a processing unit 1510 and a transceiver unit 1520.
  • the computing device 1500 is used to implement the functions of the network problem analysis device in the method embodiments shown in Figs. 5 to 14 above.
  • a computing device 1500 is applied to a first network, the first network includes a plurality of network elements, and the computing device includes:
  • the transceiver unit 1520 is configured to obtain network problem data information, where the network problem data information indicates a network problem occurring in a network element in the first network;
  • the transceiver unit 1520 is further configured to obtain user data information, where the user data information is user data information of at least one user served by the first network, and the user data information includes a key performance indicator KPI of the user;
  • the processing unit 1510 is configured to generate a network problem analysis result according to the network problem data information and the user data information, wherein the network problem analysis result indicates the degree to which the KPI is affected by the network problem.
  • the network problem analysis results include:
  • the confidence of the KPI, and/or the degree of degradation of the KPI wherein the confidence of the KPI is the confidence that the KPI is affected by the network problem, and the degree of degradation of the KPI indicates the degree of change of the KPI after the network problem occurs.
  • the processing unit 1510 is further configured to determine an abnormal network element from a plurality of network elements included in the first network according to the network problem data information, wherein the abnormal network element is a network element affected by the network problem;
  • the processing unit 1510 is further configured to determine an abnormal user from among a plurality of users served by the abnormal network element, wherein the abnormal user is a user who migrates into, migrates out of, and/or resides in the abnormal network element after the network problem occurs;
  • the processing unit 1510 is further configured to generate the network problem analysis result according to the KPI of the abnormal user, wherein the KPI indicated by the network problem analysis result belongs to the abnormal user.
  • the processing unit 1510 is further configured to determine an affected KPI of the abnormal user from the KPI of the abnormal user, wherein the affected KPI is a KPI of the abnormal user that is affected by the network problem among the multiple KPIs of the abnormal user;
  • the processing unit 1510 is further configured to generate the network problem analysis result according to the affected KPI of the abnormal user, wherein the KPI indicated by the network problem analysis result is the affected KPI.
  • the processing unit 1510 is further configured to determine the abnormal network element from the plurality of network elements included in the first network according to the network problem data information;
  • the processing unit 1510 is further configured to generate first network information, where the first network information indicates a network topology structure of the abnormal network element, and the network topology result indicates a service association relationship between network elements.
  • the transceiver unit 1520 is further used to obtain network resource data information, where the network resource data information includes any one or more of the following information: configuration information of the first network or engineering parameter information of the first network;
  • the processing unit 1510 is further configured to generate second network information according to the network resource data information, where the second network information indicates a network topology structure of the first network;
  • the processing unit 1510 is further configured to determine the abnormal network element from the multiple network elements indicated by the second network information according to the network problem data information, wherein the multiple network elements indicated by the second network information are multiple network elements included in the first network.
  • the first network information indicates a network topology of a first network element and a second network element, wherein the first network element is a network element in the first network where the network problem occurs, and the second network element is a network element having a business association relationship with the first network element.
  • the processing unit 1510 is further configured to determine at least one group of abnormal network elements from the first network element and the second network element according to the service migration information of the first network element and the service migration information of the second network element, wherein the service migration information indicates statistical characteristic information of migration in, migration out and/or residence of the user served by the network element;
  • the abnormal network element group includes at least two abnormal network elements, a service association relationship exists between the abnormal network elements included in the abnormal network element group, and a change in the service migration information of the abnormal network elements before and after the network problem occurs meets a first threshold;
  • the processing unit 1510 is further configured to mark the service association relationship between the abnormal network elements as abnormal, and generate third network information according to the first network information, wherein the third network information indicates a topological structure of the abnormal network element group.
  • the processing unit 1510 is further configured to determine the abnormal user served by the abnormal network element according to the third network information
  • the processing unit 1510 is further configured to determine the affected KPI from the KPI of the abnormal user according to the confidence of the KPI of the abnormal user, wherein the confidence of the affected KPI is greater than a second threshold.
  • the processing unit 1510 is further configured to generate fourth network information according to the communication mechanism relationship and the affected KPI of the abnormal user, wherein the fourth network information indicates an impact relationship between the affected KPI and the network problem, the impact relationship indicated by the fourth network information is consistent with the impact relationship indicated by the communication mechanism relationship, and the communication mechanism relationship indicates an impact relationship between multiple KPIs and the network problem;
  • the processing unit 1510 is further configured to generate the network problem analysis result according to the fourth network information and the network problem.
  • the fourth network information further includes: a degradation degree of a key KPI and a confidence level of the affected KPI, and the key KPI belongs to the affected KPI.
  • the network problem analysis result includes any one or more of the following:
  • the network problem the abnormal network element where the network problem occurs, the abnormal user affected by the network problem, the affected KPI of the abnormal user, the confidence of the affected KPI, or the degree of degradation of the affected KPI
  • the confidence of the affected KPI is the confidence that the affected KPI is affected by the network problem
  • the degree of degradation of the affected KPI indicates the degree of change of the affected KPI after the occurrence of the network problem.
  • the degradation degree of the KPI includes any one or more of the following:
  • the ratio of the statistic of the KPI after the network problem occurs to the third threshold.
  • the statistics include any one or more of the following:
  • the network problem data information includes any one or more of the following information:
  • the alarm information of the network element the performance degradation information of the network element, or the change information of the network element.
  • FIG. 16 is a schematic diagram of the structure of a computing device 1600 in an embodiment of the present application.
  • the computing device 1600 includes a processor 1610 and an interface circuit 1620.
  • the processor 1610 and the interface circuit 1620 are coupled to each other.
  • the interface circuit 1620 can be a transceiver or an input-output interface.
  • the computing device 1600 may also include a memory 1630 for storing instructions executed by the processor 1610 or storing input data required by the processor 1610 to execute instructions or storing data generated after the processor 1610 executes instructions.
  • the processor 1610 is used to implement the function of the processing unit 1510
  • the interface circuit 1620 is used to implement the function of the transceiver unit 1520 .
  • the computing device 1600 is applied to a first network, the first network includes a plurality of network elements, and the computing device includes: an interface circuit 1620 and a processor 1610;
  • the interface circuit 1620 is used to obtain network problem data information, where the network problem data information indicates a network problem occurring in a network element in the first network;
  • the interface circuit 1620 is further configured to obtain user data information, where the user data information is user data information of at least one user served by the first network, and the user data information includes a key performance indicator KPI of the user;
  • the processor 1610 is configured to generate a network problem analysis result according to the network problem data information and the user data information, wherein the network problem analysis result indicates the degree to which the KPI is affected by the network problem.
  • the network problem analysis results include:
  • the confidence level of the KPI, and/or the degree of degradation of the KPI wherein the confidence level of the KPI is the degree to which the KPI is affected by the
  • the degradation degree of the KPI indicates the degree of change of the KPI after the network problem occurs.
  • the processor 1610 is further configured to determine an abnormal network element from a plurality of network elements included in the first network according to the network problem data information, wherein the abnormal network element is a network element affected by the network problem;
  • the processor 1610 is further configured to determine an abnormal user from among a plurality of users served by the abnormal network element, wherein the abnormal user is a user who migrates into, migrates out of, and/or resides in the abnormal network element after the network problem occurs;
  • the processor 1610 is further configured to generate the network problem analysis result according to the KPI of the abnormal user, where the KPI indicated by the network problem analysis result belongs to the abnormal user.
  • the processor 1610 is further configured to determine an affected KPI of the abnormal user from the KPI of the abnormal user, wherein the affected KPI is a KPI affected by the network problem among the multiple KPIs of the abnormal user;
  • the processor 1610 is further configured to generate the network problem analysis result according to the affected KPI of the abnormal user, wherein the KPI indicated by the network problem analysis result is the affected KPI.
  • the processor 1610 is further configured to determine the abnormal network element from the plurality of network elements included in the first network according to the network problem data information;
  • the processor 1610 is further configured to generate first network information, where the first network information indicates a network topology structure of the abnormal network element, and the network topology result indicates a service association relationship between network elements.
  • the interface circuit 1620 is further used to obtain network resource data information, wherein the network resource data information includes any one or more of the following information: configuration information of the first network or engineering parameter information of the first network;
  • the processor 1610 is further configured to generate second network information according to the network resource data information, where the second network information indicates a network topology structure of the first network;
  • the processor 1610 is further configured to determine the abnormal network element from a plurality of network elements indicated by the second network information according to the network problem data information, wherein the plurality of network elements indicated by the second network information are a plurality of network elements included in the first network.
  • the first network information indicates a network topology of a first network element and a second network element, wherein the first network element is a network element in the first network where the network problem occurs, and the second network element is a network element having a business association relationship with the first network element.
  • the processor 1610 is further configured to determine at least one group of abnormal network elements from the first network element and the second network element according to the service migration information of the first network element and the service migration information of the second network element, wherein the service migration information indicates statistical feature information of migration in, migration out and/or residence of the user served by the network element;
  • the abnormal network element group includes at least two abnormal network elements, a service association relationship exists between the abnormal network elements included in the abnormal network element group, and a change in the service migration information of the abnormal network elements before and after the network problem occurs meets a first threshold;
  • the processor 1610 is further configured to mark the service association relationship between the abnormal network elements as abnormal, and generate third network information according to the first network information, wherein the third network information indicates a topological structure of the abnormal network element group.
  • the processor 1610 is further configured to determine, according to the third network information, the abnormal user served by the abnormal network element;
  • the processor 1610 is further configured to determine the affected KPI from the KPIs of the abnormal user according to the confidence of the KPI of the abnormal user, wherein the confidence of the affected KPI is greater than a second threshold.
  • the processor 1610 is further configured to generate fourth network information according to the communication mechanism relationship and the affected KPI of the abnormal user, wherein the fourth network information indicates an impact relationship between the affected KPI and the network problem, the impact relationship indicated by the fourth network information is consistent with the impact relationship indicated by the communication mechanism relationship, and the communication mechanism relationship indicates an impact relationship between multiple KPIs and the network problem;
  • the processor 1610 is further configured to generate the network problem analysis result according to the fourth network information and the network problem.
  • the fourth network information further includes: a degradation degree of a key KPI and a confidence level of the affected KPI, and the key KPI belongs to the affected KPI.
  • the network problem analysis result includes any one or more of the following:
  • the network problem the abnormal network element where the network problem occurs, the abnormal user affected by the network problem, the affected KPI of the abnormal user, the confidence of the affected KPI, or the degree of degradation of the affected KPI
  • the confidence of the affected KPI is the confidence that the affected KPI is affected by the network problem
  • the degree of degradation of the affected KPI indicates the degree of change of the affected KPI after the occurrence of the network problem.
  • the degradation degree of the KPI includes any one or more of the following:
  • the ratio of the statistic of the KPI after the network problem occurs to the third threshold.
  • the statistics include any one or more of the following:
  • the network problem data information includes any one or more of the following information:
  • the alarm information of the network element the performance degradation information of the network element, or the change information of the network element.
  • the interface circuit 1620 can also be connected to a transceiver, which can be used to support the reception or transmission of air interface signals between a computing device and a network device, and between a computing device and a terminal device.
  • the transceiver can be connected to multiple antennas.
  • the transceiver includes a transmitter Tx and a receiver Rx.
  • one or more antennas can receive air interface signals
  • the receiver Rx of the transceiver is used to receive the air interface signal from the antenna, and convert the air interface signal into a digital baseband signal or a digital intermediate frequency signal, and provide the digital baseband signal or the digital intermediate frequency signal to the processor 1610, so that the processor 1610 performs further processing on the digital baseband signal or the digital intermediate frequency signal, such as demodulation processing and decoding processing.
  • the transmitter Tx in the transceiver is also used to receive a modulated digital baseband signal or a digital intermediate frequency signal from the processor 1610, and convert the modulated digital baseband signal or the digital intermediate frequency signal into an air interface signal, and send the air interface signal through one or more antennas.
  • the above-mentioned computing device may also be a chip.
  • the transceiver unit 1620 may be an input and/or output circuit of a chip, or a communication interface.
  • the chip may be used in a terminal or a base station or other network device.
  • the computing device includes a means for generating data, and a means for sending data.
  • the functions of the means for generating data and the means for sending data may be implemented by one or more processors.
  • data may be generated by one or more processors, and the data may be sent through a transceiver, or an input/output circuit, or an interface of a chip.
  • data please refer to the relevant description in the embodiments of the present application.
  • the computing device includes a means for receiving data, and a means for sending uplink data.
  • a means for receiving data For data and how to send uplink data based on the data, please refer to the relevant description in the embodiments of the present application.
  • data may be received through a transceiver, or an input/output circuit, or an interface of a chip.
  • the terminal chip When the computing device is a chip applied to a terminal, the terminal chip implements the functions of the terminal in the above method embodiment.
  • the terminal chip receives information from other modules in the terminal (such as a radio frequency module or an antenna), and the information is sent by the base station to the terminal; or the terminal chip sends information to other modules in the terminal (such as a radio frequency module or an antenna), and the information is sent by the terminal to the base station.
  • the base station module implements the functions of the base station in the above-mentioned method embodiment.
  • the base station module receives information from other modules in the base station (such as a radio frequency module or an antenna), and the information is sent by the terminal to the base station; or, the base station module sends information to other modules in the base station (such as a radio frequency module or an antenna), and the information is sent by the base station to the terminal.
  • the base station module here can be a baseband chip of a base station, or it can be a DU or other module, and the DU here can be a DU under an open radio access network (O-RAN) architecture.
  • OF-RAN open radio access network
  • the processor in the embodiments of the present application may be a central processing unit (CPU), or other general-purpose processors, digital signal processors (DSP), application specific integrated circuits (ASIC), field programmable gate arrays (FPGA) or other programmable logic devices, transistor logic devices, hardware components or any combination thereof.
  • the general-purpose processor may be a microprocessor or any conventional processor.
  • the method steps in the embodiments of the present application can be implemented in hardware or in software instructions that can be executed by a processor.
  • the software instructions may be composed of corresponding software modules, which may be stored in random access memory, flash memory, read-only memory, programmable read-only memory, erasable programmable read-only memory, electrically erasable programmable read-only memory, registers, hard disk, mobile hard disk, CD-ROM or any other form of storage medium known in the art.
  • An exemplary storage medium is coupled to the processor so that the processor can read information from the storage medium and write information to the storage medium.
  • the storage medium may also be a component of the processor.
  • the processor and the storage medium may be located in an ASIC.
  • the ASIC may be located in a base station or a terminal.
  • the processor and the storage medium may also exist in a base station or a terminal as discrete components.
  • the computer program product includes one or more computer programs or instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, a network device, a user device or other programmable device.
  • the computer program or instruction may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer program or instruction may be transmitted from one website site, computer, server or data center to another website site, computer, server or data center by wired or wireless means.
  • the computer-readable storage medium may be any available medium that a computer can access or a data storage device such as a server, data center, etc. that integrates one or more available media.
  • the available medium may be a magnetic medium, for example, a floppy disk, a hard disk, a tape; it may also be an optical medium, for example, a digital video disc; it may also be a semiconductor medium, for example, a solid-state hard disk.
  • the computer-readable storage medium may be a volatile or nonvolatile storage medium, or may include both volatile and nonvolatile types of storage media.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本申请实施例公开一种网络问题分析方法以及相关装置,该方法包括:获取网络问题数据信息,所述网络问题数据信息指示第一网络中网元发生的网络问题;获取用户数据信息,所述用户数据信息为所述第一网络所服务的至少一个用户的用户数据信息,所述用户数据信息包括所述用户的关键性能指标KPI;根据所述网络问题数据信息和所述用户数据信息,生成网络问题分析结果,所述网络问题分析结果指示所述KPI受到所述网络问题的影响程度。通过上述方法,得到量化的网络问题分析结果,使得网络问题分析结果更加直观,提升网络问题的分析准确率。

Description

一种网络问题分析方法以及相关装置
本申请要求于2022年11月30日提交国家知识产权局、申请号为CN202211521732.6、发明名称为“一种网络问题分析方法以及相关装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及网络技术领域,尤其涉及一种网络问题分析方法以及相关装置。
背景技术
当前,电信网络以保障网络中用户的用户服务质量稳定可靠为目标,因此大量的运维活动均以用户服务质量作为衡量指标,实现其对应的商业价值。其中,占比较大的运维活动是网络问题管理,具体包括对网络问题的监控、分析派单及诊断修复。网络问题包括但不限于:网元告警、网元性能劣化以及网元的变更事故。
在当前对网络问题进行分析的过程中,受限于运维资源的有限性,主要关注网络问题的处理时效。通常对网络问题进行关联、压缩、或者过滤处理,以压缩需要处理的网络问题数量。基于网络运维工程师的经验或规则,对压缩后的网络问题进行派单处理。上述方法,对多个网络问题进行共现关联分析,以描这些网络问题的相关程度。然后采用预定义规则,对相关程度较高的网络问题进行严重程度的等级划分。
然而,由于电信网络的运行环境及用户业务的使用行为具有不确定性高的特点,因此导致发生告警的随机性较大。此外,仅根据用户数据间的共现关联程度分析,易受用户数据的波动影响,无法准确量化网络问题造成的影响程度,导致网络问题分析结果的准确性低。
发明内容
第一方面,本申请实施例提出一种网络问题分析方法,所述方法应用于第一网络,所述第一网络包括多个网元,所述方法包括:
获取网络问题数据信息,所述网络问题数据信息指示所述第一网络中网元发生的网络问题;
获取用户数据信息,所述用户数据信息为所述第一网络所服务的至少一个用户的用户数据信息,所述用户数据信息包括所述用户的关键性能指标KPI;
根据所述网络问题数据信息和所述用户数据信息,生成网络问题分析结果,所述网络问题分析结果指示所述KPI受到所述网络问题的影响程度,
所述网络问题分析结果包括:
所述KPI的置信度,和/或,所述KPI的劣化程度,其中,所述KPI的置信度为所述KPI受到所述网络问题影响的置信度,所述KPI的劣化程度指示所述KPI在发生所述网络问题后的变化程度。
具体的,网络问题数据信息包括但不限于:网元的告警信息、网元的性能劣化信息、网元的变更信息。
示例性的,网元的告警信息,包括:当网元是小区时,网元的告警信息可以是该小区的不可用告警信息,该小区视为逻辑网元,网络设备可以包括(或者管理)一个或多个小区,网络设备通过网络设备管理的小区向用户(例如终端设备)提供无线通信服务;当网元是射频单元等网络设备时,网元的告警信息可以是该网络设备的链路中断信息;当网元是站点等网络设备时,网元的告警信息可以是该网络设备的退服告警信息。
示例性的,网元的性能劣化信息,包括:连接建立失败率大于预设阈值,或者,设备的负载大于预设阈值。
示例性的,网元的变更信息,包括:邻区的配置信息发生变更,或者,新增载频。
可选的,网络问题分析装置获取网络问题数据信息后,还可以对这些信息进行标准化处理,得到标准化的网络问题数据信息。示例性的,标准化的网络问题数据信息包括:网络问题的标识、网络问题的 类型、以及网络问题的发生位置(即发生网络问题的网元的标识信息)。
该用户数据信息指的是第一网络中网络所服务的各个用户的用户数据信息。示例性的,该用户数据信息包括但不限于:网元收发的信令、消息、信息,或者报文等。该用户数据信息还包括:用户的KPI。
示例性的,用户的KPI包括但不限于:速率、调度次数、调制和编码方案(MCS)、双流传输比例、初传误码率、信号干扰噪声比(SINR)、信道质量信息(CQI)、资源分配成功率、可用资源、干扰,或者,参考信号接收功率(RSRP)等。
可选的,网络问题分析装置获取用户数据信息后,还可以对这些信息进行标准化处理,得到标准化的用户数据信息。例如,不同用户的KPI可能采用不同的单位,对这些用户的KPI进行标准化处理,得到相同单位的KPI。
本申请实施例中,网络问题分析装置综合网络问题数据信息和用户数据信息,得到网络问题分析结果,网络问题分析结果指示用户数据信息受到网络问题的影响程度。该用户数据信息包括用户的关键性能指标KPI。具体的,网络问题分析结果包括KPI的置信度和/或KPI的劣化程度,其中,KPI的置信度为KPI受到网络问题影响程度的置信度,KPI的劣化程度指示KPI在发生网络问题后的变化程度。网络问题分析结果指示量化的网络问题造成的影响程度,提升分析网络问题造成影响的准确率。
结合第一方面,在一种可能的实现方式中,根据所述网络问题数据信息和所述用户数据信息,生成所述网络问题分析结果,包括:
根据所述网络问题数据信息从所述第一网络包括的多个网元中,确定异常网元,所述异常网元为受到所述网络问题影响的网元;
从所述异常网元所服务的多个用户中,确定异常用户,所述异常用户为发生所述网络问题后迁入、迁出和/或驻留所述异常网元的用户;
根据所述异常用户的KPI,生成所述网络问题分析结果,所述网络问题分析结果指示的所述KPI属于所述异常用户。
具体的,根据网络问题数据信息,从第一网络的多个网元中确定异常网元。本申请实施例中,将第一网络中受到网络问题影响的网元称为异常网元。具体的,异常网元包括:发生网络问题的网元以及与该发生网络问题的网元具有业务关联关系的网元。为了便于说明,将发生网络问题的网元称为第一网元,将与第一网元具有业务关联关系的网元称为第二网元。网络问题分析装置根据网络问题数据信息,将网络问题关联至第一网元。
然后,从异常网元所服务的用户中,确定异常用户。异常用户受到网络问题影响的程度较大。例如,将发生网络问题发生网络问题后1小时内,迁入、迁出和驻留该网元的用户作为异常用户。
通过上述方法,降低计算处理的复杂度,提升网络问题分析的准确率。
结合第一方面,在一种可能的实现方式中,根据所述异常用户的KPI,生成所述网络问题分析结果,包括:
从所述异常用户的KPI中,确定所述异常用户的受影响KPI,其中,所述受影响KPI为所述异常用户的多个KPI中受到所述网络问题影响的KPI;
根据所述异常用户的所述受影响KPI,生成所述网络问题分析结果,所述网络问题分析结果指示的所述KPI为所述受影响KPI。
通过上述方法,确定异常用户后,进一步确定异常用户的多个KPI中受到网络问题影响的KPI(即受影响KPI)。进一步降低计算处理的复杂度,提升网络问题分析的准确率。
结合第一方面,在一种可能的实现方式中,根据所述网络问题数据信息从所述第一网络包括的多个网元中,确定所述异常网元,包括:
根据所述网络问题数据信息从所述第一网络包括的多个网元中确定所述异常网元;
生成第一网络信息,所述第一网络信息指示所述异常网元的网络拓扑结构,所述网络拓扑结果指示网元之间的业务关联关系。
通过上述方法,根据网络问题数据信息从第一网络中找到异常网元,然后生成第一网络信息,第一网络信息指示异常网络的网络拓扑结构。后续根据第一网络信息可方便确定异常网元之间的业务关联关系,降低计算处理的复杂度,提升网络问题分析的准确率。
结合第一方面,在一种可能的实现方式中,根据所述网络问题数据信息从所述第一网络包括的多个网元中标记所述异常网元,包括:
获取网络资源数据信息,所述网络资源数据信息包括以下任意一项或多项信息:所述第一网络的配置信息或者所述第一网络的工程参数信息;
根据所述网络资源数据信息,生成第二网络信息,所述第二网络信息指示所述第一网络的网络拓扑结构;
根据所述网络问题数据信息,从所述第二网络信息指示的多个网元中确定所述异常网元,其中,所述第二网络信息指示的多个网元为所述第一网络包括的多个网元。
具体的,网络问题分析装置获取网络资源数据信息,网络资源数据信息包括以下任意一项或多项信息:第一网络的配置信息,或者,第一网络的工程参数信息。该第二网络信息指示第一网络的网络拓扑结构。网络拓扑结构指示网元之间的业务关联关系。关于业务关联关系与物理连接关系,以无线网络中的网元A(该网元A是一种基站)和网元B(该网元B是一种基站)为例,网元A与网元B之间存在物理连接。此外,网元A与网元B之间互相配置为逻辑邻区,实现网元A与网元B之间建立业务关联关系。用户(例如是终端)业务可以在网元A与网元B之间切换。根据网络问题数据信息,从第二网络信息指示的多个网元(即第一网络的多个网元)中确定异常网元。本申请实施例中,将第一网络中受到网络问题影响的网元称为异常网元。然后,根据异常网元之间的业务关联关系,生成第一网络信息,第一网络信息指示异常网元的网络拓扑结构。
为了进一步提升网络问题分析的准确性,对根据第二网络信息生成的多个网元集合(网元集合包括第一网元以及第二网元)进行合并处理。例如,网元集合#1和网元集合#2包括一个相同的网元。网元集合#1发生的网络问题可能与网元集合#2发生的网络问题不同,因此网元集合#1和网元集合#2包括的相同网元意味着该网元可能同时受到两个不同的网络问题的影响。为了提升网络问题分析的准确性,将网元集合#1和网元集合#2合并,输出合并后的网元集合并生成对应的网络拓扑子图用于后续处理。
结合第一方面,在一种可能的实现方式中,第一网络信息指示第一网元和第二网元的网络拓扑,其中,所述第一网元为所述第一网络中发生所述网络问题的网元,所述第二网元为与所述第一网元具有业务关联关系的网元。具体的,异常网元包括:发生网络问题的网元以及与该发生网络问题的网元具有业务关联关系的网元。为了便于说明,将发生网络问题的网元称为第一网元,将与第一网元具有业务关联关系的网元称为第二网元。网络问题分析装置根据网络问题数据信息,将网络问题关联至第一网元。
结合第一方面,在一种可能的实现方式中,根据所述网络问题数据信息,从所述第二网络信息指示的多个网元中确定所述异常网元,包括:
根据所述第一网元的业务迁移信息和所述第二网元的所述业务迁移信息,从所述第一网元和所述第二网元中确定至少一组异常网元组,所述业务迁移信息指示所述网元所服务的所述用户迁入、迁出和/或驻留的统计特征信息;
其中,所述异常网元组包括至少两个所述异常网元,所述异常网元组包括的所述异常网元之间存在业务关联关系,所述异常网元的所述业务迁移信息在发生所述网络问题前后的变化情况满足第一阈值;
对所述异常网元之间的业务关联关系标记为异常,根据所述第一网络信息生成第三网络信息,所述第三网络信息指示所述异常网元组的拓扑结构。
为了提升网络问题分析结果的准确性,还可以从第一网络信息指示第一网元和第二网元中确定异常网元组,异常网元组包括至少两个异常网元。异常网元的业务迁移信息在发生网络问题前后的变化情况满足第一阈值。换言之,从第一网元和第二网元中筛选受到网络问题影响较大的异常网元组。通过上述方法,可以从大量的第二网元中筛选受到网络问题显著影响的网元。将这部分受到网络问题显著影响的 第二网元和第一网元一起作为异常网元进行分析,找到异常网元所服务的异常用户。有效降低数据处理的复杂度,降低计算难度,提升网络问题分析的准确性。
获取第一网元的业务迁移信息,和,第二网元的业务迁移信息,业务迁移信息指示该网元所服务的用户,迁入该网元,迁出该网元,和/或,驻留该网元的统计特征信息。该统计特征信息,包括但不限于:总数、比例、均值,和/或方差等。
示例性的,若有网络问题发生的网元为网元#A,与网元#A具有业务关联关系的其中一个网元为网元#B。可统计在网络问题发生前一小时内,由网元#A迁移至网元#B的迁入用户的总数,以每10分钟统计一次迁入用户的总数,具体为:100,99,98,102,103,95。在网络问题发生后10分钟,每10分钟统计一次迁入用户的总数为25。根据统计特征信息#1“100,99,98,102,103,95”与统计特征信息#2“25”,使用假设校验方法,确定统计特征信息#1与统计特征信息#2的变化是否显著。以某种假设检验方法计算,得到的网络问题发生后的迁入用户总数与网络问题发生前1小时没有显著变化的概率为0.04,而预设的第一阈值为0.1,则说明有显著变化,网元#A与网元#B之间的业务关联关系为异常业务关联关系。
结合第一方面,在一种可能的实现方式中,从所述异常用户的KPI中,确定所述异常用户的受影响KPI,包括:
根据所述第三网络信息,确定所述异常网元所服务的所述异常用户;
根据所述异常用户的所述KPI的置信度,从所述异常用户的所述KPI中确定所述受影响KPI,其中,所述受影响KPI的置信度大于第二阈值。
具体的,根据第三网络信息和用户数据信息,从异常网络组(异常网元组包括的异常网元)所服务的异常用户的用户数据信息中,确定该异常用户的至少一个KPI。该KPI可以是经过标准化处理的KPI。具体的,获取异常用户的KPI值包括:获取网络问题发生前的KPI值(KPIbefore)和发生网络问题后的KPI值(KPIafter)。
示例性的,以KPI为RSRP为例,将RSRP取值区间分为4个,分别是-130分贝毫瓦(decibel relative to one milliwatt,dBm)~-120dBm,-120dBm~-110dBm,-110dBm~-100dBm和-100dBm~90dBm。网络问题发生前的5分钟采集了100个RSRP值,落在这4个区间的比例为10%,30%,40%,20%,KPIbefore的比例为f(KPIbefore),f(KPIbefore)作为KPIbefore的分布(即KPIbefore的概率分布);网络问题发生后的5分钟采集了100个RSRP值,落在这4个区间的比例为20%,40%,30%,10%,KPIafter的比例为f(KPIafter),f(KPIafter)作为KPIafter的分布(即KPIafter的概率分布)。根据KPIbefore的分布与KPIafter的分布,计算这两个KPI的分布的差异度,计算差异度的方法例如是JS散度(值域[0,1])。JS散度越小,说明差异越小;JS散度越大,越大说明差异越大。
例如,第二阈值为0.75。以异常用户的KPI包括:速率、调度次数、调制和编码方案(MCS)、双流传输比例、初传误码率、信号干扰噪声比(SINR)、信道质量信息(CQI)、资源分配成功率、可用资源、干扰,和参考信号接收功率(RSRP)。其中,速率、MCS、SINR、CQI和RSRP的置信度大于0.75,则认为速率、MCS、SINR、CQI和RSRP为受影响KPI。然后,根据受影响KPI在网络问题发生前的统计量和受影响KPI在网络问题发生后的统计量,计算受影响KPI的劣化程度。
通过上述方法,输出量化的网络问题分析结果,使得网络问题分析结果更加直观,提升网络问题的分析准确率。
结合第一方面,在一种可能的实现方式中,根据所述异常用户的所述受影响KPI,生成所述网络问题分析结果,包括:
根据通信机理关系和所述异常用户的所述受影响KPI,生成第四网络信息,所述第四网络信息指示所述受影响KPI和所述网络问题之间的影响关系,所述第四网络信息指示的影响关系与所述通信机理关系指示的影响关系一致,所述通信机理关系指示多个KPI与网络问题之间的影响关系;
根据所述第四网络信息和所述网络问题,生成所述网络问题分析结果。
通信机理关系,也可以称为通信机理关系图。通信机理关系图用于指示各个KPI之间的影响关系,通信机理关系图指示的影响关系符合通信协议描述的影响机理。示例性的,通信网络工程师可以预定义该通信机理关系图,该通信机理关系图描述不同类型的网络问题对KPI的影响,以及不同KPI之间的影响因果顺序。
结合第一方面,在一种可能的实现方式中,所述第四网络信息还包括:关键KPI的劣化程度和所述受影响KPI的置信度,所述关键KPI属于所述受影响KPI。使得网络问题分析结果更准确的反应网络问题对KPI的影响。
结合第一方面,在一种可能的实现方式中,所述网络问题分析结果,包括以下任意一项或多项:
所述网络问题、发生所述网络问题的所述异常网元、受到所述网络问题影响的所述异常用户,所述异常用户的所述受影响KPI,所述受影响KPI的置信度,或者,所述受影响KPI的劣化程度,所述受影响KPI的置信度为所述受影响KPI受到所述网络问题影响的置信度,所述受影响KPI的劣化程度指示所述受影响KPI在发生所述网络问题后的变化程度。
结合第一方面,在一种可能的实现方式中,所述KPI的劣化程度包括以下任意一项或多项:
发生所述网络问题后的所述KPI的统计量与发生所述网络问题前的所述KPI的统计量的差值,
发生所述网络问题后的所述KPI的统计量与发生所述网络问题前的所述KPI的统计量的比例值,
发生所述网络问题后的所述KPI的统计量与第三阈值的差值,
或者,发生所述网络问题后的所述KPI的统计量与所述第三阈值的比例值。
结合第一方面,在一种可能的实现方式中,所述统计量包括以下任意一项或多项:
均值、中位数、下分位数、上分位数,或者,任意区间的累计概率分布。
结合第一方面,在一种可能的实现方式中,所述网络问题数据信息,包括以下任意一项或多项信息:
所述网元的告警信息、所述网元的性能劣化信息,或者,所述网元的变更信息。
第二方面,本申请实施例提出一种网络问题分析装置,所述网络问题分析装置应用于第一网络,所述第一网络包括多个网元,所述网络问题分析装置包括:
收发模块,用于获取网络问题数据信息,所述网络问题数据信息指示所述第一网络中网元发生的网络问题;
所述收发模块,还用于获取用户数据信息,所述用户数据信息为所述第一网络所服务的至少一个用户的用户数据信息,所述用户数据信息包括所述用户的关键性能指标KPI;
处理模块,用于根据所述网络问题数据信息和所述用户数据信息,生成网络问题分析结果,所述网络问题分析结果指示所述KPI受到所述网络问题的影响程度,
所述网络问题分析结果包括:
所述KPI的置信度,和/或,所述KPI的劣化程度,其中,所述KPI的置信度为所述KPI受到所述网络问题影响的置信度,所述KPI的劣化程度指示所述KPI在发生所述网络问题后的变化程度。
在一种可能的实现方式中,
所述处理模块,还用于根据所述网络问题数据信息从所述第一网络包括的多个网元中,确定异常网元,所述异常网元为受到所述网络问题影响的网元;
所述处理模块,还用于从所述异常网元所服务的多个用户中,确定异常用户,所述异常用户为发生所述网络问题后迁入、迁出和/或驻留所述异常网元的用户;
所述处理模块,还用于根据所述异常用户的KPI,生成所述网络问题分析结果,所述网络问题分析结果指示的所述KPI属于所述异常用户。
在一种可能的实现方式中,
所述处理模块,还用于从所述异常用户的KPI中,确定所述异常用户的受影响KPI,其中,所述受影响KPI为所述异常用户的多个KPI中受到所述网络问题影响的KPI;
所述处理模块,还用于根据所述异常用户的所述受影响KPI,生成所述网络问题分析结果,所述网络问题分析结果指示的所述KPI为所述受影响KPI。
在一种可能的实现方式中,
所述处理模块,还用于根据所述网络问题数据信息从所述第一网络包括的多个网元中确定所述异常网元;
所述处理模块,还用于生成第一网络信息,所述第一网络信息指示所述异常网元的网络拓扑结构,所述网络拓扑结果指示网元之间的业务关联关系。
在一种可能的实现方式中,
所述收发模块,还用于获取网络资源数据信息,所述网络资源数据信息包括以下任意一项或多项信息:所述第一网络的配置信息或者所述第一网络的工程参数信息;
所述处理模块,还用于根据所述网络资源数据信息,生成第二网络信息,所述第二网络信息指示所述第一网络的网络拓扑结构;
所述处理模块,还用于根据所述网络问题数据信息,从所述第二网络信息指示的多个网元中确定所述异常网元,其中,所述第二网络信息指示的多个网元为所述第一网络包括的多个网元。
在一种可能的实现方式中,所述第一网络信息指示第一网元和第二网元的网络拓扑,其中,所述第一网元为所述第一网络中发生所述网络问题的网元,所述第二网元为与所述第一网元具有业务关联关系的网元。
在一种可能的实现方式中,
所述处理模块,还用于根据所述第一网元的业务迁移信息和所述第二网元的所述业务迁移信息,从所述第一网元和所述第二网元中确定至少一组异常网元组,所述业务迁移信息指示所述网元所服务的所述用户迁入、迁出和/或驻留的统计特征信息;
其中,所述异常网元组包括至少两个所述异常网元,所述异常网元组包括的所述异常网元之间存在业务关联关系,所述异常网元的所述业务迁移信息在发生所述网络问题前后的变化情况满足第一阈值;
所述处理模块,还用于对所述异常网元之间的业务关联关系标记为异常,根据所述第一网络信息生成第三网络信息,所述第三网络信息指示所述异常网元组的拓扑结构。
在一种可能的实现方式中,
所述处理模块,还用于根据所述第三网络信息,确定所述异常网元所服务的所述异常用户;
所述处理模块,还用于根据所述异常用户的所述KPI的置信度,从所述异常用户的所述KPI中确定所述受影响KPI,其中,所述受影响KPI的置信度大于第二阈值。
在一种可能的实现方式中,
所述处理模块,还用于根据通信机理关系和所述异常用户的所述受影响KPI,生成第四网络信息,所述第四网络信息指示所述受影响KPI和所述网络问题之间的影响关系,所述第四网络信息指示的影响关系与所述通信机理关系指示的影响关系一致,所述通信机理关系指示多个KPI与网络问题之间的影响关系;
所述处理模块,还用于根据所述第四网络信息和所述网络问题,生成所述网络问题分析结果。
在一种可能的实现方式中,所述第四网络信息还包括:关键KPI的劣化程度和所述受影响KPI的置信度,所述关键KPI属于所述受影响KPI。
在一种可能的实现方式中,所述网络问题分析结果,包括以下任意一项或多项:
所述网络问题、发生所述网络问题的所述异常网元、受到所述网络问题影响的所述异常用户,所述异常用户的所述受影响KPI,所述受影响KPI的置信度,或者,所述受影响KPI的劣化程度,所述受影响KPI的置信度为所述受影响KPI受到所述网络问题影响的置信度,所述受影响KPI的劣化程度指示所述受影响KPI在发生所述网络问题后的变化程度。
在一种可能的实现方式中,所述KPI的劣化程度包括以下任意一项或多项:
发生所述网络问题后的所述KPI的统计量与发生所述网络问题前的所述KPI的统计量的差值,
发生所述网络问题后的所述KPI的统计量与发生所述网络问题前的所述KPI的统计量的比例值,
发生所述网络问题后的所述KPI的统计量与第三阈值的差值,
或者,发生所述网络问题后的所述KPI的统计量与所述第三阈值的比例值。
在一种可能的实现方式中,所述统计量包括以下任意一项或多项:
均值、中位数、下分位数、上分位数,或者,任意区间的累计概率分布。
在一种可能的实现方式中,所述网络问题数据信息,包括以下任意一项或多项信息:
所述网元的告警信息、所述网元的性能劣化信息,或者,所述网元的变更信息。
第三方面,本申请实施例提出一种计算设备,所述计算设备应用于第一网络,所述第一网络包括多个网元,所述计算设备包括:通信接口和处理器;
所述通信接口,用于获取网络问题数据信息,所述网络问题数据信息指示所述第一网络中网元发生的网络问题;
所述通信接口,还用于获取用户数据信息,所述用户数据信息为所述第一网络所服务的至少一个用户的用户数据信息,所述用户数据信息包括所述用户的关键性能指标KPI;
处理器,用于根据所述网络问题数据信息和所述用户数据信息,生成网络问题分析结果,所述网络问题分析结果指示所述KPI受到所述网络问题的影响程度,
所述网络问题分析结果包括:
所述KPI的置信度,和/或,所述KPI的劣化程度,其中,所述KPI的置信度为所述KPI受到所述网络问题影响的置信度,所述KPI的劣化程度指示所述KPI在发生所述网络问题后的变化程度。
在一种可能的实现方式中,
所述处理器,还用于根据所述网络问题数据信息从所述第一网络包括的多个网元中,确定异常网元,所述异常网元为受到所述网络问题影响的网元;
所述处理器,还用于从所述异常网元所服务的多个用户中,确定异常用户,所述异常用户为发生所述网络问题后迁入、迁出和/或驻留所述异常网元的用户;
所述处理器,还用于根据所述异常用户的KPI,生成所述网络问题分析结果,所述网络问题分析结果指示的所述KPI属于所述异常用户。
在一种可能的实现方式中,
所述处理器,还用于从所述异常用户的KPI中,确定所述异常用户的受影响KPI,其中,所述受影响KPI为所述异常用户的多个KPI中受到所述网络问题影响的KPI;
所述处理器,还用于根据所述异常用户的所述受影响KPI,生成所述网络问题分析结果,所述网络问题分析结果指示的所述KPI为所述受影响KPI。
在一种可能的实现方式中,
所述处理器,还用于根据所述网络问题数据信息从所述第一网络包括的多个网元中确定所述异常网元;
所述处理器,还用于生成第一网络信息,所述第一网络信息指示所述异常网元的网络拓扑结构,所述网络拓扑结果指示网元之间的业务关联关系。
在一种可能的实现方式中,
所述通信接口,还用于获取网络资源数据信息,所述网络资源数据信息包括以下任意一项或多项信息:所述第一网络的配置信息或者所述第一网络的工程参数信息;
所述处理器,还用于根据所述网络资源数据信息,生成第二网络信息,所述第二网络信息指示所述第一网络的网络拓扑结构;
所述处理器,还用于根据所述网络问题数据信息,从所述第二网络信息指示的多个网元中确定所述异常网元,其中,所述第二网络信息指示的多个网元为所述第一网络包括的多个网元。
在一种可能的实现方式中,所述第一网络信息指示第一网元和第二网元的网络拓扑,其中,所述第一网元为所述第一网络中发生所述网络问题的网元,所述第二网元为与所述第一网元具有业务关联关系的网元。
在一种可能的实现方式中,
所述处理器,还用于根据所述第一网元的业务迁移信息和所述第二网元的所述业务迁移信息,从所述第一网元和所述第二网元中确定至少一组异常网元组,所述业务迁移信息指示所述网元所服务的所述用户迁入、迁出和/或驻留的统计特征信息;
其中,所述异常网元组包括至少两个所述异常网元,所述异常网元组包括的所述异常网元之间存在业务关联关系,所述异常网元的所述业务迁移信息在发生所述网络问题前后的变化情况满足第一阈值;
所述处理器,还用于对所述异常网元之间的业务关联关系标记为异常,根据所述第一网络信息生成第三网络信息,所述第三网络信息指示所述异常网元组的拓扑结构。
在一种可能的实现方式中,
所述处理器,还用于根据所述第三网络信息,确定所述异常网元所服务的所述异常用户;
所述处理器,还用于根据所述异常用户的所述KPI的置信度,从所述异常用户的所述KPI中确定所述受影响KPI,其中,所述受影响KPI的置信度大于第二阈值。
在一种可能的实现方式中,
所述处理器,还用于根据通信机理关系和所述异常用户的所述受影响KPI,生成第四网络信息,所述第四网络信息指示所述受影响KPI和所述网络问题之间的影响关系,所述第四网络信息指示的影响关系与所述通信机理关系指示的影响关系一致,所述通信机理关系指示多个KPI与网络问题之间的影响关系;
所述处理器,还用于根据所述第四网络信息和所述网络问题,生成所述网络问题分析结果。
在一种可能的实现方式中,所述第四网络信息还包括:关键KPI的劣化程度和所述受影响KPI的置信度,所述关键KPI属于所述受影响KPI。
在一种可能的实现方式中,所述网络问题分析结果,包括以下任意一项或多项:
所述网络问题、发生所述网络问题的所述异常网元、受到所述网络问题影响的所述异常用户,所述异常用户的所述受影响KPI,所述受影响KPI的置信度,或者,所述受影响KPI的劣化程度,所述受影响KPI的置信度为所述受影响KPI受到所述网络问题影响的置信度,所述受影响KPI的劣化程度指示所述受影响KPI在发生所述网络问题后的变化程度。
在一种可能的实现方式中,所述KPI的劣化程度包括以下任意一项或多项:
发生所述网络问题后的所述KPI的统计量与发生所述网络问题前的所述KPI的统计量的差值,
发生所述网络问题后的所述KPI的统计量与发生所述网络问题前的所述KPI的统计量的比例值,
发生所述网络问题后的所述KPI的统计量与第三阈值的差值,
或者,发生所述网络问题后的所述KPI的统计量与所述第三阈值的比例值。
在一种可能的实现方式中,所述统计量包括以下任意一项或多项:
均值、中位数、下分位数、上分位数,或者,任意区间的累计概率分布。
在一种可能的实现方式中,所述网络问题数据信息,包括以下任意一项或多项信息:
所述网元的告警信息、所述网元的性能劣化信息,或者,所述网元的变更信息。
第四方面,提供了一种计算设备,计算设备执行第一方面中任意一种实现方式中的方法。
第五方面,提供了一种计算设备集群,云计算系统包括如第四方面的计算设备。
本申请第六方面提供一种计算机存储介质,该计算机存储介质可以是非易失性的;该计算机存储介质中存储有计算机可读指令,当该计算机可读指令被处理器执行时实现第一方面中任意一种实现方式中的方法。
本申请第七方面提供一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行第一方面中任意一种实现方式中的方法。
本申请第八方面提供一种芯片系统,该芯片系统包括处理器和接口电路,用于支持网络设备实现上述方面中所涉及的功能,例如,发送或处理上述方法中所涉及的数据和/或信息。在一种可能的设计中,芯片系统还包括存储器,存储器,用于保存网络设备必要的程序指令和数据。该芯片系统,可以由芯片构成,也可以包括芯片和其他分立器件。
附图说明
图1为本申请实施例涉及的一种应用场景示意图;
图2为本申请实施例中一种计算设备100的结构示意图;
图3为本申请实施例中一种计算设备集群的结构示意图;
图4为本申请实施例中一种计算设备集群的结构示意图;
图5为本申请实施例中一种网络问题分析方法的实施例示意图;
图6为本申请实施例中一种网络问题分析方法的实施例示意图;
图7为本申请实施例中一种网络问题分析方法的实施例示意图;
图8为本申请实施例中一种网络问题分析方法的实施例示意图;
图9为本申请实施例中第一网络的网络拓扑结构示意图;
图10为本申请实施例中第一网络信息的一种网络拓扑结构示意图;
图11为本申请实施例中第一网络信息的又一种实施例示意图;
图12为本申请实施例中第三网络信息的一种网络拓扑示意图;
图13为本申请实施例中通信机理关系图的结构示意图;
图14为本申请实施例中第四网络信息的一种拓扑示意图;
图15为本申请的实施例提供的可能的计算设备1500的结构示意图;
图16为本申请实施例中一种计算设备1600的结构示意图。
具体实施方式
下面将参照附图更详细地描述本公开的一些示例实现。虽然附图中显示了本公开的一些示例实现,然而,应该理解,可以以各种形式实现本公开而不应被这里阐述的示例实现所限制。相反,提供这些实现是为了使本公开更加透彻和完整,并且能够将本公开的范围完整地传达给本领域的技术人员。
在本文中使用的术语“包括”及其变形表示开放性包括,即“包括但不限于”。除非特别申明,术语“或”表示“和/或”。术语“基于”表示“至少部分地基于”。术语“实施例”和“一些实施例”表示“至少一些实施例”。术语“第一”、“第二”等描述,是用于区分不同的对象等,其不代表先后顺序,也不限定“第一”和“第二”是不同的类型。
首先,介绍本申请实施例涉及的一些概念。
自治网络(Autonomous Networks),自治网络是一种能自配置、自治愈、自优化、自演进的新一代电信网络。
网元(Network Element),网元是网络管理中可以监视和管理的最小单元。
共现关联分析(Co-existence Analysis),指的是将各种信息载体中的共现信息定量化的分析方法。可揭示信息的内容关联和特征项所隐含的共现关系。
网络拓扑(Network Topology),网络拓扑指构成网络的网元间特定的连接结构,其中,拓扑的节点为网元,边为网元间业务关联关系,该业务关联关系可以为物理的或逻辑的。
有向图(Directed Graph),有向图是一个二元组G=(V,E),其中集合V中的元素称为节点,集合E中的元素是两个节点组成的无序对,称为边。其中边为有方向的图称作有向图。
连通路径(Connected Graph),连通路径指的是对一个图中的任意两个节点,若存在交替的节点和边的序列,则该序列称为该两个节点间的联通路径。
假设检验(Hypothesis Testing),假设校验指的是用来判断样本与样本、样本与总体的差异是由抽样误差引起还是本质差别造成的统计推断方法。当检验的显著水平低于预设的阈值,则满足假设,反之不满足。其中,显著水平取值范围为(0,1),一般设置为0.1、0.05或0.01。
概率分布(Probability Distribution),概率分布指的是用于表述随机变量取值的概率规律。
分布相似度(Distribution Similarity),分布相似度表示两个概率分布的相似程度,常用的度量方法包括:余弦相似度、KL散度、推土机距离等。
图搜索方法(Graph Search),图搜索方法是指通过遍历图,找到起始节点到目标节点的所有路径的方法,常用的搜索方法包括:深度优先搜索、广度优先搜索等。
累计概率分布(Cumulative Probability Distribution),累计概率分布用于表述随机变量落在任一区间上的概率,累积概率分布可由概率分布积分求得。
示例性的,请参阅图1,图1为本申请实施例涉及的一种应用场景示意图。运营支持系统(Operation Support System,OSS)也称为操作支持系统,是电信业务开展和运营时所必需的支撑平台。OSS是电信运营商的一体化、信息资源共享的支持系统,它主要由网络管理、系统管理、计费、营业、账务和客户服务等部分组成,系统间通过统一的信息总线有机整合在一起。操作与支持系统包括操作维护中心和网络管理中心。它负责全网的通信质量及运行的检验和管理,记录和收集全网运行中的各种数据的情况。它对全网内各设备之间都有连接线,并对各设备执行监视和控制的职能。
本申请实施例提出的网络问题分析装置按照功能划分,可以分为:数据监控模块、数据存储模块以及分析处理模块。具体如下:数据监控模块可采集OSS系统在运行过程中实时产生的网元数据。数据监控模块实时监测OSS系统的网元数据,并将基于网元数据得到的网络问题数据信息上传至数据存储模块中存储。数据存储模块还用于存储OSS系统所服务的用户的用户数据信息,用户数据信息包括用户的关键性能指标(Key Performance Index,KPI)。分析处理模块从数据存储模块中提取网络问题数据信息以及用户数据信息后,对这些信息进行标准化处理。该标准化处理指的是将网络问题数据信息和用户数据信息预处理为标准化格式,便于后续的分析处理。分析处理模块根据网络问题数据信息和用户数据信息,得到生成网络问题分析结果,所述网络问题分析结果指示KPI受到网络问题的影响程度。最后,网络问题分析装置将网络问题分析结果反馈给网络维护工程师,由网络维护工程师对OSS系统中运行的网络进行处理。例如,网络维护工程师根据网络问题分析结果,对网络问题进行排序并创建工单,然后将工单上传至工单管理服务器。由工单管理服务器对多个工单按照优先级派发至网络维护工程师处理。
网络问题分析装置中的分析处理模块、数据存储模块和数据监控模块均可以通过软件实现,或者可以通过硬件实现。示例性的,接下来以分析处理模块为例,介绍分析处理模块的实现方式。类似的,数据存储模块和数据监控模块的实现方式可以参考分析处理模块的实现方式。
模块作为软件功能单元的一种举例,分析处理模块可以包括运行在计算实例上的代码。其中,计算实例可以包括物理主机(计算设备)、虚拟机、容器中的至少一种。进一步地,上述计算实例可以是一台或者多台。例如,分析处理模块可以包括运行在多个主机/虚拟机/容器上的代码。需要说明的是,用于运行该代码的多个主机/虚拟机/容器可以分布在相同的区域(region)中,也可以分布在不同的region中。进一步地,用于运行该代码的多个主机/虚拟机/容器可以分布在相同的可用区(availability zone,AZ)中,也可以分布在不同的AZ中,每个AZ包括一个数据中心或多个地理位置相近的数据中心。其中,通常一个region可以包括多个AZ。
同样,用于运行该代码的多个主机/虚拟机/容器可以分布在同一个虚拟私有云(virtual private cloud,VPC)中,也可以分布在多个VPC中。其中,通常一个VPC设置在一个region内,同一region内两个VPC之间,以及不同region的VPC之间跨区通信需在每个VPC内设置通信网关,经通信网关实现VPC之间的互连。
模块作为硬件功能单元的一种举例,分析处理模块可以包括至少一个计算设备,如服务器等。或者,分析处理模块也可以是利用专用集成电路(application-specific integrated circuit,ASIC)实现、或可编程逻辑器件(programmable logic device,PLD)实现的设备等。其中,上述PLD可以是复杂程序逻辑器件(complex programmable logical device,CPLD)、现场可编程门阵列(field-programmable gate array,FPGA)、通用阵列逻辑(generic array logic,GAL)或其任意组合实现。
分析处理模块包括的多个计算设备可以分布在相同的region中,也可以分布在不同的region中。分析处理模块包括的多个计算设备可以分布在相同的AZ中,也可以分布在不同的AZ中。同样,分析处理模块包括的多个计算设备可以分布在同一个VPC中,也可以分布在多个VPC中。其中,所述多个计算设备可以是服务器、ASIC、PLD、CPLD、FPGA和GAL等计算设备的任意组合。
网络问题分析装置可以通过软件实现,或者可以通过硬件实现。示例性的,接下来介绍网络问题分析装置的实现方式。
模块作为软件功能单元的一种举例,网络问题分析装置可以包括运行在计算实例上的代码。其中,计算实例可以是物理主机(计算设备)、虚拟机、容器等计算设备中的至少一种。进一步地,上述计算设备可以是一台或者多台。例如,网络问题分析装置可以包括运行在多个主机/虚拟机/容器上的代码。需要说明的是,用于运行该应用程序的多个主机/虚拟机/容器可以分布在相同的region中,也可以分布在不同的region中。用于运行该代码的多个主机/虚拟机/容器可以分布在相同的AZ中,也可以分布在不同的AZ中,每个AZ包括一个数据中心或多个地理位置相近的数据中心。其中,通常一个region可以包括多个AZ。
同样,用于运行该代码的多个主机/虚拟机/容器可以分布在同一个VPC中,也可以分布在多个VPC中。其中,通常一个VPC设置在一个region内。同一region内两个VPC之间,以及不同region的VPC之间跨区通信需在每个VPC内设置通信网关,经通信网关实现VPC之间的互连。
模块作为硬件功能单元的一种举例,网络问题分析装置可以包括至少一个计算设备,如服务器等。或者,网络问题分析装置也可以是利用ASIC实现、或PLD实现的设备等。其中,上述PLD可以是CPLD、FPGA、GAL或其任意组合实现。
网络问题分析装置包括的多个计算设备可以分布在相同的region中,也可以分布在不同的region中。网络问题分析装置包括的多个计算设备可以分布在相同的AZ中,也可以分布在不同的AZ中。同样,网络问题分析装置包括的多个计算设备可以分布在同一个VPC中,也可以分布在多个VPC中。其中,所述多个计算设备可以是服务器、ASIC、PLD、CPLD、FPGA和GAL等计算设备的任意组合。
本申请还提供一种计算设备100。如图2所示,计算设备100包括:总线102、处理器104、存储器106和通信接口108。处理器104、存储器106和通信接口108之间通过总线102通信。计算设备100可以是服务器或终端设备。应理解,本申请不限定计算设备100中的处理器、存储器的个数。
总线102可以是外设部件互连标准(peripheral component interconnect,PCI)总线或扩展工业标准结构(extended industry standard architecture,EISA)总线等。总线可以分为地址总线、数据总线、控制总线等。为便于表示,图2中仅用一条线表示,但并不表示仅有一根总线或一种类型的总线。总线104可包括在计算设备100各个部件(例如,存储器106、处理器104、通信接口108)之间传送信息的通路。
处理器104可以包括中央处理器(central processing unit,CPU)、图形处理器(graphics processing unit,GPU)、微处理器(micro processor,MP)或者数字信号处理器(digital signal processor,DSP)等处理器中的任意一种或多种。
存储器106可以包括易失性存储器(volatile memory),例如随机存取存储器(random access memory,RAM)。处理器104还可以包括非易失性存储器(non-volatile memory),例如只读存储器(read-only memory,ROM),快闪存储器,机械硬盘(hard disk drive,HDD)或固态硬盘(solid state drive,SSD)。
存储器106中存储有可执行的程序代码,处理器104执行该可执行的程序代码以分别实现前述分析处理模块、数据存储模块和数据监控模块的功能,从而实现网络问题分析方法。也即,存储器106上存有用于执行网络问题分析方法的指令。
或者,存储器106中存储有可执行的代码,处理器104执行该可执行的代码以分别实现前述网络问题分析装置的功能,从而实现网络问题分析方法。也即,存储器106上存有用于执行网络问题分析方法的指令。
通信接口103使用例如但不限于网络接口卡、收发器一类的收发模块,来实现计算设备100与其他设备或通信网络之间的通信。
本申请实施例还提供了一种计算设备集群。该计算设备集群包括至少一台计算设备。该计算设备可以是服务器,例如是中心服务器、边缘服务器,或者是本地数据中心中的本地服务器。在一些实施例中,计算设备也可以是台式机、笔记本电脑或者智能手机等终端设备。
如图3所示,所述计算设备集群包括至少一个计算设备100。计算设备集群中的一个或多个计算设 备100中的存储器106中可以存有相同的用于执行网络问题分析方法的指令。
在一些可能的实现方式中,该计算设备集群中的一个或多个计算设备100的存储器106中也可以分别存有用于执行网络问题分析方法的部分指令。换言之,一个或多个计算设备100的组合可以共同执行用于执行网络问题分析方法的指令。
需要说明的是,计算设备集群中的不同的计算设备100中的存储器106可以存储不同的指令,分别用于执行网络问题分析装置的部分功能。也即,不同的计算设备100中的存储器106存储的指令可以实现分析处理模块、数据存储模块和数据监控模块中的一个或多个模块的功能。
在一些可能的实现方式中,计算设备集群中的一个或多个计算设备可以通过网络连接。其中,所述网络可以是广域网或局域网等等。图4示出了一种可能的实现方式。如图4所示,两个计算设备100A和100B之间通过网络进行连接。具体地,通过各个计算设备中的通信接口与所述网络进行连接。在这一类可能的实现方式中,计算设备100A中的存储器106中存有执行分析处理模块的功能的指令。同时,计算设备100B中的存储器106中存有执行数据存储模块和数据监控模块的功能的指令。
图4所示的计算设备集群之间的连接方式可以是考虑到本申请提供的网络问题分析方法需要大量的数据信息,因此考虑将数据存储模块和数据监控模块实现的功能交由计算设备100B执行。
应理解,图4中示出的计算设备100A的功能也可以由多个计算设备100完成。同样,计算设备100B的功能也可以由多个计算设备100完成。
本申请实施例还提供了另一种计算设备集群。该计算设备集群中各计算设备之间的连接关系可以类似的参考图3和图4所述计算设备集群的连接方式。不同的是,该计算设备集群中的一个或多个计算设备100中的存储器106中可以存有相同的用于执行网络问题分析方法的指令。
在一些可能的实现方式中,该计算设备集群中的一个或多个计算设备100的存储器106中也可以分别存有用于执行网络问题分析方法的部分指令。换言之,一个或多个计算设备100的组合可以共同执行用于执行网络问题分析方法的指令。
需要说明的是,计算设备集群中的不同的计算设备100中的存储器106可以存储不同的指令,用于执行网络问题分析系统的部分功能。也即,不同的计算设备100中的存储器106存储的指令可以实现网络问题分析装置中的一个或多个装置的功能。
下面结合附图介绍本申请实施例。图5为本申请实施例中一种网络问题分析方法的实施例示意图。本申请实施例提出的一种网络问题分析方法包括:
501、获取网络问题数据信息。
步骤501中,网络问题分析装置获取第一网络中网元的网络问题数据信息,具体的,网络问题数据信息包括但不限于:网元的告警信息、网元的性能劣化信息、网元的变更信息。
示例性的,网元的告警信息,包括:当网元是小区时,网元的告警信息可以是该小区的不可用告警信息,该小区视为逻辑网元,网络设备可以包括(或者管理)一个或多个小区,网络设备通过网络设备管理的小区向用户(例如终端设备)提供无线通信服务;当网元是射频单元等网络设备时,网元的告警信息可以是该网络设备的链路中断信息;当网元是站点等网络设备时,网元的告警信息可以是该网络设备的退服告警信息。
示例性的,网元的性能劣化信息,包括:连接建立失败率大于预设阈值,或者,设备的负载大于预设阈值。
示例性的,网元的变更信息,包括:邻区的配置信息发生变更,或者,新增载频。
可选的,网络问题分析装置获取网络问题数据信息后,还可以对这些信息进行标准化处理,得到标准化的网络问题数据信息。示例性的,标准化的网络问题数据信息包括:网络问题的标识、网络问题的类型、以及网络问题的发生位置(即发生网络问题的网元的标识信息)。例如表1所示:
表1
502、获取用户数据信息,用户数据信息包括用户的KPI。
步骤502中,网络问题分析装置获取用户数据信息,该用户数据信息指的是第一网络中网络所服务的各个用户的用户数据信息。示例性的,该用户数据信息包括但不限于:网元收发的信令、消息、信息,或者报文等。该用户数据信息还包括:用户的KPI。
示例性的,用户的KPI包括但不限于:速率、调度次数、调制和编码方案(MCS)、双流传输比例、初传误码率、信号干扰噪声比(SINR)、信道质量信息(CQI)、资源分配成功率、可用资源、干扰,或者,参考信号接收功率(RSRP)等。
可选的,网络问题分析装置获取用户数据信息后,还可以对这些信息进行标准化处理,得到标准化的用户数据信息。例如,不同用户的KPI可能采用不同的单位,对这些用户的KPI进行标准化处理,得到相同单位的KPI。
503、根据网络问题数据信息和用户数据信息,生成网络问题分析结果,该网络问题分析结果指示KPI受到网络问题的影响程度。
步骤503中,根据网络问题数据信息和用户数据信息,分别确定异常网元以及异常网元所服务的异常用户。然后根据异常用户的用户数据信息(KPI),确定网络问题分析结果,该网络问题分析结果指示异常用户的用户数据信息中受影响KPI受到网络问题的影响程度。具体的,网络问题分析结果包括:KPI的置信度,和/或,KPI的劣化程度。其中,KPI的置信度为该KPI受到网络问题影响的置信度,KPI的劣化程度指示该KPI在发生网络问题后的变化程度。
进一步的,在用户的多个KPI中,还可以选择关键KPI。例如,用户的KPI包括:速率、调度次数、调制和编码方案(MCS)、双流传输比例、初传误码率、信号干扰噪声比(SINR)、信道质量信息(CQI)、资源分配成功率、可用资源、干扰,和,参考信号接收功率(RSRP)。选择其中的RSRP和速率作为关键KPI。在网络问题分析结果中,包括该关键KPI的置信度和关键KPI的劣化程度。
关于KPI的劣化程度,包括以下任意一项或多项:发生所述网络问题后的所述KPI的统计量与发生所述网络问题前的所述KPI的统计量的差值,发生所述网络问题后的所述KPI的统计量与发生所述网络问题前的所述KPI的统计量的比例值,发生所述网络问题后的所述KPI的统计量与第三阈值的差值,或者,发生所述网络问题后的所述KPI的统计量与所述第三阈值的比例值。
关于统计量,所述统计量包括以下任意一项或多项:均值、中位数、下分位数、上分位数,或者,任意区间的累计概率分布。
需要说明的是,本申请实施例对KPI的置信度计算方法以及KPI的劣化程度的计算方法不作限定。
一种示例中,计算KPI的置信度的方法,包括:统计该KPI在网络问题发生前的数据,记为KPIbefore;统计该KPI在发生网络问题之后的数据,记为KPIafter。然后计算KPIbefore以及KPIafter的概率分布。最后,利用分布相似度度量方法,计算网络问题发生前后的概率分布的差异程度作为该KPI的置信度。
一种示例中,计算KPI的劣化程度的方法,包括:计算KPIbefore在某个区间的累计概率分布,记 为S-KPIbefore;计算KPIafter在某个区间的累计概率分布,记为S-KPIafter。然后,根据计算S-KPIbefore与S-KPIafter的差值,作为该KPI的劣化程度。该KPI的统计量为某个区间的累计概率分布,针对局部劣化的敏感性佳的特点。
本申请实施例中,网络问题分析装置综合网络问题数据信息和用户数据信息,得到网络问题分析结果,网络问题分析结果指示用户数据信息受到网络问题的影响程度。该用户数据信息包括用户的关键性能指标KPI。具体的,网络问题分析结果包括KPI的置信度和/或KPI的劣化程度,其中,KPI的置信度为KPI受到网络问题影响程度的置信度,KPI的劣化程度指示KPI在发生网络问题后的变化程度。网络问题分析结果指示量化的网络问题造成的影响程度,提升分析网络问题造成影响的准确率。
结合前述实施例,进一步的,为了进一步提升网络问题分析结果的准确率。在步骤503中(根据网络问题数据信息和用户数据信息,生成网络问题分析结果),还可以确定第一网络中受到网络问题影响的异常网元。然后根据异常网元所服务的用户,从这些用户中确定异常用户。从异常用户的用户数据信息(KPI)中,确定受到网络问题影响的KPI(称为受影响KPI)。最后,根据发生网络问题之前的受影响KPI与发生网络问题之后的KPI,生成网络问题分析结果,该网络问题分析结果指示受影响KPI受到网络问题的影响程度。下面分别进行说明。
首先,介绍如何确定异常网元。请参阅图6,图6为本申请实施例中一种网络问题分析方法的实施例示意图。本申请实施例提出的网络问题分析方法,还包括:
601、获取网络资源数据信息。
步骤601中,网络问题分析装置获取网络资源数据信息,网络资源数据信息包括以下任意一项或多项信息:第一网络的配置信息,或者,第一网络的工程参数信息。
示例性的,以网元为基站(eNodeB)为例,网络资源数据信息如表2所示。
表2
又一种示例中,网络资源数据信息,如表3所示。
表3
602、根据网络资源数据信息,生成第二网络信息,第二网络信息指示第一网络的拓扑结构。
步骤602中,网络资源数据信息指示网元之间的业务关联关系和/或物理连接关系。根据网络资源数据信息,可生成第二网络信息,该第二网络信息指示第一网络的拓扑结构。示例性的,请参阅图9,图9为本申请实施例中第一网络的网络拓扑结构示意图。根据第一网络的网络资源数据信息,生成第二网络信息,该第二网络信息指示第一网络的网络拓扑结构。网络拓扑结构指示网元之间的业务关联关系。
关于业务关联关系与物理连接关系,以无线网络中的网元A(该网元A是一种基站)和网元B(该网元B是一种基站)为例,网元A与网元B之间存在物理连接。此外,网元A与网元B之间互相配置为逻辑邻区,实现网元A与网元B之间建立业务关联关系。用户(例如是终端)业务可以在网元A与网元B之间切换。
603、根据网络问题数据信息,从第二网络信息指示的多个网元中确定异常网元,异常网元为受到网络问题影响的网元,生成第一网络信息。
步骤603中,根据网络问题数据信息,从第二网络信息指示的多个网元(即第一网络的多个网元)中确定异常网元。本申请实施例中,将第一网络中受到网络问题影响的网元称为异常网元。具体的,异 常网元包括:发生网络问题的网元以及与该发生网络问题的网元具有业务关联关系的网元。为了便于说明,将发生网络问题的网元称为第一网元,将与第一网元具有业务关联关系的网元称为第二网元。网络问题分析装置根据网络问题数据信息,将网络问题关联至第一网元。
然后,根据第一网元与第二网元之间的业务关联关系,生成第一网络信息,第一网络信息指示异常网元的网络拓扑结构。换言之,第一网络信息指示第一网元的网络拓扑和第二网元的网络拓扑。一种可能的实现方式中,网络问题分析装置从第二网络信息中标记第一网元与第二网元,然后生成第一网络信息。将某一个第一网元与该第一网元具有业务关联关系的第二网元,作为一个网元集合输出。
请参阅图10,图10为本申请实施例中第一网络信息的一种网络拓扑结构示意图。网元集合#1包括发生网络问题的网元(第一网元),以及与该第一网元具有业务关联关系的其他未发生网络问题的网元(第二网元)。
可选的,为了进一步提升网络问题分析的准确性,对根据第二网络信息生成的多个网元集合(网元集合包括第一网元以及第二网元)进行合并处理。具体的,对具有相同网元的多个网元集合合并为一个网元集合,将合并后的网元集合作为拓扑子图进行分析。以图10为例,网元集合#1和网元集合#2包括一个相同的网元。网元集合#1发生的网络问题可能与网元集合#2发生的网络问题不同,因此网元集合#1和网元集合#2包括的相同网元意味着该网元可能同时受到两个不同的网络问题的影响。为了提升网络问题分析的准确性,将网元集合#1和网元集合#2合并,输出合并后的网元集合并生成对应的网络拓扑子图用于后续处理。为了便于理解,如图11所示,图11为本申请实施例中第一网络信息的又一种实施例示意图。将网元集合#1和网元集合#2合并,输出合并后的网元集合并生成对应的网络拓扑子图,即图11中的拓扑子图#1。根据第二网络信息(网元集合#1和网元集合#2),生成第一网络信息(拓扑子图#1)。
进一步的,为了提升网络问题分析结果的准确性,还可以从第一网络信息指示第一网元和第二网元中确定异常网元组,异常网元组包括至少两个异常网元。异常网元的业务迁移信息在发生网络问题前后的变化情况满足第一阈值。换言之,从第一网元和第二网元中筛选受到网络问题影响较大的异常网元组。具体流程请参阅图7,图7为本申请实施例中一种网络问题分析方法的实施例示意图。本申请实施例提出的一种网络问题分析方法,还包括:
S1、获取第一网元的业务迁移信息和第二网元的业务迁移信息。
步骤S1中,获取第一网元的业务迁移信息,和,第二网元的业务迁移信息,业务迁移信息指示该网元所服务的用户,迁入该网元,迁出该网元,和/或,驻留该网元的统计特征信息。该统计特征信息,包括但不限于:总数、比例、均值,和/或方差等。
S2、假设校验。
步骤S2中,使用假设校验方法,确定统计特征信息#1与统计特征信息#2的变化是否显著。具体的,业务迁移信息包括:发生网络问题前该网元(该网元为第一网元或者第二网元)迁入的用户、迁出的用户,和/或驻留的用户的统计特征信息,记为统计特征信息#1;发生网络问题后该网元迁入的用户、迁出的用户,和/或驻留的用户的统计特征信息,记为统计特征信息#2。
示例性的,若有网络问题发生的网元为网元#A,与网元#A具有业务关联关系的其中一个网元为网元#B。可统计在网络问题发生前一小时内,由网元#A迁移至网元#B的迁入用户的总数,以每10分钟统计一次迁入用户的总数,具体为:100,99,98,102,103,95。在网络问题发生后10分钟,每10分钟统计一次迁入用户的总数为25。根据统计特征信息#1“100,99,98,102,103,95”与统计特征信息#2“25”,使用假设校验方法,确定统计特征信息#1与统计特征信息#2的变化是否显著。以某种假设检验方法计算,得到的网络问题发生后的迁入用户总数与网络问题发生前1小时没有显著变化的概率为0.04,而预设的第一阈值为0.1,则说明有显著变化,网元#A与网元#B之间的业务关联关系为异常业务关联关系。
本申请实施例中,对假设校验方法不作限制。
若统计特征信息在网络问题发生前后,变化显著(满足第一阈值),则进入步骤S3;若统计特征信息在网络问题发生前后,变化不显著(不满足第一阈值),则进入步骤S4。
S3、统计特征信息在网络问题发生前后,变化显著(满足第一阈值),标记第一网元与第二网元之间的业务关联关系为异常。
步骤S3中,当第一网元与某一个第二网元之间的业务关联关系为异常,则第一网元与该第二网元作为一对异常网元组,该第一网元和该第二网元为异常网元。认为该第二网元受到网络问题的显著影响。
然后,网络问题分析装置标记第一网络信息中的异常网元组,网络问题分析装置标记异常网元组之间的业务关联关系为异常业务关联关系,生成第三网络信息。为了便于理解,请参阅图12,图12为本申请实施例中第三网络信息的一种网络拓扑示意图。以图11中的拓扑子图#2为例,通过步骤S1~S3的方法,找到异常网元组后,将异常网元组之间的业务关联关系标记为异常业务关联关系。
S4、统计特征信息在网络问题发生前后,变化不显著(不满足第一阈值),标记第一网元与第二网元之间的业务关联关系为正常。
步骤S4中,当第一网元与某一个第二网元之间的业务关联关系为正常,认为该第二网元未受到网络问题的显著影响,该第二网元作为正常网元。
604、从异常网元服务的多个用户中确定异常用户,异常用户为发生网络问题后迁入、迁出和/或驻留异常网元的用户。
步骤604中,确定异常网元后,找到网络问题发生后迁入、迁出和/或驻留该网元的用户,将这些用户作为异常用户。例如,将发生网络问题后1小时内,迁入、迁出以及驻留该网元的用户作为异常用户。
通过上述方法,可以从大量的第二网元中筛选受到网络问题显著影响的网元。将这部分受到网络问题显著影响的第二网元和第一网元一起作为异常网元进行分析,找到异常网元所服务的异常用户。有效降低数据处理的复杂度,降低计算难度,提升网络问题分析的准确性。
结合前述实施例,确定异常用户后,进一步确定异常用户的多个KPI中受到网络问题影响的KPI(即受影响KPI)。请参阅图8,图8为本申请实施例中一种网络问题分析方法的实施例示意图。本申请实施例提出的一种网络问题分析方法,还包括:
801、获取异常网元组所服务的异常用户的KPI。
步骤801中,根据第三网络信息和用户数据信息,从异常网络组(异常网元组包括的异常网元)所服务的异常用户的用户数据信息中,确定该异常用户的至少一个KPI。该KPI可以是经过标准化处理的KPI。具体的,获取异常用户的KPI值包括:获取网络问题发生前的KPI值(KPIbefore)和发生网络问题后的KPI值(KPIafter)。
802、计算概率分布。
步骤802中,获取异常用户的KPI后,利用分布相似度度量方法,计算KPIbefore与KPIafter分布的差异程度。
示例性的,以KPI为RSRP为例,将RSRP取值区间分为4个,分别是-130分贝毫瓦(decibel relative to one milliwatt,dBm)~-120dBm,-120dBm~-110dBm,-110dBm~-100dBm和-100dBm~90dBm。网络问题发生前的5分钟采集了100个RSRP值,落在这4个区间的比例为10%,30%,40%,20%,KPIbefore的比例为f(KPIbefore),f(KPIbefore)作为KPIbefore的分布(即KPIbefore的概率分布);网络问题发生后的5分钟采集了100个RSRP值,落在这4个区间的比例为20%,40%,30%,10%,KPIafter的比例为f(KPIafter),f(KPIafter)作为KPIafter的分布(即KPIafter的概率分布)。根据KPIbefore的分布与KPIafter的分布,计算这两个KPI的分布的差异度,计算差异度的方法例如是JS散度(值域[0,1])。JS散度越小,说明差异越小;JS散度越大,越大说明差异越大。
803、根据发生网络问题前后的KPI的概率分布,确定KPI的置信度。
步骤803中,根据步骤802计算得到的KPIbefore的概率分布和KPIafter的概率分布,得到该KPI的置信度。该KPI的置信度指示该KPI受到网络问题影响的置信度。
804、根据KPI的置信度,确定受影响KPI。
步骤804中,根据KPI的置信度,从至少一个KPI中筛选得到受影响KPI。具体的,受影响KPI的置信度大于第二阈值。
例如,第二阈值为0.75。以异常用户的KPI包括:速率、调度次数、调制和编码方案(MCS)、双流 传输比例、初传误码率、信号干扰噪声比(SINR)、信道质量信息(CQI)、资源分配成功率、可用资源、干扰,和参考信号接收功率(RSRP)。其中,速率、MCS、SINR、CQI和RSRP的置信度大于0.75,则认为速率、MCS、SINR、CQI和RSRP为受影响KPI。然后,根据受影响KPI在网络问题发生前的统计量和受影响KPI在网络问题发生后的统计量,计算受影响KPI的劣化程度。
一种可能的实现方式中,根据通信机理关系和受影响KPI,生成第四网络信息,所述第四网络信息指示所述受影响KPI和所述网络问题之间的影响关系,所述第四网络信息指示的影响关系与所述通信机理关系指示的影响关系一致,所述通信机理关系指示多个KPI与网络问题之间的影响关系。然后,根据第四网络信息,生成网络问题分析结果。
通信机理关系,也可以称为通信机理关系图。例如图13示意的,图13为本申请实施例中通信机理关系图的结构示意图。通信机理关系图用于指示各个KPI之间的影响关系,通信机理关系图指示的影响关系符合通信协议描述的影响机理。示例性的,通信网络工程师可以预定义该通信机理关系图,该通信机理关系图描述不同类型的网络问题对KPI的影响,以及不同KPI之间的影响因果顺序。例如图13示意的,RSRP的恶化,影响SINR。
得到受影响KPI后,根据通信机理关系图指示的影响关系,从网络问题出发利用图搜索方法,确定连接各个受影响KPI和网络问题的路径。然后,将描述该路径的KPI拓扑结构输出,作为第四网络信息。例如图14示意的,图14为本申请实施例中第四网络信息的一种拓扑示意图。第四网络信息指示各个受影响KPI之间的影响关系,以及受影响KPI与网络问题之间的影响关系。
可选的,在受影响KPI中可以确定关键KPI。例如,选取速率作为图14示意的多个受影响KPI的关键KPI。
最后,根据第四网络信息(以及网络问题数据信息),生成网络问题分析结果。网络问题分析结果,包括以下任意一项或多项:所述网络问题、发生所述网络问题的所述异常网元、受到所述网络问题影响的所述异常用户,所述异常用户的所述受影响KPI,所述受影响KPI的置信度,或者,所述受影响KPI的劣化程度,所述受影响KPI的置信度为所述受影响KPI受到所述网络问题影响的置信度,所述受影响KPI的劣化程度指示所述受影响KPI在发生所述网络问题后的变化程度。示例性的,如表4所示:
表4
通过上述方法,输出量化的网络问题分析结果,使得网络问题分析结果更加直观,提升网络问题的分析准确率。
可以理解的是,为了实现上述实施例中功能,计算设备包括了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本申请中所公开的实施例描述的各示例的单元及方法步骤,本申请能够以硬件或硬件和计算机软件相结合的形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用场景和设计约束条件。
图15为本申请的实施例提供的可能的计算设备1500的结构示意图。这些计算设备可以用于实现上 述方法实施例中网络问题分析装置的功能,因此也能实现上述方法实施例所具备的有益效果。
如图15所示,计算设备1500包括处理单元1510和收发单元1520。计算设备1500用于实现上述图5~图14中所示的方法实施例中网络问题分析装置的功能。
有关上述处理单元1510和收发单元1520更详细的描述可以参考图3-8所示的方法实施例中相关描述。
一种示例中,计算设备1500应用于第一网络,所述第一网络包括多个网元,所述计算设备包括:
收发单元1520,用于获取网络问题数据信息,所述网络问题数据信息指示所述第一网络中网元发生的网络问题;
所述收发单元1520,还用于获取用户数据信息,所述用户数据信息为所述第一网络所服务的至少一个用户的用户数据信息,所述用户数据信息包括所述用户的关键性能指标KPI;
处理单元1510,用于根据所述网络问题数据信息和所述用户数据信息,生成网络问题分析结果,所述网络问题分析结果指示所述KPI受到所述网络问题的影响程度,
所述网络问题分析结果包括:
所述KPI的置信度,和/或,所述KPI的劣化程度,其中,所述KPI的置信度为所述KPI受到所述网络问题影响的置信度,所述KPI的劣化程度指示所述KPI在发生所述网络问题后的变化程度。
在一种可能的实现方式中,
所述处理单元1510,还用于根据所述网络问题数据信息从所述第一网络包括的多个网元中,确定异常网元,所述异常网元为受到所述网络问题影响的网元;
所述处理单元1510,还用于从所述异常网元所服务的多个用户中,确定异常用户,所述异常用户为发生所述网络问题后迁入、迁出和/或驻留所述异常网元的用户;
所述处理单元1510,还用于根据所述异常用户的KPI,生成所述网络问题分析结果,所述网络问题分析结果指示的所述KPI属于所述异常用户。
在一种可能的实现方式中,
所述处理单元1510,还用于从所述异常用户的KPI中,确定所述异常用户的受影响KPI,其中,所述受影响KPI为所述异常用户的多个KPI中受到所述网络问题影响的KPI;
所述处理单元1510,还用于根据所述异常用户的所述受影响KPI,生成所述网络问题分析结果,所述网络问题分析结果指示的所述KPI为所述受影响KPI。
在一种可能的实现方式中,
所述处理单元1510,还用于根据所述网络问题数据信息从所述第一网络包括的多个网元中确定所述异常网元;
所述处理单元1510,还用于生成第一网络信息,所述第一网络信息指示所述异常网元的网络拓扑结构,所述网络拓扑结果指示网元之间的业务关联关系。
在一种可能的实现方式中,
所述收发单元1520,还用于获取网络资源数据信息,所述网络资源数据信息包括以下任意一项或多项信息:所述第一网络的配置信息或者所述第一网络的工程参数信息;
所述处理单元1510,还用于根据所述网络资源数据信息,生成第二网络信息,所述第二网络信息指示所述第一网络的网络拓扑结构;
所述处理单元1510,还用于根据所述网络问题数据信息,从所述第二网络信息指示的多个网元中确定所述异常网元,其中,所述第二网络信息指示的多个网元为所述第一网络包括的多个网元。
在一种可能的实现方式中,所述第一网络信息指示第一网元和第二网元的网络拓扑,其中,所述第一网元为所述第一网络中发生所述网络问题的网元,所述第二网元为与所述第一网元具有业务关联关系的网元。
在一种可能的实现方式中,
所述处理单元1510,还用于根据所述第一网元的业务迁移信息和所述第二网元的所述业务迁移信息,从所述第一网元和所述第二网元中确定至少一组异常网元组,所述业务迁移信息指示所述网元所服务的所述用户迁入、迁出和/或驻留的统计特征信息;
其中,所述异常网元组包括至少两个所述异常网元,所述异常网元组包括的所述异常网元之间存在业务关联关系,所述异常网元的所述业务迁移信息在发生所述网络问题前后的变化情况满足第一阈值;
所述处理单元1510,还用于对所述异常网元之间的业务关联关系标记为异常,根据所述第一网络信息生成第三网络信息,所述第三网络信息指示所述异常网元组的拓扑结构。
在一种可能的实现方式中,
所述处理单元1510,还用于根据所述第三网络信息,确定所述异常网元所服务的所述异常用户;
所述处理单元1510,还用于根据所述异常用户的所述KPI的置信度,从所述异常用户的所述KPI中确定所述受影响KPI,其中,所述受影响KPI的置信度大于第二阈值。
在一种可能的实现方式中,
所述处理单元1510,还用于根据通信机理关系和所述异常用户的所述受影响KPI,生成第四网络信息,所述第四网络信息指示所述受影响KPI和所述网络问题之间的影响关系,所述第四网络信息指示的影响关系与所述通信机理关系指示的影响关系一致,所述通信机理关系指示多个KPI与网络问题之间的影响关系;
所述处理单元1510,还用于根据所述第四网络信息和所述网络问题,生成所述网络问题分析结果。
在一种可能的实现方式中,所述第四网络信息还包括:关键KPI的劣化程度和所述受影响KPI的置信度,所述关键KPI属于所述受影响KPI。
在一种可能的实现方式中,所述网络问题分析结果,包括以下任意一项或多项:
所述网络问题、发生所述网络问题的所述异常网元、受到所述网络问题影响的所述异常用户,所述异常用户的所述受影响KPI,所述受影响KPI的置信度,或者,所述受影响KPI的劣化程度,所述受影响KPI的置信度为所述受影响KPI受到所述网络问题影响的置信度,所述受影响KPI的劣化程度指示所述受影响KPI在发生所述网络问题后的变化程度。
在一种可能的实现方式中,所述KPI的劣化程度包括以下任意一项或多项:
发生所述网络问题后的所述KPI的统计量与发生所述网络问题前的所述KPI的统计量的差值,
发生所述网络问题后的所述KPI的统计量与发生所述网络问题前的所述KPI的统计量的比例值,
发生所述网络问题后的所述KPI的统计量与第三阈值的差值,
或者,发生所述网络问题后的所述KPI的统计量与所述第三阈值的比例值。
在一种可能的实现方式中,所述统计量包括以下任意一项或多项:
均值、中位数、下分位数、上分位数,或者,任意区间的累计概率分布。
在一种可能的实现方式中,所述网络问题数据信息,包括以下任意一项或多项信息:
所述网元的告警信息、所述网元的性能劣化信息,或者,所述网元的变更信息。
如图16所示,图16为本申请实施例中一种计算设备1600的结构示意图。计算设备1600包括处理器1610和接口电路1620。处理器1610和接口电路1620之间相互耦合。可以理解的是,接口电路1620可以为收发器或输入输出接口。可选的,计算设备1600还可以包括存储器1630,用于存储处理器1610执行的指令或存储处理器1610运行指令所需要的输入数据或存储处理器1610运行指令后产生的数据。
当计算设备1600用于实现图5-图14所示的方法时,处理器1610用于实现上述处理单元1510的功能,接口电路1620用于实现上述收发单元1520的功能。
一种示例中,所述计算设备1600应用于第一网络,所述第一网络包括多个网元,所述计算设备包括:接口电路1620和处理器1610;
所述接口电路1620,用于获取网络问题数据信息,所述网络问题数据信息指示所述第一网络中网元发生的网络问题;
所述接口电路1620,还用于获取用户数据信息,所述用户数据信息为所述第一网络所服务的至少一个用户的用户数据信息,所述用户数据信息包括所述用户的关键性能指标KPI;
处理器1610,用于根据所述网络问题数据信息和所述用户数据信息,生成网络问题分析结果,所述网络问题分析结果指示所述KPI受到所述网络问题的影响程度,
所述网络问题分析结果包括:
所述KPI的置信度,和/或,所述KPI的劣化程度,其中,所述KPI的置信度为所述KPI受到所述 网络问题影响的置信度,所述KPI的劣化程度指示所述KPI在发生所述网络问题后的变化程度。
在一种可能的实现方式中,
所述处理器1610,还用于根据所述网络问题数据信息从所述第一网络包括的多个网元中,确定异常网元,所述异常网元为受到所述网络问题影响的网元;
所述处理器1610,还用于从所述异常网元所服务的多个用户中,确定异常用户,所述异常用户为发生所述网络问题后迁入、迁出和/或驻留所述异常网元的用户;
所述处理器1610,还用于根据所述异常用户的KPI,生成所述网络问题分析结果,所述网络问题分析结果指示的所述KPI属于所述异常用户。
在一种可能的实现方式中,
所述处理器1610,还用于从所述异常用户的KPI中,确定所述异常用户的受影响KPI,其中,所述受影响KPI为所述异常用户的多个KPI中受到所述网络问题影响的KPI;
所述处理器1610,还用于根据所述异常用户的所述受影响KPI,生成所述网络问题分析结果,所述网络问题分析结果指示的所述KPI为所述受影响KPI。
在一种可能的实现方式中,
所述处理器1610,还用于根据所述网络问题数据信息从所述第一网络包括的多个网元中确定所述异常网元;
所述处理器1610,还用于生成第一网络信息,所述第一网络信息指示所述异常网元的网络拓扑结构,所述网络拓扑结果指示网元之间的业务关联关系。
在一种可能的实现方式中,
所述接口电路1620,还用于获取网络资源数据信息,所述网络资源数据信息包括以下任意一项或多项信息:所述第一网络的配置信息或者所述第一网络的工程参数信息;
所述处理器1610,还用于根据所述网络资源数据信息,生成第二网络信息,所述第二网络信息指示所述第一网络的网络拓扑结构;
所述处理器1610,还用于根据所述网络问题数据信息,从所述第二网络信息指示的多个网元中确定所述异常网元,其中,所述第二网络信息指示的多个网元为所述第一网络包括的多个网元。
在一种可能的实现方式中,所述第一网络信息指示第一网元和第二网元的网络拓扑,其中,所述第一网元为所述第一网络中发生所述网络问题的网元,所述第二网元为与所述第一网元具有业务关联关系的网元。
在一种可能的实现方式中,
所述处理器1610,还用于根据所述第一网元的业务迁移信息和所述第二网元的所述业务迁移信息,从所述第一网元和所述第二网元中确定至少一组异常网元组,所述业务迁移信息指示所述网元所服务的所述用户迁入、迁出和/或驻留的统计特征信息;
其中,所述异常网元组包括至少两个所述异常网元,所述异常网元组包括的所述异常网元之间存在业务关联关系,所述异常网元的所述业务迁移信息在发生所述网络问题前后的变化情况满足第一阈值;
所述处理器1610,还用于对所述异常网元之间的业务关联关系标记为异常,根据所述第一网络信息生成第三网络信息,所述第三网络信息指示所述异常网元组的拓扑结构。
在一种可能的实现方式中,
所述处理器1610,还用于根据所述第三网络信息,确定所述异常网元所服务的所述异常用户;
所述处理器1610,还用于根据所述异常用户的所述KPI的置信度,从所述异常用户的所述KPI中确定所述受影响KPI,其中,所述受影响KPI的置信度大于第二阈值。
在一种可能的实现方式中,
所述处理器1610,还用于根据通信机理关系和所述异常用户的所述受影响KPI,生成第四网络信息,所述第四网络信息指示所述受影响KPI和所述网络问题之间的影响关系,所述第四网络信息指示的影响关系与所述通信机理关系指示的影响关系一致,所述通信机理关系指示多个KPI与网络问题之间的影响关系;
所述处理器1610,还用于根据所述第四网络信息和所述网络问题,生成所述网络问题分析结果。
在一种可能的实现方式中,所述第四网络信息还包括:关键KPI的劣化程度和所述受影响KPI的置信度,所述关键KPI属于所述受影响KPI。
在一种可能的实现方式中,所述网络问题分析结果,包括以下任意一项或多项:
所述网络问题、发生所述网络问题的所述异常网元、受到所述网络问题影响的所述异常用户,所述异常用户的所述受影响KPI,所述受影响KPI的置信度,或者,所述受影响KPI的劣化程度,所述受影响KPI的置信度为所述受影响KPI受到所述网络问题影响的置信度,所述受影响KPI的劣化程度指示所述受影响KPI在发生所述网络问题后的变化程度。
在一种可能的实现方式中,所述KPI的劣化程度包括以下任意一项或多项:
发生所述网络问题后的所述KPI的统计量与发生所述网络问题前的所述KPI的统计量的差值,
发生所述网络问题后的所述KPI的统计量与发生所述网络问题前的所述KPI的统计量的比例值,
发生所述网络问题后的所述KPI的统计量与第三阈值的差值,
或者,发生所述网络问题后的所述KPI的统计量与所述第三阈值的比例值。
在一种可能的实现方式中,所述统计量包括以下任意一项或多项:
均值、中位数、下分位数、上分位数,或者,任意区间的累计概率分布。
在一种可能的实现方式中,所述网络问题数据信息,包括以下任意一项或多项信息:
所述网元的告警信息、所述网元的性能劣化信息,或者,所述网元的变更信息。
接口电路1620还可以与收发器连接,收发器可以用于支持计算设备与网络设备之间,计算设备与终端设备之间空口信号的接收或者发送,收发器可以与多个天线相连。收发器包括发射机Tx和接收机Rx。具体地,一个或多个天线可以接收空口信号,该收发器的接收机Rx用于从天线接收所述空口信号,并将空口信号转换为数字基带信号或数字中频信号,并将该数字基带信号或数字中频信号提供给所述处理器1610,以便处理器1610对该数字基带信号或数字中频信号做进一步的处理,例如解调处理和译码处理。此外,收发器中的发射机Tx还用于从处理器1610接收经过调制的数字基带信号或数字中频信号,并将该经过调制的数字基带信号或数字中频信号转换为空口信号,并通过一个或多个天线发送所述空口信号。
上述计算设备还可以是芯片。收发单元1620可以是芯片的输入和/或输出电路,或者通信接口。芯片可以用于终端或基站或其他网络设备。在一种可能的设计中,计算设备包括用于生成数据的部件(means),以及用于发送数据的部件(means)。可以通过一个或多个处理器来实现生成数据的means以及发送数据的means的功能。例如可以通过一个或多个处理器生成数据,通过收发器、或输入/输出电路、或芯片的接口发送数据。数据可以参见本申请实施例中的相关描述。在一种可能的设计中,计算设备包括用于接收数据的部件(means),以及用于发送上行数据的部件(means)。数据以及如何根据该数据,发送上行数据可以参见本申请实施例中的相关描述。例如可以通过收发器、或输入/输出电路、或芯片的接口接收数据。
当上述计算设备为应用于终端的芯片时,该终端芯片实现上述方法实施例中终端的功能。该终端芯片从终端中的其它模块(如射频模块或天线)接收信息,该信息是基站发送给终端的;或者,该终端芯片向终端中的其它模块(如射频模块或天线)发送信息,该信息是终端发送给基站的。
当上述计算设备为应用于基站的模块时,该基站模块实现上述方法实施例中基站的功能。该基站模块从基站中的其它模块(如射频模块或天线)接收信息,该信息是终端发送给基站的;或者,该基站模块向基站中的其它模块(如射频模块或天线)发送信息,该信息是基站发送给终端的。这里的基站模块可以是基站的基带芯片,也可以是DU或其他模块,这里的DU可以是开放式无线接入网(open radio access network,O-RAN)架构下的DU。
可以理解的是,本申请的实施例中的处理器可以是中央处理单元(Central Processing Unit,CPU),还可以是其它通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field Programmable Gate Array,FPGA)或者其它可编程逻辑器件、晶体管逻辑器件,硬件部件或者其任意组合。通用处理器可以是微处理器,也可以是任何常规的处理器。
本申请的实施例中的方法步骤可以在硬件中实现,也可以在可由处理器执行的软件指令中实现。软 件指令可以由相应的软件模块组成,软件模块可以被存放于随机存取存储器、闪存、只读存储器、可编程只读存储器、可擦除可编程只读存储器、电可擦除可编程只读存储器、寄存器、硬盘、移动硬盘、CD-ROM或者本领域熟知的任何其它形式的存储介质中。一种示例性的存储介质耦合至处理器,从而使处理器能够从该存储介质读取信息,且可向该存储介质写入信息。存储介质也可以是处理器的组成部分。处理器和存储介质可以位于ASIC中。另外,该ASIC可以位于基站或终端中。处理器和存储介质也可以作为分立组件存在于基站或终端中。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机程序或指令。在计算机上加载和执行所述计算机程序或指令时,全部或部分地执行本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、网络设备、用户设备或者其它可编程装置。所述计算机程序或指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机程序或指令可以从一个网站站点、计算机、服务器或数据中心通过有线或无线方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是集成一个或多个可用介质的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,例如,软盘、硬盘、磁带;也可以是光介质,例如,数字视频光盘;还可以是半导体介质,例如,固态硬盘。该计算机可读存储介质可以是易失性或非易失性存储介质,或可包括易失性和非易失性两种类型的存储介质。
在本申请的各个实施例中,如果没有特殊说明以及逻辑冲突,不同的实施例之间的术语和/或描述具有一致性、且可以相互引用,不同的实施例中的技术特征根据其内在的逻辑关系可以组合形成新的实施例。

Claims (32)

  1. 一种网络问题分析方法,其特征在于,所述方法应用于第一网络,所述第一网络包括多个网元,所述方法包括:
    获取网络问题数据信息,所述网络问题数据信息指示所述第一网络中网元发生的网络问题;
    获取用户数据信息,所述用户数据信息为所述第一网络所服务的至少一个用户的用户数据信息,所述用户数据信息包括所述用户的关键性能指标KPI;
    根据所述网络问题数据信息和所述用户数据信息,生成网络问题分析结果,所述网络问题分析结果指示所述KPI受到所述网络问题的影响程度,
    所述网络问题分析结果包括:
    所述KPI的置信度,和/或,所述KPI的劣化程度,其中,所述KPI的置信度为所述KPI受到所述网络问题影响的置信度,所述KPI的劣化程度指示所述KPI在发生所述网络问题后的变化程度。
  2. 根据权利要求1所述的方法,其特征在于,根据所述网络问题数据信息和所述用户数据信息,生成所述网络问题分析结果,包括:
    根据所述网络问题数据信息从所述第一网络包括的多个网元中,确定异常网元,所述异常网元为受到所述网络问题影响的网元;
    从所述异常网元所服务的多个用户中,确定异常用户,所述异常用户为发生所述网络问题后迁入、迁出和/或驻留所述异常网元的用户;
    根据所述异常用户的KPI,生成所述网络问题分析结果,所述网络问题分析结果指示的所述KPI属于所述异常用户。
  3. 根据权利要求2所述的方法,其特征在于,根据所述异常用户的KPI,生成所述网络问题分析结果,包括:
    从所述异常用户的KPI中,确定所述异常用户的受影响KPI,其中,所述受影响KPI为所述异常用户的多个KPI中受到所述网络问题影响的KPI;
    根据所述异常用户的所述受影响KPI,生成所述网络问题分析结果,所述网络问题分析结果指示的所述KPI为所述受影响KPI。
  4. 根据权利要求2或3中任一项所述的方法,其特征在于,根据所述网络问题数据信息从所述第一网络包括的多个网元中,确定所述异常网元,包括:
    根据所述网络问题数据信息从所述第一网络包括的多个网元中确定所述异常网元;
    生成第一网络信息,所述第一网络信息指示所述异常网元的网络拓扑结构,所述网络拓扑结果指示网元之间的业务关联关系。
  5. 根据权利要求4所述的方法,其特征在于,根据所述网络问题数据信息从所述第一网络包括的多个网元中标记所述异常网元,包括:
    获取网络资源数据信息,所述网络资源数据信息包括以下任意一项或多项信息:所述第一网络的配置信息或者所述第一网络的工程参数信息;
    根据所述网络资源数据信息,生成第二网络信息,所述第二网络信息指示所述第一网络的网络拓扑结构;
    根据所述网络问题数据信息,从所述第二网络信息指示的多个网元中确定所述异常网元,其中,所述第二网络信息指示的多个网元为所述第一网络包括的多个网元。
  6. 根据权利要求4或5所述的方法,其特征在于,所述第一网络信息指示第一网元和第二网元的网络拓扑,其中,所述第一网元为所述第一网络中发生所述网络问题的网元,所述第二网元为与所述第一 网元具有业务关联关系的网元。
  7. 根据权利要求6所述的方法,其特征在于,根据所述网络问题数据信息,从所述第二网络信息指示的多个网元中确定所述异常网元,包括:
    根据所述第一网元的业务迁移信息和所述第二网元的所述业务迁移信息,从所述第一网元和所述第二网元中确定至少一组异常网元组,所述业务迁移信息指示所述网元所服务的所述用户迁入、迁出和/或驻留的统计特征信息;
    其中,所述异常网元组包括至少两个所述异常网元,所述异常网元组包括的所述异常网元之间存在业务关联关系,所述异常网元的所述业务迁移信息在发生所述网络问题前后的变化情况满足第一阈值;
    对所述异常网元之间的业务关联关系标记为异常,根据所述第一网络信息生成第三网络信息,所述第三网络信息指示所述异常网元组的拓扑结构。
  8. 根据权利要求7所述的方法,其特征在于,从所述异常用户的KPI中,确定所述异常用户的受影响KPI,包括:
    根据所述第三网络信息,确定所述异常网元所服务的所述异常用户;
    根据所述异常用户的所述KPI的置信度,从所述异常用户的所述KPI中确定所述受影响KPI,其中,所述受影响KPI的置信度大于第二阈值。
  9. 根据权利要求3-8中任一项所述的方法,其特征在于,根据所述异常用户的所述受影响KPI,生成所述网络问题分析结果,包括:
    根据通信机理关系和所述异常用户的所述受影响KPI,生成第四网络信息,所述第四网络信息指示所述受影响KPI和所述网络问题之间的影响关系,所述第四网络信息指示的影响关系与所述通信机理关系指示的影响关系一致,所述通信机理关系指示多个KPI与网络问题之间的影响关系;
    根据所述第四网络信息和所述网络问题,生成所述网络问题分析结果。
  10. 根据权利要求9所述的方法,其特征在于,所述第四网络信息还包括:关键KPI的劣化程度和所述受影响KPI的置信度,所述关键KPI属于所述受影响KPI。
  11. 根据权利要求1-10中任一项所述的方法,其特征在于,所述网络问题分析结果,包括以下任意一项或多项:
    所述网络问题、发生所述网络问题的所述异常网元、受到所述网络问题影响的所述异常用户,所述异常用户的所述受影响KPI,所述受影响KPI的置信度,或者,所述受影响KPI的劣化程度,所述受影响KPI的置信度为所述受影响KPI受到所述网络问题影响的置信度,所述受影响KPI的劣化程度指示所述受影响KPI在发生所述网络问题后的变化程度。
  12. 根据权利要求1-11所述的方法,其特征在于,所述KPI的劣化程度包括以下任意一项或多项:
    发生所述网络问题后的所述KPI的统计量与发生所述网络问题前的所述KPI的统计量的差值,
    发生所述网络问题后的所述KPI的统计量与发生所述网络问题前的所述KPI的统计量的比例值,
    发生所述网络问题后的所述KPI的统计量与第三阈值的差值,
    或者,发生所述网络问题后的所述KPI的统计量与所述第三阈值的比例值。
  13. 根据权利要求12所述的方法,其特征在于,所述统计量包括以下任意一项或多项:
    均值、中位数、下分位数、上分位数,或者,任意区间的累计概率分布。
  14. 根据权利要求1-13中任一项所述的方法,其特征在于,所述网络问题数据信息,包括以下任意一项或多项信息:
    所述网元的告警信息、所述网元的性能劣化信息,或者,所述网元的变更信息。
  15. 一种网络问题分析装置,其特征在于,所述网络问题分析装置应用于第一网络,所述第一网络包括多个网元,所述网络问题分析装置包括:
    收发模块,用于获取网络问题数据信息,所述网络问题数据信息指示所述第一网络中网元发生的网络问题;
    所述收发模块,还用于获取用户数据信息,所述用户数据信息为所述第一网络所服务的至少一个用户的用户数据信息,所述用户数据信息包括所述用户的关键性能指标KPI;
    处理模块,用于根据所述网络问题数据信息和所述用户数据信息,生成网络问题分析结果,所述网络问题分析结果指示所述KPI受到所述网络问题的影响程度,
    所述网络问题分析结果包括:
    所述KPI的置信度,和/或,所述KPI的劣化程度,其中,所述KPI的置信度为所述KPI受到所述网络问题影响的置信度,所述KPI的劣化程度指示所述KPI在发生所述网络问题后的变化程度。
  16. 根据权利要求15所述的网络问题分析装置,其特征在于,所述处理模块,还用于根据所述网络问题数据信息从所述第一网络包括的多个网元中,确定异常网元,所述异常网元为受到所述网络问题影响的网元;
    所述处理模块,还用于从所述异常网元所服务的多个用户中,确定异常用户,所述异常用户为发生所述网络问题后迁入、迁出和/或驻留所述异常网元的用户;
    所述处理模块,还用于根据所述异常用户的KPI,生成所述网络问题分析结果,所述网络问题分析结果指示的所述KPI属于所述异常用户。
  17. 根据权利要求16所述的网络问题分析装置,其特征在于,所述处理模块,还用于从所述异常用户的KPI中,确定所述异常用户的受影响KPI,其中,所述受影响KPI为所述异常用户的多个KPI中受到所述网络问题影响的KPI;
    所述处理模块,还用于根据所述异常用户的所述受影响KPI,生成所述网络问题分析结果,所述网络问题分析结果指示的所述KPI为所述受影响KPI。
  18. 根据权利要求16或17中任一项所述的网络问题分析装置,其特征在于,
    所述处理模块,还用于根据所述网络问题数据信息从所述第一网络包括的多个网元中确定所述异常网元;
    所述处理模块,还用于生成第一网络信息,所述第一网络信息指示所述异常网元的网络拓扑结构,所述网络拓扑结果指示网元之间的业务关联关系。
  19. 根据权利要求18所述的网络问题分析装置,其特征在于,所述收发模块,还用于获取网络资源数据信息,所述网络资源数据信息包括以下任意一项或多项信息:所述第一网络的配置信息或者所述第一网络的工程参数信息;
    所述处理模块,还用于根据所述网络资源数据信息,生成第二网络信息,所述第二网络信息指示所述第一网络的网络拓扑结构;
    所述处理模块,还用于根据所述网络问题数据信息,从所述第二网络信息指示的多个网元中确定所述异常网元,其中,所述第二网络信息指示的多个网元为所述第一网络包括的多个网元。
  20. 根据权利要求18或19所述的网络问题分析装置,其特征在于,所述第一网络信息指示第一网元和第二网元的网络拓扑,其中,所述第一网元为所述第一网络中发生所述网络问题的网元,所述第二网元为与所述第一网元具有业务关联关系的网元。
  21. 根据权利要求20所述的网络问题分析装置,其特征在于,所述处理模块,还用于根据所述第一网元的业务迁移信息和所述第二网元的所述业务迁移信息,从所述第一网元和所述第二网元中确定至少一组异常网元组,所述业务迁移信息指示所述网元所服务的所述用户迁入、迁出和/或驻留的统计特征信息;
    其中,所述异常网元组包括至少两个所述异常网元,所述异常网元组包括的所述异常网元之间存在业务关联关系,所述异常网元的所述业务迁移信息在发生所述网络问题前后的变化情况满足第一阈值;
    所述处理模块,还用于对所述异常网元之间的业务关联关系标记为异常,根据所述第一网络信息生成第三网络信息,所述第三网络信息指示所述异常网元组的拓扑结构。
  22. 根据权利要求20所述的网络问题分析装置,其特征在于,所述处理模块,还用于根据所述第三网络信息,确定所述异常网元所服务的所述异常用户;
    所述处理模块,还用于根据所述异常用户的所述KPI的置信度,从所述异常用户的所述KPI中确定所述受影响KPI,其中,所述受影响KPI的置信度大于第二阈值。
  23. 根据权利要求17-22中任一项所述的网络问题分析装置,其特征在于,
    所述处理模块,还用于根据通信机理关系和所述异常用户的所述受影响KPI,生成第四网络信息,所述第四网络信息指示所述受影响KPI和所述网络问题之间的影响关系,所述第四网络信息指示的影响关系与所述通信机理关系指示的影响关系一致,所述通信机理关系指示多个KPI与网络问题之间的影响关系;
    所述处理模块,还用于根据所述第四网络信息和所述网络问题,生成所述网络问题分析结果。
  24. 根据权利要求23所述的网络问题分析装置,其特征在于,所述第四网络信息还包括:关键KPI的劣化程度和所述受影响KPI的置信度,所述关键KPI属于所述受影响KPI。
  25. 根据权利要求15-24中任一项所述的网络问题分析装置,其特征在于,所述网络问题分析结果,包括以下任意一项或多项:
    所述网络问题、发生所述网络问题的所述异常网元、受到所述网络问题影响的所述异常用户,所述异常用户的所述受影响KPI,所述受影响KPI的置信度,或者,所述受影响KPI的劣化程度,所述受影响KPI的置信度为所述受影响KPI受到所述网络问题影响的置信度,所述受影响KPI的劣化程度指示所述受影响KPI在发生所述网络问题后的变化程度。
  26. 根据权利要求15-25所述的网络问题分析装置,其特征在于,所述KPI的劣化程度包括以下任意一项或多项:
    发生所述网络问题后的所述KPI的统计量与发生所述网络问题前的所述KPI的统计量的差值,
    发生所述网络问题后的所述KPI的统计量与发生所述网络问题前的所述KPI的统计量的比例值,
    发生所述网络问题后的所述KPI的统计量与第三阈值的差值,
    或者,发生所述网络问题后的所述KPI的统计量与所述第三阈值的比例值。
  27. 根据权利要求26所述的网络问题分析装置,其特征在于,所述统计量包括以下任意一项或多项:
    均值、中位数、下分位数、上分位数,或者,任意区间的累计概率分布。
  28. 根据权利要求15-27中任一项所述的网络问题分析装置,其特征在于,所述网络问题数据信息,包括以下任意一项或多项信息:
    所述网元的告警信息、所述网元的性能劣化信息,或者,所述网元的变更信息。
  29. 一种计算设备,其特征在于,所述计算设备包括处理器和存储器;
    所述计算设备的处理器用于执行所述计算设备的存储器中存储的指令,以使得所述计算设备执行如 权利要求1-14所述的方法。
  30. 一种计算设备集群,其特征在于,包括至少一个计算设备,每个计算设备包括处理器和存储器;
    所述至少一个计算设备的处理器用于执行所述至少一个计算设备的存储器中存储的指令,以使得所述计算设备集群执行如权利要求1-14所述的方法。
  31. 一种包含指令的计算机程序产品,其特征在于,当所述指令被计算设备集群运行时,使得所述计算设备集群执行如权利要求的1-14所述的方法。
  32. 一种计算机可读存储介质,其特征在于,包括计算机程序指令,当所述计算机程序指令由计算设备集群执行时,所述计算设备集群执行如权利要求1-14所述的方法。
PCT/CN2023/134285 2022-11-30 2023-11-27 一种网络问题分析方法以及相关装置 WO2024114567A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211521732.6A CN118118317A (zh) 2022-11-30 2022-11-30 一种网络问题分析方法以及相关装置
CN202211521732.6 2022-11-30

Publications (1)

Publication Number Publication Date
WO2024114567A1 true WO2024114567A1 (zh) 2024-06-06

Family

ID=91211115

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/134285 WO2024114567A1 (zh) 2022-11-30 2023-11-27 一种网络问题分析方法以及相关装置

Country Status (2)

Country Link
CN (1) CN118118317A (zh)
WO (1) WO2024114567A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108848515A (zh) * 2018-05-31 2018-11-20 武汉虹信技术服务有限责任公司 一种基于大数据的物联网业务质量监测平台及方法
CN109951856A (zh) * 2017-12-20 2019-06-28 中国电信股份有限公司 网元状态的检测方法、装置以及计算机可读存储介质
US20210385670A1 (en) * 2018-10-02 2021-12-09 Cellwize Wireless Technologies Ltd. Method of controlling traffic in a cellular network and system thereof
WO2021250445A1 (en) * 2020-06-10 2021-12-16 Telefonaktiebolaget Lm Ericsson (Publ) Network performance assessment
CN114128226A (zh) * 2019-05-30 2022-03-01 三星电子株式会社 使用机器学习的根本原因分析和自动化

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109951856A (zh) * 2017-12-20 2019-06-28 中国电信股份有限公司 网元状态的检测方法、装置以及计算机可读存储介质
CN108848515A (zh) * 2018-05-31 2018-11-20 武汉虹信技术服务有限责任公司 一种基于大数据的物联网业务质量监测平台及方法
US20210385670A1 (en) * 2018-10-02 2021-12-09 Cellwize Wireless Technologies Ltd. Method of controlling traffic in a cellular network and system thereof
CN114128226A (zh) * 2019-05-30 2022-03-01 三星电子株式会社 使用机器学习的根本原因分析和自动化
WO2021250445A1 (en) * 2020-06-10 2021-12-16 Telefonaktiebolaget Lm Ericsson (Publ) Network performance assessment

Also Published As

Publication number Publication date
CN118118317A (zh) 2024-05-31

Similar Documents

Publication Publication Date Title
JP4945630B2 (ja) 整合フィルタを用いた任意無線ネットワークデータの解析
CN111817868B (zh) 一种网络质量异常的定位方法与装置
US9756518B1 (en) Method and apparatus for detecting a traffic suppression turning point in a cellular network
CN102668622B (zh) 网络瓶颈管理
US11586981B2 (en) Failure analysis device, failure analysis method, and failure analysis program
WO2022061900A1 (zh) 故障自治能力的确定方法以及相关设备
US12061517B2 (en) Using user equipment data clusters and spatial temporal graphs of abnormalities for root cause analysis
CN111954224A (zh) 一种同频干扰的处理方法及装置
CN111343647B (zh) 用户感知评估的方法、装置、设备和介质
CN110753369B (zh) 一种中断小区检测方法及装置
WO2024114567A1 (zh) 一种网络问题分析方法以及相关装置
Muñoz et al. A method for identifying faulty cells using a classification tree-based UE diagnosis in LTE
WO2023045931A1 (zh) 一种网络性能异常分析方法、装置及可读存储介质
CN112035286A (zh) 故障原因的确定方法及装置、存储介质、电子装置
Khatib et al. Modelling LTE solved troubleshooting cases
CN110650028B (zh) 一种业务om的方法、装置和系统
JP2023537119A (ja) ネットワークデータ分析方法、ネットワークデータ分析機能ネットワーク要素、通信システム、非一時的コンピュータ可読記憶媒体およびコンピュータプログラム
US20230370354A1 (en) Systems and methods for identifying spatial clusters of users having poor experience in a heterogeneous network
CN114286196B (zh) 基于无线网络指标的视频质差原因分析方法及装置
US20240250763A1 (en) Interference identification in a ran
US20240098486A1 (en) Autonomous identification of rouge devices in a communications network
KR20230158875A (ko) 인공지능을 활용한 네트워크 슬라이스 품질 보장 방법 및 장치
US20240314018A1 (en) Method and apparatus for determining a first causal map
CN118540725A (zh) 一种数据统计方法及装置
Li et al. Analytics and Machine Learning Powered Wireless Network Optimization and Planning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23896718

Country of ref document: EP

Kind code of ref document: A1