US20210160124A1 - Automatic Integrated Control Loop For Fault, Performance And Service Quality Management - Google Patents
Automatic Integrated Control Loop For Fault, Performance And Service Quality Management Download PDFInfo
- Publication number
- US20210160124A1 US20210160124A1 US17/163,877 US202117163877A US2021160124A1 US 20210160124 A1 US20210160124 A1 US 20210160124A1 US 202117163877 A US202117163877 A US 202117163877A US 2021160124 A1 US2021160124 A1 US 2021160124A1
- Authority
- US
- United States
- Prior art keywords
- management data
- faults
- root cause
- network
- service
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
- H04L41/0631—Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
- H04L41/065—Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis involving logical or physical relationship, e.g. grouping and hierarchies
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
- H04L41/0604—Management of faults, events, alarms or notifications using filtering, e.g. reduction of information by using priority, element types, position or time
- H04L41/0613—Management of faults, events, alarms or notifications using filtering, e.g. reduction of information by using priority, element types, position or time based on the type or category of the network elements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
- H04L41/0654—Management of faults, events, alarms or notifications using network fault recovery
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/50—Network service management, e.g. ensuring proper service fulfilment according to agreements
- H04L41/5003—Managing SLA; Interaction between SLA and QoS
- H04L41/5009—Determining service level performance parameters or violations of service level contracts, e.g. violations of agreed response time or mean time between failures [MTBF]
Definitions
- This disclosure relates generally to adaptation of control loop, and more particularly, to a system and method for integrating fault, performance and service quality management in an integrated control loop.
- a failure may trigger an alert for performance issues and service quality issues, meaning multiple teams may be investigating a single failure.
- this can manifest itself in the failure of a cell site in the mobility Radio Access Network (RAN).
- RAN Radio Access Network
- Such a failure may cause traffic to be re-routed to neighboring cell sites which can cause unexpected congestion due to the increased load. The congestion may lead to service impacts including increased blocked and/or dropped call rates.
- the failure of a single cell site may generate an alert for that failure but also multiple downstream alerts for the congested cell sites without reference to the failed cell site. This is clearly inefficient and requires intelligent systems to relate, or correlate, these different symptoms of the underlying cell tower failure.
- the present disclosure is directed to a method comprising, collecting operational data from a system, segregating the data into a first component comprising one or more service quality anomalies and a second component comprising one or more network-based events, wherein the network-based events are one of a network-based fault or a degradation of network performance, correlating the first component and the second component to determine whether the one or more network-based events have an impact on service quality, and resolving the network-based events that have an impact on service quality.
- the method may further include prioritizing a resolution of the network-based events that have a greater impact on service quality.
- the analysis of the second component may include collecting network-based event data and correlating the network-based event data to determine a common cause of symptoms associated with the network-based event and wherein an analysis of the first component may include comparing service quality measurements to key performance indicators to determine the one or more service quality anomalies.
- the correlating step may annotate the network-based events with the service quality anomalies.
- the method may further include determining that a silent fault has occurred wherein the silent fault includes multiple symptoms and wherein the method further comprises analyzing the multiple symptoms to localize the silent fault based on the multiple symptoms.
- the silent fault may then be published for further analysis.
- the method may further include determining that a silent fault has not occurred and if a silent fault has not occurred, then a root cause of the service quality anomaly is determined.
- the present disclosure is also directed to a system including a data collector configured to collect service metrics and network metrics, a service anomaly detector in communication with the data collector wherein the service anomaly detector determines service level impacts, a network event processor in communication with the data collector, and a service impact analyzer configured to annotate network-based events with the service quality impacts.
- the system may also include a a filter in communication with the service anomaly detector and the service impact analyzer, wherein the filter is configured to separate silent failures from other faults and wherein a silent failure has multiple symptoms and wherein the system further comprises a silent failure localization processor configured to correlate the multiple symptoms to assist in further resolution of the network-based events having service level impact.
- the network metrics includes network fault data and network performance data.
- the present disclosure is also directed to an analyzer including a data collector configured to collect service metrics and network metrics wherein the analyzer is in communication with the data collector, the analyzer having an input-output interface, a processor coupled to the input-output interface wherein the processor is coupled to a memory, the memory having stored thereon executable instructions that when executed by the processor cause the processor to effectuate operations including segregating the data into a first component comprising the service metrics and a second component comprising the network metrics, analyzing the first component to identify at least one service impact anomaly, analyzing the second component to identify at least one network event, correlating the first component and the second component based on the analyzing step, and annotating the network event with the service impact anomaly
- the operations may further include resolving the network event if the network event has an impact on service quality metrics.
- the operations further include prioritizing a resolution of the network event having a greater impact on the service quality metrics.
- the network event is one of a network-based fault or a degradation of network performance.
- the operations may further include determining that a silent fault has occurred wherein the silent fault comprises multiple symptoms and wherein the operations further comprise analyzing the multiple symptoms to localize the silent fault based on the multiple symptoms.
- the operations may further determine that a silent fault has not occurred and if a silent fault has not occurred, then a root cause of the service impact anomaly is determined.
- FIG. 1 is a representation of an exemplary functional integrated service development and delivery platform on which the systems and methods of the present disclosure may be implemented.
- FIG. 2 is a representation of an exemplary platform architecture which may implement the functionality of FIG. 1 .
- FIG. 3 is a representation of an exemplary architecture of the data collection and analytics events engine shown in FIG. 2 .
- FIG. 4 is an exemplary flowchart showing an aspect of the operations of a method in accordance with the present disclosure.
- FIG. 5 is an exemplary flowchart showing another aspect of the operations of a method in accordance with the present disclosure.
- the present disclosure is directed to a system and method to more effectively and efficiently manage network operations.
- the disclosure provides a holistic approach to detecting and isolating network and service issues, quantifying the customer impact of network issues and eliminating operational inefficiencies associated with service and network teams investigating common issues.
- the disclosure focuses on three aspects of network management.
- the first aspect may be fault management which may, for example, include alarms, traps, Syslog messages, and hard errors such as cut fiber optic cables, application services unexpectedly going offline, power failures and any other hard errors.
- the second aspect of network management may include performance management which may, for example, include measurements with respect to packet loss rates, latency, speed degradation and the like.
- the third aspect of network management may include service quality management which is directed to an end-to-end customer-perceived quality measurement. This third aspect may include, for example, quality issues such as video delivery in a lower resolution.
- the network operator may be able prioritize the resolution of network issues. For example, an increased latency between routers in the network may trigger a network alarm, but it may not result in degradation of a service and thus have no impact on the customer experience. As such, that problem may be placed on a lower priority for resolution. Conversely, the larger the impact on the impact on the customer experience, the higher the priority for resolution. Moreover, the integration of these three network management functions may result in more efficient use of resources assigned to resolve a single that may manifest itself in 2 or more areas.
- the present disclosure provides an integrated control loop which is an integrated tool to combine the various aspects of network management.
- a data collection analytics and events (“DCAE”) engine At the core of the disclosure is a data collection analytics and events (“DCAE”) engine.
- the DCAE engine will collect and analyze data and trigger certain actions based on the correlation of symptoms of any anomalies detected in the network. For example, anomalies impacting the customer experience may compare performance metrics to key performance indications (KPIs) while anomalies in the network may include symptoms such as alarms or traps.
- KPIs key performance indications
- the DCAE engine may include a service impact analyzer which annotates all detected network events with the service indicators to identify which events may influence the customer experience.
- the service impact analyzer may identify the “where” and “how” a network event impacts a customer.
- a generic spatial-temporal event correlation engine may produce a flow of network events annotated with their service impact and a flow of silent service events, which may, for example, include service issues with no known associated network root causes. These flows can then be fed into various downstream systems for manual exploration and inspection, which may, for example, include SQM Explorer, ticket dispatch optimization, or closed-loop automation updates to policy engines and controllers.
- a run-time execution platform 10 shown in FIG. 1 which may, for example, be configured as an ECOMP platform such as the platform developed by AT&T, the assignee of the present application, and which may be considered the intelligence in a software defined network (SDN) that provides lifecycle management and control of software-centric network resources, infrastructure and services.
- SDN software defined network
- ECOMP is but one example of a run-time execution platform 10 which may serve as the operating environment for the present disclosure.
- the run time execution platform 10 may be a reusable platform of software assets comprising the following functionality:
- SDC Service Design and Creation
- MSO Master Service Orchestration
- DCAE Data Collection, Analytics and Events
- A&AI Active and Available Inventory
- DN & Application Controller configure, monitor maintain the health of network services and elements throughout their lifecycle.
- Application controllers configure, monitor and maintains the health of an applications throughout their lifecycle;
- Portal Provides a common portal framework and UI architecture for creating and operating virtual infrastructure and services. Examples of capabilities include common look and feel, role administration, access control, SDK with plug-ins and web components.
- the system 9 may include a model-driven software architecture which makes services and operations available to users.
- the system 9 may include functionality to facilitate user development of software applications, products & solutions.
- the system may have a central platform 1 which may, for example, be an open source platform and may include machine learning and other analytics and provide the interactions of additional functionality relating to creating services and policies and operating network functionality. Interacting with the platform 1 may be an access portal 2 and a master service orchestrator 3 .
- the master service orchestrator 3 may manage the delivery, modification or removal of networks and services and provide cross-domain orchestration to optimize the utilization of resources or to take corrective action.
- the master services orchestrator 3 may interact with various applications to collect data to determine network parameters.
- the system 9 may include a service design and creation function 4 interacting with the platform 1 .
- the service design and creation function 4 may provide a rich catalog-driven environment to construct and maintain service and resource definitions, constraints and management processes and policies.
- FIG. 1 Also shown in FIG. 1 is a set of controllers 5 .
- the controllers 5 may include network controllers for instantiating, configuring, and managing the lifecycle transport virtual network functions (VNFs) and network infrastructure, service/app controllers for instantiating, configuring and managing the lifecycle of service virtual functions (VFs), and infrastructure controllers for instantiating, configuring, and managing the lifecycle of infrastructure such as computational capacity and storage.
- VNFs lifecycle transport virtual network functions
- VFs lifecycle of service virtual functions
- infrastructure controllers for instantiating, configuring, and managing the lifecycle of infrastructure such as computational capacity and storage.
- the data collection analytics and event function 6 may collect data and events necessary to manage networks and make data available to real time applications.
- the data collection analytics and event function 6 may provide the framework for analytics applications to identify patterns and anomalies and to publish events to drive closed-loop control.
- an active and availability inventory function 7 which may provide a real-time topology map with views of virtual networks, services and applications.
- a policy function 8 which creates, maintains and modifies policies associated with system 9 .
- the system 9 may be configured to support the provision, operation and maintenance of a wireless communication network which may, for example, be 3G, 4G/LTE, 5G, or any other wireless communication system now known or to be developed.
- the wireless communication network may be configured in part or in whole using software defined network components.
- the system 9 may include a run-time execution platform 10 which may, for example, be an ECOMP platform or other platform to support edge computing.
- the run-time execution platform 10 may also be referred to simply as platform 10 herein.
- the platform 10 may be a single entity that provides management functions in a uniform manner to onboard, deliver and manage the lifecycle of SDN resources and services.
- the applications that use the platform 10 may interact with it via application program interfaces (APIs) 18 so applications need not have to know all inner components that comprise platform 10 and system 9 .
- APIs application program interfaces
- the platform 10 may be accessible by a portal 12 which may, for example, be a server or other device that is operable to access and control the platform 10 and through which user or service provider generated models may be deployed on platform 10 .
- a user guide 14 may assist a user in creating models 16 for an application.
- the models 16 together with APIs 18 , may then determine the resources desired for the business support systems (BSS) 2 which in turn, may access the platform 10 for placing orders, monitor usage and perform other accounting functions.
- the models 16 and APIs 18 may also be used to determine and request features, functionality and capacity of the operational support systems (OSS) 22 , which may, for example, include capacity, fallout, and key performance indicators (KPIs) and the like.
- OSS operational support systems
- KPIs key performance indicators
- the portal 12 may also provide access to best practices 11 for SDN networks with respect to on-boarding and configuring new applications 15 .
- users may access software development kits (SDKs) and virtual function on-boarding functions 13 to support models 16 .
- SDKs software development kits
- the platform 10 may control the instantiation of virtual network functions (VNFs) 17 , physical network functions (PNFs) 18 and virtual machines (VMs) 21 .
- VNFs virtual network functions
- PNFs physical network functions
- VMs virtual machines
- the data collection analytics and events engine 30 may collect data and events necessary to manage and evolve networks and services.
- platform 110 which may, for example, be similar to platform 10 described above.
- platform 110 may be controlled and/or accessed by an integrated control loop 130 .
- the integrated control loop 130 may be a DCAE module and include a data collection module 132 from platform 110 .
- Outputs from platform 110 to DCAE engine 130 may include service metrics and network events and metrics.
- Service metrics may include, for example, metrics relating to service quality management (SQM) metrics, those metrics that affect the customer experience.
- SQM service quality management
- the data being collected may include both service metrics and network events and metrics.
- the data may include alarms, KPIs, service level quality measurements, and the like.
- the data collector 132 may parse the collected data into two or more components.
- one set of data may he be related to end-to-end customer service measurement events and performance issues as they may relate to KPIs, shown as the top path in FIG. 3 .
- the other set of data shown as the lower path in FIG. 3 , may be related to network performance in view of faults, alarms, Syslogs, change activities, and other network events, along with performance-impacting events such as latency between servers or data losses.
- the lower path of the data flow may detect and correlate network events and then quantify the customer service impact of the correlated network events.
- a service anomaly detection function 134 which analyzes the service quality measurements captured by the data collector module 132 .
- the service anomaly detection function may, for example, be performed with assistance by a system provided by Argus Controls. This function may compare quality of service measurements to key performance indicators such as response time, quality of service measurements, resolution of audio and video streams, download and upload speeds, and other data that may impact the customer experience.
- the detected anomalies are then passed to the filtering silent failures function 142 discussed in more detail below and may also be fed into a service impact.
- the data acquisition analytics and event engine 130 may include an event processing function 136 .
- the event processing function may filter the collected event data to detect the anomalies generated by the network.
- the detected network anomalies may then be passed to the network event correlation function 138 which correlates the detected anomalies based on type of anomaly which may, for example be event-based anomalies such as a detected network event creates two or more alarms which may be caused by the same fault.
- Such faults may be correlated with other network event indicators such as Syslogs, planned maintenance outages and performance-based network events such as packet loss rates, latency, speed degradation and the like.
- the network-based and performance-based events may also be correlated based on geography, network element types, time of day, loading, or any other factor.
- the output of the event correlation function 138 may be fed into a service impact analyzer 140 .
- the service impact analyzer 140 may also receive service inputs from the platform directly in the form of service paths and service KPIs.
- the service impact analyzer 140 may quantify the impact that various network events may have on the customer experience.
- the service impact analyzer 140 may include programs that track planned network outages for maintenance and upgrades, a cell tower outage network analyzer, and inputs from other network systems.
- the service impact analyzer 140 may correlate network-based events with service quality issues that are highlighted by the service KPIs or other issues affecting the service path.
- the service impact analyzer 140 may correlate a network event whereby an individual cell tower goes off-line to create an outage and generate an alarm condition with service quality issues being generated by neighboring cell towers whose KPIs indicate excessive loading, higher blocked or dropped call rates or other issues that affect service quality.
- a single fault in the network which would otherwise generate multiple trouble tickets for a plurality of cell sites can now be detected and addressed with a single trouble ticket with the resolution prioritized based on the severity of the customer impact.
- a filtering silent failures function 142 which may be a spatial-temporal event correlation engine to produce two data flows, namely a flow of network-based events annotated with their respective service level impacts and a flow of silent service events which are defined as service issues with no known associated network root cause.
- the filtering silent failures function 142 thus generates two outputs, one output being those events whose root causes can be explained and another output being those events whose root causes cannot be explained.
- the cause of the anomaly and the priority of resolution may be sent to a portal 112 where an operator may generate a trouble ticket.
- this data fed is back into platform 110 through the integrated control loop whereby the event that triggered the anomaly may be resolved automatically.
- the root cause of the problem may now be investigated by one maintenance team instead of two or three maintenance teams.
- the silent service events are then fed into a silent failure localization processor 144 which may narrow the potential causes of the service anomalies.
- the silent failure localization function 144 may be used when a cause of the service anomaly cannot be determined.
- the silent failure localization processor 144 may correlate similar issues and summarize the commonalities among the problem set. For example, there may be commonality with respect to gaming which may be localized to a particular video game such as Madden NFL 18 . There may be a commonality with respect to geography, such as Atlanta, Ga. or South Bend, Ind. There may be commonality with respect to certain types of customer equipment such as a Samsung Galaxy® handset or an Apple iPhone®.
- the silent failure localization processor 144 may attempt to classify and correlate as many common characteristics as possible to assist with the further analysis and resolution of the anomalies. This correlated silent failure data may then be provided through portal 112 to a system operator for further investigation as well as being fed back as a control loop to platform 110 .
- Those outputs may be input to data collectors 132 in the DCAE engine 132 .
- Data collectors 132 may perform initial analysis and segregation with respect to the types of data.
- Data associated with SQM may be passed to a service anomaly detection function 134 which may, for example, be performed by software programs created by Argus Control Systems Ltd.
- FIG. 4 there is shown a flow chart beginning with a data collection function at 41 .
- data is segregated as previously described with respect to service quality and network events.
- network events such as alarms, outages, and the like are detected while at 44 , service quality anomalies such as customer-impacting performance issues are detected.
- service quality anomalies such as customer-impacting performance issues are detected.
- the network events and customer-impacting service quality issues may be correlated, and the network events are annotated with the service quality issues.
- the network-based events are analyzed to determine whether those network-based events are impacting customer service quality measurements. If not, then resolution of those network-based events is set to a lower priority at 47 . If the network-based events do impact customer service quality measurements, then the resolution of those network-based issues is set to a higher priority.
- the flow chart in FIG. 5 shows the data flow starting with the data analytics function 51 as previously described.
- the analytics function results are filtered based on whether the root cause of the anomaly has been resolved. If yes, the cause of the anomaly may be published through the portal and resolution processes initiated, either by issuing trouble tickets or through automatic error correction methodologies. If the root cause of the anomaly is not determined at 52 , i.e., the fault is a silent fault, then common symptoms of the fault are determined at 54 and based on those commonalities, a narrower range of possible root causes is determined at 55 and published to the portal for further analysis and trouble shooting. In both cases, the feedback loop to the platform is completed at 56 .
- the present disclosure has been described with respect to an integrated control loop 130 applied to a platform 110 whereby integrated control loop may be agnostic with respect to the services offered.
- the systems and methods of the present invention may work for any platform whereby certain events may or may not lead to service quality issues affecting customers or other operations.
- Other applications may include, but are not limited, to autonomous vehicles, transportation, factory management systems, home automations, virtual reality, augmented reality, and computer gaming, just to name a few.
- the present disclosure provides an integrated control loop which is able to operate independently of any particular service and thus is available across multiple service offerings, including but not limited to mobility telecommunications functionality, Voice over LTE (VoLTE) and video services. Enabling an integrated control loop on a platform as described is able to reduce alarm fatigue and improve operational efficiency by eliminating duplication of effort in troubleshooting common issues among network and service operations teams, quantify the impact of network events to prioritize resolution activities.
- the integrated control loop is able to scale the analytics across both physical and network functions and services.
- the architecture allows operators and service designers, including customers and contractors, to create control loop policies to address the needs of new or modified services and establish KPIs more efficiently.
- the disclosure permits the use of a data collection analytics event engine to provide an integrated control loop for an operational platform.
- the systems and methods of the disclosure are directed to solving a practical problem by providing a real world and practical solution that removes or minimizes duplication of effort for network and service maintenance and upgrade applications while allowing customers and users to develop their own policies to control and gain additional insight into the fault detection and service degradation functionality.
- the system and methods help customer service personnel understand and explain degradation of service levels and by the isolating or narrowing the scope of the root causes of such degradation, better predict the time before service is restored to normal levels while prioritizing the resolution of those problems that affect the customer experience.
- the present disclosure allows explicit incorporation of changing policies and KPIs to vary the control of the integrated control loop process and thus provide valuable additional capabilities of monitoring and maintaining platforms 110 or other systems employing integrated control loops and associated algorithms.
- While examples of systems and methods for creating and applying integrated control loops for an operational platform have been described in connection with various computing devices/processors, the underlying concepts may be applied to any computing device, processor, or system capable of facilitating or hosting an integrate control loop system, including cloud-based systems, proprietary systems or open source systems in which the framework of creating an integrated automatic control loop with data collections and analytics engine solutions may be implemented.
- the various techniques described herein may be implemented in connection with hardware or software or, where appropriate, with a combination of both.
- the methods and devices may take the form of program code (i.e., instructions) embodied in concrete, tangible, storage media having a concrete, tangible, physical structure.
- tangible storage media examples include floppy diskettes, Compact Disc-Read-Only Memory devices (CD-ROMs), Digital Versatile Discs, or, Digital Video Discs (DVDs), hard drives, or any other tangible machine-readable storage medium (computer-readable storage medium).
- a computer-readable storage medium is not a signal.
- a computer-readable storage medium is not a transient signal.
- a computer-readable storage medium is not a propagating signal.
- a computer-readable storage medium as described herein is an article of manufacture. When the program code is loaded into and executed by a machine, such as a computer, the machine becomes a device for telecommunications.
- the computing device will generally include a processor, a storage medium readable by the processor (including volatile or nonvolatile memory or storage elements), at least one input device, and at least one output device.
- the program(s) can be implemented in assembly or machine language, if desired.
- the language can be a compiled or interpreted language and may be combined with hardware implementations.
- the methods and devices associated with a system as described herein also may be practiced via communications embodied in the form of program code that is transmitted over some transmission medium, such as over electrical wiring or cabling, through fiber optics, or via any other form of transmission, over the air (OTA), or firmware over the air (FOTA), wherein, when the program code is received and loaded into and executed by a machine, such as an Erasable Programmable Read-Only Memory (EPROM), a gate array, a programmable logic device (PLD), a client computer, or the like, the machine becomes an device for implementing telecommunications as described herein.
- EPROM Erasable Programmable Read-Only Memory
- PLD programmable logic device
- client computer or the like
- the program code When implemented on a general-purpose processor, the program code combines with the processor to provide a unique device that operates to invoke the functionality of a telecommunications system.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
Description
- This application is a continuation of, and claims priority to, U.S. patent application Ser. No. 16/399,750 filed Apr. 30, 2019, entitled “Automatic integrated Control Loop For Fault, Performance And Service Quality Management,” the entire contents of which are hereby incorporated herein by reference.
- This disclosure relates generally to adaptation of control loop, and more particularly, to a system and method for integrating fault, performance and service quality management in an integrated control loop.
- In run-time platform environments, the management of fault, performance and service quality have traditionally been managed in separate silos. This is true both from with respect to systems and operations organizations. Such a separation of functionality leads to inherent inefficiencies. For example, a failure may trigger an alert for performance issues and service quality issues, meaning multiple teams may be investigating a single failure. In the field of telecommunications, this can manifest itself in the failure of a cell site in the mobility Radio Access Network (RAN). Such a failure may cause traffic to be re-routed to neighboring cell sites which can cause unexpected congestion due to the increased load. The congestion may lead to service impacts including increased blocked and/or dropped call rates. Thus, the failure of a single cell site may generate an alert for that failure but also multiple downstream alerts for the congested cell sites without reference to the failed cell site. This is clearly inefficient and requires intelligent systems to relate, or correlate, these different symptoms of the underlying cell tower failure.
- Accordingly, there is a need for an intelligent system and method to correlate the different symptoms to focus on the root cause of the symptoms.
- The present disclosure is directed to a method comprising, collecting operational data from a system, segregating the data into a first component comprising one or more service quality anomalies and a second component comprising one or more network-based events, wherein the network-based events are one of a network-based fault or a degradation of network performance, correlating the first component and the second component to determine whether the one or more network-based events have an impact on service quality, and resolving the network-based events that have an impact on service quality. The method may further include prioritizing a resolution of the network-based events that have a greater impact on service quality. In an aspect, the analysis of the second component may include collecting network-based event data and correlating the network-based event data to determine a common cause of symptoms associated with the network-based event and wherein an analysis of the first component may include comparing service quality measurements to key performance indicators to determine the one or more service quality anomalies. In an aspect, the correlating step may annotate the network-based events with the service quality anomalies.
- In an aspect, the method may further include determining that a silent fault has occurred wherein the silent fault includes multiple symptoms and wherein the method further comprises analyzing the multiple symptoms to localize the silent fault based on the multiple symptoms. The silent fault may then be published for further analysis. The method may further include determining that a silent fault has not occurred and if a silent fault has not occurred, then a root cause of the service quality anomaly is determined.
- The present disclosure is also directed to a system including a data collector configured to collect service metrics and network metrics, a service anomaly detector in communication with the data collector wherein the service anomaly detector determines service level impacts, a network event processor in communication with the data collector, and a service impact analyzer configured to annotate network-based events with the service quality impacts. The system may also include a a filter in communication with the service anomaly detector and the service impact analyzer, wherein the filter is configured to separate silent failures from other faults and wherein a silent failure has multiple symptoms and wherein the system further comprises a silent failure localization processor configured to correlate the multiple symptoms to assist in further resolution of the network-based events having service level impact. In an aspect, the network metrics includes network fault data and network performance data.
- The present disclosure is also directed to an analyzer including a data collector configured to collect service metrics and network metrics wherein the analyzer is in communication with the data collector, the analyzer having an input-output interface, a processor coupled to the input-output interface wherein the processor is coupled to a memory, the memory having stored thereon executable instructions that when executed by the processor cause the processor to effectuate operations including segregating the data into a first component comprising the service metrics and a second component comprising the network metrics, analyzing the first component to identify at least one service impact anomaly, analyzing the second component to identify at least one network event, correlating the first component and the second component based on the analyzing step, and annotating the network event with the service impact anomaly The operations may further include resolving the network event if the network event has an impact on service quality metrics. In an aspect, there may be plurality of network events and wherein the operations further include prioritizing a resolution of the network event having a greater impact on the service quality metrics. In an aspect, the network event is one of a network-based fault or a degradation of network performance. The operations may further include determining that a silent fault has occurred wherein the silent fault comprises multiple symptoms and wherein the operations further comprise analyzing the multiple symptoms to localize the silent fault based on the multiple symptoms. In an aspect, the operations may further determine that a silent fault has not occurred and if a silent fault has not occurred, then a root cause of the service impact anomaly is determined.
- In the following description, for purposes of explanation, numerous specific details are set forth in order to provide an understanding of the variations in implementing the disclosed technology. However, the instant disclosure may take many different forms and should not be construed as limited to the examples set forth herein. Where practical, like numbers refer to like elements throughout.
-
FIG. 1 is a representation of an exemplary functional integrated service development and delivery platform on which the systems and methods of the present disclosure may be implemented. -
FIG. 2 is a representation of an exemplary platform architecture which may implement the functionality ofFIG. 1 . -
FIG. 3 is a representation of an exemplary architecture of the data collection and analytics events engine shown inFIG. 2 . -
FIG. 4 is an exemplary flowchart showing an aspect of the operations of a method in accordance with the present disclosure. -
FIG. 5 is an exemplary flowchart showing another aspect of the operations of a method in accordance with the present disclosure. - Overview. As detailed herein, the present disclosure is directed to a system and method to more effectively and efficiently manage network operations. The disclosure provides a holistic approach to detecting and isolating network and service issues, quantifying the customer impact of network issues and eliminating operational inefficiencies associated with service and network teams investigating common issues.
- The disclosure focuses on three aspects of network management. The first aspect may be fault management which may, for example, include alarms, traps, Syslog messages, and hard errors such as cut fiber optic cables, application services unexpectedly going offline, power failures and any other hard errors. The second aspect of network management may include performance management which may, for example, include measurements with respect to packet loss rates, latency, speed degradation and the like. The third aspect of network management may include service quality management which is directed to an end-to-end customer-perceived quality measurement. This third aspect may include, for example, quality issues such as video delivery in a lower resolution.
- By integrating these three aspects of network management, the network operator may be able prioritize the resolution of network issues. For example, an increased latency between routers in the network may trigger a network alarm, but it may not result in degradation of a service and thus have no impact on the customer experience. As such, that problem may be placed on a lower priority for resolution. Conversely, the larger the impact on the impact on the customer experience, the higher the priority for resolution. Moreover, the integration of these three network management functions may result in more efficient use of resources assigned to resolve a single that may manifest itself in 2 or more areas.
- The present disclosure provides an integrated control loop which is an integrated tool to combine the various aspects of network management. At the core of the disclosure is a data collection analytics and events (“DCAE”) engine. The DCAE engine will collect and analyze data and trigger certain actions based on the correlation of symptoms of any anomalies detected in the network. For example, anomalies impacting the customer experience may compare performance metrics to key performance indications (KPIs) while anomalies in the network may include symptoms such as alarms or traps.
- The DCAE engine may include a service impact analyzer which annotates all detected network events with the service indicators to identify which events may influence the customer experience. In other words, the service impact analyzer may identify the “where” and “how” a network event impacts a customer. A generic spatial-temporal event correlation engine may produce a flow of network events annotated with their service impact and a flow of silent service events, which may, for example, include service issues with no known associated network root causes. These flows can then be fed into various downstream systems for manual exploration and inspection, which may, for example, include SQM Explorer, ticket dispatch optimization, or closed-loop automation updates to policy engines and controllers.
- System Architecture. The present disclosure is applicable across multiple industries and with various configurations. For the purposes of this disclosure, an exemplary but non-limiting software defined network telecommunications system utilizing virtual network functions will be used.
- Generally, a run-
time execution platform 10 shown inFIG. 1 , which may, for example, be configured as an ECOMP platform such as the platform developed by AT&T, the assignee of the present application, and which may be considered the intelligence in a software defined network (SDN) that provides lifecycle management and control of software-centric network resources, infrastructure and services. It will be understood that ECOMP is but one example of a run-time execution platform 10 which may serve as the operating environment for the present disclosure. The runtime execution platform 10 may be a reusable platform of software assets comprising the following functionality: - Service Design and Creation (SDC)—Provides a well-structured organization of visual design & simulation tools, templates and catalogs to model and create resources, services and products;
- Master Service Orchestration (MSO)—Arranges, sequences and implements tasks based on policies and rules/recipes to coordinate the creation, modification or removal of logical and physical resources in the managed environment;
- Data Collection, Analytics and Events (DCAE) Provides real-time fault, performance, event and other data collection, correlation & analysis to manage service, network and infrastructure health and support closed loop automation;
- Policy—Modifiable rules, assertions and/or conditions to enable real-time decision making on corrective actions and configuration changes in the software-centric network ecosystem;
- Active and Available Inventory (A&AI)—tracks the dynamic relationships of virtualized networks, services & resources, their relationships and status to maintain ECOMP platform integrity and the global inventory it manages;
- DN & Application Controller—SDN Controllers configure, monitor maintain the health of network services and elements throughout their lifecycle. Application controllers configure, monitor and maintains the health of an applications throughout their lifecycle; and
- Portal—Provides a common portal framework and UI architecture for creating and operating virtual infrastructure and services. Examples of capabilities include common look and feel, role administration, access control, SDK with plug-ins and web components.
- While the present disclosure describes the automatic variable automation of control loops in the in the context of run time execution platform policies, the system and method described herein, and the claims appended hereto apply to the variation of automated control loops generally.
- With reference to
FIG. 1 , there is shown a conceptual diagram of asystem 9. Thesystem 9 may include a model-driven software architecture which makes services and operations available to users. Thesystem 9 may include functionality to facilitate user development of software applications, products & solutions. The system may have acentral platform 1 which may, for example, be an open source platform and may include machine learning and other analytics and provide the interactions of additional functionality relating to creating services and policies and operating network functionality. Interacting with theplatform 1 may be anaccess portal 2 and amaster service orchestrator 3. Themaster service orchestrator 3 may manage the delivery, modification or removal of networks and services and provide cross-domain orchestration to optimize the utilization of resources or to take corrective action. The master services orchestrator 3 may interact with various applications to collect data to determine network parameters. - The
system 9 may include a service design andcreation function 4 interacting with theplatform 1. The service design andcreation function 4 may provide a rich catalog-driven environment to construct and maintain service and resource definitions, constraints and management processes and policies. Also shown inFIG. 1 is a set ofcontrollers 5. Thecontrollers 5 may include network controllers for instantiating, configuring, and managing the lifecycle transport virtual network functions (VNFs) and network infrastructure, service/app controllers for instantiating, configuring and managing the lifecycle of service virtual functions (VFs), and infrastructure controllers for instantiating, configuring, and managing the lifecycle of infrastructure such as computational capacity and storage. - There is also shown a data collection analytics and events function 6. The data collection analytics and
event function 6 may collect data and events necessary to manage networks and make data available to real time applications. The data collection analytics andevent function 6 may provide the framework for analytics applications to identify patterns and anomalies and to publish events to drive closed-loop control. Also shown is an active andavailability inventory function 7 which may provide a real-time topology map with views of virtual networks, services and applications. Finally, there is shown apolicy function 8 which creates, maintains and modifies policies associated withsystem 9. - With reference to
FIG. 2 , there is shown an exemplary architecture in whichsystem 9 may operate. Thesystem 9 may be configured to support the provision, operation and maintenance of a wireless communication network which may, for example, be 3G, 4G/LTE, 5G, or any other wireless communication system now known or to be developed. The wireless communication network may be configured in part or in whole using software defined network components. Thesystem 9 may include a run-time execution platform 10 which may, for example, be an ECOMP platform or other platform to support edge computing. The run-time execution platform 10 may also be referred to simply asplatform 10 herein. Theplatform 10 may be a single entity that provides management functions in a uniform manner to onboard, deliver and manage the lifecycle of SDN resources and services. The applications that use theplatform 10 may interact with it via application program interfaces (APIs) 18 so applications need not have to know all inner components that compriseplatform 10 andsystem 9. - The
platform 10 may be accessible by a portal 12 which may, for example, be a server or other device that is operable to access and control theplatform 10 and through which user or service provider generated models may be deployed onplatform 10. Auser guide 14 may assist a user in creatingmodels 16 for an application. Themodels 16, together withAPIs 18, may then determine the resources desired for the business support systems (BSS) 2 which in turn, may access theplatform 10 for placing orders, monitor usage and perform other accounting functions. Themodels 16 andAPIs 18 may also be used to determine and request features, functionality and capacity of the operational support systems (OSS) 22, which may, for example, include capacity, fallout, and key performance indicators (KPIs) and the like. - The portal 12 may also provide access to
best practices 11 for SDN networks with respect to on-boarding and configuringnew applications 15. Through the portal 12, users may access software development kits (SDKs) and virtual function on-boarding functions 13 to supportmodels 16. From there, theplatform 10 may control the instantiation of virtual network functions (VNFs) 17, physical network functions (PNFs) 18 and virtual machines (VMs) 21. - There is also shown a data collection analytics and
events engine 30 that is in communication withplatform 10. The data collection analytics andevents engine 30, described in more detail below, may collect data and events necessary to manage and evolve networks and services. - With reference to
FIG. 3 , there is shownplatform 110 which may, for example, be similar toplatform 10 described above. In an aspect,platform 110 may be controlled and/or accessed by anintegrated control loop 130. Theintegrated control loop 130 may be a DCAE module and include adata collection module 132 fromplatform 110. Outputs fromplatform 110 toDCAE engine 130 may include service metrics and network events and metrics. Service metrics may include, for example, metrics relating to service quality management (SQM) metrics, those metrics that affect the customer experience. As such, the data being collected may include both service metrics and network events and metrics. The data may include alarms, KPIs, service level quality measurements, and the like. - In an aspect, the
data collector 132 may parse the collected data into two or more components. At a high level, one set of data may he be related to end-to-end customer service measurement events and performance issues as they may relate to KPIs, shown as the top path inFIG. 3 . The other set of data, shown as the lower path inFIG. 3 , may be related to network performance in view of faults, alarms, Syslogs, change activities, and other network events, along with performance-impacting events such as latency between servers or data losses. As will be described below, the lower path of the data flow may detect and correlate network events and then quantify the customer service impact of the correlated network events. - Following the top path, there is shown a service
anomaly detection function 134 which analyzes the service quality measurements captured by thedata collector module 132. The service anomaly detection function may, for example, be performed with assistance by a system provided by Argus Controls. This function may compare quality of service measurements to key performance indicators such as response time, quality of service measurements, resolution of audio and video streams, download and upload speeds, and other data that may impact the customer experience. The detected anomalies are then passed to the filtering silent failures function 142 discussed in more detail below and may also be fed into a service impact. - The data acquisition analytics and
event engine 130 may include anevent processing function 136. The event processing function may filter the collected event data to detect the anomalies generated by the network. The detected network anomalies may then be passed to the networkevent correlation function 138 which correlates the detected anomalies based on type of anomaly which may, for example be event-based anomalies such as a detected network event creates two or more alarms which may be caused by the same fault. Such faults may be correlated with other network event indicators such as Syslogs, planned maintenance outages and performance-based network events such as packet loss rates, latency, speed degradation and the like. The network-based and performance-based events may also be correlated based on geography, network element types, time of day, loading, or any other factor. - The output of the
event correlation function 138 may be fed into aservice impact analyzer 140. Theservice impact analyzer 140 may also receive service inputs from the platform directly in the form of service paths and service KPIs. Theservice impact analyzer 140 may quantify the impact that various network events may have on the customer experience. Theservice impact analyzer 140 may include programs that track planned network outages for maintenance and upgrades, a cell tower outage network analyzer, and inputs from other network systems. In an aspect, theservice impact analyzer 140 may correlate network-based events with service quality issues that are highlighted by the service KPIs or other issues affecting the service path. For example, theservice impact analyzer 140 may correlate a network event whereby an individual cell tower goes off-line to create an outage and generate an alarm condition with service quality issues being generated by neighboring cell towers whose KPIs indicate excessive loading, higher blocked or dropped call rates or other issues that affect service quality. Thus, a single fault in the network which would otherwise generate multiple trouble tickets for a plurality of cell sites can now be detected and addressed with a single trouble ticket with the resolution prioritized based on the severity of the customer impact. - There is also shown a filtering silent failures function 142 which may be a spatial-temporal event correlation engine to produce two data flows, namely a flow of network-based events annotated with their respective service level impacts and a flow of silent service events which are defined as service issues with no known associated network root cause. The filtering silent failures function 142 thus generates two outputs, one output being those events whose root causes can be explained and another output being those events whose root causes cannot be explained.
- With respect to those events whose root causes can be explained, the cause of the anomaly and the priority of resolution may be sent to a portal 112 where an operator may generate a trouble ticket. Alternatively, or additionally, this data fed is back into
platform 110 through the integrated control loop whereby the event that triggered the anomaly may be resolved automatically. In either case, the root cause of the problem may now be investigated by one maintenance team instead of two or three maintenance teams. - Continuing with the description of
FIG. 3 , the silent service events are then fed into a silentfailure localization processor 144 which may narrow the potential causes of the service anomalies. The silentfailure localization function 144 may be used when a cause of the service anomaly cannot be determined. The silentfailure localization processor 144 may correlate similar issues and summarize the commonalities among the problem set. For example, there may be commonality with respect to gaming which may be localized to a particular video game such asMadden NFL 18. There may be a commonality with respect to geography, such as Atlanta, Ga. or South Bend, Ind. There may be commonality with respect to certain types of customer equipment such as a Samsung Galaxy® handset or an Apple iPhone®. The silentfailure localization processor 144 may attempt to classify and correlate as many common characteristics as possible to assist with the further analysis and resolution of the anomalies. This correlated silent failure data may then be provided throughportal 112 to a system operator for further investigation as well as being fed back as a control loop toplatform 110. - Those outputs may be input to
data collectors 132 in theDCAE engine 132.Data collectors 132 may perform initial analysis and segregation with respect to the types of data. Data associated with SQM may be passed to a serviceanomaly detection function 134 which may, for example, be performed by software programs created by Argus Control Systems Ltd. - Methods of Use. With reference to
FIG. 4 , there is shown a flow chart beginning with a data collection function at 41. At 42, data is segregated as previously described with respect to service quality and network events. At 43, network events such as alarms, outages, and the like are detected while at 44, service quality anomalies such as customer-impacting performance issues are detected. At 45, the network events and customer-impacting service quality issues may be correlated, and the network events are annotated with the service quality issues. At 46, the network-based events are analyzed to determine whether those network-based events are impacting customer service quality measurements. If not, then resolution of those network-based events is set to a lower priority at 47. If the network-based events do impact customer service quality measurements, then the resolution of those network-based issues is set to a higher priority. - As set forth above, there are times when the root cause of an anomaly cannot be determined with certainty. The flow chart in
FIG. 5 shows the data flow starting with the data analytics function 51 as previously described. At 52, the analytics function results are filtered based on whether the root cause of the anomaly has been resolved. If yes, the cause of the anomaly may be published through the portal and resolution processes initiated, either by issuing trouble tickets or through automatic error correction methodologies. If the root cause of the anomaly is not determined at 52, i.e., the fault is a silent fault, then common symptoms of the fault are determined at 54 and based on those commonalities, a narrower range of possible root causes is determined at 55 and published to the portal for further analysis and trouble shooting. In both cases, the feedback loop to the platform is completed at 56. - The present disclosure has been described with respect to an
integrated control loop 130 applied to aplatform 110 whereby integrated control loop may be agnostic with respect to the services offered. The systems and methods of the present invention may work for any platform whereby certain events may or may not lead to service quality issues affecting customers or other operations. Other applications may include, but are not limited, to autonomous vehicles, transportation, factory management systems, home automations, virtual reality, augmented reality, and computer gaming, just to name a few. - The present disclosure provides an integrated control loop which is able to operate independently of any particular service and thus is available across multiple service offerings, including but not limited to mobility telecommunications functionality, Voice over LTE (VoLTE) and video services. Enabling an integrated control loop on a platform as described is able to reduce alarm fatigue and improve operational efficiency by eliminating duplication of effort in troubleshooting common issues among network and service operations teams, quantify the impact of network events to prioritize resolution activities. The integrated control loop is able to scale the analytics across both physical and network functions and services. Moreover, the architecture allows operators and service designers, including customers and contractors, to create control loop policies to address the needs of new or modified services and establish KPIs more efficiently.
- As described, the disclosure permits the use of a data collection analytics event engine to provide an integrated control loop for an operational platform. The systems and methods of the disclosure are directed to solving a practical problem by providing a real world and practical solution that removes or minimizes duplication of effort for network and service maintenance and upgrade applications while allowing customers and users to develop their own policies to control and gain additional insight into the fault detection and service degradation functionality. The system and methods help customer service personnel understand and explain degradation of service levels and by the isolating or narrowing the scope of the root causes of such degradation, better predict the time before service is restored to normal levels while prioritizing the resolution of those problems that affect the customer experience. The present disclosure allows explicit incorporation of changing policies and KPIs to vary the control of the integrated control loop process and thus provide valuable additional capabilities of monitoring and maintaining
platforms 110 or other systems employing integrated control loops and associated algorithms. - While examples of systems and methods for creating and applying integrated control loops for an operational platform have been described in connection with various computing devices/processors, the underlying concepts may be applied to any computing device, processor, or system capable of facilitating or hosting an integrate control loop system, including cloud-based systems, proprietary systems or open source systems in which the framework of creating an integrated automatic control loop with data collections and analytics engine solutions may be implemented. The various techniques described herein may be implemented in connection with hardware or software or, where appropriate, with a combination of both. Thus, the methods and devices may take the form of program code (i.e., instructions) embodied in concrete, tangible, storage media having a concrete, tangible, physical structure. Examples of tangible storage media include floppy diskettes, Compact Disc-Read-Only Memory devices (CD-ROMs), Digital Versatile Discs, or, Digital Video Discs (DVDs), hard drives, or any other tangible machine-readable storage medium (computer-readable storage medium). Thus, a computer-readable storage medium is not a signal. A computer-readable storage medium is not a transient signal. Further, a computer-readable storage medium is not a propagating signal. A computer-readable storage medium as described herein is an article of manufacture. When the program code is loaded into and executed by a machine, such as a computer, the machine becomes a device for telecommunications. In the case of program code execution on programmable computers, the computing device will generally include a processor, a storage medium readable by the processor (including volatile or nonvolatile memory or storage elements), at least one input device, and at least one output device. The program(s) can be implemented in assembly or machine language, if desired. The language can be a compiled or interpreted language and may be combined with hardware implementations.
- The methods and devices associated with a system as described herein also may be practiced via communications embodied in the form of program code that is transmitted over some transmission medium, such as over electrical wiring or cabling, through fiber optics, or via any other form of transmission, over the air (OTA), or firmware over the air (FOTA), wherein, when the program code is received and loaded into and executed by a machine, such as an Erasable Programmable Read-Only Memory (EPROM), a gate array, a programmable logic device (PLD), a client computer, or the like, the machine becomes an device for implementing telecommunications as described herein. When implemented on a general-purpose processor, the program code combines with the processor to provide a unique device that operates to invoke the functionality of a telecommunications system.
- The patentable scope of the disclosure is defined by the claims, and may include other examples that occur to those skilled in the art. Such other examples are intended to be within the scope of the claims if they have structural elements that do not differ from the literal language of the claims, or if they include equivalent structural elements with insubstantial differences from the literal languages of the claims
Claims (20)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/163,877 US20210160124A1 (en) | 2019-04-30 | 2021-02-01 | Automatic Integrated Control Loop For Fault, Performance And Service Quality Management |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/399,750 US10938624B2 (en) | 2019-04-30 | 2019-04-30 | Automatic integrated control loop for fault, performance and service quality management |
US17/163,877 US20210160124A1 (en) | 2019-04-30 | 2021-02-01 | Automatic Integrated Control Loop For Fault, Performance And Service Quality Management |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/399,750 Continuation US10938624B2 (en) | 2019-04-30 | 2019-04-30 | Automatic integrated control loop for fault, performance and service quality management |
Publications (1)
Publication Number | Publication Date |
---|---|
US20210160124A1 true US20210160124A1 (en) | 2021-05-27 |
Family
ID=73016816
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/399,750 Active US10938624B2 (en) | 2019-04-30 | 2019-04-30 | Automatic integrated control loop for fault, performance and service quality management |
US17/163,877 Abandoned US20210160124A1 (en) | 2019-04-30 | 2021-02-01 | Automatic Integrated Control Loop For Fault, Performance And Service Quality Management |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/399,750 Active US10938624B2 (en) | 2019-04-30 | 2019-04-30 | Automatic integrated control loop for fault, performance and service quality management |
Country Status (1)
Country | Link |
---|---|
US (2) | US10938624B2 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20230156481A1 (en) * | 2021-11-12 | 2023-05-18 | T-Mobile Innovations Llc | Downtime optimized network upgrade process |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11271835B2 (en) | 2019-05-10 | 2022-03-08 | Cisco Technology, Inc. | Composite key performance indicators for network health monitoring |
US20230070701A1 (en) * | 2021-08-31 | 2023-03-09 | Juniper Networks, Inc. | Network performance monitoring and fault management based on wide area network link health assessments |
WO2023106975A1 (en) * | 2021-12-06 | 2023-06-15 | Telefonaktiebolaget Lm Ericsson (Publ) | Method and device for configuring a service |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5872911A (en) * | 1995-12-29 | 1999-02-16 | Mci Communications Corporations | Method and system of service impact analysis in a communications network |
US20090177354A1 (en) * | 2008-01-09 | 2009-07-09 | International Business Machines Corporation | Method for vehicle fault diagnosis using audio sensors |
US9313091B1 (en) * | 2013-09-26 | 2016-04-12 | Emc Corporation | Analytics platform for automated diagnosis, remediation, and proactive supportability |
US9807229B1 (en) * | 2014-10-10 | 2017-10-31 | Carrier Iq, Inc. | Wireless customer experience fault analytics system and method of operation |
US20170019312A1 (en) * | 2015-07-17 | 2017-01-19 | Brocade Communications Systems, Inc. | Network analysis and management system |
-
2019
- 2019-04-30 US US16/399,750 patent/US10938624B2/en active Active
-
2021
- 2021-02-01 US US17/163,877 patent/US20210160124A1/en not_active Abandoned
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20230156481A1 (en) * | 2021-11-12 | 2023-05-18 | T-Mobile Innovations Llc | Downtime optimized network upgrade process |
Also Published As
Publication number | Publication date |
---|---|
US10938624B2 (en) | 2021-03-02 |
US20200351149A1 (en) | 2020-11-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20210160124A1 (en) | Automatic Integrated Control Loop For Fault, Performance And Service Quality Management | |
US11373123B2 (en) | System and method for designing and executing control loops in a cloud environment | |
US10175978B2 (en) | Monitoring code sensitivity to cause software build breaks during software project development | |
US10530740B2 (en) | Systems and methods for facilitating closed loop processing using machine learning | |
US11271816B2 (en) | Network topology management using network element differential history | |
US10503631B1 (en) | Runtime intelligence within an integrated development environment | |
US20220052916A1 (en) | Orchestration of Activities of Entities Operating in a Network Cloud | |
US20180314549A1 (en) | Operational micro-services design, development, deployment | |
Capizzi et al. | From devops to devdataops: data management in devops processes | |
US11212173B2 (en) | Model-driven technique for virtual network function rehoming for service chains | |
US20230086473A1 (en) | Smart retry policy for automated provisioning of online resources | |
US20220083404A1 (en) | Automatic scope configuration of monitoring agents for tracking missing events at runtime | |
Cho et al. | CBR-based network performance management with multi-agent approach | |
Van Rossem et al. | NFV service dynamicity with a DevOps approach: Insights from a use-case realization | |
Mesiti et al. | StreamLoader: an event-driven ETL system for the on-line processing of heterogeneous sensor data | |
US20230060758A1 (en) | Orchestration of Activities of Entities Operating in a Network Cloud | |
Xie et al. | Joint monitoring and analytics for service assurance of network slicing | |
US20230188414A1 (en) | Data center management based on merging/unmerging of configurations | |
Yamada et al. | Developing network configuration management database system and its application—data federation for network management | |
Kuruba | DevOps for IT service reliability and availability | |
US20200379434A1 (en) | Automatic control loop decision variation | |
US11949580B2 (en) | Data center management based on probing | |
US20240073234A1 (en) | Opentelemetry Security Systems and Methods | |
US20230367687A1 (en) | Automated open telemetry instrumentation leveraging behavior learning | |
Yates et al. | Artificial Intelligence for Network Operations |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: AT&T MOBILITY II LLC, GEORGIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:MAGRINI, HEATHER;HALL, BRANDON;CHILDS, KATHRYN W.;SIGNING DATES FROM 20190910 TO 20190911;REEL/FRAME:055190/0827 Owner name: AT&T INTELLECTUAL PROPERTY I, L.P., GEORGIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:YAN, HE;GE, ZIHUI;RANA, GIRITHARAN;AND OTHERS;SIGNING DATES FROM 20190911 TO 20191104;REEL/FRAME:055190/0433 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: APPLICATION DISPATCHED FROM PREEXAM, NOT YET DOCKETED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE AFTER FINAL ACTION FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO PAY ISSUE FEE |