WO2023156827A1 - Anomaly detection - Google Patents
Anomaly detection Download PDFInfo
- Publication number
- WO2023156827A1 WO2023156827A1 PCT/IB2022/051475 IB2022051475W WO2023156827A1 WO 2023156827 A1 WO2023156827 A1 WO 2023156827A1 IB 2022051475 W IB2022051475 W IB 2022051475W WO 2023156827 A1 WO2023156827 A1 WO 2023156827A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data points
- data
- anomaly detection
- cell
- detection process
- Prior art date
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/142—Network analysis or design using statistical or mathematical methods
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/16—Threshold monitoring
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/16—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using machine learning or artificial intelligence
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/02—Capturing of monitoring data
- H04L43/022—Capturing of monitoring data by sampling
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
- H04L43/0805—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
- H04L43/0852—Delays
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
- H04L43/0876—Network utilisation, e.g. volume of load or congestion level
- H04L43/0888—Throughput
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/10—Active monitoring, e.g. heartbeat, ping or trace-route
- H04L43/106—Active monitoring, e.g. heartbeat, ping or trace-route using time related information in packets, e.g. by adding timestamps
Definitions
- Anomaly detection is a well-established use case in telecommunication systems to identify abnormal behavior in a network.
- the method is normally applied to time series data measuring different metrics characterizing the health of a part of the network (e.g., a cell).
- the method works by monitoring normal behavior to understand the pattern of data under standard conditions and set thresholds. When successive values over a prescribed period are beyond the threshold, an anomaly is flagged.
- anomaly detection techniques there are two types of anomaly detection techniques: simple statistical methods and machine learning-based approaches.
- Simple statistical methods are characterized by a light “footprint” (i.e., few computing resources) but limited by their inability to handle more challenging scenarios.
- Machine learning (ML)-based approaches can learn more sophisticated patterns but have larger footprints (i.e., require more computing resources).
- a survey of different anomaly detection techniques can be found at blogs(dot)oracle(dot)com/ai-and-datascience/post/introduction-to-anomaly-detection.
- an improved method for anomaly detection may be performed by a network management node.
- the method includes storing time series data, the stored time series data comprising a first set of N data points, wherein N > 2 and each data point in the first set of data points was obtained at a different point in time.
- the method further includes using the most current data point from the first set of N data points to determine whether or not to perform an anomaly detection process using at least N-l of the N data points.
- a computer program comprising instructions which when executed by processing circuitry of a network management node, causes the network management node to perform the above described method.
- a carrier containing the computer program wherein the carrier is one of an electronic signal, an optical signal, a radio signal, or a computer readable storage medium.
- the network management node is configured to perform the methods disclosed herein.
- the network management node includes processing circuitry and a memory containing instructions executable by the processing circuitry, whereby the network management node is configured to perform the methods disclosed herein.
- the embodiments are advantageous in that they have a small energy footprint (i.e., reduced computation) which is beneficial for many reasons, including energy savings which leads to a low carbon footprint and enables scalability.
- the embodiments can be scaled to handle more cells within a sampling interval. This is a critical requirement for large scale Communication Service Providers who deploy tens-of-thousands of cells in a geographical region.
- FIG. 1 illustrates a communication system according to an embodiment.
- FIG. 2 is a flowchart illustrating a process according to some embodiments.
- FIG. 3 is a flowchart illustrating a process according to some embodiments.
- FIG. 4 illustrates a network management node according to some embodiments.
- FIG. 1 illustrates a communication system 100 according to an embodiment.
- Communication system 100 includes network nodes (e.g., base stations 111 and 112) that each serve one or more cells (e.g., cell 121 and 122). While only two network nodes are shown, it is possible that a communication system includes tens of thousands of network nodes or more, where each network node serves one or more cells.
- network nodes e.g., base stations 111 and 112
- cells e.g., cell 121 and 122
- Communication system 100 further includes a network management node 104, which, in the illustrated embodiment, includes a (1) data gathering function (DGF) 132 that functions to obtain and store time series data for each cell in system 100 and (2) an AD function (ADF) 134 that, for each cell for which time series data is collected, uses the stored time series data for the cell to detect whether or not the cell is experiencing an anomaly.
- DGF data gathering function
- ADF AD function
- data gathering function 132 creates and updates a database (DB) 190 that stores time series data having the following form:
- the database 190 stores, for each of celll, cell2, and cell3, first time series data (i.e., a first set of data points) corresponding to a first performance metric (PM-1) (e.g., average latency, average throughput, cell downtime, etc.) for the cell and second time series data (i.e., a second set of data points) corresponding to a second performance metric (PM-2) (e.g., average latency, average throughput, cell downtime, etc.) for the cell.
- first performance metric e.g., average latency, average throughput, cell downtime, etc.
- PM-2 second performance metric
- the database is not limited to this form shown as the database can store time series data for any number of cells and/or any number of performance metrics. For instance, for each cell, the database may only store time series data for a single performance metric.
- the database may also store a timestamp that indicates the time at which the data point was generated or received by data gathering function 132.
- the database 190 only stores the most recent N data points for a given cell and given performance metric (this feature is illustrated in the table above which shows that for each cell/performance metric pair, the database stores at most N data points).
- the data gathering function 132 when data gathering function 132 receives a new data point for a particular cell/performance metric pair and the database already has N data points for this particular cell/performance metric pair, the data gathering function 132 will remove from the database the oldest data point for this particular cell/performance metric pair and then add to the database the new data point for this particular cell/performance metric pair.
- AD function 134 provides an efficient anomaly detection method by utilizing an “AD trigger function” to reduce the computational load of AD function 134 by reducing how often AD function 134 uses time series data to make a determination as to whether or not an anomaly is present.
- the AD trigger function is employed at least once every X units of time (e.g., at least once every 2 hours), and, based on the output of the AD trigger function, a decision is made as to whether to use the current time series data in database 190 to detect an anomaly or to wait until a later time and use at that later time the current time series data in database 190 to detect an anomaly.
- the AD trigger function returns FALSE, then anomaly detection is not triggered until some later point in time.
- FIG. 2 is a flowchart illustrating a process 200 according to an embodiment that is performed by AD function 134 for a given cell/performance metric pair. That is, AD function 134 performs process 200 for each cell/performance pair.
- Process 200 may begin in step s202.
- Step s202 comprises AD function 134 determining whether database 190 contains at least N data points for the given cell/performance metric pair under consideration. If not, AD function 134 goes back to performing step s202, otherwise it proceeds to step s204.
- Step s204 comprises AD function 134 determining whether the most recently obtained data point for the given cell/performance metric pair under consideration (denoted VN) satisfies an AD triggering condition (e.g., in one embodiment in which the performance metric is average latency, AD function 134 determines whether VN is greater than a threshold; in another embodiment in which the performance metric is average throughput, AD function 134 determines whether VN is less than a threshold). If VN satisfies the AD triggering condition (e.g., VN is less than a threshold), then AD function 134 proceeds to step s206 in which AD function performs an AD process, otherwise it proceeds to step s210.
- an AD triggering condition e.g., in one embodiment in which the performance metric is average latency, AD function 134 determines whether VN is greater than a threshold; in another embodiment in which the performance metric is average throughput, AD function 134 determines whether VN is less than a threshold. If VN satisfies the AD
- Step s206 comprises AD function 134 performing the AD process. That is, in step s206, AD function 134 uses at least the most recent N-l data points for the given cell/performance metric pair under consideration to determine whether an anomaly is present. In one embodiment, AD function 134 determines that an anomaly is present if all of the most recent N-l data points satisfy the AD triggering condition (e.g., if the performance metric is average throughput such that each data point is an average throughput value, then AD function determines that an anomaly is present if each one of the most recent N-l data points is less than the threshold).
- the AD triggering condition e.g., if the performance metric is average throughput such that each data point is an average throughput value, then AD function determines that an anomaly is present if each one of the most recent N-l data points is less than the threshold.
- AD function 134 determines that an anomaly is present if all of the most recent N-l data points satisfy the AD triggering condition and VN is less than VN-I.
- the second condition that VN is less than VN-I is a check to see if the anomalous trend is continuing (i.e., that the average throughput below the threshold is continuing to decrease with the latest data point VN).
- one or more actions are taken. These actions may include one or more of: (1) network management node 104 generating an alarm notification to notify the network operator that an anomaly has been detected; (2) network management node 104 adjusting one or more configuration parameters for the network node experiencing the anomaly; (3) network management node 104 attempting to reduce the load on the network node experiencing the anomaly by adjusting a load balancer to steer traffic away from said node; etc.
- Step s208 comprises AD function 134 waiting until the next new data point is obtained. For instance, in one embodiment, a new data point is obtained every fifteen minutes. Thus, in this embodiment, in step s208 AD function 134 ends up waiting about fifteen minutes or less. After step s208 (e.g., after the next new data point is obtained), AD function 134 goes back to step s204.
- Step s210 comprises AD function 134 waiting until N new data points for the given cell/performance metric are obtained.
- FIG. 3 is a flowchart illustrating a process 300 according to an embodiment that is performed by network management node 104.
- Process 300 may begin in step s302.
- Step s302 comprises storing time series data, the stored time series data comprising a first set of N data points, wherein N > 2 and each data point in the first set of data points was obtained at a different point in time.
- Step s304 comprises using the most current data point from the first set of N data points to determine whether or not to perform an anomaly detection process using at least N-l of the N data points.
- using the most current data point to determine whether or not to perform the anomaly detection process comprises comparing the data point to a threshold. In some embodiments, using the most current data point to determine whether or not to perform the anomaly detection process further comprises determining, based on the comparing, that a condition is satisfied (e.g., the data point is less than the threshold, the data point is greater than the threshold, the data point is not greater than the threshold, etc.).
- the method further includes performing the anomaly detection process using at least N-l of the N data points as a result of determining that the condition is satisfied. In some embodiments the method further includes, after performing the anomaly detection process, obtaining a new current data point and using the new current data point to determine whether or not to perform the anomaly detection process using at least N-l of the most recent data points included in the first set of N data points. In some embodiments the method further includes performing the anomaly detection process using the N-l of the most recent data points included in the first set of N data points.
- the method further includes, as a result of determining that the condition is not satisfied: refraining from performing the anomaly detection process; collecting a new set of N data points; and using the most current data point from the new set of N data points to determine whether or not to perform the anomaly detection process using at least N-l of the new set of N data points.
- the first set of N data points is associated with a cell of a mobile communication network, and each one of N data points included in the first set of N data points is a measure of a first performance metric for the cell.
- the first performance metric is: an average throughput of the cell, an average latency associated with the cell, or a downtime for the cell.
- the time series data further comprises a second set of N data points, wherein N > 2 and each data point in the second set of data points was obtained at a different point in time
- the method further comprises using the most current data point from the second set of N data points to determine whether or not to perform the anomaly detection process using the second set of N data points, wherein each one of N data points included in the second set of N data points is a measure of a second performance metric for the cell, and the second performance metric for the cell is different than the first performance metric for the cell.
- FIG. 4 is a block diagram of network management node 104, according to some embodiments, for performing network node methods disclosed herein.
- network node 104 may comprise: processing circuitry (PC) 402, which may include one or more processors (P) 455 (e.g., one or more general purpose microprocessors and/or one or more other processors, such as an application specific integrated circuit (ASIC), field- programmable gate arrays (FPGAs), and the like), which processors may be co-located in a single housing or in a single data center or may be geographically distributed (i.e., network node 104 may be a distributed computing apparatus where some function are performed in one location and other functions performed in another location); at least one network interface 448 comprising a transmitter (Tx) 445 and a receiver (Rx) 447 for enabling network node 104 to transmit data to and receive data from other nodes connected to a network 110 (e.g., an Internet Protocol (IP) network) to which network
- IP Internet Protocol
- a computer readable medium (CRM) 442 may be provided and store a computer program (CP) 443 comprising computer readable instructions (CRI) 444.
- CRM 442 may be a non-transitory computer readable medium, such as, magnetic media (e.g., a hard disk), optical media, memory devices (e.g., random access memory, flash memory), and the like.
- the CRI 444 of computer program 443 is configured such that when executed by PC 402, the CRI causes network node 104 to perform steps described herein (e.g., steps described herein with reference to the flow charts).
- network node 104 may be configured to perform steps described herein without the need for code. That is, for example, PC 402 may consist merely of one or more ASICs. Hence, the features of the embodiments described herein may be implemented in hardware and/or software.
- the AD process is triggered less often without missing any anomalies. That is, the embodiments have the same performance in anomaly detection as a conventional method, but the embodiments use fewer computation resources. The saved computation resources could be used to process more cells or for other purposes or not used at all, thereby reducing energy consumption.
- Table 2 below shows the benchmark results over a 7-day period with and without the smart sampling approach described above.
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Mathematical Physics (AREA)
- Probability & Statistics with Applications (AREA)
- Pure & Applied Mathematics (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Algebra (AREA)
- Environmental & Geological Engineering (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
A method (300) for anomaly detection. The method includes storing (s302) time series data, the stored time series data comprising a first set of N data points, wherein N > 2 and each data point in the first set of data points was obtained at a different point in time. The method further includes using (s304) the most current data point from the first set of N data points to determine whether or not to perform an anomaly detection process using at least N-1 of the N data points.
Description
ANOMALY DETECTION
TECHNICAL FIELD
[001] Disclosed are embodiments related to anomaly detection.
BACKGROUND
[002] Anomaly detection (AD) is a well-established use case in telecommunication systems to identify abnormal behavior in a network. The method is normally applied to time series data measuring different metrics characterizing the health of a part of the network (e.g., a cell). The method works by monitoring normal behavior to understand the pattern of data under standard conditions and set thresholds. When successive values over a prescribed period are beyond the threshold, an anomaly is flagged.
[003] In general, there are two types of anomaly detection techniques: simple statistical methods and machine learning-based approaches. Simple statistical methods are characterized by a light “footprint” (i.e., few computing resources) but limited by their inability to handle more challenging scenarios. Machine learning (ML)-based approaches can learn more sophisticated patterns but have larger footprints (i.e., require more computing resources). A survey of different anomaly detection techniques can be found at blogs(dot)oracle(dot)com/ai-and-datascience/post/introduction-to-anomaly-detection.
[004] When there is need to perform anomaly detection in a large network with limited resources, the statistical approach is generally preferred over an ML based approach.
SUMMARY
[005] Certain challenges presently exist. For instance, conventional AD systems perform an AD process each time that a new data point in the time series is obtained. This, however, is often inefficient because anomalies are detected only when several consecutive data points show an abnormal pattern. In particular, this inefficiency makes it difficult to scale the conventional AD systems solution to analyze 100s or 1000s of cells, which is a common occurrence in telecommunication systems.
[006] For example, assume that one were to perform the AD process for each cell in a network with N cells, where, for each cell, a time series data point is obtained every M minutes. In such a scenario, the total number of daily executions of the AD process is:
N x (1440 / M). Because anomalies, by definition, are rare, the efficiency of the AD system is
low (z.e., the number of detected anomalies divided by the total number of daily executions is a low number).
[007] Accordingly, in one aspect there is provided an improved method for anomaly detection. The method may be performed by a network management node. The method includes storing time series data, the stored time series data comprising a first set of N data points, wherein N > 2 and each data point in the first set of data points was obtained at a different point in time. The method further includes using the most current data point from the first set of N data points to determine whether or not to perform an anomaly detection process using at least N-l of the N data points.
[008] In another aspect there is provided a computer program comprising instructions which when executed by processing circuitry of a network management node, causes the network management node to perform the above described method. In another aspect there is provided a carrier containing the computer program, wherein the carrier is one of an electronic signal, an optical signal, a radio signal, or a computer readable storage medium.
[009] In another aspect there is provided a network management node where the network management node is configured to perform the methods disclosed herein. In some embodiments, the network management node includes processing circuitry and a memory containing instructions executable by the processing circuitry, whereby the network management node is configured to perform the methods disclosed herein.
[0010] The embodiments are advantageous in that they have a small energy footprint (i.e., reduced computation) which is beneficial for many reasons, including energy savings which leads to a low carbon footprint and enables scalability. With respect to scalability, with reduced computational effort per cell, the embodiments can be scaled to handle more cells within a sampling interval. This is a critical requirement for large scale Communication Service Providers who deploy tens-of-thousands of cells in a geographical region.
BRIEF DESCRIPTION OF THE DRAWINGS
[0011] The accompanying drawings, which are incorporated herein and form part of the specification, illustrate various embodiments.
[0012] FIG. 1 illustrates a communication system according to an embodiment.
[0013] FIG. 2 is a flowchart illustrating a process according to some embodiments.
[0014] FIG. 3 is a flowchart illustrating a process according to some embodiments.
[0015] FIG. 4 illustrates a network management node according to some embodiments.
DETAILED DESCRIPTION
[0016] FIG. 1 illustrates a communication system 100 according to an embodiment. Communication system 100 includes network nodes (e.g., base stations 111 and 112) that each serve one or more cells (e.g., cell 121 and 122). While only two network nodes are shown, it is possible that a communication system includes tens of thousands of network nodes or more, where each network node serves one or more cells.
[0017] Communication system 100 further includes a network management node 104, which, in the illustrated embodiment, includes a (1) data gathering function (DGF) 132 that functions to obtain and store time series data for each cell in system 100 and (2) an AD function (ADF) 134 that, for each cell for which time series data is collected, uses the stored time series data for the cell to detect whether or not the cell is experiencing an anomaly.
[0018] For instance, in one embodiment, data gathering function 132 creates and updates a database (DB) 190 that stores time series data having the following form:
[0019] As shown in Table 1, the database 190 stores, for each of celll, cell2, and cell3, first time series data (i.e., a first set of data points) corresponding to a first performance
metric (PM-1) (e.g., average latency, average throughput, cell downtime, etc.) for the cell and second time series data (i.e., a second set of data points) corresponding to a second performance metric (PM-2) (e.g., average latency, average throughput, cell downtime, etc.) for the cell.
[0020] The database is not limited to this form shown as the database can store time series data for any number of cells and/or any number of performance metrics. For instance, for each cell, the database may only store time series data for a single performance metric.
[0021] Moreover, for each data point stored in the database (e.g., data point vl 11), the database may also store a timestamp that indicates the time at which the data point was generated or received by data gathering function 132. In one embodiment, to save space, the database 190 only stores the most recent N data points for a given cell and given performance metric (this feature is illustrated in the table above which shows that for each cell/performance metric pair, the database stores at most N data points). Thus, in such an embodiment, when data gathering function 132 receives a new data point for a particular cell/performance metric pair and the database already has N data points for this particular cell/performance metric pair, the data gathering function 132 will remove from the database the oldest data point for this particular cell/performance metric pair and then add to the database the new data point for this particular cell/performance metric pair.
[0022] AD function 134 provides an efficient anomaly detection method by utilizing an “AD trigger function” to reduce the computational load of AD function 134 by reducing how often AD function 134 uses time series data to make a determination as to whether or not an anomaly is present. For example, in one embodiment, the AD trigger function is employed at least once every X units of time (e.g., at least once every 2 hours), and, based on the output of the AD trigger function, a decision is made as to whether to use the current time series data in database 190 to detect an anomaly or to wait until a later time and use at that later time the current time series data in database 190 to detect an anomaly. Thus, for instance, if the AD trigger function returns FALSE, then anomaly detection is not triggered until some later point in time.
[0023] FIG. 2 is a flowchart illustrating a process 200 according to an embodiment that is performed by AD function 134 for a given cell/performance metric pair. That is, AD function 134 performs process 200 for each cell/performance pair. Process 200 may begin in step s202. Step s202 comprises AD function 134 determining whether database 190 contains
at least N data points for the given cell/performance metric pair under consideration. If not, AD function 134 goes back to performing step s202, otherwise it proceeds to step s204.
[0024] Step s204 comprises AD function 134 determining whether the most recently obtained data point for the given cell/performance metric pair under consideration (denoted VN) satisfies an AD triggering condition (e.g., in one embodiment in which the performance metric is average latency, AD function 134 determines whether VN is greater than a threshold; in another embodiment in which the performance metric is average throughput, AD function 134 determines whether VN is less than a threshold). If VN satisfies the AD triggering condition (e.g., VN is less than a threshold), then AD function 134 proceeds to step s206 in which AD function performs an AD process, otherwise it proceeds to step s210.
[0025] Step s206 comprises AD function 134 performing the AD process. That is, in step s206, AD function 134 uses at least the most recent N-l data points for the given cell/performance metric pair under consideration to determine whether an anomaly is present. In one embodiment, AD function 134 determines that an anomaly is present if all of the most recent N-l data points satisfy the AD triggering condition (e.g., if the performance metric is average throughput such that each data point is an average throughput value, then AD function determines that an anomaly is present if each one of the most recent N-l data points is less than the threshold). In another embodiment where the performance metric is average throughput, AD function 134 determines that an anomaly is present if all of the most recent N-l data points satisfy the AD triggering condition and VN is less than VN-I. Here, the second condition that VN is less than VN-I is a check to see if the anomalous trend is continuing (i.e., that the average throughput below the threshold is continuing to decrease with the latest data point VN).
[0026] In one embodiment, when an anomaly is detected, one or more actions are taken. These actions may include one or more of: (1) network management node 104 generating an alarm notification to notify the network operator that an anomaly has been detected; (2) network management node 104 adjusting one or more configuration parameters for the network node experiencing the anomaly; (3) network management node 104 attempting to reduce the load on the network node experiencing the anomaly by adjusting a load balancer to steer traffic away from said node; etc.
[0027] Step s208 comprises AD function 134 waiting until the next new data point is obtained. For instance, in one embodiment, a new data point is obtained every fifteen
minutes. Thus, in this embodiment, in step s208 AD function 134 ends up waiting about fifteen minutes or less. After step s208 (e.g., after the next new data point is obtained), AD function 134 goes back to step s204.
[0028] Step s210 comprises AD function 134 waiting until N new data points for the given cell/performance metric are obtained. In the embodiment in which data gathering function 132 obtains a new data point every fifteen minutes and N=8, AD function 134 ends up waiting about 2 hours in step s210. After step s210, AD function 134 goes back to step s204.
[0029] The table below contains pseudo code of a computer program that can be used to implement process 200 where the performance metric is average latency and a second condition that VN is greater than VN-I is a check to see if the anomalous trend is continuing (i.e., that the average latency above the threshold Th is continuing to increase with the latest data point VN).
[0030] FIG. 3 is a flowchart illustrating a process 300 according to an embodiment that is performed by network management node 104. Process 300 may begin in step s302. Step s302 comprises storing time series data, the stored time series data comprising a first set of N data points, wherein N > 2 and each data point in the first set of data points was obtained at a different point in time. Step s304 comprises using the most current data point from the first set of N data points to determine whether or not to perform an anomaly detection process using at least N-l of the N data points.
[0031] In some embodiments, using the most current data point to determine whether or not to perform the anomaly detection process comprises comparing the data point to a threshold. In some embodiments, using the most current data point to determine whether or not to perform the anomaly detection process further comprises determining, based on the comparing, that a condition is satisfied (e.g., the data point is less than the threshold, the data point is greater than the threshold, the data point is not greater than the threshold, etc.).
[0032] In some embodiments the method further includes performing the anomaly detection process using at least N-l of the N data points as a result of determining that the condition is satisfied. In some embodiments the method further includes, after performing the anomaly detection process, obtaining a new current data point and using the new current data point to determine whether or not to perform the anomaly detection process using at least N-l of the most recent data points included in the first set of N data points. In some embodiments the method further includes performing the anomaly detection process using the N-l of the most recent data points included in the first set of N data points.
[0033] In some embodiments the method further includes, as a result of determining that the condition is not satisfied: refraining from performing the anomaly detection process; collecting a new set of N data points; and using the most current data point from the new set of N data points to determine whether or not to perform the anomaly detection process using at least N-l of the new set of N data points.
[0034] In some embodiments, the first set of N data points is associated with a cell of a mobile communication network, and each one of N data points included in the first set of N data points is a measure of a first performance metric for the cell. In some embodiments, the first performance metric is: an average throughput of the cell, an average latency associated with the cell, or a downtime for the cell.
[0035] In some embodiments, the time series data further comprises a second set of N data points, wherein N > 2 and each data point in the second set of data points was obtained at a different point in time, and the method further comprises using the most current data point from the second set of N data points to determine whether or not to perform the anomaly detection process using the second set of N data points, wherein each one of N data points included in the second set of N data points is a measure of a second performance metric for the cell, and the second performance metric for the cell is different than the first performance metric for the cell.
[0036] FIG. 4 is a block diagram of network management node 104, according to some embodiments, for performing network node methods disclosed herein. As shown in FIG. 4, network node 104 may comprise: processing circuitry (PC) 402, which may include one or more processors (P) 455 (e.g., one or more general purpose microprocessors and/or one or more other processors, such as an application specific integrated circuit (ASIC), field- programmable gate arrays (FPGAs), and the like), which processors may be co-located in a single housing or in a single data center or may be geographically distributed (i.e., network node 104 may be a distributed computing apparatus where some function are performed in one location and other functions performed in another location); at least one network interface 448 comprising a transmitter (Tx) 445 and a receiver (Rx) 447 for enabling network node 104 to transmit data to and receive data from other nodes connected to a network 110 (e.g., an Internet Protocol (IP) network) to which network interface 468 is connected;; and a local storage unit (a.k.a., “data storage system”) 408, which may include one or more nonvolatile storage devices and/or one or more volatile storage devices. In embodiments where PC 402 includes a programmable processor, a computer readable medium (CRM) 442 may be provided and store a computer program (CP) 443 comprising computer readable instructions (CRI) 444. CRM 442 may be a non-transitory computer readable medium, such as, magnetic media (e.g., a hard disk), optical media, memory devices (e.g., random access memory, flash memory), and the like. In some embodiments, the CRI 444 of computer program 443 is configured such that when executed by PC 402, the CRI causes network node 104 to perform steps described herein (e.g., steps described herein with reference to the flow charts). In other embodiments, network node 104 may be configured to perform steps described herein without the need for code. That is, for example, PC 402 may consist merely of one or more ASICs. Hence, the features of the embodiments described herein may be implemented in hardware and/or software.
[0037] Conclusion
[0038] By employing the “AD trigger function,” the AD process is triggered less often without missing any anomalies. That is, the embodiments have the same performance in anomaly detection as a conventional method, but the embodiments use fewer computation resources. The saved computation resources could be used to process more cells or for other purposes or not used at all, thereby reducing energy consumption.
[0039] Results
[0040] Table 2 below shows the benchmark results over a 7-day period with and without the smart sampling approach described above.
[0041] While various embodiments are described herein, it should be understood that they have been presented by way of example only, and not limitation. Thus, the breadth and scope of this disclosure should not be limited by any of the above-described exemplary embodiments. Moreover, any combination of the above-described elements in all possible variations thereof is encompassed by the disclosure unless otherwise indicated herein or otherwise clearly contradicted by context.
[0042] Additionally, while the processes described above and illustrated in the drawings are shown as a sequence of steps, this was done solely for the sake of illustration. Accordingly, it is contemplated that some steps may be added, some steps may be omitted, the order of the steps may be re-arranged, and some steps may be performed in parallel.
Claims
1. A method (300) for anomaly detection, the method comprising: storing (s302) time series data, the stored time series data comprising a first set of N data points, wherein N > 2 and each data point in the first set of data points was obtained at a different point in time; and using (s304) the most current data point from the first set of N data points to determine whether or not to perform an anomaly detection process using at least N-l of the N data points.
2. The method of claim 1, wherein using the most current data point to determine whether or not to perform the anomaly detection process comprises comparing the data point to a threshold.
3. The method of claim 2, wherein using the most current data point to determine whether or not to perform the anomaly detection process further comprises determining (s204), based on the comparing, that a condition is satisfied.
4. The method of claim 3, further comprising, as a result of determining that the condition is satisfied, performing (s206) the anomaly detection process using at least N-l of the N data points.
5. The method of claim 4, further comprising: after performing the anomaly detection process, obtaining (s208) a new current data point; and using the new current data point to determine whether or not to perform the anomaly detection process using the new current data point and N-l of the most recent data points included in the first set of N data points.
6. The method of claim 5, further comprising performing the anomaly detection process using the new current data point and N-l of the most recent data points included in the first set of N data points
7. The method of claim 3, further comprising, as a result of determining that the condition is not satisfied: refraining from performing the anomaly detection process; collecting (s210) a new set of N data points; and using the most current data point from the new set of N data points to determine whether or not to perform the anomaly detection process using at least N-l of the new set of N data points.
8. The method of any one of claims 1-7, wherein the first set of N data points is associated with a cell of a mobile communication network, and each one of N data points included in the first set of N data points is a measure of a first performance metric for the cell.
9. The method of claim 8, wherein the first performance metric is: a throughput of the cell, a latency associated with the cell, or a downtime for the cell.
10. The method of claim 8 or 9, wherein the time series data further comprises a second set of N data points, wherein N > 2 and each data point in the second set of data points was obtained at a different point in time, and the method further comprises using the most current data point from the second set of N data points to determine whether or not to perform the anomaly detection process using the second set of N data points, wherein each one of N data points included in the second set of N data points is a measure of a second performance metric for the cell, and the second performance metric for the cell is different than the first performance metric for the cell.
11. A computer program (443) comprising instructions (444) which when executed by processing circuitry (402) of a network management node (104) causes the network management node to perform the method of any one of claims 1-10.
12. A network management node, the network management node being configured to: store time series data, the stored time series data comprising a first set of N data points, wherein N > 2 and each data point in the first set of data points was obtained at a different point in time; and use the most current data point from the first set of N data points to determine whether or not to perform an anomaly detection process using at least N-l of the N data points.
13. The network management node of claim 12, wherein using the most current data point to determine whether or not to perform the anomaly detection process comprises comparing the data point to a threshold.
14. The network management node of claim 13, wherein using the most current data point to determine whether or not to perform the anomaly detection process further comprises determining, based on the comparing, that a condition is satisfied.
15. The network management node of claim 14, wherein the network management node is further configured to, as a result of determining that the condition is satisfied, perform the anomaly detection process using at least N-l of the N data points.
16. The network management node of claim 15, wherein the network management node is further configured to: after performing the anomaly detection process, obtain a new current data point; and use the new current data point to determine whether or not to perform the anomaly detection process using the new current data point and N-l of the most recent data points included in the first set of N data points.
17. The network management node of claim 16, wherein the network management node is further configured to perform the anomaly detection process using the new current data point and N-l of the most recent data points included in the first set of N data points
18. The network management node of claim 14, wherein the network management node is further configured to, as a result of determining that the condition is not satisfied: refrain from performing the anomaly detection process;
collect a new set of N data points; and use the most current data point from the new set of N data points to determine whether or not to perform the anomaly detection process using at least N-l of the new set of N data points.
19. The network management node of any one of claims 12-18, wherein the first set of N data points is associated with a cell of a mobile communication network, and each one of N data points included in the first set of N data points is a measure of a first performance metric for the cell.
20. The network management node of claim 19, wherein the first performance metric is: a throughput of the cell, a latency associated with the cell, or a downtime for the cell.
21. The network management node of claim 19 or 20, wherein the time series data further comprises a second set of N data points, wherein N > 2 and each data point in the second set of data points was obtained at a different point in time, and the method further comprises using the most current data point from the second set of N data points to determine whether or not to perform the anomaly detection process using the second set of N data points, wherein each one of N data points included in the second set of N data points is a measure of a second performance metric for the cell, and the second performance metric for the cell is different than the first performance metric for the cell.
22. A network management node (104) comprising: processing circuitry (402); and a memory (442), the memory containing instructions (444) executable by the processing circuitry, whereby the network management node is operative to perform the method of any one of claims 1-10.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/IB2022/051475 WO2023156827A1 (en) | 2022-02-18 | 2022-02-18 | Anomaly detection |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/IB2022/051475 WO2023156827A1 (en) | 2022-02-18 | 2022-02-18 | Anomaly detection |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023156827A1 true WO2023156827A1 (en) | 2023-08-24 |
Family
ID=80682840
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/IB2022/051475 WO2023156827A1 (en) | 2022-02-18 | 2022-02-18 | Anomaly detection |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2023156827A1 (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160285700A1 (en) * | 2015-03-24 | 2016-09-29 | Futurewei Technologies, Inc. | Adaptive, Anomaly Detection Based Predictor for Network Time Series Data |
EP3326330A1 (en) * | 2015-07-22 | 2018-05-30 | Dynamic Network Services, Inc. | Methods, systems, and apparatus to generate information transmission performance alerts |
US10200262B1 (en) * | 2016-07-08 | 2019-02-05 | Splunk Inc. | Continuous anomaly detection service |
US20200267057A1 (en) * | 2019-02-15 | 2020-08-20 | Oracle International Corporation | Systems and methods for automatically detecting, summarizing, and responding to anomalies |
US20200382361A1 (en) * | 2019-05-30 | 2020-12-03 | Samsung Electronics Co., Ltd | Root cause analysis and automation using machine learning |
WO2021176460A1 (en) * | 2020-03-03 | 2021-09-10 | Telefonaktiebolaget Lm Ericsson (Publ) | Adaptive thresholding heuristic for anomaly detection |
-
2022
- 2022-02-18 WO PCT/IB2022/051475 patent/WO2023156827A1/en unknown
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160285700A1 (en) * | 2015-03-24 | 2016-09-29 | Futurewei Technologies, Inc. | Adaptive, Anomaly Detection Based Predictor for Network Time Series Data |
EP3326330A1 (en) * | 2015-07-22 | 2018-05-30 | Dynamic Network Services, Inc. | Methods, systems, and apparatus to generate information transmission performance alerts |
US10200262B1 (en) * | 2016-07-08 | 2019-02-05 | Splunk Inc. | Continuous anomaly detection service |
US20200267057A1 (en) * | 2019-02-15 | 2020-08-20 | Oracle International Corporation | Systems and methods for automatically detecting, summarizing, and responding to anomalies |
US20200382361A1 (en) * | 2019-05-30 | 2020-12-03 | Samsung Electronics Co., Ltd | Root cause analysis and automation using machine learning |
WO2021176460A1 (en) * | 2020-03-03 | 2021-09-10 | Telefonaktiebolaget Lm Ericsson (Publ) | Adaptive thresholding heuristic for anomaly detection |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107835098B (en) | Network fault detection method and system | |
CN106685750B (en) | System anomaly detection method and device | |
US10021130B2 (en) | Network state information correlation to detect anomalous conditions | |
CN103597890B (en) | Track the user terminal in mobile communications network | |
CN112311617A (en) | Configured data monitoring and alarming method and system | |
CN113824768B (en) | Health check method and device in load balancing system and flow forwarding method | |
US11706114B2 (en) | Network flow measurement method, network measurement device, and control plane device | |
CN108063685B (en) | Log analysis method and device | |
US20210357281A1 (en) | Using User Equipment Data Clusters and Spatial Temporal Graphs of Abnormalities for Root Cause Analysis | |
CN110875841A (en) | Alarm information pushing method and device and readable storage medium | |
US11777786B2 (en) | Method, device and computer program product for anomaly detection and root cause analysis | |
CN113784378B (en) | Method, device, server and storage medium for detecting faults of indoor partition cells | |
US8645311B2 (en) | Critical threshold parameters for defining bursts in event logs | |
EP4075749A1 (en) | Detection method and detection device for heavy flow data stream | |
JPWO2015182629A1 (en) | Monitoring system, monitoring device and monitoring program | |
CN109963292B (en) | Complaint prediction method, complaint prediction device, electronic apparatus, and storage medium | |
Chen et al. | Agent-based trust management model for wireless sensor networks | |
CN111082956A (en) | Event stream processing method, electronic device and readable storage medium | |
KR102333866B1 (en) | Method and Apparatus for Checking Problem in Mobile Communication Network | |
WO2023156827A1 (en) | Anomaly detection | |
WO2022017626A1 (en) | Trajectory based performance monitoring in a wireless communication network | |
CN111343647B (en) | Method, apparatus, device and medium for user perception evaluation | |
CN112867051A (en) | System and method for peer-to-peer statistics based failure detection | |
CN113271216B (en) | Data processing method and related equipment | |
CN113807697A (en) | Alarm association-based order dispatching method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22708605 Country of ref document: EP Kind code of ref document: A1 |