WO2021189730A1 - 检测异常高密子图的方法、装置、设备及存储介质 - Google Patents

检测异常高密子图的方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2021189730A1
WO2021189730A1 PCT/CN2020/103200 CN2020103200W WO2021189730A1 WO 2021189730 A1 WO2021189730 A1 WO 2021189730A1 CN 2020103200 W CN2020103200 W CN 2020103200W WO 2021189730 A1 WO2021189730 A1 WO 2021189730A1
Authority
WO
WIPO (PCT)
Prior art keywords
density
data
feature
abnormal
graph
Prior art date
Application number
PCT/CN2020/103200
Other languages
English (en)
French (fr)
Inventor
赵世泉
Original Assignee
深圳壹账通智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳壹账通智能科技有限公司 filed Critical 深圳壹账通智能科技有限公司
Publication of WO2021189730A1 publication Critical patent/WO2021189730A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • G06Q50/265Personal security, identity or safety

Definitions

  • This application relates to the field of big data technology, and in particular to methods, devices, equipment, and storage media for detecting abnormally high-density subgraphs.
  • the complex relationship network plays a pivotal role in the field of risk control and anti-fraud, especially in areas such as malicious group identification and fraud risk group prevention and group control.
  • the current analysis methods based on complex high-density subgraphs are all static analysis methods, that is, the overall content of the high-density subgraph is analyzed at a certain moment to obtain various predefined indicators, and then the properties of the high-density subgraph are analyzed. Division to identify fraudulent groups.
  • the inventor realizes that with the improvement of black production capacity, it is difficult to identify a fraud group (that is, abnormal high-density sub-graph) only from a static perspective, which leads to the detection of whether the high-density sub-graph is The accuracy of the exception is reduced.
  • the main purpose of this application is to improve the accuracy of detecting whether the high-density subgraph is abnormal.
  • the first aspect of the present application provides a method for detecting abnormally high-density subgraphs, including: obtaining a complex relationship network to be analyzed, and performing real-time graph segmentation processing on the complex relationship network through a preset algorithm, Obtain a high-density sub-graph, the high-density sub-graph is used to indicate the community and the association relationship between the communities; the network topology structure characteristics of the high-density sub-graph are sampled according to a first preset time interval to obtain dynamic feature change data, The dynamic feature change data is used to indicate the network topology structure feature data that the high-density sub-graph changes dynamically with time; obtain the static feature data in the historical complex relational network, and compare the static feature data through a preset statistical model.
  • the characteristic data is counted and calculated to obtain a confidence interval.
  • the historical complex relationship network is used to indicate the complex relationship network generated or stored before the complex relationship network, and the confidence interval is used to indicate the static state between each time period.
  • the average change range value of characteristic data divide the dynamic characteristic change data into non-abnormal characteristics and abnormal characteristics according to the confidence interval and outside the confidence interval, and use the non-abnormal characteristics and the abnormal characteristics as target derivatives Features; anomaly detection is performed on the high-density sub-graph by combining the target-derived features with an anomaly detection model to obtain the target abnormal high-density sub-graph.
  • the second aspect of the present application provides a device for detecting abnormal high-density subgraphs, including a memory, a processor, and computer-readable instructions stored on the memory and running on the processor, and the processor executes all
  • the computer-readable instructions implement the following steps: obtain the complex relationship network to be analyzed, and perform real-time graph segmentation processing on the complex relationship network through a preset algorithm to obtain a high-density sub-graph, the high-density sub-graph used to indicate the community And the association relationship between communities; sampling the network topology structure characteristics of the high-density sub-graph at a first preset time interval to obtain dynamic feature change data, where the dynamic feature change data is used to indicate the high-density sub-graph
  • the characteristic data of the network topology that changes dynamically with time; the static characteristic data in the historical complex relational network is acquired, and the static characteristic data is counted and calculated through the preset statistical model to obtain the confidence interval.
  • the complex relationship network is used to indicate the complex relationship network generated or stored before the complex relationship network, the confidence interval is used to indicate the average change range value of the static feature data between each time period; the dynamic feature is changed
  • the data is divided into non-abnormal features and abnormal features according to within the confidence interval and outside the confidence interval, and the non-abnormal features and the abnormal features are used as target derived features; the abnormality detection model is combined with the target derived features to compare the results.
  • the high-density sub-graph is used for abnormality detection, and the target abnormal high-density sub-graph is obtained.
  • the third aspect of the present application provides a computer-readable storage medium, which stores computer instructions.
  • the computer executes the following steps: Obtain the complex to be analyzed Relationship network, and perform real-time graph segmentation processing on the complex relationship network through a preset algorithm to obtain a high-density sub-graph, the high-density sub-graph is used to indicate the community and the association relationship between the communities; according to the first preset time interval Performing sampling processing on the network topology structure features of the high-density subgraph to obtain dynamic feature change data, where the dynamic feature change data is used to indicate the network topology structure feature data of the high-density sub-graph that dynamically changes over time; Acquire static feature data in a historical complex relationship network, perform statistics and calculations on the static feature data through a preset statistical model, to obtain a confidence interval, and the historical complex relationship network is used to indicate that the complex relationship network is generated or generated before the complex relationship network.
  • a stored complex relational network where the confidence interval is used to indicate the average change range value of the static characteristic data between each time period; the dynamic characteristic change data is divided into within the confidence interval and outside the confidence interval For non-abnormal features and abnormal features, the non-abnormal features and the abnormal features are used as target derived features; anomaly detection is performed on the high-density sub-graph by combining the target-derived features with an anomaly detection model to obtain the target abnormal high-density sub-graph.
  • the fourth aspect of the present application provides an apparatus for detecting abnormally high-density subgraphs, including: a segmentation processing module for obtaining a complex relationship network to be analyzed, and performing real-time graph segmentation processing on the complex relationship network through a preset algorithm , Obtain a high-density sub-graph, the high-density sub-graph is used to indicate the community and the association relationship between the communities; a sampling processing module is used to sample the network topology structure characteristics of the high-density sub-graph at a first preset time interval , Obtain dynamic feature change data, the dynamic feature change data is used to indicate the network topology structure feature data that the high-density sub-graph changes dynamically with time; the statistical calculation module is used to obtain the static state in the historical complex relationship network Feature data, the static feature data is counted and calculated through a preset statistical model to obtain a confidence interval, the historical complex relationship network is used to indicate the complex relationship network generated or stored before the complex relationship network, the The confidence interval is used to indicate the average
  • an anomaly detection module is used to perform abnormal detection on the high-density sub-graph by combining the target derived features with an anomaly detection model to obtain target anomalies Gaomizi map.
  • real-time graph segmentation is performed on the complex relational network to be analyzed through a preset algorithm to obtain a high-density sub-graph; the network topology structure characteristics of the high-density sub-graph are sampled at a first preset time interval Process to obtain dynamic feature change data; obtain static feature data in the historical complex relationship network, and perform statistics and calculations on static feature data through a preset statistical model to obtain a confidence interval; dynamic feature change data according to the confidence interval and the confidence interval
  • the exterior is divided into non-abnormal features and abnormal features, and non-abnormal features and abnormal features are regarded as target derived features; anomaly detection is performed on the high-density sub-graph by combining the target-derived features with the anomaly detection model, and the target abnormal high-density sub-graph is obtained.
  • the embodiment of the present application analyzes the risk capability of the high-density sub-graph by combining the static index of the high-density sub-graph and the dynamic index in the dynamic evolution process, and improves the accuracy of detecting whether the high-density sub-graph is abnormal.
  • FIG. 1 is a schematic diagram of an embodiment of a method for detecting abnormally high-density subgraphs in an embodiment of this application;
  • FIG. 2 is a schematic diagram of another embodiment of the method for detecting abnormal high-density subgraphs in an embodiment of the application;
  • FIG. 3 is a schematic diagram of an embodiment of an apparatus for detecting abnormal high-density subgraphs in an embodiment of the application
  • FIG. 4 is a schematic diagram of another embodiment of the device for detecting abnormal high-density subgraphs in an embodiment of the application;
  • FIG. 5 is a schematic diagram of an embodiment of a device for detecting abnormal high-density subgraphs in an embodiment of the application.
  • the embodiments of the present application provide a method, device, equipment and storage medium for detecting abnormal high-density subgraphs, which are used to combine the static indicators of the high-density subgraphs with dynamic indicators in the dynamic evolution process to analyze the risk capability of the high-density subgraphs, and improve The accuracy of detecting whether the high-density sub-image is abnormal.
  • An embodiment of the method for detecting abnormal high-density subgraphs in the embodiment of the present application includes:
  • the method for detecting abnormally high-density subgraphs includes:
  • a complex relationship network is formed by the connection between business content and business content, such as the use of a certain platform by people in a park, the degree of use of a certain platform, and what is the relationship between companies using a certain platform, etc. Since the complex relationship network is constantly changing with the change of business and time, when the server receives the instruction sent by the terminal or user end, it uses a preset algorithm to perform real-time graph segmentation and division of the complex relationship network at the current moment. Community planning, to obtain high-density sub-graphs with higher and more closely related relationships, and trigger data collection instructions while generating high-density sub-graphs.
  • this step 101 may include: obtaining the complex relationship network to be analyzed, initializing each node of the complex relationship network to a different first community, and calculating the first modularity metric value of the first community; and dividing each node separately In the second community where the neighboring node of each node is located, calculate the second modularity metric value of the second community; calculate the difference between the first modularity metric value and the second modularity metric value of each node; Analyze whether the difference is a positive number. If the difference is not a positive number, continue to perform community division processing on each node until the difference is a positive number, and the divided community is obtained. The community division processing is used to instruct each node to be initialized to a different first.
  • the graph is regarded as a high-density subgraph.
  • the server when the server receives the instruction sent by the terminal or the client, it reads the complex relational network stored in the database, and uses the two nodes in the complex relational network as node A and node B, and node A and node B are adjacent , Divide node A and node B into a separate community, that is, node A corresponds to community A1, node B corresponds to community B1, calculate the first modular metric value of community A1, and the first module of community B1 Calculate the second modular metric value of the second modularity metric value of the second modularity metric value of the second modularity metric value of the second modularity metric value of the second modularization metric value of the second modularity metric value of the second modularized metric value, respectively
  • the second modularity metric of the B2 community which measures the network of A1 and A2 communities (or: B1 and B2 communities) by calculating the difference between the first and second modularity metric.
  • the high connection edge weight indicates that the relationship complexity and the relationship degree are large. Therefore, the graph obtained by dividing the community with the connection edge weight greater than the preset threshold is used as the high-density subgraph to improve the quality of the generated high-density subgraph.
  • Dynamic feature change data are various network topological structure features such as the number of vertices, degrees, average degrees, and average correlation coefficients that dynamically change with time.
  • the data collection instruction starts the relevant data collection tool to capture the features of the high-density sub-image at the first preset time interval at regular intervals to obtain the dynamic feature change data in continuous equal time slices, where each equal time Perform average calculation or weighted average calculation on the dynamic feature change data in the slice to obtain dynamic feature change data that can represent the comprehensive changes in the time slice.
  • this step 102 may include: performing real-time network topology structure feature extraction on the high-density subgraph to obtain dynamic feature data; capturing the dynamic feature data at a first preset time interval to obtain candidate dynamic feature change data; Perform performance analysis and reliability analysis on candidate dynamic feature change data to obtain dynamic feature change data.
  • the server assigns weights to the network topology structure features of each dimension in the high-density sub-graph, sorts the features according to the weight value from large to small, and selects the features of the network topology structure in a specific order to obtain the specified network topology structure features.
  • Eigenvalue decomposition extracts the characteristics of the specified network topology to obtain dynamic feature data, combined with the flexible plug-in system of the acquisition tool Fluentd, requires less resources, and supports buffering based on memory and files to prevent data loss between nodes
  • the feature captures the dynamic feature change data at the first preset time interval to obtain candidate dynamic feature change data, performs performance analysis and reliability analysis on the candidate dynamic feature change data, and obtains performance guarantee and reliable performance dynamic feature change data.
  • the historical complex relationship network is used to indicate the complex relationship generated or stored before the complex relationship network Network
  • the confidence interval is used to indicate the average variation range value of the static characteristic data between each time period.
  • the static feature change data are various network topology features such as the number of vertices, degrees, average degrees, and average correlation coefficients corresponding to the number of vertices, degrees, average degrees, and average correlation coefficients corresponding to a specific moment in the historical complex relationship network.
  • this step 103 may include: obtaining a historical complex relationship network, and selecting and extracting features of the historical complex relationship network to obtain static feature data; using the static feature data as a node to obtain one of the static feature data in the historical complex relationship network Use the association relationship as the division condition to generate static high-density subgraphs according to the nodes and division conditions; obtain the time series data of the static high-density subgraphs, and sample the time series data according to the second preset time interval to obtain static features Change data; according to the third preset time interval, the static feature change data is calculated for the preset time interval, and the statistical data corresponding to each time interval is obtained.
  • the statistical data corresponding to each time interval includes the static high-density submap The quantity, and the mean and variance of the static feature change data in the third preset time interval; the statistical data corresponding to each time interval is calculated through the preset formula to obtain the first confidence threshold and the second confidence threshold, And generate a confidence interval according to the first confidence threshold and the second confidence threshold.
  • the server assigns weights to the static features of each dimension in the historical complex relationship network, sorts the static features from large to small according to the weight value, performs feature selection on the static features in a specific order, and obtains the specified static features. Specify static features for extraction to obtain static feature data.
  • static feature data is A( Number of vertices: 5, average degree: 25 degrees and average correlation coefficient: 4.5), B (number of vertices 5, average degree: 30 degrees and average correlation coefficient: 5) and C (number of vertices: 6, average degree: 35 degrees and average correlation coefficient: 5.5).
  • A Number of vertices: 5, average degree: 25 degrees and average correlation coefficient: 4.5
  • B number of vertices 5, average degree: 30 degrees and average correlation coefficient: 5
  • C number of vertices: 6, average degree: 35 degrees and average correlation coefficient: 5.5.
  • the correlation of C is low similarity and low correlation, and the correlation of B and C is high similarity and high correlation. Then the historical complex relationship network corresponding to A, B, and C is divided into the same area. And connect the historical complex relationship network corresponding to A and B, and connect the historical complex relationship network corresponding to B and C.
  • the corresponding network topologies of A and B are adjacent, and B and C are respectively The corresponding network topology is adjacent; backtracking the static feature change data of each static high-density sub-graph from the time of generation, the same time slice ⁇ t interval, for each static high-density sub-graph, the corresponding static feature change data can be calculated, such as: Indicates the number of nodes in the high-density subgraph at time t 0;
  • the second confidence threshold is greater than the first confidence threshold, based on the first confidence threshold and the second confidence threshold Obtain the confidence interval [first confidence threshold, second confidence threshold], where, Is the mean value of the static feature change data in the third preset time interval, ⁇ is the variance of the static feature change data in the third preset time interval, n is the number of historical high-density subgraphs, It is the corresponding value obtained from the query preset percentage confidence interval table.
  • the server can intuitively and clearly display whether the dynamic feature change data is abnormal in the confidence interval with a statistical analysis graph through a preset statistical analysis tool.
  • the ID of the high-density subgraph is also marked. By marking the initial derived feature and the ID of the high-density subgraph, it is convenient to track the dynamic change of the high-density subgraph corresponding to the derived feature in real time.
  • this step 104 may include: performing time continuity analysis on the dynamic feature change data to obtain time-continuous first feature data and second feature data, and time continuity is used to indicate the end time point of the first feature data and the second feature data.
  • the beginning and end time points of the characteristic data are the same or connected; calculate the characteristic difference value between the first characteristic data and the second characteristic data; determine whether the characteristic difference value is outside the confidence interval; if the characteristic difference value is not outside the confidence interval, then the characteristic difference The value is set to zero, and the first characteristic data and the second characteristic data corresponding to the characteristic difference value are regarded as non-abnormal characteristics; if the characteristic difference value is outside the confidence interval, the characteristic difference value is set to 1, and the characteristic difference value corresponding to the first characteristic data
  • the first feature data and the second feature data are abnormal features; the non-abnormal features and abnormal features are used as target derived features.
  • the server calculates the specified feature change data (that is, the first feature data and the second feature data) for the generated high-density sub-image every equal time slice ⁇ t, and analyzes the difference between the first feature data and the second feature data through the statistical analysis graph.
  • the difference value (that is, the characteristic difference value) will generate a line chart, histogram or other statistical graphs to analyze whether the characteristic difference value falls within the confidence interval at the current moment, and the first characteristic corresponding to the characteristic difference value that falls outside the confidence interval
  • the data and the second feature data are taken as abnormal features, and the first feature data and the second feature data corresponding to the feature difference values falling within the confidence interval are taken as non-abnormal features to obtain the target derived feature.
  • the server constructs an anomaly detection model, which is a combined model that integrates multiple performance models.
  • the sample data (sample data with derived features) in the anomaly detection model is screened through expert rules to obtain the initial sample data, and the initial sample data
  • the sample data is used for risk prediction, the risk value is obtained, the risk value is judged whether the risk value is greater than the preset value, the initial sample data with the risk value greater than the preset value is obtained, and the candidate sample data is obtained, which is based on Gaussian (normal) distribution in the unsupervised learning algorithm
  • the detection model combines target-derived features to detect anomalies in high-density subgraphs.
  • the dynamic evolution anomaly detection of high-density subgraphs can well cope with the situation of a large number of black production or fraud in a short period of time, that is, when the static characteristics of the entire high-density subgraph have not deteriorated, the evolution trend of each static feature is timely To curb the deterioration of the entire high-density submap.
  • this step 105 may include: creating and marking the correspondence between the target derived feature and the high-density subgraph through the anomaly detection model to obtain the labeled high-density subgraph; and performing anomaly detection on the marked high-density subgraph through the isolation forest algorithm , The initial abnormal high-density subgraph is obtained; anomaly detection is performed on the initial abnormal high-density sub-graph by the subspace anomaly detection algorithm based on clustering, and the target abnormal high-density sub-graph is obtained.
  • the server creates and marks the corresponding relationship between the target derived feature and the high density subgraph corresponding to the target derived feature through the anomaly detection model, and obtains the marked high density subgraph, so that the high density subgraph can be intuitively and conveniently analyzed through the analysis of the target derived feature.
  • the graph performs anomaly detection and display. Anomaly detection is performed on the labeled high-density subgraph by the isolated forest algorithm, and the initial abnormal high-density subgraph is obtained.
  • the high-density sub-graph E at the current moment can be obtained as the target abnormal high-density sub-graph.
  • the derived features may be high-dimensional data, and the accuracy of the isolation forest algorithm for the analysis of high-dimensional data is affected, the initial abnormal high-density subgraph obtained by the isolation forest algorithm for anomaly detection is performed on the cluster-based subspace anomaly
  • the anomaly detection of the detection algorithm improves the accuracy of its anomaly detection, thereby ensuring the quality and accuracy of the target anomaly high-density subgraph.
  • the embodiment of the present application analyzes the risk capability of the high-density sub-graph by combining the static index of the high-density sub-graph and the dynamic index in the dynamic evolution process, and improves the accuracy of detecting whether the high-density sub-graph is abnormal.
  • another embodiment of the method for detecting abnormal high-density subgraphs in the embodiment of the present application includes:
  • the historical complex relationship network is used to indicate the complex relationship generated or stored before the complex relationship network Network
  • the confidence interval is used to indicate the average variation range value of the static characteristic data between each time period.
  • the methods from 201 to 205 can be referred to from 101 to 105, which will not be repeated here.
  • the server uses the k-nearest neighbor algorithm to classify the abnormality degree of the target abnormal high-density subgraph, and obtains classification information of different abnormalities; uses the time series prediction algorithm to predict the abnormal development of the target abnormal high-density subgraph to obtain predictable anomalies in the future.
  • Changing anomaly information analyze the same type of anomaly on the target anomaly high-density subgraph through the clustering algorithm, and obtain the cluster information of the same type of anomaly as the target anomaly high-density subgraph; preset the weights of the classification information, anomaly information and clustering information Score evaluation, obtain scores, and sort the target abnormally high-density subgraphs according to the order of the scores in descending order to obtain the final target abnormally high-density subgraphs. Through comprehensive evaluation, the accuracy and quality of the acquisition of target abnormally high density submaps are improved.
  • the embodiment of the application analyzes the risk capability of the high-density sub-graph by combining the static index of the high-density sub-graph and the dynamic index in the dynamic evolution process, and improves the accuracy of detecting whether the high-density sub-graph is abnormal, and performs processing on the target abnormal high-density sub-graph.
  • the device for detecting abnormal high-density subgraphs in the embodiment of this application is described above, and the device for detecting abnormal high-density subgraphs in the embodiment of this application is described below. Please refer to FIG. 3, the device for detecting abnormal high-density subgraphs in the embodiment of this application An example of includes:
  • the segmentation processing module 301 is used to obtain the complex relationship network to be analyzed, and perform real-time graph segmentation processing on the complex relationship network through a preset algorithm to obtain a high-density sub-graph, which is used to indicate the community and the association relationship between the communities ;
  • the sampling processing module 302 is configured to sample the network topology structure characteristics of the high-density sub-graph at a first preset time interval to obtain dynamic feature change data, which is used to indicate that the high-density sub-graph changes dynamically with time Characteristic data of changing network topology;
  • the statistical calculation module 303 is used to obtain the static feature data in the historical complex relationship network, and perform statistics and calculations on the static feature data through a preset statistical model to obtain a confidence interval.
  • the historical complex relationship network is used to indicate the generation before the complex relationship network Or a stored complex relationship network, the confidence interval is used to indicate the average change range value of the static characteristic data between each time period;
  • the judgment analysis module 304 is configured to divide the dynamic feature change data into non-abnormal features and abnormal features according to the confidence interval and outside the confidence interval, and use the non-abnormal features and abnormal features as target derived features;
  • the anomaly detection module 305 is configured to perform anomaly detection on the high-density sub-graph by combining the target-derived features with the anomaly detection model to obtain the target anomaly high-density sub-graph.
  • each module in the above apparatus for detecting abnormal high-density subgraph corresponds to each step in the above-mentioned method embodiment of detecting abnormal high-density subgraph, and the functions and implementation processes are not repeated here.
  • the embodiment of the present application analyzes the risk capability of the high-density sub-graph by combining the static index of the high-density sub-graph and the dynamic index in the dynamic evolution process, and improves the accuracy of detecting whether the high-density sub-graph is abnormal.
  • another embodiment of the apparatus for detecting abnormal high-density subgraphs in the embodiment of the present application includes:
  • the segmentation processing module 301 is used to obtain the complex relationship network to be analyzed, and perform real-time graph segmentation processing on the complex relationship network through a preset algorithm to obtain a high-density sub-graph, which is used to indicate the community and the association relationship between the communities ;
  • the sampling processing module 302 is configured to sample the network topology structure characteristics of the high-density sub-graph at a first preset time interval to obtain dynamic feature change data, which is used to indicate that the high-density sub-graph changes dynamically with time Characteristic data of changing network topology;
  • the statistical calculation module 303 is used to obtain static feature data in the historical complex relationship network, and perform statistics and calculations on the static feature data through a preset statistical model to obtain a confidence interval.
  • the historical complex relationship network is used to indicate that the complex relationship network is before the Generated or stored complex relationship network, the confidence interval is used to indicate the average change range value of the static characteristic data between each time period;
  • the judgment analysis module 304 is configured to divide the dynamic feature change data into non-abnormal features and abnormal features according to the confidence interval and outside the confidence interval, and use the non-abnormal features and abnormal features as derivative features;
  • the anomaly detection module 305 is configured to perform anomaly detection on the high-density sub-graph by combining the target-derived features with the anomaly detection model to obtain the target anomaly high-density sub-graph;
  • the processing module 306 is configured to perform anomaly degree classification processing, abnormal development prediction processing, and anomaly analysis processing of the same type on the target abnormal high-density sub-graph to obtain the final target abnormal high-density sub-graph.
  • the segmentation processing module 301 is specifically configured to: obtain the complex relationship network to be analyzed, initialize each node of the complex relationship network to a different first community, and calculate the first modularity metric value of the first community;
  • the community division processing is used to instruct each node to be initialized to a different first.
  • a community and a second community that divides each node into the neighboring node of each node;
  • sampling processing module 302 is specifically configured to: perform feature extraction on the high-density sub-images to obtain dynamic feature data;
  • the statistical calculation module 303 is specifically configured to: obtain the historical complex relationship network, and select and extract the network topology structure characteristics of the historical complex relationship network to obtain static feature data;
  • the static feature change data is counted, and the statistical data corresponding to each time interval is obtained.
  • the statistical data corresponding to each time interval includes the number of static high-density sub-images and the static feature change data.
  • the statistical data corresponding to each time interval is calculated by a preset formula, the first confidence threshold and the second confidence threshold are obtained, and the confidence interval is generated according to the first confidence threshold and the second confidence threshold.
  • the judgment analysis module 304 is specifically configured to: perform a time continuity analysis on the dynamic feature change data to obtain time-continuous first feature data and second feature data, and the time continuity is used to indicate the end time point of the first feature data Same as or connected to the start point of the second characteristic data;
  • the characteristic difference value is set to zero, and the first characteristic data and the second characteristic data corresponding to the characteristic difference value are regarded as non-abnormal characteristics;
  • the characteristic difference value is set to 1, and the first characteristic data and the second characteristic data corresponding to the characteristic difference value are regarded as abnormal characteristics;
  • the anomaly detection module 305 is specifically configured to: create and mark the corresponding relationship between the target derived feature and the high-density subgraph through the anomaly detection model, to obtain the marked high-density subgraph;
  • Anomaly detection is performed on the marked high-density sub-graph by the isolated forest algorithm, and the initial abnormal high-density sub-graph is obtained;
  • Anomaly detection is performed on the initial anomaly high-density subgraph by a cluster-based subspace anomaly detection algorithm, and the target anomaly high-density subgraph is obtained.
  • each module in the above-mentioned abnormal high-density subgraph detection apparatus corresponds to each step in the above-mentioned abnormal high-density subgraph detection method embodiment, and its functions and implementation processes will not be repeated here.
  • the embodiment of the application analyzes the risk capability of the high-density sub-graph by combining the static index of the high-density sub-graph and the dynamic index in the dynamic evolution process, and improves the accuracy of detecting whether the high-density sub-graph is abnormal, and performs processing on the target abnormal high-density sub-graph.
  • Figures 3 to 4 above describe in detail the device for detecting abnormal high-density subgraphs in the embodiments of the present application from the perspective of modular functional entities.
  • the following describes the device for detecting abnormal high-density subgraphs in the embodiments of the present application in detail from the perspective of hardware processing. describe.
  • FIG. 5 is a schematic structural diagram of a device for detecting abnormally high-density subgraphs provided by an embodiment of the present application.
  • the device 500 for detecting abnormally high-density subgraphs may have relatively large differences due to different configurations or performances, and may include one or more A processor (central processing units, CPU) 501 (for example, one or more processors) and a memory 509, and one or more storage media 508 (for example, one or one storage device with a large amount of data) storing application programs 507 or data 506.
  • the memory 509 and the storage medium 508 may be short-term storage or persistent storage.
  • the program stored in the storage medium 508 may include one or more modules (not shown in the figure), and each module may include a series of command operations on the sign-in management device. Further, the processor 501 may be configured to communicate with the storage medium 508, and execute a series of instruction operations in the storage medium 508 on the device 500 for detecting abnormal high-density subgraphs.
  • the device 500 for detecting abnormal high-density sub-graphs may also include one or more power supplies 502, one or more wired or wireless network interfaces 503, one or more input and output interfaces 504, and/or one or more operating systems 505, For example, Windows Serve, Mac OS X, Unix, Linux, FreeBSD, etc.
  • Windows Serve Windows Serve
  • Mac OS X Unix
  • Linux FreeBSD
  • FIG. 5 does not constitute a limitation on the device for detecting abnormally high-density subgraphs, and may include more or less components than shown in the figure, or a combination Certain components, or different component arrangements.
  • the processor 501 can perform the functions of the segmentation processing module 301, the sampling processing module 302, the statistical calculation module 303, the judgment analysis module 304, the abnormality detection module 305, and the processing module 306 in the foregoing embodiment.
  • the processor 501 is the control center of the device for detecting abnormal high-density subgraphs, and can perform processing according to the method of detecting abnormal high-density subgraphs.
  • the processor 501 uses various interfaces and lines to connect various parts of the entire device for detecting abnormal high-density subgraphs, and executes by running or executing software programs and/or modules stored in the memory 509, and calling data stored in the memory 509.
  • the storage medium 508 and the memory 509 are both carriers for storing data.
  • the storage medium 508 may refer to an internal memory with a small storage capacity but a fast speed
  • the storage medium 509 may have a large storage capacity but a slow storage speed. External memory.
  • the memory 509 may be used to store software programs and modules.
  • the processor 501 executes various functional applications and data processing of the device 500 for detecting abnormal high-density subgraphs by running the software programs and modules stored in the memory 509.
  • the memory 509 may mainly include a storage program area and a storage data area.
  • the storage program area may store an operating system and at least one application program required by a function (obtain the complex relationship network to be analyzed, and perform a preset algorithm on the complex relationship network.
  • the storage data area can store data created according to the use of the sign-in management device (sampling processing of the network topological structure characteristics of the high-density sub-graphs at the first preset time interval, Obtain dynamic feature change data, etc.) and so on.
  • the memory 509 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other non-volatile solid-state storage devices.
  • the method program for detecting abnormal high-density subgraphs and the received data stream provided in the embodiment of the present application are stored in the memory, and the processor 501 is called from the memory 509 when needed.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • the computer instructions may be transmitted from a website, computer, server, or data center. Transmission to another website, computer, server or data center via wired (such as coaxial cable, optical fiber, twisted pair) or wireless (such as infrared, wireless, microwave, etc.).
  • the computer-readable storage medium may be any available medium that can be stored by a computer or a data storage device such as a server or a data center integrated with one or more available media.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, and a magnetic tape), an optical medium (for example, an optical disc), or a semiconductor medium (for example, a solid state disk (SSD)).
  • the present application also provides a device for detecting abnormal high-density subgraphs, including: a memory and at least one processor, the memory stores instructions, and the memory and the at least one processor are interconnected by wires; the at least one processor The processor invokes the instructions in the memory, so that the intelligent path planning device executes the steps in the above-mentioned method for detecting abnormally high-density subgraphs.
  • the present application also provides a computer-readable storage medium.
  • the computer-readable storage medium may be a non-volatile computer-readable storage medium or a volatile computer-readable storage medium.
  • the computer-readable storage medium stores computer instructions, and when the computer instructions are executed on the computer, the computer executes the following steps:
  • Acquire static feature data in a historical complex relationship network perform statistics and calculations on the static feature data through a preset statistical model, to obtain a confidence interval, and the historical complex relationship network is used to indicate that the complex relationship network is generated or generated before the complex relationship network.
  • Anomaly detection is performed on the high-density sub-graph by combining the target-derived features with an anomaly detection model to obtain the target abnormal high-density sub-graph.
  • the disclosed system, device, and method can be implemented in other ways.
  • the device embodiments described above are merely illustrative, for example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of this application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium.
  • a computer device which may be a personal computer, a server, or a network device, etc.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disk and other media that can store program code .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Tourism & Hospitality (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Security & Cryptography (AREA)
  • Artificial Intelligence (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

一种检测异常高密子图的方法、装置、设备及存储介质,涉及大数据领域,能够提高检测高密子图是否异常的准确性。该方法包括:通过预置算法对获取待分析的复杂关系网络进行实时的图分割处理,得到高密子图;按照第一预设时间间隔对高密子图的网络拓扑结构特征进行采样处理,得到动态特征变化数据;获取历史复杂关系网络中的静态特征数据,通过预置的统计模型对静态特征数据进行统计与计算,得到置信区间;将动态特征变化数据根据置信区间内和置信区间外划分为非异常特征和异常特征,将非异常特征和异常特征作为目标衍生特征;通过异常检测模型结合目标衍生特征对高密子图进行异常检测,得到目标异常高密子图。

Description

检测异常高密子图的方法、装置、设备及存储介质
本申请要求于2020年3月27日提交中国专利局、申请号为202010226309.8、发明名称为“检测异常高密子图的方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在申请中。
技术领域
本申请涉及大数据技术领域,尤其涉及检测异常高密子图的方法、装置、设备及存储介质。
背景技术
复杂关系网络在风控领域和反欺诈领域有着举足轻重的作用,特别是对于恶意团伙识别和欺诈风险群防群控等领域有着非常显著的作用。目前基于复杂高密子图的分析方法都是一种静态分析方法,即在某个时刻对高密子图中的整体内容进行分析,得到预先定义的各种指标,进而对该高密子图的性质进行划分,从而对欺诈团体进行识别。发明人意识到,随着黑产能力的提升,仅从静态角度去分析一个高密子图,很难对一个欺诈团体(即异常高密子图)进行很好的识别,进而导致检测高密子图是否异常的准确性降低。
发明内容
本申请的主要目的在于提高检测高密子图是否异常的准确性。
为实现上述目的,本申请第一方面提供了一种检测异常高密子图的方法,包括:获取待分析的复杂关系网络,并通过预置算法对所述复杂关系网络进行实时的图分割处理,得到高密子图,所述高密子图用于指示社区以及社区之间的关联关系;按照第一预设时间间隔对所述高密子图的网络拓扑结构特征进行采样处理,得到动态特征变化数据,所述动态特征变化数据用于指示所述高密子图随着时间变化而发生动态变化的网络拓扑结构特征数据;获取历史复杂关系网络中的静态特征数据,通过预置的统计模型对所述静态特征数据进行统计与计算,得到置信区间,所述历史复杂关系网络用于指示在所述复杂关系网络之前生成或存储的复杂关系网络,所述置信区间用于指示每个时间段之间的静态特征数据的平均变化范围值;将所述动态特征变化数据根据所述置信区间内和所述置信区间外划分为非异常特征和异常特征,将所述非异常特征和所述异常特征作为目标衍生特征;通过异常检测模型结合所述目标衍生特征对所述高密子图进行异常检测,得到目标异常高密子图。
本申请第二方面提供了一种检测异常高密子图的设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现如下步骤:获取待分析的复杂关系网络,并通过预置算法对所述复杂关系网络进行实时的图分割处理,得到高密子图,所述高密子图用于指示社区以及社区之间的关联关系;按照第一预设时间间隔对所述高密子图的网络拓扑结构特征进行采样处理,得到动态特征变化数据,所述动态特征变化数据用于指示所述高密子图随着时间变化而发生动态变化的网络拓扑结构特征数据;获取历史复杂关系网络中的静态特征数据,通过预置的统计模型对所述静态特征数据进行统计与计算,得到置信区间,所述历史复杂关系网络用于指示在所述复杂关系网络之前生成或存储的复杂关系网络,所述置信区间用于指示每个时间段之间的静态特征数据的平均变化范围值;将所述动态特征变化数据根据所述置信区间内和所述置信区间外划分为非异常特征和异常特征,将所述非异常特征和所述异常特征作为目标衍生特征;通过异常检测模型结合所述目标衍生特征对所述高密子图进行异常检测,得到目标异常高密子图。
本申请第三方面提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机指令,当所述计算机指令在计算机上运行时,使得计算机执行如下步骤:获取待分析的复杂关系网络,并通过预置算法对所述复杂关系网络进行实时的图分割处理,得到高 密子图,所述高密子图用于指示社区以及社区之间的关联关系;按照第一预设时间间隔对所述高密子图的网络拓扑结构特征进行采样处理,得到动态特征变化数据,所述动态特征变化数据用于指示所述高密子图随着时间变化而发生动态变化的网络拓扑结构特征数据;获取历史复杂关系网络中的静态特征数据,通过预置的统计模型对所述静态特征数据进行统计与计算,得到置信区间,所述历史复杂关系网络用于指示在所述复杂关系网络之前生成或存储的复杂关系网络,所述置信区间用于指示每个时间段之间的静态特征数据的平均变化范围值;将所述动态特征变化数据根据所述置信区间内和所述置信区间外划分为非异常特征和异常特征,将所述非异常特征和所述异常特征作为目标衍生特征;通过异常检测模型结合所述目标衍生特征对所述高密子图进行异常检测,得到目标异常高密子图。
本申请第四方面提供了一种检测异常高密子图的装置,包括:分割处理模块,用于获取待分析的复杂关系网络,并通过预置算法对所述复杂关系网络进行实时的图分割处理,得到高密子图,所述高密子图用于指示社区以及社区之间的关联关系;采样处理模块,用于按照第一预设时间间隔对所述高密子图的网络拓扑结构特征进行采样处理,得到动态特征变化数据,所述动态特征变化数据用于指示所述高密子图随着时间变化而发生动态变化的网络拓扑结构特征数据;统计计算模块,用于获取历史复杂关系网络中的静态特征数据,通过预置的统计模型对所述静态特征数据进行统计与计算,得到置信区间,所述历史复杂关系网络用于指示在所述复杂关系网络之前生成或存储的复杂关系网络,所述置信区间用于指示每个时间段之间的静态特征数据的平均变化范围值;判断分析模块,用于将所述动态特征变化数据根据所述置信区间内和所述置信区间外划分为非异常特征和异常特征,将所述非异常特征和所述异常特征作为目标衍生特征;异常检测模块,用于通过异常检测模型结合所述目标衍生特征对所述高密子图进行异常检测,得到目标异常高密子图。
本申请提供的技术方案中,通过预置算法对获取待分析的复杂关系网络进行实时的图分割处理,得到高密子图;按照第一预设时间间隔对高密子图的网络拓扑结构特征进行采样处理,得到动态特征变化数据;获取历史复杂关系网络中的静态特征数据,通过预置的统计模型对静态特征数据进行统计与计算,得到置信区间;将动态特征变化数据根据置信区间内和置信区间外划分为非异常特征和异常特征,将非异常特征和异常特征作为目标衍生特征;通过异常检测模型结合目标衍生特征对高密子图进行异常检测,得到目标异常高密子图。本申请实施例,通过结合高密子图的静态指标与动态演化过程中的动态指标来分析高密子图的风险能力,提高检测高密子图是否异常的准确性。
附图说明
图1为本申请实施例中检测异常高密子图的方法的一个实施例示意图;
图2为本申请实施例中检测异常高密子图的方法的另一个实施例示意图;
图3为本申请实施例中检测异常高密子图的装置的一个实施例示意图;
图4为本申请实施例中检测异常高密子图的装置的另一个实施例示意图;
图5为本申请实施例中检测异常高密子图的设备的一个实施例示意图。
具体实施方式
本申请实施例提供了一种检测异常高密子图的方法、装置、设备及存储介质,用于结合高密子图的静态指标与动态演化过程中的动态指标来分析高密子图的风险能力,提高检测高密子图是否异常的准确性。
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例进行描述。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理 解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。此外,术语“包括”或“具有”及其任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
为便于理解,下面对本申请实施例的具体流程进行描述,请参阅图1,本申请实施例中检测异常高密子图的方法的一个实施例包括:
在一实施例中,该检测异常高密子图的方法包括:
101、获取待分析的复杂关系网络,并通过预置算法对复杂关系网络进行实时的图分割处理,得到高密子图,高密子图用于指示社区以及社区之间的关联关系。
复杂关系网络由业务内容和业务内容之间的联系所构成,例如:某园区的人对于某平台的使用、对于某平台的使用程度、使用某平台的公司之间是什么关系等。由于复杂关系网络随着业务和时间的变化而时刻在产生变化,因而,服务器在接收到终端或用户端所发送的指令时,通过预置算法对当前时刻的复杂关系网络进行实时的图分割和社区规划,得到关系关联度更高和关系更密切的高密子图,在生成高密子图的同时触动数据采集指令。
具体地,该步骤101可以包括:获取待分析的复杂关系网络,将复杂关系网络的各节点初始化为不同的第一社区,并计算第一社区的第一模块化度量值;将各节点分别划分在各节点的邻近节点所在的第二社区中,并计算第二社区的第二模块化度量值;计算每个节点的第一模块化度量值和第二模块化度量值之间的差值;分析差值是否为正数,若差值不为正数,继续对各节点进行社区划分处理,直到差值为正数,得到划分社区,社区划分处理用于指示将各节点初始化为不同的第一社区和将各节点分别划分在各节点的邻近节点所在的第二社区;获取并分析划分社区中的各社区之间的连接边权重,将连接边权重均大于预设阈值的划分社区所构成的图作为高密子图。
例如:服务器在接收到终端或用户端所发送的指令时,读取数据库中存储的复杂关系网络,以复杂关系网络中的两个节点甲节点和乙节点作为说明,甲节点和乙节点相邻,将甲节点和乙节点分别划分为一个单独的社区,即甲节点对应甲1社区,乙节点对应乙1社区,计算甲1社区的第一模块化度量值,以及乙1社区的第一模块化度量值,分别将甲节点划分在乙节点所在的社区,得到甲2社区,将乙节点划分在甲节点所在的社区,得到乙2社区,计算甲2社区的第二模块化度量值,以及乙2社区的第二模块化度量值,通过计算第一模块化度量值与第二模块化度量值的差异来衡量甲1社区和甲2社区(或:乙1社区和乙2社区)的网络社区结构强度。连接边权重大说明关系复杂度和关系关联度大,因而,将连接边权重均大于预设阈值的划分社区所得的图作为高密子图,以提高生成的高密子图的质量。
102、按照第一预设时间间隔对高密子图的网络拓扑结构特征进行采样处理,得到动态特征变化数据,动态特征变化数据用于指示高密子图随着时间变化而发生动态变化的网络拓扑结构特征数据。
动态特征变化数据为随着时间变化而产生动态变化的顶点个数、度数、平均度数和平均关联系数等各种网络拓扑结构特征。数据采集指令启动相关的数据采集工具每隔一定的时间抓取高密子图在第一预设时间间隔的特征,得到连续的等时间片内的动态特征变化数据,其中,可对每个等时间片内的动态特征变化数据进行均值计算或者加权平均值计算,得到可代表该时间片内综合变化的动态特征变化数据。
具体地,该步骤102可以包括:对高密子图进行实时的网络拓扑结构特征提取,得到动态特征数据;按照第一预设时间间隔对动态特征数据进行抓取,获得候选动态特征变化 数据;对候选动态特征变化数据进行性能分析和可靠性分析,得到动态特征变化数据。
服务器通过对高密子图中每一维的网络拓扑结构特征赋予权重,按照权重值从大到小对特征进行排序,对特定顺序的网络拓扑结构特征进行特征选择,得到指定网络拓扑结构特征,通过特征值分解对指定网络拓扑结构特征进行提取,得到动态特征数据,再结合采集工具Fluentd的具有灵活的插件系统、所需的资源较少和支持基于内存和文件的缓冲以防止节点间数据丢失的特性在第一预设时间间隔对动态特征变化数据进行抓取得到候选动态特征变化数据,对候选动态特征变化数据进行性能分析和可靠性分析,获得性能保证和性能可靠的动态特征变化数据。
103、获取历史复杂关系网络中的静态特征数据,通过预置的统计模型对静态特征数据进行统计与计算,得到置信区间,历史复杂关系网络用于指示在复杂关系网络之前生成或存储的复杂关系网络,置信区间用于指示每个时间段之间的静态特征数据的平均变化范围值。
静态特征变化数据为历史复杂关系网络中在特定时刻时对应的顶点个数、度数、平均度数和平均关联系数等各种网络拓扑结构特征。采用等时间间隔的方式(具体时间间隔多长,也取决于不同的业务场景,一般为1小时,要求比较高的场景,可以以分钟为单位)对于历史复杂关系网络中的静态特征数据进行统计与计算,统计其每个时间片的静态特征变化数据(如顶点个数,度数,平均度数,平均关联系数等)的总体情况,并计算每个时间片之间的各静态特征变化数据的平均变化范围值(即置信区间),以这些静态特征变化数据的平均变化范围值(即置信区间)作为动态特征变化数据判断的基准。
具体地,该步骤103可以包括:获取历史复杂关系网络,并对历史复杂关系网络的特征进行选择和提取,得到静态特征数据;将静态特征数据作为节点,获取历史复杂关系网络中静态特征数据之间的关联关系,将关联关系作为划分条件,根据节点和划分条件生成静态高密子图;获取静态高密子图的时序数据,并按照第二预设时间间隔对时序数据进行采样处理,得到静态特征变化数据;按照第三预设时间间隔,对静态特征变化数据进行预设时间间隔的统计,获得与每个时间间隔对应的统计数据,与每个时间间隔对应的统计数据包括静态高密子图的数量、以及静态特征变化数据在第三预设时间间隔内的均值和方差;通过预置公式对与每个时间间隔对应的统计数据进行计算,获得第一置信度阈值和第二置信度阈值,并根据第一置信度阈值和第二置信度阈值生成置信区间。
服务器通过对历史复杂关系网络中每一维的静态特征赋予权重,按照权重值从大到小对静态特征进行排序,对特定顺序的静态特征进行特征选择,得到指定静态特征,通过特征值分解对指定静态特征进行提取,得到静态特征数据。以静态特征数据作为节点,以历史复杂关系网络中静态特征数据之间的关联关系作为划分条件,对历史复杂关系网络进行高密子图划分,得到静态高密子图,例如:静态特征数据为甲(顶点个数:5、平均度数:25度和平均关联系数:4.5)、乙(顶点个数5、平均度数:30度和平均关联系数:5)和丙(顶点个数:6、平均度数:35度和平均关联系数:5.5),其中,甲、乙和丙对应的历史复杂关系网络所在区域之间相隔非常远,甲与乙的关联关系为相似度非常高与关联度非常高,甲和丙的关联关系为相似度较低与关联度较低,乙和丙的关联关系为相似度较高与关联度较高,则将甲、乙和丙对应的历史复杂关系网络划分为同一区域,并将甲和乙对应的历史复杂关系网络组合连接,将乙和丙对应的历史复杂关系网络组合连接,即在静态高密子图上,甲与乙分别对应的网络拓扑结构邻近,乙与丙分别对应的网络拓扑结构邻近;回溯每个静态高密子图从产生时刻开始,等时间片Δt间隔的静态特征变化数据,针对每个静态高密子图,可算得对应的静态特征变化数据,如:
Figure PCTCN2020103200-appb-000001
表示t 0时刻高密子图中节点的数目;
Figure PCTCN2020103200-appb-000002
针对每个单独的静态特征变化数据,计算得到每个静态特征变化数据在每个时刻的变化:
Figure PCTCN2020103200-appb-000003
Figure PCTCN2020103200-appb-000004
统计所有的静态高密子图,计算上述静态特征变化数据在每个时间片Δt间隔中变化的均值(即在第三预设时间间隔内的均值)和置信区间。可通过预置公式
Figure PCTCN2020103200-appb-000005
对于每个时间间隔对应的统计数据进行计算,分别得到第一置信度阈值和第二置信度阈值,第二置信度阈值大于第一置信度阈值,根据第一置信度阈值和第二置信度阈值得到置信区间[第一置信度阈值,第二置信度阈值],其中,
Figure PCTCN2020103200-appb-000006
为静态特征变化数据在第三预设时间间隔内的均值,σ为静态特征变化数据在第三预设时间间隔内的方差,n为历史高密子图的数量,
Figure PCTCN2020103200-appb-000007
为查询预置的百分率置信区间表所得的对应值。
104、将动态特征变化数据根据置信区间内和置信区间外划分为非异常特征和异常特征,将非异常特征和异常特征作为目标衍生特征。
服务器可通过预置统计分析工具以统计分析图直观明了地显示动态特征变化数据在置信区间内的是否异常的情况。将动态特征变化数据在置信区间内的判断(定义)为非异常特征,将动态特征变化数据在置信区间外的判断(定义)为异常特征,非异常特征和异常特征为目标衍生特征,除此之外,还对高密子图的ID进行标记,通过对初始衍生特征进行标记和对高密子图的ID的标记,以便于对于衍生特征对应的高密子图进行实时的动态变化跟踪。
具体地,该步骤104可以包括:对动态特征变化数据进行时间连续性分析,获得时间连续的第一特征数据和第二特征数据,时间连续用于指示第一特征数据的末端时间点与第二特征数据的始端时间点相同或连接;计算第一特征数据和第二特征数据之间的特征差异值;判断特征差异值是否在置信区间外;若特征差异值不在置信区间外,则将特征差异值置零,并将特征差异值对应的第一特征数据和第二特征数据作为非异常特征;若特征差异值在置信区间外,则将特征差异值置1,并将特征差异值对应的第一特征数据和第二特征数据为异常特征;将非异常特征和异常特征作为目标衍生特征。
服务器针对生成的高密子图,每隔等时间片Δt计算指定的特征变化数据(即第一特征数据和第二特征数据),通过统计分析图分析第一特征数据和第二特征数据之间的差异值(即特征差异值)将生成折线图、直方图或其他统计图以分析特征差异值在当前时刻是否落在置信区间内,将其落在置信区间外的特征差异值对应的第一特征数据和第二特征数据作为异常特征,以及将其落在置信区间内的特征差异值对应的第一特征数据和第二特征数据作为非异常特征,得到目标衍生特征。这样可每个时可得到每个高密子图的所有动态特征变化数据(即目标衍生特征)和异常特征。其中,特征差异值:指标变化为
Figure PCTCN2020103200-appb-000008
Figure PCTCN2020103200-appb-000009
衍生特征为t 0~t 1:(0,0,1,……)。
105、通过异常检测模型结合目标衍生特征对高密子图进行异常检测,得到目标异常高密子图。
服务器通过构建异常检测模型,该异常检测模型为综合多种性能模型的组合模型,通过专家规则对异常检测模型中的样本数据(具有衍生特征的样本数据)进行筛选,得到初始样本数据,对初始样本数据进行风险预测,得到风险值,判断风险值是否大于预设值,获取风险值大于预设值的初始样本数据,得到候选样本数据,通过无监督学习算法中的基于高斯(正态)分布的异常检测算法中对候选样本数据进行正态分布分析,从而得到目标衍生特征中异常对应的目标异常高密子图,以完成对目标异常检测模型的训练,得到最终的目标异常检测模型,通过异常检测模型结合目标衍生特征对高密子图进行异常检测。对高密子图的动态演化异常检测,能很好地应对短时间内大量黑产或欺诈涌入的情况,即在整个高密子图的静态特征还未恶化时,通过各静态特征的演化趋势及时遏制整个高密子图的恶化。
具体地,该步骤105可以包括:通过异常检测模型,创建并标记目标衍生特征和高密子图的对应关系,得到标记后的高密子图;通过孤立森林算法对标记后的高密子图进行异常检测,得到初始异常高密子图;通过基于聚类的子空间异常检测算法对初始异常高密子图进行异常检测,得到目标异常高密子图。
服务器通过异常检测模型创建并标记目标衍生特征和目标衍生特征对应的高密子图的对应关系,得到标记后的高密子图,以便于通过对目标衍生特征进行分析时能直观和便捷地对高密子图进行异常检测和显示。通过孤立森林算法对标记高密子图进行异常检测,得到初始异常高密子图,例如:当前时刻有五个高密子图A、B、C、D和E,在上一个时间间隔内的衍生特征分别是A(0,0,0,0,1),B(0,0,0,0,0),C(0,0,0,0,1),D(0,0,0,0,0),E(0,1,1,0,1),通过异常检测模型的孤立森林算法进行分析,可得到当前时刻高密子图E为目标异常高密子图。由于衍生特征可能为高维数据,而孤立森林算法对于高维数据的分析的准确度受到影响,因而,对通过孤立森林算法进行异常检测所得的初始异常高密子图进行基于聚类的子空间异常检测算法的异常检测,提高其异常检测的准确度,进而保证目标异常高密子图的质量和准确性。
本申请实施例,通过结合高密子图的静态指标与动态演化过程中的动态指标来分析高密子图的风险能力,提高检测高密子图是否异常的准确性。
请参阅图2,本申请实施例中检测异常高密子图的方法的另一个实施例包括:
201、获取待分析的复杂关系网络,并通过预置算法对复杂关系网络进行实时的图分割处理,得到高密子图,高密子图用于指示社区以及社区之间的关联关系。
202、按照第一预设时间间隔对高密子图的网络拓扑结构特征进行采样处理,得到动态特征变化数据,动态特征变化数据用于指示高密子图随着时间变化而发生动态变化的网络拓扑结构特征数据。
203、获取历史复杂关系网络中的静态特征数据,通过预置的统计模型对静态特征数据进行统计与计算,得到置信区间,历史复杂关系网络用于指示在复杂关系网络之前生成或存储的复杂关系网络,置信区间用于指示每个时间段之间的静态特征数据的平均变化范围值。
204、将动态特征变化数据根据置信区间内和置信区间外划分为非异常特征和异常特征,将非异常特征和异常特征作为目标衍生特征。
205、通过异常检测模型结合目标衍生特征对高密子图进行异常检测,得到目标异常高密子图。
本申请实施例中,201至205的方法可参见101至105,此处不再赘述。
206、对目标异常高密子图进行异常程度分类处理、异常发展预测处理和同类型异常分析处理,获得最终的目标异常高密子图。
服务器通过k-近邻算法对目标异常高密子图进行异常程度分类,获得不同异常程度的分类信息;通过时间序列预测算法对目标异常高密子图进行异常发展预测,获得能预测的在未来时段的异常变化的异常信息;通过聚类算法对目标异常高密子图进行同类型异常分析,获得与目标异常高密子图同类异常的聚类信息;将分类信息、异常信息和聚类信息进行预设权重的分值评估,获得分值,并将分值按照值从大到小的顺序对目标异常高密子图进行排序,获得最终的目标异常高密子图。通过综合评估,提高目标异常高密子图的获取准确度和质量。
本申请实施例,通过结合高密子图的静态指标与动态演化过程中的动态指标来分析高密子图的风险能力,提高检测高密子图是否异常的准确性,并通过对目标异常高密子图进行异常程度分类处理、异常发展预测处理和同类型异常分析处理,提高目标异常高密子图的获取准确度和质量。
上面对本申请实施例中检测异常高密子图的方法进行了描述,下面对本申请实施例中检测异常高密子图的装置进行描述,请参阅图3,本申请实施例中检测异常高密子图的装置的一个实施例包括:
分割处理模块301,用于获取待分析的复杂关系网络,并通过预置算法对复杂关系网络进行实时的图分割处理,得到高密子图,高密子图用于指示社区以及社区之间的关联关系;
采样处理模块302,用于按照第一预设时间间隔对高密子图的网络拓扑结构特征进行采样处理,得到动态特征变化数据,动态特征变化数据用于指示高密子图随着时间变化而发生动态变化的网络拓扑结构特征数据;
统计计算模块303,用于获取历史复杂关系网络中的静态特征数据,通过预置的统计模型对静态特征数据进行统计与计算,得到置信区间,历史复杂关系网络用于指示在复杂关系网络之前生成或存储的复杂关系网络,置信区间用于指示每个时间段之间的静态特征数据的平均变化范围值;
判断分析模块304,用于将动态特征变化数据根据置信区间内和置信区间外划分为非异常特征和异常特征,将非异常特征和异常特征作为目标衍生特征;
异常检测模块305,用于通过异常检测模型结合目标衍生特征对高密子图进行异常检测,得到目标异常高密子图。
上述检测异常高密子图的装置中各个模块的功能实现与上述检测异常高密子图的方法实施例中各步骤相对应,其功能和实现过程在此处不再一一赘述。
本申请实施例,通过结合高密子图的静态指标与动态演化过程中的动态指标来分析高密子图的风险能力,提高检测高密子图是否异常的准确性。
请参阅图4,本申请实施例中检测异常高密子图的装置的另一个实施例包括:
分割处理模块301,用于获取待分析的复杂关系网络,并通过预置算法对复杂关系网络进行实时的图分割处理,得到高密子图,高密子图用于指示社区以及社区之间的关联关系;
采样处理模块302,用于按照第一预设时间间隔对高密子图的网络拓扑结构特征进行采样处理,得到动态特征变化数据,动态特征变化数据用于指示高密子图随着时间变化而发生动态变化的网络拓扑结构特征数据;
统计计算模块303,用于获取历史复杂关系网络中的静态特征数据,并通过预置的统计模型对静态特征数据进行统计与计算,得到置信区间,历史复杂关系网络用于指示在复 杂关系网络之前生成或存储的复杂关系网络,置信区间用于指示每个时间段之间的静态特征数据的平均变化范围值;
判断分析模块304,用于将动态特征变化数据根据置信区间内和置信区间外划分为非异常特征和异常特征,将非异常特征和异常特征作为衍生特征;
异常检测模块305,用于通过异常检测模型结合目标衍生特征对高密子图进行异常检测,得到目标异常高密子图;
处理模块306,用于对目标异常高密子图进行异常程度分类处理、异常发展预测处理和同类型异常分析处理,获得最终的目标异常高密子图。
可选的,分割处理模块301具体用于:获取待分析的复杂关系网络,将复杂关系网络的各节点初始化为不同的第一社区,并计算第一社区的第一模块化度量值;
将各节点分别划分在各节点的邻近节点所在的第二社区中,并计算第二社区的第二模块化度量值;
计算每个节点的第一模块化度量值和第二模块化度量值之间的差值;
分析差值是否为正数,若差值不为正数,继续对各节点进行社区划分处理,直到差值为正数,得到划分社区,社区划分处理用于指示将各节点初始化为不同的第一社区和将各节点分别划分在各节点的邻近节点所在的第二社区;
获取并分析划分社区中的各社区之间的连接边权重,将连接边权重均大于预设阈值的划分社区所构成的图作为高密子图。
可选的,采样处理模块302具体用于:对高密子图进行特征提取,得到动态特征数据;
对高密子图进行实时的网络拓扑结构特征提取,得到动态特征数据;
按照第一预设时间间隔对动态特征数据进行抓取,获得候选动态特征变化数据;
对候选动态特征变化数据进行性能分析和可靠性分析,得到动态特征变化数据。
可选的,统计计算模块303具体用于:获取历史复杂关系网络,并对历史复杂关系网络的网络拓扑结构特征进行选择和提取,得到静态特征数据;
将静态特征数据作为节点,获取历史复杂关系网络中静态特征数据之间的关联关系,将关联关系作为划分条件,根据节点和划分条件生成静态高密子图;
获取静态高密子图中的时序数据,并按照第二预设时间间隔对时序数据进行采样处理,得到静态特征变化数据;
按照第三预设时间间隔,对静态特征变化数据进行统计,获得与每个时间间隔对应的统计数据,与每个时间间隔对应的统计数据包括静态高密子图的数量、以及静态特征变化数据在第三预设时间间隔内的均值和方差;
通过预置公式对与每个时间间隔对应的统计数据进行计算,获得第一置信度阈值和第二置信度阈值,并根据第一置信度阈值和第二置信度阈值生成置信区间。
可选的,判断分析模块304具体用于:对动态特征变化数据进行时间连续性分析,获得时间连续的第一特征数据和第二特征数据,时间连续用于指示第一特征数据的末端时间点与第二特征数据的始端时间点相同或连接;
计算第一特征数据和第二特征数据之间的特征差异值;
判断特征差异值是否在置信区间外;
若特征差异值不在置信区间外,则将特征差异值置零,并将特征差异值对应的第一特征数据和第二特征数据作为非异常特征;
若特征差异值在置信区间外,则将特征差异值置1,并将特征差异值对应的第一特征数据和第二特征数据作为异常特征;
将非异常特征和异常特征作为目标衍生特征。
可选的,异常检测模块305具体用于:通过异常检测模型,创建并标记目标衍生特征和高密子图之间的对应关系,得到标记后的高密子图;
通过孤立森林算法对标记后的高密子图进行异常检测,得到初始异常高密子图;
通过基于聚类的子空间异常检测算法对初始异常高密子图进行异常检测,得到目标异常高密子图。
上述检测异常高密子图的装置中各个模块的功能实现与上述检测异常高密子图的方法实施例中各步骤相对应,其功能和实现过程在此处不再一一赘述。
本申请实施例,通过结合高密子图的静态指标与动态演化过程中的动态指标来分析高密子图的风险能力,提高检测高密子图是否异常的准确性,并通过对目标异常高密子图进行异常程度分类处理、异常发展预测处理和同类型异常分析处理,提高目标异常高密子图的获取准确度和质量。
上面图3至图4从模块化功能实体的角度对本申请实施例中的检测异常高密子图的装置进行详细描述,下面从硬件处理的角度对本申请实施例中检测异常高密子图的设备进行详细描述。
图5是本申请实施例提供的一种检测异常高密子图的设备的结构示意图,该检测异常高密子图的设备500可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上处理器(central processing units,CPU)501(例如,一个或一个以上处理器)和存储器509,一个或一个以上存储应用程序507或数据506的存储介质508(例如一个或一个以上海量存储装置)。其中,存储器509和存储介质508可以是短暂存储或持久存储。存储在存储介质508的程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对签到管理设备中的一系列指令操作。更进一步地,处理器501可以设置为与存储介质508通信,在检测异常高密子图的设备500上执行存储介质508中的一系列指令操作。
检测异常高密子图的设备500还可以包括一个或一个以上电源502,一个或一个以上有线或无线网络接口503,一个或一个以上输入输出接口504,和/或,一个或一个以上操作系统505,例如Windows Serve,Mac OS X,Unix,Linux,FreeBSD等等。本领域技术人员可以理解,图5中示出的检测异常高密子图的设备结构并不构成对检测异常高密子图的设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。处理器501可以执行上述实施例中分割处理模块301、采样处理模块302、统计计算模块303、判断分析模块304、异常检测模块305和处理模块306的功能。
下面结合图5对检测异常高密子图的设备的各个构成部件进行具体的介绍:
处理器501是检测异常高密子图的设备的控制中心,可以按照检测异常高密子图的方法进行处理。处理器501利用各种接口和线路连接整个检测异常高密子图的设备的各个部分,通过运行或执行存储在存储器509内的软件程序和/或模块,以及调用存储在存储器509内的数据,执行检测异常高密子图的设备的各种功能和处理数据,从而实现提高检测高密子图是否异常的准确性的功能。存储介质508和存储器509都是存储数据的载体,本申请实施例中,存储介质508可以是指储存容量较小,但速度快的内存储器,而存储器509可以是储存容量大,但储存速度慢的外存储器。
存储器509可用于存储软件程序以及模块,处理器501通过运行存储在存储器509的软件程序以及模块,从而执行检测异常高密子图的设备500的各种功能应用以及数据处理。存储器509可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(获取待分析的复杂关系网络,并通过预置算法对复杂关系网络进行实时的图分割处理,得到高密子图等)等;存储数据区可存储根据签到管理设备的使用所创建的数据(按照第一预设时间间隔对高密子图的网络拓扑结构特征进行采样处理, 得到动态特征变化数据等)等。此外,存储器509可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他非易失性固态存储器件。在本申请实施例中提供的检测异常高密子图的方法程序和接收到的数据流存储在存储器中,当需要使用时,处理器501从存储器509中调用。
在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、双绞线)或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存储的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,光盘)、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。
本申请还提供一种检测异常高密子图的设备,包括:存储器和至少一个处理器,所述存储器中存储有指令,所述存储器和所述至少一个处理器通过线路互连;所述至少一个处理器调用所述存储器中的所述指令,以使得所述智能化路径规划设备执行上述检测异常高密子图的方法中的步骤。
本申请还提供一种计算机可读存储介质,该计算机可读存储介质可以为非易失性计算机可读存储介质,也可以为易失性计算机可读存储介质。计算机可读存储介质存储有计算机指令,当所述计算机指令在计算机上运行时,使得计算机执行如下步骤:
获取待分析的复杂关系网络,并通过预置算法对所述复杂关系网络进行实时的图分割处理,得到高密子图,所述高密子图用于指示社区以及社区之间的关联关系;
按照第一预设时间间隔对所述高密子图的网络拓扑结构特征进行采样处理,得到动态特征变化数据,所述动态特征变化数据用于指示所述高密子图随着时间变化而发生动态变化的网络拓扑结构特征数据;
获取历史复杂关系网络中的静态特征数据,通过预置的统计模型对所述静态特征数据进行统计与计算,得到置信区间,所述历史复杂关系网络用于指示在所述复杂关系网络之前生成或存储的复杂关系网络,所述置信区间用于指示每个时间段之间的静态特征数据的平均变化范围值;
将所述动态特征变化数据根据所述置信区间内和所述置信区间外划分为非异常特征和异常特征,将所述非异常特征和所述异常特征作为目标衍生特征;
通过异常检测模型结合所述目标衍生特征对所述高密子图进行异常检测,得到目标异常高密子图。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的 部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。

Claims (20)

  1. 一种检测异常高密子图的方法,其中,包括:
    获取待分析的复杂关系网络,并通过预置算法对所述复杂关系网络进行实时的图分割处理,得到高密子图,所述高密子图用于指示社区以及社区之间的关联关系;
    按照第一预设时间间隔对所述高密子图的网络拓扑结构特征进行采样处理,得到动态特征变化数据,所述动态特征变化数据用于指示所述高密子图随着时间变化而发生动态变化的网络拓扑结构特征数据;
    获取历史复杂关系网络中的静态特征数据,通过预置的统计模型对所述静态特征数据进行统计与计算,得到置信区间,所述历史复杂关系网络用于指示在所述复杂关系网络之前生成或存储的复杂关系网络,所述置信区间用于指示每个时间段之间的静态特征数据的平均变化范围值;
    将所述动态特征变化数据根据所述置信区间内和所述置信区间外划分为非异常特征和异常特征,将所述非异常特征和所述异常特征作为目标衍生特征;
    通过异常检测模型结合所述目标衍生特征对所述高密子图进行异常检测,得到目标异常高密子图。
  2. 根据权利要求1所述的检测异常高密子图的方法,其中,所述获取历史复杂关系网络中的静态特征数据,通过预置的统计模型对所述静态特征数据进行统计与计算,得到置信区间,包括:
    获取历史复杂关系网络,并对所述历史复杂关系网络的网络拓扑结构特征进行选择和提取,得到静态特征数据;
    将所述静态特征数据作为节点,获取所述历史复杂关系网络中所述静态特征数据之间的关联关系,将所述关联关系作为划分条件,根据所述节点和所述划分条件生成静态高密子图;
    获取所述静态高密子图中的时序数据,并按照第二预设时间间隔对所述时序数据进行采样处理,得到静态特征变化数据;
    按照第三预设时间间隔,对所述静态特征变化数据进行统计,获得与每个时间间隔对应的统计数据,所述与每个时间间隔对应的统计数据包括所述静态高密子图的数量、以及所述静态特征变化数据在所述第三预设时间间隔内的均值和方差;
    通过预置公式对所述与每个时间间隔对应的统计数据进行计算,获得第一置信度阈值和第二置信度阈值,并根据所述第一置信度阈值和所述第二置信度阈值生成置信区间。
  3. 根据权利要求1所述的检测异常高密子图的方法,其中,所述将所述动态特征变化数据根据所述置信区间内和所述置信区间外划分为非异常特征和异常特征,将所述非异常特征和所述异常特征作为目标衍生特征,包括:
    对所述动态特征变化数据进行时间连续性分析,获得时间连续的第一特征数据和第二特征数据,所述时间连续用于指示所述第一特征数据的末端时间点与所述第二特征数据的始端时间点相同或连接;
    计算所述第一特征数据和所述第二特征数据之间的特征差异值;
    判断所述特征差异值是否在所述置信区间外;
    若所述特征差异值不在所述置信区间外,则将所述特征差异值置零,并将所述特征差异值对应的第一特征数据和第二特征数据作为非异常特征;
    若所述特征差异值在所述置信区间外,则将所述特征差异值置1,并将所述特征差异值对应的第一特征数据和第二特征数据作为异常特征;
    将所述非异常特征和所述异常特征作为目标衍生特征。
  4. 根据权利要求1所述的检测异常高密子图的方法,其中,所述通过异常检测模型结合所述目标衍生特征对所述高密子图进行异常检测,得到目标异常高密子图,包括:
    通过异常检测模型,创建并标记所述目标衍生特征和所述高密子图之间的对应关系,得到标记后的高密子图;
    通过孤立森林算法对所述标记后的高密子图进行异常检测,得到初始异常高密子图;
    通过基于聚类的子空间异常检测算法对所述初始异常高密子图进行异常检测,得到目标异常高密子图。
  5. 根据权利要求1所述的检测异常高密子图的方法,其中,所述获取待分析的复杂关系网络,并通过预置算法对所述复杂关系网络进行实时的图分割处理,得到高密子图,包括:
    获取待分析的复杂关系网络,将所述复杂关系网络的各节点初始化为不同的第一社区,并计算所述第一社区的第一模块化度量值;
    将所述各节点分别划分在所述各节点的邻近节点所在的第二社区中,并计算所述第二社区的第二模块化度量值;
    计算每个节点的所述第一模块化度量值和所述第二模块化度量值之间的差值;
    分析所述差值是否为正数,若所述差值不为正数,继续对各节点进行社区划分处理,直到所述差值为正数,得到划分社区,所述社区划分处理用于指示将各节点初始化为不同的第一社区和将所述各节点分别划分在所述各节点的邻近节点所在的第二社区;
    获取并分析所述划分社区中的各社区之间的连接边权重,将所述连接边权重均大于预设阈值的划分社区所构成的图作为高密子图。
  6. 根据权利要求5所述的检测异常高密子图的方法,其中,所述按照第一预设时间间隔对所述高密子图的网络拓扑结构特征进行采样处理,得到动态特征变化数据,包括:
    对所述高密子图进行实时的网络拓扑结构特征提取,得到动态特征数据;
    按照第一预设时间间隔对所述动态特征数据进行抓取,获得候选动态特征变化数据;
    对所述候选动态特征变化数据进行性能分析和可靠性分析,得到动态特征变化数据。
  7. 根据权利要求1-6中任意一项所述的检测异常高密子图的方法,其中,在所述通过异常检测模型结合所述目标衍生特征对所述高密子图进行异常检测,得到目标异常高密子图之后,所述检测异常高密子图的方法还包括:
    对所述目标异常高密子图进行异常程度分类处理、异常发展预测处理和同类型异常分析处理,获得最终的目标异常高密子图。
  8. 一种检测异常高密子图的设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现如下步骤:
    获取待分析的复杂关系网络,并通过预置算法对所述复杂关系网络进行实时的图分割处理,得到高密子图,所述高密子图用于指示社区以及社区之间的关联关系;
    按照第一预设时间间隔对所述高密子图的网络拓扑结构特征进行采样处理,得到动态特征变化数据,所述动态特征变化数据用于指示所述高密子图随着时间变化而发生动态变化的网络拓扑结构特征数据;
    获取历史复杂关系网络中的静态特征数据,通过预置的统计模型对所述静态特征数据进行统计与计算,得到置信区间,所述历史复杂关系网络用于指示在所述复杂关系网络之前生成或存储的复杂关系网络,所述置信区间用于指示每个时间段之间的静态特征数据的平均变化范围值;
    将所述动态特征变化数据根据所述置信区间内和所述置信区间外划分为非异常特征和 异常特征,将所述非异常特征和所述异常特征作为目标衍生特征;
    通过异常检测模型结合所述目标衍生特征对所述高密子图进行异常检测,得到目标异常高密子图。
  9. 根据权利要求8所述的检测异常高密子图的设备,所述处理器执行所述计算机程序时还实现以下步骤:
    获取历史复杂关系网络,并对所述历史复杂关系网络的网络拓扑结构特征进行选择和提取,得到静态特征数据;
    将所述静态特征数据作为节点,获取所述历史复杂关系网络中所述静态特征数据之间的关联关系,将所述关联关系作为划分条件,根据所述节点和所述划分条件生成静态高密子图;
    获取所述静态高密子图中的时序数据,并按照第二预设时间间隔对所述时序数据进行采样处理,得到静态特征变化数据;
    按照第三预设时间间隔,对所述静态特征变化数据进行统计,获得与每个时间间隔对应的统计数据,所述与每个时间间隔对应的统计数据包括所述静态高密子图的数量、以及所述静态特征变化数据在所述第三预设时间间隔内的均值和方差;
    通过预置公式对所述与每个时间间隔对应的统计数据进行计算,获得第一置信度阈值和第二置信度阈值,并根据所述第一置信度阈值和所述第二置信度阈值生成置信区间。
  10. 根据权利要求8所述的检测异常高密子图的设备,所述处理器执行所述计算机程序时还实现以下步骤:
    对所述动态特征变化数据进行时间连续性分析,获得时间连续的第一特征数据和第二特征数据,所述时间连续用于指示所述第一特征数据的末端时间点与所述第二特征数据的始端时间点相同或连接;
    计算所述第一特征数据和所述第二特征数据之间的特征差异值;
    判断所述特征差异值是否在所述置信区间外;
    若所述特征差异值不在所述置信区间外,则将所述特征差异值置零,并将所述特征差异值对应的第一特征数据和第二特征数据作为非异常特征;
    若所述特征差异值在所述置信区间外,则将所述特征差异值置1,并将所述特征差异值对应的第一特征数据和第二特征数据作为异常特征;
    将所述非异常特征和所述异常特征作为目标衍生特征。
  11. 根据权利要求8所述的检测异常高密子图的设备,所述处理器执行所述计算机程序时还实现以下步骤:
    通过异常检测模型,创建并标记所述目标衍生特征和所述高密子图之间的对应关系,得到标记后的高密子图;
    通过孤立森林算法对所述标记后的高密子图进行异常检测,得到初始异常高密子图;
    通过基于聚类的子空间异常检测算法对所述初始异常高密子图进行异常检测,得到目标异常高密子图。
  12. 根据权利要求8所述的检测异常高密子图的设备,所述处理器执行所述计算机程序时还实现以下步骤:
    获取待分析的复杂关系网络,将所述复杂关系网络的各节点初始化为不同的第一社区,并计算所述第一社区的第一模块化度量值;
    将所述各节点分别划分在所述各节点的邻近节点所在的第二社区中,并计算所述第二社区的第二模块化度量值;
    计算每个节点的所述第一模块化度量值和所述第二模块化度量值之间的差值;
    分析所述差值是否为正数,若所述差值不为正数,继续对各节点进行社区划分处理,直到所述差值为正数,得到划分社区,所述社区划分处理用于指示将各节点初始化为不同的第一社区和将所述各节点分别划分在所述各节点的邻近节点所在的第二社区;
    获取并分析所述划分社区中的各社区之间的连接边权重,将所述连接边权重均大于预设阈值的划分社区所构成的图作为高密子图。
  13. 根据权利要求12所述的检测异常高密子图的设备,所述处理器执行所述计算机程序时还实现以下步骤:
    对所述高密子图进行实时的网络拓扑结构特征提取,得到动态特征数据;
    按照第一预设时间间隔对所述动态特征数据进行抓取,获得候选动态特征变化数据;
    对所述候选动态特征变化数据进行性能分析和可靠性分析,得到动态特征变化数据。
  14. 根据权利要求8-13中任意一项所述的检测异常高密子图的设备,所述处理器执行所述计算机程序时还实现以下步骤:
    对所述目标异常高密子图进行异常程度分类处理、异常发展预测处理和同类型异常分析处理,获得最终的目标异常高密子图。
  15. 一种计算机可读存储介质,所述计算机可读存储介质中存储计算机指令,当所述计算机指令在计算机上运行时,使得计算机执行如下步骤:
    获取待分析的复杂关系网络,并通过预置算法对所述复杂关系网络进行实时的图分割处理,得到高密子图,所述高密子图用于指示社区以及社区之间的关联关系;
    按照第一预设时间间隔对所述高密子图的网络拓扑结构特征进行采样处理,得到动态特征变化数据,所述动态特征变化数据用于指示所述高密子图随着时间变化而发生动态变化的网络拓扑结构特征数据;
    获取历史复杂关系网络中的静态特征数据,通过预置的统计模型对所述静态特征数据进行统计与计算,得到置信区间,所述历史复杂关系网络用于指示在所述复杂关系网络之前生成或存储的复杂关系网络,所述置信区间用于指示每个时间段之间的静态特征数据的平均变化范围值;
    将所述动态特征变化数据根据所述置信区间内和所述置信区间外划分为非异常特征和异常特征,将所述非异常特征和所述异常特征作为目标衍生特征;
    通过异常检测模型结合所述目标衍生特征对所述高密子图进行异常检测,得到目标异常高密子图。
  16. 根据权利要求15所述的计算机可读存储介质,当所述计算机指令在计算机上运行时,使得计算机还执行以下步骤:
    获取历史复杂关系网络,并对所述历史复杂关系网络的网络拓扑结构特征进行选择和提取,得到静态特征数据;
    将所述静态特征数据作为节点,获取所述历史复杂关系网络中所述静态特征数据之间的关联关系,将所述关联关系作为划分条件,根据所述节点和所述划分条件生成静态高密子图;
    获取所述静态高密子图中的时序数据,并按照第二预设时间间隔对所述时序数据进行采样处理,得到静态特征变化数据;
    按照第三预设时间间隔,对所述静态特征变化数据进行统计,获得与每个时间间隔对应的统计数据,所述与每个时间间隔对应的统计数据包括所述静态高密子图的数量、以及所述静态特征变化数据在所述第三预设时间间隔内的均值和方差;
    通过预置公式对所述与每个时间间隔对应的统计数据进行计算,获得第一置信度阈值和第二置信度阈值,并根据所述第一置信度阈值和所述第二置信度阈值生成置信区间。
  17. 根据权利要求15所述的计算机可读存储介质,当所述计算机指令在计算机上运行时,使得计算机还执行以下步骤:
    对所述动态特征变化数据进行时间连续性分析,获得时间连续的第一特征数据和第二特征数据,所述时间连续用于指示所述第一特征数据的末端时间点与所述第二特征数据的始端时间点相同或连接;
    计算所述第一特征数据和所述第二特征数据之间的特征差异值;
    判断所述特征差异值是否在所述置信区间外;
    若所述特征差异值不在所述置信区间外,则将所述特征差异值置零,并将所述特征差异值对应的第一特征数据和第二特征数据作为非异常特征;
    若所述特征差异值在所述置信区间外,则将所述特征差异值置1,并将所述特征差异值对应的第一特征数据和第二特征数据作为异常特征;
    将所述非异常特征和所述异常特征作为目标衍生特征。
  18. 根据权利要求15所述的计算机可读存储介质,当所述计算机指令在计算机上运行时,使得计算机还执行以下步骤:
    通过异常检测模型,创建并标记所述目标衍生特征和所述高密子图之间的对应关系,得到标记后的高密子图;
    通过孤立森林算法对所述标记后的高密子图进行异常检测,得到初始异常高密子图;
    通过基于聚类的子空间异常检测算法对所述初始异常高密子图进行异常检测,得到目标异常高密子图。
  19. 根据权利要求15所述的计算机可读存储介质,当所述计算机指令在计算机上运行时,使得计算机还执行以下步骤:
    获取待分析的复杂关系网络,将所述复杂关系网络的各节点初始化为不同的第一社区,并计算所述第一社区的第一模块化度量值;
    将所述各节点分别划分在所述各节点的邻近节点所在的第二社区中,并计算所述第二社区的第二模块化度量值;
    计算每个节点的所述第一模块化度量值和所述第二模块化度量值之间的差值;
    分析所述差值是否为正数,若所述差值不为正数,继续对各节点进行社区划分处理,直到所述差值为正数,得到划分社区,所述社区划分处理用于指示将各节点初始化为不同的第一社区和将所述各节点分别划分在所述各节点的邻近节点所在的第二社区;
    获取并分析所述划分社区中的各社区之间的连接边权重,将所述连接边权重均大于预设阈值的划分社区所构成的图作为高密子图。
  20. 一种检测异常高密子图的装置,其中,所述检测异常高密子图的装置包括:
    分割处理模块,用于获取待分析的复杂关系网络,并通过预置算法对所述复杂关系网络进行实时的图分割处理,得到高密子图,所述高密子图用于指示社区以及社区之间的关联关系;
    采样处理模块,用于按照第一预设时间间隔对所述高密子图的网络拓扑结构特征进行采样处理,得到动态特征变化数据,所述动态特征变化数据用于指示所述高密子图随着时间变化而发生动态变化的网络拓扑结构特征数据;
    统计计算模块,用于获取历史复杂关系网络中的静态特征数据,通过预置的统计模型对所述静态特征数据进行统计与计算,得到置信区间,所述历史复杂关系网络用于指示在 所述复杂关系网络之前生成或存储的复杂关系网络,所述置信区间用于指示每个时间段之间的静态特征数据的平均变化范围值;
    判断分析模块,用于将所述动态特征变化数据根据所述置信区间内和所述置信区间外划分为非异常特征和异常特征,将所述非异常特征和所述异常特征作为目标衍生特征;
    异常检测模块,用于通过异常检测模型结合所述目标衍生特征对所述高密子图进行异常检测,得到目标异常高密子图。
PCT/CN2020/103200 2020-03-27 2020-07-21 检测异常高密子图的方法、装置、设备及存储介质 WO2021189730A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010226309.8A CN111475680A (zh) 2020-03-27 2020-03-27 检测异常高密子图的方法、装置、设备及存储介质
CN202010226309.8 2020-03-27

Publications (1)

Publication Number Publication Date
WO2021189730A1 true WO2021189730A1 (zh) 2021-09-30

Family

ID=71750252

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/103200 WO2021189730A1 (zh) 2020-03-27 2020-07-21 检测异常高密子图的方法、装置、设备及存储介质

Country Status (2)

Country Link
CN (1) CN111475680A (zh)
WO (1) WO2021189730A1 (zh)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113837874A (zh) * 2021-11-22 2021-12-24 北京芯盾时代科技有限公司 一种数据的识别方法、装置、存储介质及电子设备
CN114257493A (zh) * 2021-12-17 2022-03-29 中国电信股份有限公司 网络节点的故障预警方法、装置、介质及电子设备
US20220156234A1 (en) * 2020-11-13 2022-05-19 Hitachi, Ltd. Data integration method and data integration system
CN115912359A (zh) * 2023-02-23 2023-04-04 豪派(陕西)电子科技有限公司 基于大数据的数字化安全隐患识别排查治理方法
CN116055385A (zh) * 2022-12-30 2023-05-02 中国联合网络通信集团有限公司 路由方法、管理节点、路由节点及介质
CN116151511A (zh) * 2023-03-01 2023-05-23 国网山东省电力公司菏泽供电公司 基于数据处理的配电馈线和台区智能诊断管理方法及系统
CN116204690A (zh) * 2023-04-28 2023-06-02 泰力基业股份有限公司 一种具有自动灭火功能的配电箱数据传输系统
CN116269738A (zh) * 2023-05-25 2023-06-23 深圳市科医仁科技发展有限公司 射频治疗仪的智能控制方法、装置、设备及存储介质
CN116628554A (zh) * 2023-05-31 2023-08-22 烟台大学 一种工业互联网数据异常的检测方法、系统和设备
CN116844684A (zh) * 2023-05-18 2023-10-03 首都医科大学附属北京朝阳医院 一种医学检验结果的质控处理方法、装置、设备及介质
CN117282261A (zh) * 2023-11-23 2023-12-26 天津恩纳社环保有限公司 一种微生物废气处理系统
CN117436006A (zh) * 2023-12-22 2024-01-23 圣道天德电气(山东)有限公司 一种智慧环网柜故障实时监测方法及系统

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112134862B (zh) * 2020-09-11 2023-09-08 国网电力科学研究院有限公司 基于机器学习的粗细粒度混合网络异常检测方法及装置
CN112214499B (zh) 2020-12-03 2021-03-19 腾讯科技(深圳)有限公司 图数据处理方法、装置、计算机设备和存储介质
CN112669299B (zh) * 2020-12-31 2023-04-07 上海智臻智能网络科技股份有限公司 瑕疵检测方法及装置、计算机设备和存储介质
CN115134246B (zh) * 2021-03-22 2023-07-21 中国移动通信集团河南有限公司 网络性能指标监控方法、装置、设备和存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018203956A1 (en) * 2017-05-02 2018-11-08 Google Llc Systems and methods to detect clusters in graphs
CN109711746A (zh) * 2019-01-02 2019-05-03 中国联合网络通信集团有限公司 一种基于复杂网络的信用评估方法和系统
CN109788001A (zh) * 2019-03-07 2019-05-21 武汉极意网络科技有限公司 可疑互联网协议地址发现方法、用户设备、存储介质及装置
CN109816535A (zh) * 2018-12-13 2019-05-28 中国平安财产保险股份有限公司 欺诈识别方法、装置、计算机设备及存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018203956A1 (en) * 2017-05-02 2018-11-08 Google Llc Systems and methods to detect clusters in graphs
CN109816535A (zh) * 2018-12-13 2019-05-28 中国平安财产保险股份有限公司 欺诈识别方法、装置、计算机设备及存储介质
CN109711746A (zh) * 2019-01-02 2019-05-03 中国联合网络通信集团有限公司 一种基于复杂网络的信用评估方法和系统
CN109788001A (zh) * 2019-03-07 2019-05-21 武汉极意网络科技有限公司 可疑互联网协议地址发现方法、用户设备、存储介质及装置

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220156234A1 (en) * 2020-11-13 2022-05-19 Hitachi, Ltd. Data integration method and data integration system
CN113837874A (zh) * 2021-11-22 2021-12-24 北京芯盾时代科技有限公司 一种数据的识别方法、装置、存储介质及电子设备
CN113837874B (zh) * 2021-11-22 2022-04-12 北京芯盾时代科技有限公司 一种数据的识别方法、装置、存储介质及电子设备
CN114257493A (zh) * 2021-12-17 2022-03-29 中国电信股份有限公司 网络节点的故障预警方法、装置、介质及电子设备
CN114257493B (zh) * 2021-12-17 2024-04-23 中国电信股份有限公司 网络节点的故障预警方法、装置、介质及电子设备
CN116055385A (zh) * 2022-12-30 2023-05-02 中国联合网络通信集团有限公司 路由方法、管理节点、路由节点及介质
CN115912359A (zh) * 2023-02-23 2023-04-04 豪派(陕西)电子科技有限公司 基于大数据的数字化安全隐患识别排查治理方法
CN116151511A (zh) * 2023-03-01 2023-05-23 国网山东省电力公司菏泽供电公司 基于数据处理的配电馈线和台区智能诊断管理方法及系统
CN116151511B (zh) * 2023-03-01 2023-10-20 国网山东省电力公司菏泽供电公司 一种基于数据处理的配电馈线和台区智能诊断管理方法及系统
CN116204690B (zh) * 2023-04-28 2023-07-18 泰力基业股份有限公司 一种具有自动灭火功能的配电箱数据传输系统
CN116204690A (zh) * 2023-04-28 2023-06-02 泰力基业股份有限公司 一种具有自动灭火功能的配电箱数据传输系统
CN116844684A (zh) * 2023-05-18 2023-10-03 首都医科大学附属北京朝阳医院 一种医学检验结果的质控处理方法、装置、设备及介质
CN116844684B (zh) * 2023-05-18 2024-04-02 首都医科大学附属北京朝阳医院 一种医学检验结果的质控处理方法、装置、设备及介质
CN116269738A (zh) * 2023-05-25 2023-06-23 深圳市科医仁科技发展有限公司 射频治疗仪的智能控制方法、装置、设备及存储介质
CN116628554A (zh) * 2023-05-31 2023-08-22 烟台大学 一种工业互联网数据异常的检测方法、系统和设备
CN116628554B (zh) * 2023-05-31 2023-11-03 烟台大学 一种工业互联网数据异常的检测方法、系统和设备
CN117282261B (zh) * 2023-11-23 2024-02-23 天津恩纳社环保有限公司 一种微生物废气处理系统
CN117282261A (zh) * 2023-11-23 2023-12-26 天津恩纳社环保有限公司 一种微生物废气处理系统
CN117436006A (zh) * 2023-12-22 2024-01-23 圣道天德电气(山东)有限公司 一种智慧环网柜故障实时监测方法及系统
CN117436006B (zh) * 2023-12-22 2024-03-15 圣道天德电气(山东)有限公司 一种智慧环网柜故障实时监测方法及系统

Also Published As

Publication number Publication date
CN111475680A (zh) 2020-07-31

Similar Documents

Publication Publication Date Title
WO2021189730A1 (zh) 检测异常高密子图的方法、装置、设备及存储介质
Jiang et al. Saliency detection via absorbing markov chain
CN111833172A (zh) 一种基于孤立森林的消费信贷欺诈行为检测方法及其系统
Verma et al. On evaluation of network intrusion detection systems: Statistical analysis of CIDDS-001 dataset using machine learning techniques
WO2022037130A1 (zh) 网络流量异常的检测方法、装置、电子装置和存储介质
CN111385297B (zh) 无线设备指纹识别方法、系统、设备及可读存储介质
CN107196953A (zh) 一种基于用户行为分析的异常行为检测方法
Li et al. A supervised clustering and classification algorithm for mining data with mixed variables
CN107579846B (zh) 一种云计算故障数据检测方法及系统
Bai et al. Entropic dynamic time warping kernels for co-evolving financial time series analysis
KR100628329B1 (ko) 네트워크 세션 특성 정보에 대한 공격 행위 탐지규칙 생성장치 및 그 방법
CN116662817B (zh) 物联网设备的资产识别方法及系统
US20090043536A1 (en) Use of Sequential Clustering for Instance Selection in Machine Condition Monitoring
WO2020258598A1 (zh) 图像处理方法、提名评估方法及相关装置
CN113822366A (zh) 业务指标异常检测方法及装置、电子设备、存储介质
CN110825545A (zh) 一种云服务平台异常检测方法与系统
CN113125903A (zh) 线损异常检测方法、装置、设备及计算机可读存储介质
CN111291824A (zh) 时间序列的处理方法、装置、电子设备和计算机可读介质
CN112202718A (zh) 一种基于XGBoost算法的操作系统识别方法、存储介质及设备
CN117156442A (zh) 基于5g网络的云数据安全保护方法及系统
CN111708890A (zh) 一种搜索词确定方法和相关装置
KR102014234B1 (ko) 무선 프로토콜 자동 분석 방법 및 그를 위한 장치
CN113794653B (zh) 一种基于抽样数据流的高速网络流量分类方法
Han et al. Time series segmentation to discover behavior switching in complex physical systems
KR102433598B1 (ko) 데이터 경계 도출 시스템 및 방법

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20927486

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 190123)

122 Ep: pct application non-entry in european phase

Ref document number: 20927486

Country of ref document: EP

Kind code of ref document: A1