CN116361059A - Diagnosis method and diagnosis system for abnormal root cause of banking business - Google Patents
Diagnosis method and diagnosis system for abnormal root cause of banking business Download PDFInfo
- Publication number
- CN116361059A CN116361059A CN202310567128.5A CN202310567128A CN116361059A CN 116361059 A CN116361059 A CN 116361059A CN 202310567128 A CN202310567128 A CN 202310567128A CN 116361059 A CN116361059 A CN 116361059A
- Authority
- CN
- China
- Prior art keywords
- abnormal
- data
- business
- root cause
- anomaly
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/079—Root cause analysis, i.e. error or fault diagnosis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
- G06N5/022—Knowledge engineering; Knowledge acquisition
- G06N5/025—Extracting rules from data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/02—Banking, e.g. interest calculation or account maintenance
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Finance (AREA)
- Accounting & Taxation (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Quality & Reliability (AREA)
- Computational Linguistics (AREA)
- Development Economics (AREA)
- Economics (AREA)
- Marketing (AREA)
- Strategic Management (AREA)
- Technology Law (AREA)
- General Business, Economics & Management (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention relates to the technical field of data processing, in particular to a method and a system for diagnosing abnormal root causes of banking businesses. The method comprises the following steps: carrying out data acquisition and data preprocessing on a banking system to obtain abnormal banking data; offline calculation is carried out on abnormal data of banking business by using an association rule mining technology, so as to obtain abnormal detection rule data; performing anomaly detection on anomaly detection rule data through a preset anomaly detection model to obtain anomaly data; extracting the characteristics of the abnormal data to obtain the characteristics of the abnormal data; constructing a root cause classification model based on a decision tree, and inputting abnormal data characteristics into the root cause classification model to perform root cause analysis to obtain abnormal root cause data; and carrying out anomaly localization on the anomaly root data by using a confidence propagation analysis and blood margin analysis technology to obtain anomaly source data. The invention further corrects the abnormal root cause point through the abnormal blood-cause map of the business so as to improve the accuracy of root cause diagnosis.
Description
Technical Field
The invention relates to the technical field of data processing, in particular to a method and a system for diagnosing abnormal root causes of banking businesses.
Background
With the rapid development of internet technology, banking systems play an increasingly important role in realizing efficient financial services. Due to the great increase in business complexity and data throughput, various abnormal situations, such as process failure, data errors, network connection problems, etc., often occur in banking systems, and these abnormalities seriously affect the normal operation of banking processes and even threaten the interests and business reputation of customers. Therefore, the abnormal root of banking business becomes an important research direction due to the diagnosis technology. The existing diagnostic method of the abnormal root causes mainly focuses on finding the source causing the occurrence of the abnormality and identifying relevant factors such as the type, the process and the cause of the abnormality. These techniques use methods of machine learning, data mining, knowledge maps, artificial intelligence, etc. to effectively diagnose various abnormal situations in a banking system. In particular, in a large data set, the complexity of diagnosing abnormal data nodes and abnormal data relationships in a banking system is very high. In order to solve the problem, a method based on graph analysis and deep learning is developed by the banking abnormal root cause diagnosis technology, and efficient, accurate and real-time abnormal data identification and root cause diagnosis are realized. However, there are numerous business systems in banks, one business process involves a very large number of systems, and the corresponding infrastructure is also very large. Usually, a business abnormality occurs, and the related applications and facilities need to be examined and diagnosed, which often takes time and effort, and results in longer duration of the abnormality and lower efficiency of abnormality treatment.
Disclosure of Invention
Based on the above, the present invention provides a method and a system for diagnosing abnormal root cause of banking business, so as to solve at least one of the above technical problems.
In order to achieve the above purpose, a method for diagnosing abnormal root cause of banking business, the method comprises the following steps:
step S1: carrying out data acquisition and data preprocessing on a banking system to obtain abnormal banking data;
step S2: offline calculation is carried out on abnormal data of banking business by using an association rule mining technology, so as to obtain abnormal detection rule data;
step S3: performing anomaly detection on anomaly detection rule data through a preset anomaly detection model to obtain anomaly data;
step S4: extracting the characteristics of the abnormal data to obtain the characteristics of the abnormal data; constructing a root cause classification model based on a decision tree, and inputting abnormal data characteristics into the root cause classification model to perform root cause analysis to obtain abnormal root cause data;
step S5: performing anomaly positioning on the anomaly root data by using a method combining confidence propagation analysis and blood margin analysis technology to obtain anomaly source data;
step S6: and constructing a business abnormal blood-margin map according to the abnormal source data, and further knowing the relationship between banking business abnormalities through the business abnormal blood-margin map so as to execute a corresponding abnormal root cause diagnosis process.
The invention is beneficial to accurately capturing the abnormal data in the banking system through data acquisition, thereby providing basic data for abnormal root cause analysis, and improving the quality and accuracy of the data through preprocessing operations such as cleaning, normalization and the like on the data, thereby providing a more reliable data base for subsequent analysis and processing. By the association rule mining technology, abnormal rules and modes in the banking system can be effectively found, so that banking abnormal data can be detected more accurately. And by offline computing the anomaly detection rule data, the operation amount of online detection can be reduced, and the anomaly detection efficiency can be improved. By using a preset anomaly detection model, anomaly data can be accurately detected, so that a necessary data basis is provided for subsequent anomaly root cause analysis. The root cause of the abnormal banking business can be accurately identified through feature extraction of abnormal data and construction of a root cause classification model based on a decision tree, and support is provided for subsequent abnormal positioning and solving. In addition, by extracting the characteristics of the abnormal data, the data characteristics which are clearer, more concise and easier to understand can be obtained, so that the interpretability and the application value of the data are improved. By utilizing a method combining confidence propagation analysis and blood margin analysis technology to perform anomaly positioning on anomaly root data, the accuracy and precision of anomaly positioning can be improved. Then, by means of abnormality positioning, abnormality source data can be rapidly and accurately positioned, the range of abnormal root causes is further narrowed, and necessary data basis is provided for subsequent diagnosis of the abnormal root causes. By constructing the abnormal blood-source map of the business, the association relation between abnormal banking business can be deeply mined, and the hidden association relation of abnormal data can be found, so that a richer data basis is provided for abnormal root cause analysis. Finally, by utilizing the abnormal blood-margin map of the business, the relationship and the connection between the abnormal banking business can be rapidly identified, the diagnosis efficiency and the accuracy of the abnormal root cause are improved, the abnormal repair time is shortened, the risk loss of the banking business is reduced, and therefore, the operation and maintenance personnel of the banking business system are helped to rapidly locate and solve the abnormal situation.
Preferably, step S1 comprises the steps of:
acquiring banking data by acquiring data of a banking system;
and carrying out data preprocessing on the banking data to obtain banking abnormal data.
By collecting and processing the banking data, the invention can automatically find the abnormal data in the banking system, which can enable banking operators to timely identify system faults and abnormal business fluctuations, thereby taking necessary measures faster and reducing losses. The data collection and preprocessing steps may provide useful information regarding banking system performance. Based on this information, banking operators can make better decisions to optimize system performance and improve business efficiency and results, while also reducing human error and error reporting.
Preferably, step S2 comprises the steps of:
step S21: obtaining abnormal business rule data by converting the abnormal banking business data;
step S22: utilizing frequent item set mining based on Apriori algorithm to acquire association rules in the business exception rule data, and obtaining a business exception frequent item set;
step S23: calculating the confidence coefficient of the business abnormal frequent item set according to the association rule algorithm, and obtaining the association rule of the confidence coefficient to obtain the rule confidence coefficient;
Wherein, the association rule algorithm formula is as follows:
in the method, in the process of the invention,for rule confidence->As an exponential function +.>For the number of collections in the business anomaly frequent item set, +.>Is the +.o in the business anomaly frequent item set>Frequent item sets->Is the +.o in the business anomaly frequent item set>Frequent item sets->Is the +.o in the business anomaly frequent item set>Probability of occurrence of frequent item sets, +.>Is the +.o in the business anomaly frequent item set>Probability of occurrence of frequent item sets, +.>Is the +.o in the business anomaly frequent item set>A. Th in the frequent item set and the business anomaly frequent item set>Probability of simultaneous occurrence of several frequent item sets, +.>Correction coefficients for rule confidence;
in order to obtain accurate rule confidence, the invention constructs a formula of an association rule algorithm, and by using the formula of the association rule algorithm, service operators can be helped to quickly locate abnormal conditions of the service, help the service operators to better handle problems, improve the efficiency of operation of the banking service, calculate the confidence of a frequent item set of the abnormal service by using the association rule algorithm and obtain the confidence of the rule, improve the efficiency and accuracy of abnormal detection of the banking service, and fully consider the quantity of the sets in the frequent item set of the abnormal service by using the formula The +.f. in the business anomaly frequent item set>Frequent item set->The +.f. in the business anomaly frequent item set>Frequent item set->The +.f. in the business anomaly frequent item set>Probability of occurrence of frequent item sets +.>The +.f. in the business anomaly frequent item set>Probability of occurrence of frequent item sets +.>The +.f. in the business anomaly frequent item set>A. Th in the frequent item set and the business anomaly frequent item set>Probability of simultaneous occurrence of frequent item sets +.>Wherein in order to prevent deviation of the obtained rule confidence, it is also necessary to normalize it according to rule confidence +.>The correlation with the parameters constitutes a functional relationship +.>The confidence coefficient of the business abnormal frequent item set is calculated through the association rule algorithm, so that the rule confidence coefficient is obtained, and meanwhile, the correction coefficient of the rule confidence coefficient in the formula is obtainedCan be adjusted according to the actual situation,thereby improving the accuracy and applicability of the association rule algorithm.
Step S24: and sequencing the business abnormal frequent item sets corresponding to the rule confidence degrees according to the sequence from large to small, and selecting the business abnormal frequent item set with higher rule confidence degrees to obtain abnormal detection rule data.
The invention can help banks to quickly and effectively find out the abnormal frequent item sets of the business through the conversion of the abnormal data of the banking business and the frequent item set mining based on the Apriori algorithm. And then, by calculating the confidence coefficient of the business anomaly frequent item set, meaningful business anomaly rule data can be screened out, and the corresponding business anomaly frequent item set is selected according to the confidence coefficient, so that the accuracy of business anomaly detection is further improved. By deriving the association rule from the business anomaly data, mining frequent item sets and calculating the confidence coefficient to determine the validity of the association rule, the business risk of the bank can be effectively prevented. The association rule with high confidence level shows that the related business anomaly rule data has stronger association, and is a reflection of the actual rule. By selecting the association rule with higher confidence, the accuracy of detecting the business abnormality can be further improved. The confidence coefficient of the abnormal frequent item set of the business is calculated according to the association rule algorithm, so that the rule confidence coefficient can be obtained, the bank is helped to better know the abnormal situation of the business, the business operation quality is improved, the business risk is reduced, and a powerful support is provided for the business development of the bank. According to the order of the rule confidence from high to low, the business abnormal frequent item set with higher rule confidence is selected for abnormal detection, so that the detection accuracy can be improved, the false alarm rate is reduced, and the business decision is optimized, thereby providing beneficial support for the business management and decision of banks.
Preferably, step S3 comprises the steps of:
step S31: performing data preprocessing and data conversion on the anomaly detection rule data to obtain anomaly detection data; dividing the anomaly detection data into a training data set, an optimization data set and a detection data set;
step S32: constructing an anomaly detection model based on a support vector machine, wherein the anomaly detection model comprises a training model, a model optimization and a detection model;
step S33: inputting a training data set into a training model based on a support vector machine for model training, and inputting an optimized data set into the training model for parameter tuning through the following abnormal detection loss function to generate a detection model;
wherein, the formula of the abnormality detection loss function is as follows:
in the method, in the process of the invention,for abnormality detection loss function, ++>For detecting abnormal loss value, < >>To optimize the number of data items of the data set, +.>To optimize the +.>Data item->No. in the data set for model training results>Data item->For regularization parameters, ++>For training parameters of the model->Correction coefficients for anomaly detection loss function;
The invention constructs an anomaly detection loss function for calculating the loss value of an anomaly detection training model, and when the anomaly detection model is used for training anomaly detection data, in order to help the model to fit the data as much as possible, the accuracy of the training model is estimated by using the anomaly detection loss function as an index of model parameter optimization, and the function formula fully considers the number of data items of an optimized data set Abnormality detection loss value->By optimizing the +.>Individual data itemsAnd +.f. in the data set of model training results>Individual data items->Form the association relation->Use regularization parameter +.>Parameters of training model->Parameter tuning is performed, and a loss function is detected according to abnormality>Interaction with parameters constitutes a functional relation +.>The function isThe formula realizes the calculation of the loss value of the abnormality detection model by using the abnormality detection loss function, and simultaneously, the correction coefficient of the abnormality detection loss function is adopted +>The method can be used for adjusting the special conditions during model training, and further improving the applicability and stability of the abnormality detection loss function, so that the generalization capability and robustness of the abnormality detection training model are improved.
Step S34: inputting the detection data set into a detection model subjected to parameter optimization for model detection to obtain an optimal abnormal detection model; and re-inputting the abnormality detection rule data into the optimal abnormality detection model for abnormality analysis detection to obtain abnormality data.
The invention can classify, analyze and diagnose the data more conveniently by preprocessing and converting the abnormal detection rule data, and divide the abnormal detection data set to make the training process of the abnormal detection model more accurate, so that the normal data and the abnormal data are fully utilized and the detection is more accurate. The support vector machine is an efficient and accurate machine learning algorithm, and can enable the model to detect abnormal data more accurately, so that banking business management and risk control levels are improved. The anomaly detection model constructed based on the support vector machine has high efficiency and stability and strong detection capability, so that the bank can more rapidly treat the anomaly problem. When the model is trained, the model is better fitted with the training data set, model parameters are optimized through abnormal detection loss functions, and an accurate detection model is finally generated. The model parameters are optimized through the anomaly detection loss function, so that the generalization capability of the model is improved, and the model can accurately detect the anomaly data in the unknown data set. The method has the advantages that the abnormal data can be more accurately captured by using the optimal abnormal detection model for detection, when the abnormal data is detected, the abnormal detection model can help banks to find potential risk problems, timely process the problems and reduce risk loss, business management problems can be processed more quickly by using the abnormal detection model for detecting the abnormal data, the working efficiency and the output are improved, the accuracy and the efficiency of abnormal data detection and risk control can be improved, and beneficial support and help are provided for banking business management and decision.
Preferably, step S4 comprises the steps of:
step S41: carrying out data preprocessing on the abnormal data and extracting relevant characteristics to obtain abnormal data characteristics;
step S42: constructing a root cause classification model based on a decision tree, and inputting the abnormal data characteristics into the root cause classification model for model training to obtain abnormal point data characteristics; verifying and optimizing the model by a cross verification method to generate an optimal root cause classification model;
step S43: inputting the abnormal point data characteristics into an optimal root cause classification model for root cause analysis, and carrying out weight calculation on the abnormal point data characteristics by using a characteristic importance analysis function to obtain abnormal point characteristic weight values;
the formula of the feature importance analysis function is as follows:
in the method, in the process of the invention,is the->Personal node->Is the->Abnormal point characteristic weight of individual node, +.>For the number of decision trees +.>Is->Number of non-leaf nodes of the tree, +.>Is->No. of the tree>Dividing outlier feature of individual non-leaf nodes, < >>To specify +.>No. of the tree>The feature of the partition outlier of the non-leaf nodes is +.>Personal node->To specify +.>No. of the tree >The characteristic of the abnormal point of the division of the non-leaf nodes is the kth-1 node in the decision tree,/L->Is->No. of the tree>Sum of first derivatives on left node of each non-leaf node, +.>Is->No. of the tree>Sum of second derivatives on left node of the non-leaf nodes, +.>Is->Is (are) smooth parameters>Is->No. of the tree>Sum of first derivatives on right node of each non-leaf node, +.>Is->No. of the tree>Sum of second derivatives on right node of the individual non-leaf nodes, +.>Is->Is (are) smooth parameters>Is->No. of the tree>Sum of first derivatives of all nodes on each non-leaf node, +.>Is->No. of the tree>Sum of second derivatives of all nodes on each non-leaf node,/->Is->Is (are) smooth parameters>The correction coefficient is the characteristic weight of the abnormal point;
the invention constructs a formula of a feature importance analysis function for improving the accuracy of root cause diagnosis, calculates and evaluates the weight of each abnormal point data feature by using the feature importance analysis function, classifies and regresses the abnormal point data feature by using a decision tree algorithm, sorts and screens the abnormal point data feature according to the weights, and determines the most important feature so as to be used for subsequent root cause analysis, thereby improving the accuracy of root cause diagnosis, and fully considers the first feature in the decision tree Personal node->Number of decision trees->First->Number of non-leaf nodes of the tree +.>First->No. of the tree>Dividing outlier feature of individual non-leaf nodes +.>Specify->No. of the tree>The feature of the partition outlier of the non-leaf nodes is +.>Indication function of individual node->Specify->No. of the tree>The characteristic of the abnormal point of the partition of the non-leaf nodes is the indication function of the kth-1 node in the decision tree +.>First->No. of the tree>Sum of first derivatives on left node of each non-leaf node +.>First->No. of the tree>Sum of second derivatives on left node of each non-leaf nodeFirst->No. of the tree>Sum of first derivatives on right node of each non-leaf node +.>First->No. of the tree>Sum of the second derivatives on the right node of the individual non-leaf nodes +.>First->No. of the tree>Sum of first derivatives of all nodes on the individual non-leaf nodes +.>First->No. of the tree>Sum of second derivatives of all nodes on each non-leaf node +.>In order not to let->No. of the tree>Deviation of the sum of the second derivatives on the left and right nodes of the non-leaf node is adjusted by introducing a harmonic smoothing parameter, wherein +.>No. of the tree >Sum of the second derivatives on the left node of the individual non-leaf nodes +.>Is +.>First->No. of the tree>Sum of the second derivatives on the right node of the individual non-leaf nodes +.>Is +.>First->No. of the tree>All on non-leaf nodesSum of second derivatives of nodes->Is +.>According to +.>Abnormal point feature weight of individual node +.>Interaction with parameters constitutes a functional relation +.>The function is used for calculating the characteristic weight of the abnormal point by utilizing the characteristic importance analysis function, and meanwhile, the abnormal point characteristic weight is corrected by the correction coefficient +.>By the introduction of the feature importance analysis function, special conditions which occur in the process of calculating the feature weight of the abnormal point data can be adjusted by using the feature importance analysis function, so that the applicability and stability of the feature importance analysis function are further improved, and the accuracy of root cause diagnosis is further improved.
Step S44: and sequencing the characteristic weights of the abnormal points according to the sequence from large to small to obtain the abnormal root data.
The method is beneficial to reducing noise and redundancy of the data by preprocessing the abnormal data and extracting the related features, thereby reducing the computational complexity and improving the computational efficiency. The root cause classification model constructed based on the decision tree has high precision and stability, can more accurately classify the root cause of abnormal data, and is beneficial to improving the generalization capability of the model by performing model verification and optimization through a cross verification method, so that the model can better perform on unknown data. The root cause of the abnormal data can be accurately determined by using the optimal root cause classification model to perform root cause analysis, so that the abnormal problem is better solved, and the importance of each abnormal point feature in the root cause classification can be calculated by using the feature importance analysis function, so that the root cause analysis result is more accurate and reliable. The main root cause of the abnormal data can be identified by sequencing the characteristic weights of the abnormal points, so that the abnormal problem is solved better, the identified main abnormal root cause is processed in a targeted manner, the abnormal problem can be solved more effectively, the service efficiency is improved, and the risk loss is reduced. In summary, a complete set of abnormal root cause analysis flow is constructed, and the efficiency and accuracy of solving the abnormal problem can be improved through the steps of processing abnormal data, constructing a model, analyzing the root cause, determining the abnormal root cause and the like, so that beneficial guidance and support are provided for business management and decision making.
Preferably, step S41 comprises the steps of:
step S411: obtaining an abnormal data set by carrying out data preprocessing on the abnormal data;
step S412: extracting the characteristics of the abnormal data set to obtain an abnormal data characteristic data set; constructing an abnormal data characteristic database, and storing the abnormal data characteristic data set into the abnormal data characteristic database;
step S413: and acquiring a characteristic data packet of an abnormal data characteristic data set in the abnormal data characteristic database, and extracting characteristic data in the characteristic data packet to obtain an abnormal data characteristic.
According to the invention, the data preprocessing is performed on the abnormal data, and is helpful for cleaning, reducing data noise and screening out the abnormal data, so that a cleaned abnormal data set is obtained, an accurate data basis is provided for subsequent abnormal detection, and the detection precision is improved. Then, by extracting the characteristics of the abnormal data set, the abnormal data can be mapped into the characteristic space, the high-dimensional characteristic data is subjected to dimension reduction, key characteristics are extracted, a basis and a foundation are provided for subsequent abnormal detection, and the time and the calculation cost required by the abnormal detection are reduced. And constructing an abnormal data characteristic database and storing the abnormal data characteristic data set into the abnormal data characteristic database, so that the integrity, usability and confidentiality of the abnormal data characteristics can be managed, maintained and guaranteed. The method comprises the steps of obtaining a characteristic data packet of an abnormal data characteristic data set in an abnormal data characteristic database, extracting characteristic data in the characteristic data packet, analyzing and mining abnormal data characteristics, finding potential abnormal data characteristic rules and trends, providing a reference basis for business decision and management of banks, and providing valuable data support for optimization of root cause classification models.
Preferably, step S5 comprises the steps of:
step S51: preprocessing and cleaning the abnormal root data to obtain standard root data;
step S52: analyzing the abnormal root cause points of the standard root cause data by using a blood margin analysis technology, and constructing a blood margin relation graph of the abnormal root cause points to obtain an initial business abnormal blood margin graph;
step S53: and carrying out abnormal optimization positioning on abnormal root cause points in the initial business abnormal blood-margin map by using a preset confidence coefficient propagation analysis technology to obtain abnormal source data.
The invention is helpful for cleaning data and rapidly screening out standard root cause data through the pretreatment of the abnormal root cause data, provides an accurate data basis for the subsequent abnormal root cause analysis, and improves the analysis precision. The blood margin analysis technology is utilized to analyze the abnormal root points to which the standard root data belong, so that a blood margin relation graph of the abnormal root points is constructed, the association and the association between the business data can be analyzed by the bank, and the influence and the propagation force graph of the abnormal data can be established, thereby rapidly positioning and analyzing the parallel abnormal points, and simultaneously providing effective information support for the abnormal root positioning. By utilizing a confidence propagation analysis technology, the node confidence level can be continuously adjusted in an iterative mode, and finally, the confidence level value of the abnormal source node is obtained, so that an abnormal source is rapidly positioned and the abnormal detection precision is optimized. And meanwhile, the abnormal root cause points in the abnormal blood-edge map of the initial business are subjected to abnormal optimization positioning through a confidence propagation analysis technology, so that an abnormal source can be rapidly positioned, the accuracy of abnormal detection is improved, the business flow and flow transformation are optimized, and important support and help are provided for banking business management and decision making, so that accurate abnormal source data are obtained.
Preferably, step S53 includes the steps of:
step S530: constructing a confidence propagation analysis technology, wherein the confidence propagation analysis technology comprises a confidence analysis technology, a confidence propagation technology and a confidence tracking technology;
step S531: solving an initial business abnormal blood margin map by a confidence analysis technology to obtain initial node confidence;
step S532: carrying out confidence coefficient propagation on the initial node confidence coefficient, and transmitting and updating the confidence coefficient of each node through a confidence coefficient propagation technology according to the interdependence relationship among the nodes to obtain final confidence coefficient;
step S533: and sorting the final confidence coefficient, and searching a precursor node of a node with higher final confidence coefficient by using a confidence coefficient tracking technology to obtain abnormal source data.
The invention can analyze and position the abnormal root cause of the business by constructing a confidence propagation analysis technology, provides a beneficial analysis tool and method for the bank, and further optimizes the business management and decision of the bank. The confidence coefficient of the initial node can be obtained by solving the initial business abnormal blood-edge map through the confidence coefficient analysis technology in the confidence coefficient propagation analysis technology, a basis and a foundation are provided for the follow-up confidence coefficient propagation, and meanwhile, reference information for business abnormal root analysis can be provided for banks. The confidence coefficient of the initial node is transmitted, the confidence coefficient of each node can be transmitted and updated through a confidence coefficient transmission technology according to the interdependence relationship among the nodes, and finally the final confidence coefficient is obtained, so that the accurate positioning of the abnormal root causes is realized, and a basis for accurate abnormal analysis and processing is provided for banks. Candidate abnormal root causes can be screened step by step through the belief propagation among the nodes, the speed and the accuracy of locating the abnormal root causes are accelerated, and the belief propagation technology among the nodes can avoid loopholes of data information, so that missing and misjudgment phenomena of the abnormal root causes are effectively avoided, and the accuracy of abnormality detection is improved. In addition, the confidence coefficient propagation is carried out on the initial node confidence coefficient, so that the accuracy and speed of anomaly detection can be improved, the speed and accuracy of accurately positioning the anomaly root cause are increased, the scientificity of business decision is enhanced, the business process is optimized, and beneficial support and assistance are provided for banks. Through the sorting and confidence tracking technology, the abnormal node with higher final confidence can be quickly found, and the precursor node is searched, so that abnormal business source data can be more accurately positioned, the bank can be helped to better solve business problems, and the service level can be improved. The confidence tracking technology can effectively search the precursor nodes and acquire the information thereof, is favorable for banks to timely receive and process massive abnormal data, has advantages in time, and has important practical value for daily business operation and risk management of the banks.
Preferably, step S6 comprises the steps of:
step S61: carrying out data analysis on the abnormal source data to construct abnormal blood-margin maps of different types of businesses;
step S62: carrying out detailed analysis on different types of abnormal business blood-margin maps according to abnormal conditions, and knowing the attribute and type among abnormal banking businesses to obtain an abnormal root cause diagnosis scheme;
step S63: the bank staff performs the corresponding abnormal root cause diagnosis process by checking and analyzing the abnormal root cause diagnosis scheme.
According to the invention, through data analysis on the abnormality source data, corresponding business abnormality blood-margin maps are constructed according to different types of abnormalities, and a beneficial tool and data support are provided for analysis and diagnosis of the abnormalities. According to the abnormal conditions, detailed analysis is carried out on abnormal blood-related maps of different types of businesses, so that the abnormal properties and types can be known more deeply, the accuracy and speed for locating the abnormal root cause are higher, the problem is analyzed from multiple dimensions, and abnormal data are counted to help risk control. The method has the advantages that the management level of banking business and the business experience of customers can be greatly improved, various types and attributes of banking business abnormality can be deeply known, and beneficial information and guidance are provided for providing corresponding abnormality root cause diagnosis schemes. By checking and analyzing the abnormal root cause diagnosis scheme, bank staff can quickly locate the abnormal source of the business and perform corresponding processing, thereby improving the banking efficiency, customer satisfaction and trust feeling, avoiding the occurrence of similar abnormal events again and being beneficial to the improvement of team cooperation efficiency.
Preferably, in the present specification, there is also provided a banking abnormality cause diagnosis system including:
at least one processor; the method comprises the steps of,
a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the banking anomaly root diagnosis method as claimed in any one of the preceding claims.
In summary, the present invention provides a system for diagnosing abnormal root cause of banking, which can implement any one of the methods for diagnosing abnormal root cause of banking according to the present invention, and is used for implementing a method for diagnosing abnormal root cause of banking in conjunction with operations among a memory, a processor and a computer program running on the memory, the internal structures of the systems cooperate with each other, and the system for diagnosing abnormal root cause of banking analyzes relationships and connection modes of data nodes of banking by using advanced algorithms and models based on a technology of abnormal blood-cause map of banking, so as to implement rapid location and identification of abnormal data sources. The core technology of the system comprises abnormal blood-source map construction, abnormal source data positioning, abnormal source data influence analysis, abnormal root cause diagnosis and the like. The system combines the advanced technologies such as artificial intelligence, machine learning, knowledge graph and the like, can monitor the abnormal data condition of the banking system in real time, analyze the relationship and organization structure among the abnormal data and the influence relationship between the abnormal data and other business processes, thereby helping the operation and maintenance personnel of the banking system to quickly locate and solve the abnormal condition. The technology can improve the stability and reliability of banking business processes and provide higher-quality financial services for customers.
Drawings
Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of a non-limiting implementation, made with reference to the accompanying drawings in which:
FIG. 1 is a schematic flow chart of steps of a method for diagnosing abnormal root cause of banking business according to the present invention;
FIG. 2 is a flowchart illustrating the detailed implementation of step S2 in FIG. 1;
FIG. 3 is a flowchart illustrating the detailed implementation of step S3 in FIG. 1;
FIG. 4 is a flowchart illustrating the detailed implementation of step S4 in FIG. 1;
FIG. 5 is a flowchart illustrating the detailed implementation of step S41 in FIG. 4;
FIG. 6 is a flowchart illustrating the detailed implementation of step S5 in FIG. 1;
FIG. 7 is a flowchart illustrating the detailed implementation of step S53 in FIG. 6;
fig. 8 is a detailed implementation step flow diagram of step S6 in fig. 1.
Detailed Description
The following is a clear and complete description of the technical method of the present patent in conjunction with the accompanying drawings, and it is evident that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, are intended to fall within the scope of the present invention.
Furthermore, the drawings are merely schematic illustrations of the present invention and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus a repetitive description thereof will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. The functional entities may be implemented in software or in one or more hardware modules or integrated circuits or in different networks and/or processor methods and/or microcontroller methods.
It will be understood that, although the terms "first," "second," etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another element. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing from the scope of example embodiments. The term "and/or" as used herein includes any and all combinations of one or more of the associated listed items.
In order to achieve the above objective, referring to fig. 1 to 8, the present invention provides a method for diagnosing abnormal root cause of banking, the method comprising the following steps:
step S1: carrying out data acquisition and data preprocessing on a banking system to obtain abnormal banking data;
step S2: offline calculation is carried out on abnormal data of banking business by using an association rule mining technology, so as to obtain abnormal detection rule data;
step S3: performing anomaly detection on anomaly detection rule data through a preset anomaly detection model to obtain anomaly data;
step S4: extracting the characteristics of the abnormal data to obtain the characteristics of the abnormal data; constructing a root cause classification model based on a decision tree, and inputting abnormal data characteristics into the root cause classification model to perform root cause analysis to obtain abnormal root cause data;
step S5: performing anomaly positioning on the anomaly root data by using a method combining confidence propagation analysis and blood margin analysis technology to obtain anomaly source data;
step S6: and constructing a business abnormal blood-margin map according to the abnormal source data, and further knowing the relationship between banking business abnormalities through the business abnormal blood-margin map so as to execute a corresponding abnormal root cause diagnosis process.
In the embodiment of the present invention, referring to fig. 1, a flow chart of steps of a method for diagnosing a root cause of a banking abnormality according to the present invention is shown, where in this example, the method for diagnosing a root cause of a banking abnormality includes:
step S1: carrying out data acquisition and data preprocessing on a banking system to obtain abnormal banking data;
the embodiment of the invention performs data acquisition and data preprocessing on the banking system, wherein the data acquisition of the banking system generally comprises two modes of real-time acquisition and off-line acquisition. The real-time acquisition can be performed by a monitoring system, the transaction data, the log data and the like are monitored and acquired in real time, and the offline acquisition can be performed by means of backup, data transmission and the like. And then, carrying out operations such as data deduplication, cleaning, missing value filling and the like on the acquired data, and finally obtaining abnormal data of banking business.
Step S2: offline calculation is carried out on abnormal data of banking business by using an association rule mining technology, so as to obtain abnormal detection rule data;
according to the embodiment of the invention, the association rule mining technology is used for carrying out offline calculation on information such as different business process abnormality detection rules, abnormality detection thresholds and the like in the abnormal data of the banking business, and finally the abnormal detection rule data is obtained.
Step S3: performing anomaly detection on anomaly detection rule data through a preset anomaly detection model to obtain anomaly data;
according to the embodiment of the invention, the anomaly detection rule data is divided into a training data set, an optimizing data set and a detection data set by utilizing the pre-constructed anomaly detection model based on the support vector machine to carry out anomaly detection, the anomaly detection model is subjected to model training, parameter optimization and model verification, the anomaly data is identified through the anomaly detection model, and finally the anomaly data is obtained.
Step S4: extracting the characteristics of the abnormal data to obtain the characteristics of the abnormal data; constructing a root cause classification model based on a decision tree, and inputting abnormal data characteristics into the root cause classification model to perform root cause analysis to obtain abnormal root cause data;
according to the embodiment of the invention, the key characteristics of the abnormal data are obtained by extracting the characteristics of the abnormal data, then, the extracted characteristics of the abnormal data are input into the root cause classification model for root cause analysis by constructing the root cause classification model based on the decision tree, and finally, the abnormal root cause data are obtained.
Step S5: performing anomaly positioning on the anomaly root data by using a method combining confidence propagation analysis and blood margin analysis technology to obtain anomaly source data;
According to the embodiment of the invention, through combining confidence propagation analysis and blood margin analysis technology, the abnormal root data is subjected to abnormal positioning, the source and related factors of the abnormal root data are determined, and finally the abnormal source data are obtained.
Step S6: and constructing a business abnormal blood-margin map according to the abnormal source data, and further knowing the relationship between banking business abnormalities through the business abnormal blood-margin map so as to execute a corresponding abnormal root cause diagnosis process.
According to the embodiment of the invention, the abnormal source data and the abnormal fields are used for constructing a business abnormal blood-cause map, the business abnormal blood-cause map is a topological relation map representing data sources and data flow directions, the relation among the abnormal source data can be helped to be understood, the sources, the flow directions and the changes of the abnormal source data are analyzed and determined through the business abnormal blood-cause map, solutions are selected and executed according to the abnormal cause, for example, possible faults are repaired, abnormal configuration and parameters are modified, and the like, so that an abnormal cause diagnosis scheme is finally obtained, and bank staff can rapidly locate the business abnormal source and perform corresponding abnormal cause diagnosis processing through checking and analyzing the abnormal cause diagnosis scheme.
The invention is beneficial to accurately capturing the abnormal data in the banking system through data acquisition, thereby providing basic data for abnormal root cause analysis, and improving the quality and accuracy of the data through preprocessing operations such as cleaning, normalization and the like on the data, thereby providing a more reliable data base for subsequent analysis and processing. By the association rule mining technology, abnormal rules and modes in the banking system can be effectively found, so that banking abnormal data can be detected more accurately. And by offline computing the anomaly detection rule data, the operation amount of online detection can be reduced, and the anomaly detection efficiency can be improved. By using a preset anomaly detection model, anomaly data can be accurately detected, so that a necessary data basis is provided for subsequent anomaly root cause analysis. The root cause of the abnormal banking business can be accurately identified through feature extraction of abnormal data and construction of a root cause classification model based on a decision tree, and support is provided for subsequent abnormal positioning and solving. In addition, by extracting the characteristics of the abnormal data, the data characteristics which are clearer, more concise and easier to understand can be obtained, so that the interpretability and the application value of the data are improved. By utilizing a method combining confidence propagation analysis and blood margin analysis technology to perform anomaly positioning on anomaly root data, the accuracy and precision of anomaly positioning can be improved. Then, by means of abnormality positioning, abnormality source data can be rapidly and accurately positioned, the range of abnormal root causes is further narrowed, and necessary data basis is provided for subsequent diagnosis of the abnormal root causes. By constructing the abnormal blood-source map of the business, the association relation between abnormal banking business can be deeply mined, and the hidden association relation of abnormal data can be found, so that a richer data basis is provided for abnormal root cause analysis. Finally, by utilizing the abnormal blood-margin map of the business, the relationship and the connection between the abnormal banking business can be rapidly identified, the diagnosis efficiency and the accuracy of the abnormal root cause are improved, the abnormal repair time is shortened, the risk loss of the banking business is reduced, and therefore, the operation and maintenance personnel of the banking business system are helped to rapidly locate and solve the abnormal situation.
Preferably, step S1 comprises the steps of:
acquiring banking data by acquiring data of a banking system;
the embodiment of the invention collects transaction data, customer data, account data and the like in a banking system through real-time collection and off-line collection, and performs corresponding desensitization and encryption processing on the data to finally obtain banking data.
Preferably, the banking data is subjected to data preprocessing to obtain banking abnormal data.
According to the embodiment of the invention, the collected original banking data is subjected to preprocessing operations such as removing the repeated value, filling the missing value, converting the data type and the like, so that the banking abnormal data is detected from the banking data, and finally the banking abnormal data is obtained.
By collecting and processing the banking data, the invention can automatically find the abnormal data in the banking system, which can enable banking operators to timely identify system faults and abnormal business fluctuations, thereby taking necessary measures faster and reducing losses. The data collection and preprocessing steps may provide useful information regarding banking system performance. Based on this information, banking operators can make better decisions to optimize system performance and improve business efficiency and results, while also reducing human error and error reporting.
Preferably, step S2 comprises the steps of:
step S21: obtaining abnormal business rule data by converting the abnormal banking business data;
step S22: utilizing frequent item set mining based on Apriori algorithm to acquire association rules in the business exception rule data, and obtaining a business exception frequent item set;
step S23: calculating the confidence coefficient of the business abnormal frequent item set according to the association rule algorithm, and obtaining the association rule of the confidence coefficient to obtain the rule confidence coefficient;
wherein, the association rule algorithm formula is as follows:
in the method, in the process of the invention,for rule confidence->As an exponential function +.>For the number of collections in the business anomaly frequent item set, +.>Is the +.o in the business anomaly frequent item set>Frequent item sets->Is the +.o in the business anomaly frequent item set>Frequent item sets->Is the +.o in the business anomaly frequent item set>Probability of occurrence of frequent item sets, +.>Is the +.o in the business anomaly frequent item set>Probability of occurrence of frequent item sets, +.>Is the +.o in the business anomaly frequent item set>A. Th in the frequent item set and the business anomaly frequent item set>Probability of simultaneous occurrence of several frequent item sets, +.>Correction coefficients for rule confidence;
step S24: and sequencing the business abnormal frequent item sets corresponding to the rule confidence degrees according to the sequence from large to small, and selecting the business abnormal frequent item set with higher rule confidence degrees to obtain abnormal detection rule data.
As an embodiment of the present invention, referring to fig. 2, a detailed step flow chart of step S2 in fig. 1 is shown, in which step S2 includes the following steps:
step S21: obtaining abnormal business rule data by converting the abnormal banking business data;
according to the embodiment of the invention, the business abnormal rule is defined according to the characteristics and the requirements of the banking business, the collected and preprocessed banking business abnormal data is converted into the business abnormal rule data, and finally the business abnormal rule data is obtained.
Step S22: utilizing frequent item set mining based on Apriori algorithm to acquire association rules in the business exception rule data, and obtaining a business exception frequent item set;
according to the embodiment of the invention, the association rule in the business anomaly rule data is obtained through utilizing frequent item set mining based on the Apriori algorithm, the association rule meeting the minimum support degree and the minimum confidence degree threshold is determined based on the Apriori algorithm, the mined association rule is evaluated and optimized, the confidence degree threshold is adjusted or invalid rules are deleted, and finally the business anomaly frequent item set is obtained.
Step S23: calculating the confidence coefficient of the business abnormal frequent item set according to the association rule algorithm, and obtaining the association rule of the confidence coefficient to obtain the rule confidence coefficient;
According to the embodiment of the invention, the confidence coefficient of the mined business abnormal frequent item set is obtained by calculating the business abnormal frequent item set by using the association rule algorithm, the confidence coefficient set meeting the requirements is obtained by pruning the confidence coefficient rule which does not meet the requirements according to the confidence coefficient threshold set in the front, and finally the rule confidence coefficient is obtained.
Wherein, the association rule algorithm formula is as follows:
in the method, in the process of the invention,for rule confidence->As an exponential function +.>For the number of collections in the business anomaly frequent item set, +.>Is the +.o in the business anomaly frequent item set>Frequent item sets->Is the +.o in the business anomaly frequent item set>Frequent item sets->Is the +.o in the business anomaly frequent item set>Probability of occurrence of frequent item sets, +.>Is the +.o in the business anomaly frequent item set>Probability of occurrence of frequent item sets, +.>Is the +.o in the business anomaly frequent item set>A. Th in the frequent item set and the business anomaly frequent item set>Probability of simultaneous occurrence of several frequent item sets, +.>Correction coefficients for rule confidence;
the invention constructs a formula of an association rule algorithm, and in order to obtain accurate rule confidence,the method can help service operators to quickly locate abnormal conditions of the service, help the service operators to better handle the problem, improve the efficiency of operation of the banking service, and improve the efficiency and accuracy of detection of the abnormal condition of the banking service by calculating the confidence coefficient of the abnormal frequent item set of the service and obtaining the confidence coefficient of the rule by using the association rule algorithm The +.f. in the business anomaly frequent item set>Frequent item set->The +.f. in the business anomaly frequent item set>Frequent item set->The +.f. in the business anomaly frequent item set>Probability of occurrence of frequent item sets +.>The +.f. in the business anomaly frequent item set>Probability of occurrence of frequent item sets +.>The +.f. in the business anomaly frequent item set>A. Th in the frequent item set and the business anomaly frequent item set>Probability of simultaneous occurrence of frequent item sets +.>Wherein in order to prevent deviation of the obtained rule confidence, it is also necessary to normalize it according to rule confidence +.>The correlation with the parameters constitutes a functional relationship +.>The confidence coefficient of the business abnormal frequent item set is calculated through the association rule algorithm, so that the rule confidence coefficient is obtained, and meanwhile, the correction coefficient of the rule confidence coefficient in the formula is obtainedThe method can be adjusted according to actual conditions, so that accuracy and applicability of the association rule algorithm are improved.
Step S24: and sequencing the business abnormal frequent item sets corresponding to the rule confidence degrees according to the sequence from large to small, and selecting the business abnormal frequent item set with higher rule confidence degrees to obtain abnormal detection rule data.
According to the embodiment of the invention, the business abnormal frequent item sets are ordered according to the rule confidence degree from large to small, the business abnormal frequent item set with higher rule confidence degree is selected according to the ordered result, and the corresponding business abnormal rule is established according to the selected business abnormal frequent item set, so that the abnormal detection rule data is finally obtained.
The invention can help banks to quickly and effectively find out the abnormal frequent item sets of the business through the conversion of the abnormal data of the banking business and the frequent item set mining based on the Apriori algorithm. And then, by calculating the confidence coefficient of the business anomaly frequent item set, meaningful business anomaly rule data can be screened out, and the corresponding business anomaly frequent item set is selected according to the confidence coefficient, so that the accuracy of business anomaly detection is further improved. By deriving the association rule from the business anomaly data, mining frequent item sets and calculating the confidence coefficient to determine the validity of the association rule, the business risk of the bank can be effectively prevented. The association rule with high confidence level shows that the related business anomaly rule data has stronger association, and is a reflection of the actual rule. By selecting the association rule with higher confidence, the accuracy of detecting the business abnormality can be further improved. The confidence coefficient of the abnormal frequent item set of the business is calculated according to the association rule algorithm, so that the rule confidence coefficient can be obtained, the bank is helped to better know the abnormal situation of the business, the business operation quality is improved, the business risk is reduced, and a powerful support is provided for the business development of the bank. According to the order of the rule confidence from high to low, the business abnormal frequent item set with higher rule confidence is selected for abnormal detection, so that the detection accuracy can be improved, the false alarm rate is reduced, and the business decision is optimized, thereby providing beneficial support for the business management and decision of banks.
Preferably, step S3 comprises the steps of:
step S31: performing data preprocessing and data conversion on the anomaly detection rule data to obtain anomaly detection data; dividing the anomaly detection data into a training data set, an optimization data set and a detection data set;
step S32: constructing an anomaly detection model based on a support vector machine, wherein the anomaly detection model comprises a training model, a model optimization and a detection model;
step S33: inputting a training data set into a training model based on a support vector machine for model training, and inputting an optimized data set into the training model for parameter tuning through the following abnormal detection loss function to generate a detection model;
wherein, the formula of the abnormality detection loss function is as follows:
in the method, in the process of the invention,for abnormality detection loss function, ++>For detecting abnormal loss value, < >>To optimize the number of data items of the data set, +.>To optimize the +.>Data item->No. in the data set for model training results>Data item->For regularization parameters, ++>For training parameters of the model->Correction coefficients for the anomaly detection loss function;
step S34: inputting the detection data set into a detection model subjected to parameter optimization for model detection to obtain an optimal abnormal detection model; and re-inputting the abnormality detection rule data into the optimal abnormality detection model for abnormality analysis detection to obtain abnormality data.
As an embodiment of the present invention, referring to fig. 3, a detailed step flow chart of step S3 in fig. 1 is shown, in which step S3 includes the following steps:
step S31: performing data preprocessing and data conversion on the anomaly detection rule data to obtain anomaly detection data; dividing the anomaly detection data into a training data set, an optimization data set and a detection data set;
according to the embodiment of the invention, the accuracy and consistency of data are ensured by carrying out the processes of cleaning, de-duplication, filling and the like on the anomaly detection rule data, the cleaned data are converted into a data format suitable for model training and anomaly detection, the anomaly detection data are finally obtained, the preprocessed and converted anomaly detection data are divided into three different data sets according to the ratio of 7:2:1 for training, optimizing and detecting, wherein the training data set is used for training an anomaly detection model and accounts for 70% of the anomaly detection data; the optimized data set is used for optimizing parameters of the anomaly detection model and is used in a training stage, and the parameters account for 20% of anomaly detection data; the detection data set is used for testing and verifying a trained abnormality detection model, and accounts for 10% of the abnormality detection data.
Step S32: constructing an anomaly detection model based on a support vector machine, wherein the anomaly detection model comprises a training model, a model optimization and a detection model;
the method comprises the steps of constructing an anomaly detection model based on a support vector machine, wherein the constructed anomaly detection model comprises a training model, model optimization and detection models, and the training model trains the anomaly detection model by using a training data set and evaluates the model; the model optimization is to perform parameter tuning on the trained abnormal detection model by using an optimized data set to obtain a detection model; the detection model is used for testing and verifying the abnormality detection model by using the detection data set, so that potential abnormality and risk are found.
Step S33: inputting a training data set into a training model based on a support vector machine for model training, and inputting an optimized data set into the training model for parameter tuning through the following abnormal detection loss function to generate a detection model;
according to the embodiment of the invention, the abnormal detection model based on the support vector machine is trained through the divided training data set, the abnormal detection loss function is selected to fit the abnormal detection model optimizing model parameters, the optimizing data set is input into the training model through the abnormal detection loss function to iteratively calculate gradient descent, so that the model parameters are optimized to obtain optimal super parameters, and finally the detection model is generated.
Wherein, the formula of the abnormality detection loss function is as follows:
in the method, in the process of the invention,for abnormality detection loss function, ++>For detecting abnormal loss value, < >>To optimize the number of data items of the data set, +.>To optimize the +.>Data item->No. in the data set for model training results>Data item->For regularization parameters, ++>For training parameters of the model->Correction coefficients for the anomaly detection loss function;
the invention constructs an anomaly detection loss function for calculating the loss value of an anomaly detection training model, and when the anomaly detection model is used for training anomaly detection data, in order to help the model to fit the data as much as possible, the anomaly detection loss function is used as an index of model parameter optimization to evaluate the quasi-training modelThe functional formula fully considers the number of data items of the optimized data setAbnormality detection loss value->By optimizing the +.>Individual data itemsAnd +.f. in the data set of model training results>Individual data items->Form the association relation->Use regularization parameter +.>Parameters of training model->Parameter tuning is performed, and a loss function is detected according to abnormality >Interaction with parameters constitutes a functional relation +.>The function formula realizes the calculation of the loss value of the abnormality detection model by using the abnormality detection loss function, and simultaneously, the function formula realizes the calculation of the loss value of the abnormality detection model by the correction coefficient of the abnormality detection loss function>Can be adjusted for special situations occurring during model training,the applicability and stability of the anomaly detection loss function are further improved, and therefore the generalization capability and robustness of the anomaly detection training model are improved.
Step S34: inputting the detection data set into a detection model subjected to parameter optimization for model detection to obtain an optimal abnormal detection model; and re-inputting the abnormality detection rule data into the optimal abnormality detection model for abnormality analysis detection to obtain abnormality data.
According to the embodiment of the invention, the performance and accuracy of the trained detection model are verified by using the detection data set, the optimal anomaly detection model is finally obtained, the optimal anomaly detection model is applied to anomaly detection rule data, potential risks and anomalies are identified by detecting anomaly data through the model, and the anomaly data is finally obtained.
The invention can classify, analyze and diagnose the data more conveniently by preprocessing and converting the abnormal detection rule data, and divide the abnormal detection data set to make the training process of the abnormal detection model more accurate, so that the normal data and the abnormal data are fully utilized and the detection is more accurate. The support vector machine is an efficient and accurate machine learning algorithm, and can enable the model to detect abnormal data more accurately, so that banking business management and risk control levels are improved. The anomaly detection model constructed based on the support vector machine has high efficiency and stability and strong detection capability, so that the bank can more rapidly treat the anomaly problem. When the model is trained, the model is better fitted with the training data set, model parameters are optimized through abnormal detection loss functions, and an accurate detection model is finally generated. The model parameters are optimized through the anomaly detection loss function, so that the generalization capability of the model is improved, and the model can accurately detect the anomaly data in the unknown data set. The method has the advantages that the abnormal data can be more accurately captured by using the optimal abnormal detection model for detection, when the abnormal data is detected, the abnormal detection model can help banks to find potential risk problems, timely process the problems and reduce risk loss, business management problems can be processed more quickly by using the abnormal detection model for detecting the abnormal data, the working efficiency and the output are improved, the accuracy and the efficiency of abnormal data detection and risk control can be improved, and beneficial support and help are provided for banking business management and decision.
Preferably, step S4 comprises the steps of:
step S41: carrying out data preprocessing on the abnormal data and extracting relevant characteristics to obtain abnormal data characteristics;
step S42: constructing a root cause classification model based on a decision tree, and inputting the abnormal data characteristics into the root cause classification model for model training to obtain abnormal point data characteristics; verifying and optimizing the model by a cross verification method to generate an optimal root cause classification model;
step S43: inputting the abnormal point data characteristics into an optimal root cause classification model for root cause analysis, and carrying out weight calculation on the abnormal point data characteristics by using a characteristic importance analysis function to obtain abnormal point characteristic weight values;
the formula of the feature importance analysis function is as follows:
in the method, in the process of the invention,is the->Personal node->Is the->Abnormal point characteristic weight of individual node, +.>For the number of decision trees +.>Is->Number of non-leaf nodes of the tree, +.>Is->No. of the tree>Dividing outlier feature of individual non-leaf nodes, < >>To specify +.>No. of the tree>The feature of the partition outlier of the non-leaf nodes is +.>Personal node->To specify +.>No. of the tree >The characteristic of the abnormal point of the division of the non-leaf nodes is the kth-1 node in the decision tree,/L->Is->No. of the tree>Sum of first derivatives on left node of each non-leaf node, +.>Is->No. of the tree>Sum of second derivatives on left node of the non-leaf nodes, +.>Is->Is (are) smooth parameters>Is->No. of the tree>Sum of first derivatives on right node of each non-leaf node, +.>Is->No. of the tree>Sum of second derivatives on right node of the individual non-leaf nodes, +.>Is->Is (are) smooth parameters>Is->No. of the tree>Sum of first derivatives of all nodes on each non-leaf node, +.>Is->No. of the tree>Sum of second derivatives of all nodes on each non-leaf node,/->Is->Is (are) smooth parameters>The correction coefficient is the characteristic weight of the abnormal point;
step S44: and sequencing the characteristic weights of the abnormal points according to the sequence from large to small to obtain the abnormal root data.
As an embodiment of the present invention, referring to fig. 4, a detailed step flow chart of step S4 in fig. 1 is shown, in which step S4 includes the following steps:
step S41: carrying out data preprocessing on the abnormal data and extracting relevant characteristics to obtain abnormal data characteristics;
According to the embodiment of the invention, the obtained abnormal data is subjected to the processes of cleaning, de-duplication, filling the missing value and the like, and then, the relevant characteristics are extracted from the preprocessed abnormal data through the characteristic extraction algorithm, so that the characteristics of the abnormal data are finally obtained.
Step S42: constructing a root cause classification model based on a decision tree, and inputting the abnormal data characteristics into the root cause classification model for model training to obtain abnormal point data characteristics; verifying and optimizing the model by a cross verification method to generate an optimal root cause classification model;
according to the embodiment of the invention, the root cause classification model is constructed by using a classification algorithm based on a decision tree, the abnormal data features are sent into the root cause classification model for training, the abnormal data features with similar abnormal information are divided into the same class, the corresponding abnormal point data features are obtained, the root cause classification model is optimized by a cross verification method, and in the cross verification process, the optimal super-parameter combination is searched by using the modes of grid search, random search and the like, so that the optimal root cause classification model is finally generated.
Step S43: inputting the abnormal point data characteristics into an optimal root cause classification model for root cause analysis, and carrying out weight calculation on the abnormal point data characteristics by using a characteristic importance analysis function to obtain abnormal point characteristic weight values;
According to the embodiment of the invention, root cause analysis is carried out on the abnormal point data characteristics by using an optimal root cause classification model, specific characteristics causing abnormality are identified, then the importance of each characteristic in the model is calculated by using a characteristic importance analysis function, and the characteristic importance is used for measuring the influence degree of the characteristic on abnormal data so as to help identify the root cause and finally obtain the characteristic weight of the abnormal point.
The formula of the feature importance analysis function is as follows:
in the method, in the process of the invention,is the->Personal node->Is the->Abnormal point characteristic weight of individual node, +.>For the number of decision trees +.>Is->Number of non-leaf nodes of the tree, +.>Is->No. of the tree>Dividing outlier feature of individual non-leaf nodes, < >>To specify +.>No. of the tree>The feature of the partition outlier of the non-leaf nodes is +.>Personal node->To specify +.>No. of the tree>The characteristic of the abnormal point of the division of the non-leaf nodes is the kth-1 node in the decision tree,/L->Is->No. of the tree>Sum of first derivatives on left node of each non-leaf node, +.>Is->No. of the tree>Sum of second derivatives on left node of the non-leaf nodes, +.>Is- >Is (are) smooth parameters>Is->No. of the tree>Sum of first derivatives on right node of each non-leaf node, +.>Is->No. of the tree>Sum of second derivatives on right node of the individual non-leaf nodes, +.>Is->Is (are) smooth parameters>Is->No. of the tree>Sum of first derivatives of all nodes on each non-leaf node, +.>Is->No. of the tree>Sum of second derivatives of all nodes on each non-leaf node,/->Is->Is (are) smooth parameters>The correction coefficient is the characteristic weight of the abnormal point;
the invention constructs a formula of a feature importance analysis function for improving the accuracy of root cause diagnosis, calculates and evaluates the weight of each abnormal point data feature by using the feature importance analysis function, classifies and regresses the abnormal point data feature by using a decision tree algorithm, sorts and screens the abnormal point data feature according to the weights, and determines the most important feature so as to be used for subsequent root cause analysis, thereby improving the accuracy of root cause diagnosis, and fully considers the first feature in the decision treePersonal node->Number of decision trees->First->Number of non-leaf nodes of the tree +.>First- >No. of the tree>Dividing outlier feature of individual non-leaf nodes +.>Specify->No. of the tree>The feature of the partition outlier of the non-leaf nodes is +.>Indication function of individual node->Specify->No. of the tree>The characteristic of the abnormal point of the partition of the non-leaf nodes is the indication function of the kth-1 node in the decision tree +.>First->No. of the tree>Sum of first derivatives on left node of each non-leaf node +.>First->No. of the tree>Sum of second derivatives on left node of each non-leaf nodeFirst->No. of the tree>Sum of first derivatives on right node of each non-leaf node +.>First->No. of the tree>Sum of the second derivatives on the right node of the individual non-leaf nodes +.>First->No. of the tree>Sum of first derivatives of all nodes on the individual non-leaf nodes +.>First->No. of the tree>Sum of second derivatives of all nodes on each non-leaf node +.>In order not to let->No. of the tree>Deviation of the sum of the second derivatives on the left and right nodes of the non-leaf node is adjusted by introducing a harmonic smoothing parameter, wherein +.>No. of the tree>Sum of the second derivatives on the left node of the individual non-leaf nodes +.>Is +. >First->No. of the tree>Sum of the second derivatives on the right node of the individual non-leaf nodes +.>Is +.>First->No. of the tree>Sum of second derivatives of all nodes on each non-leaf node +.>Is +.>According to +.>Abnormal point feature weight of individual node +.>Interaction with parameters constitutes a functional relation +.>The function is used for calculating the characteristic weight of the abnormal point by utilizing the characteristic importance analysis function, and meanwhile, the abnormal point characteristic weight is corrected by the correction coefficient +.>By the introduction of the feature importance analysis function, special conditions which occur in the process of calculating the feature weight of the abnormal point data can be adjusted by using the feature importance analysis function, so that the applicability and stability of the feature importance analysis function are further improved, and the accuracy of root cause diagnosis is further improved.
Step S44: and sequencing the characteristic weights of the abnormal points according to the sequence from large to small to obtain the abnormal root data.
According to the embodiment of the invention, the characteristic weights of the abnormal points are ranked according to the sequence from large to small by sequencing the characteristic weights of the abnormal points, the most important characteristic causing the abnormality is determined, and the most important characteristic is taken as the basis of the abnormal root cause, so that the abnormal root cause data is finally obtained.
The method is beneficial to reducing noise and redundancy of the data by preprocessing the abnormal data and extracting the related features, thereby reducing the computational complexity and improving the computational efficiency. The root cause classification model constructed based on the decision tree has high precision and stability, can more accurately classify the root cause of abnormal data, and is beneficial to improving the generalization capability of the model by performing model verification and optimization through a cross verification method, so that the model can better perform on unknown data. The root cause of the abnormal data can be accurately determined by using the optimal root cause classification model to perform root cause analysis, so that the abnormal problem is better solved, and the importance of each abnormal point feature in the root cause classification can be calculated by using the feature importance analysis function, so that the root cause analysis result is more accurate and reliable. The main root cause of the abnormal data can be identified by sequencing the characteristic weights of the abnormal points, so that the abnormal problem is solved better, the identified main abnormal root cause is processed in a targeted manner, the abnormal problem can be solved more effectively, the service efficiency is improved, and the risk loss is reduced. In summary, a complete set of abnormal root cause analysis flow is constructed, and the efficiency and accuracy of solving the abnormal problem can be improved through the steps of processing abnormal data, constructing a model, analyzing the root cause, determining the abnormal root cause and the like, so that beneficial guidance and support are provided for business management and decision making.
Preferably, step S41 comprises the steps of:
step S411: obtaining an abnormal data set by carrying out data preprocessing on the abnormal data;
step S412: extracting the characteristics of the abnormal data set to obtain an abnormal data characteristic data set; constructing an abnormal data characteristic database, and storing the abnormal data characteristic data set into the abnormal data characteristic database;
step S413: and acquiring a characteristic data packet of an abnormal data characteristic data set in the abnormal data characteristic database, and extracting characteristic data in the characteristic data packet to obtain an abnormal data characteristic.
As an embodiment of the present invention, referring to fig. 5, a detailed step flow diagram of step S41 in fig. 4 is shown, in which step S41 includes the following steps:
step S411: obtaining an abnormal data set by carrying out data preprocessing on the abnormal data;
according to the embodiment of the invention, the abnormal data is subjected to data cleaning, duplicate removal, missing value filling and other treatments, and the abnormal data is subjected to operations such as standardization and normalization, so that an abnormal data set is finally obtained.
Step S412: extracting the characteristics of the abnormal data set to obtain an abnormal data characteristic data set; constructing an abnormal data characteristic database, and storing the abnormal data characteristic data set into the abnormal data characteristic database;
According to the embodiment of the invention, related information is extracted from the abnormal data set by using a feature extraction algorithm, an abnormal data feature database is constructed according to the extracted feature data, and the extracted abnormal data feature data set is stored in the abnormal data feature database.
Step S413: and acquiring a characteristic data packet of an abnormal data characteristic data set in the abnormal data characteristic database, and extracting characteristic data in the characteristic data packet to obtain an abnormal data characteristic.
According to the embodiment of the invention, the characteristic data packets of the abnormal data characteristic data set in the abnormal data characteristic database are obtained by using query and access sentences, the characteristic data packets are analyzed, the most valuable characteristic data are extracted and are used as the basis for root cause analysis of the abnormal data, and finally the abnormal data characteristics are obtained.
According to the invention, the data preprocessing is performed on the abnormal data, and is helpful for cleaning, reducing data noise and screening out the abnormal data, so that a cleaned abnormal data set is obtained, an accurate data basis is provided for subsequent abnormal detection, and the detection precision is improved. Then, by extracting the characteristics of the abnormal data set, the abnormal data can be mapped into the characteristic space, the high-dimensional characteristic data is subjected to dimension reduction, key characteristics are extracted, a basis and a foundation are provided for subsequent abnormal detection, and the time and the calculation cost required by the abnormal detection are reduced. And constructing an abnormal data characteristic database and storing the abnormal data characteristic data set into the abnormal data characteristic database, so that the integrity, usability and confidentiality of the abnormal data characteristics can be managed, maintained and guaranteed. The method comprises the steps of obtaining a characteristic data packet of an abnormal data characteristic data set in an abnormal data characteristic database, extracting characteristic data in the characteristic data packet, analyzing and mining abnormal data characteristics, finding potential abnormal data characteristic rules and trends, providing a reference basis for business decision and management of banks, and providing valuable data support for optimization of root cause classification models.
Preferably, step S5 comprises the steps of:
step S51: preprocessing and cleaning the abnormal root data to obtain standard root data;
step S52: analyzing the abnormal root cause points of the standard root cause data by using a blood margin analysis technology, and constructing a blood margin relation graph of the abnormal root cause points to obtain an initial business abnormal blood margin graph;
step S53: and carrying out abnormal optimization positioning on abnormal root cause points in the initial business abnormal blood-margin map by using a preset confidence coefficient propagation analysis technology to obtain abnormal source data.
As an embodiment of the present invention, referring to fig. 6, a detailed step flow chart of step S5 in fig. 1 is shown, in which step S5 includes the following steps:
step S51: preprocessing and cleaning the abnormal root data to obtain standard root data;
according to the embodiment of the invention, the abnormal root data is changed into the standardized data by removing repeated data, filling the missing value, processing the abnormal data and other measures on the abnormal root data, and finally the standard root data is obtained.
Step S52: analyzing the abnormal root cause points of the standard root cause data by using a blood margin analysis technology, and constructing a blood margin relation graph of the abnormal root cause points to obtain an initial business abnormal blood margin graph;
According to the embodiment of the invention, the blood-margin analysis technology is used for analyzing the abnormal root cause points in the standard root cause data, so that the blood-margin relation map of the abnormal root cause points is constructed, and finally the initial business abnormal blood-margin map is obtained.
Step S53: and carrying out abnormal optimization positioning on abnormal root cause points in the initial business abnormal blood-margin map by using a preset confidence coefficient propagation analysis technology to obtain abnormal source data.
According to the embodiment of the invention, the abnormal root point is subjected to abnormal optimization positioning by utilizing a preset confidence propagation analysis technology, the most probable abnormal source data is determined, the correlation between the abnormal root points in the initial business abnormal blood-margin map and the distance and direction of confidence propagation are analyzed, and corresponding measures are taken to correct the abnormal root point, so that the abnormal source data is finally obtained.
The invention is helpful for cleaning data and rapidly screening out standard root cause data through the pretreatment of the abnormal root cause data, provides an accurate data basis for the subsequent abnormal root cause analysis, and improves the analysis precision. The blood margin analysis technology is utilized to analyze the abnormal root points to which the standard root data belong, so that a blood margin relation graph of the abnormal root points is constructed, the association and the association between the business data can be analyzed by the bank, and the influence and the propagation force graph of the abnormal data can be established, thereby rapidly positioning and analyzing the parallel abnormal points, and simultaneously providing effective information support for the abnormal root positioning. By utilizing a confidence propagation analysis technology, the node confidence level can be continuously adjusted in an iterative mode, and finally, the confidence level value of the abnormal source node is obtained, so that an abnormal source is rapidly positioned and the abnormal detection precision is optimized. And meanwhile, the abnormal root cause points in the abnormal blood-edge map of the initial business are subjected to abnormal optimization positioning through a confidence propagation analysis technology, so that an abnormal source can be rapidly positioned, the accuracy of abnormal detection is improved, the business flow and flow transformation are optimized, and important support and help are provided for banking business management and decision making, so that accurate abnormal source data are obtained.
Preferably, step S53 includes the steps of:
step S530: constructing a confidence propagation analysis technology, wherein the confidence propagation analysis technology comprises a confidence analysis technology, a confidence propagation technology and a confidence tracking technology;
step S531: solving an initial business abnormal blood margin map by a confidence analysis technology to obtain initial node confidence;
step S532: carrying out confidence coefficient propagation on the initial node confidence coefficient, and transmitting and updating the confidence coefficient of each node through a confidence coefficient propagation technology according to the interdependence relationship among the nodes to obtain final confidence coefficient;
step S533: and sorting the final confidence coefficient, and searching a precursor node of a node with higher final confidence coefficient by using a confidence coefficient tracking technology to obtain abnormal source data.
As an embodiment of the present invention, referring to fig. 7, a detailed step flow chart of step S53 in fig. 6 is shown, in which step S53 includes the following steps:
step S530: constructing a confidence propagation analysis technology, wherein the confidence propagation analysis technology comprises a confidence analysis technology, a confidence propagation technology and a confidence tracking technology;
the embodiment of the invention determines the most probable abnormal source data by constructing a confidence propagation analysis technology, wherein the confidence propagation analysis technology is a root cause analysis method based on graph theory and probability statistics. In this technique, the confidence of an outlier indicates the probability that the outlier is a root node. The confidence analysis technology is used for calculating initial confidence of each abnormal point; the confidence coefficient propagation technology is used for transmitting the confidence coefficient from the precursor node to the subsequent node and updating the confidence coefficient of the subsequent node; the confidence tracking technology is used for searching the precursor node of the abnormal root node and determining the data or the process corresponding to the abnormal root node.
Step S531: solving an initial business abnormal blood margin map by a confidence analysis technology to obtain initial node confidence;
according to the embodiment of the invention, the initial confidence coefficient of each node in the initial business abnormal blood-edge map is calculated by using a confidence coefficient analysis technology, so that the probability that the nodes possibly become root cause nodes is measured, and finally the initial node confidence coefficient is obtained.
Step S532: carrying out confidence coefficient propagation on the initial node confidence coefficient, and transmitting and updating the confidence coefficient of each node through a confidence coefficient propagation technology according to the interdependence relationship among the nodes to obtain final confidence coefficient;
according to the embodiment of the invention, the confidence coefficient is transmitted from the precursor node to the subsequent node by using a confidence coefficient transmission technology, and the confidence coefficient of the subsequent node is updated by using a probability statistical technology according to the mutual dependency relationship among the nodes, so that the final confidence coefficient is finally obtained.
Step S533: and sorting the final confidence coefficient, and searching a precursor node of a node with higher final confidence coefficient by using a confidence coefficient tracking technology to obtain abnormal source data.
According to the embodiment of the invention, the abnormal root node is sequenced according to the calculated final confidence coefficient, so that the abnormal root node with higher confidence coefficient is found out. Then, precursor nodes of the abnormal root cause nodes with higher confidence can be searched through a confidence tracking technology, and data or a process corresponding to the abnormal root cause nodes is determined, so that abnormal source data is obtained.
The invention can analyze and position the abnormal root cause of the business by constructing a confidence propagation analysis technology, provides a beneficial analysis tool and method for the bank, and further optimizes the business management and decision of the bank. The confidence coefficient of the initial node can be obtained by solving the initial business abnormal blood-edge map through the confidence coefficient analysis technology in the confidence coefficient propagation analysis technology, a basis and a foundation are provided for the follow-up confidence coefficient propagation, and meanwhile, reference information for business abnormal root analysis can be provided for banks. The confidence coefficient of the initial node is transmitted, the confidence coefficient of each node can be transmitted and updated through a confidence coefficient transmission technology according to the interdependence relationship among the nodes, and finally the final confidence coefficient is obtained, so that the accurate positioning of the abnormal root causes is realized, and a basis for accurate abnormal analysis and processing is provided for banks. Candidate abnormal root causes can be screened step by step through the belief propagation among the nodes, the speed and the accuracy of locating the abnormal root causes are accelerated, and the belief propagation technology among the nodes can avoid loopholes of data information, so that missing and misjudgment phenomena of the abnormal root causes are effectively avoided, and the accuracy of abnormality detection is improved. In addition, the confidence coefficient propagation is carried out on the initial node confidence coefficient, so that the accuracy and speed of anomaly detection can be improved, the speed and accuracy of accurately positioning the anomaly root cause are increased, the scientificity of business decision is enhanced, the business process is optimized, and beneficial support and assistance are provided for banks. Through the sorting and confidence tracking technology, the abnormal node with higher final confidence can be quickly found, and the precursor node is searched, so that abnormal business source data can be more accurately positioned, the bank can be helped to better solve business problems, and the service level can be improved. The confidence tracking technology can effectively search the precursor nodes and acquire the information thereof, is favorable for banks to timely receive and process massive abnormal data, has advantages in time, and has important practical value for daily business operation and risk management of the banks.
Preferably, step S6 comprises the steps of:
step S61: carrying out data analysis on the abnormal source data to construct abnormal blood-margin maps of different types of businesses;
step S62: carrying out detailed analysis on different types of abnormal business blood-margin maps according to abnormal conditions, and knowing the attribute and type among abnormal banking businesses to obtain an abnormal root cause diagnosis scheme;
step S63: the bank staff performs the corresponding abnormal root cause diagnosis process by checking and analyzing the abnormal root cause diagnosis scheme.
As an embodiment of the present invention, referring to fig. 8, a detailed step flow chart of step S6 in fig. 1 is shown, in which step S6 includes the following steps:
step S61: carrying out data analysis on the abnormal source data to construct abnormal blood-margin maps of different types of businesses;
according to the embodiment of the invention, the abnormal source data are analyzed, counted and processed to construct different types of abnormal business blood-margin maps, so that different types of abnormal business conditions are identified.
Step S62: carrying out detailed analysis on different types of abnormal business blood-margin maps according to abnormal conditions, and knowing the attribute and type among abnormal banking businesses to obtain an abnormal root cause diagnosis scheme;
According to the embodiment of the invention, the attribute and the type of the abnormal banking business are known by further analyzing the abnormal blood-cause patterns of different types of business, the abnormal root cause nodes are identified from the abnormal blood-cause patterns, and the abnormal root cause diagnosis scheme is finally obtained by analyzing the relationship between the abnormal root cause nodes.
Step S63: the bank staff performs the corresponding abnormal root cause diagnosis process by checking and analyzing the abnormal root cause diagnosis scheme.
According to the embodiment of the invention, the staff of the bank can know the specific condition of the abnormal root cause node by checking and analyzing the abnormal root cause diagnosis scheme, so that a corresponding abnormal root cause diagnosis process is formulated, wherein the root cause diagnosis process possibly involves further analysis and investigation on aspects such as data, business flow, supervision policy and the like, so that the abnormal root cause can be accurately identified and related problems can be solved.
According to the invention, through data analysis on the abnormality source data, corresponding business abnormality blood-margin maps are constructed according to different types of abnormalities, and a beneficial tool and data support are provided for analysis and diagnosis of the abnormalities. According to the abnormal conditions, detailed analysis is carried out on abnormal blood-related maps of different types of businesses, so that the abnormal properties and types can be known more deeply, the accuracy and speed for locating the abnormal root cause are higher, the problem is analyzed from multiple dimensions, and abnormal data are counted to help risk control. The method has the advantages that the management level of banking business and the business experience of customers can be greatly improved, various types and attributes of banking business abnormality can be deeply known, and beneficial information and guidance are provided for providing corresponding abnormality root cause diagnosis schemes. By checking and analyzing the abnormal root cause diagnosis scheme, bank staff can quickly locate the abnormal source of the business and perform corresponding processing, thereby improving the banking efficiency, customer satisfaction and trust feeling, avoiding the occurrence of similar abnormal events again and being beneficial to the improvement of team cooperation efficiency.
Preferably, in the present specification, there is also provided a banking abnormality cause diagnosis system including:
at least one processor; the method comprises the steps of,
a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the banking anomaly root diagnosis method as claimed in any one of the preceding claims.
In summary, the present invention provides a system for diagnosing abnormal root cause of banking, which can implement any one of the methods for diagnosing abnormal root cause of banking according to the present invention, and is used for implementing a method for diagnosing abnormal root cause of banking in conjunction with operations among a memory, a processor and a computer program running on the memory, the internal structures of the systems cooperate with each other, and the system for diagnosing abnormal root cause of banking analyzes relationships and connection modes of data nodes of banking by using advanced algorithms and models based on a technology of abnormal blood-cause map of banking, so as to implement rapid location and identification of abnormal data sources. The core technology of the system comprises abnormal blood-source map construction, abnormal source data positioning, abnormal source data influence analysis, abnormal root cause diagnosis and the like. The system combines the advanced technologies such as artificial intelligence, machine learning, knowledge graph and the like, can monitor the abnormal data condition of the banking system in real time, analyze the relationship and organization structure among the abnormal data and the influence relationship between the abnormal data and other business processes, thereby helping the operation and maintenance personnel of the banking system to quickly locate and solve the abnormal condition. The technology can improve the stability and reliability of banking business processes and provide higher-quality financial services for customers.
The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.
The foregoing is only a specific embodiment of the invention to enable those skilled in the art to understand or practice the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims (10)
1. The method for diagnosing the abnormal root cause of the banking business is characterized by comprising the following steps of:
step S1: carrying out data acquisition and data preprocessing on a banking system to obtain abnormal banking data;
step S2: offline calculation is carried out on abnormal data of banking business by using an association rule mining technology, so as to obtain abnormal detection rule data;
Step S3: performing anomaly detection on anomaly detection rule data through a preset anomaly detection model to obtain anomaly data;
step S4: extracting the characteristics of the abnormal data to obtain the characteristics of the abnormal data; constructing a root cause classification model based on a decision tree, and inputting abnormal data characteristics into the root cause classification model to perform root cause analysis to obtain abnormal root cause data;
step S5: performing anomaly positioning on the anomaly root data by using a method combining confidence propagation analysis and blood margin analysis technology to obtain anomaly source data;
step S6: and constructing a business abnormal blood-margin map according to the abnormal source data, and further knowing the relationship between banking business abnormalities through the business abnormal blood-margin map so as to execute a corresponding abnormal root cause diagnosis process.
2. The method for diagnosing a cause of an abnormality in banking according to claim 1, wherein step S1 includes the steps of:
acquiring banking data by acquiring data of a banking system;
and carrying out data preprocessing on the banking data to obtain banking abnormal data.
3. The method for diagnosing a cause of an abnormality in banking according to claim 1, wherein step S2 includes the steps of:
Step S21: obtaining abnormal business rule data by converting the abnormal banking business data;
step S22: utilizing frequent item set mining based on Apriori algorithm to acquire association rules in the business exception rule data, and obtaining a business exception frequent item set;
step S23: calculating the confidence coefficient of the business abnormal frequent item set according to the association rule algorithm, and obtaining the association rule of the confidence coefficient to obtain the rule confidence coefficient;
wherein, the association rule algorithm formula is as follows:
in the method, in the process of the invention,for rule confidence->As an exponential function +.>For the number of collections in the business anomaly frequent item set, +.>Is the +.o in the business anomaly frequent item set>Frequent item sets->Is the +.o in the business anomaly frequent item set>Frequent item sets->Is the +.o in the business anomaly frequent item set>Probability of occurrence of frequent item sets, +.>Is the +.o in the business anomaly frequent item set>Probability of occurrence of frequent item sets, +.>Is the +.o in the business anomaly frequent item set>A. Th in the frequent item set and the business anomaly frequent item set>Probability of simultaneous occurrence of several frequent item sets, +.>Correction coefficients for rule confidence;
step S24: and sequencing the business abnormal frequent item sets corresponding to the rule confidence degrees according to the sequence from large to small, and selecting the business abnormal frequent item set with higher rule confidence degrees to obtain abnormal detection rule data.
4. The method for diagnosing a cause of an abnormality in banking according to claim 1, wherein step S3 includes the steps of:
step S31: performing data preprocessing and data conversion on the anomaly detection rule data to obtain anomaly detection data; dividing the anomaly detection data into a training data set, an optimization data set and a detection data set;
step S32: constructing an anomaly detection model based on a support vector machine, wherein the anomaly detection model comprises a training model, a model optimization and a detection model;
step S33: inputting a training data set into a training model based on a support vector machine for model training, and inputting an optimized data set into the training model for parameter tuning through the following abnormal detection loss function to generate a detection model;
wherein, the formula of the abnormality detection loss function is as follows:
in the method, in the process of the invention,for abnormality detection loss function, ++>For detecting abnormal loss value, < >>To optimize the number of data items of the data set, +.>To optimize the +.>Data item->No. in the data set for model training results>Data item->For regularization parameters, ++>For training parameters of the model->Correction coefficients for the anomaly detection loss function;
Step S34: inputting the detection data set into a detection model subjected to parameter optimization for model detection to obtain an optimal abnormal detection model; and re-inputting the abnormality detection rule data into the optimal abnormality detection model for abnormality analysis detection to obtain abnormality data.
5. The method for diagnosing a cause of an abnormality in banking according to claim 1, wherein step S4 includes the steps of:
step S41: carrying out data preprocessing on the abnormal data and extracting relevant characteristics to obtain abnormal data characteristics;
step S42: constructing a root cause classification model based on a decision tree, and inputting the abnormal data characteristics into the root cause classification model for model training to obtain abnormal point data characteristics; verifying and optimizing the model by a cross verification method to generate an optimal root cause classification model;
step S43: inputting the abnormal point data characteristics into an optimal root cause classification model for root cause analysis, and carrying out weight calculation on the abnormal point data characteristics by using a characteristic importance analysis function to obtain abnormal point characteristic weight values;
the formula of the feature importance analysis function is as follows:
in the method, in the process of the invention,is the->Personal node- >Is the->Abnormal point characteristic weight of individual node, +.>For the number of decision trees +.>Is->Number of non-leaf nodes of the tree, +.>Is->No. of the tree>Dividing outlier feature of individual non-leaf nodes, < >>To specify +.>No. of the tree>The feature of the partition outlier of the non-leaf nodes is +.>Personal node->To specify +.>No. of the tree>The characteristic of the abnormal point of the division of the non-leaf nodes is the kth-1 node in the decision tree,/L->Is->No. of the tree>The sum of the first derivatives on the left node of the non-leaf nodes,is->No. of the tree>Sum of second derivatives on left node of the non-leaf nodes, +.>Is->Is (are) smooth parameters>Is->No. of the tree>Sum of first derivatives on right node of each non-leaf node, +.>Is->No. of the tree>Sum of second derivatives on right node of the individual non-leaf nodes, +.>Is->Is (are) smooth parameters>Is->No. of the tree>Sum of first derivatives of all nodes on each non-leaf node, +.>Is->No. of the tree>Sum of second derivatives of all nodes on each non-leaf node,/->Is->Is (are) smooth parameters>The correction coefficient is the characteristic weight of the abnormal point;
Step S44: and sequencing the characteristic weights of the abnormal points according to the sequence from large to small to obtain the abnormal root data.
6. The method of diagnosing a root cause of a banking abnormality as recited in claim 5, wherein step S41 includes the steps of:
step S411: obtaining an abnormal data set by carrying out data preprocessing on the abnormal data;
step S412: extracting the characteristics of the abnormal data set to obtain an abnormal data characteristic data set; constructing an abnormal data characteristic database, and storing the abnormal data characteristic data set into the abnormal data characteristic database;
step S413: and acquiring a characteristic data packet of an abnormal data characteristic data set in the abnormal data characteristic database, and extracting characteristic data in the characteristic data packet to obtain an abnormal data characteristic.
7. The method for diagnosing a cause of an abnormality in banking according to claim 1, wherein step S5 includes the steps of:
step S51: preprocessing and cleaning the abnormal root data to obtain standard root data;
step S52: analyzing the abnormal root cause points of the standard root cause data by using a blood margin analysis technology, and constructing a blood margin relation graph of the abnormal root cause points to obtain an initial business abnormal blood margin graph;
Step S53: and carrying out abnormal optimization positioning on abnormal root cause points in the initial business abnormal blood-margin map by using a preset confidence coefficient propagation analysis technology to obtain abnormal source data.
8. The method of diagnosing a root cause of a banking abnormality as recited in claim 7, wherein the step S53 includes the steps of:
step S530: constructing a confidence propagation analysis technology, wherein the confidence propagation analysis technology comprises a confidence analysis technology, a confidence propagation technology and a confidence tracking technology;
step S531: solving an initial business abnormal blood margin map by a confidence analysis technology to obtain initial node confidence;
step S532: carrying out confidence coefficient propagation on the initial node confidence coefficient, and transmitting and updating the confidence coefficient of each node through a confidence coefficient propagation technology according to the interdependence relationship among the nodes to obtain final confidence coefficient;
step S533: and sorting the final confidence coefficient, and searching a precursor node of a node with higher final confidence coefficient by using a confidence coefficient tracking technology to obtain abnormal source data.
9. The method for diagnosing a cause of an abnormality in banking according to claim 1, wherein step S6 includes the steps of:
step S61: carrying out data analysis on the abnormal source data to construct abnormal blood-margin maps of different types of businesses;
Step S62: carrying out detailed analysis on different types of abnormal business blood-margin maps according to abnormal conditions, and knowing the attribute and type among abnormal banking businesses to obtain an abnormal root cause diagnosis scheme;
step S63: the bank staff performs the corresponding abnormal root cause diagnosis process by checking and analyzing the abnormal root cause diagnosis scheme.
10. A banking abnormality cause diagnosis system, comprising:
at least one processor; the method comprises the steps of,
a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the banking anomaly root diagnosis method of any one of claims 1 to 9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310567128.5A CN116361059B (en) | 2023-05-19 | 2023-05-19 | Diagnosis method and diagnosis system for abnormal root cause of banking business |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310567128.5A CN116361059B (en) | 2023-05-19 | 2023-05-19 | Diagnosis method and diagnosis system for abnormal root cause of banking business |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116361059A true CN116361059A (en) | 2023-06-30 |
CN116361059B CN116361059B (en) | 2023-08-08 |
Family
ID=86909993
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310567128.5A Active CN116361059B (en) | 2023-05-19 | 2023-05-19 | Diagnosis method and diagnosis system for abnormal root cause of banking business |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116361059B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116661426A (en) * | 2023-07-14 | 2023-08-29 | 创域智能(常熟)网联科技有限公司 | Abnormal AI diagnosis method and system of sensor operation control system |
CN117874686A (en) * | 2024-03-11 | 2024-04-12 | 中信证券股份有限公司 | Abnormal data positioning method, device, electronic equipment and computer readable medium |
CN117874686B (en) * | 2024-03-11 | 2024-05-10 | 中信证券股份有限公司 | Abnormal data positioning method, device, electronic equipment and computer readable medium |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140115403A1 (en) * | 2012-03-26 | 2014-04-24 | Nec Laboratories America, Inc. | Method and System for Software System Performance Diagnosis with Kernel Event Feature Guidance |
US20150242856A1 (en) * | 2014-02-21 | 2015-08-27 | International Business Machines Corporation | System and Method for Identifying Procurement Fraud/Risk |
CN105373894A (en) * | 2015-11-20 | 2016-03-02 | 广州供电局有限公司 | Inspection data-based power marketing service diagnosis model establishing method and system |
KR20180060044A (en) * | 2016-11-28 | 2018-06-07 | 주식회사 나라시스템 | Security System for Cloud Computing Service |
CN110046145A (en) * | 2019-04-02 | 2019-07-23 | 国网安徽省电力有限公司 | Expert intelligence Analysis Service platform based on electric flux big data research |
CN115344449A (en) * | 2021-05-14 | 2022-11-15 | 中国移动通信集团浙江有限公司 | Alarm analysis method, device, equipment and storage medium |
CN115576731A (en) * | 2022-10-27 | 2023-01-06 | 中国农业银行股份有限公司 | System fault root cause positioning method and device, equipment and storage medium |
US20230091638A1 (en) * | 2021-09-21 | 2023-03-23 | Rakuten Mobile, Inc. | Method, device and computer program product for anomaly detection |
CN116028467A (en) * | 2022-11-15 | 2023-04-28 | 珠海市新德汇信息技术有限公司 | Intelligent service big data modeling method, system, storage medium and computer equipment |
-
2023
- 2023-05-19 CN CN202310567128.5A patent/CN116361059B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140115403A1 (en) * | 2012-03-26 | 2014-04-24 | Nec Laboratories America, Inc. | Method and System for Software System Performance Diagnosis with Kernel Event Feature Guidance |
US20150242856A1 (en) * | 2014-02-21 | 2015-08-27 | International Business Machines Corporation | System and Method for Identifying Procurement Fraud/Risk |
CN105373894A (en) * | 2015-11-20 | 2016-03-02 | 广州供电局有限公司 | Inspection data-based power marketing service diagnosis model establishing method and system |
KR20180060044A (en) * | 2016-11-28 | 2018-06-07 | 주식회사 나라시스템 | Security System for Cloud Computing Service |
CN110046145A (en) * | 2019-04-02 | 2019-07-23 | 国网安徽省电力有限公司 | Expert intelligence Analysis Service platform based on electric flux big data research |
CN115344449A (en) * | 2021-05-14 | 2022-11-15 | 中国移动通信集团浙江有限公司 | Alarm analysis method, device, equipment and storage medium |
US20230091638A1 (en) * | 2021-09-21 | 2023-03-23 | Rakuten Mobile, Inc. | Method, device and computer program product for anomaly detection |
CN115576731A (en) * | 2022-10-27 | 2023-01-06 | 中国农业银行股份有限公司 | System fault root cause positioning method and device, equipment and storage medium |
CN116028467A (en) * | 2022-11-15 | 2023-04-28 | 珠海市新德汇信息技术有限公司 | Intelligent service big data modeling method, system, storage medium and computer equipment |
Non-Patent Citations (3)
Title |
---|
张晴, 丁鹏: "银行卡业务异常数据分析方法的研究与实现", 计算机工程, no. 1, pages 48 - 52 * |
蒋国瑞;王晓谊;谢凤玲;: "基于数据挖掘的电信家庭电话分类研究", 科技管理研究, no. 12, pages 111 - 115 * |
赵宇阔;: "通讯痕迹与情报分析系统的设计方法", 电脑与电信, no. 09, pages 56 - 57 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116661426A (en) * | 2023-07-14 | 2023-08-29 | 创域智能(常熟)网联科技有限公司 | Abnormal AI diagnosis method and system of sensor operation control system |
CN116661426B (en) * | 2023-07-14 | 2023-09-22 | 创域智能(常熟)网联科技有限公司 | Abnormal AI diagnosis method and system of sensor operation control system |
CN117874686A (en) * | 2024-03-11 | 2024-04-12 | 中信证券股份有限公司 | Abnormal data positioning method, device, electronic equipment and computer readable medium |
CN117874686B (en) * | 2024-03-11 | 2024-05-10 | 中信证券股份有限公司 | Abnormal data positioning method, device, electronic equipment and computer readable medium |
Also Published As
Publication number | Publication date |
---|---|
CN116361059B (en) | 2023-08-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Chen et al. | Entity embedding-based anomaly detection for heterogeneous categorical events | |
US11748227B2 (en) | Proactive information technology infrastructure management | |
Zhong et al. | Unsupervised learning for expert-based software quality estimation. | |
CN111563524A (en) | Multi-station fusion system operation situation abnormity monitoring and alarm combining method | |
CN110335168B (en) | Method and system for optimizing power utilization information acquisition terminal fault prediction model based on GRU | |
CN106503086A (en) | The detection method of distributed local outlier | |
Lim et al. | Identifying recurrent and unknown performance issues | |
CN116361059B (en) | Diagnosis method and diagnosis system for abnormal root cause of banking business | |
US20220367057A1 (en) | Missing medical diagnosis data imputation method and apparatus, electronic device and medium | |
CN116976682B (en) | Fuzzy algorithm-based operation state evaluation method for electricity consumption information acquisition system | |
CN117196066A (en) | Intelligent operation and maintenance information analysis model | |
CN115185736A (en) | Micro-service call chain anomaly detection method and device based on graph convolution neural network | |
CN116597939A (en) | Medicine quality control management analysis system and method based on big data | |
CN115544519A (en) | Method for carrying out security association analysis on threat information of metering automation system | |
CN114416573A (en) | Defect analysis method, device, equipment and medium for application program | |
CN116108371B (en) | Cloud service abnormity diagnosis method and system based on cascade abnormity generation network | |
Kong et al. | Multivariate time series anomaly detection with generative adversarial networks based on active distortion transformer | |
Mahammad et al. | Machine Learning Approach to Predict Asthma Prevalence with Decision Trees | |
CN117539920B (en) | Data query method and system based on real estate transaction multidimensional data | |
Supardi et al. | An evolutionary stream clustering technique for outlier detection | |
Mim et al. | Impact of Centrality on Automated Vulnerability Detection Using Convolutional Neural Network | |
Biswas et al. | An Iterative Clustering Approach for Tracking Server Logs for Monitoring SCADA EMS/DMS | |
Weng et al. | A Correlation Analysis-Based Multivariate Alarm Method With Maximum Likelihood Evidential Reasoning | |
CN117149500B (en) | Abnormal root cause obtaining method and system based on index data and log data | |
CN113392921B (en) | Data-driven wind control strategy rule generation method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |