CN118133339A - Compliance intelligent early warning system and early warning method based on data behavior feature analysis - Google Patents

Compliance intelligent early warning system and early warning method based on data behavior feature analysis Download PDF

Info

Publication number
CN118133339A
CN118133339A CN202410158124.6A CN202410158124A CN118133339A CN 118133339 A CN118133339 A CN 118133339A CN 202410158124 A CN202410158124 A CN 202410158124A CN 118133339 A CN118133339 A CN 118133339A
Authority
CN
China
Prior art keywords
data
early warning
analysis
behavior
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410158124.6A
Other languages
Chinese (zh)
Inventor
王佳妮
杨磊
曹秦畅
李嘉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xi'an Anmai Information Technology Co ltd
Original Assignee
Xi'an Anmai Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xi'an Anmai Information Technology Co ltd filed Critical Xi'an Anmai Information Technology Co ltd
Priority to CN202410158124.6A priority Critical patent/CN118133339A/en
Publication of CN118133339A publication Critical patent/CN118133339A/en
Pending legal-status Critical Current

Links

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

A compliance intelligent early warning system and an early warning method based on data behavior feature analysis, the system comprises: the system comprises a data source module, a data acquisition module, a capability assembly module, a behavior monitoring module and an intelligent early warning module; the method comprises the following steps: the method comprises the steps of collecting original data in real time through a data collecting module, carrying out cleaning conversion, analysis packaging and storage on the collected data, carrying out standardization processing on the data, carrying out data analysis on the stored data through a capability component module, identifying an abnormal behavior mode through a behavior monitoring module, namely judging compliance of the data behavior, sending alarm information to an intelligent early warning module if the abnormal behavior is monitored, receiving the alarm information through the intelligent early warning module, sending risk early warning to relevant responsible persons or departments, counting early warning information from different dimensionalities, and generating a statistical analysis report form to show early warning distribution conditions; the invention has the advantages of high-efficiency identification, high safety, low operation cost and comprehensive data processing.

Description

Compliance intelligent early warning system and early warning method based on data behavior feature analysis
Technical Field
The invention relates to the technical field of data analysis and early warning, in particular to a compliance intelligent early warning system and method based on data behavior feature analysis.
Background
In the digital age, data has become the core of enterprise operation, however, as data is continuously growing and data exchange between enterprises increases, security and compliance problems of data are increasingly significant, and threats such as data leakage, abuses and unauthorized data access are also increased, which may lead to serious consequences such as privacy violations, rule violations and reputation impairments; thus, security and compliance of data has become a primary concern for various industries; in this context, data behavior analysis techniques are open-ended, aimed at monitoring, analyzing, and pre-warning data-related behaviors to ensure data security and compliance, which are capable of capturing various aspects of data, including access, transmission, sharing, and use of data, to identify abnormal behaviors or potential compliance issues in real-time, through which an enterprise can better understand its data lifecycle, identify potential threats, improve compliance, and take precautions to ensure data integrity and security.
Existing technical solutions, including access control, security Information and Event Management Systems (SIEMs), and User Behavior Analysis (UBA) tools, have made some progress in the fields of data security and compliance; however, these schemes often face limitations such as complex configuration, insufficient monitoring, and lack of intelligent pre-warning.
The invention patent application with the publication number of CN114338221A discloses a network detection system based on big data analysis, which comprises a network interface, a network management module, a detection module, a security module, an analysis module, a processing module, an early warning module and a processor, wherein the detection module detects the states of the network interface and the network management module to acquire state data of a network communication channel; the security module is used for filtering the transmission data of the network interface so as to actively protect the data of the network communication; the analysis module is based on the data received by the user access terminal and analyzes the data; the processing module is used for processing the abnormality of the network and prompting the abnormal data to the user access terminal; the early warning module is used for early warning the on-off state of the network transmission channel so as to realize interactive feedback prompt of the network state; however, since the analysis module analyzes the security state data and the operation state data received by the network management module and triggers the security module to process the loopholes when the security state data and the operation state data have the loopholes, and the network intrusion privacy is strong nowadays, the invention can not meet the accurate detection and analysis of the network intrusion, and the risk of data loss and data damage is easily caused to users.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention aims to provide a compliance intelligent early warning system and an early warning method based on data behavior feature analysis, which are characterized in that the data behavior feature analysis is carried out on the data by collecting original data, a behavior feature library is built according to the data behavior, and abnormal data behaviors are identified, so that the compliance of the data behaviors is judged, the abnormal activities are automatically identified, the risk early warning is carried out, the safety and the compliance of the data are ensured, and the system and the method have the advantages of high-efficiency identification, high safety, low operation cost and comprehensive data processing.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:
a compliance intelligent early warning system based on data behavioral profile analysis, comprising:
The data source module is used for providing original data, wherein the original data comprises dynamic data, namely refreshed database flow, and static data, namely database log information;
the data acquisition module is used for acquiring original data from the data source module, processing, analyzing and caching the original data, and transmitting the processed data to the capacity component module for data processing;
The capacity component module is used for analyzing and processing the data transmitted by the data acquisition module and transmitting the processed data to the behavior monitoring module;
The behavior monitoring module is used for monitoring the access behaviors of users and systems to data in real time, carrying out characteristic analysis, establishing a behavior characteristic library according to the data behaviors, and transmitting alarm information obtained by analysis to the intelligent early warning module, wherein the behavior monitoring module is also associated with compliance rule libraries of different industries so as to judge whether the user behaviors violate rules or not, namely judging compliance of the user behaviors;
and the intelligent early warning module is used for receiving the warning information transmitted by the behavior monitoring module and immediately sending an early warning notice to the related receiving end.
The data acquisition module acquires original data, processes, analyzes and caches the original data, and specifically comprises the following steps:
Firstly, dynamic data are collected in real time through Filebeat collecting tools, and are buffered through a Kafka message bus and then are input into a Storm or a Flink streaming processing engine of a streaming computing area, the data are cleaned, formatted and aggregated, the data are detected by utilizing Complex Event Processing, namely CEP functions, the data characteristics are extracted, and behavior analysis, risk identification and abnormality early warning are completed on line in real time by combining a rule engine and a machine learning model, and streaming computing results are temporarily stored in a Redis cache or a Kafka partition in a data cache layer;
meanwhile, static data enter a batch calculation area, and a plurality of prediction models are trained through machine learning and deep learning to form a model library;
And finally, selecting a proper statistical model and a proper deep learning model by the data analysis modeling area according to the service demand matching degree, the data feature matching degree, the model prediction precision and the model universality, deploying by the stream calculation area or the batch calculation area, constructing a high-quality feature space by combining the multi-source heterogeneous data sources of stream calculation and batch processing, and realizing the plan revision and the risk assessment of the service system by simulation and emulation.
The construction of the high-quality feature space is specifically as follows:
Firstly, extracting, converting and updating data characteristics to construct effective characteristics expressing data rules;
Secondly, selecting a feature subset which has obvious effect on a target task by using a filtering type, packaging type and embedded type feature selection algorithm, and deleting redundant and irrelevant features;
Then, a dimensionality reduction algorithm of principal component analysis and linear discriminant analysis is used for obtaining low-dimensional characteristic expression;
Finally, the data is divided into training data sets and test data sets, so that the constructed feature space is suitable for model training and evaluation.
The capability component module comprises a data analysis component, a database audit component and an information operation and maintenance management component, wherein the database audit component is used for providing SQL operation records, analyzing the access behavior of SQL sentences to data, the information operation and maintenance management component is used for providing whether a user accesses, modifies and deletes the data according to a specified work order flow or not, and the data analysis component is used for associating with the database audit component and the information operation and maintenance management component through a data analysis component association analysis algorithm, so that full-link tracking of data circulation is realized.
The association analysis algorithm is specifically implemented by the following steps:
firstly, collecting SQL logs of a database auditing component and work order data of an operation and maintenance management component in real time through a data acquisition module;
secondly, analyzing SQL logs of the database audit component and worksheets of the operation and maintenance management component to extract key fields, wherein the key fields comprise: operating object, time, version number;
then, matching and matching the extracted key fields with characteristic fields in the rule base by loading an association rule base;
Then, when the matching association rule is found, other systems and operations associated with the operation are determined;
And finally, recording and checking the association result, confirming whether the association is correct, feeding back the association quality to adjust the rule, and outputting the finally confirmed association analysis result to realize full-link tracking of the data flow.
The behavior monitoring module comprises a data behavior monitoring component, a characteristic analysis component and a compliance analysis component, wherein the data flow is monitored through the behavior monitoring component, the characteristic analysis component is used for associating a behavior characteristic library, the monitored data behavior is matched with the behavior characteristic library so as to identify potential abnormal behavior, meanwhile, the compliance analysis component is used for judging whether the data behavior violates rules or not through closing the compliance rule library, namely judging compliance of the data behavior, and if the non-compliance behavior is monitored, the alarm information is sent to the intelligent early warning module.
The intelligent early warning module comprises an intelligent early warning analysis component and an intelligent early warning notification component, wherein the intelligent early warning analysis component is used for summarizing and counting early warning times of different departments or units and ranking the early warning times so as to judge an early warning aggregation area, and meanwhile, various statistical analysis reports are automatically generated to intuitively display early warning distribution conditions of different dimensions, and the dimensions comprise early warning types, early warning objects and early warning frequencies; the intelligent early warning notification component is used for sending early warning notification to related responsible persons or departments, and the channels of the early warning notification comprise work orders, mails and short messages.
A compliance intelligent early warning method based on data behavior feature analysis comprises the following steps:
Step 1, acquiring original data in a data source module in real time through a data acquisition module;
step 2, cleaning, converting, analyzing, packaging and storing the acquired data through a data acquisition module, and normalizing the data;
Step 3, carrying out data analysis on the stored data through the capability component module;
step 4, identifying an abnormal behavior mode through the behavior monitoring module, namely judging compliance of data behaviors, and if the abnormal behavior is monitored, sending alarm information to the intelligent early warning module;
And 5, receiving alarm information through an intelligent early warning module, sending risk early warning to related responsible persons or departments in a mode of worksheets, short messages and mails, and counting the early warning information from different dimensions to generate a statistical analysis report form so as to show early warning distribution conditions.
The original data types in the step 1 comprise business system logs, network traffic, database operation logs and data asset information.
In the step 3, the data analysis is performed on the stored data through the capability component module, specifically: and the database audit component is associated with the information operation and maintenance management component to acquire a database SQL operation record and a work order processing log, and the correlation of SQL sentences, work order records and data assets is analyzed to track the data flow and the access control full link.
Compared with the prior art, the invention has the beneficial effects that:
1. According to the invention, through characteristic analysis of the data behaviors, a behavior characteristic library is established according to the data behaviors, and abnormal data behaviors are identified, so that the compliance of the data behaviors is judged, potential abnormal activities are identified, risk early warning is carried out, and the method has the advantage of high-efficiency identification.
2. The invention can make risk early warning by identifying abnormal data behaviors, and generates various statistical analysis reports by combining the early warning information of different dimensions so as to intuitively display the early warning distribution conditions of different dimensions, and has the advantage of high system safety.
3. According to the invention, by analyzing the data behaviors, abnormal activities are automatically identified, and risk early warning is intelligently carried out, so that the manual operation cost is reduced, and the identification efficiency is improved.
4. The invention manages and monitors large-scale data by monitoring the data circulation among departments and service systems, thereby ensuring the data safety and ensuring more comprehensive data processing.
In summary, the invention collects the original data, performs behavior feature analysis on the data, establishes a behavior feature library according to the data behaviors, and identifies abnormal data behaviors, thereby judging the compliance of the data behaviors, automatically identifying abnormal activities and performing risk early warning, ensuring the safety and compliance of the data, and having the advantages of efficient identification, high safety, low operation cost and comprehensive data processing.
Drawings
Fig. 1 is a schematic diagram of a system structure according to the present invention.
Fig. 2 is a schematic diagram of a data processing flow of the data acquisition module in the present invention.
FIG. 3 is a schematic flow chart of the method of the present invention.
Fig. 4 is a schematic diagram of an early warning result obtained in the first embodiment of the present invention.
Detailed Description
The present invention will be described in detail with reference to the accompanying drawings.
Referring to fig. 1, a compliance intelligent pre-warning system based on data behavior feature analysis includes:
The data source module is used for providing original data, wherein the original data comprises dynamic data, namely refreshed database flow, and static data, namely database log information; the module is responsible for interfacing various original data sources, is a foundation for ensuring the reliability of the system, covers a plurality of types of data sources such as business system logs, network traffic, user operation logs, asset databases and the like, provides basic support for accurate behavior analysis of the system, directly influences the detection effect and accuracy of the integrity and quality of the system, and is used as a most basic acquisition module, the data source module needs to select a proper acquisition range according to the analysis requirement of the system, and ensures that full, accurate and timely original data are acquired;
the data acquisition module is used for acquiring original data from the data source module, processing, analyzing and caching the original data, and transmitting the processed data to the capacity component module for data processing;
The capacity component module is used for analyzing and processing the data transmitted by the data acquisition module and transmitting the processed data to the behavior monitoring module;
The behavior monitoring module is used for monitoring the access behaviors of users and systems to data in real time, carrying out characteristic analysis, establishing a behavior characteristic library according to the data behaviors, and transmitting alarm information obtained by analysis to the intelligent early warning module, wherein the behavior monitoring module is also associated with compliance rule libraries of different industries so as to judge whether the user behaviors violate rules or not, namely judging compliance of the user behaviors; by carrying out feature analysis on the data behaviors, a behavior feature library is established according to the data behaviors, and abnormal data behaviors are identified, so that the compliance of the data behaviors is judged, potential abnormal activities are identified, risk early warning is carried out, and the method has the advantage of high-efficiency identification;
and the intelligent early warning module is used for receiving the warning information transmitted by the behavior monitoring module, sending an early warning notice to a related receiving end in real time, and carrying out intelligent early warning by analyzing the data behaviors, so that the manual operation cost is reduced, and the recognition efficiency is improved.
Referring to fig. 2, the data acquisition module acquires original data, and processes, analyzes and caches the original data, specifically:
Firstly, dynamic data are collected in real time through Filebeat collecting tools, and are buffered through a Kafka message bus and then are input into a Storm or a Flink streaming processing engine of a streaming computing area, the data are cleaned, formatted and aggregated, the data are detected by utilizing Complex Event Processing, namely CEP functions, the data characteristics are extracted, and behavior analysis, risk identification and abnormality early warning are completed on line in real time by combining a rule engine and a machine learning model, and streaming computing results are temporarily stored in a Redis cache or a Kafka partition in a data cache layer;
meanwhile, static data enter a batch calculation area, and a plurality of prediction models are trained through machine learning and deep learning to form a model library;
And finally, selecting a proper statistical model and a proper deep learning model by the data analysis modeling area according to the service demand matching degree, the data feature matching degree, the model prediction precision and the model universality, deploying by the stream calculation area or the batch calculation area, constructing a high-quality feature space by combining the multi-source heterogeneous data sources of stream calculation and batch processing, and realizing the plan revision and the risk assessment of the service system by simulation and emulation.
The construction of the high-quality feature space is specifically as follows:
Firstly, extracting, converting and updating data characteristics to construct effective characteristics expressing data rules;
Secondly, selecting a feature subset which has obvious effect on a target task by using a filtering type, packaging type and embedded type feature selection algorithm, and deleting redundant and irrelevant features;
Then, a dimensionality reduction algorithm of principal component analysis and linear discriminant analysis is used for obtaining low-dimensional characteristic expression;
Finally, the data is divided into training data sets and test data sets, so that the constructed feature space is suitable for model training and evaluation.
The capability component module comprises a data analysis component, a database audit component and an information operation and maintenance management component, wherein the database audit component is used for providing SQL operation records, analyzing the access behavior of SQL sentences to data, the information operation and maintenance management component is used for providing whether a user accesses, modifies and deletes the data according to a specified work order flow or not, and the data analysis component is used for associating with the database audit component and the information operation and maintenance management component through a data analysis component association analysis algorithm, so that full-link tracking of data circulation is realized.
The association analysis algorithm is specifically implemented by the following steps:
firstly, collecting SQL logs of a database auditing component and work order data of an operation and maintenance management component in real time through a data acquisition module;
secondly, analyzing SQL logs of the database audit component and worksheets of the operation and maintenance management component to extract key fields, wherein the key fields comprise: operating object, time, version number;
then, matching and matching the extracted key fields with characteristic fields in the rule base by loading an association rule base;
Then, when the matching association rule is found, other systems and operations associated with the operation are determined;
And finally, recording and checking the association result, confirming whether the association is correct, feeding back the association quality to adjust the rule, and outputting the finally confirmed association analysis result to realize full-link tracking of the data flow.
The behavior monitoring module comprises a data behavior monitoring component, a characteristic analysis component and a compliance analysis component, wherein the data flow is monitored through the behavior monitoring component, the characteristic analysis component is used for associating a behavior characteristic library, the monitored data behavior is matched with the behavior characteristic library so as to identify potential abnormal behavior, meanwhile, the compliance analysis component is used for judging whether the data behavior violates rules or not through closing the compliance rule library, namely judging compliance of the data behavior, and if the non-compliance behavior is monitored, the alarm information is sent to the intelligent early warning module.
The intelligent early warning module comprises an intelligent early warning analysis component and an intelligent early warning notification component, wherein the intelligent early warning analysis component is used for summarizing and counting early warning times of different departments or units and ranking the early warning times so as to judge an early warning aggregation area, and meanwhile, various statistical analysis reports are automatically generated to intuitively display early warning distribution conditions of different dimensions, and the dimensions comprise early warning types, early warning objects and early warning frequencies; the intelligent early warning notification component is used for sending early warning notification to related responsible persons or departments, and the channels of the early warning notification comprise work orders, mails and short messages; the intelligent early warning module is used for monitoring data circulation among departments and service systems, so that large-scale data are managed and monitored, data safety is guaranteed, and data processing is more comprehensive.
Referring to fig. 3, an early warning method of a compliance intelligent early warning system based on data behavior feature analysis includes the following steps:
Step 1, acquiring original data in a data source module in real time through a data acquisition module, wherein the original data types comprise service system logs, network traffic, database operation logs and data asset information;
step 2, cleaning, converting, analyzing, packaging and storing the acquired data through a data acquisition module, and normalizing the data;
Step 3, carrying out data analysis on the stored data through the capability component module, wherein the data analysis comprises the following specific steps: the database audit component is associated with the information operation and maintenance management component to acquire a database SQL operation record and a work order processing log, and the correlation of SQL sentences, work order records and data assets is analyzed to track the data flow and the access control full link;
step 4, identifying an abnormal behavior mode through the behavior monitoring module, namely judging compliance of data behaviors, and if the abnormal behavior is monitored, sending alarm information to the intelligent early warning module;
Step 5, receiving alarm information through an intelligent early warning module, sending risk early warning to related responsible persons or departments in a mode of worksheets, short messages and mails, and counting the early warning information from different dimensions to generate a statistical analysis report form so as to show early warning distribution conditions; according to the invention, by collecting the original data, performing behavior feature analysis on the data, establishing a behavior feature library according to the data behaviors, and identifying the abnormal data behaviors, thereby judging the data behavior compliance, automatically identifying abnormal activities and performing risk early warning, ensuring the data safety and compliance, and having the advantages of high-efficiency identification, high safety, low operation cost and comprehensive data processing.
The present invention provides three embodiments.
Example 1:
the data acquisition module is used for acquiring the original data in the data source module in real time, and the data type is as follows
Tables 1 and 2 show:
TABLE 1
Heterogeneous data types Collecting results
XML format data Accurate and accurate
Protobuf format Accurate and accurate
Avro format data Accurate and accurate
TABLE 2
Setting performance indexes: average acquisition speed: 1000 bars/second, acquisition throughput: 50 MB/sec anomaly, case response time: average for 5 seconds;
The data acquisition module is used for carrying out cleaning conversion, analysis packaging and storage on the acquired data, and carrying out standardization processing on the data, and specifically comprises the following steps:
Data extraction and normalization: extracting data to be analyzed from each system, ensuring the consistency of data formats, and performing standardization processing, wherein the data cleaning, format conversion and the like are included in the embodiment;
loading association rules: defining logical relationships between systems and loading association rules based on business requirements, security policies, and previous analysis results, the definition of rules requiring consideration of which events or behaviors should be considered relevant;
matching and associating data: matching and associating the data extracted from the systems, and connecting, merging or associating the related data so as to establish the relationship between the data of different systems;
Abnormality detection: detecting anomalies or unusual patterns by monitoring the correlated data, which may include identifying anomalies using statistical methods, machine learning models, or specialized rules;
extracting associated events: based on the results of the anomaly detection, events related to the anomaly are extracted, which can help the system identify anomalies in the user's access behavior.
The stored data are subjected to data analysis through the capability assembly module, and abnormal data behaviors are identified and alarmed, specifically:
(1) The data monitoring module is mainly based on real-time analysis of log and stream data, and is used for monitoring data access and use behaviors. The format is as follows:
Time user operation business system access data detailed information
2021-05-2114:32:11|Zhang San| query|CRM System|customer information Table|SELECT FROM customer WHERE NAME LIKE '% less%'
(2) Compliance judgment accuracy assessment:
Network behavior compliance requirements exist in various industries, however, a complete evaluation strategy cannot be provided to judge and early warn the non-compliance behavior in time, and the compliance rule of the system is as follows:
rule 1: the data contains sensitive vocabulary, level = important;
The action is to send a serious alarm;
Rule 2: modifying more than 500 pieces of key service data, level= suspious;
The action is to send warning and alarm;
(3) Alarm and response: upon detection of an anomaly, triggering a corresponding alarm mechanism, notifying security personnel or system administrators, and at the same time, may need to define some automatic response mechanism, such as automatically blocking the user account or reducing its permissions;
issuing an alarm format:
Time rule problem detail operation data processing advice
2022-05-21|Rule 1| bulk export user information|export 99999| for verification;
The early warning accuracy is evaluated and tested, and the method specifically comprises the following steps: the simulation environment is established through the historical data, different types of abnormal data access behaviors are played back, indexes such as early warning accuracy, recall rate and the like of the system are evaluated, and compared with the prior art based on static threshold rules, the early warning accuracy can be improved by more than 15% through machine learning to identify complex behavior patterns;
(1) Experimental objective:
And evaluating the performance of the compliance intelligent early warning system based on the data behavior feature analysis in terms of early warning accuracy, and comparing the performance with the difference of the conventional method based on the static threshold rule.
(2) Data preparation:
constructing a historical dataset:
Using compliance data from the past year, containing normal behavior and known abnormal behavior, for a total of 100,000 pieces;
Labeling data, labeling normal and abnormal behaviors according to compliance standards, and ensuring diversity and representativeness of a data set, as shown in table 3:
Data type Quantity of
Total amount of historical data 100000
Number of abnormal data labels 1000
TABLE 3 Table 3
(4) And (3) establishing a simulation environment:
Setting a test scene: three test scenarios including malicious access, unauthorized operation, abnormal data transmission were simulated 500 times, 200 times, 100 times, respectively, as shown in table 4:
test scenario Number of simulations
Malicious access 500
Unauthorized operation 200
Abnormal data transmission 100
TABLE 4 Table 4
(5) And (3) evaluation index selection:
Early warning accuracy:
setting a threshold R for triggering early warning of the system, and triggering early warning when the probability of detecting abnormal behaviors by the system exceeds 0.7 (namely R > 0.7);
early warning accuracy = number of correct early warning of system/total number of early warning
Recall rate:
calculating the ratio of the number of anomalies detected by the system to the total number of anomalies, i.e. the recall, according to the known anomaly data;
recall = number of abnormalities detected by the system/total number of abnormalities
In this embodiment, the evaluation index is shown in table 5:
Index (I) Value of
Early warning accuracy 80%
Recall rate of recall 85%
Percent performance improvement 15%
Table 5 (6) specific evaluation procedure, as shown in table 6:
TABLE 6
(7) Evaluation results:
the test data in the examples of the present invention are shown in table 7:
TABLE 7
In the embodiment of the invention, 2000 explicitly-marked abnormal data access records exist in the whole evaluation test set, and the result is shown in table 8 and fig. 4 after system early warning:
Item annotation Quantity of
Number of correct early warning strips 1900
Number of missing report/false report 100
TABLE 8
By combining the data in the table 8 and the data in the figure 4, the accuracy of the early warning result obtained by the early warning system and the early warning method provided by the invention can be seen to reach 95%, so that the early warning system and the early warning method have excellent data behavior early warning capability.
Example 2: bank data safety compliance early warning system
The solution is as follows:
By adopting the compliance intelligent early warning system based on the characteristic analysis of the data behaviors, the system can identify abnormal behaviors and potential illegal activities such as unauthorized data access, data tampering and the like through the real-time monitoring and analysis of bank data, and simultaneously, the system can evaluate the compliance of the data access behaviors and trigger intelligent early warning notification in time by combining the data risk analysis model of each business scene;
The implementation process comprises the following steps:
a. system configuration and initialization: according to the data characteristics and business requirements of the bank, configuring and initializing the system, including establishing a behavior feature library, configuring a data risk analysis model and the like;
b. Data acquisition and processing: data access and use behaviors of banks, including viewing, modifying, copying, transmitting and other activities of data, are collected in real time through interfaces or other modes. The system cleans, integrates and extracts the characteristics of the data;
c. Behavioral characteristic analysis: the system carries out comparison and analysis on the data according to the behavior feature library, so as to quickly identify possible violations or abnormal activities, and meanwhile, the system is combined with a data risk analysis model to evaluate compliance of data access behaviors;
d. Intelligent early warning: once a potential compliance problem or abnormal data access behavior is detected, the system automatically triggers an early warning notice, and reminds relevant personnel through worksheets, mails, short messages and the like, and the relevant personnel verify and process according to the early warning information; the early warning effect is shown in table 9 and table 10:
TABLE 9
Index (I) Before implementation After implementation
Data access anomaly detection coverage 50% 95%
Data access abnormality early warning accuracy 60% 90%
Average early warning response time For 2 hours For 5 minutes
Month data safety compliance violation event 120 Start up 20 Get up
Table 10
The implementation effect is as follows:
By implementing the compliance intelligent early warning system based on the data behavior feature analysis, the bank can realize real-time monitoring of data safety and compliance, the accuracy and timeliness of the early warning system are greatly improved, threats such as data leakage, abuse and unauthorized data access are effectively reduced, meanwhile, the data safety compliance is improved, and the compliance risk is reduced.
Example 3: medical industry data operation compliance management system
The solution is as follows:
the medical industry data operation compliance management system combines the data behavior feature analysis, real-time monitoring and intelligent early warning functions, so that the compliance of data operation is ensured;
The implementation process comprises the following steps:
data source integration: the system is integrated with various databases and information systems of medical institutions, so that all data sources are ensured to be covered comprehensively;
data behavior feature analysis: by analyzing the data operation behaviors of the medical institution, a behavior feature library is established, and abnormal or illegal operations are identified;
Real-time monitoring and early warning: once a potential violation or abnormal operation is detected, the system automatically triggers an early warning and notifies relevant personnel through preset channels (such as mail, short message, desktop notification, etc.).
Compliance review and correction: the related personnel verify according to the early warning information, if illegal operation does exist, the system can provide correction advice, and the corrected operation is monitored in real time.
The implementation effect is as follows:
According to the embodiment, the system provided by the invention obviously improves the compliance of data operation in the medical industry, greatly reduces the risks of data leakage and abuse through real-time monitoring and early warning, ensures the privacy safety of patients, and simultaneously improves the data management efficiency and the compliance of medical institutions.

Claims (10)

1. A compliance intelligent early warning system based on data behavior feature analysis is characterized by comprising:
The data source module is used for providing original data, wherein the original data comprises dynamic data, namely refreshed database flow, and static data, namely database log information;
the data acquisition module is used for acquiring original data from the data source module, processing, analyzing and caching the original data, and transmitting the processed data to the capacity component module for data processing;
The capacity component module is used for analyzing and processing the data transmitted by the data acquisition module and transmitting the processed data to the behavior monitoring module;
The behavior monitoring module is used for monitoring the access behaviors of users and systems to data in real time, carrying out characteristic analysis, establishing a behavior characteristic library according to the data behaviors, and transmitting alarm information obtained by analysis to the intelligent early warning module, wherein the behavior monitoring module is also associated with compliance rule libraries of different industries so as to judge whether the user behaviors violate rules or not, namely judging compliance of the user behaviors;
and the intelligent early warning module is used for receiving the warning information transmitted by the behavior monitoring module and immediately sending an early warning notice to the related receiving end.
2. The compliance intelligent early warning system based on data behavior feature analysis according to claim 1, wherein the data acquisition module acquires raw data, processes, analyzes and caches the raw data, and specifically comprises:
Firstly, dynamic data are collected in real time through Filebeat collecting tools, and are buffered through a Kafka message bus and then are input into a Storm or a Flink streaming processing engine of a streaming computing area, the data are cleaned, formatted and aggregated, the data are detected by utilizing Complex Event Processing, namely CEP functions, the data characteristics are extracted, and behavior analysis, risk identification and abnormality early warning are completed on line in real time by combining a rule engine and a machine learning model, and streaming computing results are temporarily stored in a Redis cache or a Kafka partition in a data cache layer;
meanwhile, static data enter a batch calculation area, and a plurality of prediction models are trained through machine learning and deep learning to form a model library;
And finally, selecting a proper statistical model and a proper deep learning model by the data analysis modeling area according to the service demand matching degree, the data feature matching degree, the model prediction precision and the model universality, deploying by the stream calculation area or the batch calculation area, constructing a high-quality feature space by combining the multi-source heterogeneous data sources of stream calculation and batch processing, and realizing the plan revision and the risk assessment of the service system by simulation and emulation.
3. The intelligent pre-warning system based on data behavior feature analysis according to claim 2, wherein the construction of the high-quality feature space is specifically as follows:
Firstly, extracting, converting and updating data characteristics to construct effective characteristics expressing data rules;
Secondly, selecting a feature subset which has obvious effect on a target task by using a filtering type, packaging type and embedded type feature selection algorithm, and deleting redundant and irrelevant features;
Then, a dimensionality reduction algorithm of principal component analysis and linear discriminant analysis is used for obtaining low-dimensional characteristic expression;
Finally, the data is divided into training data sets and test data sets, so that the constructed feature space is suitable for model training and evaluation.
4. The compliance intelligent early warning system based on data behavior feature analysis according to claim 1, wherein the capability component module comprises a data analysis component, a database audit component and an information operation and maintenance management component, wherein SQL operation records are provided through the database audit component, the access behavior of SQL sentences to data is analyzed, whether a user accesses, modifies and deletes the data according to a specified work order flow is provided through the information operation and maintenance management component, and the data analysis component is associated with the database audit component and the information operation and maintenance management component through a data analysis component association analysis algorithm, so that full link tracking of data circulation is realized.
5. The intelligent early warning system based on data behavior feature analysis according to claim 4, wherein the association analysis algorithm is specifically implemented as follows:
firstly, collecting SQL logs of a database auditing component and work order data of an operation and maintenance management component in real time through a data acquisition module;
secondly, analyzing SQL logs of the database audit component and worksheets of the operation and maintenance management component to extract key fields, wherein the key fields comprise: operating object, time, version number;
then, matching and matching the extracted key fields with characteristic fields in the rule base by loading an association rule base;
Then, when the matching association rule is found, other systems and operations associated with the operation are determined;
And finally, recording and checking the association result, confirming whether the association is correct, feeding back the association quality to adjust the rule, and outputting the finally confirmed association analysis result to realize full-link tracking of the data flow.
6. The intelligent data behavior feature analysis-based compliance early warning system according to claim 1, wherein the behavior monitoring module comprises a data behavior monitoring component, a feature analysis component and a compliance analysis component, wherein the behavior monitoring component is used for monitoring data flow, the feature analysis component is used for associating a behavior feature library, the monitored data behavior is matched with the behavior feature library to identify potential abnormal behaviors, and the compliance analysis component is used for judging whether the data behavior violates rules or not, namely judging compliance of the data behavior, and if the non-compliance behavior is monitored, the warning information is sent to the intelligent early warning module.
7. The compliance intelligent early warning system based on data behavior feature analysis according to claim 1, wherein the intelligent early warning module comprises an intelligent early warning analysis component and an intelligent early warning notification component, the intelligent early warning analysis component is used for summarizing and counting early warning times of different departments or units and ranking, so that an early warning aggregation area is judged, various statistical analysis reports are automatically generated at the same time, early warning distribution conditions of different dimensions are visually displayed, and the dimensions comprise early warning types, early warning objects and early warning frequencies; the intelligent early warning notification component is used for sending early warning notification to related responsible persons or departments, and the channels of the early warning notification comprise work orders, mails and short messages.
8. The early warning method of the compliance intelligent early warning system based on the data behavior feature analysis in any one of claims 1 to 7 is characterized by comprising the following steps:
Step 1, acquiring original data in a data source module in real time through a data acquisition module;
step 2, cleaning, converting, analyzing, packaging and storing the acquired data through a data acquisition module, and normalizing the data;
Step 3, carrying out data analysis on the stored data through the capability component module;
step 4, identifying an abnormal behavior mode through the behavior monitoring module, namely judging compliance of data behaviors, and if the abnormal behavior is monitored, sending alarm information to the intelligent early warning module;
And 5, receiving alarm information through an intelligent early warning module, sending risk early warning to related responsible persons or departments in a mode of worksheets, short messages and mails, and counting the early warning information from different dimensions to generate a statistical analysis report form so as to show early warning distribution conditions.
9. The method for intelligent pre-warning based on data behavior feature analysis according to claim 8, wherein the raw data types in step 1 include service system logs, network traffic, database operation logs, and data asset information.
10. The method for intelligent pre-warning based on the data behavior feature analysis according to claim 8, wherein the step 3 is characterized in that the stored data is subjected to data analysis by a capability component module, specifically: and the database audit component is associated with the information operation and maintenance management component to acquire a database SQL operation record and a work order processing log, and the correlation of SQL sentences, work order records and data assets is analyzed to track the data flow and the access control full link.
CN202410158124.6A 2024-02-04 2024-02-04 Compliance intelligent early warning system and early warning method based on data behavior feature analysis Pending CN118133339A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410158124.6A CN118133339A (en) 2024-02-04 2024-02-04 Compliance intelligent early warning system and early warning method based on data behavior feature analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410158124.6A CN118133339A (en) 2024-02-04 2024-02-04 Compliance intelligent early warning system and early warning method based on data behavior feature analysis

Publications (1)

Publication Number Publication Date
CN118133339A true CN118133339A (en) 2024-06-04

Family

ID=91246656

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410158124.6A Pending CN118133339A (en) 2024-02-04 2024-02-04 Compliance intelligent early warning system and early warning method based on data behavior feature analysis

Country Status (1)

Country Link
CN (1) CN118133339A (en)

Similar Documents

Publication Publication Date Title
CN112001586B (en) Enterprise networking big data audit risk control architecture based on block chain consensus mechanism
Templ et al. Statistical disclosure control for micro-data using the R package sdcMicro
US9921936B2 (en) Method and system for IT resources performance analysis
CN108763957A (en) A kind of safety auditing system of database, method and server
CN107222472A (en) A kind of user behavior method for detecting abnormality under Hadoop clusters
CN101470887A (en) Credit early-warning system and method
CN113434575B (en) Data attribution processing method, device and storage medium based on data warehouse
CN116226894B (en) Data security treatment system and method based on meta bin
CN111930726B (en) Off-line form-based grade protection evaluation data acquisition and analysis method and system
CN116664310A (en) Unified monitoring and controlling method, device and system for customer risk
CN116628722A (en) Financial data safety management processing system
CN114238020A (en) Multidimensional high-precision intelligent service monitoring method and system
CN112733897B (en) Method and apparatus for determining abnormality cause of multi-dimensional sample data
CN113642672A (en) Feature processing method and device of medical insurance data, computer equipment and storage medium
CN117829994A (en) Money laundering risk analysis method based on graph calculation
CN117421735A (en) Mining evaluation method based on big data vulnerability mining
CN117132114A (en) Enterprise internal risk management precaution device system
Flammini et al. Optimisation of security system design by quantitative risk assessment and genetic algorithms
CN118133339A (en) Compliance intelligent early warning system and early warning method based on data behavior feature analysis
CN115640158A (en) Detection analysis method and device based on database
Hou Financial Abnormal Data Detection System Based on Reinforcement Learning
CN113409141A (en) Grain storage full-flow traceable supervision method based on block chain technology
CN112615812A (en) Information network unified vulnerability multi-dimensional security information collection, analysis and management system
CN115496332A (en) Asset identification and management system and method
CN118279067A (en) Information data management method based on process mining technology

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination