CN115795330A - Medical information anomaly detection method and system based on AI algorithm - Google Patents

Medical information anomaly detection method and system based on AI algorithm Download PDF

Info

Publication number
CN115795330A
CN115795330A CN202111055581.5A CN202111055581A CN115795330A CN 115795330 A CN115795330 A CN 115795330A CN 202111055581 A CN202111055581 A CN 202111055581A CN 115795330 A CN115795330 A CN 115795330A
Authority
CN
China
Prior art keywords
data
network
anomaly detection
algorithm
characteristic vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111055581.5A
Other languages
Chinese (zh)
Inventor
孟亚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Xuhui Dahua Hospital
Original Assignee
Shanghai Xuhui Dahua Hospital
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Xuhui Dahua Hospital filed Critical Shanghai Xuhui Dahua Hospital
Priority to CN202111055581.5A priority Critical patent/CN115795330A/en
Publication of CN115795330A publication Critical patent/CN115795330A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention provides a medical information abnormity detection method and system based on an AI algorithm, which relate to the field of Internet medical treatment, and the abnormity detection method comprises the following steps of 1: collecting and converging medical information data; step 2: screening and feature extraction are carried out on formats and fields of the data by utilizing a data cleaning technology and a filtering noise reduction technology; and step 3: establishing an anomaly detection model by using an AI algorithm based on a medical information security service scene, and performing model training and model optimization; and 4, step 4: and carrying out abnormality judgment based on the optimized abnormality detection model, and continuously optimizing the abnormality detection model through machine learning. The invention realizes the anomaly detection of ATP attack and DDoS attack based on AI algorithm.

Description

Medical information anomaly detection method and system based on AI algorithm
Technical Field
The invention relates to the field of Internet medical treatment, in particular to a medical information abnormity detection method and system based on an AI algorithm.
Background
In the field of medical information security, data gradually tends to be distributed and isomerized, security threats tend to be diversified and complicated, and an analysis mining technology aiming at the data requires a data mining system to have better distribution and higher computing and processing capacity. Currently, artificial intelligence has been widely applied to many fields such as intrusion detection, situation analysis, cloud defense, anti-fraud, internet of things security, mobile terminal security and the like from early malware monitoring.
The application depth of the existing similar products for processing the network security problem by using artificial intelligence is still in the early accumulation stage. Except the performance of partial safety protection record records, the innovation of the network safety protection system based on the artificial intelligence technology is still in the research and practice stage, namely abnormal detection and early warning cannot be carried out on APT attack, DDos attack and the like.
APT attacks (Advanced Persistent Threat) are inherently a targeted attack. The advanced attack means is utilized to carry out the attack form of the long-term continuous network attack on a specific target, the principle of the APT attack is more advanced and advanced compared with other attack forms, the advanced nature is mainly characterized in that the APT needs to accurately collect the service flow and the target system of an attack object before starting the attack, in the collection process, the attack can actively dig the loopholes of a trusted system and an application program of the attacked object, the loopholes are utilized to build the network required by an attacker, and the zero-day loopholes are utilized to carry out the attack. Aiming at stealing core data, the method aims at network attack and invasion behaviors launched by clients and is a 'malicious commercial spyware threat' which is consummated for a long time. Such behavior is often managed and planned for a long time and is highly covert. The conventional feature inspection cannot identify ATP attacks, social engineering, zero-day bugs, customized malicious software and the like are often utilized, a traditional passive defense system based on a feature library cannot identify abnormal traffic, and serious hysteresis exists.
DDos attack (Distributed Denial of Service attack) is essentially a special form of Denial of Service attack based on DoS, which is a Distributed, coordinated, large-scale attack. The single DoS attack is generally in a one-to-one manner, and utilizes some defects of a network protocol and an operating system, adopts a cheating and disguising strategy to carry out the network attack, so that a website server is flooded with a large amount of information required to be replied, network bandwidth or system resources are consumed, and the network or the system is not overloaded and is paralyzed to stop providing normal network services. Compared with the DoS attack initiated by a single host, the distributed denial of service attack DDoS is a group behavior initiated by hundreds or even thousands of hosts which are invaded and provided with attack processes at the same time. For the DDos attack of the protocol message lacking the state, the attack message belongs to a normal message, and under the condition, it is difficult to judge whether the attack message is a single message by a general detection technology.
Disclosure of Invention
In view of the above drawbacks of the prior art, the present invention provides a method and a system for detecting medical information abnormality based on AI algorithm, so as to solve the following problems:
1. detecting ATP attacks based on a normalization algorithm;
the method has the advantages that the key fields in the information security event, such as network throughput, data concurrent access amount, server performance and other data, are normalized and mapped to a certain value interval, so that the association possibility among values is more obvious, and slow and lasting abnormity which cannot be monitored by conventional information security means can be conveniently found.
2. Detecting DDoS attack based on a Kalman filtering algorithm;
establishing a model according to an observed value of an information safety index, generating random errors in the observation process of obtaining the observed value, and filtering continuous observation through a Kalman filtering algorithm to enable a filtering result to be close to a true value and enable a normal model to be more accurate; therefore, the information safety condition can be sensed and analyzed according to the performance of the node in a certain period of time, and prediction and early warning can be carried out according to the analysis result.
The invention provides a medical information abnormity detection method based on an AI algorithm, which comprises the following steps:
step 1: collecting and converging medical information data, and dividing the data into historical data and observation data according to the collection time of the medical information data;
step 2: screening and feature extracting formats and fields of data by using a data cleaning technology and a filtering and denoising technology, obtaining a training feature vector set and a testing feature vector set from historical data, and obtaining an observation feature vector from observation data;
and 3, step 3: based on a medical information safety service scene, establishing an anomaly detection model by using an AI algorithm, inputting training characteristic vectors into the anomaly detection model for model training, and inputting test characteristic vectors into the trained anomaly detection model for model optimization;
and 4, step 4: and inputting the observation characteristic vector into the optimized abnormality detection model for abnormality judgment, and continuously optimizing the abnormality detection model through machine learning.
In an embodiment of the present invention, the collection range of the medical information data includes a log source of a host server, a network device, a security device, and an application system.
In an embodiment of the present invention, for the APT attack, the anomaly detection method specifically includes:
step 1.1: collecting flow data, performing standard analysis on the flow data, dividing the data into historical data and observation data according to the collection time of the flow data, and marking the event result of the historical data, namely marking the normal flow behavior as 1 and marking the abnormal flow behavior as 0;
step 1.2: extracting flow characteristics according to a known service scene, obtaining a training flow characteristic vector set and a testing flow characteristic vector set from historical data, and obtaining an observation flow characteristic vector from observation data;
step 1.3: normalizing the flow characteristic vector, and expressing the flow characteristic after the normalization processing by (0, 1);
step 1.4: constructing an anomaly detection model through a classifier algorithm, inputting the training flow characteristic vector subjected to normalization processing into the anomaly detection model for model training, and inputting the test flow characteristic vector into the anomaly detection model for model optimization;
step 1.5: inputting the observation flow characteristic vector into the optimized abnormality detection model, outputting the probability of abnormality occurrence through the abnormality detection model, comparing the probability of abnormality occurrence with a preset threshold value, and judging the APT attack when the probability of abnormality occurrence is larger than the preset threshold value.
In an embodiment of the present invention, the classifier algorithm includes a decision tree algorithm, a bayesian algorithm, and an SVM algorithm.
In an embodiment of the present invention, for DDos attack, the anomaly detection method specifically includes:
step 2.1: collecting multidimensional network data, and standardizing the multidimensional network data,
step 2.2: dividing data into historical data and observation data according to the acquisition time of the multi-dimensional network data, and associating event results of the historical data with the historical data one by one;
step 2.3: extracting network characteristics according to a known service scene, obtaining a training network characteristic vector set and a testing network characteristic vector set from historical data, and obtaining an observation network characteristic vector from observation data;
step 2.4: carrying out filtering and noise reduction processing on the network characteristic vectors, namely judging whether the multidimensional network characteristic vectors obey a Gaussian distribution law or not, if so, constructing an anomaly detection model through a Kalman filtering algorithm, inputting the training network characteristic vectors subjected to filtering and noise reduction processing into the anomaly detection model for model training, and inputting the test network characteristic vectors into the anomaly detection model for model optimization;
step 2.5: inputting the observation network feature vector into an optimized anomaly detection model, outputting the probability of the occurrence of the anomaly through the anomaly detection model, comparing the probability of the occurrence of the anomaly with a preset threshold, and judging that the DDos attack is carried out when the probability of the occurrence of the anomaly is greater than the preset threshold.
In an embodiment of the present invention, the principle of constructing the anomaly detection model based on the kalman filter algorithm in step 2.4 is as follows:
step 2.41: predicting the network eigenvector and the covariance matrix at the k moment according to the network eigenvector and the covariance matrix at the k-1 moment to obtain a predicted network eigenvector and a predicted covariance matrix at the k moment;
step 2.42: respectively comparing the predicted network characteristic vector and the predicted covariance matrix at the moment k with the actual network characteristic vector and the actual covariance matrix at the moment k, and updating the predicted network characteristic vector and the predicted covariance matrix at the moment k;
step 2.43: carrying out time updating and state updating according to the actual network eigenvector and the actual covariance matrix at the moment k +1, and repeating the steps to obtain the optimal value of the network eigenvector at the moment k + 1;
step 2.44: and drawing a predicted network curve according to the optimal value of the network characteristic vector at the k +1 moment.
In an embodiment of the present invention, the principle of determining the DDos attack in step 2.5 is as follows:
step 2.51: continuously correcting waveform burrs based on the goodness of fit of a predicted network curve aiming at observed network feature vectors, and defining a proper preset threshold;
step 2.52: when a short, tight and large-scale escape or long-term slow escape saturation curve occurs, DDos attack is judged and a manual analysis stage is entered.
The invention provides a medical information abnormity detection system based on an AI algorithm, which is realized based on a method and comprises an acquisition layer, an analysis layer and an application layer;
the acquisition layer is used for acquiring and converging medical information data;
the analysis layer divides the data into historical data and observation data according to the acquisition time of the medical information data, screens and extracts the characteristics of the formats and the fields of the data by using a data cleaning technology and a filtering and noise reduction technology, obtains a training characteristic vector set and a testing characteristic vector set from the historical data, and obtains an observation characteristic vector from the observation data;
based on a medical information safety service scene, establishing an anomaly detection model by using an AI algorithm, inputting training characteristic vectors into the anomaly detection model for model training, and inputting test characteristic vectors into the trained anomaly detection model for model optimization;
inputting the observation feature vector into the optimized anomaly detection model for anomaly judgment, and continuously optimizing the anomaly detection model through machine learning;
and the application layer is used for carrying out visual abnormal judgment results and early warning display.
As described above, the medical information abnormality detection method and system based on the AI algorithm according to the present invention have the following beneficial effects:
1. the safety of the basic information network and the important information system related to the privacy of the patient is guaranteed to a greater extent, and a safety foundation is laid for the information application in the future.
2. Automation is realized, and precious labor cost in repeated and simple decision-making work is reduced. The method avoids the waste of most mass data and characteristic dimensions due to the manual analysis of security experts.
3. The safety attack and defense is a long-term dynamic process, the traditional safety defense mechanism based on rules and blacklists is difficult to avoid the situation of hysteresis, through the application of artificial intelligence, the attack behaviors which are never met are easily discovered and blocked in certain scenes, and decision and judgment are carried out on unknown samples by means of the achievement of machine learning.
Drawings
Fig. 1 shows a flow chart of the detection of the APT attack disclosed in the embodiment of the present invention.
Fig. 2 shows a flow chart of detection of a DDos attack disclosed in an embodiment of the present invention.
Fig. 3 is a block diagram of a medical information abnormality detection system disclosed in an embodiment of the present invention.
Fig. 4 is a functional schematic diagram of a medical information abnormality detection system disclosed in the embodiment of the present invention.
Fig. 5 is a flowchart illustrating a construction of the medical information abnormality detection system disclosed in the embodiment of the present invention.
Fig. 6 is a schematic diagram illustrating detection of an APT attack disclosed in the embodiment of the present invention.
Fig. 7 shows a schematic diagram of detection of DDos attack disclosed in the embodiment of the present invention.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict.
It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention, and the components related to the present invention are only shown in the drawings rather than drawn according to the number, shape and size of the components in actual implementation, and the type, quantity and proportion of the components in actual implementation may be changed freely, and the layout of the components may be more complicated.
The invention provides a medical information abnormity detection method based on an AI algorithm, which comprises the following steps:
step 1: collecting and converging medical information data, and dividing the data into historical data and observation data according to the collection time of the medical information data;
the acquisition range of the medical information data comprises a host server, network equipment, safety equipment and a log source of an application system.
And 2, step: screening and feature extraction are carried out on formats and fields of data by using a data cleaning technology and a filtering and denoising technology, a training feature vector set and a testing feature vector set are obtained from historical data, and an observation feature vector is obtained from observation data;
and 3, step 3: based on a medical information safety service scene, establishing an anomaly detection model by using an AI algorithm, inputting training characteristic vectors into the anomaly detection model for model training, and inputting test characteristic vectors into the trained anomaly detection model for model optimization;
and 4, step 4: and inputting the observation characteristic vector into the optimized abnormality detection model for abnormality judgment, and continuously optimizing the abnormality detection model through machine learning.
In the detection process aiming at ATP attack, related flow information is extracted mainly in a flow detection mode and an analysis mode, linkage analysis is carried out on bandwidth occupation, CPU/RAM, physical paths, IP routing, zone bits, ports, protocols, frame lengths, frame numbers and the like, and abnormal information which possibly occurs in flow abnormity, flow behaviors and the like is counted by integrating analysis means such as nodes, topology, time and the like; in order to eliminate the dimension influence among the indexes, data standardization processing needs to be carried out to solve the comparability among the data indexes, after the original data are subjected to data standardization processing, all the indexes are in the same order of magnitude and are suitable for comprehensive comparison and evaluation, wherein the most typical is the normalization processing of the data.
The normalization process is to convert the feature vector to the same dimension, to map the data to the [0,1] or the [ -1,1] interval, and only determined by the extreme value of the variable, because the interval scaling process is one of the normalization processes, the specific function of the normalization process is to induce the statistical distribution of the uniform sample, the normalization is the statistical probability distribution between 0 and 1, and the normalization is the statistical coordinate distribution between-1 and +1.
The abnormal flow analysis technology is further integrated with a machine learning technology and a statistical technology, the model can be established in a scientific and reasonable mode, possible vulnerability attacks can be accurately identified according to analysis results and data, compared with the traditional network defense technology, the abnormal detection technology based on the big data analysis technology can fully exert the advantages of the abnormal flow analysis technology, an original system is protected through a data acquisition mechanism, the abnormal behaviors are effectively tracked, historical flow data are analyzed, abnormal flow points are effectively determined, and the purpose of defending APT attacks is finally achieved.
Specifically, for the APT attack, the anomaly detection method specifically includes the following steps, please refer to fig. 1:
step 1.1: collecting flow data, performing standard analysis on the flow data, dividing the data into historical data and observation data according to the collection time of the flow data, and marking the event result of the historical data, namely marking the normal flow behavior as 1 and marking the abnormal flow behavior as 0;
the traffic data includes but is not limited to: IP, route, zone bit, port, protocol, frame length, frame number and the like;
step 1.2: extracting flow characteristics according to a known service scene, obtaining a training flow characteristic vector set and a testing flow characteristic vector set from historical data, and obtaining an observation flow characteristic vector from observation data; the division of the training flow characteristic vector set and the testing flow characteristic vector set can be set by self;
common feature vectors include, but are not limited to: network quintuple information, equipment fingerprint information, a core service port and the like, and dynamic characteristics comprise resource loss rate, access behavior, URL redirection information, data inheritance relationship and the like;
step 1.3: normalizing the flow characteristic vector, and expressing the flow characteristic after the normalization treatment by (0, 1) to eliminate the influence of different characteristic dimensions on modeling; and analyzing the correlation between each index and the event result by adopting a statistical method (correlation coefficients of Pearson, spireman and Kendall) or a tree model, removing the characteristic with small correlation or constructing a new characteristic as model input;
step 1.4: constructing an anomaly detection model through a classifier algorithm, inputting the training flow characteristic vector subjected to normalization processing into the anomaly detection model for model training, and inputting the test flow characteristic vector into the anomaly detection model for model optimization; the classifier algorithm comprises a decision tree algorithm, a Bayesian algorithm and an SVM algorithm;
step 1.5: inputting the observation traffic characteristic vector into the optimized anomaly detection model, outputting the probability of the occurrence of the anomaly through the anomaly detection model, comparing the probability of the occurrence of the anomaly with a preset threshold, and judging the flow is an APT attack when the probability of the occurrence of the anomaly is greater than the preset threshold, please refer to fig. 6.
In the detection process aiming at DDos attack, state estimation is an important component of a Kalman filtering algorithm, quantitative inference on random quantity according to observation data is an estimation problem, particularly, the state estimation on dynamic behavior can realize the functions of estimating and predicting the real-time running state, the state estimation has important significance for understanding and controlling a system, and the applied method belongs to an estimation theory in statistics. The most common are least squares estimation, linear minimum variance estimation, recursive least squares estimation, etc. Other methods such as bayesian estimation of risk criteria, maximum likelihood estimation, stochastic approximation, etc. are also applicable.
Based on the anomaly detection of the Kalman filtering algorithm, the current network environment is assumed to be in a state similar to a steady state, the algorithm collects and arranges a large amount of normal network data in the early period, an initial threshold value is set through statistical analysis or data transformation on historical network data, then the current network data is calculated, and whether the current network is abnormal or not is judged through comparison with the initial threshold value. If some information of the current network data exceeds a corresponding threshold value, the abnormality is represented. Common network feature vectors are byte number, packet number, flow count, audit record data, number of audit events, interval events, quintuple (protocol, source IP address, destination port, and destination IP address), resource consumption event, etc.
Specifically, for DDos attack, the anomaly detection method specifically includes the following steps, please refer to fig. 2:
step 2.1: collecting multidimensional network data and carrying out standardized processing on the multidimensional network data;
step 2.2: dividing data into historical data and observation data according to the acquisition time of the multi-dimensional network data, and associating event results of the historical data with the historical data one by one;
step 2.3: extracting network characteristics according to a known service scene, obtaining a training network characteristic vector set and a testing network characteristic vector set from historical data, and obtaining an observation network characteristic vector from observation data;
among other things, network characteristics include, but are not limited to: byte count, packet count, flow count, audit log data, number of audit events, interval events, quintuple (protocol, source IP address, destination port, and destination IP address), and resource consumption events,
step 2.4: carrying out filtering and noise reduction processing on the network characteristic vectors, namely judging whether the multidimensional network characteristic vectors obey a Gaussian distribution law or not, if so, constructing an anomaly detection model through a Kalman filtering algorithm, inputting the training network characteristic vectors subjected to filtering and noise reduction processing into the anomaly detection model for model training, and inputting the test network characteristic vectors into the anomaly detection model for model optimization;
step 2.5: inputting the observation network feature vector into the optimized anomaly detection model, outputting the probability of the occurrence of the anomaly through the anomaly detection model, comparing the probability of the occurrence of the anomaly with a preset threshold, and judging that the DDos attack occurs when the probability of the occurrence of the anomaly is greater than the preset threshold, see fig. 7.
Specifically, the principle of constructing the anomaly detection model based on the kalman filter algorithm in step 2.4 is as follows:
step 2.41: predicting the network eigenvector and the covariance matrix at the k moment according to the network eigenvector and the covariance matrix at the k-1 moment to obtain a predicted network eigenvector and a predicted covariance matrix at the k moment;
step 2.42: respectively comparing the predicted network eigenvector and the predicted covariance matrix at the moment k with the actual network eigenvector and the actual covariance matrix at the moment k, and updating the predicted network eigenvector and the predicted covariance matrix at the moment k;
step 2.43: carrying out time updating and state updating according to the actual network eigenvector and the actual covariance matrix at the moment k +1, and repeating the steps to obtain the optimal value of the network eigenvector at the moment k + 1;
step 2.44: and drawing a predicted network curve according to the optimal value of the network characteristic vector at the k +1 moment.
Specifically, the principle of determining the DDos attack in step 2.5 is as follows:
step 2.51: continuously correcting waveform burrs based on the goodness of fit of a predicted network curve aiming at observed network feature vectors, and defining a proper preset threshold;
step 2.52: when a short, tight and large-scale escape or long-term slow escape saturation curve occurs, DDos attack is judged and a manual analysis stage is entered.
The first embodiment is as follows:
taking the data of the boundary firewall as an example, the DDos attack is technically designed:
data sample: devid =3 dname = "secgatnsg" date = "2021-04-15-42:
a calculation field: screening mod = flow, count by span =1m
SPL processing logic (programming language for structured data computation processing): calculating the total receiving/transmitting amount in each time window;
Figure BDA0003254487780000081
using an algorithm: kalman filtering algorithm
ML calculation logic (logistic regression algorithm): and calculating a time sequence baseline of the flow, and when the time sequence baseline exceeds a confidence interval, outputting a result which is regarded as suspected DDos attack, namely count +1.
The invention provides a medical information abnormity detection system based on AI algorithm, which is realized based on the method, the system comprises an acquisition layer, an analysis layer and an application layer, please refer to FIG. 3;
the acquisition layer is used for acquiring and converging medical information data;
the analysis layer divides the data into historical data and observation data according to the acquisition time of the medical information data, screens and extracts the characteristics of formats and fields of the data by using a data cleaning technology and a filtering and noise reduction technology, obtains a training characteristic vector set and a testing characteristic vector set from the historical data, and obtains an observation characteristic vector from the observation data;
based on a medical information safety service scene, establishing an anomaly detection model by using an AI algorithm, inputting training feature vectors into the anomaly detection model for model training, and inputting test feature vectors into the trained anomaly detection model for model optimization;
inputting the observation characteristic vector into the optimized anomaly detection model for anomaly judgment, and continuously optimizing the anomaly detection model through machine learning;
and the application layer is used for carrying out visualization abnormity judgment result and early warning display.
Referring to fig. 4, the acquisition layer is mainly responsible for accessing data from an external system or device by the platform and performing distributed storage on the data according to configuration; meanwhile, data acquisition of the acquisition layer can forward data which is responsible for providing screening and data desensitization processing to the data analysis layer to the analysis layer. Compared with the Logistack, the difference of the acquisition layer data source list is small, and the supported list comprises File, JDBC, UDP, TPC, syslog, splunk forwarder and the like.
Besides the data source format and the data source protocol, the acquisition layer also provides an interface of data source acquisition extension, and the interface can realize the acquisition of more data source protocols through secondary development.
The analysis layer mainly provides functions of data analysis, data search analysis, machine learning, artificial intelligence algorithm and the like; the analysis layer may be composed of one or more data analysis nodes; in consideration of the practical application scenario, the analysis layer may be built by a data sharing cluster composed of a plurality of data analysis nodes.
The data analysis is independent of the data search analysis, and the data analysis is completed before entering a search analysis engine. The data analysis engine is mainly responsible for extracting the unstructured logs into a plurality of KEY-VALUE KEY VALUEs and performing structured processing, so that the data analysis efficiency can be effectively improved, the data input amount of the data analysis engine is reduced, and the pertinence and the efficiency of the data search analysis engine in data processing are improved. The data analysis layer provides professional-level parsing engines for various system logs, and the parsing engines perform upgrade iteration according to version update of the system. The platform supports maintenance and upgrading of a data analysis engine, and data analysis is ensured to be consistent with log versions. In addition, the data analysis engine of the original factory also supports the analysis rule which can be defined by the regular expression, and the user-defined data analysis based on the regular expression can be realized only by customizing the data engine.
In terms of a data search analysis engine, the platform employs a search analysis engine that can optimize log searches. The engine supports dynamic indexing of logs and log query analysis statements of standard SPL. For the command function commonly used for log analysis, more than 170 log analysis commonly used commands and functions are built in the platform log analysis, and a user can conveniently and quickly call the log analysis function when the log is inquired and used. In the aspect of data search analysis performance, under the condition of ideal resources, the log search analysis performance of the platform can reach a log analysis rate of 20MB/sec, and the log analysis rate has certain advantages compared with other log analysis processing platforms on the market.
In the aspects of artificial intelligence analysis and big data machine learning, platform log analysis integrates the current mature machine learning algorithms such as time sequence data prediction, anomaly detection, multi-factor relevance analysis and the like realized based on a Kalman filter. Meanwhile, the data search analysis engine provides an algorithm function interface, supports mainstream scripting languages used in the fields of artificial intelligent machine learning such as python, R, ruby and the like, and realizes the extension of machine learning.
The application layer is a human-computer interaction interface layer which is built on the data analysis layer based on Web Service. The method supports a standard SPL search function, a conversion function of log unstructured data to structured data and a visualization display of search result data, and the data visualization supported by the current platform comprises the following steps: data tables, line graphs, bar graphs, area graphs, and the like.
Besides the visual display, the search analysis result and the style are solidified into the instrument board for subsequent digital visual display.
Referring to fig. 5, the construction process of the medical information abnormality detection system is as follows:
(1) Collecting data
Aiming at log sources of types such as a host server, network equipment, safety equipment, an application system and the like, the log sources generated by the equipment are adopted, and the acquisition range comprises:
1) And (4) terminal safety: by analyzing and processing the collected logs, virus outbreak, worm outbreak, terminal abnormal behavior and the like can be found;
2) The server is safe: by analyzing and processing the collected logs, server virus outbreak, abnormal behaviors, abnormal operation and maintenance and the like can be found;
3) The safety system device operates as follows: the operation and maintenance and operation of the safety equipment are monitored by analyzing and processing the collected logs and utilizing a statistical analysis means of the logs;
4) Security attack behavior: the security attack behavior of the internal and external networks can be discovered by analyzing and processing the collected logs and calculating the real and credible security attack alarm through the association between the system and the vulnerability scanning result; the attack behavior with high risk is found in real time, and real-time processing is carried out;
5) And (3) network abnormal access behavior: the use and utilization conditions of the network are monitored in real time by analyzing and processing the collected logs, and the behaviors which seriously affect the network bandwidth and illegally utilize the network bandwidth resources are processed in real time; the smooth network is guaranteed, the effective bandwidth of the service is guaranteed, and the legal network access behaviors of enterprises are standardized.
(2) Cleaning transformation
Because the equipment systems are various, and the difference of log information storage formats, field meanings and communication protocols is large, the collected various equipment logs is normalized, the complete information of audit records is extracted, and a basis is provided for subsequent security audit and correlation analysis.
(3) Exploration/visualization
Providing various statistical analysis reports of all monitored information, the presentation format including: graphs, histograms, line-surface diagrams, data tables, instrument diagrams, etc.; and the inquiry and the derivation of historical data are provided, the unit of the statistical time period can be year, month, day, hour, minute and the like, various statistical reports such as ranking and comparison of ranking of resources and the like according to the running state, key performance, alarm, and the like of all managed networks, hosts, databases, middleware, service systems and the like are provided, and a user can conveniently know the running whole situation of the current whole information-based resource intuitively and in multiple angles.
(4) Demand/model
Taking logs acquired by different devices and IT basic data as analysis bases, and analyzing and modeling the logs through association rules defined by a system according to the information security management requirements of the medical industry; meanwhile, potential safety hazards can be deeply excavated, the severity of safety events can be judged, correlation analysis events with higher reliability can be correlated, and the signal-to-noise ratio of event processing can be improved;
methods of correlation analysis are provided: the association analysis based on rules, association analysis based on statistics, association analysis based on subject and object objects, and association analysis based on KPI specifically include:
1) The method supports correlation analysis based on multiple modes such as historical safety states, cause and effect;
2) The method supports the statistic-based correlation analysis and has long-period correlation analysis;
3) Various analysis and alarm functions based on the security policy are supported;
4) The method supports the correlation analysis based on asset vulnerability and identity roles;
based on the medical information safety business requirements, an algorithm and a knowledge base system are utilized to establish a visual model which is suitable for the medical information safety business requirements, and the method mainly comprises the following steps:
1) Automatic risk discovery and alarm model
Logs meeting the alarm definition are pushed to a predefined administrator. The alarm analysis comprises log association of single equipment and also comprises cross-equipment log association analysis. The alarm requirements at least meet mail alarm, short message alarm and management interface alarm.
2) Safety automation and service orchestration model
According to a large amount of safety event data provided by the data analysis tool, automatic arrangement of services is realized through an automatic script, and automatic response of safety events is carried out.
(5) Algorithm evaluation
1) The algorithm definition and the evaluation are carried out on a single or a series of specific and ordered operation behaviors to form a correct or abnormal behavior model, and the violation and abnormal behavior can be accurately detected according to the model.
2) Introducing human intervention, evaluating the output result of the algorithm, continuously improving the accuracy of the evaluation result through supervised learning and unsupervised learning, finding out abnormal behavior users through learning and analysis of a large amount of data, carrying out deep analysis based on user behaviors on accurately defined strategy scenes, and confirming the threat probability of the users through a comprehensive analysis model.
3) Specific application of machine learning algorithm in information security situation perception scene
The machine learning algorithm supported by the platform can be applied to different scenes such as data preprocessing, feature extraction, classification, regression, clustering, time series prediction, abnormal value detection, text analysis and the like. The method aims to construct an optimal model for anomaly detection, prevent the threat events from happening in the bud and pre-judge possible threats in advance.
(6) Publish deployment and application evaluation
On the basis of a platform data search engine, an application component facing a certain information security management field or a specific service can be constructed and customized according to the actual situation and the requirement of a hospital. Meanwhile, the platform converges a popular information security management knowledge base appointed in part of industries, and provides management experience and advanced technology of each management field in the form of application or plug-in.
In conclusion, the invention ensures the safety of the medical information system, realizes the supervision of safety technology and promotes the falling of information safety level protection related systems. Therefore, the invention effectively overcomes various defects in the prior art and has high industrial utilization value.
The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present invention.

Claims (8)

1. A medical information abnormality detection method based on an AI algorithm, the abnormality detection method comprising the steps of:
step 1: acquiring and converging medical information data, and dividing the data into historical data and observation data according to the acquisition time of the medical information data;
and 2, step: screening and feature extraction are carried out on formats and fields of data by using a data cleaning technology and a filtering and denoising technology, a training feature vector set and a testing feature vector set are obtained from historical data, and an observation feature vector is obtained from observation data;
and 3, step 3: based on a medical information safety service scene, establishing an anomaly detection model by using an AI algorithm, inputting training characteristic vectors into the anomaly detection model for model training, and inputting test characteristic vectors into the trained anomaly detection model for model optimization;
and 4, step 4: and inputting the observation characteristic vector into the optimized abnormality detection model for abnormality judgment, and continuously optimizing the abnormality detection model through machine learning.
2. The AI algorithm-based medical information anomaly detection method according to claim 1, characterized in that: the acquisition range of the medical information data comprises a host server, network equipment, safety equipment and a log source of an application system.
3. The AI-algorithm-based medical information anomaly detection method according to claim 2, wherein for APT attacks, the anomaly detection method specifically comprises:
step 1.1: collecting flow data, performing standard analysis on the flow data, dividing the data into historical data and observation data according to the collection time of the flow data, and marking the event result of the historical data, namely marking the normal flow behavior as 1 and marking the abnormal flow behavior as 0;
step 1.2: extracting flow characteristics according to a known service scene, obtaining a training flow characteristic vector set and a testing flow characteristic vector set from historical data, and obtaining an observation flow characteristic vector from observation data;
step 1.3: normalizing the flow characteristic vector, and expressing the flow characteristic after the normalization processing by (0, 1);
step 1.4: constructing an anomaly detection model through a classifier algorithm, inputting the training flow characteristic vector subjected to normalization processing into the anomaly detection model for model training, and inputting the test flow characteristic vector into the anomaly detection model for model optimization; step 1.5: inputting the observation flow characteristic vector into the optimized abnormality detection model, outputting the probability of abnormality occurrence through the abnormality detection model, comparing the probability of abnormality occurrence with a preset threshold value, and judging the APT attack when the probability of abnormality occurrence is larger than the preset threshold value.
4. The AI algorithm-based medical information anomaly detection method of claim 3, wherein the classifier algorithms comprise a decision tree algorithm, a Bayesian algorithm, and an SVM algorithm.
5. The AI algorithm-based medical information anomaly detection method according to claim 2, characterized in that for DDos attacks, the anomaly detection method specifically comprises:
step 2.1: collecting multidimensional network data, and standardizing the multidimensional network data,
step 2.2: dividing data into historical data and observation data according to the acquisition time of the multi-dimensional network data, and associating event results of the historical data with the historical data one by one;
step 2.3: extracting network characteristics according to a known service scene, obtaining a training network characteristic vector set and a testing network characteristic vector set from historical data, and obtaining an observation network characteristic vector from observation data;
step 2.4: filtering and denoising the network characteristic vector, namely judging whether the multidimensional network characteristic vector obeys a Gaussian distribution law, if so, constructing an anomaly detection model through a Kalman filtering algorithm, inputting the training network characteristic vector subjected to filtering and denoising into the anomaly detection model for model training, and inputting the testing network characteristic vector into the anomaly detection model for model optimization;
step 2.5: inputting the observation network characteristic vector into an optimized abnormality detection model, outputting the probability of abnormality occurrence through the abnormality detection model, comparing the probability of abnormality occurrence with a preset threshold value, and judging DDos attack when the probability of abnormality occurrence is larger than the preset threshold value.
6. The AI algorithm-based medical information anomaly detection method of claim 5, wherein the step 2.4 of constructing an anomaly detection model based on Kalman filtering algorithm is based on the following principle:
step 2.41: predicting the network eigenvector and the covariance matrix at the k moment according to the network eigenvector and the covariance matrix at the k-1 moment to obtain a predicted network eigenvector and a predicted covariance matrix at the k moment;
step 2.42: respectively comparing the predicted network eigenvector and the predicted covariance matrix at the moment k with the actual network eigenvector and the actual covariance matrix at the moment k, and updating the predicted network eigenvector and the predicted covariance matrix at the moment k;
step 2.43: carrying out time updating and state updating according to the actual network eigenvector and the actual covariance matrix at the moment k +1, and repeating the steps to obtain the optimal value of the network eigenvector at the moment k + 1;
step 2.44: and drawing a predicted network curve according to the optimal value of the network characteristic vector at the k +1 moment.
7. The AI algorithm-based medical information anomaly detection method of claim 6, wherein the step 2.5 of determining DDos attack principles is as follows:
step 2.51: continuously correcting waveform burrs on the basis of the goodness of fit of a predicted network curve aiming at the observed network characteristic vector, and defining a proper preset threshold value;
step 2.52: when a short, tight and large-scale escape or long-term slow escape saturation curve occurs, DDos attack is judged and a manual analysis stage is entered.
8. A medical information anomaly detection system based on AI algorithm, the system is realized based on the method of any one of claims 1 to 7, the system comprises an acquisition layer, an analysis layer and an application layer;
the acquisition layer is used for acquiring and converging medical information data;
the analysis layer divides the data into historical data and observation data according to the acquisition time of the medical information data, screens and extracts the characteristics of formats and fields of the data by using a data cleaning technology and a filtering and noise reduction technology, obtains a training characteristic vector set and a testing characteristic vector set from the historical data, and obtains an observation characteristic vector from the observation data;
based on a medical information safety service scene, establishing an anomaly detection model by using an AI algorithm, inputting training characteristic vectors into the anomaly detection model for model training, and inputting test characteristic vectors into the trained anomaly detection model for model optimization;
inputting the observation characteristic vector into the optimized anomaly detection model for anomaly judgment, and continuously optimizing the anomaly detection model through machine learning;
and the application layer is used for carrying out visual abnormal judgment results and early warning display.
CN202111055581.5A 2021-09-09 2021-09-09 Medical information anomaly detection method and system based on AI algorithm Pending CN115795330A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111055581.5A CN115795330A (en) 2021-09-09 2021-09-09 Medical information anomaly detection method and system based on AI algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111055581.5A CN115795330A (en) 2021-09-09 2021-09-09 Medical information anomaly detection method and system based on AI algorithm

Publications (1)

Publication Number Publication Date
CN115795330A true CN115795330A (en) 2023-03-14

Family

ID=85473196

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111055581.5A Pending CN115795330A (en) 2021-09-09 2021-09-09 Medical information anomaly detection method and system based on AI algorithm

Country Status (1)

Country Link
CN (1) CN115795330A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116662375A (en) * 2023-08-02 2023-08-29 湖南远跃科技发展有限公司 HIS-based prescription data verification method and system
CN117038050A (en) * 2023-10-10 2023-11-10 深圳华声医疗技术股份有限公司 Physiological parameter abnormality processing method, system and medical equipment

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116662375A (en) * 2023-08-02 2023-08-29 湖南远跃科技发展有限公司 HIS-based prescription data verification method and system
CN116662375B (en) * 2023-08-02 2023-10-10 湖南远跃科技发展有限公司 HIS-based prescription data verification method and system
CN117038050A (en) * 2023-10-10 2023-11-10 深圳华声医疗技术股份有限公司 Physiological parameter abnormality processing method, system and medical equipment
CN117038050B (en) * 2023-10-10 2024-01-26 深圳华声医疗技术股份有限公司 Physiological parameter abnormality processing method, system and medical equipment

Similar Documents

Publication Publication Date Title
US9256735B2 (en) Detecting emergent behavior in communications networks
Kholidy Detecting impersonation attacks in cloud computing environments using a centric user profiling approach
CN114301712B (en) Industrial internet alarm log correlation analysis method and system based on graph method
Rahal et al. A distributed architecture for DDoS prediction and bot detection
US20220360597A1 (en) Cyber security system utilizing interactions between detected and hypothesize cyber-incidents
US9961047B2 (en) Network security management
CN110896386B (en) Method, device, storage medium, processor and terminal for identifying security threat
US20230011004A1 (en) Cyber security sandbox environment
US20230132703A1 (en) Capturing Importance In A Network Using Graph Theory
CN115795330A (en) Medical information anomaly detection method and system based on AI algorithm
Aung et al. An analysis of K-means algorithm based network intrusion detection system
Al-Utaibi et al. Intrusion detection taxonomy and data preprocessing mechanisms
CN112491860A (en) Industrial control network-oriented collaborative intrusion detection method
CN113904881A (en) Intrusion detection rule false alarm processing method and device
Kamarudin et al. A new unified intrusion anomaly detection in identifying unseen web attacks
RU148692U1 (en) COMPUTER SECURITY EVENTS MONITORING SYSTEM
Brandao et al. Log Files Analysis for Network Intrusion Detection
Harbola et al. Improved intrusion detection in DDoS applying feature selection using rank & score of attributes in KDD-99 data set
RU180789U1 (en) DEVICE OF INFORMATION SECURITY AUDIT IN AUTOMATED SYSTEMS
Pan et al. Anomaly behavior analysis for building automation systems
Meng et al. An effective high threating alarm mining method for cloud security management
Sulaiman et al. Big data analytic of intrusion detection system
Azarkasb et al. A network intrusion detection approach at the edge of fog
CN113572781A (en) Method for collecting network security threat information
Sharma et al. An Analysis of Android Malware and IoT Attack Detection with Machine Learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination