CN110019070A - A kind of security log clustering method based on Hadoop and system of calling to account - Google Patents
A kind of security log clustering method based on Hadoop and system of calling to account Download PDFInfo
- Publication number
- CN110019070A CN110019070A CN201711101507.6A CN201711101507A CN110019070A CN 110019070 A CN110019070 A CN 110019070A CN 201711101507 A CN201711101507 A CN 201711101507A CN 110019070 A CN110019070 A CN 110019070A
- Authority
- CN
- China
- Prior art keywords
- log
- account
- calling
- algorithm
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims description 18
- 238000012545 processing Methods 0.000 claims abstract description 17
- 230000008447 perception Effects 0.000 claims abstract 2
- 230000008569 process Effects 0.000 claims description 9
- 230000000694 effects Effects 0.000 claims description 7
- 238000003064 k means clustering Methods 0.000 claims description 4
- 239000008358 core component Substances 0.000 claims description 2
- 230000009467 reduction Effects 0.000 claims description 2
- 238000011160 research Methods 0.000 abstract description 9
- 230000006872 improvement Effects 0.000 abstract description 8
- 230000002547 anomalous effect Effects 0.000 abstract description 3
- 238000004364 calculation method Methods 0.000 abstract description 2
- 238000007418 data mining Methods 0.000 abstract description 2
- 238000004458 analytical method Methods 0.000 description 11
- 238000012544 monitoring process Methods 0.000 description 6
- 230000036544 posture Effects 0.000 description 5
- 230000000903 blocking effect Effects 0.000 description 3
- 238000012098 association analyses Methods 0.000 description 2
- 230000006399 behavior Effects 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 239000000243 solution Substances 0.000 description 2
- 238000009412 basement excavation Methods 0.000 description 1
- 239000000306 component Substances 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 238000010219 correlation analysis Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000012804 iterative process Methods 0.000 description 1
- 230000003449 preventive effect Effects 0.000 description 1
- 238000000746 purification Methods 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000000452 restraining effect Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/1805—Append-only file systems, e.g. using logs or journals to store data
- G06F16/1815—Journaling file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Probability & Statistics with Applications (AREA)
- Computer And Data Communications (AREA)
Abstract
The invention belongs to technical field be data mining and information security field.A kind of system of calling to account of the security management and control log of magnanimity of disclosure of the invention, pass through the dynamic log of research magnanimity control, establish a kind of practicable system of calling to account, realize calling to account and tracing to the source to anomalous event, threat situation perception to security incident, more particularly to a kind of improvement of clustering algorithm K-Means algorithm, in conjunction with the characteristics of Map/Reduce parallel computation, it realizes K-Means parallelization iterative calculation, improves speed, the accuracy rate of log processing.
Description
Technical field
The invention belongs to technical field be a kind of security log clustering method based on Hadoop and system of calling to account.It belongs to
In data mining and information security field, the basis of invention is based on cloud computing environment, the cloud computing mode of Hadoop framework.
It is related to a kind of calling to account for the security management and control log of magnanimity, more particularly to a kind of changing for Log Clustering algorithm K-Means algorithm
Into.
Background technique
Cloud computing carries huge data resource, and the sustainable growth of big data brings severe examine to physical equipment
It tests, the data these savings expansions such as how to store, handle, analyzing increasingly become heat of computer and related fields concern
Point, come into being open source Hadoop big data processing framework become every profession and trade heat, Hadoop platform have efficiently, can
It leans on, the features such as scalability is strong, its two chief components are Hadoop distributed file system HDFS and parallel processing mould
Type Map Reduce completes the processing of mass data into Map Reduce iterative processing by the extraction document from HDFS.
The efficient processing capacity of core component Map/Reduce in Hadoop, entire processing stage can probably be divided into for 3 stages, the
One stage was divided into data block (usually 64M) entrance of fixed size by extracting from HDFS (document storage system)
Map, Map carry out first treated to it, form the key-value pair of (Key, value), are into second stage, that is, intermediate stage
Combine, the stage are mainly that the sequence completed to key-value pair is handled, which needs to occupy a large amount of I/O, last single order
Section is, by Combine treated median enters Reduce stage, Reduce requires processing data, output according to output
As a result.
Things of a kind come together, and Clustering is deep-rooted, is popular in big data excavation, belongs to unsupervised machine
Study, the classification method with various ways, generally there is the cluster based on division, based on grid, based on density etc., generally
The appropriate method of the selections such as the shape according to clustering object, also can be used the mode of fusion, for our days that be studied
Will characteristic (non-structured text) has chosen, algorithm simple with the principle readily Typical Representative K-Means based on division
Algorithm, principle are that data are assigned to according to distance and are specified in advance by constantly calculating each sample point to the distance of cluster centre
It is the S collection data of n for capacity, process is as follows in the K class of (the K value of general random input):
Input a K value (specified generally according to anticipation output class)
K value is randomly selected from sample set S as initial cluster center, calculates remaining sample value to each initial clustering
The distance at center carries out being divided into nearest cluster class according to distance, continuous to iterate to calculate, until convergence.
In the iterative process of K-Means, two big research directions are related generally to, the selection of the first K value randomly chooses K value, holds
The effect of cluster is easily influenced, the selection of the second initial cluster center exists and selects isolated point or noise spot as initial clustering
The risk at center, causes local optimum, and for its defect, research hotspot in recent years is also just concentrated mainly on these two aspects.
In view of the powerful cloud computing ability of Hadoop, the characteristics of K-Means algorithm is simply easily realized, Map/ is utilized
The computation capability of Reduce combines K-Means with Map/Reduce, greatly improves cloud computing ability, occupies
In this, log is managed in conjunction with this project research object magnanimity, by improving K-Means algorithm, and the energy in conjunction with Map/Reduce
The processing capacity for enough improving log, reduces rate of failing to report and rate of false alarm, further realizes to leakage of information, process blocking, network is cut
The discovery and reduction of the control behavior such as stream, and evidence of calling to account is formed according to unified specification.Building one has to multidate information
The system traced to the source call to account to internet security is improved, improves the running environment of credible cloud, it appears necessary.
Summary of the invention
The present invention is established a kind of practicable system of calling to account, is realized to peace by the dynamic log of research magnanimity control
Total event calling to account and tracing to the source, and by the concern to security postures, is capable of the safety of each network node of control, takes preventive measures,
The K-Means algorithm of this invention is additionally related to, at present the defect of the clustering algorithm of K-Means, i.e. initial cluster center and K
Value selection is easy that algorithm is made to fall into locally optimal solution, in conjunction with current present Research, although many experts and scholars are equal in terms of improvement
A large amount of research and study have been done, but there is more or less deficiency to a certain extent, has especially been applied to control log
The research of aspect has been far from satisfying the demand of massive logs Data Analysis Services at present, studies data base collected
This all using the sample set of UCI, does not apply the control log sample set of higher-dimension, on the other hand, greatly well
The research of most K-Means clusters is not comprehensive, does not grow with each passing hour for the removal of isolated point, the present invention is directed to K-
The further expansion and value promotion of the range of many disadvantages and the big data application of Means algorithm, carry out in terms of following two
It improves, is by the purification to sample set data first, the isolated point of removal interference Clustering Effect followed by utilizes dot density
And average distance, threshold values, the mainly selection of MMD algorithm (minimax distance) progress initial cluster center are set, class is reached
Between similitude it is minimum, the great purpose of similitude in class, under the cloud computing mode of hadoop, pass through Map/Reduce repeatedly
In generation, forms good Clustering Effect, provides good data environment for log correlation analysis, has using association algorithm formation
The chain of evidence that can be called to account is supplied to user, and user extracts feature by input, finds relevant chain of evidence by the system of calling to account,
And then it is called to account and is traced to the source.By the invention, the processing speed of log and proposing for secure network performance can greatly be improved
It rises, preferably manages the network equipment, realize control to dynamic data and call to account.
For the present invention, provided technical solution is as follows:
It analysis for a security log based on Hadoop and calls to account system, according to the flow direction and feature of information, from
Log collects displaying, and this system is mainly made of four module, is acquisition, the log storage, log analysis of log respectively
With platform show etc., the main target for inventing the system is can call to account trace back to anomalous event by the analysis to log
The safety of credible cloud environment, such as SQL injection, APT attack, DDOS attack are improved, using B/S framework, according to user in source
Demand, input information keys, can check that log is associated with chain of evidence, and then according to log recording, tracking information source is real
Now to the control of key message with call to account, including find and restore such as information leakage, blocking process, the control row such as network shuts off
For.Log recording the operating status of the various network equipments, pass through pretreatment, storage, the analysis to magnanimity security log, energy
Enough by display platform real time inspection network safety situation, by monitoring, instrument board displaying can be realized to equipment operation from
Program level, network level and it is system-level manage layer by layer, this system relates generally to acquisition, storage, clustering, the association of log
The relevant technologies such as analysis, denoising, what log collection part mainly used compatible with big data Computational frame Hadoop phase can adopt
Collect software, acquisition log enters HDFS document storage system by collector and stored, and carries out data by Map/Reduce
Processing and analysis, denoising completes cluster, and then is associated analysis, tactful according to control in this process, can be to control
Log is operated, including sequence, label etc., to improve the accuracy rate called to account, can finally be shown by web interface pair
As the security postures tendency of system, finds potential threat and attack, quick response is made to anomalous event in first time, is given
The safe condition of user's presentation whole system.
Sport technique segment used by one system that can be called to account must be it is all linked with one another, the acquisition of log is using concurrent
The multipoint acquisition of formula, improves the collecting efficiency of log, while coordinating HDFS system, carries out the storage and management of log, passes through
The continuous iteration of Map/Reduce carries out the iteration selection of K-Means initial cluster center, until cluster centre value no longer occurs
Change, iteration cluster, until convergence, the present invention uses the K-Means algorithm based on division, using correlation technology to log number
According to carry out semantic association, data are no longer an independent individuals, but be in state associated with each other, be conducive to data into
Row is distinguished and identification, and log stream association purpose is will program level relevant to this demand, system-level and network according to the demand of calling to account
Grade control event log merge, formed complete event sequence, and then generate chain of evidence with realize call to account it is undeniable
Property.User can call to account or be traceable to the specific network equipment, network event such as can by log association by input feature vector value
To discriminate whether that virtual machine escape occurs, the DDOS attack or potential APT attack in a certain domain whether occur, if repair
Change sensitive data, the monitoring and management carried out to it.
The successful operation for system of calling to account be unable to do without the improvement for the emphasis K-Means clustering algorithm that the present invention studies, needle
It is that core is based on MMD (minimax distance) and density to the scheme adopted by the present invention to K-Means algorithm improvement
Combination, on the one hand first optimize sample set data, it is initial poly- that the algorithm principle on the other hand based on distance and density finds K
Class center, is constantly iterated, until sample set is assigned to K cluster.In this stage, former according to the rudimentary algorithm of K-Means
Reason calculates sample average, and by calculating dot density definition, the dot density of the dot density and all sample sets that calculate the point is simultaneously
Sequence, removal are far smaller than the sample of sample average, can remove noise jamming point, recalculate new samples mean value, and with this
Centered on, each sample point and centre distance in sample set are calculated, distance is according to descending sequence, and selected distance is recently and most
Remote corresponding two sample points sequentially find third, the 4th according to MMD algorithm principle as initial cluster center, until K
Initial cluster center is iterated calculating according to the algorithm principle of K-Means, completes the cluster to sample set, finally using most
Small variance evaluation function evaluates Clustering Effect, completes entire cluster process.
Preferably, improved K-Means algorithm proposed by the present invention and Map/Reduce perfect combination, establish to magnanimity
The analysis of control log handles and does association analysis, for realizing the control and blocking to certain sensitive behaviors, by feature
It extracts realization to call to account, can be realized to sensitive data, certain potential threat (APT), DDOS attack, information leakage etc. are called to account
With trace to the source, form the monitoring in advance of complete set, the security protection called to account afterwards is called to account system.
Detailed description of the invention
Fig. 1 is that the security log of the present invention based on Hadoop is called to account system architecture diagram
Fig. 2 is the schematic diagram of Map/Reduce algorithm of the present invention
Fig. 3 is the Map/Reduce parallelization flow chart of the present invention for improving K-Means
Specific embodiment
User is shown by web front-end in present system inputs the feature critical word called to account on interface, by accessing day
Will linked database is called to account and is traced to the source according to the data that the thought of " reverse " inquires needs, to sensitive data, centainly
Potential threat (APT), DDOS attack, information leakage etc. calling to account and tracing to the source, the monitoring in advance of the complete set of formation, afterwards
The security protection called to account is called to account system, and whole system is broadly divided into four modules, and first piece is data acquisition module, main complete
At the acquisition of daily record data;Second module is log storage, the main storage for completing log and piecemeal;Third is analysis module,
The main cluster for completing log, association form chain of evidence of calling to account;4th be platform display module, mainly by security postures,
The functional modules such as instrument board, monitoring, inquiry show and inquire key message.According to whole frame structure, layer from low to high
Layer promotes, and four-stage is divided to carry out.
(1) the log collection stage
The control log of magnanimity is acquired by log collector first, these logs have extensive source, including program
Grade, system-level and network level, the above collected sample set of three-level have recorded log corresponding with safety label strategy,
Log is managed, acquisition is related to the network equipment, safety equipment, operating system, application program etc., which can be well
Good compatibility is realized with the document storage system of the HDFS of Hadoop, and vectorization processing is carried out to log, and this system uses
Day is acquired by the way that it is disposed on each node with the Flume log collector with Hadoop with compatibility very well
Then will information carries out the unitized of format.
(2) log memory phase
Log after treatment is stored in HDFS, the stage mainly complete of both work, on the one hand removal with
The unrelated log of safety, including request failure and commercial paper log, video, audio class log etc., passes through the log after screening
It is stored according to fixed size, usually 64M.
(3) the log analysis stage
The stage main task is to complete the cluster of magnanimity control log, association analysis, is formed with according to the chain of evidence that can be chased after,
The displaying of upper platform is supported, the emphasis invention of the system focuses on the improvement of clustering algorithm, is based especially on parallelization
The improvement of the clustering algorithm of the K-Means of Map/Reduce, the choosing of the improvement of K-Means algorithm mainly for initial cluster center
It selects, the Map/Reduce in this stage, distributes Map according to the Split block elasticity being divided into storage, the Map stage completes to sort out, no
The disconnected sample point that calculates is sorted out at a distance from cluster centre;Reduce completes cluster centre and rebuilds, by the Map stage
Key-value pair enters the Reduce stage as input, Reduce is former according to the algorithm of K-Means using the processing of Combine
Reason, recalculates center of a sample, and until convergence, principle is threshold values to be set, by with the point using dot density and average distance
Centered on, average distance is the density that radius calculates the point, the dot density of sample point is calculated, and compared with threshold values, if far
It is seemingly isolated point, and remove much smaller than threshold values person, then recalculates sample average, and as center, calculates new samples
It concentrates each point from centre distance, according to descending sequence, is then initially gathered according to MMD algorithm (minimax distance)
The selection at class center, similitude is minimum between reaching class, the great purpose of similitude in class, under the cloud computing mode of hadoop,
By the iteration of Map/Reduce, good Clustering Effect is formed, in this process, the distance between two o'clock is calculated and usually adopts
Euclidean distance calculation method, it is generally the case that for sample size be n sample set, the section that K value selects forSorting procedure is as follows:
Input: input sample collection S
Output: K clustering cluster.
Process:
Step 1 inputs S sample set and K value
Step 2 calculates the sample average of S and the average distance of the point, calculates the dot density of the point, is set as threshold values
Step 3 calculates separately each dot density in sample, and compared with threshold values, and removal is far smaller than the point of threshold values, shape again
The sample set X of Cheng Xin
Step 4 recalculates the sample average point of X, as center, each point and its distance is calculated, according to descending row
Sequence.
Step 5 according to MMD, choose it is minimum with maximum distance corresponding points as the first, second initial cluster center, according to
MMD algorithm principle finds third, the 4th, until finding K initial cluster center.
Step 6 is according to K value size, and in conjunction with the working principle of Map/Reduce, the Map stage is constantly calculated in each point and cluster
Heart distance forms (key, value) key-value pair, by the sequence of intermediate stage Combine, is transmitted to Reduce calculating, last defeated
K cluster result out judges whether to restrain, continues interative computation if not restraining, conversely, output result.Output is tied
Fruit selects suitable association algorithm, is associated to it according to correlation rule, forms chain of evidence of calling to account.
The related definition that this process is related to is as follows:
The basic thought for defining 1MMD algorithm is to select distance from biggish sample as initial cluster center
The algorithm determine data set X=(x1, x2, x3,.Xn), the step of initial cluster center is as follows:
Step 1: select a sample as the 1st cluster centre O1. from X in a random basis
Step 2: O1 is chosen from X apart from maximum sample as the 2nd cluster centre O2.
Step 3: it calculates residue sample xi and arrives the distance di1 and di2 of O1, O2 respectively, and obtain their minimum value di=
Min (di1, di2), i=1,2, N
Step 4: if Dt=max { di } > α ∥ O1-O2 ∥, using corresponding sample xi as the 3rd cluster centre O3,
Proportionality coefficient α is for controlling number of clusters
Step 5: if q cluster centre (4≤q < k) has been determined, remaining sample is calculated to existing cluster centre
Distance dij. Dr=max if { min (di1, di2 ..., diq) } > α ∥ O1-O2 ∥, then using corresponding sample xr as q+
1 cluster centre
Step 6: identical processing is repeated, the initial cluster center of number of clusters is met until finding.
It defines 2 Euclidean distances and defines d=sqrt (∑ (xi1-xi2) ^) i=1 here, 2..n, xi1 indicate first point
I-th dimension coordinate, xi2 indicate the i-th dimension coordinate of second point.
(4) platform shows the stage
Mainly showing and inquiring for the stage, by inputting keyword, can inquire information relevant to the keyword, such as
The ID number that virtual machine can be inputted inquires relative safety-related log, and shows the security postures of system, shape
At a Security Situation Awareness Systems, the multiple functions such as inquiry, monitoring, instrument are provided with, realize that the visualization of user is handed over
Mutually.
It is built by the substep of the above four-stage, the intelligent log analysis for completing entire Visual Interactive, which is called to account, is
System, by the improvement to K-Means algorithm, successfully applies it in the parallel computation environment of Map/Reduce, improves number
According to processing speed, can show whole network security postures.
Claims (3)
1. the invention of a kind of security log clustering method based on Hadoop and system of calling to account, feature include:
One is capable of handling the system of calling to account of dynamic data, can be tracked to the key message of input, finds and goes wrong
Root, discovery and reduction such as information leakage block process, and network such as shuts off at the control behavior;
On the one hand a kind of improved K-Means clustering algorithm improves the standard of algorithm compared with traditional K-Means clustering algorithm
True rate improves the call to account efficiency and quality of system, on the other hand can preferably improve the Clustering Effect of control log, using big
The storage of the core component HDFS of data cloud computing framework Hadoop and the efficient iterative ability of Map/Reduce realize Map/
The K-Means parallelization of Reduce is handled, in particular for the control log of magnanimity, under the cloud computing mode of big data processing
It shows more excellent.
2. the one according to claim 1 system of calling to account with dynamic threats perception, it is characterised in that:
It is called to account feature critical word according to input, finds log and be associated with chain of evidence, comparison is called to account feature, correlation of calling to account from cloud environment
Equipment or routine interface, including program level, system-level and network level, the api interface for being related to program level calls log, system-level
Virtual machine escape log, safety equipment, network device state running log of network level etc., formed to dynamic log data
It manages and calls to account.
3. a kind of improved K-Means clustering algorithm according to claim 1, it is characterised in that:
By the way that threshold values is arranged using dot density and sample average, average distance scheduling algorithm to the vectorization processing of control log,
Isolated point is removed, initial cluster center is determined according to minimax distance algorithm, improves Log Clustering effect, pass through Map/
The cluster data of the continuous iteration outputting high quality of Reduce.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711101507.6A CN110019070A (en) | 2017-11-10 | 2017-11-10 | A kind of security log clustering method based on Hadoop and system of calling to account |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711101507.6A CN110019070A (en) | 2017-11-10 | 2017-11-10 | A kind of security log clustering method based on Hadoop and system of calling to account |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110019070A true CN110019070A (en) | 2019-07-16 |
Family
ID=67185979
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711101507.6A Pending CN110019070A (en) | 2017-11-10 | 2017-11-10 | A kind of security log clustering method based on Hadoop and system of calling to account |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110019070A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111262734A (en) * | 2020-01-13 | 2020-06-09 | 北京工业大学 | Network security event emergency processing method |
CN113449098A (en) * | 2020-03-25 | 2021-09-28 | 中移(上海)信息通信科技有限公司 | Log clustering method, device, equipment and storage medium |
CN116744305A (en) * | 2023-05-05 | 2023-09-12 | 烟台欣飞智能系统有限公司 | Communication system based on safety control of 5G data communication process |
CN117033464A (en) * | 2023-08-11 | 2023-11-10 | 上海鼎茂信息技术有限公司 | Log parallel analysis algorithm based on clustering and application |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090019019A1 (en) * | 2005-07-12 | 2009-01-15 | The Diallog Works Ltd. | Method and system for obtaining information |
CN104636494A (en) * | 2015-03-04 | 2015-05-20 | 浪潮电子信息产业股份有限公司 | Log audit checking system based on Spark big data platform |
CN106067880A (en) * | 2016-06-13 | 2016-11-02 | 国家计算机网络与信息安全管理中心 | A kind of source tracing method of IP address based on 4G network |
CN106130806A (en) * | 2016-08-30 | 2016-11-16 | 四川新环佳科技发展有限公司 | Data Layer method for real-time monitoring |
-
2017
- 2017-11-10 CN CN201711101507.6A patent/CN110019070A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090019019A1 (en) * | 2005-07-12 | 2009-01-15 | The Diallog Works Ltd. | Method and system for obtaining information |
CN104636494A (en) * | 2015-03-04 | 2015-05-20 | 浪潮电子信息产业股份有限公司 | Log audit checking system based on Spark big data platform |
CN106067880A (en) * | 2016-06-13 | 2016-11-02 | 国家计算机网络与信息安全管理中心 | A kind of source tracing method of IP address based on 4G network |
CN106130806A (en) * | 2016-08-30 | 2016-11-16 | 四川新环佳科技发展有限公司 | Data Layer method for real-time monitoring |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111262734A (en) * | 2020-01-13 | 2020-06-09 | 北京工业大学 | Network security event emergency processing method |
CN113449098A (en) * | 2020-03-25 | 2021-09-28 | 中移(上海)信息通信科技有限公司 | Log clustering method, device, equipment and storage medium |
CN116744305A (en) * | 2023-05-05 | 2023-09-12 | 烟台欣飞智能系统有限公司 | Communication system based on safety control of 5G data communication process |
CN116744305B (en) * | 2023-05-05 | 2024-01-26 | 烟台欣飞智能系统有限公司 | Communication system based on safety control of 5G data communication process |
CN117033464A (en) * | 2023-08-11 | 2023-11-10 | 上海鼎茂信息技术有限公司 | Log parallel analysis algorithm based on clustering and application |
CN117033464B (en) * | 2023-08-11 | 2024-04-02 | 上海鼎茂信息技术有限公司 | Log parallel analysis algorithm based on clustering and application |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhang et al. | An incremental CFS algorithm for clustering large data in industrial internet of things | |
CN103235825B (en) | A kind of magnanimity face recognition search engine design method based on Hadoop cloud computing framework | |
CN110019070A (en) | A kind of security log clustering method based on Hadoop and system of calling to account | |
CN108595539A (en) | A kind of recognition methods of trace analogical object and system based on big data | |
CN110825769A (en) | Data index abnormity query method and system | |
CN110414987A (en) | Recognition methods, device and the computer system of account aggregation | |
Shakya et al. | Feature selection based intrusion detection system using the combination of DBSCAN, K-Mean++ and SMO algorithms | |
Farid et al. | Mining complex data streams: discretization, attribute selection and classification | |
CN111401149B (en) | Lightweight video behavior identification method based on long-short-term time domain modeling algorithm | |
CN110134719A (en) | A kind of identification of structural data Sensitive Attributes and stage division of classifying | |
CN110377605A (en) | A kind of Sensitive Attributes identification of structural data and classification stage division | |
CN112685459A (en) | Attack source feature identification method based on K-means clustering algorithm | |
CN107895008A (en) | Information hotspot discovery method based on big data platform | |
CN106649527A (en) | Detection system and detection method of advertisement clicking anomaly based on Spark Streaming | |
Las-Casas et al. | A big data architecture for security data and its application to phishing characterization | |
Du et al. | Research on decision tree algorithm based on information entropy | |
Jiang | Credit scoring model based on the decision tree and the simulated annealing algorithm | |
CN105426966A (en) | Association rule digging method based on improved genetic algorithm | |
CN103186772A (en) | Face recognition system and method based on cluster framework | |
Rahman et al. | An efficient approach for selecting initial centroid and outlier detection of data clustering | |
Wang et al. | An intrusion detection system for the internet of things based on the ensemble of unsupervised techniques | |
Zhang et al. | An improved PAM clustering algorithm based on initial clustering centers | |
Liao et al. | Abnormal transaction detection of Bitcoin network based on feature fusion | |
CN114581013A (en) | Physical credible traceability warehouse management device based on unstructured block chain characteristics | |
Chen et al. | Research and application of cluster analysis algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20190716 |
|
WD01 | Invention patent application deemed withdrawn after publication | ||
DD01 | Delivery of document by public notice |
Addressee: Lu Xie Document name: Deemed withdrawal notice |
|
DD01 | Delivery of document by public notice |