CN107104978B - Network risk early warning method based on deep learning - Google Patents

Network risk early warning method based on deep learning Download PDF

Info

Publication number
CN107104978B
CN107104978B CN201710375043.1A CN201710375043A CN107104978B CN 107104978 B CN107104978 B CN 107104978B CN 201710375043 A CN201710375043 A CN 201710375043A CN 107104978 B CN107104978 B CN 107104978B
Authority
CN
China
Prior art keywords
risk
network
sample data
training
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710375043.1A
Other languages
Chinese (zh)
Other versions
CN107104978A (en
Inventor
赖洪昌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201710375043.1A priority Critical patent/CN107104978B/en
Publication of CN107104978A publication Critical patent/CN107104978A/en
Application granted granted Critical
Publication of CN107104978B publication Critical patent/CN107104978B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/20Network architectures or network communication protocols for network security for managing network security; network security policies in general
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a network risk early warning method based on deep learning, which comprises the following steps: A1. collecting asset risk sample data of a network space of a whole network segment, and storing the sample data into a database; A2. extracting data from a database, and performing convolutional neural network distributed training learning to form an initial risk prediction model; A3. and inputting the production data into a risk prediction model, evaluating the risk value of the production data, and giving an alarm if an early warning threshold value is reached. By the method and the equipment, safety risk assessment and early warning can be carried out on a plurality of target networks or targets without obvious bugs, and the safety state of one network can be assessed on the whole; the response speed is improved, and the risk points are found quickly; meanwhile, the maintenance cost is reduced, and the labor is saved.

Description

Network risk early warning method based on deep learning
Technical Field
The invention relates to a network risk early warning technology, in particular to a network risk early warning method and a network risk early warning system aiming at machine deep learning in an area range.
Background
In the current network security field, whether a target is safe or not is detected through traditional modes such as vulnerability scanning and port scanning, the mode has an effect on a single target and obvious vulnerabilities, and the security state of the target cannot be rapidly and comprehensively obtained for batch targets or targets without the obvious vulnerabilities.
Disclosure of Invention
In order to solve the problems, the invention provides a network risk early warning method and device based on deep learning, which can quickly and comprehensively acquire a security state of batch targets or targets without obvious bugs.
The invention provides a network risk early warning method based on deep learning, which is characterized by comprising the following steps: A1. collecting asset risk sample data of a network space of a whole network segment, and storing the sample data into a database; A2. extracting data from a database, and performing Convolutional Neural Network (CNN) distributed training learning to form a risk prediction model; A3. and inputting the production data into a risk prediction model, evaluating the risk value of the production data, and giving an alarm if an early warning threshold value is reached.
Preferably, the step a1 includes: A11. determining risk elements, and collecting network space asset risk sample data of the whole network segment; A12. and carrying out vulnerability scanning on the collected risk sample data, and dividing the security level.
Further preferably, the risk elements include: one or more of a target IP, an open port, a server system type and version, a server application type and version, existing vulnerabilities, a database type and version, a weak password, whether CDN acceleration is employed, and a firewall.
Further preferably, the security level is divided into: the safety level comprises four safety levels of high-risk, medium-risk, low-risk and safe, the ratio of the four safety levels is 1:1:1:1, and the number of each safety level is more than or equal to 5000.
Further preferably, the step a1 further includes: A13. and converting the collected network space asset risk sample data into binary sample data which can be identified by deep learning.
Still more preferably, the step a13 includes: A131. performing picture processing on the sample, and cutting the sample into uniform size; A132. and whitening the cut picture.
Preferably, the distributed training learning of step a2 is performed in a gradient decreasing manner, and the initial gradient is 10-4
Preferably, the step a2 includes: A21. preparing a training environment, wherein the training environment is carried out by adopting a Tensorflow GPU mode; A22. extracting training sample data from a database, and performing model training by combining a convolutional neural network to obtain a risk prediction model; A23. and extracting test sample data from the database, and performing evaluation test on the risk prediction model.
Still more preferably, the step a22 includes: A221. the model network structure adopts 3 convolution layers, wherein the first convolution layer adopts a convolution kernel of 3 x 3, the second convolution layer adopts a convolution kernel of 2 x 2, each convolution layer is followed by a maximum pooling layer, and then followed by two hidden layers and an output layer, and the feature maps of each convolution layer respectively adopt 32, 64 and 128; A222. performing regression by using a softmax function, wherein the final output layer does not need softmax regression; A223. training is carried out by using training sample data to obtain an initial risk prediction model.
The invention also provides a computer-readable storage medium containing a computer program which is executed by a computer to implement the method as described above.
The invention has the beneficial effects that: collecting network space asset risk samples of the whole network segment, carrying out distributed training learning by combining a Convolutional Neural Network (CNN), and carrying out self-learning and adjustment by combining all local results and neural network analysis to obtain a comprehensive and integrated risk prediction model. The risk prediction model can carry out security risk assessment and early warning on a plurality of target networks or targets without obvious bugs, and can assess the security state of one network on the whole; the response speed is increased, the risk points are found quickly, and the processing efficiency and accuracy of network security situation analysis and prediction are improved; meanwhile, the maintenance cost is reduced, and the labor is saved.
Further advantages are also obtained in a further preferred embodiment: the maximum resistance for network security assessment and early warning by using the CNN is as follows: and (5) construction of an application scene learning sample. The invention limits the risk elements of the risk sample to be as follows: the method comprises the steps of target IP, open ports, server system types and versions, server application types and versions, existing bugs, database types and versions, weak passwords, whether CDN acceleration is adopted or not, and whether firewall is adopted or not, so that the time of CNN distributed training is saved, and the accuracy of safety assessment and early warning results is improved.
Drawings
Fig. 1 is a schematic flow chart of a deep learning-based network risk early warning method according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of a convolutional neural network distributed training learning process according to an embodiment of the present invention.
Detailed Description
The present invention is described in further detail below with reference to specific embodiments and with reference to the attached drawings, it should be emphasized that the following description is only exemplary and is not intended to limit the scope and application of the present invention.
As shown in fig. 1, the embodiment provides a network risk early warning method based on deep learning, which includes the following steps:
step 1, collecting the asset risk sample data of the network space of the whole network segment.
Step 1-1, establishing a database of the network space asset risk sample data, and identifying risk points of the assets to determine risk elements, wherein the risk elements comprise: target IP, open port, server system type and version, server application type and version, existing vulnerabilities, database type and version, weak password, whether CDN acceleration is employed, whether firewall is employed. And acquiring network space asset risk sample data of the whole network segment according to the risk elements.
By extracting the risk elements which possibly cause serious consequences on the network security, sample data is formed, and the real reliability of the prediction result of later-stage deep learning can be ensured. Some risk elements appear to be non-dangerous, but when combined, can create fatal vulnerabilities.
The network space risk sample data acquisition method comprises the following steps: the method comprises the steps of using a detection technology of service types and version information operated by a target network host, a detection technology of information such as an operating system and equipment types, an identification technology of security vulnerability of the target host, and an identification technology of CDN (content delivery network) and firewall to finish the collection work of sample data, and using a distributed technology to ensure that the collected sample has real-time property.
Step 1-2, carrying out vulnerability scanning on the acquired risk sample data, and dividing the acquired risk sample data into four security levels: high-risk, medium-risk, low-risk and safe. The ratio of the four safety levels of high-risk, medium-risk, low-risk and safe is 1:1:1:1, and the number of each safety level is more than or equal to 5000.
Through the division of the security levels, each security risk level comprises the designated insecurity factor and the maximum loss degree possibly brought by the vulnerability, and a user can preliminarily master the belonged classification and the possible loss of the vulnerability and make a specific defense measure so as to reduce the risk of the user facing the network risk.
And 1-3, converting the network space asset risk sample data into binary sample data which can be identified by deep learning. Collecting node data, summarizing the data in a control server, and storing the data after data cleaning.
The task of data cleansing is to filter out unsatisfactory data, mainly incomplete data, erroneous data, duplicate data, and the like.
Since most of the result data in the database is text or numbers and the combination is many, there is great difficulty in quantifying the sample parameters and it is difficult to form a learning model for deep learning, so sample data is made into a picture.
1) Sample picture processing: the sample pictures are uniformly cropped to a 100x100 pixel size, with the center region cropped for evaluation or randomly cropped for training.
2) And approximate whitening processing is carried out on the picture, so that the model is insensitive to the dynamic range change of the picture. And 2, extracting data from the database, and performing distributed training learning of the convolutional neural network to form a risk prediction model.
The good learning model can not only improve the learning speed, but also improve the accuracy of the learning result, and meanwhile, the number of samples needs to be considered, and comprehensively, the CNN model is the most ideal deep learning model at present. The step adopts a picture training mode and combines the characteristic that the convolutional neural network is good at solving picture recognition to train.
The learning model is divided into two types of samples: training samples and test samples. The training sample is sample data required in the debugging and training stages and is used as a function and a method for adjusting deep learning to guide a final result to a correct direction; the test sample is used for verifying whether the accuracy meets the functions of network risk assessment and early warning and is used in the assessment stage. The training sample is sample data used in a training model stage; the test sample is the sample data used in the evaluation model stage. Independent and dependent variables are known for both types of samples.
From the partitioned samples, a risk prediction model is formed by training a machine, the process of which is shown in fig. 2.
And 2-1, preparing a training environment. The training environment is performed in a Tensorflow GPU mode, the calculation speed of the GPU is higher than that of a CPU, and the time cost of the training process can be reduced.
Step 2-2, training model stage. And (3) performing model training in a training environment by using the training sample prepared in the step (2) in combination with a convolutional neural network. The training process is as follows:
1) the model network structure is defined by using 3 convolutional layers, the first convolutional layer uses 3 × 3 convolutional kernels, the second convolutional layer uses 2 × 2 convolutional kernels, each convolutional layer is followed by a max-pooling layer, and then two hidden layers and an output layer, and the feature maps of each convolutional layer are respectively 32, 64 and 128.
2) Regression is performed using the softmax function, and the final output layer does not need the softmax function regression.
3) And training after the model network structure is well defined to obtain an initial risk prediction model.
The accuracy was optimized in a gradient decreasing manner with an initial gradient of 10-4(ii) a And performing distributed CNN training, performing linear regression on the training data in a gradient decreasing mode to reach a balanced state, finding out factors which have a large influence on a training result, and performing distributed training by using the data as CNN input. The data parallel distributed training stores a model backup on each working node of the GPU, processes different parts of data on each node, combines the results of each working node, and synchronizes model parameters among the nodes; the method can accelerate the efficiency of data training and model establishment.
And 2-3, evaluating the model. And (3) using the test sample prepared in the step (2) in a training environment, carrying out evaluation test on the initial risk prediction model obtained in the step (2-2), and determining whether the accuracy is qualified. The test method comprises the following steps: and inputting the test sample into the initial risk prediction model, and judging whether the result is matched with the expectation after the result is output. And if the network risk is matched, putting the network risk into a production process for early warning of the network risk. If not, returning to the step 2-2 for algorithm optimization until the output result is matched with the expectation. The output results are four types, namely safety, low risk, medium risk and high risk.
The risk prediction model established by the method has different accuracy according to different risk elements forming the sample data. According to the risk factors selected from the sample data, the time for forming the risk prediction model is different, the accuracy of the risk prediction model is also different, and the results are as follows:
from the above table it can be seen that: when the risk elements comprise a target IP, an open port, a server system type and version, a server application type and version, existing bugs, a database type and version, a weak password, whether CDN acceleration is adopted or not and whether a firewall is adopted or not, the accuracy of a risk prediction model is high, the learning time is short, and when one of the risk elements is lack, the result is lack of accuracy.
When the risk elements are redundant of these, experimental results show that: the learning time is long, the time for forming a risk prediction model is long, and the cost is high. By choosing the appropriate risk elements: the target IP, the open port, the server system type and version, the server application type and version, existing bugs, the database type and version, the weak password, whether CDN acceleration is adopted or not, and whether a firewall is adopted or not are adopted, so that the learning time is short, and the accuracy of the formed risk prediction model is high, namely, the method is quick and accurate.
And 3, inputting the production data into a risk prediction model, evaluating the risk value of the model, and alarming if an early warning threshold value is reached.
The foregoing is a more detailed description of the invention in connection with specific/preferred embodiments and is not intended to limit the practice of the invention to those descriptions. It will be apparent to those skilled in the art that various substitutions and modifications can be made to the described embodiments without departing from the spirit of the invention, and these substitutions and modifications should be considered to fall within the scope of the invention.

Claims (9)

1. A network risk early warning method based on deep learning is characterized by comprising the following steps:
A1. collecting asset risk sample data of a network space of a whole network segment, and storing the sample data into a database;
A2. extracting data from a database, and performing convolutional neural network distributed training learning to form a risk prediction model;
A3. inputting production data into a risk prediction model, evaluating the risk value of the production data, and giving an alarm if an early warning threshold value is reached; the step A1 includes:
A11. determining risk elements, and collecting network space asset risk sample data of the whole network segment;
A12. and carrying out vulnerability scanning on the collected risk sample data, and dividing the security level.
2. The method of claim 1, wherein the risk elements comprise: one or more of a target IP, an open port, a server system type and version, a server application type and version, existing vulnerabilities, a database type and version, a weak password, whether CDN acceleration is employed, and a firewall.
3. The method of claim 1, wherein the security level is divided into: the safety level comprises four safety levels of high-risk, medium-risk, low-risk and safe, the ratio of the four safety levels is 1:1:1:1, and the number of each safety level is more than or equal to 5000.
4. The method of claim 1, wherein said step a1 further comprises:
A13. and converting the network space asset risk sample data into binary sample data which can be identified by deep learning.
5. The method of claim 4, wherein said step A13 comprises:
A131. performing picture processing on the sample, and cutting the sample into uniform size;
A132. and whitening the cut picture.
6. The method of claim 1, wherein the distributed training learning of step A2 is performed in a gradient decreasing manner, and the initial gradient is 10-4
7. The method of claim 1, wherein said step a2 comprises:
A21. preparing a training environment, wherein the training environment is carried out by adopting a Tensorflow GPU mode;
A22. extracting training sample data from a database, and performing model training by combining a convolutional neural network to obtain an initial risk prediction model;
A23. and extracting test sample data from the database, and performing evaluation test on the initial risk prediction model.
8. The method of claim 7, wherein said step a22 comprises:
A221. the model network structure adopts 3 convolution layers, wherein the first convolution layer adopts a convolution kernel of 3 x 3, the second convolution layer adopts a convolution kernel of 2 x 2, each convolution layer is followed by a maximum pooling layer, and then followed by two hidden layers and an output layer, and the feature maps of each convolution layer respectively adopt 32, 64 and 128;
A222. performing regression by using a softmax function, wherein the final output layer does not need softmax regression;
A223. training is carried out by using training sample data to obtain an initial risk prediction model.
9. A computer-readable storage medium containing a computer program, the computer program being executable by a computer to perform the method of any one of claims 1 to 8.
CN201710375043.1A 2017-05-24 2017-05-24 Network risk early warning method based on deep learning Active CN107104978B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710375043.1A CN107104978B (en) 2017-05-24 2017-05-24 Network risk early warning method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710375043.1A CN107104978B (en) 2017-05-24 2017-05-24 Network risk early warning method based on deep learning

Publications (2)

Publication Number Publication Date
CN107104978A CN107104978A (en) 2017-08-29
CN107104978B true CN107104978B (en) 2019-12-24

Family

ID=59669379

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710375043.1A Active CN107104978B (en) 2017-05-24 2017-05-24 Network risk early warning method based on deep learning

Country Status (1)

Country Link
CN (1) CN107104978B (en)

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107707553B (en) * 2017-10-18 2020-02-07 北京启明星辰信息安全技术有限公司 Weak password scanning method and device and computer storage medium
CN107948587B (en) * 2017-11-15 2019-12-27 中国联合网络通信集团有限公司 Risk assessment method, device and system for monitoring equipment
CN108364106A (en) * 2018-02-27 2018-08-03 平安科技(深圳)有限公司 A kind of expense report Risk Forecast Method, device, terminal device and storage medium
CN110598959A (en) * 2018-05-23 2019-12-20 中国移动通信集团浙江有限公司 Asset risk assessment method and device, electronic equipment and storage medium
CN108897614A (en) * 2018-05-25 2018-11-27 福建天晴数码有限公司 A kind of memory method for early warning and server-side based on convolutional neural networks
CN110875912A (en) * 2018-09-03 2020-03-10 中移(杭州)信息技术有限公司 Network intrusion detection method, device and storage medium based on deep learning
CN109472396B (en) * 2018-10-17 2023-06-20 成都卡普数据服务有限责任公司 Mountain fire prediction method based on deep network learning
CN109993412A (en) * 2019-03-01 2019-07-09 百融金融信息服务股份有限公司 The construction method and device of risk evaluation model, storage medium, computer equipment
CN110399252A (en) * 2019-07-19 2019-11-01 广东浪潮大数据研究有限公司 A kind of data back up method, device, equipment and computer readable storage medium
CN110543565A (en) * 2019-08-30 2019-12-06 广西电网有限责任公司南宁供电局 Auditing method, system and readable storage medium based on convolutional neural network model
CN111400572A (en) * 2020-02-28 2020-07-10 开普云信息科技股份有限公司 Content safety monitoring system and method for realizing image feature recognition based on convolutional neural network
CN112001565A (en) * 2020-09-08 2020-11-27 清华大学合肥公共安全研究院 Earthquake disaster loss prediction and evaluation method and system based on Softmax regression model
CN112565255A (en) * 2020-12-04 2021-03-26 广东电网有限责任公司珠海供电局 Electric power Internet of things equipment safety early warning method based on BP neural network
CN113361855A (en) * 2021-05-07 2021-09-07 浙江警官职业学院 Short, medium and long-term risk warning method and device
CN113065666A (en) * 2021-05-11 2021-07-02 海南善沙网络科技有限公司 Distributed computing method for training neural network machine learning model
CN113225358B (en) * 2021-07-09 2021-09-03 四川大学 Network security risk assessment system
CN113542278B (en) * 2021-07-16 2023-04-25 北京源堡科技有限公司 Network security assessment method, system and device
CN113506039A (en) * 2021-08-03 2021-10-15 杭银消费金融股份有限公司 Risk prediction method and device based on intelligent AI machine learning
CN115396242B (en) * 2022-10-31 2023-04-07 江西神舟信息安全评估中心有限公司 Data identification method and network security vulnerability detection method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104873186A (en) * 2015-04-17 2015-09-02 中国科学院苏州生物医学工程技术研究所 Wearable artery detection device and data processing method thereof
CN204708828U (en) * 2015-04-17 2015-10-21 中国科学院苏州生物医学工程技术研究所 A kind of wearable noinvasive arterial health checkout gear
CN105894372A (en) * 2016-06-13 2016-08-24 腾讯科技(深圳)有限公司 Method and device for predicting group credit

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10679140B2 (en) * 2014-10-06 2020-06-09 Seagate Technology Llc Dynamically modifying a boundary of a deep learning network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104873186A (en) * 2015-04-17 2015-09-02 中国科学院苏州生物医学工程技术研究所 Wearable artery detection device and data processing method thereof
CN204708828U (en) * 2015-04-17 2015-10-21 中国科学院苏州生物医学工程技术研究所 A kind of wearable noinvasive arterial health checkout gear
CN105894372A (en) * 2016-06-13 2016-08-24 腾讯科技(深圳)有限公司 Method and device for predicting group credit

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于深度学习技术的恶意APP检测方案;张海舰等;《网络安全技术与应用》;20170315;P108 *

Also Published As

Publication number Publication date
CN107104978A (en) 2017-08-29

Similar Documents

Publication Publication Date Title
CN107104978B (en) Network risk early warning method based on deep learning
CN106776842B (en) Multimedia data detection method and device
CN109902018B (en) Method for acquiring test case of intelligent driving system
CN106548343B (en) Illegal transaction detection method and device
CN104901971B (en) The method and apparatus that safety analysis is carried out to network behavior
CN107645503A (en) A kind of detection method of the affiliated DGA families of rule-based malice domain name
CN111177714A (en) Abnormal behavior detection method and device, computer equipment and storage medium
CN105072214A (en) C&C domain name identification method based on domain name feature
CN109117634A (en) Malware detection method and system based on network flow multi-view integration
CN104123501B (en) A kind of viral online test method based on many assessor set
CN113434859A (en) Intrusion detection method, device, equipment and storage medium
CN113704082A (en) Model evaluation method and device, electronic equipment and storage medium
Lengyel et al. Assessing the relative importance of methodological decisions in classifications of vegetation data
CN110943974B (en) DDoS (distributed denial of service) anomaly detection method and cloud platform host
CN110598959A (en) Asset risk assessment method and device, electronic equipment and storage medium
CN112217650A (en) Network blocking attack effect evaluation method, device and storage medium
CN113988616A (en) Enterprise risk assessment system and method based on industry data
CN111835781B (en) Method and system for discovering host of same source attack based on lost host
KR102177998B1 (en) Learning methods, preprocessing methods, learning devices and preprocessing devices for detecting syn flood attacks based on machine learning models
CN110808947B (en) Automatic vulnerability quantitative evaluation method and system
CN115484112B (en) Payment big data safety protection method, system and cloud platform
Song et al. A comprehensive approach to detect unknown attacks via intrusion detection alerts
KR102433581B1 (en) Social advanced persistent threat prediction system and method using time-series learning-type ensemble AI techniques
CN114880637A (en) Account risk verification method and device, computer equipment and storage medium
CN111209567B (en) Method and device for judging perceptibility of improving robustness of detection model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant