CN114021149B - System for predicting industrial control network bugs based on correction parameters - Google Patents

System for predicting industrial control network bugs based on correction parameters Download PDF

Info

Publication number
CN114021149B
CN114021149B CN202111358159.7A CN202111358159A CN114021149B CN 114021149 B CN114021149 B CN 114021149B CN 202111358159 A CN202111358159 A CN 202111358159A CN 114021149 B CN114021149 B CN 114021149B
Authority
CN
China
Prior art keywords
vulnerability
industrial control
control network
internet
probability
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111358159.7A
Other languages
Chinese (zh)
Other versions
CN114021149A (en
Inventor
李峰
王绍密
侯绪森
时伟强
宋衍龙
胡建秋
袁晓露
王善军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Yuntian Safety Technology Co ltd
Original Assignee
Shandong Yuntian Safety Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Yuntian Safety Technology Co ltd filed Critical Shandong Yuntian Safety Technology Co ltd
Priority to CN202111358159.7A priority Critical patent/CN114021149B/en
Publication of CN114021149A publication Critical patent/CN114021149A/en
Application granted granted Critical
Publication of CN114021149B publication Critical patent/CN114021149B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • G06F21/577Assessing vulnerabilities and evaluating computer system security

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention relates to a system for predicting industrial control network bugs based on correction parameters, which realizes the step S201. Obtaining P corresponding to each sample vulnerability id in a preset training period from a databasemS202, determining a correction parameter P corresponding to each sample vulnerability idCVEDetermining corresponding training parameter values PCmBased on PCmGenerating a model input vector of each sample vulnerability id; step S203, correcting parameters P corresponding to all sample loopholes idCVETraining the model input vector and the industrial control network vulnerability outbreak probability truth value to obtain an industrial control network vulnerability prediction model; and S204, predicting the explosion probability of the industrial control network vulnerability based on the industrial control network vulnerability prediction model. The method can quickly and accurately predict the vulnerability outbreak probability of the industrial control network and improve the safety of the industrial control network.

Description

System for predicting industrial control network bugs based on correction parameters
Technical Field
The invention relates to the technical field of computers, in particular to a system for predicting industrial control network bugs based on correction parameters.
Background
With the accelerated fusion of new-generation information technologies such as cloud computing, big data, artificial intelligence, internet of things and the like and manufacturing technologies, industrial control systems are independently opened from original closed, interconnected from a single machine and intelligentized from automation. When industrial enterprises obtain great development kinetic energy, a great deal of potential safety hazards also appear, and from Stuxnet viruses of iran nuclear plants in 2010 to Havex viruses in europe in the year 2014, network (hereinafter referred to as industrial control network) attacks of industrial control systems are more and more severe, and the industrial control systems are urgently required to obtain safety protection.
The system bugs of the industrial control system are important factors influencing the safety of the industrial control network, the bugs of the industrial control network cannot be repaired in time like an IT system, and a large number of bugs exist for a long time. Therefore, if the situation of vulnerability outbreak of the industrial control network cannot be predicted in time, and corresponding defense measures are taken, the safety of the industrial control network cannot be ensured. Therefore, how to accurately and efficiently predict the vulnerability outbreak situation of the industrial control network becomes a technical problem to be solved urgently.
Disclosure of Invention
The invention aims to provide a system for predicting the industrial control network vulnerability based on correction parameters, which can quickly and accurately predict the explosion probability of the industrial control network vulnerability, thereby realizing that corresponding defense measures are taken based on the explosion probability of the industrial control network vulnerability and improving the safety of the industrial control network.
According to one aspect of the invention, the system for predicting the industrial control network vulnerability based on the correction parameters comprises a processor, a database and a storage medium stored with a computer program, wherein the processor is in communication connection with the database, and an internet vulnerability characteristic parameter list, a networking vulnerability actual outbreak probability list and an industrial control network vulnerability actual outbreak probability list which correspond to all internet vulnerability ids are stored in the database, and P is the ratio of the actual outbreak probability of the industrial control network vulnerability to the actual outbreak probability of the industrial control network vulnerabilitymThe value range of M is 1 to M, wherein M is the number of the internet vulnerability characteristic parameters; the computer program stored in the storage medium includes a second computer program, and the processor implements the following steps when executing the first computer program:
step S201, obtaining P corresponding to each sample vulnerability id in a preset training period from the databasemThe method comprises the steps of (1) parameter value list, internet vulnerability actual outbreak probability list, industrial control network vulnerability actual outbreak probability list and industrial control network vulnerability outbreak probability true value;
step S202, determining a correction parameter P corresponding to each sample vulnerability id based on an Internet vulnerability actual outbreak probability list and an industrial control network vulnerability actual outbreak probability listCVEBased on P corresponding to each sample vulnerability idmDetermines the corresponding training parameter value PCmBased on PCmGenerating a model input vector of each sample vulnerability id;
step S203, correcting parameters P corresponding to all sample loopholes idCVETraining the model input vector and the industrial control network vulnerability outbreak probability truth value to obtain an industrial control network vulnerability prediction model:
f(x)=b0*PCVE+b1*x1+b2*x2+…bM *xM;
wherein x isjP corresponding to sample vulnerability idmTraining parameter values of bjIs xjThe value range of j is 1 to M;
and S204, predicting the explosion probability of the industrial control network vulnerability based on the industrial control network vulnerability prediction model.
Compared with the prior art, the invention has obvious advantages and beneficial effects. By means of the technical scheme, the system for predicting the industrial control network vulnerability based on the correction parameters can achieve considerable technical progress and practicability, has industrial wide utilization value, and at least has the following advantages:
the invention discloses a system for predicting industrial control network bugs based on correction parameters, which predicts the industrial control network bugs by internet bug characteristic parameters and correction parameters P representing the incidence relation between the internet actual bug outbreak probability and the industrial control network actual bug outbreak probability valueCVEAnd (4) constructing model input parameters, and training to obtain an industrial control network vulnerability prediction model. The method realizes the internet vulnerability characteristic parameters and the correction parameters P based on multiple dimensionsCVETherefore, the vulnerability outbreak probability of the industrial control network is rapidly and accurately predicted, reasonable defense measures are set based on the probability, and the safety and the stability of the industrial control network are improved. The system is particularly suitable for application scenes in which vulnerability characteristic parameters of the industrial control network are not easy to obtain. Reasonable defense measures are set based on the method, and the safety and the stability of the industrial control network are improved.
The foregoing description is only an overview of the technical solutions of the present invention, and in order to make the technical means of the present invention more clearly understood, the present invention may be implemented in accordance with the content of the description, and in order to make the above and other objects, features, and advantages of the present invention more clearly understood, the following preferred embodiments are described in detail with reference to the accompanying drawings.
Drawings
Fig. 1 is a schematic diagram of a system framework for predicting an industrial control network vulnerability according to an embodiment of the present invention;
fig. 2 is a flowchart of predicting an industrial control network vulnerability based on the internet and industrial control network vulnerability parameters according to an embodiment of the present invention;
fig. 3 is a flowchart of predicting an industrial control network vulnerability based on a correction parameter according to a second embodiment of the present invention;
fig. 4 is a flowchart of predicting an industrial control network vulnerability based on Summary length features according to a third embodiment of the present invention;
fig. 5 is a flowchart of predicting an industrial control network vulnerability based on Summary participle features according to the fourth embodiment of the present invention;
fig. 6 is a flowchart of predicting an industrial control network vulnerability based on bitmap according to a fifth embodiment of the present invention;
fig. 7 is a flowchart of predicting an industrial control network vulnerability based on an N-gram according to a sixth embodiment of the present invention.
Detailed Description
To further explain the technical means and effects of the present invention adopted to achieve the predetermined objects, the following detailed description of the embodiments and effects thereof will be made with reference to the accompanying drawings.
The industrial control network is an enterprise intranet, and is communicated with the Internet through a gateway, and a plurality of single-chip microcomputers, DSPs, industrial personal computers, sensors and the like are connected in the industrial control network. The same vulnerability may be exposed on the internet or on the industrial control network. Each vulnerability (CVE) has a corresponding vulnerability id (which may be a unique identifier labeled for each vulnerability by the international standards organization) and characteristic parameters. The vulnerability characteristic parameters comprise Internet vulnerability characteristic parameters crawled from the Internet and industrial control network vulnerability characteristic parameters crawled from an industrial control network. As an example, the internet Vulnerability feature parameters may specifically include a CVSS value set by a Common Vulnerability Scoring System (CVSS for short) for each Vulnerability id, a summery (Vulnerability description text) parameter, a CVSS parameter, a CWE (Common weak Enumeration) parameter, a product group parameter, a reference website domain name parameter, and other customized internet parameters, and the summery parameters may specifically include a summery text length, a summery word segmentation feature, and the like. The industrial control network vulnerability characteristic parameters comprise gateway parameters, internal state parameters of the industrial control network and the like. Each internet vulnerability characteristic parameter and each industrial control network vulnerability characteristic parameter correspond to an own updating period, and huge differences may exist between the updating periods of the parameters.
The embodiment of the invention provides a system for predicting industrial control network vulnerabilities, which comprises a processor, a database and a storage medium stored with a computer program, wherein the processor is in communication connection with the database, as shown in fig. 1. Those skilled in the art will appreciate that the processor is disposed on a server, and the server and the database are not particularly limited to a hardware device and/or a software device, and may be a server cluster, a storage cluster, and the like. It will be appreciated that any computing device or combination of computing devices capable of data processing may be provided with the server and any storage device or combination of storage devices capable of data storage may be considered the database. The server and the database may be separate devices or may share one or more separate devices.
The database stores an internet vulnerability characteristic parameter list P = { P } corresponding to all internet vulnerability ids1,P2,…PMAn update period list TP of the internet vulnerability characteristic parameters = { TP = }1,TP2,…TPMAnd a vulnerability characteristic parameter list Q = { Q) of the industrial control network1,Q2,…QNAnd an industrial control network vulnerability characteristic parameter update period list TQ = { TQ }1,TQ2,…TQN}。PmIs the mth internet vulnerability characteristic parameter, TPmIs PmThe value range of M is 1 to M, and M is the quantity of the internet vulnerability characteristic parameters. QnTQ being the nth industrial control network vulnerability characteristic parameternIs QnThe value range of N is 1 to N, and N is the quantity of vulnerability characteristic parameters of the industrial control network.
The internet vulnerability characteristic parameter list specifically comprises at least one of a CVSS value, a CVSS parameter, a CWE parameter, a product group parameter, a reference website domain name parameter, a summary parameter and the like. And the internet vulnerability characteristic parameter list is updated correspondingly according to the updating period corresponding to each internet vulnerability characteristic parameter. The industrial control network vulnerability characteristic parameter list specifically comprises at least one of gateway parameters, internal state parameters of the industrial control network and the like.
And the industrial control network vulnerability characteristic parameters are correspondingly updated according to the updating period of each industrial control network vulnerability characteristic parameter. It should be noted that the internet vulnerability characteristic parameters are updated slowly, and the industrial control network vulnerability characteristic parameters can be monitored and obtained through the inside of the industrial control network, so that the updating frequency is high, and the accuracy can be up to the hour level or even the minute level. Therefore, the updating period of the internet vulnerability characteristic parameters is far longer than that of the industrial control network vulnerability characteristic parameters, and can be specifically set to max (TP)m)/min(TQn)>D1, wherein D1 is a preset first threshold value, and D1 is set according to a specific application scene. For example, the value range of D1 is set to [10,100 ]]Preferably, D1= 20. Although the industrial control network and the Internet are isolated through the gateway, the tendency of the vulnerability outbreak on the Internet is consistent with the tendency of the vulnerability outbreak on the industrial control network, and more vulnerability characteristic parameters can be crawled on the Internet, so that the probability of the vulnerability outbreak on the industrial control network can be predicted based on the industrial control network vulnerability characteristic parameters and the Internet vulnerability characteristic parameters by combining an artificial intelligence training industrial control network vulnerability prediction model.
As an example, the processor when processing the computer program implements the following steps:
step S1, acquiring vulnerability characteristic vectors corresponding to vulnerability ids of each sample from the Internet vulnerability characteristic parameter list and the industrial control network vulnerability characteristic parameter list based on a preset sample vulnerability id set, and constructing a training parameter set;
step S2, training according to the training parameter set to obtain an industrial control network vulnerability prediction model;
and step S3, predicting the probability of outbreak of the vulnerability id to be detected on the industrial control network based on the industrial control network vulnerability prediction model.
It can be understood that, since the probability of vulnerability outbreak is predicted by using historical data, the sample vulnerability id set may be a part of vulnerability ids in the current vulnerability id set, or may be all vulnerability ids, and is set according to specific requirements.
Due to the fact that various internet vulnerability characteristic parameters and industrial control network vulnerability characteristic parameters exist, updating periods are different, different vulnerability characteristic parameters can be selected, and different industrial control network vulnerability prediction models can be obtained through training in different processing modes. The detailed description will be described in detail below by way of a plurality of examples, and the technical contents of the respective examples may be cited as each other unless otherwise specified.
The first embodiment,
The computer program stored in the storage medium includes a first computer program that, when executed by the processor, implements the steps of:
step S101, obtaining a training period T of a training data set0=LCM(TPm) LCM is the function of solving the least common multiple.
Due to different PmAnd QnThe update periods of the TP-based model are different, and if the sliding window is directly used for selecting the training parameters, many parameters are not changed within a certain time, which wastes computing resources and has little significance for model training, so that the TP is selected in the embodimentmThe small common multiple of (c) is used as the training period. It should be noted that, because the update cycle of the internet vulnerability characteristic parameters is much longer than that of the industrial control network vulnerability characteristic parameters, only the update cycle of the internet vulnerability characteristic parameters is considered when selecting the time window, so that all the internet vulnerability characteristic parameters and the industrial control network vulnerability characteristic parameters can be acquired by each sample vulnerability id, the waste of computing power is avoided, and the efficiency of model training is improved.
Step S102, obtaining each sample from the databaseT before the current moment of the distance corresponding to the vulnerability id0Inner PmList of parameter values, QnThe parameter value list and the industrial control network vulnerability outbreak probability truth value are obtained;
the actual explosion probability value of the industrial control network vulnerability refers to the actual explosion probability of the industrial control network vulnerability corresponding to the sample prediction moment, the actual explosion probability of the industrial control network vulnerability is also a parameter updated in the corresponding updating period, and the actual explosion probability of the industrial control network vulnerability is obtained by dividing the number of the industrial control network host equipment reporting the explosion of the vulnerability by the number of all monitored industrial control network host equipment in the updating period of the industrial control network vulnerability explosion probability.
Step S103, P based on vulnerability id of each samplemDetermining P corresponding to sample vulnerability id by parameter value listmTraining parameter value PCmQ based on each sample vulnerability idnDetermining Q corresponding to sample vulnerability id by using parameter value listnTraining parameter value QC ofnBased on PCmAnd QCnGenerating a model input vector of each sample vulnerability id;
step S104, training according to all sample vulnerability id model input vectors and industrial control network vulnerability outbreak probability truth values to obtain an industrial control network vulnerability prediction model:
f(x)=a1*x1+a2*x2+…aM+N*xM+N
wherein x isiP corresponding to sample vulnerability idmTraining parameter value or Q ofnTraining parameter values of aiIs xiThe value range of i is 1 to M + N;
and S105, predicting the explosion probability of the industrial control network vulnerability based on the industrial control network vulnerability prediction model.
As an example, the step S104 includes:
and S114, inputting all sample vulnerability id model input vectors into a preset industrial control network vulnerability prediction model to obtain a sample industrial control network vulnerability outbreak probability prediction value.
And S124, adjusting a weight coefficient according to the sample industrial control network vulnerability outbreak probability predicted value and the true value, and training to obtain the industrial control network vulnerability prediction model.
As an example, in the step S105, a training period T corresponding to the training data set is preferably adopted0In the same period, the same input vector construction strategy from the step S102 to the step S104 is adopted to predict the vulnerability outbreak probability of the industrial control network, and the prediction accuracy is high.
As another example, the selected time period when the industrial control network vulnerability prediction model is trained is much longer than the period of the characteristic parameters of the industrial control network vulnerability, but because the period of the characteristic parameters of the industrial control network vulnerability is broken and the updating frequency is high, when at least one characteristic parameter of the industrial control network vulnerability is updated, the industrial control network vulnerability can be predicted, the prediction sensitivity is high, and the predictability of the newly-outbreak vulnerability is very strong. Specifically, the step S105 includes:
step S134, obtaining the prediction period T1=min(TQn);
Step S144, every interval T1Collecting P of current moment corresponding to vulnerability id to be testedmValue of parameter or QnConstructing an input vector corresponding to the loophole id to be detected;
step S154, inputting the input vector corresponding to the loophole id to be detected into the industrial control network loophole prediction model to obtain T after the current moment of the loophole id to be detected1And (5) the vulnerability outbreak probability of the industrial control network at the moment.
As a preferred example, in step S103, the distance T before the current time corresponding to each sample vulnerability id is T0Inner PmIs listed as { PCm1,PCm2,…PCmAWhere, PCmaIs PmAt T0The value range of the a is 1 to A, A is PmBefore the current time T0Internally collected PmTotal number of parameter values, a = T0/TPmP based on per sample vulnerability idmDetermining P corresponding to sample vulnerability id by parameter value listmTraining parameter value PCmThe method comprises the following steps:
step S113, determining P based on the following equationmTraining ofExercise parameter value PCm
Figure 333431DEST_PATH_IMAGE001
Due to different parameter updating periods, most of the internet vulnerability characteristic parameters and industrial control network vulnerability characteristic parameters can obtain a plurality of characteristic parameters in a training period, and therefore reasonable characteristic parameter values need to be determined based on the plurality of characteristic parameters to construct model input. As a preferred example, in step S103, the distance T before the current time corresponding to each sample vulnerability id is T0Inner QnIs listed as { QCn1,QCn2,…QCnB}, QCnbIs QnBefore the current time T0The value range of the B parameter value in the (B) is from 1 to B, and B is QnBefore the current time T0Internally collected QnTotal number of parameter values, B = T0/ TQnQ based on per sample vulnerability idnDetermining Q corresponding to sample vulnerability id by using parameter value listnTraining parameter value QC ofnThe method comprises the following steps:
step S123, determining Q based on the following equationnTraining parameter value QC ofn
Figure DEST_PATH_IMAGE002
As time goes on, a new bug may be added, and the bug characteristic parameters and the bug characteristic parameter update period may also change, so that in order to further improve the accuracy of the industrial control network bug prediction model, the processor implements the following steps when executing the first computer program:
step S100, every interval T0And re-executing the step S101 to the step S104 to update the industrial control network vulnerability prediction model.
In the first embodiment, the internet vulnerability characteristic parameters and the industrial control network vulnerability characteristic parameters are obtained by setting a reasonable training period of a training data set, corresponding parameter values are respectively determined based on the internet vulnerability characteristic parameters and the industrial control network vulnerability characteristic parameters in the training period, then the corresponding parameter values are converted into input vectors, and an industrial control network vulnerability prediction model is obtained through training. The method has the advantages that the vulnerability outbreak probability of the industrial control network can be rapidly and accurately predicted based on the internet vulnerability characteristic parameters and the industrial control network vulnerability characteristic parameters with multiple dimensions, reasonable defense measures are set based on the vulnerability outbreak probability, and the safety and the stability of the industrial control network are improved. The method and the device are particularly suitable for application scenarios capable of simultaneously obtaining the internet vulnerability characteristic parameters and the industrial control network vulnerability characteristic parameters.
Example II,
It should be noted that the internet vulnerability characteristic parameters are many and easy to obtain, but in some application scenarios, the internet vulnerability characteristic parameters are limited by various factors such as the scale of the industrial control network, and sufficient industrial control network vulnerability characteristic parameters may not be obtained to train the industrial control network vulnerability prediction model. However, because the trend of the same vulnerability outbreak in the industrial control network is consistent with the overall trend of the same vulnerability outbreak in the internet and has relevance, the vulnerability outbreak probability of the industrial control network can be predicted based on the incidence relation between the industrial control network and the internet vulnerability outbreak and by combining with the internet vulnerability characteristic parameters.
Specifically, the computer program stored in the storage medium includes a second computer program, and when the processor executes the second computer program, the following steps are implemented:
step S201, obtaining P corresponding to each sample vulnerability id in a preset training period from the databasemThe method comprises the steps of parameter value list, internet vulnerability actual outbreak probability list, industrial control network vulnerability actual outbreak probability list and industrial control network vulnerability outbreak probability truth value.
The preset training period can be set according to specific training requirements, and can also be determined directly based on a mode of obtaining the least common multiple of the update period of the internet vulnerability characteristic parameters in the embodiment. The actual outbreak probability of the internet vulnerability is also a parameter updated in the corresponding updating period, and is obtained by dividing the number of the internet host devices reporting the outbreak of the vulnerability by the number of all the monitored internet host devices in the updating period of the outbreak probability of the internet vulnerability. The specific algorithm for obtaining the true value of the industrial control network vulnerability outbreak probability and the actual outbreak probability of the industrial control network vulnerability is described in detail in the first embodiment, and is not described herein again.
Step S202, determining a correction parameter P corresponding to each sample vulnerability id based on an Internet vulnerability actual outbreak probability list and an industrial control network vulnerability actual outbreak probability listCVEBased on P corresponding to each sample vulnerability idmDetermines the corresponding training parameter value PCmBased on PCmAnd generating a model input vector of each sample vulnerability id.
Note that, the correction parameter PCVEThe method is used for expressing the incidence relation between the internet actual vulnerability outbreak probability and the industrial control network actual vulnerability outbreak probability value of each sample vulnerability id, and each sample vulnerability id has a corresponding correction parameter PCVEIs a dynamically changing value based on each vulnerability id.
Step S203, correcting parameters P corresponding to all sample loopholes idCVETraining the model input vector and the industrial control network vulnerability outbreak probability truth value to obtain an industrial control network vulnerability prediction model:
f(x)=b0*PCVE+b1*x1+b2*x2+…bM *xM;
wherein x isjP corresponding to sample vulnerability idmTraining parameter values of bjIs xjThe value of j ranges from 1 to M.
It should be noted that the model constructs an input vector based on the internet vulnerability characteristic parameters, and based on a corresponding correction parameter P representing the incidence relation between the internet actual vulnerability outbreak probability and the industrial control network actual vulnerability outbreak probability valueCVEMeanwhile, the probability of vulnerability outbreak of the industrial control network is predicted based on the characteristic parameters of the internet vulnerabilities as model input.
And S204, predicting the explosion probability of the industrial control network vulnerability based on the industrial control network vulnerability prediction model.
It can be understood that after the vulnerability prediction model is trained, the probability of vulnerability outbreak in the industrial control network can be predicted based on the input parameters corresponding to the vulnerability at any time.
Due to the fact that the correction parameter P is based onCVEAnd predicting the vulnerability outbreak probability of the industrial control network by using the Internet characteristic parameters. Therefore, how to determine the reasonably accurate correction parameter P in the model training processCVEThis is particularly important. As an example, in step S202, a correction parameter P corresponding to each sample vulnerability id is determined based on the internet vulnerability actual outbreak probability list and the industrial control network vulnerability actual outbreak probability listCVEThe method comprises the following steps:
step S212, obtaining an Internet and industrial control network vulnerability outbreak associated parameter list { R) according to an Internet vulnerability actual outbreak probability list and an industrial control network vulnerability actual outbreak probability list corresponding to each sample vulnerability id in a preset training period1,R2,…RC},RcThe value range of C is 1 to C for the C-th associated parameter, C is the total number of the associated parameters obtained by the sample vulnerability id in the preset training period, Rc=CHc1/CHc2,CHc1The actual outbreak probability value of the internet vulnerability at the c moment, CHc2And the actual explosion probability of the industrial control network vulnerability at the c-th moment.
Step S222, according to { R1,R2,…RCDetermining a correction parameter P corresponding to a sample vulnerability idCVE
As an example, the step S222 includes:
step S232, obtaining { R1,R2,…RCMean value of RAVGAccording to RAVGAnd { R1,R2,…RCAcquiring a first variation parameter SR 1:
Figure 93315DEST_PATH_IMAGE003
step S242, if SR1 is greater than or equal to the preset second threshold D2, { R [ ] is obtained1,R2,…RCMaximum value R of }maxSetting up PCVE=RmaxOtherwise, set PCVE=RAVG
It should be noted that if SR1 is greater than or equal to the preset second threshold D2, it indicates that there is a sudden burst period in the current training period, and therefore { R is selected1,R2,…RCThe maximum value of the correction parameter PCVEAnd is more accurate. If SR1 is less than the second threshold D2, it indicates that the vulnerability is relatively stable in the current training period, so R is selected1,R2,…RCAs a correction parameter PCVEAnd is more accurate. By reasonably selecting accurate correction parameter PCVEThe accuracy of the industrial control network vulnerability prediction model can be improved, the model has strong sensitivity to emerging vulnerability prediction, and the method is particularly suitable for application scenarios of emerging vulnerability prediction.
To further improve the selection of the correction parameter PCVEIf SR1 is smaller than D2 in step S242, the following steps may be further performed:
step S252, { R }1,R2,…RCMinimum value R of }minAccording to Rmin、RAVG、RmaxAcquiring a second variation parameter SR 2:
Figure DEST_PATH_IMAGE004
step S262, if SR2 is greater than or equal to 1, setting PCVE=RminOtherwise, set PCVE=Rmax
When SR2 is 1 or more, R will be describedminIs more influential, P is therefore preferredCVE=RminThis situation is typically applicable in scenarios where an existing vulnerability is suddenly fixed, in which case R isminIs more influential, P is selectedCVE=RminThe model accuracy can be improved. If SR2 is less than 1, then it indicates RmaxIs more influential, P is therefore preferredCVE=RmaxThus enabling the mold to be madeThe method has strong sensitivity to the newly appeared vulnerability prediction, and is particularly suitable for application scenes of the newly appeared vulnerability prediction.
As an example, in the step S202, P corresponding to each sample vulnerability id is used as the basismDetermines the corresponding training parameter value PCmThe method comprises the following steps:
step S272, corresponding P of each sample vulnerability idmThe maximum value, the minimum value or the mean value of all the parameters in the parameter value list is determined as the corresponding training parameter value PCm
In the system of the second embodiment, through the internet vulnerability characteristic parameters and the correction parameter P representing the incidence relation between the internet actual vulnerability outbreak probability and the industrial control network actual vulnerability outbreak probability valueCVEAnd (4) constructing model input parameters, and training to obtain an industrial control network vulnerability prediction model. The method realizes the internet vulnerability characteristic parameters and the correction parameters P based on multiple dimensionsCVETherefore, the vulnerability outbreak probability of the industrial control network is rapidly and accurately predicted, reasonable defense measures are set based on the probability, and the safety and the stability of the industrial control network are improved. The second embodiment is particularly suitable for application scenarios in which vulnerability characteristic parameters of the industrial control network are not easy to obtain. Reasonable defense measures are set based on the method, and the safety and the stability of the industrial control network are improved.
The Summary parameter is the text description of the vulnerability by an authority, and can accurately and reliably reflect vulnerability characteristics, so that the characteristic parameter for predicting the industrial control network vulnerability can be constructed based on Summary. The Summary is an unstructured parameter, and therefore a feature parameter value needs to be constructed based on the text feature of the Summary, for example, in the prior art, the longer the Summary length is, the greater the vulnerability is, the higher the urgency degree to be processed is. However, since Summary is generally all Summary updated regularly by the authority, only Summary of vulnerability changed in a period will be changed, and Summary unchanged in other periods is kept consistent with Summary in the previous period when updated. For example, for a serious vulnerability that breaks out three years ago, there is a long textual description, but no recurrence occurs within three yearsIn other variations, the Summary description stays in a description state three years ago, and if feature parameter values are directly constructed from the Summary text only, it is obviously highly probable that the Summary feature parameter values are not accurately constructed. Therefore, the corresponding feature weight value needs to be given according to the change of the Summary, so that the accuracy of building the Summary feature parameter is improved. Therefore, it is important to determine the feature weight value corresponding to each Summary, and the following detailed description is provided by some specific embodiments.
Example III,
The computer program stored in the storage medium comprises a third computer program that, when executed by the processor, performs the steps of:
step S300, obtaining a text sequence { Str of each sample vulnerability id in corresponding Summary from the database1,Str2,…},StreThe value range of e is 1 to infinity for the Summary text corresponding to the e-th updating period.
Step S301, when e =1, according to StreLength of (1) determining StreCharacteristic weight w ofe
Through step S301, can be for each StreAnd setting corresponding initial characteristic weight.
Step S302, when e>At 1 time, Str is comparede-1And StreIf the text information of (2) is completely consistent, judging z x we-1Whether the weight is larger than a preset first characteristic weight threshold value weminIf greater than, set we=z*we-1Wherein z is a preset weight adjustment coefficient, 0<z<1, if z x we-1Is less than or equal to weminThen set we=weminIf Stre-1And StreIf the text information of (1) is not consistent, according to StreLength of (1) determining StreCharacteristic weight w ofe
It should be noted that when Stre-1And StreWhen the text information is completely consistent, it means that Summary is not updated, so it needs to multiply z to reduce the corresponding feature weight. Preferably, z is set to1/2 some Summary have long-term non-updating condition and can not be decreased without limit, so the first characteristic weight threshold w is seteminWhen w iseWhen the temperature is reduced to a certain degree, the minimum value is selected. When Str is formede-1And StreIf the text information of (2) is inconsistent, it indicates that Summary is changed, so it needs to be directly based on the current StreLength of (1) determining StreCharacteristic weight w ofe
Step S303, based on each StreCharacteristic weight w ofeAnd StreDetermining each StreCorresponding Summary characteristic parameter value PCSe=we*g(Stre) Wherein, g (Str)e) Is based on StreAnd determining the original characteristic parameter value.
In addition, g (Str)e) The parameter value may be obtained directly based on an existing algorithm, that is, a corresponding parameter value is determined directly based on a text feature, and the existing algorithm is not described herein again. w is aeIs determined based on the characteristic parameters of the Summary and the change of the Summary in continuous periods, so that the acquired Summary characteristic parameter value PCS is enabled to beeMore accurate and reliable, thereby improving the accuracy of the model.
S304, constructing a model input vector based on Summary characteristic parameter values corresponding to sample vulnerability ids, training to obtain an industrial control network vulnerability prediction model, and predicting industrial control network vulnerability outbreak probability based on the industrial control network vulnerability prediction model.
It can be understood that other required internet vulnerability characteristic parameters and industrial control network vulnerability characteristic parameters may be introduced when the input vector is constructed, and the specific parameter processing may be based on the methods described in the first embodiment and the second embodiment, or may adopt the existing data processing method, which is not described herein again.
As an example, in said steps S301 and S302, according to StreLength of (1) determining StreCharacteristic weight w ofeThe method comprises the following steps:
step S311, StreLength L ofeAnd a preset first length threshold value LminAnd a second length threshold LmaxComparison, first length threshold LminLess than a second length threshold LmaxIf L ise<LminThen set we=weminIf L ise>LmaxThen set we=wemax,wemaxIs a preset second feature weight threshold, the second feature weight threshold is greater than the first feature weight threshold, LeIn [ L ]min,Lmax]Within the range, then set we=k1*LeWherein k is1Is a preset first linear change coefficient.
Through step S311, Str can be basedeLength L ofeAn accurate and reliable initial feature weight is determined. Preferably, k is1Is set as (w)emax- wemin)/ wemax
As a preferred example, when e >1, before performing step S311, the method further includes:
step S310, determining we-1Whether or not based on we-1=z*we-2Set, if yes, and Stre-1And StreIf the text information of (1) is not consistent, w is setemin= we-1
When w ise-1=z*we-2Then, it is described that the Summary in the previous two periods has not changed, and the previous period has reduced the weight, and the current period has changed compared with the Summary in the previous period, so the weight of the current period must be greater than the weight of the previous period, at this time, w in the current period can be calculatedeminIs set as we-1The accuracy of obtaining the characteristic weight of the period is improved. It is understood that if not the case in step S310, then weminStill the original preset value.
As an example, in step S304, a model input vector is constructed based on the Summary feature parameter value corresponding to the sample vulnerability id, and the industrial control network vulnerability prediction model is obtained through training, which includes:
step S314, determining a model input vector of each sample vulnerability id according to the Summary characteristic parameter value corresponding to the sample vulnerability id, a preset internet vulnerability characteristic parameter and a preset industrial control network vulnerability characteristic parameter;
step S324, training based on model input vectors corresponding to sample vulnerability ids and the industrial control network vulnerability outbreak probability truth value to obtain the industrial control network vulnerability prediction model.
It can be understood that, after the model sample input is determined, the selected artificial intelligence model can be trained by obtaining the corresponding sample true value, the input parameter can be set to a preset training period, and the processing mode of the input parameter can be processed based on the processing mode in the first embodiment and the second embodiment, or can be processed by the existing processing mode, which is not described herein.
It should be noted that, the algorithm directly executed in step S301 after step S300 is executed is applicable to the vulnerability ids of the corresponding Summary text from the time e =1, but some vulnerability ids are newly added later, and a set of corresponding feature weight determination policy may also be set for such vulnerability ids, as an example, the step S300 further includes:
step S311, if { Str1,Str2… } preceding nr consecutive strs in sequenceeIs empty, Strnr+1If not, Str is setnr+1Characteristic weight w ofnr+1= wemax,wemaxFor a preset second feature weight threshold, then initializing e = nr +2, and performing step S302.
It should be noted that if { Str1,Str2… } preceding nr consecutive strs in sequenceeIs empty, Strnr+1If not, it indicates that the corresponding bug id is a new bug id at nr +1, then Strnr+1Characteristic weight w ofnr+1The maximum second characteristic weight threshold value can be directly set, so that the data processing amount can be reduced and the data processing efficiency can be improved on the premise of ensuring the accuracy.
The third embodiment can adjust the feature weight of the Summary according to the text change and the length change of the Summary in the continuous period, the text change of the Summary is easy to judge, and the length parameter is easy to obtain, so that the accuracy and the obtaining efficiency of obtaining the Summary feature parameter value are improved, the accuracy and the training efficiency of training the industrial control network vulnerability prediction model are improved, and the accuracy and the prediction efficiency of predicting the industrial control network vulnerability outbreak probability are improved. Reasonable defense measures are set based on the method, and the safety and the stability of the industrial control network are improved.
Example four,
The fourth embodiment provides an application scenario more suitable for a high Summary update frequency, that is, the Summary update frequency exceeds a preset update frequency threshold.
The computer program stored in the storage medium comprises a fourth computer program that, when executed by the processor, performs the steps of:
step S400, obtaining a text sequence { Str of each sample vulnerability id in corresponding Summary from the database1,Str2,…},StreThe value range of e is 1 to infinity for the Summary text corresponding to the e-th updating period.
Step S401, for StrePerforming word segmentation processing, and stopping words by using a preset stopping word bank to obtain StreCorresponding participle set Ae
It should be noted that the preset deactivation word library may be a deactivation word library constructed based on technologies, and may be continuously changed according to application requirements. The industrial internet decommissioning word library described in the subsequent embodiment may also be used, and the industrial internet decommissioning word library may also be updated according to the updating method of the industrial internet decommissioning word library described in the sixth embodiment, which is not described herein again.
Step S402, when e =1, according to AeDetermining the number of participles StreCharacteristic weight w ofe
Through step S402, can be based on AeNumber of participles per StreAnd setting corresponding initial characteristic weight.
Step S403, when e>At 1 time, compare Stre-1And StreIf the text information of (1) is completely consistent with the text information of (1), setting we=we-1If Stre-1And StreIf the text information is not completely consistent, the word set A is dividedeAnd a participle set Ae-1Performing set difference set operation to obtain AeRelative to Ae-1Difference set fractional word number Ae- Ae-1And A ise-1Relative to AeDifference set fractional word number Ae-1-Ae1Set up we=[( Ae- Ae-1)/(Ae-1-Ae1) ]* we-1
It should be noted that when Stre-1And StreWhen the text information is completely consistent, it indicates that the Summary is not updated, and because the Summary update frequency is fast, w can be set directlye=we-1. If Stre-1And StreIf the text information is not completely consistent, the text information needs to be based on AeAnd Ae-1The change relationship between the characteristic weight and the weight to determine the characteristic weight change coefficient [ (A)e- Ae-1)/(Ae-1-Ae1) ]And is further based on [ (A)e- Ae-1)/(Ae-1-Ae1) ]And the weight w of the previous cyclee-1To determine we,Ae- Ae-1And Ae-1-Ae1In contrast, if Ae- Ae-1Greater than Ae-1-Ae1Description of AeIn Ae-1Add more words if Ae- Ae-1Is less than Ae-1-Ae1Description of AeIn Ae-1Reduce more words on the basis of (A), thus leading to (A)eAt Ae-1When more words are added on the basis, the feature weight becomes larger, AeIn Ae-1When more words are reduced on the basis, the characteristic weight is reduced, and the determined characteristic weight w is improvedeThe accuracy of (2).
Step S404, based on each StreCharacteristic weight w ofeAnd StreDetermining each StreCorresponding Summary characteristic parameter value PCSe=we*g(Stre) Wherein, g (Str)e) Is based on StreAnd determining the original characteristic parameter value.
In addition, g (Str)e) Can be obtained directly based on existing algorithms, i.e. directly based on text featuresDetermining a corresponding parameter value, which is not described herein again in the prior art. w is aeIs determined based on the characteristic parameters of the Summary and the change of the Summary in continuous periods, so that the acquired Summary characteristic parameter value PCS is enabled to beeMore accurate and reliable, thereby improving the accuracy of the model.
S405, constructing a model input vector based on the Summary characteristic parameter values corresponding to the sample vulnerability ids, training to obtain an industrial control network vulnerability prediction model, and predicting the industrial control network vulnerability outbreak probability based on the industrial control network vulnerability prediction model.
It can be understood that other required internet vulnerability characteristic parameters and industrial control network vulnerability characteristic parameters may be introduced when the input vector is constructed, and the specific parameter processing may be based on the methods described in the first embodiment and the second embodiment, or may adopt the existing data processing method, which is not described herein again.
As an example, the step S402 includes:
step S412, AeNumber of participles SAeWith a predetermined first threshold value of the number of sub-words SUminAnd a first threshold value for the number of words SUmaxComparison, wherein SUmin<SUmaxIf SAe< SUminThen set we= wsmin,wsminIs a preset third feature weight threshold if SAe>SUmaxThen set we= wsmax,wsmaxIs a preset fourth characteristic weight threshold value, if the preset third characteristic weight threshold value is in [ SU ]min,SUmax]Then set we=k2*SAeWherein k is2Is a preset second linear change coefficient.
Preferably, wsminSet to 0, wsmaxThe setting is 1, which facilitates the calculation.
Through step S412, can be based on AeNumber of participles SAeAn accurate and reliable initial feature weight is determined. Preferably, k is2Is set to (ws)max-wsmin)/ wsmax
It should be noted that, the algorithm directly executing step S402 after step S401 is executed is applicable to the vulnerability ids of the corresponding Summary text from the time e =1, but some vulnerability ids are newly added later, and a set of corresponding feature weight determination policy may also be set for such vulnerability ids, as an example, step S401 further includes:
step 422, if { Str1,Str2… } consecutive first ns strseA of (A)eIs empty, Ae+1If not, Str is setns+1Characteristic weight w ofns+1= wsmax,wsmaxFor a preset third feature weight threshold, then initializing e = ns +2, and performing step S403.
It should be noted that if { Str1,Str2… } consecutive first ns strseA of (A)eIs empty, Ae+1If not, it indicates that the corresponding bug id is the newly added bug id in ns +1, then Strns+1Characteristic weight w ofns+1The maximum value third characteristic weight threshold value can be directly set, so that the data processing amount can be reduced and the data processing efficiency can be improved on the premise of ensuring the accuracy.
As an example, in step S405, a model input vector is constructed based on the Summary feature parameter value corresponding to the sample vulnerability id, and an industrial control network vulnerability prediction model is obtained through training, including:
step S415, determining a model input vector of each sample vulnerability id according to the Summary characteristic parameter value corresponding to the sample vulnerability id, a preset internet vulnerability characteristic parameter and a preset industrial control network vulnerability characteristic parameter;
and S425, training based on model input vectors corresponding to sample vulnerability ids and the true value of the vulnerability outbreak probability of the industrial control network to obtain the industrial control network vulnerability prediction model.
The fourth embodiment is particularly suitable for application scenarios with high Summary update frequency, that is, application scenarios with Summary update frequency higher than a preset update frequency threshold. The method has the advantages that the feature weight of the Summary can be adjusted according to the word segmentation change relation of the Summary text in the continuous period, and the accuracy and the efficiency of obtaining the Summary feature parameter values are improved, so that the accuracy and the efficiency of training the industrial control network vulnerability prediction model are improved, and the accuracy and the prediction efficiency of predicting the industrial control network vulnerability outbreak probability are improved. Reasonable defense measures are set based on the method, and the safety and the stability of the industrial control network are improved.
Example V,
The fifth embodiment is particularly suitable for application scenarios with low Summary update frequency, that is, application scenarios with Summary update frequency lower than a preset update frequency threshold.
The system also comprises a bitmap (bitmap) generated based on the change of the Summary text corresponding to each vulnerability id along with the update period, and the bitmap is adopted for storage, so that the data storage space can be saved. If the text of the Summary in the current period is not changed relative to the text of the Summary in the previous period, the position of the current period in the corresponding period of the bitmap is set to be 0, otherwise, the position is set to be 1, BeThe value corresponding to the bitmap in the e-th update period, BeEqual to 0 or 1, e has a value of 1 to infinity.
The computer program stored in the storage medium includes a fifth computer program that, when executed by the processor, implements the steps of:
step S501, determining a Summary cycle detection window TK based on a preset training cycle, wherein TK = a Te, a is a positive integer larger than 1, and Te is a Summary update cycle;
it is understood that the period detection window TK comprises a bits, each bit corresponding to an update period. Preferably, a has a value of 8.
Step S502, with BeAs the a-th bit information in TK, the e-th period detection window information TK is obtainedeBased on BeAnd TKeBit change in (c), determining BeCharacteristic weight w ofe
In addition, B iseAs the a-th bit information in TK, i.e. BeThe last bit of information in the TK. TKeChange of position in (TK)eThe variation of a bits in (1) corresponds to the variation of Summary of a consecutive cycles. For example, the a bits are all 0's,it indicates that within a consecutive cycles, Summary does not change. As another example, BeIs 1, description BeThe corresponding Summary is changed from the Summary of the previous cycle. As another example, BeIs 0, Be-a1Is 1, Be-a1And BeAll the values in the interval are 0, B is indicatedeThe corresponding Summary continues for a1 cycles without change, and is therefore based on BeAnd TKeA bit change in (B) can be determinedeCharacteristic weight w ofe
Step S503, based on each BeCharacteristic weight w ofeAnd Summary text StreDetermining a Summary characteristic parameter value PCS corresponding to each Summary texte=we*g(Stre) Wherein, g (Str)e) Is based on StreAnd determining the original characteristic parameter value.
In addition, g (Str)e) The parameter value may be obtained directly based on an existing algorithm, that is, a corresponding parameter value is determined directly based on a text feature, and the existing algorithm is not described herein again. w is aeIs determined based on the characteristic parameters of the Summary and the change of the Summary in continuous periods, so that the acquired Summary characteristic parameter value PCS is enabled to beeMore accurate and reliable, thereby improving the accuracy of the model.
Step S504, a model input vector is constructed based on the Summary characteristic parameter values corresponding to the sample vulnerability ids, an industrial control network vulnerability prediction model is obtained through training, and the industrial control network vulnerability outbreak probability is predicted based on the industrial control network vulnerability prediction model.
It can be understood that other required internet vulnerability characteristic parameters and industrial control network vulnerability characteristic parameters may be introduced when the input vector is constructed, and the specific parameter processing may be based on the methods described in the first embodiment and the second embodiment, or may adopt the existing data processing method, which is not described herein again.
As an example, in step S502, the base B iseAnd TKeBit change in (c), determining BeCharacteristic weight w ofeThe method comprises the following steps:
step S512, judging BeWhether or not to determine whether or not to performIs 1, if 1, then w is sete=wbmax,wbmaxIs a preset fifth feature weight threshold, otherwise, step S522 is executed;
step S522, TK is obtainedeNeutral and BeThe most recent bit taking the value 1 and BeNumber of bits of interval d, judgment (wb)max- wbmin) Whether/d is less than a preset sixth characteristic weight threshold wbminIf it is smaller than, set we= wbminOtherwise set we=(wbmax- wbmin)/d,wbmin< wbmax
Since the update frequency of Summary is low, when Summary is updated compared with Summary in the previous period, it should have high weight, and through steps S512-S522, it can be used for w that the current period changes compared with the previous periodeSet directly to wbmaxTherefore, the accuracy can be ensured, and the calculation amount can be reduced. It is understood that, if a more accurate result is required, the calculation may be specifically performed based on the sum length or the word segmentation result in the third embodiment and the fourth embodiment, and details are not described herein again. When the Summary is not updated compared with the Summary of the previous period, determining corresponding weight based on the distance between the current period and the last updating period, and acquiring K based on the bitmapeNeutral and BeThe most recent bit taking the value 1 and BeThe number d of the interval bits is small in calculation amount and high in calculation efficiency.
As an example, to further improve the quasi-determination of feature weight acquisition, wb may be dynamically adjusted based on the results of the last cycle detection windowmaxAnd wbminAfter step S522, the method further includes:
step S532, all w obtained based on the TK of the current Summary period detection windoweMax (w) ofe) And minimum value min (w)e) Update wbmax= max(we) Update wbmin= min(we)。
In order to further improve the efficiency of acquiring the feature weight, the bit operation may be performed directly based on bitmap, and as an example, the step S502 includes:
step S542 and BeAs the a-th bit information in the TK, obtaining the bitmap in the e-th period detection windoweInitializing WK as binary, corresponding initial decimal number of WK as 2a-1;
Step S552, judging the current bitmapeIf the last bit of (2) is 0, if so, go to step S562, if so, go to step S572;
step S562, the e-th period detection window is shifted to the right by one bit, and bitmap is updatedeWK is shifted right by one bit, and the process returns to step S552;
step S572, determining the current WK as BeCharacteristic weight w ofe
Take the value of a as 8, take B as an exampleeAs the a-th bit information in the TK, obtaining the bitmap in the e-th period detection windowe00110000, initializing WK to 10000000, current bitmapeIs 0, will bitmapeRight shift by one bit 00011000, right shift by one bit WK 01000000, and loop execution until bitmapeIs 1, the value of WK is BeCharacteristic weight w ofe. Obtaining the feature weight w by adopting the operation as step S542 to step S572eIncrease the acquisition characteristic weight weThe efficiency of (c).
The fifth embodiment is particularly suitable for application scenarios with low Summary update frequency, that is, application scenarios with Summary update frequency lower than a preset update frequency threshold. The method and the device particularly adopt bitmap to store the periodic change rate of Summary, and greatly reduce the space occupied by data storage. Obtaining feature weights w based on bitmapseThe method has high operation speed and high accuracy, and improves the characteristic weight weAccuracy and efficiency of the process. Therefore, the accuracy and the training efficiency of training the industrial control network vulnerability prediction model are improved, and the accuracy and the prediction efficiency of predicting the industrial control network vulnerability outbreak probability are further improved. Reasonable defense measures are set based on the method, and the safety and the stability of the industrial control network are improved.
Examples three to five describe three sets of methods for determining the weighting of the corresponding Summary feature of each Summary textMethod, example six will further describe a method of determining the value of an original characteristic parameter g (Str)e) Based on g (Str)e) And determining the corresponding Summary feature parameter value PCS of each Summary text according to the corresponding feature weighte。g(Stre) Specifically, g (Str) can be obtained based on the Summary text feature based on the existing feature processing algorithme). G (Str) can also be obtained based on the sub-scheme described in example sixe)
Examples six,
The system comprises a preset industrial internet stop word bank, wherein stop words commonly used in the field of industrial internet are stored in the internet stop word bank. The text sequence of Summary corresponding to each sample vulnerability id is { Str1,Str2,…},StreThe value range of e is 1 to infinity for the Summary text corresponding to the e-th updating period.
The computer program stored in the storage medium includes a sixth computer program that, when executed by the processor, implements the steps of:
step S601, stopping Str based on the industrial InterneteThe industrial internet stop words in (1) are removed, and the Str is paired at the position of the industrial internet stop wordseThe segmentation is carried out to generate a corresponding text segment sequence { Fre1, Fre2,…FreI}, FreiIs StreI ranges from 1 to I, and I is StreTotal number of text segments.
Wherein, taking ABCDEFG as an example of a piece of text, each letter represents a word, and assuming that C and E are stop words in the industrial Internet stop word bank, C and E are removed, and the rest text is divided into three text segments AB, D and FG.
Step S602, for each StreEach Fr ofeiExecuting preset N-gram word segmentation processing, wherein N is a positive integer and the value range is [ Kn1, Kn2]Each StreAll Fr ofeiThe word segmentation is merged and de-duplicated to obtain a corresponding word segmentation vector FBe
It should be noted that if the N-gram word segmentation is directly performed on each Summary, because the number of Summary texts is huge, if the word segmentation results of all the Summary texts N-gram are directly subjected to one-hot coding, the vector dimension is too large, the required calculation amount is large, and the data processing efficiency is low. In this embodiment, each Summary is segmented based on stop words in step S601, and then the segmented text segments are N-gram word segmentation one by one, so that vector dimensionality can be greatly reduced, and data processing efficiency can be improved. The specific word segmentation process of the N-gram is prior art and will not be described herein. Preferably, kn1A value of 3, kn2The value is 6.
Step S603, all FBseAnd combining and de-duplicating the participles in the database to obtain a participle set FC, and determining the number of the participles of the FC as the dimension of one-hot coding.
Step S604, carrying out FB on participle vectors based on one-hot coded dimension paireOne-hot encoding is performed to obtain each StreThe original characteristic parameter value of (2).
The specific encoding process of one-hot encoding is prior art and will not be described herein. It will be appreciated that when the word vector FB is bisectedeAfter one-hot coding, the corresponding Str can be obtained based on the coding resulteThe original characteristic parameter value of (2).
Step S605, Str corresponding to vulnerability id based on sampleeAnd establishing a model input vector by using the original characteristic parameter values, training to obtain an industrial control network vulnerability prediction model, and predicting the industrial control network vulnerability outbreak probability based on the industrial control network vulnerability prediction model.
The step S605 can be directly based on StreThe original characteristic parameter value of (1) is combined with other vulnerability characteristic parameter values to build a model input vector, and in order to further improve the accuracy of the Summary characteristic parameter value, each Str can be subjected toeGiving corresponding weight, as an example, in the step S605, based on the Str corresponding to the sample vulnerability ideThe original characteristic parameter value modeling input vector comprises:
step S615, Str corresponding to vulnerability id based on sampleeValue g (St) of the original characteristic parameter ofre) And corresponding feature weights weDetermining each StreCorresponding Summary characteristic parameter value PCSe=we*g(Stre) And constructing a model input vector based on the Summary characteristic parameter value corresponding to the sample vulnerability id.
Wherein, StreCorresponding feature weight weBased on StreAnd the current Summary text is determined based on changes in the historical Summary text. Specifically, at least one of the third, fourth and fifth embodiments may be adopted to determine weAnd will not be described herein.
In order to further improve the processing efficiency and accuracy of the Summary parameter value, the industrial internet decommissioned thesaurus may be updated, and for example, when the processor executes the sixth computer program, the processor further implements an industrial internet decommissioned thesaurus updating process, including the following steps:
step S600, initializing N = Kn2 in the N-gram,
s610, dividing the Summary texts corresponding to all vulnerability ids into text segments based on the industrial internet stop word library, removing industrial internet stop words, and performing N-gram word segmentation processing on each text segment to obtain an N-gram word segmentation quantity list;
step S620, adding the N-gram participles with the N-gram participle quantity larger than a preset participle quantity threshold value D3 into the industrial Internet disabled word stock, and judging whether Kn is larger than Kn or not1If yes, setting Kn = Kn-1, returning to step S610, if Kn is equal to Kn1And ending the thesaurus updating process of the industrial internet.
Through the steps S600-S620, the industrial internet disabled word stock is updated by combining N-gram processing on all the Summary texts, so that the disabled word stock is synchronously updated based on the updating conditions of the Summary texts, and the processing efficiency and accuracy of obtaining Summary parameter values are improved.
Preferably, D3= f [ [ solution ] ]
Figure 200948DEST_PATH_IMAGE005
,SN,avg(Kn)]Wherein D3 is substituted with
Figure DEST_PATH_IMAGE006
Is positively correlated with SN, is negatively correlated with avg (Kn) D3, and is the total number of total Summary of all holes, LjAvg (N) is the average of all values of N in the N-gram.
The sixth embodiment reduces the number of word segments after N-gram processing on all Summary texts by processing the stop words and segments of the Summary texts through the industrial internet stop word bank, thereby reducing the word segment vector FBeThe coding dimension of one-hot coding is improved, and Str acquisition is improvedeThe efficiency and the accuracy of the original characteristic parameter values are improved, so that the accuracy and the training efficiency of training the industrial control network vulnerability prediction model are improved, the accuracy and the prediction efficiency of predicting the vulnerability outbreak probability of the industrial control network are improved, reasonable defense measures are set based on the method, and the safety and the stability of the industrial control network are improved.
Examples seven,
A server comprising the system of at least one of embodiments one through six.
The server can quickly and accurately train the industrial control network vulnerability prediction model based on the internet vulnerability characteristic parameters and the industrial control network vulnerability characteristic parameters, so that the industrial control network vulnerability outbreak probability can be quickly and accurately predicted based on the industrial control network vulnerability prediction model, reasonable defense measures are set based on the method, and the safety and the stability of the industrial control network are improved.
It should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the steps as a sequential process, some of the steps may be performed in parallel, concurrently or at the same time. In addition, the order of the steps may be rearranged. A process may be terminated when its operations are completed, but may have additional steps not included in the figure. A process may correspond to a method, a function, a procedure, a subroutine, a subprogram, etc.
It is to be understood that the invention is not limited to the specific embodiments disclosed, but on the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the invention as defined by the appended claims.

Claims (7)

1. A system for predicting industrial control network loophole based on correction parameters is characterized in that,
the system comprises a processor, a database and a storage medium stored with a computer program, wherein the processor is in communication connection with the database, and an internet vulnerability characteristic parameter list, an internet vulnerability actual outbreak probability list and an industrial control network vulnerability actual outbreak probability list which correspond to all internet vulnerability ids are stored in the database, and PmThe value range of M is 1 to M, wherein M is the number of the internet vulnerability characteristic parameters; the computer program stored in the storage medium includes a second computer program that, when executed by the processor, implements the steps of:
step S201, obtaining P corresponding to each sample vulnerability id in a preset training period from the databasemThe method comprises the steps of (1) parameter value list, internet vulnerability actual outbreak probability list, industrial control network vulnerability actual outbreak probability list and industrial control network vulnerability outbreak probability true value;
step S202, determining a correction parameter P corresponding to each sample vulnerability id based on an Internet vulnerability actual outbreak probability list and an industrial control network vulnerability actual outbreak probability listCVEBased on P corresponding to each sample vulnerability idmDetermines the corresponding training parameter value PCmBased on PCmGenerating a model input vector of each sample vulnerability id;
step S203, correcting parameters P corresponding to all sample loopholes idCVETraining the model input vector and the industrial control network vulnerability outbreak probability truth value to obtain an industrial control network vulnerability prediction model:
f(x)=b0*PCVE+b1*x1+b2*x2+…bM*xM
wherein x isjP corresponding to sample vulnerability idmTraining parameter values of bjIs xjThe value of j ranges from 1 to M, b0Is PCVEThe weight coefficient of (a);
the industrial control network vulnerability prediction model constructs an input vector based on the internet vulnerability characteristic parameters and based on a corresponding correction parameter P representing the incidence relation between the internet actual vulnerability outbreak probability and the industrial control network actual vulnerability outbreak probability valueCVESimultaneously, the model is used as the input of the model;
and S204, predicting the explosion probability of the industrial control network vulnerability based on the industrial control network vulnerability prediction model.
2. The system of claim 1,
in the step S202, a correction parameter P corresponding to each sample vulnerability id is determined based on the Internet vulnerability actual outbreak probability list and the industrial control network vulnerability actual outbreak probability listCVEThe method comprises the following steps:
step S212, obtaining an Internet and industrial control network vulnerability outbreak associated parameter list { R) according to an Internet vulnerability actual outbreak probability list and an industrial control network vulnerability actual outbreak probability list corresponding to each sample vulnerability id in a preset training period1,R2,…RC},RcThe value range of C is 1 to C for the C-th associated parameter, C is the total number of the associated parameters obtained by the sample vulnerability id in the preset training period, Rc=CHc1/CHc2,CHc1The actual outbreak probability value of the internet vulnerability at the c moment, CHc2The actual explosion probability of the industrial control network vulnerability at the c moment is obtained;
step S222, according to { R1,R2,…RCDetermining a correction parameter P corresponding to a sample vulnerability idCVE
3. The system of claim 2,
the step S222 includes:
step S232, obtaining { R1,R2,…RCMean value of RAVGAccording to RAVGAnd { R1,R2,…RCAcquiring a first variation parameter SR 1:
Figure FDA0003585540780000021
step S242, if SR1 is greater than or equal to the preset second threshold D2, { R [ ] is obtained1,R2,…RCMaximum value R of }maxSetting up PCVE=RmaxOtherwise, set PCVE=RAVG
4. The system of claim 3,
in the step S242, if SR1 is smaller than D2, the following steps are performed:
step S252, { R }1,R2,…RCMinimum value R of }minAccording to Rmin、RAVG、RmaxAcquiring a second variation parameter SR 2:
Figure FDA0003585540780000022
step S262, if SR2 is greater than or equal to 1, setting PCVE=RminOtherwise, set PCVE=Rmax
5. The system of claim 1,
in the step S202, P corresponding to each sample vulnerability id is based onmDetermines the corresponding training parameter value PCmThe method comprises the following steps:
step S272, corresponding P of each sample vulnerability idmIs a maximum, minimum or mean of all parameters in the parameter value listDetermined as corresponding training parameter value PCm
6. The system of claim 1,
the actual explosion probability of the industrial control network vulnerability corresponding to the sample prediction moment is the true value of the industrial control network vulnerability explosion probability;
the actual explosion probability of the industrial control network bug is obtained by reporting the quantity of industrial control network host equipment which explodes the bug in the update period of the explosion probability of the industrial control network bug and dividing the quantity of all monitored industrial control network host equipment;
the actual outbreak probability of the internet vulnerability is obtained by dividing the number of the internet host devices reporting the outbreak of the vulnerability by the number of all the monitored internet host devices in the update period of the outbreak probability of the internet vulnerability.
7. A server, characterized in that it comprises a system according to any one of claims 1 to 6.
CN202111358159.7A 2021-11-17 2021-11-17 System for predicting industrial control network bugs based on correction parameters Active CN114021149B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111358159.7A CN114021149B (en) 2021-11-17 2021-11-17 System for predicting industrial control network bugs based on correction parameters

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111358159.7A CN114021149B (en) 2021-11-17 2021-11-17 System for predicting industrial control network bugs based on correction parameters

Publications (2)

Publication Number Publication Date
CN114021149A CN114021149A (en) 2022-02-08
CN114021149B true CN114021149B (en) 2022-06-03

Family

ID=80064789

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111358159.7A Active CN114021149B (en) 2021-11-17 2021-11-17 System for predicting industrial control network bugs based on correction parameters

Country Status (1)

Country Link
CN (1) CN114021149B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110784486A (en) * 2019-11-07 2020-02-11 广州安加互联科技有限公司 Industrial vulnerability scanning method and system
CN112016097A (en) * 2020-08-28 2020-12-01 重庆文理学院 Method for predicting time of network security vulnerability being utilized
CN112637220A (en) * 2020-12-25 2021-04-09 中能融合智慧科技有限公司 Industrial control system safety protection method and device
CN112801359A (en) * 2021-01-25 2021-05-14 海尔数字科技(青岛)有限公司 Industrial internet security situation prediction method and device, electronic equipment and medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107645533A (en) * 2016-07-22 2018-01-30 阿里巴巴集团控股有限公司 Data processing method, data transmission method for uplink, Risk Identification Method and equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110784486A (en) * 2019-11-07 2020-02-11 广州安加互联科技有限公司 Industrial vulnerability scanning method and system
CN112016097A (en) * 2020-08-28 2020-12-01 重庆文理学院 Method for predicting time of network security vulnerability being utilized
CN112637220A (en) * 2020-12-25 2021-04-09 中能融合智慧科技有限公司 Industrial control system safety protection method and device
CN112801359A (en) * 2021-01-25 2021-05-14 海尔数字科技(青岛)有限公司 Industrial internet security situation prediction method and device, electronic equipment and medium

Also Published As

Publication number Publication date
CN114021149A (en) 2022-02-08

Similar Documents

Publication Publication Date Title
US11132602B1 (en) Efficient online training for machine learning
US20210390416A1 (en) Variable parameter probability for machine-learning model generation and training
US20210342699A1 (en) Cooperative execution of a genetic algorithm with an efficient training algorithm for data-driven model creation
EP3955204A1 (en) Data processing method and apparatus, electronic device and storage medium
Ahmad et al. DRaNN_PSO: A deep random neural network with particle swarm optimization for intrusion detection in the industrial internet of things
CN115378988B (en) Data access abnormity detection and control method and device based on knowledge graph
CN113672467A (en) Operation and maintenance early warning method and device, electronic equipment and storage medium
Dong et al. Fully convolutional spatio-temporal models for representation learning in plasma science
CN114021149B (en) System for predicting industrial control network bugs based on correction parameters
CN113792300B (en) System for predicting industrial control network bugs based on internet and industrial control network bug parameters
US20240095535A1 (en) Executing a genetic algorithm on a low-power controller
CN114021151B (en) System for predicting industrial control network bugs based on Summary length features
CN114021148B (en) System for predicting industrial control network bugs based on Summary word segmentation characteristics
CN117318052A (en) Reactive power prediction method and device for phase advance test of generator set and computer equipment
CN114021150B (en) System for predicting industrial control network bugs based on N-gram
CN114021147B (en) System for predicting industrial control network vulnerability based on bitmap
CN113537614A (en) Construction method, system, equipment and medium of power grid engineering cost prediction model
Wei et al. Smart contract fuzzing based on taint analysis and genetic algorithms
KAMYAB et al. Reliability assessment of structures by Monte Carlo simulation and neural networks
CN115828414A (en) Reliability and sensitivity analysis method for uncertainty of distributed parameters of radome structure
Ma et al. Prediction model of BP neural network based on improved genetic algorithm optimization for infectious diseases
US11201874B2 (en) Information processing apparatus, control method, and program
WO2022070105A1 (en) Systems and methods for enforcing constraints to predictions
Shaik et al. Integrating Random Forest and Support Vector Regression Models for Optimized Energy Consumption Evaluation in Cloud Computing Data Centers
JPWO2021131824A5 (en)

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: A system for predicting industrial control network vulnerabilities based on modified parameters

Effective date of registration: 20221130

Granted publication date: 20220603

Pledgee: Zhejiang Commercial Bank Co.,Ltd. Jinan Branch

Pledgor: Shandong Yuntian Safety Technology Co.,Ltd.

Registration number: Y2022980024358