CN110225019A - A kind of network security processing method and device - Google Patents
A kind of network security processing method and device Download PDFInfo
- Publication number
- CN110225019A CN110225019A CN201910479765.0A CN201910479765A CN110225019A CN 110225019 A CN110225019 A CN 110225019A CN 201910479765 A CN201910479765 A CN 201910479765A CN 110225019 A CN110225019 A CN 110225019A
- Authority
- CN
- China
- Prior art keywords
- network
- network security
- reward
- state
- probability
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/20—Network architectures or network communication protocols for network security for managing network security; network security policies in general
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The embodiment of the present application discloses a kind of network security processing method and device;The network safe state of this method detection target network, based on network security mapping relations, it obtains under network safe state, execute the execution probability of default network security response, network security mapping relations include the mapping relations between network safe state and the probability for executing default network security response, the corresponding state reward of network safe state is obtained based on probability is executed, based on the state reward got, it calculates so that the corresponding current state reward of network safe state is the destination probability of maximum value, network security mapping relations are updated based on destination probability, obtain updated network security mapping relations.The efficiency of network security processing can be improved in the program.
Description
Technical field
This application involves field of computer technology, and in particular to a kind of network security processing method and device.
Background technique
Since network safety situation becomes to become increasingly complex, the event of the menace networks safety such as rogue activity, abnormal aggression
Happen occasionally, after network is attacked, be easy to cause network data leakage, servers go down the problems such as, therefore, to network security thing
Part is timely handled very necessary.
The method handled at present network security mainly passes through security expert, for example, security expert goes out in network
Existing network safety event is detected, and provides corresponding solution, this network according to the network safety event of appearance
The method efficiency of safe handling is very low.
Summary of the invention
The embodiment of the present application provides a kind of network security processing method and device, and the effect of network security processing can be improved
Rate.
The embodiment of the present application provides a kind of network security processing method, comprising:
Detect the network safe state of target network;
It based on network security mapping relations, obtains under the network safe state, executes default network security response
Probability is executed, the network security mapping relations include between network safe state and the probability for executing default network security response
Mapping relations;
The corresponding state reward of the network safe state is obtained based on the execution probability;
Based on the state reward got, calculate so that the corresponding current state reward of the network safe state is maximum
The destination probability of value;
The network security mapping relations are updated based on the destination probability, updated network security is obtained and reflects
Penetrate relationship.
Correspondingly, the embodiment of the present application also provides a kind of network safety processing equipment, comprising:
Detection module, for detecting the network safe state of target network;
Probability obtains module, for being based on network security mapping relations, obtains under the network safe state, executes pre-
If the execution probability of network security response, the network security mapping relations include network safe state and the default network peace of execution
Mapping relations between the probability of total regression;
Reward obtains module, for obtaining the corresponding state reward of the network safe state based on the execution probability;
Computing module, for calculating so that the network safe state is corresponding current based on the state reward got
State reward is the destination probability of maximum value;
Update module is updated for being updated based on the destination probability to the network security mapping relations
Network security mapping relations afterwards.
Correspondingly, the embodiment of the present application also provides a kind of storage medium, the storage medium is stored with instruction, described instruction
The step of network security processing method of any offer of the embodiment of the present application is provided when being executed by processor.
Correspondingly, the embodiment of the present application also provides a kind of computer equipment, the computer equipment includes processor and deposits
Reservoir, the memory are stored with a plurality of instruction, and the processor loads instruction from the memory, to execute the application reality
The step of network security processing method of any offer of example is provided.
The embodiment of the present application detects the network safe state of target network, is based on network security mapping relations, obtains in net
Under network safe condition, the execution probability of default network security response is executed, network security mapping relations include network safe state
Mapping relations between the probability of the default network security response of execution, based on executing, probability acquisition network safe state is corresponding
State reward is calculated based on the state reward got so that the corresponding current state reward of network safe state is maximum value
Destination probability, network security mapping relations are updated based on destination probability, the mapping of updated network security is obtained and closes
System.The efficiency of network security processing can be improved in the program.
Detailed description of the invention
In order to more clearly explain the technical solutions in the embodiments of the present application, make required in being described below to embodiment
Attached drawing is briefly described, it should be apparent that, the drawings in the following description are only some examples of the present application, for
For those skilled in the art, without creative efforts, it can also be obtained according to these attached drawings other attached
Figure.
Fig. 1 is the schematic diagram of a scenario of network security processing system provided by the embodiments of the present application;
Fig. 2 is the first pass schematic diagram of network security processing method provided by the embodiments of the present application;
Fig. 3 is the second procedure schematic diagram of network security processing method provided by the embodiments of the present application;
Fig. 4 is the frame diagram of network security processing method provided by the embodiments of the present application;
Fig. 5 is the techniqueflow schematic diagram of network security processing method provided by the embodiments of the present application;
Fig. 6 is intensified learning model solution flow diagram provided by the embodiments of the present application;
Fig. 7 is the first structural schematic diagram of network security processing method provided by the embodiments of the present application;
Fig. 8 is the structural schematic diagram of computer equipment provided by the embodiments of the present application.
Specific embodiment
Schema is please referred to, wherein identical component symbol represents identical component, the principle of the application is to implement one
It is illustrated in computing environment appropriate.The following description be based on illustrated by the application specific embodiment, should not be by
It is considered as limitation the application other specific embodiments not detailed herein.
In the following description, the specific embodiment of the application will refer to the step as performed by one or multi-section computer
And symbol illustrates, unless otherwise stating clearly.Therefore, these steps and operation will have to mention for several times is executed by computer, this paper institute
The computer execution of finger includes by representing with the computer processing unit of the electronic signal of the data in a structuring pattern
Operation.This operation is converted at the data or the position being maintained in the memory system of the computer, reconfigurable
Or in addition change the running of the computer in mode known to the tester of this field.The maintained data structure of the data
For the provider location of the memory, there is the specific feature as defined in the data format.But the application principle is with above-mentioned text
Word illustrates that be not represented as a kind of limitation, this field tester will appreciate that plurality of step and behaviour as described below
Also it may be implemented in hardware.
Term as used herein " module " can see the software object executed in the arithmetic system as.It is as described herein
Different components, module, engine and service can see the objective for implementation in the arithmetic system as.And device as described herein and side
Method is preferably implemented in the form of software, can also be implemented on hardware certainly, within the application protection scope.
The embodiment of the present application provides a kind of network security processing method, and the executing subject of the network security processing method can be with
It is network safety processing equipment provided by the embodiments of the present application, or is integrated with the network equipment of the network safety processing equipment,
Wherein the network safety processing equipment can be realized by the way of hardware or software.Wherein, the network equipment can be intelligence
The equipment such as mobile phone, tablet computer, palm PC, laptop or desktop computer.
Referring to Fig. 1, Fig. 1 is the application scenarios schematic diagram of network security processing method provided by the embodiments of the present application, with
For network safety processing equipment integrates in the network device, the network equipment can detecte the network safe state of target network,
It based on network security mapping relations, obtains under network safe state, executes the execution probability of default network security response, network
Security mapping relationship includes the mapping relations between network safe state and the probability for executing default network security response, based on holding
Row probability is obtained the corresponding state reward of network safe state and is calculated based on the state reward got so that network security shape
The corresponding current state reward of state is the destination probability of maximum value, is carried out more based on destination probability to network security mapping relations
Newly, updated network security mapping relations are obtained.
Referring to Fig. 2, Fig. 2 is the flow diagram of network security processing method provided by the embodiments of the present application.The application
The detailed process for the network security processing method that embodiment provides can be such that
201, the network safe state of target network is detected.
Wherein, network safe state can be to carry out what safety detection obtained to the network security scene where target network
Network state.For example, network safe state may include safe condition, network be scanned state, network hole be utilized state,
Network state under attack, network by state of capturing etc..
In one embodiment, network safe state set can be indicated by S, can wrap in network safe state set S
Multiple network safe condition s is included, for example, may include safe condition s1, the scanned state of network in network safe state set S
S2, network hole are utilized state s3, network, and state s4 under attack, network are captured state s5 etc..
In practical applications, it can detecte the network safe state of target network, for example, can be to where target network
Network security scene is detected, and the network security scene where determining target network is terminal intrusion scenario.And according to end
Issuable network safe state in intrusion scenario is held, network safe state set S is defined as to include safe condition s1, net
Network is scanned state s2, network hole is utilized state s3, network state s4 and network under attack are captured state s5 etc.
Network safe state.Then target network is detected, obtains the network safe state of target network.
In one embodiment, it can also pass through multiple networks to promote the accuracy that network safe state obtains
Security engine detects network security parameters, so that network safe state is obtained, specifically, step " detection target network
Network safe state ", may include:
Network safe state set is obtained, includes multiple network safe condition in the network safe state set;
The corresponding network security of web security engine described in target network is detected respectively based on multiple web security engines
As a result;
According to the network security as a result, from the multiple network safe condition of the network safe state set, determine
The network safe state of the target network.
In practical applications, available network safe state set may include a variety of in network safe state set S
Then network safe state s can detect web security engine described in target network according to multiple web security engines respectively
Corresponding network security is as a result, each web security engine can correspond to and detect a kind of network security as a result, such as, multiple network
Security engine can respectively detect the multiple portions such as database, transmission network, user terminal, obtain network security knot
Fruit.And according to multiple web results, the corresponding network safe state of target network is determined in network safe state set.
In one embodiment, the network security scene where target network is detected, determines target network place
Network security scene be terminal intrusion scenario after, the master of network security response can also will be carried out according to network safe state
Body is defined as intelligent body.For example, the intelligent body can be defined as the root role of end host under terminal intrusion scenario.
Wherein, root user can be unique power user in system, can possess permission all in system, such as open
Some process is moved or stopped, user is deleted or increase, increase or disable hardware etc..
202, network security mapping relations are based on, are obtained under network safe state, default network security response is executed
Execute probability.
Wherein, the standard that various network exception events are done can be coped with by network by presetting network security response
Measure that is standby and being taken after network exception event generation.For example, default network security response may include closing exception
Virus document, isolation operation apocrypha, is deleted, closes malicious process etc. flow of the shielding from certain IP by connectivity port
Deng.
In one embodiment, the network security scene where target network is detected, determines target network place
Network security scene be terminal intrusion scenario after, default network security response sets can also be indicated by A, according to host
Default network security response sets A is defined as including that a variety of default networks are pacified by conventional process flow when by abnormal aggression
Total regression a is prevented and under suppressing exception for example, may include closing scanned port a1 in default network security response sets A
Published article part a2, the default network security response such as abnormal connection a3, locking abnormal login account a4 are closed, reports administrator a5.
Wherein, the execution probability for executing default network security response can be with are as follows: target network is in some network safe state
Under, by the cognition to current network safe state, selection executes certain default net from a variety of default network security responses
The foundation of network security response.For example, the execution probability that the target network executes various default network security responses can indicate are as follows:
Target network may execute every kind of default network security in default network security response sets A under some network safe state
The probability size of a is responded, the probability distribution based on default network security response sets can be expressed as by executing probability.
Wherein, network security mapping relations may include that network safe state and the various default network securitys of execution respond
Mapping relations between probability, such as the network security mapping relations may include target network in some network safe state
Under, the probability size of every kind of default network security response a in default network security response sets A may be executed.
In one embodiment, network security mapping relations can be indicated by strategy, indicates that strategy, strategy can be with by π
Target network is indicated under some network safe state, based on the probability distribution of default network security response sets, for example, tactful
Representation formula can be such that
π (a | s)=P [At=a | St=s]
Wherein, A can indicate default network security response sets, and a can be indicated in default network security response sets A
Default network security response, S can indicate network safe state set, and s can indicate the network in network safe state set S
Safe condition, t can indicate current time.
Wherein, tactful π can indicate that intelligent body takes possible default network security to respond a certain network safe state s
The probability of a.Strategy π is only related with current network safe state, unrelated with the network safe state of history.Meanwhile tactful π
Be it is static, be unrelated with the time, but intelligent body can also in real time be adjusted tactful π.
Foundation due to executing default network security response is a probability distribution, so for different network security shapes
State, there may be the responses of different default network securitys according to the same strategy for target network;For identical network security shape
State, target network may also generate different default network security responses according to the same strategy.
In one embodiment, network safety processing equipment can integrate in intensified learning model, network security processing side
Multiple steps of method can be executed by intensified learning model.
Wherein, intensified learning model can be that response (behavior) is instructed by interacting the reward of acquisition with environment,
Learnt in a manner of trial and error, the target of intensified learning is the intensified learning model in order to obtain maximum reward.Intensified learning
Emphasize how to be responded based on state, to obtain maximized antedated profit.For example, the intensified learning model can be Ma Er
It can husband's decision model etc..
Wherein, Markov property can be when a random process is in given present status and all past states
In the case of, the conditional probability distribution of future state only relies upon current state.Namely when given present status, random mistake
Journey and past state are conditional samplings, then this random process has Markov property, the mistake with Markov property
Journey is properly termed as Markov process.
Wherein, it is random with Markov property can to refer to that intelligent body periodically or is continuously observed for Markovian decision
Dynamical system sequentially makes corresponding strategy.The state that Markovian decision can be observed according to intelligent body at each moment,
A response (behavior) is selected to be executed from available response sets based on strategy, the state in stochastic systems future is
Random, and its state transition probability has Markov property.Intelligent body is according to newly observed state, then makes new plan
Corresponding response is slightly executed, is repeatedly carried out according to this.
In practical applications, it can be based on intensified learning model, obtained under network safe state, various default nets are executed
The execution probability of network security response.For example, available network safe state s, and it is based on intensified learning model, it obtains corresponding
Network security mapping relations, i.e. strategy π, the expression way of tactful π can be such that
π (a | s)=P [At=a | St=s]
Wherein, A can indicate default network security response sets, and a can be indicated in default network security response sets A
Default network security response, S can indicate network safe state set, and s can indicate the network in network safe state set S
Safe condition, t can indicate current time.
In one embodiment, intensified learning system can also be constructed, for example, can be < net by intensified learning system construction
Network safe condition set S, default network security response sets A, bonusing method R, state transition function P, decay factor γ >, and
Network security mapping relations are defined, which can be expressed as tactful π.
Wherein, bonusing method R can indicate t at the time of network is in some network safe state s, take some default net
After network security response a, intelligent body obtains default network security and responds corresponding cumulative award, and bonusing method can learn from else's experience the phase of testing
It hopes.Bonusing method R can be the reward function based on network safe state s and default network security response a.Bonusing method R's
Function formula can be such that
Wherein, A can indicate default network security response sets, and a can be indicated in default network security response sets A
Default network security response, S can indicate network safe state set, and s can indicate the network in network safe state set S
Safe condition, t can indicate current time.
Wherein, state transition function P can indicate t at the time of network is in some network safe state s, take some pre-
If subsequent time jumps to the probability of network safe state s ' from network safe state s after network security responds a.State transfer
Function P can be defined as dimensional gaussian distribution, and the calculation formula of state transition function can be such that
Wherein, A can indicate default network security response sets, and a can be indicated in default network security response sets A
Default network security response, S can indicate network safe state set, and s can indicate the network in network safe state set S
Safe condition, t can indicate current time.
Wherein, decay factor γ can be the factor of the reward contribution for adjusting different time points response.Due to network
The network safe state undergone later will receive the influence of current network security state, but this influence can gradually weaken, can
To express the decaying of this influence by decay factor γ, decay factor γ can be chosen for the numerical value between 0 to 1.?
It can not also include decay factor γ in the intensified learning system in one embodiment.
Wherein, the definition of network safe state set S, default network security response sets A and strategy π are referred to
Described above, details are not described herein again.
203, the corresponding state reward of network safe state is obtained based on execution probability.
Wherein, state reward can be under certain network safe state, it then follows is determined according to network security mapping relations
Probability, the reward desired value of available cumulative award.For example, when target network is in network safe state s, according to
The probability that network security mapping relations determine, after taking several default network security response a, intelligent body is available tired in total
The reward desired value that bonuses distributed according to strict calculations is encouraged.
In practical applications, the corresponding state reward of network safe state can be obtained according to probability is executed.For example, can
According to probability is executed, to obtain the corresponding state reward v of network safe state sπ(s) namely Bellman equation
It in one embodiment, can also be by obtaining instant reward respectively in order to improve the accuracy that state reward obtains
It will be rewarded with future, obtains the corresponding state reward of network safe state, specifically, step " is based on the execution probability, obtains institute
State the corresponding state reward of network safe state ", may include:
Based on the execution probability, the corresponding instant reward of default network security response and the following reward are obtained;
Processing is merged to the instant reward and the following reward, obtains the corresponding state of the network safe state
Reward.
Wherein, state reward may include instant reward and the following reward.
Wherein, reward immediately can be the reward based on network safe state s, to the progress of instant network security response.
Wherein, instant network security response can network security mapping relations (i.e. the strategy π) according to, determine network pacify
The corresponding current default network security of total state s responds a, which can be used as instant network
Security response.
Wherein, the following reward can be the reward based on network safe state s, to the progress of future network security response.?
After executing instant network security response a under network safe state s, network safe state is transformed to s ', future reward can for
Under network security mapping relations (i.e. strategy π), several default networks that future can be carried out are pacified based on network safe state s '
The reward that total regression carries out.
Wherein, future network security response can network security mapping relations (i.e. the strategy π) according to, determine network pacify
After the corresponding instant network security response a of total state s, the default network security response that future can be carried out, future network peace
Total regression can be used for carrying out the calculating of the following reward.After executing instant network security response a at network safe state s, net
Network safe condition is transformed to s ', and future network security response can be the corresponding default net of network safe state s ' at tactful π
Network security response after network security response a ' and default network security response a '.
In practical applications, for example, the corresponding instant network peace of network safe state s can be obtained according to probability is executed
Then total regression a is obtained and is executed the instant reward that instant network security response a is obtained at network safe state s, and executes
After instant network security response a, it then follows execute the reward of future accessed by probability, and according to the instant reward got and not
It rewards, obtains the corresponding state reward of network safe state s.
In one embodiment, it in order to improve network security processing accuracy, can be obtained by instant network security response
Immediately reward, specifically, step " be based on the execution probability, obtain the corresponding instant reward of default network security response and
Future rewards ", may include:
Based on the execution probability, it is corresponding i.e. from a variety of default network securitys responses to obtain the network safe state
When network security respond;
Obtain the corresponding instant reward of the instant network security response;
The corresponding following reward of default network security response is obtained based on the execution probability.
In practical applications, network security shape can be obtained from a variety of default network security responses based on probability is executed
Then the corresponding instant network security response of state obtains the corresponding instant reward of instant network security response, and general based on executing
Rate obtains the corresponding following reward of default network security response.For example, at network safe state s, available default network
Security response set A, and conventional process flow when according to host by abnormal aggression, by default network security response sets A
It is defined as including a variety of default network security response a, for example, may include closing to be swept in default network security response sets A
Port a1 is retouched, prevents and suppressing exception downloads file a2, closes abnormal connection a3, locking abnormal login account a4, reports management
The default network security response such as member a5.Then it responds and collects from default network security according to probability is executed at network safe state s
It closes and obtains instant network security response corresponding with network safe state in A, and it is corresponding i.e. to obtain instant network security response
When reward, be then based on and execute probability and obtain the corresponding following reward of default network security response.
In one embodiment, the reward that can execute to target network may get after default network security response it is expected
Value obtains the following reward.Specifically, step " obtains the corresponding following prize of default network security response based on the execution probability
Encourage ", may include:
Target network is obtained after executing the instant network security response, safety is carried out according to the execution probability and is rung
The following reward desired value that should be obtained;
The following reward is obtained based on the following reward desired value.
In practical applications, for example, target network according to execute probability obtain instant network security response after, target network
The network safe state of network can change, and be changed into the network safe state s ' after executing response by network safe state s.When
When target network is in network safe state s ', it can also continue to be rung according to the default network security for executing probability acquisition next step
It answers, after the response performed the next step, network safe state changes again, and so on, etc..From the foregoing, it will be observed that target network
The same execution probability is followed, several default network security responses needed to be implemented can be obtained, future reward desired value can be pre-
Measure the reward assigned for the following all default network security responses that may be executed.Therefore, available target network exists
After executing instant network security response, according to the following reward desired value for executing probability progress security response acquisition, and should
Future reward desired value is as the following reward.
In one embodiment, intelligent body can also be forbidden to execute default by auditing to instant network security response
Danger response, to improve the safety of network security processing.Specifically, step " obtains the instant network security response pair
The instant reward answered " may include:
When the instant network security response is default dangerous response, the instant of the instant network security response is determined
Reward the reward that is negative;
When the instant network security response is not default dangerous response, obtain the instant network security response i.e.
When reward be positive reward.
In practical applications, when the response of instant network security responds for default danger, instant network safety can be determined
The instant of response rewards the reward that is negative;When the response of instant network security does not respond for default danger, available instant network
The instant of security response rewards the reward that is positive.
It can be with for example, default dangerous response sets can be established by the experience of expert, in the default dangerous response sets
Forbid the default dangerous response executed including a variety of intelligent bodies, such as deletes database, execute in batches file, independently download suspicious text
Part deletes sensitive document, closes critical services port, the certain critical services of closing etc..When the response of instant network security is pre-
If when danger response, it can be determined that the instant network security response has risk, stops executing the instant network security response,
Negative reward is assigned to the instant network security response simultaneously;It, can be with when the response of instant network security is for default dangerous response
Judge that the instant network security response does not have risk, the instant network security response can be executed, while to the instant net
Network security response assigns positive reward.
It is audited by the default network security response carried out to intelligent body, so that intelligent body be prevented to execute default danger
Response, and certain negative reward is given to default dangerous response, it is ensured that the network security response that intelligent body executes will not band
Carry out bigger safety problem.
In one embodiment, the negative reward and positive reward can be not limited to negative and positive number, or it is opposite just
It is negative.The negative reward and positive reward can be positive number simultaneously, can also be simultaneously negative etc..For example, when instant network security is rung
When should be default dangerous response, relatively small reward can be assigned to the instant network security response;When instant network security
When response does not respond for default danger, relatively large reward can be assigned to the instant network security response.
In one embodiment, the imparting that can be just being rewarded according to the network safety event that target network occurs, to mention
The accuracy of high network security processing.Specifically, step " obtains the instant of the instant network security response and rewards the prize that is positive
Encourage ", may include:
The corresponding network safety event set of the network safe state is obtained, includes in the network safety event set
The safe subevent of multiple network;
After detecting the target network execution instant network security response, every kind of network occurs for the target network
The event occurrence rate of safe subevent;
Based on the event occurrence rate, obtains the instant of the instant network security response and reward the reward that is positive.
In practical applications, the corresponding network safety event set of available network safe state, network safety event
It include the safe subevent of multiple network in set, after detection target network executes instant network security response, target network hair
The event occurrence rate of raw every kind of network security subevent, is based on event occurrence rate, and obtain instant network security response is
When reward be positive reward.
For example, detecting to the network security scene where target network, the network peace where target network is determined
Whole scene is the corresponding network safety event set of available network safe state after terminal intrusion scenario, network peace
In total event set include the safe subevent of multiple network, as in network whether there is port scan, server whether by
Whether whether virus infection file, network have abnormal behaviour etc. for DDos attack, system.Then it can detecte target network in net
Under network safe condition s, after executing instant network security response a, the event of every kind of network security subevent occurs for target network
Probability of happening, and it is based on event occurrence rate, it obtains the instant of instant network security response and rewards the reward that is positive.
It in one embodiment, can also be by subcharacter detection module to the event that network security subevent occurs in network
Probability of happening is detected.It may include multiple subcharacter detection sub-modules in subcharacter detection module, every height can be passed through
Feature detection sub-module detects a kind of event occurrence rate of network security subevent.The subcharacter detection module can be with
It is composed of a variety of safety detection engines.
For example, detecting to the network security scene where target network, the network peace where target network is determined
Whole scene includes that port scan monitors submodule, malice for that after terminal intrusion scenario, can define in subcharacter detection module
File download monitoring submodule, root authority are stolen monitoring submodule, the high suspect code monitoring submodule of execution, sensitive catalogue and are visited
Ask monitoring submodule, sensitive document transmission of monitoring submodule, the sub- monitoring modular of exceptional communication etc. subcharacter detection sub-module.It is logical
It crosses each subcharacter detection sub-module to detect a kind of event occurrence rate of network security subevent, obtains event
Probability, and it is based on event occurrence rate, obtain the positive reward of instant network security response.
R can be passed throughtIt indicates the instant reward that t moment carries out instant network security response a, can indicate to work as by k
When instant network security response a is default dangerous response, the dead loss of instant network security response a, k can be constant, such as k
=-10.A can be passed throughu(a) relationship between instant network security response and default dangerous response is indicated, when instant network is pacified
When total regression is default dangerous response, A can be madeu(a)=1, when the response of instant network security does not respond for default danger,
It can make Au(a)=0.It can indicate that event occurrence rate, the o (a) can be embodied in an array by o (a).I.e.
When reward RtCalculation formula can be such that
Rt=kAu(a)+(1-Au(a))·f(o(a))
In one embodiment, instant reward and the following reward can also be calculated by cost function, to improve
The accuracy of network security processing.For example, state value function and state behavior memory can be introduced in Markovian decision
Function.
Wherein, state value function can be used to assess the value of network safe state s.State value function can be base
In the cost function of network security mapping relations (i.e. strategy π), indicate since network safe state s, it then follows when current strategies π
The expectation of cumulative award obtained by intelligent body.State value function vπ(s) calculation formula can be such that
vπ(s)=E [Gt|St=s]
Wherein, S can indicate network safe state set, and s can indicate the network security in network safe state set S
State, t can indicate current time, GtIt can indicate to harvest.
Wherein, harvest can indicate the summation for having decaying of all rewards backward since certain moment.G can be passed throughtTable
Show harvest, harvest GtIt can indicate since network safe state s, until when terminating network safe state, all reward R's
The sum of band decaying.Harvest GtCalculation formula can be such that
Wherein, γ can indicate decay factor, and R can indicate to reward, and t can indicate current time.Decay factor γ body
The following reward is showed in the value ratio of current time t, in the reward R that the t+k+1 moment obtainst+k+1In the valence that moment t is embodied
Value is γkR。
Wherein, state behavior cost function can be used to assess at network safe state s, preset network security and respond a
Value.Q can be passed throughπ(s a) indicates that state behavior cost function, state behavior cost function can indicate following strategy
When π, when executing some default network security response a to network safe state s, intelligent body getable reward expectation.Shape
State behavior memory function qπ(s, calculation formula a) can be such that
qπ(s, a)=E [Gt|St=s, At=a]
Wherein, A can indicate default network security response sets, and a can be indicated in default network security response sets A
Default network security response, S can indicate network safe state set, and s can indicate the network in network safe state set S
Safe condition, t can indicate current time, GtIt can indicate to harvest.
Such as draw a conclusion by the way that above formula is available:
Further available state value function vπ(s) and state behavior cost function qπ(s, a) namely Bell is graceful
Equation, formula are expressed as follows:
Wherein, Bellman equation can be the functional equation group about objective function, can be by by " decision problem is in spy
Fix time value how " it is carried out in the form of " value of the remuneration from initial selected than the decision problem derived from initial selected "
It indicates, thus by dynamic optimization problem reduction.
In above formula,It can indicate to reward immediately,It can indicate future
Reward, immediately reward and the following reward are all related to tactful π (a | s).
204, it based on the state reward got, calculates so that the corresponding current state reward of network safe state is maximum
The destination probability of value.
In practical applications, it can be calculated based on the state reward got so that network safe state is corresponding current
State reward is the destination probability of maximum value, for example, the state reward that can be will acquire calculates network security as known quantity
The corresponding current state reward of state, the formula of current state reward can beIt solves so that current state reward is maximum
Destination probability π (a | s), that is, following the destination probability under network safe state, maximum state reward value, mesh can be obtained
More correct default network security response can be executed according to the destination probability by marking network.
In one embodiment, it can be calculated by maximizing cost function so that the corresponding current shape of network safe state
State reward is the destination probability of maximum value.For example, maximized state value function v can be calculated*(s) and it is maximized
State behavior cost function q*(s, a), formula can be such that
Wherein, maximized state value function v*It (s) can be the function so that network safe state s Maximum Value,
Maximized state behavior cost function q*(s can be a) so that presetting network security at network safe state s and responding a
The function of Maximum Value.By making state value function v*(s) and state behavior cost function q*(s a) is maximized, Ji Keqiu
Solution obtains so that the corresponding current state of network safe state rewards maximum destination probability.For example, for arbitrary network safety
State s, if the value for following tactful π is not less than the value followed under tactful π ', strategy π is better than strategy π '.
205, network security mapping relations are updated based on destination probability, obtain updated network security mapping and closes
System.
In practical applications, for example, after destination probability is obtained by calculation, network security can be reflected according to destination probability
The parameter penetrated in relationship is adjusted, such as to executed in network security mapping relations the probability of various default network securitys responses into
Row adjustment, and then network security mapping relations are updated, obtain updated network security mapping relations.
In one embodiment, can also be decided whether after obtaining update by iteration by being detected to destination probability
Network security mapping relations.Specifically, step " carries out more the network security mapping relations based on the destination probability
Newly, updated network security mapping relations are obtained ", may include:
When the destination probability meets probability regularization condition, the corresponding execution probability of the network safe state is adjusted
For the destination probability;
It returns and the step of corresponding state of the network safe state is rewarded is obtained based on the execution probability;
When destination probability is unsatisfactory for probability regularization condition, the network security is mapped based on current destination probability and is closed
System is updated, and obtains updated network security mapping relations.
Wherein, iteration is to repeat the process of feedback, in order to be approached required as a result, each time to the repetition of process
It is properly termed as an iteration, and the result that iteration obtains each time can be used as the initial value of next iteration.
Wherein, probability regularization condition is to determine whether destination probability meets needs and be adjusted to destination probability for probability is executed,
And then the condition that network security mapping relations are updated, for example, when destination probability and execution probability difference, it is believed that
The destination probability meets probability regularization condition, needs to execute probability and is adjusted to destination probability, and to network security mapping relations
It is updated.For another example, it can also define when there are when preset gap between destination probability and execution probability, it is believed that the mesh
Mark probability meets probability regularization condition.
In one embodiment, for the ease of implement, probability regularization condition can also for when to network security mapping relations into
When the number that row updates does not reach default update times, it is believed that meet probability regularization condition, need to execute probability and be adjusted to
Destination probability, and network security mapping relations are updated.
In practical applications, for example, can define probability regularization condition is that destination probability and execution probability be not identical, work as mesh
When marking probability and not identical execution probability, it can will execute probability and be adjusted to destination probability, and return based on execution probability acquisition
The step of corresponding state of network safe state is rewarded, re-starts the calculating of destination probability, until destination probability and execution are general
Rate is identical, is unsatisfactory for probability regularization condition, the update of network security mapping relations is no longer carried out, to obtain updated network
Security mapping relationship.
In one embodiment, probability adjustment item can also be defined by the number being updated to network security mapping relations
Part, for example, probability regularization condition is unsatisfactory for, no after can defining and having updated default update times to network security mapping relations
The update of network security mapping relations is carried out again, and obtains updated network security mapping relations.
In one embodiment, network safe state, the target network for the target network that can also be will test out are preset
Every kind of network security occurs for the corresponding state reward of destination probability, the network safe state of network security response, target network
The event occurrence rate of event is all obtained, and is recorded into log, with easy-to-look-up and record.
In one embodiment, which can be not limited only to Markovian decision model, can also utilize it
He carries out network security processing by intensified learning model.
In one embodiment, after getting updated network security mapping relations, target network can also be according to this
Updated network security mapping relations execute default network security response accordingly, and specifically, step " is based on the target
Probability is updated the network security mapping relations, obtains updated network security mapping relations " after, it can also wrap
It includes:
Detect the current network security state of target network;
When the current network security state is default network safe state, reflected based on the updated network security
Relationship is penetrated, is obtained under the current network security state, the current execution probability of default network security response is executed;
Based on the current execution probability, the current network security state is determined from a variety of default network security responses
Corresponding current network security response;
The current network security response is executed for the target network.
Wherein, default network safe state can for network due to by accidental or malice the reason of by destroying, more
Change, reveal, system continuously reliably cannot normally be run, network service outages etc. network state.
In practical applications, the current network security state that can detecte target network is in current network security state
When default network safe state, updated network security mapping relations can be based on, are obtained under current network security state,
Execute the current execution probability of various default network security responses.It is then based on current execution probability, is pacified from a variety of default networks
The corresponding current network security response of network safe state is determined in total regression, and executes current network security for target network
Response.
For example, the current network security state s of detection target network, is default network security in current network security state
When state, updated network security mapping relations can be based on, are obtained at current network security state s, are executed various pre-
If the current execution probability of network security response, namely current execution probability, the expression way of strategy π are obtained according to tactful π
It can be such that
π (a | s)=P [At=a | St=s]
Wherein, A can indicate default network security response sets, and a can be indicated in default network security response sets A
Default network security response, S can indicate network safe state set, and s can indicate the network in network safe state set S
Safe condition, t can indicate current time.
After getting current execution probability, it can determine that network safe state is corresponding from a variety of default network securitys responses
Current network security response, for example, determining that network safe state s is corresponding from default network security response sets A and working as
Preceding network security responds a.Then target network can execute current network security response a.
In one embodiment, current network security can also be responded and is audited, will do it accidentally behaviour to ensure network not
Make, brings more serious network security problem.Specifically, step " executes the current network security for the current network
Respond ", may include:
When current network security response does not respond for default danger, executed for the target network described current
Network security response;
The method also includes when current network security response responds for default danger, refusal executes described current
Network security response.
In practical applications, it when current network security response does not respond for default danger, executes current network security and rings
It answers, the method also includes when current network security response responds for default danger, refusal executes current network security response.
For example, current network security response can be executed when current network security response does not respond for default danger;When current net
When network security response is default dangerous response, it can refuse to execute current network security response.
In one embodiment, when perform current network security response after, can also continue to network safe state into
Row detection, to improve the accuracy of network security processing.Specifically, which can also include:
It detects the target network and executes network safe state after the execution after the current network security responds;
When network safe state is default network safe state after the execution, more by the current network security state
It is newly network safe state after the execution;
It returns to execute and is based on the updated network security mapping relations, obtain in the current network security state
Under, the step of executing the current execution probability of various default network securitys responses, until meeting stop condition.
Wherein, stop condition can be so that the condition that step cycle process stops, detecting current network for example, can work as
Network safe state when not being default network safe state, stop step cycle.It can also be full when recycling pre-determined number
Sufficient stop condition stops step cycle, etc..
In practical applications, it can detecte target network and execute network security shape after the execution after current network security responds
Current network security state when network safe state is default network safe state after execution, is updated to net after executing by state
Network safe condition, and execution is returned based on updated network security mapping relations, it obtains under current network security state, holds
The step of current execution probability of the various default network security responses of row, until meeting stop condition.
For example, after target network executes current network security response, can network safe state to target network into
Row detection illustrates that target network is still abnormal, can will work as when network safe state is default network safe state after execution
Preceding network safe state is updated to network safe state after executing.Then it can execute to map based on updated network security and close
System obtains under current network security state, and the current execution for executing various default network security responses for target network is general
The step of rate, until network safe state is not default network safe state after the execution of detection target network.Work as execution
When network safe state is not default network safe state afterwards, target network is safe at this time, can stop carrying out step
Circulation.
From the foregoing, it will be observed that the embodiment of the present application can detecte the network safe state of target network, mapped based on network security
Relationship obtains under network safe state, executes the execution probability of default network security response, and network security mapping relations include
Mapping relations between network safe state and the probability for executing default network security response obtain network peace based on probability is executed
The corresponding state reward of total state is calculated based on the state reward got so that the corresponding current state of network safe state
Reward is the destination probability of maximum value, is updated based on destination probability to network security mapping relations, obtains updated net
Network Security mapping relationship.The program gets target network and executes default network security response by network security mapping relations
Execution probability, and current state is made to reward maximum destination probability, according to destination probability and executes probability network is pacified
Full mapping relations are updated, to get the updated network security mapping pass that can obtain maximum current state reward
System.It can also be by judging whether default network security response is default dangerous response, it is ensured that target network, which not will do it, accidentally to be grasped
Make, to guarantee the stabilization of network environment.After default network security response can be executed by detection target network simultaneously, mesh
The event occurrence rate of every kind of network security subevent occurs for mark network, obtains the accumulative prize of default network security in response to belt decaying
It encourages.Therefore when network safety event occurs, carry out network security processing that can be instant, to reduce network security treatment process
In for artificial dependence, improve the efficiency of network security processing, reduce loss.
Citing, is described in further detail by the method according to described in above-described embodiment below.
Referring to Fig. 3, the detailed process of the network security processing method can be such that
(1) intensified learning model is constructed.
In practical applications, intensified learning model can be constructed.
(1) in one embodiment, as shown in figure 4, subcharacter detection module can be constructed, which can be in network
The event occurrence rate that network security subevent occurs is detected.It may include multiple subcharacter inspections in subcharacter detection module
Submodule is surveyed, can be examined by a kind of event occurrence rate of each subcharacter detection sub-module to network security subevent
It surveys.The subcharacter detection module can be composed of a variety of safety detection engines.
It may include multiple subcharacter detections in the subcharacter detection module for example, subcharacter detection module can be constructed
Submodule is stolen such as port scan monitoring submodule, malicious file downloading monitoring submodule, root authority and monitors submodule, holds
The high suspect code monitoring submodule of row, sensitive directory access monitor submodule, sensitive document transmission of monitoring submodule, exceptional communication
Sub- monitoring modular etc. subcharacter detection sub-module.By each subcharacter detection sub-module to a kind of network security subevent
Event occurrence rate is detected, available event occurrence rate.
(2) in one embodiment, default network security response sets can be defined, indicate that default network security is rung by A
Should gather, conventional process flow when according to host by abnormal aggression, default network security response sets A is defined as include
A variety of default network securitys respond a, for example, may include closing scanned port a1 in default network security response sets A, resistance
The default nets such as only simultaneously suppressing exception downloads file a2, closing exception connects a3, locks abnormal login account a4, reports administrator a5
Network security response.
(3) in one embodiment, it can define and the bonusing method rewarded is responded to default network security.For example, working as
When default network security response is presets dangerous response, default network security can be responded and give negative reward;When default network
When security response is not default dangerous response, the event occurrence rate that can be detected according to subcharacter detection module, to default
Positive reward is given in network security response.
For example, R can be passed throughtIt indicates the reward that t moment carries out default network security response a, can indicate to work as by k
When default network security response a is default dangerous response, the dead loss of network security response a is preset, k can be constant, such as k
=-10.A can be passed throughu(a) relationship for indicating default network security response with default dangerous response, when default network security is rung
When should be default dangerous response, A can be madeuIt (a)=1, can be with, when the response of default network security is for default dangerous response
So that Au(a)=0.It can indicate that the event occurrence rate that subcharacter detection module detects, the o (a) can have by o (a)
Body shows as an array.The calculation formula of reward can be such that
Rt=kAu(a)+(1-Au(a))·f(o(a))
(4) in one embodiment, intensified learning system can be constructed.For example, can be < net by intensified learning system construction
Network safe condition set S, default network security response sets A, bonusing method R, state transition function P, decay factor γ >, and
Definition is generated the network security mapping relations of default network security response by network safe state, which can
To be expressed as tactful π.
Wherein, bonusing method R can indicate t at the time of network is in some network safe state s, take some default net
After network security response a, the cumulative award that intelligent body obtains can learn from else's experience and test expectation.Bonusing method R can be for based on network security
The reward function of state s and default network security response a.The function formula of bonusing method R can be such that
Wherein, state transition function P can indicate t at the time of network is in some network safe state s, take some pre-
If subsequent time jumps to the probability of network safe state s ' from network safe state s after network security responds a.It can define
Formula for dimensional gaussian distribution, state transition function P can be such that
Wherein, decay factor γ can be the factor of the response reward contribution for adjusting different time points.Due to network
The network safe state undergone later will receive the influence of current network security state, but this influence can gradually weaken, can
To express this decaying by decay factor γ, decay factor γ can be chosen for the numerical value between 0 to 1.Implement one
It can not also include decay factor γ in example.
Wherein, strategy can be network under some network safe state, by the cognition to current network security state,
Selection executes the foundation of certain default network security response.For example, strategy can be expressed as network in some network safe state
Under, the probability size of every kind of default network security response in default network security response sets may be executed, is expressed as based on pre-
If the probability distribution of network security response sets.Tactful formula can be such that
π (a | s)=P [At=a | St=s]
(5) in one embodiment, default dangerous response sets can be constructed, for example, can establish by the experience of expert
Danger response sets are preset, may include that intelligent body forbids a variety of default danger carried out to ring in the default dangerous response sets
Answer, such as delete database, execute in batches file, independently download apocrypha, delete sensitive document, close critical services port,
Close certain critical services etc..Etc..
(6) in one embodiment, log can be constructed, network safe state, the network that will test out carry out default network
The thing of every kind of network security subevent occurs for the corresponding state reward of destination probability, the network safe state of security response, network
Part probability of happening is all obtained, and is recorded into log, with easy-to-look-up and record.
As shown in figure 5, the network equipment can construct subcharacter detection module, default network security response sets A is defined, it is fixed
Justice responds the bonusing method rewarded to network security, constructs intensified learning system, log is constructed, then to intensified learning mould
Type is trained, intensified learning model after being trained, and carries out network safety event response based on intensified learning model.
(2) intensified learning model is trained, intensified learning model after being trained.
301, the network safe state of network equipment detection target network.
In practical applications, the network equipment can detect the network security scene where target network, determine
Network security scene where target network is terminal intrusion scenario.And pacified according to network issuable in terminal intrusion scenario
Total state, by network safe state set S, to be defined as include safe condition s1, that network is scanned state s2, network hole is sharp
With state s3, network, state s4 and network under attack are captured the network safe states such as state s5.Then to target network
It is detected, obtains the network safe state of target network, which can be in network safe state set S
A kind of network safe state.
In one embodiment, the network security scene where target network is detected, determines target network place
Network security scene be terminal intrusion scenario after, the master of network security response can also will be carried out according to network safe state
Body is defined as intelligent body.For example, the intelligent body can be defined as the root role of end host.
302, the network equipment is based on network security mapping relations, obtains under network safe state, executes default network peace
The execution probability of total regression.
In practical applications, the available network safe state s of the network equipment, and be based on network security mapping relations, i.e.,
Tactful π is obtained under network safe state, executes the execution probability of various default network security responses, the expression side of strategy π
Formula can be such that
π (a | s)=P [At=a | St=s]
303, the network equipment is based on executing the corresponding state reward of probability acquisition network safe state.
In practical applications, at network safe state s, the available default network security response sets A of the network equipment,
And conventional process flow when according to host by abnormal aggression, default network security response sets A is defined as including a variety of
Default network security responds a, for example, may include closing scanned port a1 in default network security response sets A, prevents simultaneously
The default networks such as suppressing exception downloads file a2, closing exception connects a3, locks abnormal login account a4, reports administrator a5 are pacified
Total regression.The network equipment is based on executing probability, obtains from default network security response sets A corresponding with network safe state
Instant network security response, and obtain the corresponding instant reward of instant network security response.And it is responded in instant network security
Under, it then follows execute the corresponding following reward of several future network security responses of probability.It, can according to instant reward and the following reward
To obtain network safe state corresponding state reward, wherein the calculation formula of state reward can be with are as follows:
In one embodiment, default dangerous response sets can also be established by the experience of expert, the default dangerous response
It may include the default dangerous response that a variety of intelligent bodies are forbidden executing in set, such as delete database, execute in batches file, is autonomous
Download apocrypha, delete sensitive document, close critical services port, close certain critical services etc..When instant network is pacified
When total regression is default dangerous response, it can be determined that the instant network security response has risk, stops executing the instant net
Network security response, while negative reward is assigned to the instant network security response;When the response of instant network security is not default danger
When response, it can be determined that the instant network security response does not have risk, can execute the instant network security response, simultaneously
Positive reward is assigned to the instant network security response.
In one embodiment, the network security scene where target network is detected, determines target network place
Network security scene be terminal intrusion scenario after, the corresponding network security thing of the available network safe state of the network equipment
Part set includes the safe subevent of multiple network in the network safety event set, as whether there is port scan, clothes in network
Whether by DDos attack, system, whether virus infection file, network have abnormal behaviour etc. to business device.Then it can detecte
Target network is at network safe state s, and after executing instant network security response a, every kind of network security occurs for target network
The event occurrence rate of subevent, and it is based on event occurrence rate, it obtains the instant of instant network security response and rewards the prize that is positive
It encourages.
In one embodiment, the network equipment can also detect the network security scene where target network, determine
After the network security scene where target network is terminal intrusion scenario out, can define includes end in subcharacter detection module
Mouth scanning monitoring submodule, malicious file downloading monitoring submodule, root authority steal and monitor submodule, execute high suspect code
Monitor submodule, sensitive directory access monitoring submodule, sensitive document transmission of monitoring submodule, the sub- monitoring modular of exceptional communication etc.
Equal subcharacters detection sub-module.By each subcharacter detection sub-module to a kind of event occurrence rate of network security subevent
It is detected, obtains event occurrence rate, and be based on event occurrence rate, obtain the positive reward of instant network security response.
R can be passed throughtIt indicates the instant reward that t moment carries out instant network security response a, can indicate to work as by k
When instant network security response a is default dangerous response, the dead loss of instant network security response a, k can be constant, such as k
=-10.A can be passed throughu(a) relationship between instant network security response and default dangerous response is indicated, when instant network is pacified
When total regression is default dangerous response, A can be madeu(a)=1, when the response of instant network security does not respond for default danger,
It can make Au(a)=0.It can indicate that event occurrence rate, the o (a) can be embodied in an array by o (a).I.e.
When reward RtCalculation formula can be such that
Rt=kAu(a)+(1-Au(a))·f(o(a))
In one embodiment, instant reward and the following reward can also be calculated by cost function, to improve
The accuracy of network security processing.For example, state value function and state behavior memory can be introduced in Markovian decision
Function.
State value function vπ(s) calculation formula can be such that
vπ(s)=E [Gt|St=s]
Wherein, S can indicate network safe state set, and s can indicate the network security in network safe state set S
State, t can indicate current time, GtIt can indicate to harvest.
Harvest GtCalculation formula can be such that
State behavior cost function qπ(s, calculation formula a) can be such that
qπ(s, a)=E [Gt|St=s, At=a]
Such as draw a conclusion by the way that above formula is available:
vπ(s)=∑a∈Aπ(a|s)qπ(s,a)
Further available state value function vπ(s) and state behavior cost function qπ(s, a) namely Bell is graceful
Equation, formula are expressed as follows:
In above formula,It can indicate to reward immediately,It can indicate future
Reward, immediately reward and the following reward are all related to tactful π (a | s).
304, the network equipment is calculated based on the state reward got so that the corresponding current state prize of network safe state
Encourage the destination probability for maximum value.
In practical applications, it can be calculated based on the state reward got so that network safe state is corresponding current
State reward is the destination probability of maximum value, for example, the state reward that can be will acquire calculates network security as known quantity
The corresponding current state reward of state, the formula of current state reward can beIt solves so that current state reward is maximum
Destination probability π (a | s), that is, following the destination probability under network safe state, maximum state reward value, mesh can be obtained
More correct default network security response can be executed according to the destination probability by marking network.
In one embodiment, it can be calculated by maximizing cost function so that the corresponding current shape of network safe state
State reward is the destination probability of maximum value.For example, maximized state value function v can be calculated*(s) and it is maximized
State behavior cost function q*(s, a), formula can be such that
Wherein, maximized state value function v*It (s) can be the function so that network safe state s Maximum Value,
Maximized state behavior cost function q*(s can be a) so that presetting network security at network safe state s and responding a
The function of Maximum Value.By making state value function v*(s) and state behavior cost function q*(s a) is maximized, Ji Keqiu
Solution obtains so that the corresponding current state of network safe state rewards maximum destination probability.For example, for arbitrary network safety
State s, if the value for following tactful π is not less than the value followed under tactful π ', strategy π is better than strategy π '.
The bonusing method rewarded is responded to network security as shown in fig. 6, can define, and establishes Bellman equation,
Bellman equation is solved using iterative algorithm, optimization probability is obtained, to complete security incident response process.
In one embodiment, network safe state, the target network for the target network that can also be will test out execute default
Every kind of network security occurs for the corresponding state reward of destination probability, the network safe state of network security response, target network
The event occurrence rate of event is all obtained, and is recorded into log, with easy-to-look-up and record.
305, when destination probability meets probability regularization condition, the corresponding execution probability of network safe state is adjusted to mesh
Mark probability.
In practical applications, for example, can define probability regularization condition is that destination probability and execution probability be not identical, work as mesh
It can be destination probability by the corresponding execution probability updating of network safe state when marking probability and not identical execution probability.
306, it returns and the step of corresponding state of network safe state is rewarded is obtained based on execution probability.
In practical applications, for example, can define probability regularization condition is that destination probability and execution probability be not identical, work as mesh
It can be destination probability by the corresponding execution probability updating of network safe state when marking probability and not identical execution probability.Then
It can return based on the step of probability obtains network safe state corresponding state reward is executed, continue based on getting
State reward calculates so that the destination probability that the corresponding current state of the network safe state is rewarded as maximum value, is obtained again
Destination probability is got, it, can be by the corresponding execution probability of network safe state when the destination probability and not identical execution probability
It is updated to destination probability, continues to recycle.When the destination probability is identical as probability is executed, the circulation of step can be stopped.
307, when destination probability is unsatisfactory for probability regularization condition, network security is mapped based on current destination probability and is closed
System is updated, and obtains updated network security mapping relations.
It in practical applications, can be according to destination probability to net for example, when destination probability is unsatisfactory for probability regularization condition
Parameter in network Security mapping relationship is adjusted, and is such as responded to executing various default network securitys in network security mapping relations
Probability be adjusted, and then network security mapping relations are updated, obtain updated network security mapping relations.
In one embodiment, probability adjustment item can also be defined by the number being updated to network security mapping relations
Part, for example, probability regularization condition is unsatisfactory for, no after can defining and having updated default update times to network security mapping relations
The update of network security mapping relations is carried out again, and obtains updated network security mapping relations.
In one embodiment, network security mapping relations are updated, obtain updated network security mapping relations
When, i.e., it is believed that being trained to intensified learning model, and intensified learning model after being trained.
(3) network safety event response is carried out based on intensified learning model.
308, the current network security state of network equipment detection target network.
In practical applications, can the current network security state to target network detect, detect specific steps on
Text has described, and details are not described herein again.
309, when current network security state is default network safe state, the network equipment is pacified based on updated network
Full mapping relations obtain under current network security state, execute the current execution probability of default network security response.
In practical applications, when current network security state s is default network safe state, updated net can be based on
Network Security mapping relationship is obtained at current network security state s, and the current execution for executing various default network security responses is general
Rate, namely current execution probability is obtained according to tactful π, the expression way of strategy π can be such that
π (a | s)=P [At=a | St=s]
310, the network equipment is based on current execution probability, determines current network security from a variety of default network securitys responses
The corresponding current network security response of state.
In practical applications, after getting current execution probability, net can be determined from a variety of default network securitys responses
The corresponding current network security response of network safe condition, for example, determining that network is pacified from default network security response sets A
The corresponding current network security of total state s responds a.
311, target network executes current network security response.
In practical applications, target network can execute current network security response a.
In one embodiment, it when current network security response does not respond for default danger, executes current network security and rings
It answers, the method also includes when current network security response responds for default danger, refusal executes current network security response.
For example, current network security response can be executed when current network security response does not respond for default danger;When current net
When network security response is default dangerous response, it can refuse to execute current network security response.
It in practical applications, can be to the net of target network for example, after target network execution current network security response
Network safe condition is detected, and when network safe state is default network safe state after execution, illustrates target network still
It is abnormal, network safe state can be updated to network safe state after executing.Then it can execute based on updated network
Security mapping relationship obtains under network safe state, executes the current of various default network security responses for target network
The step of executing probability, until network safe state is not default network safe state after the execution of detection target network.
When network safe state is not default network safe state after execution, target network is safe at this time, can stop carrying out
The circulation of step.
As shown in figure 4, intelligent body can follow strategy, default network security response is determined by network safe state,
And judge whether default network security response is default dangerous response, while the probability detected by subcharacter detection module
It awards to the response of default network security, can also the corresponding strategy of network safe state, network security be responded and be encouraged
It encourages and is recorded in log.
From the foregoing, it will be observed that the embodiment of the present application can detect the network safe state of target network by the network equipment, it is based on
Network security mapping relations obtain under network safe state, execute the execution probability of default network security response, network security
Mapping relations include the mapping relations between network safe state and the probability for executing default network security response, general based on executing
Rate is obtained the corresponding state reward of network safe state and is calculated based on the state reward got so that network safe state pair
The current state reward answered is the destination probability of maximum value, is updated, is obtained to network security mapping relations based on destination probability
To updated network security mapping relations.It is various to get target network execution by network security mapping relations for the program
The execution probability of default network security response, and current state is made to reward maximum destination probability, according to destination probability and
It executes probability to be updated network security mapping relations, thus after getting the update that can obtain maximum current state reward
Network security mapping relations.It can also be by judging whether default network security response is default dangerous response, it is ensured that target
Network not will do it maloperation, to guarantee the stabilization of network environment.Default net can be executed by detection target network simultaneously
After network security response, the event occurrence rate of every kind of network security subevent occurs for target network, obtains default network security
The cumulative award of in response to belt decaying.Therefore when network safety event occurs, carry out network security processing that can be instant, to subtract
For artificial dependence in few network security treatment process, the efficiency of network security processing is improved, reduces loss.
In order to better implement above method, the embodiment of the present application also provides a kind of network safety processing equipment, the network
Secure processing device can be adapted for the network equipment, as shown in fig. 7, the network safety processing equipment may include: detection module
71, probability obtains module 72, reward obtains module 73, computing module 74 and update module 75, as follows:
Detection module 71, for detecting the network safe state of target network;
Probability obtains module 72, for being based on network security mapping relations, obtains under the network safe state, executes
The execution probability of default network security response, the network security mapping relations include network safe state and the default network of execution
Mapping relations between the probability of security response;
Reward obtains module 73, for obtaining the corresponding state prize of the network safe state based on the execution probability
It encourages;
Computing module 74, for calculating so that the network safe state is corresponding works as based on the state reward got
Preceding state reward is the destination probability of maximum value;
Update module 75 is obtained more for being updated based on the destination probability to the network security mapping relations
Network security mapping relations after new.
In one embodiment, the reward obtains module 73, may include that reward acquisition submodule 731 and state reward obtain
Submodule 732 is taken, as follows:
Acquisition submodule 731 is rewarded, it is corresponding for based on the execution probability is based on, obtaining default network security response
Immediately reward and the following reward;
State rewards acquisition submodule 732, for merging processing to the instant reward and the following reward, obtains
The corresponding state reward of the network safe state.
In one embodiment, the reward acquisition submodule 731, can be specifically used for:
Acquisition submodule 7311 is responded, for being based on the execution probability, is obtained from a variety of default network security responses
The corresponding instant network security response of the network safe state;
Immediately reward acquisition submodule 7312, for obtaining the corresponding instant reward of the instant network security response;
Future reward acquisition submodule 7313, it is corresponding for obtaining default network security response based on the execution probability
Future reward.
In one embodiment, the instant reward acquisition submodule 7312, can be specifically used for:
Negative reward determines submodule 73121, for determining when the instant network security response is default dangerous response
The instant of the instant network security response rewards the reward that is negative;
Positive reward determines submodule 73122, for obtaining when the instant network security response is not default dangerous response
The instant of the instant network security response is taken to reward the reward that is positive.
In one embodiment, the positive reward determines submodule 73122, can be specifically used for:
The corresponding network safety event set of the network safe state is obtained, includes in the network safety event set
The safe subevent of multiple network;
After detecting the target network execution instant network security response, every kind of network occurs for the target network
The event occurrence rate of safe subevent;
Based on the event occurrence rate, obtains the instant of the instant network security response and reward the reward that is positive.
In one embodiment, the network safety processing equipment can also be specifically used for:
Detect the current network security state of target network;
When the current network security state is default network safe state, reflected based on the updated network security
Relationship is penetrated, is obtained under the current network security state, the current execution probability of default network security response is executed;
Based on the current execution probability, the current network security state is determined from a variety of default network security responses
Corresponding current network security response;
The current network security response is executed for the target network.
In one embodiment, the update module 75, can be specifically used for:
When the destination probability meets probability regularization condition, the corresponding execution probability of the network safe state is adjusted
For the destination probability;
It returns and the step of corresponding state of the network safe state is rewarded is obtained based on the execution probability;
When destination probability is unsatisfactory for probability regularization condition, the network security is mapped based on current destination probability and is closed
System is updated, and obtains updated network security mapping relations.
From the foregoing, it will be observed that the embodiment of the present application can detect the network safe state of target network by detection module 71, lead to
It crosses probability and obtains module 72 based on network security mapping relations, obtain under network safe state, execute default network security and ring
The execution probability answered, network security mapping relations include between network safe state and the probability for executing default network security response
Mapping relations, module 73 obtained by reward be based on executing probability and obtain the corresponding state of network safe state and reward, pass through
Computing module 74 is calculated based on the state reward got so that the corresponding current state reward of network safe state is maximum value
Destination probability, by update module 75 be based on destination probability network security mapping relations are updated, obtain updated
Network security mapping relations.The program gets target network and executes various default network peaces by network security mapping relations
The execution probability of total regression, and current state is made to reward maximum destination probability, according to destination probability and execute probability pair
Network security mapping relations are updated, to get the updated network security that can obtain maximum current state reward
Mapping relations.It can also be by judging whether default network security response be default dangerous response, it is ensured that target network will not be into
Row maloperation, to guarantee the stabilization of network environment.Default network security response can be executed by detection target network simultaneously
Later, the event occurrence rate of every kind of network security subevent occurs for target network, obtains default network security in response to belt decaying
Cumulative award.Therefore when network safety event occurs, carry out network security processing that can be instant, to reduce network security
For artificial dependence in treatment process, the efficiency of network security processing is improved, reduces loss.
The embodiment of the present application also provides a kind of computer equipment, which can set for server or terminal etc.
It is standby, it is integrated with any network safety processing equipment provided by the embodiment of the present application.As shown in figure 8, Fig. 8 is the application reality
The structural schematic diagram of the computer equipment of example offer is provided, specifically:
The computer equipment may include the processor 801, one or one of one or more than one processing core with
The components such as memory 802, power supply 803 and the input unit 804 of upper computer readable storage medium.Those skilled in the art can be with
Understand, computer equipment structure shown in Fig. 8 does not constitute the restriction to computer equipment, may include than illustrate it is more or
Less component perhaps combines certain components or different component layouts.Wherein:
Processor 801 is the control centre of the computer equipment, is set using various interfaces and the entire computer of connection
Standby various pieces, by running or executing the software program and/or module that are stored in memory 802, and calling storage
Data in memory 802 execute the various functions and processing data of computer equipment, to carry out to computer equipment whole
Body monitoring.Optionally, processor 801 may include one or more processing cores;Preferably, processor 801 can integrate at
Manage device and modem processor, wherein the main processing operation system of application processor, user interface and application program etc. are adjusted
Demodulation processor processed mainly handles wireless communication.It is understood that above-mentioned modem processor can not also integrate everywhere
It manages in device 801.
Memory 802 can be used for storing software program and module, and processor 801 is stored in memory 802 by operation
Software program and module, thereby executing various function application and data processing.Memory 802 can mainly include storage journey
Sequence area and storage data area, wherein storing program area can the (ratio of application program needed for storage program area, at least one function
Such as sound-playing function, image player function) etc.;Storage data area can be stored to be created according to using for computer equipment
Data etc..In addition, memory 802 may include high-speed random access memory, it can also include nonvolatile memory, such as
At least one disk memory, flush memory device or other volatile solid-state parts.Correspondingly, memory 802 can be with
Including Memory Controller, to provide access of the processor 801 to memory 802.
Computer equipment further includes the power supply 803 powered to all parts, it is preferred that power supply 803 can pass through power supply pipe
Reason system and processor 801 are logically contiguous, to realize management charging, electric discharge and power managed by power-supply management system
Etc. functions.Power supply 803 can also include one or more direct current or AC power source, recharging system, power failure inspection
The random components such as slowdown monitoring circuit, power adapter or inverter, power supply status indicator.
The computer equipment may also include input unit 804, which can be used for receiving the number or word of input
Information is accorded with, and generates keyboard related with user setting and function control, mouse, operating stick, optics or trace ball letter
Number input.
Although being not shown, computer equipment can also be including display unit etc., and details are not described herein.Specifically in the present embodiment
In, the processor 801 in computer equipment can be according to following instruction, by the process pair of one or more application program
The executable file answered is loaded into memory 802, and the application journey being stored in memory 802 is run by processor 801
Sequence, thus realize various functions, it is as follows:
The network safe state of target network is detected, network security mapping relations are based on, is obtained under network safe state,
The execution probability of default network security response is executed, network security mapping relations include network safe state and the default network of execution
Mapping relations between the probability of security response obtain the corresponding state reward of network safe state based on probability is executed, are based on
The state reward got calculates so that the corresponding current state reward of network safe state is the destination probability of maximum value, base
Network security mapping relations are updated in destination probability, obtain updated network security mapping relations.
The specific implementation of above each operation can be found in the embodiment of front, and details are not described herein.
From the foregoing, it will be observed that the embodiment of the present application can detecte the network safe state of target network, mapped based on network security
Relationship obtains under network safe state, executes the execution probability of default network security response, and network security mapping relations include
Mapping relations between network safe state and the probability for executing default network security response obtain network peace based on probability is executed
The corresponding state reward of total state is calculated based on the state reward got so that the corresponding current state of network safe state
Reward is the destination probability of maximum value, is updated based on destination probability to network security mapping relations, obtains updated net
Network Security mapping relationship.The program gets target network and executes various default network securitys by network security mapping relations
The execution probability of response, and current state is made to reward maximum destination probability, according to destination probability and probability is executed to net
Network Security mapping relationship is updated, and is reflected to get and can obtain the updated network security that maximum current state is rewarded
Penetrate relationship.It can also be by judging whether default network security response is default dangerous response, it is ensured that target network not will do it
Maloperation, to guarantee the stabilization of network environment.Default network security can be executed by detection target network simultaneously and responds it
Afterwards, the event occurrence rate of every kind of network security subevent occurs for target network, obtains default network security in response to belt decaying
Cumulative award.Therefore when network safety event occurs, carry out network security processing that can be instant, to reduce at network security
For artificial dependence during reason, the efficiency of network security processing is improved, reduces loss.
It will appreciated by the skilled person that all or part of the steps in the various methods of above-described embodiment can be with
It is completed by instructing, or relevant hardware is controlled by instruction to complete, which can store computer-readable deposits in one
In storage media, and is loaded and executed by processor.
For this purpose, the embodiment of the present application provides a kind of storage medium, wherein being stored with a plurality of instruction, which can be processed
Device is loaded, to execute the step in any network security processing method provided by the embodiment of the present application.For example, this refers to
Order can execute following steps:
The network safe state of target network is detected, network security mapping relations are based on, is obtained under network safe state,
The execution probability of default network security response is executed, network security mapping relations include network safe state and the default network of execution
Mapping relations between the probability of security response obtain the corresponding state reward of network safe state based on probability is executed, are based on
The state reward got calculates so that the corresponding current state reward of network safe state is the destination probability of maximum value, base
Network security mapping relations are updated in destination probability, obtain updated network security mapping relations.
The specific implementation of above each operation can be found in the embodiment of front, and details are not described herein.
Wherein, which may include: read-only memory (ROM, Read Only Memory), random access memory
Body (RAM, Random Access Memory), disk or CD etc..
By the instruction stored in the storage medium, any network peace provided by the embodiment of the present application can be executed
Step in full processing method, it is thereby achieved that any network security processing method institute provided by the embodiment of the present application
The beneficial effect being able to achieve is detailed in the embodiment of front, and details are not described herein.
A kind of network security processing method and device provided by the embodiment of the present application are described in detail above, this
Specific case is applied in text, and the principle and implementation of this application are described, the explanation of above example is only intended to
Help understands the present processes and its core concept;Meanwhile for those skilled in the art, according to the thought of the application,
There will be changes in the specific implementation manner and application range, in conclusion the content of the present specification should not be construed as to this
The limitation of application.
Claims (10)
1. a kind of network security processing method characterized by comprising
Detect the network safe state of target network;
It based on network security mapping relations, obtains under the network safe state, executes the execution of default network security response
Probability, the network security mapping relations include reflecting between network safe state and the probability for executing default network security response
Penetrate relationship;
The corresponding state reward of the network safe state is obtained based on the execution probability;
Based on the state reward got, calculate so that the corresponding current state reward of the network safe state is maximum value
Destination probability;
The network security mapping relations are updated based on the destination probability, updated network security mapping is obtained and closes
System.
2. network security processing method according to claim 1, which is characterized in that based on the destination probability to the net
Network Security mapping relationship is updated, and obtains updated network security mapping relations, comprising:
When the destination probability meets probability regularization condition, the corresponding execution probability of the network safe state is adjusted to institute
State destination probability;
It returns and the step of corresponding state of the network safe state is rewarded is obtained based on the execution probability;
When destination probability is unsatisfactory for probability regularization condition, based on current destination probability to the network security mapping relations into
Row updates, and obtains updated network security mapping relations.
3. network security processing method according to claim 1, which is characterized in that be based on the execution probability, obtain institute
State the corresponding state reward of network safe state, comprising:
Based on the execution probability, the corresponding instant reward of default network security response and the following reward are obtained;
Processing is merged to the instant reward and the following reward, obtains the corresponding state prize of the network safe state
It encourages.
4. network security processing method according to claim 3, which is characterized in that be based on the execution probability, obtain pre-
If the corresponding instant reward of network security response and the following reward, comprising:
Based on the execution probability, the corresponding instant net of the network safe state is obtained from a variety of default network security responses
Network security response;
Obtain the corresponding instant reward of the instant network security response;
The corresponding following reward of default network security response is obtained based on the execution probability.
5. network security processing method according to claim 4, which is characterized in that obtained based on the execution probability default
The corresponding following reward of network security response, comprising:
Target network is obtained after executing the instant network security response, security response is carried out according to the execution probability and is obtained
The following reward desired value obtained;
The following reward is obtained based on the following reward desired value.
6. network security processing method according to claim 4, which is characterized in that obtain the instant network security response
Corresponding instant reward, comprising:
When the instant network security response is default dangerous response, the instant reward of the instant network security response is determined
Be negative reward;
When the instant network security response is not default dangerous response, the instant prize of the instant network security response is obtained
Encourage the reward that is positive.
7. network security processing method according to claim 6, which is characterized in that obtain the instant network security response
Instant reward be positive reward, comprising:
The corresponding network safety event set of the network safe state is obtained, includes a variety of in the network safety event set
Network security subevent;
After detecting the target network execution instant network security response, every kind of network security occurs for the target network
The event occurrence rate of subevent;
Based on the event occurrence rate, obtains the instant of the instant network security response and reward the reward that is positive.
8. network security processing method according to claim 1, which is characterized in that based on the destination probability to the net
Network Security mapping relationship is updated, after obtaining updated network security mapping relations, the method also includes:
Detect the current network security state of target network;
When the current network security state is default network safe state, closed based on the updated network security mapping
System obtains under the current network security state, executes the current execution probability of default network security response;
Based on the current execution probability, determine that the current network security state is corresponding from a variety of default network security responses
Current network security response;
The current network security response is executed for the target network.
9. network security processing method according to claim 8, which is characterized in that for described in target network execution
Current network security response, comprising:
When current network security response does not respond for default danger, the current network is executed for the target network
Security response;
The method also includes when current network security response responds for default danger, refusal executes the current network
Security response.
10. a kind of network safety processing equipment characterized by comprising
Detection module, for detecting the network safe state of target network;
Probability obtains module, for being based on network security mapping relations, obtains under the network safe state, executes default net
The execution probability of network security response, the network security mapping relations include that network safe state and the default network security of execution are rung
The mapping relations between probability answered;
Reward obtains module, for obtaining the corresponding state reward of the network safe state based on the execution probability;
Computing module, for calculating so that the corresponding current state of the network safe state based on the state reward got
Reward is the destination probability of maximum value;
Update module is obtained updated for being updated based on the destination probability to the network security mapping relations
Network security mapping relations.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910479765.0A CN110225019B (en) | 2019-06-04 | 2019-06-04 | Network security processing method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910479765.0A CN110225019B (en) | 2019-06-04 | 2019-06-04 | Network security processing method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110225019A true CN110225019A (en) | 2019-09-10 |
CN110225019B CN110225019B (en) | 2021-08-31 |
Family
ID=67819557
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910479765.0A Active CN110225019B (en) | 2019-06-04 | 2019-06-04 | Network security processing method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110225019B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115102705A (en) * | 2022-04-02 | 2022-09-23 | 中国人民解放军国防科技大学 | Automatic network security detection method based on deep reinforcement learning |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160012081A1 (en) * | 2008-11-07 | 2016-01-14 | Cloudlock, Inc. | Relationship Model for Modeling Relationships Between Equivalent Objects Accessible Over a Network |
CN107135224A (en) * | 2017-05-12 | 2017-09-05 | 中国人民解放军信息工程大学 | Cyber-defence strategy choosing method and its device based on Markov evolutionary Games |
CN108809979A (en) * | 2018-06-11 | 2018-11-13 | 中国人民解放军战略支援部队信息工程大学 | Automatic intrusion response decision-making technique based on Q-learning |
CN109190720A (en) * | 2018-07-28 | 2019-01-11 | 深圳市商汤科技有限公司 | Intelligent body intensified learning method, apparatus, equipment and medium |
CN109194583A (en) * | 2018-08-07 | 2019-01-11 | 中国地质大学(武汉) | Network congestion Diagnosis of Links method and system based on depth enhancing study |
CN109698836A (en) * | 2019-02-01 | 2019-04-30 | 重庆邮电大学 | A kind of method for wireless lan intrusion detection and system based on deep learning |
US20190158522A1 (en) * | 2018-01-02 | 2019-05-23 | Maryam AMIRMAZLAGHANI | Generalized likelihood ratio test (glrt) based network intrusion detection system in wavelet domain |
-
2019
- 2019-06-04 CN CN201910479765.0A patent/CN110225019B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160012081A1 (en) * | 2008-11-07 | 2016-01-14 | Cloudlock, Inc. | Relationship Model for Modeling Relationships Between Equivalent Objects Accessible Over a Network |
CN107135224A (en) * | 2017-05-12 | 2017-09-05 | 中国人民解放军信息工程大学 | Cyber-defence strategy choosing method and its device based on Markov evolutionary Games |
US20190158522A1 (en) * | 2018-01-02 | 2019-05-23 | Maryam AMIRMAZLAGHANI | Generalized likelihood ratio test (glrt) based network intrusion detection system in wavelet domain |
CN108809979A (en) * | 2018-06-11 | 2018-11-13 | 中国人民解放军战略支援部队信息工程大学 | Automatic intrusion response decision-making technique based on Q-learning |
CN109190720A (en) * | 2018-07-28 | 2019-01-11 | 深圳市商汤科技有限公司 | Intelligent body intensified learning method, apparatus, equipment and medium |
CN109194583A (en) * | 2018-08-07 | 2019-01-11 | 中国地质大学(武汉) | Network congestion Diagnosis of Links method and system based on depth enhancing study |
CN109698836A (en) * | 2019-02-01 | 2019-04-30 | 重庆邮电大学 | A kind of method for wireless lan intrusion detection and system based on deep learning |
Non-Patent Citations (1)
Title |
---|
尹秀: "基于改进的字典学习的网络入侵检测方法研究", 《中国优秀硕士学位论文全文数据库》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115102705A (en) * | 2022-04-02 | 2022-09-23 | 中国人民解放军国防科技大学 | Automatic network security detection method based on deep reinforcement learning |
CN115102705B (en) * | 2022-04-02 | 2023-11-03 | 中国人民解放军国防科技大学 | Automatic network security detection method based on deep reinforcement learning |
Also Published As
Publication number | Publication date |
---|---|
CN110225019B (en) | 2021-08-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110958220B (en) | Network space security threat detection method and system based on heterogeneous graph embedding | |
US10574675B2 (en) | Similarity search for discovering multiple vector attacks | |
EP3841502B1 (en) | Enhancing cybersecurity and operational monitoring with alert confidence assignments | |
Wang et al. | OVM: an ontology for vulnerability management | |
CN110462606A (en) | Intelligent and safe management | |
Barzegar et al. | Attack scenario reconstruction using intrusion semantics | |
US10740164B1 (en) | Application programming interface assessment | |
Zennaro et al. | Modelling penetration testing with reinforcement learning using capture‐the‐flag challenges: Trade‐offs between model‐free learning and a priori knowledge | |
WO2019028341A1 (en) | Similarity search for discovering multiple vector attacks | |
Spring et al. | Thinking about intrusion kill chains as mechanisms | |
Padayachee | Aspectising honeytokens to contain the insider threat | |
Keshavarzi et al. | An ontology-driven framework for knowledge representation of digital extortion attacks | |
Payne et al. | Towards deep federated defenses against malware in cloud ecosystems | |
Chiang et al. | Identifying smartphone malware using data mining technology | |
Pande et al. | An intrusion detection system for health-care system using machine and deep learning | |
Berghout et al. | EL-NAHL: Exploring labels autoencoding in augmented hidden layers of feedforward neural networks for cybersecurity in smart grids | |
Musman et al. | Steps toward a principled approach to automating cyber responses | |
CN110225019A (en) | A kind of network security processing method and device | |
Kumar et al. | Detection and prevention of profile cloning in online social networks | |
Randles et al. | A Formal Approach to the Engineering of Emergence and its Recurrence | |
El Hajji et al. | Analysis of neural network training and cost functions impact on the accuracy of IDS and SIEM systems | |
Alzahrani et al. | A multi-agent system for smartphone intrusion detection framework | |
Willett | Cybersecurity decision patterns as adaptive knowledge encoding in cybersecurity operations | |
Möller | Cyberattacker Profiles, Cyberattack Models and Scenarios, and Cybersecurity Ontology | |
CN112422573B (en) | Attack path restoration method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |