CN115208800A - Whole internet port scanning method and device based on reinforcement learning - Google Patents

Whole internet port scanning method and device based on reinforcement learning Download PDF

Info

Publication number
CN115208800A
CN115208800A CN202211129938.4A CN202211129938A CN115208800A CN 115208800 A CN115208800 A CN 115208800A CN 202211129938 A CN202211129938 A CN 202211129938A CN 115208800 A CN115208800 A CN 115208800A
Authority
CN
China
Prior art keywords
port
scanning
open
target network
ports
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211129938.4A
Other languages
Chinese (zh)
Other versions
CN115208800B (en
Inventor
杨家海
宋光磊
何林
李城龙
王之梁
张辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN202211129938.4A priority Critical patent/CN115208800B/en
Publication of CN115208800A publication Critical patent/CN115208800A/en
Application granted granted Critical
Publication of CN115208800B publication Critical patent/CN115208800B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/10Active monitoring, e.g. heartbeat, ping or trace-route
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0805Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/16Threshold monitoring

Abstract

The invention discloses a full internet port scanning method and a device based on reinforcement learning, wherein the method comprises the steps of dividing the internet into a plurality of target networks, carrying out full port scanning on a preset number of active addresses in each target network, and constructing an open port association diagram according to port opening information obtained by scanning; recommending candidate ports of undetected active addresses in each target network according to the open port association diagram, and scanning the candidate ports to obtain port scanning feedback results; updating the expected rewards of the candidate ports based on port scanning feedback results, updating an open port association diagram based on the updated expected rewards, and predicting the next candidate port of an active address needing to be scanned in each target network according to the updated open port association diagram; and when the number of the detection ports of each target network reaches the detection number threshold value, completing the port scanning task of one target network. The invention preferentially scans the ports which are more likely to be opened so as to improve the utilization rate of detection.

Description

Whole internet port scanning method and device based on reinforcement learning
Technical Field
The invention relates to the technical field of networks, in particular to a full internet port scanning method and device based on reinforcement learning.
Background
Full-network scanning is a common research technique in various network surveys, such as measurement service deployment and security vulnerabilities. However, these network surveys are limited to a given port set, do not fully capture the true network conditions, and may even mislead the survey conclusions.
Disclosure of Invention
The present invention is directed to solving, at least to some extent, one of the technical problems in the related art.
Therefore, the invention provides a whole Internet port scanning method and device based on reinforcement learning, which reduce the number of scanning ports and reduce the invasiveness of port scanning by utilizing a PMap port scanning tool. The system makes up the defects of the existing scanning tool and effectively supports the subsequent service discovery and safety research in the whole network range.
In order to achieve the above object, in one aspect, the present invention provides a full internet port scanning method based on reinforcement learning, including:
dividing the internet into a plurality of target networks, and carrying out full port scanning on a preset number of active addresses in each target network so as to construct an open port association diagram according to port open information obtained by scanning;
recommending candidate ports of undetected active addresses in each target network according to the open port association diagram, and scanning the candidate ports to obtain port scanning feedback results;
updating the expected rewards of the candidate ports based on the port scanning feedback result, updating the open port association diagram based on the updated expected rewards, and predicting the candidate port of the next active address needing to be scanned in each target network according to the updated open port association diagram; and the number of the first and second groups,
and when the number of the detection ports of each target network reaches a detection number threshold value, completing a port scanning task of the target network.
In addition, the reinforcement learning-based all-internet port scanning method according to the above embodiment of the present invention may further have the following additional technical features:
further, in an embodiment of the present invention, the pre-scanning the probe ports in each target network to construct an open port association graph according to port opening information obtained by the pre-scanning includes:
the performing full port scanning on a preset number of active addresses in each target network to construct an open port association diagram according to port opening information obtained by scanning includes:
selecting a preset number of active addresses from a target network to perform full-port scanning, and acquiring port opening information;
calculating the port opening probability of the full port based on the port opening information to obtain an initialized port opening probability;
and constructing the open port association diagram according to the initialized port open probability and a preset weight calculation formula.
Further, in an embodiment of the present invention, the recommending, according to the open port association map, a candidate port of an undetected active address in each target network includes:
when scanning the ports of undetected active addresses in a target network, selecting the port node with the highest probability as an entry node based on the open port association diagram;
judging the opening state of the port corresponding to the entry node, and updating the port opening probability corresponding to the port node with the highest probability according to the opening state judgment result; and the number of the first and second groups,
and calculating the port opening probabilities corresponding to other port nodes pointed by the port node with the highest probability according to a preset probability calculation formula to obtain the posterior probability of port opening so as to recommend the candidate ports according to the updated port opening probability.
Further, in an embodiment of the present invention, the preset number of active addresses is a seed address, and the method further includes: acquiring prior rewards of an open port i based on a pre-scanning mechanism:
Figure 747209DEST_PATH_IMAGE001
where k denotes the number of seed addresses, n i Indicating the number of open ports i in the seed address.
Further, in an embodiment of the present invention, the method further includes: the reward of scanning port i on an active address of each target network is:
Figure 59242DEST_PATH_IMAGE002
after completing port scanning for an active address, opening the port
Figure 44515DEST_PATH_IMAGE004
The reward of (c) is updated as:
Figure DEST_PATH_IMAGE005
wherein, the first and the second end of the pipe are connected with each other,
Figure 478908DEST_PATH_IMAGE006
indicating a reward after n scans on port i,
Figure DEST_PATH_IMAGE007
a reward representing a jth scan of port i;
and updating the open port association diagram according to the updated rewards of the port i, wherein the updating process is as follows:
Figure 736583DEST_PATH_IMAGE008
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE009
are the weights.
In order to achieve the above object, another aspect of the present invention provides an all internet port scanning apparatus based on reinforcement learning, including:
the association diagram building module is used for dividing the internet into a plurality of target networks, carrying out full port scanning on a preset number of active addresses in each target network and building an open port association diagram according to port opening information obtained by scanning;
the port scanning module is used for recommending candidate ports of undetected active addresses in each target network according to the open port association diagram and scanning the candidate ports to obtain port scanning feedback results;
the reward and graph updating module is used for updating the expected reward of the candidate port based on the port scanning feedback result, updating the open port association diagram based on the updated expected reward, and predicting the candidate port of the next active address needing to be scanned of each target network according to the updated open port association diagram; and the number of the first and second groups,
and the scanning completion module is used for completing the port scanning task of one target network when the number of the detection ports of each target network reaches the detection number threshold.
According to the method and the device for scanning the whole internet port based on reinforcement learning, the port which is more likely to be opened is preferentially scanned to improve the utilization rate of detection, the defects of the existing scanning tool are overcome, and the follow-up service discovery and safety research in the whole network range are effectively supported.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 is a flowchart of a reinforcement learning-based full Internet port scanning method according to an embodiment of the present invention;
FIG. 2 is a flow diagram of a PMap operation according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating an open port dependency graph model according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating dynamic recommendation for port scanning for an active address according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of an all internet port scanning apparatus based on reinforcement learning according to an embodiment of the present invention.
Detailed Description
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present invention will be described in detail below with reference to the embodiments with reference to the attached drawings.
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The following describes a reinforcement learning-based all-internet port scanning method and apparatus according to an embodiment of the present invention with reference to the accompanying drawings.
The invention finds the correlation of the open ports according to experience, namely the similarity of the open ports in the same network and the correlation among the open ports. Fig. 2 shows the main workflow of a PMap in a network. When scanning ports on active addresses in each network, an open port association graph is constructed for each target network by scanning open ports of a few active addresses in the network in advance. Using existing knowledge (constructed dependency graph of open ports), PMap pairs in the networkThe active address that needs to be scanned recommends a candidate port (target port). Also, PMap defines the expected return for port scanning (i.e., the
Figure 849376DEST_PATH_IMAGE010
) To estimate the probability of each port being open. After all candidate ports of one active address are scanned, the PMap updates the expected reward according to the port scanning feedback result and synchronously updates the open port association diagram so as to adjust the port scanning sequence of the next active address.
Fig. 1 is a flowchart of a reinforcement learning-based all internet port scanning method according to an embodiment of the present invention.
As shown in fig. 1, the method includes, but is not limited to, the following steps:
s1, dividing the Internet into a plurality of target networks, and carrying out full port scanning on a preset number of active addresses in each target network so as to construct an open port association diagram according to port opening information obtained by scanning.
Specifically, the internet is first divided into different networks, i.e., IP prefix networks promulgated by the Border Gateway Protocol (BGP). Then, port scanning is performed for active addresses within each network using PMap.
It is observed by the embodiments of the present invention that the open ports are related, mainly in the following two aspects.
It is understood that open ports within the same network have similarities. Hosts in the same network are more likely to open similar ports, e.g., 172.120.0.0/15 (EGIHgoing) networks where more than 92% of the active addresses open TCP/80 ports.
It is understood that there is an association between open ports. There is an association between the open ports and the association is unidirectional. For example, 100 ten thousand active IPv4 addresses were randomly scanned and found that if an address responded to UDP/443, they had 86% of the chance of responding to both TCP/443 and TCP/80. The active address of open TCP/443 has 89% of the chance to open TCP/80. However, the opposite correlation (TCP/80 → TCP/443 → UDP/443) is less pronounced.
The embodiment of the invention can select a small number of addresses from the network to carry out full port scanning, mine the correlation of the open ports in the network and recommend ports which are more likely to be opened on other active addresses in the network to carry out scanning according to the correlation.
Specifically, in order to depict the association relationship between different ports in the same network, the PMap constructs a directed graph G, which is called an association graph of open ports. As shown in FIG. 3, the nodes in the graph
Figure 751473DEST_PATH_IMAGE004
The port number representing the corresponding open port, and all nodes in the directed graph G represent the type of open port in the network (corresponding to the similarity of open ports under the same network). Directed edges in the graph represent associations between ports (corresponding to associations between open ports), and the weights of the edges
Figure DEST_PATH_IMAGE011
Indicates a conditional probability (corresponding to the degree of correlation) that port j is open when port i is open. Weight of
Figure 356767DEST_PATH_IMAGE011
The specific calculation method of (2) is as follows.
Figure 711525DEST_PATH_IMAGE012
(1)
Wherein
Figure DEST_PATH_IMAGE013
Which represents the probability that port i is open,
Figure 568623DEST_PATH_IMAGE014
indicating the probability that port j is open. Directed graph G contains only open ports that are present in the network (i.e., open ports are present in the network)
Figure DEST_PATH_IMAGE015
)。
Specifically, an open port association graph is constructed for each target network by scanning open ports of a few active addresses in the network in advance. When the target network is subjected to port scanning, a pre-scanning mechanism is used for establishing an open port association graph. More specifically, a small number of active addresses (seed addresses) are selected in the network to scan all ports, and the open probability of each node in the seed address for the corresponding port is calculated. Then, the weights between the nodes in the open port association graph are calculated according to equation 1. Due to the similarity of the open ports in the same network, the types of the open ports in the target network can be captured by scanning all the ports, and the loss of port information when a relevant graph is constructed is avoided. Preferably, when the open port correlation graph is constructed, TCP and UDP protocol attributes are added to the nodes to distinguish the protocols corresponding to the ports.
And S2, recommending candidate ports of undetected active addresses in each target network according to the open port association diagram, and scanning the candidate ports to obtain port scanning feedback results.
It can be understood that the constructed open port association map reflects the types of the open ports and the open probability thereof in the target network. In the extreme case, when all addresses in a particular network are selected as the seed addresses for pre-scanning, the open port association graph represents the true open types and their open probabilities. After the open port correlation diagram of the target network is constructed, the PMap can optimize the port scanning sequence of undetected addresses in the target network according to the correlation diagram, and preferentially select ports more likely to be open for scanning. And for each active address needing port scanning, the PMap dynamically recommends a port with the maximum opening probability in the open port association diagram by adopting a greedy method, and guides the port scanning direction. The specific recommendation process is as follows:
step 1: an ingress node for port scanning is selected. To save scan resources, the PMap preferentially scans ports that are more likely to be open. Therefore, when scanning an open port of an active address, it selects the port node i with the highest probability as the ingress node of the scanning process in G.
Figure 325226DEST_PATH_IMAGE016
(2)
Where S represents the set of all nodes (type corresponding to port open).
Step 2: and updating the port opening probability. According to the correlation between the open ports, one open port brings more information of the associated port to the invention. When the port i is scanned, if the port i is open, the invention modifies the open probability of the port i in G to 1, and then updates the port open probabilities of all nodes pointed by the node i. The set of all nodes to which node i points is represented as
Figure DEST_PATH_IMAGE017
And taking the maximum value of the average port opening probability obtained by scanning the seed address and the posterior probability under the opening of the port i as the probability that the port is possibly opened by each node j pointed to by the node i. The port opening probability of the node j pointed to by the node i is updated according to the following formula 3.
Figure 39104DEST_PATH_IMAGE018
If port i is not open, only the probability of opening of port j in G is updated to 0.
After updating the port opening probability, the PMap repeats the step 1, and selects the node with the maximum port opening probability in the G for scanning. The PMap then uses step 2 to update the port opening probability in G. This dynamic scan loops until a limit on the number of port probes per active address is reached. Fig. 4 shows in detail the dynamic recommendation process of port scanning at an active address.
S3, updating the expected rewards of the candidate ports based on port scanning feedback results, updating an open port association diagram based on the updated expected rewards, and predicting the candidate port of the next active address needing to be scanned in each target network according to the updated open port association diagram; and the number of the first and second groups,
and S4, when the number of the detection ports of each target network reaches a detection number threshold, completing a port scanning task of one target network.
Specifically, after scanning all candidate ports (actions) for an active address, the PMap updates the expected reward for the corresponding open port to more easily estimate the probability of opening a port in the target network. Therefore, it updates the association map based on the rewards of the port scan to provide more accurate port recommendations for the next active address to perform the port scan.
The PMap updates the rewards for scanning ports so that these ports with high hit rates have a better chance of being scanned when scanning the next active address. The (initial) a priori rewards of the open ports i are obtained based on a pre-scanning mechanism in the construction of the open port association graph:
Figure 146737DEST_PATH_IMAGE019
where k represents the number of seed addresses,
Figure 541947DEST_PATH_IMAGE020
indicating the number of open ports i in the seed address.
When scanning port i for an active address, the reward for scanning is 1 if port i is open, and 0 otherwise. The reward for scanning port i at an active address is as follows:
Figure 90740DEST_PATH_IMAGE021
after completing the port scan (action) for an active address, the reward for open port i is updated as follows:
Figure 244028DEST_PATH_IMAGE022
wherein
Figure 511061DEST_PATH_IMAGE023
Indicates a reward after n scans on port i, and
Figure 772278DEST_PATH_IMAGE024
representing the reward for the jth scan of port i.
The frequency of each open port is calculated by calculating the proportion of each open port on the scan address. Due to the correlation of the open ports within the same network, the open probability of a port can be approximated to the open frequency of the previous port. From equation 4, the present inventors have found
Figure 175578DEST_PATH_IMAGE025
Also represents the probability that port i is open, i.e.:
Figure 434521DEST_PATH_IMAGE026
due to sampling deviation of the seed address, the constructed open port association diagram may not fully characterize the open port characteristics of the whole target network. In order to solve the problem, after the PMap scans the address in the target network, a more reliable open port association map is constructed according to the updated reward. The specific update is as follows:
Figure 251167DEST_PATH_IMAGE027
wherein the content of the first and second substances,
Figure 988179DEST_PATH_IMAGE009
are weights.
It will be appreciated that this looping of the dynamic scan-update-adjustment process of embodiments of the present invention continues until the total number of probe packets reaches the budget limit.
Further, as the algorithm iterates, the rewards of ports with high open probabilities will become higher and higher, allocating more budget to these ports, which will eventually lead to convergence of the algorithm. However, due to sampling bias during initialization, some open ports in the target network may be missed, resulting in the absence of these port types in the constructed dependency graph of open ports. If the algorithm converges prematurely to other ports, these ports will be missed ever. This is a typical exploration and utilization dilemma, and the problem of early convergence to a local maximum is known as premature convergence. In conjunction with the port scanning feature, the present invention uses the ϵ -greedy strategy to enhance PMap exploration. The ϵ -greedy strategy can precisely control the port scan budget in advance. More specifically, the PMap was explored with probability ϵ. When the heuristic mechanism is triggered, the PMap will scan all ports for active addresses to overcome the loss of port type due to sampling bias. Otherwise, PMap explores with a probability of 1- ϵ, which recommends the port to scan according to the constructed dependency graph.
In summary, the present invention introduces PMap, which is a port scanning tool that can effectively find most open ports out of 65K ports in the entire network. And the PMap uses the correlation of the ports to construct an open port correlation diagram of each network, uses a reinforcement learning framework to update the open port correlation diagram according to a feedback result, and dynamically adjusts the port scanning sequence. Compared with the current port scanning method, PMap achieves better performance in terms of hit rate, coverage rate and invasiveness. Experiments on a real network show that the PMap can find 90% open ports only by scanning 151 ports (90% @ 151) for each active address, the number of the ports needing to be scanned is 311 times less than that of the ports needing to be scanned in full port scanning (90% @ 47K), and the number of the ports needing to be scanned is 5 times less than that of the ports needing to be scanned in common port scanning (90% @ 729). PMap reduces the number of scan ports and reduces the intrusiveness of port scanning. PMap is the first effective practice to use reinforcement learning to scan open ports. The system makes up the defects of the existing scanning tool and effectively supports the subsequent service discovery and safety research in the whole network range.
According to the reinforcement learning-based all-Internet port scanning method, the relevance of the open ports is fully utilized, and the ports which are more likely to be open are preferentially scanned to improve the detection utilization rate.
In order to implement the above embodiment, as shown in fig. 5, the embodiment further provides a reinforcement learning-based all-internet port scanning apparatus 10, where the apparatus 10 includes: the dependency graph building module 100, the port scanning module 200, the reward and graph update module 300, and the scan completion module 400.
The association graph building module 100 is configured to divide the internet into a plurality of target networks, perform full port scanning on a preset number of active addresses in each target network, and build an open port association graph according to port opening information obtained through scanning;
the port scanning module 200 is configured to recommend a candidate port of an undetected active address in each target network according to the open port association map, and scan the candidate port to obtain a port scanning feedback result;
the reward and graph updating module 300 is configured to update an expected reward of a candidate port based on a port scanning feedback result, update an open port association graph based on the updated expected reward, and predict a candidate port of a next active address to be scanned of each target network according to the updated open port association graph; and the number of the first and second groups,
a scan completion module 400, configured to complete a port scan task of one target network when the number of probe ports of each target network reaches the probe number threshold.
Further, the association graph building module 100 is further configured to:
selecting a preset number of active addresses from a target network to perform full-port scanning, and acquiring port opening information;
calculating the port opening probability of all ports based on the port opening information to obtain an initialized port opening probability;
and constructing an open port association diagram according to the initialized port open probability and a preset weight calculation formula.
Further, the port scanning module 200 is further configured to:
when scanning ports of undetected active addresses in a target network, selecting a port node with the highest probability as an entry node based on an open port association diagram;
judging the opening state of the port corresponding to the entry node, and updating the port opening probability corresponding to the port node with the highest probability according to the opening state judgment result; and the number of the first and second groups,
and calculating the port opening probabilities corresponding to all other port nodes pointed by the port node with the highest probability according to a preset probability calculation formula to obtain the posterior probability of port opening so as to recommend the candidate ports according to the updated port opening probability.
Further, the preset number of active addresses is seed addresses, and the reward and graph updating module 300 is further configured to obtain a priori rewards of the open port i based on a pre-scanning mechanism:
Figure 573881DEST_PATH_IMAGE028
where k denotes the number of seed addresses, n i Indicating the number of open ports i in the seed address.
Further, the reward and map update module 300 is further configured to scan the rewards of port i at an active address of each target network as follows:
Figure 3725DEST_PATH_IMAGE029
opening the port after completing the port scan of an active address
Figure 42089DEST_PATH_IMAGE004
The reward of (1) is updated as:
Figure 582791DEST_PATH_IMAGE005
wherein the content of the first and second substances,
Figure 288579DEST_PATH_IMAGE006
indicating a reward after n scans on port i,
Figure 623746DEST_PATH_IMAGE007
represents the reward for the jth scan of port i;
and updating the open port association diagram according to the updated rewards of the port i, wherein the updating process is as follows:
Figure 352667DEST_PATH_IMAGE008
wherein the content of the first and second substances,
Figure 493798DEST_PATH_IMAGE009
are weights.
According to the reinforcement learning-based all-Internet port scanning device disclosed by the embodiment of the invention, the relevance of the open ports is fully utilized, and the ports which are more likely to be open are preferentially scanned to improve the utilization rate of detection.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one of the feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless explicitly specified otherwise.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Although embodiments of the present invention have been shown and described above, it will be understood that the above embodiments are exemplary and not to be construed as limiting the present invention, and that changes, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims (10)

1. A full Internet port scanning method based on reinforcement learning is characterized by comprising the following steps:
dividing the internet into a plurality of target networks, and carrying out full port scanning on a preset number of active addresses in each target network so as to construct an open port association diagram according to port open information obtained by scanning;
recommending candidate ports of undetected active addresses in each target network according to the open port association diagram, and scanning the candidate ports to obtain port scanning feedback results;
updating the expected rewards of the candidate ports based on the port scanning feedback result, updating the open port association diagram based on the updated expected rewards, and predicting the candidate port of the next active address needing to be scanned in each target network according to the updated open port association diagram; and (c) a second step of,
and when the number of the detection ports of each target network reaches a detection number threshold value, completing a port scanning task of the target network.
2. The method of claim 1, wherein the performing full port scanning on a preset number of active addresses in each target network to construct an open port association map according to port opening information obtained by scanning comprises:
selecting a preset number of active addresses from a target network to perform full-port scanning, and acquiring port opening information;
calculating the port opening probability of the full port based on the port opening information to obtain an initialized port opening probability;
and constructing the open port association diagram according to the initialized port open probability and a preset weight calculation formula.
3. The method of claim 2, wherein recommending candidate ports of undetected active addresses in each target network according to the open port association map comprises:
when scanning the ports of undetected active addresses in a target network, selecting the port node with the highest probability as an entry node based on the open port association diagram;
judging the opening state of the port corresponding to the entry node, and updating the port opening probability corresponding to the port node with the highest probability according to the opening state judgment result; and the number of the first and second groups,
and calculating the port opening probabilities corresponding to other port nodes pointed by the port node with the highest probability according to a preset probability calculation formula to obtain the posterior probability of port opening so as to recommend the candidate ports according to the updated port opening probability.
4. The method of claim 3, wherein the predetermined number of active addresses is a seed address, and wherein the method further comprises: acquiring prior rewards of an open port i based on a pre-scanning mechanism:
Figure 390828DEST_PATH_IMAGE001
where k denotes the number of seed addresses, n i Indicating the number of open ports i in the seed address.
5. The method of claim 4, further comprising: the reward for scanning port i on an active address of each target network is:
Figure 956938DEST_PATH_IMAGE002
opening the port after completing the port scan of an active address
Figure 320924DEST_PATH_IMAGE004
The reward of (1) is updated as:
Figure 681498DEST_PATH_IMAGE005
wherein, the first and the second end of the pipe are connected with each other,
Figure 68617DEST_PATH_IMAGE006
indicating a reward after n scans on port i,
Figure 497805DEST_PATH_IMAGE007
represents the reward for the jth scan of port i;
and updating the open port association diagram according to the updated rewards of the port i, wherein the updating process is as follows:
Figure 388401DEST_PATH_IMAGE008
wherein the content of the first and second substances,
Figure 247773DEST_PATH_IMAGE009
are weights.
6. An all internet port scanning device based on reinforcement learning, comprising:
the association diagram building module is used for dividing the internet into a plurality of target networks, carrying out full port scanning on a preset number of active addresses in each target network and building an open port association diagram according to port opening information obtained by scanning;
the port scanning module is used for recommending candidate ports of undetected active addresses in each target network according to the open port association diagram and scanning the candidate ports to obtain port scanning feedback results;
the reward and graph updating module is used for updating the expected reward of the candidate port based on the port scanning feedback result, updating the open port association diagram based on the updated expected reward, and predicting the candidate port of the next active address needing to be scanned of each target network according to the updated open port association diagram; and the number of the first and second groups,
and the scanning completion module is used for completing a port scanning task of one target network when the number of the detection ports of each target network reaches a detection number threshold.
7. The apparatus of claim 6, wherein the dependency graph building module is further configured to:
selecting a preset number of active addresses from a target network to perform full-port scanning, and acquiring port opening information;
calculating the port opening probability of the full port based on the port opening information to obtain an initialized port opening probability;
and constructing the open port association diagram according to the initialized port open probability and a preset weight calculation formula.
8. The apparatus of claim 7, wherein the port scanning module is further configured to:
when scanning the ports of undetected active addresses in a target network, selecting the port node with the highest probability as an entry node based on the open port association diagram;
judging the opening state of the port corresponding to the entry node, and updating the port opening probability corresponding to the port node with the highest probability according to the judgment result of the opening state; and (c) a second step of,
and calculating the port opening probabilities corresponding to other port nodes pointed by the port node with the highest probability according to a preset probability calculation formula to obtain the posterior probability of port opening so as to recommend the candidate ports according to the updated port opening probability.
9. The apparatus of claim 8, wherein the predetermined number of active addresses are seed addresses, and wherein the reward and graph update module is further configured to obtain an a priori reward for open port i based on a pre-scan mechanism:
Figure 856608DEST_PATH_IMAGE010
where k denotes the number of seed addresses, n i Indicating the number of open ports i in the seed address.
10. The apparatus of claim 9, wherein the reward and map update module is further configured to scan port i for rewards on an active address of each target network:
Figure 92418DEST_PATH_IMAGE011
opening the port after completing the port scan of an active address
Figure 103099DEST_PATH_IMAGE004
The reward of (1) is updated as:
Figure 71055DEST_PATH_IMAGE005
wherein the content of the first and second substances,
Figure 229504DEST_PATH_IMAGE006
indicating a reward after n scans on port i,
Figure 941108DEST_PATH_IMAGE007
represents the reward for the jth scan of port i;
and updating the open port association diagram according to the updated rewards of the port i, wherein the updating process is as follows:
Figure 868613DEST_PATH_IMAGE008
wherein, the first and the second end of the pipe are connected with each other,
Figure 7470DEST_PATH_IMAGE009
are the weights.
CN202211129938.4A 2022-09-16 2022-09-16 Whole internet port scanning method and device based on reinforcement learning Active CN115208800B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211129938.4A CN115208800B (en) 2022-09-16 2022-09-16 Whole internet port scanning method and device based on reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211129938.4A CN115208800B (en) 2022-09-16 2022-09-16 Whole internet port scanning method and device based on reinforcement learning

Publications (2)

Publication Number Publication Date
CN115208800A true CN115208800A (en) 2022-10-18
CN115208800B CN115208800B (en) 2023-01-03

Family

ID=83571895

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211129938.4A Active CN115208800B (en) 2022-09-16 2022-09-16 Whole internet port scanning method and device based on reinforcement learning

Country Status (1)

Country Link
CN (1) CN115208800B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110768987A (en) * 2019-10-28 2020-02-07 电子科技大学 SDN-based dynamic deployment method and system for virtual honey network
CN112398969A (en) * 2021-01-19 2021-02-23 中国人民解放军国防科技大学 IPv6 address dynamic detection method and device and computer equipment
US20210226928A1 (en) * 2015-10-28 2021-07-22 Qomplx, Inc. Risk analysis using port scanning for multi-factor authentication
CN113746947A (en) * 2021-07-15 2021-12-03 清华大学 IPv6 active address detection method and device based on reinforcement learning
WO2022093697A1 (en) * 2020-10-26 2022-05-05 The Regents Of The University Of Michigan Adaptive network probing using machine learning
CN114880587A (en) * 2022-06-10 2022-08-09 国网福建省电力有限公司 Port scanning path recommendation method for Internet of things equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210226928A1 (en) * 2015-10-28 2021-07-22 Qomplx, Inc. Risk analysis using port scanning for multi-factor authentication
CN110768987A (en) * 2019-10-28 2020-02-07 电子科技大学 SDN-based dynamic deployment method and system for virtual honey network
WO2022093697A1 (en) * 2020-10-26 2022-05-05 The Regents Of The University Of Michigan Adaptive network probing using machine learning
CN112398969A (en) * 2021-01-19 2021-02-23 中国人民解放军国防科技大学 IPv6 address dynamic detection method and device and computer equipment
CN113746947A (en) * 2021-07-15 2021-12-03 清华大学 IPv6 active address detection method and device based on reinforcement learning
CN114880587A (en) * 2022-06-10 2022-08-09 国网福建省电力有限公司 Port scanning path recommendation method for Internet of things equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李果 等: ""基于多层级分类和空间建模的IPv6活跃地址发现算法"", 《清华大学学报(自然科学版)》 *

Also Published As

Publication number Publication date
CN115208800B (en) 2023-01-03

Similar Documents

Publication Publication Date Title
CN111771364B (en) Cloud-based anomaly traffic detection and protection in remote networks via DNS attributes
US10523713B2 (en) Advanced asset tracking and correlation
Vermeulen et al. Multilevel MDA-lite Paris traceroute
Bar-On et al. Individual regret in cooperative nonstochastic multi-armed bandits
Hou et al. 6Hit: A reinforcement learning-based approach to target generation for Internet-wide IPv6 scanning
JP2005223906A (en) Fault detection and diagnosis
Karthik et al. A hybrid trust management scheme for wireless sensor networks
Tossou et al. Thompson sampling for stochastic bandits with graph feedback
Spinelli et al. Observer placement for source localization: The effect of budgets and transmission variance
US9813324B2 (en) Dynamic control of endpoint profiling
CN115208800B (en) Whole internet port scanning method and device based on reinforcement learning
EP3425861A1 (en) Improved routing in an heterogeneous iot network
Angriman et al. Group-Harmonic and Group-Closeness Maximization–Approximation and Engineering∗
Acer et al. Random walks in time-graphs
Steger et al. Target acquired? evaluating target generation algorithms for ipv6
WO2019108892A1 (en) Identifying devices on a remote network
Fogel et al. Universal supervised learning for individual data
JP6390167B2 (en) Communication throughput prediction apparatus, communication throughput prediction method, and program
Abidi et al. Self-adaptive trust management model for social IoT services
JP4971292B2 (en) Overlay network routing system and method and program
Li et al. Resilient distributed diffusion for multi-task estimation
US20230403225A1 (en) Adaptive network probing using machine learning
JP2020154637A (en) Network scanning apparatus, program executed by computer, and computer-readable recording medium recording program
Song et al. Which Doors Are Open: Reinforcement Learning-based Internet-wide Port Scanning
CN112804231B (en) Distributed construction method, system and medium for attack graph of large-scale network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant