CN117579532A - Network service detection method, device and equipment for stateless records - Google Patents

Network service detection method, device and equipment for stateless records Download PDF

Info

Publication number
CN117579532A
CN117579532A CN202311501395.9A CN202311501395A CN117579532A CN 117579532 A CN117579532 A CN 117579532A CN 202311501395 A CN202311501395 A CN 202311501395A CN 117579532 A CN117579532 A CN 117579532A
Authority
CN
China
Prior art keywords
network
flow
traffic
samples
port
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311501395.9A
Other languages
Chinese (zh)
Inventor
赵诗语
郝逸航
王一焯
杨海云
宁泽欣
李�根
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianyi Safety Technology Co Ltd
Original Assignee
Tianyi Safety Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianyi Safety Technology Co Ltd filed Critical Tianyi Safety Technology Co Ltd
Priority to CN202311501395.9A priority Critical patent/CN117579532A/en
Publication of CN117579532A publication Critical patent/CN117579532A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/50Testing arrangements
    • H04L43/55Testing of service level quality, e.g. simulating service usage
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/04Processing captured monitoring data, e.g. for logfile generation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/16Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
    • H04L69/164Adaptation or special uses of UDP protocol

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Quality & Reliability (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Security & Cryptography (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The application discloses a network service detection method, device and equipment aiming at stateless records, wherein the method, device and equipment are used for preprocessing acquired network flow data to obtain a first flow sample set, wherein the first flow sample set comprises network flow samples taking UDP as a transmission protocol, and screening network flow samples corresponding to a service end in the first flow sample set through cluster analysis to obtain a third flow sample set; based on the network flow sample corresponding to the object to be tested, which is inquired from the third flow sample set, the object to be tested is passively detected, the opening probability of the object to be tested is determined, and the object to be tested is a port to be tested under the network address to be tested; and actively detecting the ports to be detected with the opening probability meeting the requirement, and determining the ports which are opened under the network address to be detected. According to the method and the device, network traffic samples based on stateless records are combined with active and passive detection to determine the network service opening condition on the designated network address, so that the detection efficiency is improved, and effective detection of a large-scale network space can be realized.

Description

Network service detection method, device and equipment for stateless records
Technical Field
The application belongs to the technical field of network space detection, and particularly relates to a network service detection method, device and equipment aiming at stateless recording.
Background
In the field of network space service detection, two detection modes, namely active detection and passive detection, are widely used. The active detection refers to a detection mode that a specific data packet is actively sent to a network, and the response of the network is observed and measured to complete a specified network service detection task, but when the space of the network to be detected is large, the active detection method is used for indiscriminately scanning ports in the network and sending a large number of detection data packets, so that the problems of low efficiency and large network load exist, and the firewall can misjudge the network attack behavior. While the passive detection mode is generally aimed at a border gateway of a small-scale park network, although no additional operation is needed to be performed on the network environment, a certain precondition is needed for using the passive detection mode:
(1) Network traffic is low speed or low sampling rate;
(2) The collected network traffic is transmitted by using TCP (Transmission Control Protocol ) as a transmission layer protocol, and the network service opening condition on a specified network address can be judged through field characteristics related to the TCP connection session state (for example, whether connection is established, whether connection is active or not, etc.) such as TCP-Flag, TCP-Window, etc. contained in the traffic statistics characteristics of the network traffic;
(3) If the plaintext detection model is used for judging the network service opening condition on the appointed network address, the network traffic is required to be completely visible or partially visible, and then DPI (Deep Packet Inspection ) can be used for constructing a traffic fingerprint, so that the plaintext detection model can identify the specific network service opening condition;
(4) If the network service opening condition on the designated network address is judged by using the rule judgment mode, the port through which the network traffic flows is required to be provided with the corresponding opening port number according to the rule of the IANA (The Internet Assigned Numbers Authority, internet digital distribution agency).
However, with the continuous increase of the access rate and the access scale of the high-speed internet, the form of network services is increasing, and the above premise is not established.
For the assumption of premise (1), taking a high-speed network OC-768 (transmission rate of 40 Gbps) link as an example, assuming that the average size of the packet is 40B, the average processing time of the packet is only 1 nanosecond. Thus, for larger scale network measurements, it is necessary to sample, but also at a higher sampling ratio. The current industry uses a Netflow technology, and network flow records generated by a specified network device when processing a data stream can be collected according to a certain sampling ratio by using Netflow.
On the assumption of the premise (2), the streaming network service ratio represented by network video and network live broadcast is gradually increased, particularly after the quit (Quick UDP Internet Connections, fast UDP internet connection) protocol has become the standard of HTTP/3.0, the proportion of network traffic using UDP (User Datagram Protocol ) as the transport layer protocol is gradually increased in the overall network traffic, and the traffic statistics feature of the network traffic transmitted using UDP as the transport layer protocol does not include information related to the connection session state, that is, the record of the UDP session acquired based on Netflow is a record item (called stateless record) which does not include a field representing the connection session state such as a TCP Flag field, so that the network service opening condition at the specified network address cannot be judged based on the feature of stateless record.
For the assumption of premise (3), traffic encryption has become common, and after high-scale sampling, the data packet completely loses the original feature distribution, so that content detection is no longer effective.
Finally, with the assumption of premise (4), since network services are increasingly differentiated, and the current open ports of internet services are rarely set according to 1024 common port rules allocated by IANA for security, the rule determination method is almost completely lost in effectiveness.
Disclosure of Invention
In view of the above problems, the present application provides a method, an apparatus, and a device for detecting network services for stateless records, which are used to determine a network service opening condition on a specified network address based on a network traffic sample of the stateless records.
In a first aspect, the present application provides a method for detecting a network service for stateless recording, the method comprising:
preprocessing the acquired network traffic data to obtain a first traffic sample set, wherein the first traffic sample set comprises network traffic samples taking a User Datagram Protocol (UDP) as a transmission protocol;
screening out network traffic samples corresponding to the same source port under each same source network address in the first traffic sample set as a subset to form a second traffic sample set;
clustering the second flow sample set by utilizing the flow characteristics of each subset in the second flow sample set, and screening network flow samples corresponding to the service end in the first flow sample set based on a clustering result to obtain a third flow sample set;
inquiring a network flow sample corresponding to the object to be detected from the third flow sample set, and passively detecting the object to be detected based on the inquired network flow sample to determine the opening probability of the object to be detected, wherein the object to be detected is a port to be detected under a network address to be detected;
And actively detecting the ports to be detected with the opening probability meeting the requirement, and determining the ports which are opened under the network address to be detected.
In one possible implementation manner, the preprocessing the acquired network traffic data to obtain a first traffic sample set includes:
acquiring network traffic data from a designated target network device;
performing operations of removing repeated data and/or abnormal data on the network traffic data to obtain primarily processed network traffic data;
and screening out the network traffic sample taking UDP as a transmission protocol from the primarily processed network traffic data to obtain a first traffic sample set.
In a possible implementation manner, the clustering the second flow sample set by using the flow characteristics of each subset in the second flow sample set includes:
obtaining a first traffic characteristic of each subset based on the number of network traffic samples in each subset;
obtaining a second flow characteristic of each subset based on the number of network flow samples in each subset and the message length of the network flow samples in each subset;
clustering samples in a flow characteristic data set based on a first flow characteristic and a second flow characteristic, and dividing the samples in the flow characteristic data set into samples corresponding to a server side and samples corresponding to a client side, wherein the same sample comprises a source network address, a source port, a first flow characteristic and a second flow characteristic corresponding to the same subset, and the flow characteristic data set comprises samples corresponding to each subset; the step of screening the network traffic samples corresponding to the service end in the first traffic sample set based on the clustering result to obtain a third traffic sample set, including:
Determining a source network address corresponding to the server based on the sample corresponding to the server;
and screening the network flow samples corresponding to the service end in the first flow sample set based on the source network address corresponding to the service end to obtain a third flow sample set.
In one possible implementation manner, the obtaining the first traffic characteristic of each subset based on the number of network traffic samples in each subset includes:
the number of network traffic samples in each subset is taken as a first traffic characteristic of each subset;
the obtaining the second traffic characteristic of each subset based on the number of the network traffic samples in each subset and the message length of the network traffic samples in each subset includes:
based on the number of network traffic samples in each subset and the total message length of all network traffic samples in each subset, calculating an average message length of the network traffic samples in each subset, and taking the average message length of the network traffic samples in each subset as a second traffic characteristic of each subset.
In a possible implementation manner, the clustering the samples in the flow characteristic data set based on the first flow characteristic and the second flow characteristic includes:
Taking the first flow characteristic and the second flow characteristic as characteristic dimensions, and clustering samples in the flow characteristic data set based on a k-means clustering algorithm to obtain a server cluster and a client cluster;
based on the samples belonging to the server cluster and the samples belonging to the client cluster, the samples in the flow characteristic data set are divided into samples corresponding to the server and samples corresponding to the client. In one possible implementation manner, the querying, from the third traffic sample set, the network traffic sample corresponding to the object to be tested includes:
inquiring a network flow sample corresponding to each port to be tested under the network address to be tested from the third flow sample set;
the passive detection is carried out on the object to be detected based on the inquired network traffic sample, and the determining of the opening probability of the object to be detected comprises the following steps:
and executing the following operations on each port to be tested under the network address to be tested:
determining the number of votes obtained on each voting item by a network traffic sample corresponding to a port to be tested based on a preset voting item and a vote number calculation mechanism corresponding to each voting item, wherein the voting item comprises a judgment rule for judging whether the port is open, and the height of the number of votes represents the degree that the network traffic sample accords with the corresponding judgment rule;
And determining the opening probability of the port to be tested based on the number of votes obtained on each voting item by the network traffic sample corresponding to the port to be tested.
In one possible embodiment, the voting items include positive voting items including voting items that characterize an opening of a port, and negative voting items including voting items that characterize an unopened port;
the determining the number of votes obtained on each voting item by the network flow sample corresponding to the port to be tested based on the preset voting item and the number calculation mechanism corresponding to each voting item comprises the following steps:
traversing a network flow sample corresponding to a port to be tested, and calculating positive and negative votes obtained by the current network flow sample based on parameters of the traversed network flow sample, preset voting items and a vote count calculation mechanism corresponding to each voting item;
and determining the opening probability of the port to be tested based on the proportion of the total positive ticket number of the network flow sample corresponding to the port to be tested in the total ticket number, wherein the total positive ticket number is the sum of the positive ticket numbers obtained by the network flow samples corresponding to the port to be tested, and the total ticket number is the sum of the total positive ticket number and the total negative ticket number of the network flow sample corresponding to the port to be tested.
In one possible implementation manner, the actively detecting the port to be detected with the opening probability meeting the requirement, and determining the port to be opened under the network address to be detected includes:
the open probability of all the ports to be tested is ordered in a descending order, and the ports to be tested with the front set number of open probabilities are selected; or selecting the ports to be detected with the opening probability larger than a preset threshold value from all the ports to be detected;
and actively detecting the selected port to be detected, and determining the port opened under the network address to be detected.
In a second aspect, the present application provides a network service detection apparatus for stateless recording, the apparatus comprising:
the preprocessing module is used for preprocessing the acquired network traffic data to obtain a first traffic sample set, wherein the first traffic sample set comprises network traffic samples taking a user datagram protocol UDP as a transmission protocol;
the first screening module is used for screening network traffic samples corresponding to the same source port under each same source network address in the first traffic sample set as a subset to form a second traffic sample set;
the second screening module is used for clustering the second flow sample set by utilizing the flow characteristics of each subset in the second flow sample set, and screening network flow samples corresponding to the service end in the first flow sample set based on a clustering result to obtain a third flow sample set;
The first detection module is used for inquiring a network traffic sample corresponding to the object to be detected from the third traffic sample set, passively detecting the object to be detected based on the inquired network traffic sample, and determining the opening probability of the object to be detected, wherein the object to be detected is a port to be detected under a network address to be detected;
and the second detection module is used for actively detecting the ports to be detected with the opening probability meeting the requirement and determining the open ports under the network address to be detected.
In a possible implementation manner, the preprocessing module performs preprocessing on the acquired network traffic data to obtain a first traffic sample set, and specifically includes:
acquiring network traffic data from a designated target network device;
performing operations of removing repeated data and/or abnormal data on the network traffic data to obtain primarily processed network traffic data;
and screening out the network traffic sample taking UDP as a transmission protocol from the primarily processed network traffic data to obtain a first traffic sample set.
In a possible implementation manner, the second screening module clusters the second traffic sample set by using the traffic characteristics of each subset in the second traffic sample set, specifically including:
Obtaining a first traffic characteristic of each subset based on the number of network traffic samples in each subset;
obtaining a second flow characteristic of each subset based on the number of network flow samples in each subset and the message length of the network flow samples in each subset;
clustering samples in a flow characteristic data set based on a first flow characteristic and a second flow characteristic, and dividing the samples in the flow characteristic data set into samples corresponding to a server side and samples corresponding to a client side, wherein the same sample comprises a source network address, a source port, a first flow characteristic and a second flow characteristic corresponding to the same subset, and the flow characteristic data set comprises samples corresponding to each subset; the second screening module screens the network traffic samples corresponding to the service end in the first traffic sample set based on the clustering result to obtain a third traffic sample set, which specifically comprises:
determining a source network address corresponding to the server based on the sample corresponding to the server;
and screening the network flow samples corresponding to the service end in the first flow sample set based on the source network address corresponding to the service end to obtain a third flow sample set.
In one possible implementation manner, the second screening module obtains the first traffic characteristic of each subset based on the number of network traffic samples in each subset, specifically including:
the number of network traffic samples in each subset is taken as a first traffic characteristic of each subset;
the second screening module obtains a second traffic characteristic of each subset based on the number of the network traffic samples in each subset and the message length of the network traffic samples in each subset, and specifically includes:
based on the number of network traffic samples in each subset and the total message length of all network traffic samples in each subset, calculating an average message length of the network traffic samples in each subset, and taking the average message length of the network traffic samples in each subset as a second traffic characteristic of each subset.
In a possible implementation manner, the second screening module clusters samples in the flow characteristic data set based on the first flow characteristic and the second flow characteristic, and specifically includes:
taking the first flow characteristic and the second flow characteristic as characteristic dimensions, and clustering samples in the flow characteristic data set based on a k-means clustering algorithm to obtain a server cluster and a client cluster;
Based on the samples belonging to the server cluster and the samples belonging to the client cluster, the samples in the flow characteristic data set are divided into samples corresponding to the server and samples corresponding to the client. In one possible implementation manner, the first detection module queries, from the third traffic sample set, a network traffic sample corresponding to the object to be measured, and specifically includes:
inquiring a network flow sample corresponding to each port to be tested under the network address to be tested from the third flow sample set;
the first detection module passively detects the object to be detected based on the queried network traffic sample, and determines the opening probability of the object to be detected, and specifically comprises the following steps:
and executing the following operations on each port to be tested under the network address to be tested:
determining the number of votes obtained on each voting item by a network traffic sample corresponding to a port to be tested based on a preset voting item and a vote number calculation mechanism corresponding to each voting item, wherein the voting item comprises a judgment rule for judging whether the port is open, and the height of the number of votes represents the degree that the network traffic sample accords with the corresponding judgment rule;
and determining the opening probability of the port to be tested based on the number of votes obtained on each voting item by the network traffic sample corresponding to the port to be tested.
In one possible embodiment, the voting items include positive voting items including voting items that characterize an opening of a port, and negative voting items including voting items that characterize an unopened port;
the first detection module determines the number of votes obtained on each voting item by a network traffic sample corresponding to a port to be detected based on a preset voting item and a voting number calculation mechanism corresponding to each voting item, and specifically comprises the following steps:
traversing a network flow sample corresponding to a port to be tested, and calculating positive and negative votes obtained by the current network flow sample based on parameters of the traversed network flow sample, preset voting items and a vote count calculation mechanism corresponding to each voting item;
and determining the opening probability of the port to be tested based on the proportion of the total positive ticket number of the network flow sample corresponding to the port to be tested in the total ticket number, wherein the total positive ticket number is the sum of the positive ticket numbers obtained by the network flow samples corresponding to the port to be tested, and the total ticket number is the sum of the total positive ticket number and the total negative ticket number of the network flow sample corresponding to the port to be tested.
In a possible implementation manner, the second detection module performs active detection on a port to be detected whose opening probability meets a requirement, and determines an open port under a network address to be detected, which specifically includes:
The open probability of all the ports to be tested is ordered in a descending order, and the ports to be tested with the front set number of open probabilities are selected; or selecting the ports to be detected with the opening probability larger than a preset threshold value from all the ports to be detected;
and actively detecting the selected port to be detected, and determining the port opened under the network address to be detected.
In a third aspect, embodiments of the present application provide an apparatus comprising at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the network service detection method for stateless recording as provided in any one of the first aspects of the present application.
In a fourth aspect, embodiments of the present application further provide a computer readable storage medium, which when executed by a processor of a terminal device, causes the terminal device to perform a network service detection method for stateless recording as set forth in any one of the first aspects of the present application.
The technical scheme provided by the embodiment of the application at least brings the following beneficial effects:
the embodiment of the application provides a network service detection method, device and equipment aiming at stateless records, which are characterized in that firstly, the acquired network flow data are preprocessed, UDP network flow samples (stateless records) taking UDP as a transport layer protocol are screened out from the preprocessed network flow data, and then, based on the flow characteristics of the UDP network flow samples, the clustering result of the UDP network flow samples is carried out, so that the screening of the UDP network flow samples corresponding to a server side is realized; when the port opening condition under the network address to be tested corresponding to the target server end needs to be judged, firstly, carrying out passive detection on each port to be tested, determining the number of votes obtained on each voting item by a UDP network traffic sample corresponding to each port to be tested through a preset voting item and a vote number calculation mechanism corresponding to each voting item, calculating the opening probability of each port by utilizing a voting result, screening out ports with large probability opening, and finally, carrying out active detection on the screened ports; therefore, compared with the prior art, the network service detection method and device not only support network service detection of the port taking UDP as a transmission layer protocol, but also can reduce the unnecessary scale of active detection in an active and passive combination mode, improve the efficiency of network service detection and provide a solution for realizing effective detection of a large-scale network space.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments of the present application will be briefly described below, and it is obvious that the drawings that are described below are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic flow chart of a network service detection method for stateless recording according to an embodiment of the present application;
fig. 2 is a schematic logic block diagram of a network service detection method for stateless recording according to an embodiment of the present application;
fig. 3 is a schematic diagram of a network service detection device for stateless recording according to an embodiment of the present application;
fig. 4 is a schematic diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. Wherein the described embodiments are some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
Also, in the description of the embodiments of the present application, "/" means or, unless otherwise indicated, for example, a/B may represent a or B; the text "and/or" is merely an association relation describing the associated object, and indicates that three relations may exist, for example, a and/or B may indicate: the three cases where a exists alone, a and B exist together, and B exists alone, and in addition, in the description of the embodiments of the present application, "plural" means two or more than two.
The terms "first," "second," and the like, are used below for descriptive purposes only and are not to be construed as implying or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first", "a second", or the like may explicitly or implicitly include one or more such feature, and in the description of embodiments of the present application, unless otherwise indicated, the meaning of "a plurality" is two or more.
In the field of network space service detection, two detection modes, namely active detection and passive detection, are widely used. The active detection refers to a detection mode that a specific data packet is actively sent to a network, and the response of the network is observed and measured to complete a specified network service detection task, but when the space of the network to be detected is large, the active detection method is used for indiscriminately scanning ports in the network and sending a large number of detection data packets, so that the problems of low efficiency and large network load exist, and the firewall can misjudge the network attack behavior. While the passive detection mode is generally aimed at a border gateway of a small-scale park network, although no additional operation is needed to be performed on the network environment, a certain precondition is needed for using the passive detection mode:
(1) Network traffic is low speed or low sampling rate;
(2) The collected network traffic is transmitted by using TCP (Transmission Control Protocol ) as a transmission layer protocol, and the network service opening condition on a specified network address can be judged through field characteristics related to the TCP connection session state (for example, whether connection is established, whether connection is active or not, etc.) such as TCP-Flag, TCP-Window, etc. contained in the traffic statistics characteristics of the network traffic;
(3) If the plaintext detection model is used for judging the network service opening condition on the appointed network address, the network traffic is required to be completely visible or partially visible, and then DPI (Deep Packet Inspection ) can be used for constructing a traffic fingerprint, so that the plaintext detection model can identify the specific network service opening condition;
(4) If the network service opening condition on the designated network address is judged by using the rule judgment mode, the port through which the network traffic flows is required to be provided with the corresponding opening port number according to the rule of the IANA (The Internet Assigned Numbers Authority, internet digital distribution agency).
However, with the continuous increase of the access rate and the access scale of the high-speed internet, the form of network services is increasing, and the above premise is not established.
For the assumption of premise (1), taking a high-speed network OC-768 (transmission rate of 40 Gbps) link as an example, assuming that the average size of the packet is 40B, the average processing time of the packet is only 1 nanosecond. Thus, for larger scale network measurements, it is necessary to sample, but also at a higher sampling ratio. The current industry uses a Netflow technology, and network flow records generated by a specified network device when processing a data stream can be collected according to a certain sampling ratio by using Netflow.
On the assumption of the premise (2), the streaming media network service ratio represented by network video and network live broadcast is gradually increased, particularly after the quit (Quick UDP Internet Connections, fast UDP internet connection) protocol has become the standard of HTTP/3.0, the proportion of network traffic using UDP (User Datagram Protocol ) as the transport layer protocol is gradually increased in the overall network traffic, and the traffic statistics feature of the network traffic transmitted using UDP as the transport layer protocol does not include information related to the connection session state, that is, the record of the UDP session acquired based on Netflow is a record item which does not include a field representing the connection session state, such as a TCP Flag field, and is called stateless record in the present application, and the network service opening condition at the designated network address cannot be judged based on the feature of stateless record.
For the assumption of premise (3), traffic encryption has become common, and after high-scale sampling, the data packet completely loses the original feature distribution, so that content detection is no longer effective.
Finally, with the assumption of premise (4), since network services are increasingly differentiated, and the current open ports of internet services are rarely set according to 1024 common port rules allocated by IANA for security, the rule determination method is almost completely lost in effectiveness.
In view of the above problems, the embodiments of the present application provide a method, an apparatus, and a device for detecting network services for stateless records, where firstly, preprocessing acquired network traffic data, screening UDP network traffic samples (stateless records) using UDP as a transport layer protocol from the network traffic data, and then, based on the results of clustering UDP network traffic samples by using traffic features of the UDP network traffic samples, implementing screening of UDP network traffic samples corresponding to a server; when the port opening condition under the network address to be tested corresponding to the target server end needs to be judged, firstly, carrying out passive detection on each port to be tested, determining the number of votes obtained on each voting item by a UDP network traffic sample corresponding to each port to be tested through a preset voting item and a vote number calculation mechanism corresponding to each voting item, calculating the opening probability of each port by utilizing a voting result, screening out ports with large probability opening, and finally, carrying out active detection on the screened ports; therefore, compared with the prior art, the network service detection method and device not only support network service detection of the port taking UDP as a transmission layer protocol, but also can reduce the unnecessary scale of active detection in an active and passive combination mode, improve the efficiency of network service detection and provide a solution for realizing effective detection of a large-scale network space.
Embodiments of the present disclosure are described in further detail below with reference to the accompanying drawings.
As shown in fig. 1, a flowchart of a method for detecting network services for stateless recording according to an embodiment of the present application is provided, where the method includes:
in step S101, the acquired network traffic data is preprocessed to obtain a first traffic sample set.
Wherein the first traffic sample set comprises network traffic samples using user datagram protocol UDP as a transport protocol.
As a possible implementation manner, the preprocessing of the acquired network traffic data to obtain the first traffic sample set specifically includes the following steps a-1, a-2 and a-3:
step a-1, data acquisition: network traffic data is obtained from a designated target network device.
In one or more embodiments, the above-specified target network device includes a specified router, switch or other data communication device, and the specific network traffic data acquisition period and the sampling rate of the network traffic data may be set according to the actual scenario.
Optionally, in the embodiment of the present application, netflow v5 is used to observe network traffic of a backbone egress router, which is usually a distributed network, and periodically derive network traffic data, where a high-speed router responsible for connecting multiple subnets and edge routers and transmitting a large amount of data traffic may be defined according to an actual service scenario.
Step a-2, data cleaning: and executing operations of removing repeated data and/or abnormal data on the network traffic data to obtain primarily processed network traffic data.
In this application, the network traffic data obtained in step a-1 generally includes a plurality of traffic sample records, and the data cleansing may include an operation of removing duplicate traffic sample records and/or removing abnormal traffic sample records, where the operation of removing abnormal traffic sample records may be an operation of removing abnormal traffic sample records such as network traffic samples with incorrect formats, for example, deleting traffic sample records with incomplete IP addresses.
Step a-3, data screening: and screening out the network traffic sample taking UDP as a transmission protocol from the primarily processed network traffic data to obtain a first traffic sample set.
In one or more embodiments, since the acquired network traffic data typically includes several traffic sample records related to various network layer protocols, it is necessary to delete traffic sample records corresponding to other network layer protocols that are not related to the transport protocol, such as deleting traffic sample records related to ICMP (Intemet Control Message Protocol, internet control message protocol), during the data screening process.
Further, in the flow samples corresponding to the network flow data after preliminary processing, after deleting the flow sample records corresponding to other network layer protocols irrelevant to the transmission protocol, the network flow samples using UDP as the transmission protocol are screened out from the network flow data after preliminary processing, and a first flow sample set is obtained.
Optionally, the data cleaning step may further include removing a field in the network traffic sample, where the field is not significant for the problem focused in the application, for example, removing a "next-hop gateway address" field, and for the screened network traffic sample using UDP as a transport protocol, removing the field not focused to obtain the first traffic sample set.
It should be noted that, in some embodiments, the first traffic sample set obtained in step S101 may be a stateless network traffic sample using UDP as a transport protocol, or may be a network traffic sample corresponding to a transport protocol in other connectionless states, that is, the network service detection method for stateless recording provided in the embodiments of the present application may be applicable to a network traffic sample corresponding to a transport protocol in any connectionless state, and in the embodiments of the present application, UDP is described as an example.
In step S102, network traffic samples corresponding to the same source port under each same source network address in the first traffic sample set are screened out as a subset, so as to form a second traffic sample set.
The server program typically exists in the form of a service in a Windows service system or a Linux system and listens to the client's requests using ports in the TCP/IP protocol. After the service is operated, the access right of the corresponding port is opened in the network, so that the network service opened by the server can communicate within the network range through the port. Therefore, only the network traffic sample record of the server side needs to be screened out for analysis on the basis of data preprocessing.
Referring to table 1, the fields and meanings contained for each network traffic sample in the first traffic sample set of the present application are shown.
TABLE 1 field in network traffic samples and its meaning
Data field Meaning of the invention
S_IP Source IP address
sIP_Local Destination IP address home location
D_IP Destination IP address
dIP_Local Destination IP address home location
Timestamp Stream arrival time stamp
Duration Duration of flow
Protocol Transmission protocol
#Packet Number of data packets in stream
S_Port Source port address
D_Port Destination port address
Each network traffic sample is a unidirectional data packet stream transmitted between a certain source port of a source network address (source IP address) and a certain destination port of a destination network address (destination IP address), and the application screens out the network traffic sample of the same source port under each same source network address from a first traffic sample set by looking up a field in table 1 as a subset to form a second traffic sample set as an analysis object of a subsequent step.
In step S103, clustering the second flow sample set by using the flow characteristics of each subset in the second flow sample set, and screening the network flow samples corresponding to the server in the first flow sample set based on the clustering result to obtain a third flow sample set;
in this embodiment of the present application, the network traffic samples in each subset of the second traffic sample set are network traffic samples from the same port under the same source IP address, but the source IP address corresponding to each subset may be a corresponding server or a corresponding client.
Therefore, on the basis of obtaining the second traffic sample set, the source IP address of the corresponding server is determined by analyzing the sources of the network traffic samples of each subset in the second traffic sample set, so as to screen out the network traffic samples of the ports under all the source IP addresses of the corresponding server in the first traffic sample set, and the network traffic samples are used as analysis objects for determining the port opening condition under the network address to be measured subsequently.
As a possible implementation, the clustering the second flow sample set using the flow characteristics of each subset of the second flow sample set includes:
obtaining a first traffic characteristic of each subset based on the number of network traffic samples in each subset;
obtaining a second flow characteristic of each subset based on the number of network flow samples in each subset and the message length of the network flow samples in each subset;
clustering samples in a flow characteristic data set based on a first flow characteristic and a second flow characteristic, and dividing the samples in the flow characteristic data set into samples corresponding to a server side and samples corresponding to a client side;
the same sample comprises a source network address, a source port, a first flow characteristic and a second flow characteristic corresponding to the same subset, and the flow characteristic data set comprises samples corresponding to each subset.
In order to determine the source IP address of the corresponding server, when selecting the flow characteristics for analyzing whether each subset in the second flow sample set corresponds to the server, it is considered that the server should send more messages than the client, and the number of samples of the corresponding network flow record should be higher. Therefore, the number of network traffic samples in each subset of the second traffic sample set is selected as the first traffic characteristic of each subset (herein denoted as NumOfTx, x corresponds to any subset x of the second traffic sample set), and the greater the NumOfTx index value, the more likely the network traffic samples in the corresponding subset x are sent out by the source IP address corresponding to the server.
In addition, the present application considers that the total message length sent by the server will be longer than that of the client, but to avoid interference of the Super-peer server when summing (i.e. a certain server sends a large number of packets, although the length of each packet is small, the sum may be large). Therefore, the message average value, that is, the average message length, needs to be calculated.
Therefore, when the second traffic sample set is selected to analyze whether each subset corresponds to the traffic feature of the server, the average message length of the network traffic samples in each subset is calculated based on the number of the network traffic samples in each subset and the total message length of all the network traffic samples in each subset (herein denoted as AvgPktLenOfTx, avgPktLenTx =ΣpktLenoftx/NumOfTx, x corresponds to any subset x of the second traffic sample set, ΣpktLenoftx is the total message length corresponding to the subset x), and the average message length of the network traffic samples in each subset is taken as the second traffic feature of each subset, and similarly, the greater the AvgPktLenOfTx index value, the more likely the network traffic samples in the corresponding subset x are sent by the source IP address corresponding to the server.
After the first flow characteristic and the second flow characteristic of each subset are determined, samples { sIP, sPort, numOfTx, avgPktLenTx } corresponding to each subset are added into the flow characteristic data set respectively, then the samples in the flow characteristic data set are clustered based on the first flow characteristic and the second flow characteristic, and the samples corresponding to the server side and the samples corresponding to the client side in the flow characteristic data set are determined based on the clustering result.
Optionally, when the k-means clustering algorithm is used for clustering samples in the flow characteristic data set, the number of clusters is set to be 2, the first flow characteristic and the second flow characteristic are taken as characteristic dimensions, the samples in the flow characteristic data set are clustered based on the k-means clustering algorithm to obtain two clusters, one cluster comprises samples corresponding to a server side, and the other cluster comprises samples corresponding to a client side;
based on the index meanings of the first flow characteristic and the second flow characteristic in the foregoing embodiment, a cluster with a cluster center closer to the origin point of the two clusters is a server cluster, a sample in the server cluster is a sample corresponding to the server, the other cluster is a client cluster, and a sample in the client cluster is a sample corresponding to the client.
As a possible implementation manner, the screening, based on the clustering result, the network traffic samples corresponding to the service end in the first traffic sample set to obtain a third traffic sample set includes:
determining a source network address corresponding to the server based on the sample corresponding to the server;
and screening the network flow samples corresponding to the service end in the first flow sample set based on the source network address corresponding to the service end to obtain a third flow sample set.
Specifically, based on samples belonging to a server cluster, a source network address corresponding to a server can be determined, and then a network traffic sample corresponding to the server in a first traffic sample set is determined, namely, the clustering result obtained by performing cluster analysis on the traffic feature data set based on the traffic features of each subset in a second traffic sample set is obtained, and the network traffic sample corresponding to the server in the first traffic sample set is screened out to obtain a third traffic sample set which is used as an analysis object of a subsequent step.
It should be noted that, the flow characteristics used for analyzing whether each subset corresponds to the server are selected for clustering, other flow characteristics that can characterize the difference between the server and the client, such as flow characteristics that characterize the frequency of message sending, may be selected in addition to the first flow characteristics and the second flow characteristics provided in the application, and other clustering algorithms besides the k-means clustering algorithm may be selected for classifying the subset of the second flow sample set based on the flow characteristics, which is not limited in the application.
After the third flow sample set is obtained, the data support preparation process is completed. In the subsequent actual detection process of the object to be detected, the content of the third flow sample set is directly accessed to determine the network service opening condition (whether the port is opened) of the object to be detected.
It should be noted that, for each time network traffic data is obtained according to a certain network traffic data obtaining period and a sampling rate of the network traffic data, network traffic samples corresponding to the server and using UDP as a transmission protocol are screened out from the network traffic data according to the foregoing steps S101 to S103, so as to obtain a third traffic sample set, which is used for subsequent actual detection of the object to be detected.
Based on the source IP address corresponding to the server and the source IP address corresponding to the client in the foregoing embodiments, the statistical feature discriminating method based on the session mechanism is provided instead of relying on the static method based on the whois and other IP review tools or the service IP record database in the conventional method. The method does not depend on external knowledge, has strong reliability, and provides a solution to the problem that the network traffic sample using UDP as a transmission protocol cannot distinguish the server.
In step S104, inquiring a network traffic sample corresponding to the object to be tested from the third traffic sample set, and performing passive detection on the object to be tested based on the inquired network traffic sample, to determine an opening probability of the object to be tested, where the object to be tested is a port to be tested under a network address to be tested;
As can be seen from the foregoing embodiments, for the UDP network traffic sample using UDP as the transport protocol, since UDP does not have a process of establishing and disconnecting, it is a stateless record, and therefore, the opening condition of the port under the source IP address corresponding to the server cannot be determined directly through the UDP network traffic sample.
When a detection request for an object to be detected is received and network service opening conditions under a network address to be detected need to be detected, passive detection is firstly executed on a port to be detected under the network address to be detected.
Optionally, if the object to be measured is a network address to be measured, defaulting to the port to be measured under the network address to be measured as a port with a port number from 0 to 65535, traversing the port with the port number from 0 to 65535, searching the network traffic sample corresponding to the traversed port from the third traffic sample set, passively detecting the traversed port based on the searched network traffic sample, and finally determining the opening probability of each port.
The port is mainly used for a transmission layer and is provided for TCP and UDP protocols. The port number is a number assigned to a port and used to identify a particular service connected to the network, and ranges from 0 to 65535, where different services may run on the same device and are identified by different port numbers. Such as 80 ports for browsing web services, 21 ports for FTP services, etc.
Optionally, if the object to be measured is a part of ports to be measured specified under the network address to be measured and in the ports from 0 to 65535, traversing the network traffic sample corresponding to the traversed port of the specified ports to be measured from the third traffic sample set, and passively detecting the traversed port based on the queried network traffic sample, so as to determine the opening probability of each port.
It should be noted that, the network address specified by the object to be tested may be a network address to be tested in a specified IP range, that is, the specified network address to be tested may be a network address to be tested in the specified IP range, or may be a network address to be tested in a specified list.
In one or more embodiments, the passively detecting the object to be detected based on the queried network traffic sample, and determining the opening probability of the object to be detected includes:
and executing the following operations on each port to be tested under the network address to be tested:
determining the number of votes obtained on each voting item by a network flow sample corresponding to a port to be tested based on a preset voting item and a voting number calculation mechanism corresponding to each voting item;
determining the opening probability of the port to be tested based on the number of votes obtained on each voting item by the network traffic sample corresponding to the port to be tested;
The voting item comprises a judging rule for judging whether a port is opened, and the height of the voting item represents the degree of the network flow sample conforming to the corresponding judging rule.
Specifically, the voting items include positive voting items and negative voting items:
the positive voting item includes a voting item representing that a port is open, for example, a network traffic sample containing IDC in sIP _local field may be sent by a machine room server (server side);
the negative vote items include vote items that characterize ports as not open, such as ports that are high order ports and do not belong to defined common ports.
According to the method, each network flow sample corresponding to the port to be tested is voted based on the preset voting items and the voting number calculation mechanism corresponding to each voting item, and finally, the voting numbers obtained by the network flow samples corresponding to the port to be tested on each voting item are accumulated, so that the port opening probability is increased as the total positive voting number obtained by each port to be tested on the positive voting item is larger than the total negative voting number obtained on the negative voting item.
As a possible implementation manner, the passive detection of the object to be detected based on the queried network traffic sample, and determining the opening probability of the object to be detected, includes:
Traversing a network flow sample corresponding to a port to be tested, and calculating positive and negative votes obtained by the current network flow sample based on parameters of the traversed network flow sample, preset voting items and a vote count calculation mechanism corresponding to each voting item;
determining the opening probability of the port to be tested based on the proportion of the total positive ticket number of the network flow sample corresponding to the port to be tested in the total ticket number;
the total positive ticket number is the sum of the positive ticket numbers obtained by each network flow sample corresponding to the port to be tested, and the total ticket number is the sum of the total positive ticket number and the total negative ticket number of the network flow sample corresponding to the port to be tested.
The voting items used in the passive detection of the object to be detected in the present application and the vote count calculation mechanism corresponding to each voting item are described below with reference to table 2.
Table 2 Ticket count calculation mechanism
Referring to table 2, for each network traffic sample corresponding to a port to be tested, voting is performed according to table 2.
For example, when judging the source port number portNum <2≡10 corresponding to the network traffic sample or the common IANA service port (the port number corresponding to the common port is defined by the actual situation, for example, the common port is defined as 3389), the positive ticket number is added with 3 tickets;
Correspondingly, when the port number portNum is a high-order port and does not belong to a common port, the corresponding negative ticket number is calculated according to a formula.
The last known record is outdated, which means that the longer the flow arrival Timestamp of the network traffic sample and the number of days difference #passdays of the current system time, the lower the probability of representing that the port is open, if #passdays is 365 days, the negative ticket number is added with 3 tickets according to the ticket number calculation mechanism in table 2.
It should be noted that the voting term and the vote count calculation mechanism used in the implementation of the present application are not limited to those shown in table 2, but may be other voting term and vote count calculation mechanisms set according to actual scenes, which is not limited in this application.
For each port to be tested, voting is carried out on each network flow sample corresponding to the port to be tested according to the voting items given in the table 2 and the voting number calculation mechanism corresponding to each voting item, and all positive voting numbers are accumulated to obtain the total positive voting number vote corresponding to the port to be tested + Accumulating all negative ticket numbers to obtain a total negative ticket number vote corresponding to the port to be tested - Finally, calculating the total positive ticket number vote + In the total ticket number (vote) + +vote - ) The ratio of the port to be tested is used as the opening probability of the port to be tested.
The voting items can be regarded as a piece of evidence for representing whether the port is open or not, so that the problem of judging the open condition of the network service at the current moment can be finished by using historical data based on the voting mechanism based on evidence accumulation provided by the embodiment of the application.
In step S105, active detection is performed on the ports to be detected whose opening probability meets the requirement, and the ports that are opened under the network address to be detected are determined.
According to the method and the device, after the opening probability of each port to be detected under the network address to be detected is determined through the passive detection mode in the embodiment, a plurality of ports to be detected, which are most likely to be opened, are selected as ports to be detected, the opening probability of which meets the requirements, and the selected ports are actively detected.
As a possible implementation manner, the actively detecting the port to be detected with the opening probability meeting the requirement, determining the port to be opened under the network address to be detected, includes:
The open probability of all the ports to be tested is ordered in a descending order, and the ports to be tested with the front set number of open probabilities are selected; or selecting the ports to be detected with the opening probability larger than a preset threshold value from all the ports to be detected;
and actively detecting the selected port to be detected, and determining the port opened under the network address to be detected.
Specifically, the active detection of the selected ports to be detected may be performed by using a detection tool such as an N-Map to perform active detection, sending a detection packet, clearing the ports which are not open in the selected ports to be detected based on the observed response, obtaining a list of the ports which are determined to be open under the network address to be detected, adding relevant information (such as a service type, a version number, etc.) of the port service, and returning the information to the user.
Fig. 2 is a schematic logic block diagram of a method for detecting network services for stateless recording according to an embodiment of the present application.
In a data preparation stage, preprocessing acquired network flow data to obtain a first flow sample set, wherein the first flow sample set comprises network flow samples taking a user datagram protocol UDP as a transmission protocol, then screening out network flow samples corresponding to the same source port under each same source network address in the first flow sample set to obtain a second flow sample set, extracting flow characteristics of each subset in the second flow sample set, clustering the second flow sample set, determining a source network address corresponding to a server by utilizing a clustering result, and obtaining a third flow sample set from network flow samples corresponding to the server in the first flow sample set based on the source network address corresponding to the server, thereby completing data preparation;
In the actual detection process, inquiring a network flow sample corresponding to the object to be detected from a third flow sample set based on the object to be detected appointed in the inquiry task, and carrying out passive detection on the object to be detected based on the inquired network flow sample to determine the opening probability of the object to be detected, wherein the object to be detected is a port to be detected under the network address to be detected, and then carrying out active detection on the port to be detected with the opening probability meeting the requirement to determine the opening port under the network address to be detected.
Based on the network service detection method for stateless records provided in the foregoing embodiment, firstly, preprocessing acquired network traffic data, screening UDP network traffic samples (stateless records) using UDP as a transport layer protocol from the preprocessed network traffic data, and then, based on the clustering result of the traffic characteristics of the UDP network traffic samples, realizing screening of UDP network traffic samples corresponding to a server; when the port opening condition under the network address to be tested corresponding to the target server end needs to be judged, firstly, carrying out passive detection on each port to be tested, determining the number of votes obtained on each voting item by a UDP network traffic sample corresponding to each port to be tested through a preset voting item and a vote number calculation mechanism corresponding to each voting item, calculating the opening probability of each port by utilizing a voting result, screening out ports with large probability opening, and finally, carrying out active detection on the screened ports; therefore, compared with the prior art, the network service detection method and device not only support network service detection of the port taking UDP as a transmission layer protocol, but also can reduce the unnecessary scale of active detection in an active and passive combination mode, improve the efficiency of network service detection and provide a solution for realizing effective detection of a large-scale network space.
Based on the same inventive concept, the embodiment of the present application further provides a network service detection device for stateless recording, as shown in fig. 3, where the device includes:
a preprocessing module 301, configured to preprocess acquired network traffic data to obtain a first traffic sample set, where the first traffic sample set includes a network traffic sample using a user datagram protocol UDP as a transmission protocol;
a first screening module 302, configured to screen out network traffic samples corresponding to the same source port under each same source network address in the first traffic sample set as a subset, to form a second traffic sample set;
a second screening module 303, configured to cluster the second traffic sample set by using the traffic characteristics of each subset in the second traffic sample set, and screen the network traffic samples corresponding to the server in the first traffic sample set based on the clustering result, so as to obtain a third traffic sample set;
the first detection module 304 is configured to query a network traffic sample corresponding to the object to be detected from the third traffic sample set, and perform passive detection on the object to be detected based on the queried network traffic sample, to determine an opening probability of the object to be detected, where the object to be detected is a port to be detected under a network address to be detected;
The second detection module 305 is configured to actively detect a port to be detected whose opening probability meets the requirement, and determine an open port under the network address to be detected.
In a possible implementation manner, the preprocessing module 301 performs preprocessing on the acquired network traffic data to obtain a first traffic sample set, and specifically includes:
acquiring network traffic data from a designated target network device;
performing operations of removing repeated data and/or abnormal data on the network traffic data to obtain primarily processed network traffic data;
and screening out the network traffic sample taking UDP as a transmission protocol from the primarily processed network traffic data to obtain a first traffic sample set.
In a possible implementation manner, the second screening module 303 clusters the second traffic sample set by using the traffic characteristics of each subset in the second traffic sample set, specifically includes:
obtaining a first traffic characteristic of each subset based on the number of network traffic samples in each subset;
obtaining a second flow characteristic of each subset based on the number of network flow samples in each subset and the message length of the network flow samples in each subset;
Clustering samples in a flow characteristic data set based on a first flow characteristic and a second flow characteristic, and dividing the samples in the flow characteristic data set into samples corresponding to a server side and samples corresponding to a client side, wherein the same sample comprises a source network address, a source port, a first flow characteristic and a second flow characteristic corresponding to the same subset, and the flow characteristic data set comprises samples corresponding to each subset; the second screening module 303 screens the network traffic samples corresponding to the service end in the first traffic sample set based on the clustering result to obtain a third traffic sample set, which specifically includes:
determining a source network address corresponding to the server based on the sample corresponding to the server;
and screening the network flow samples corresponding to the service end in the first flow sample set based on the source network address corresponding to the service end to obtain a third flow sample set.
In a possible implementation manner, the second screening module 303 obtains the first traffic characteristic of each subset based on the number of network traffic samples in each subset, specifically including:
the number of network traffic samples in each subset is taken as a first traffic characteristic of each subset;
The second screening module 303 obtains a second traffic characteristic of each subset based on the number of network traffic samples in each subset and the message length of the network traffic samples in each subset, and specifically includes:
based on the number of network traffic samples in each subset and the total message length of all network traffic samples in each subset, calculating an average message length of the network traffic samples in each subset, and taking the average message length of the network traffic samples in each subset as a second traffic characteristic of each subset.
In a possible implementation manner, the second screening module 303 clusters samples in the flow characteristic data set based on the first flow characteristic and the second flow characteristic, and specifically includes:
taking the first flow characteristic and the second flow characteristic as characteristic dimensions, and clustering samples in the flow characteristic data set based on a k-means clustering algorithm to obtain a server cluster and a client cluster;
based on the samples belonging to the server cluster and the samples belonging to the client cluster, the samples in the flow characteristic data set are divided into samples corresponding to the server and samples corresponding to the client. In a possible implementation manner, the first detection module 304 queries, from the third traffic sample set, a network traffic sample corresponding to the object to be tested, and specifically includes:
Inquiring a network flow sample corresponding to each port to be tested under the network address to be tested from the third flow sample set;
the first detection module 304 passively detects the object to be detected based on the queried network traffic sample, and determines the opening probability of the object to be detected, which specifically includes:
and executing the following operations on each port to be tested under the network address to be tested:
determining the number of votes obtained on each voting item by a network traffic sample corresponding to a port to be tested based on a preset voting item and a vote number calculation mechanism corresponding to each voting item, wherein the voting item comprises a judgment rule for judging whether the port is open, and the height of the number of votes represents the degree that the network traffic sample accords with the corresponding judgment rule;
and determining the opening probability of the port to be tested based on the number of votes obtained on each voting item by the network traffic sample corresponding to the port to be tested.
In one possible embodiment, the voting items include positive voting items including voting items that characterize an opening of a port, and negative voting items including voting items that characterize an unopened port;
the first detection module 304 determines, based on preset voting terms and a vote count calculation mechanism corresponding to each voting term, a vote count obtained by a network traffic sample corresponding to a port to be detected on each voting term, and specifically includes:
Traversing a network flow sample corresponding to a port to be tested, and calculating positive and negative votes obtained by the current network flow sample based on parameters of the traversed network flow sample, preset voting items and a vote count calculation mechanism corresponding to each voting item;
and determining the opening probability of the port to be tested based on the proportion of the total positive ticket number of the network flow sample corresponding to the port to be tested in the total ticket number, wherein the total positive ticket number is the sum of the positive ticket numbers obtained by the network flow samples corresponding to the port to be tested, and the total ticket number is the sum of the total positive ticket number and the total negative ticket number of the network flow sample corresponding to the port to be tested.
In a possible implementation manner, the second detection module 305 performs active detection on the port to be detected whose opening probability meets the requirement, and determines the port to be opened under the network address to be detected, which specifically includes:
the open probability of all the ports to be tested is ordered in a descending order, and the ports to be tested with the front set number of open probabilities are selected; or selecting the ports to be detected with the opening probability larger than a preset threshold value from all the ports to be detected;
and actively detecting the selected port to be detected, and determining the port opened under the network address to be detected.
Based on the same inventive concept, the present application also provides an electronic device 400, as shown in fig. 4, comprising at least one processor 402; and a memory 401 communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of:
preprocessing the acquired network traffic data to obtain a first traffic sample set, wherein the first traffic sample set comprises network traffic samples taking a User Datagram Protocol (UDP) as a transmission protocol;
screening out network traffic samples corresponding to the same source port under each same source network address in the first traffic sample set as a subset to form a second traffic sample set;
clustering the second flow sample set by utilizing the flow characteristics of each subset in the second flow sample set, and screening network flow samples corresponding to the service end in the first flow sample set based on a clustering result to obtain a third flow sample set;
inquiring a network flow sample corresponding to the object to be detected from the third flow sample set, and passively detecting the object to be detected based on the inquired network flow sample to determine the opening probability of the object to be detected, wherein the object to be detected is a port to be detected under a network address to be detected;
And actively detecting the ports to be detected with the opening probability meeting the requirement, and determining the ports which are opened under the network address to be detected.
In one possible implementation, the processor 402 is specifically configured to:
acquiring network traffic data from a designated target network device;
performing operations of removing repeated data and/or abnormal data on the network traffic data to obtain primarily processed network traffic data;
and screening out the network traffic sample taking UDP as a transmission protocol from the primarily processed network traffic data to obtain a first traffic sample set.
In one possible implementation, the processor 402 is specifically configured to:
obtaining a first traffic characteristic of each subset based on the number of network traffic samples in each subset;
obtaining a second flow characteristic of each subset based on the number of network flow samples in each subset and the message length of the network flow samples in each subset;
clustering samples in a flow characteristic data set based on a first flow characteristic and a second flow characteristic, and dividing the samples in the flow characteristic data set into samples corresponding to a server side and samples corresponding to a client side, wherein the same sample comprises a source network address, a source port, a first flow characteristic and a second flow characteristic corresponding to the same subset, and the flow characteristic data set comprises samples corresponding to each subset; in one possible implementation, the processor 402 is specifically configured to:
Determining a source network address corresponding to the server based on the sample corresponding to the server;
and screening the network flow samples corresponding to the service end in the first flow sample set based on the source network address corresponding to the service end to obtain a third flow sample set.
In one possible implementation, the processor 402 is specifically configured to:
the number of network traffic samples in each subset is taken as a first traffic characteristic of each subset;
based on the number of network traffic samples in each subset and the total message length of all network traffic samples in each subset, calculating an average message length of the network traffic samples in each subset, and taking the average message length of the network traffic samples in each subset as a second traffic characteristic of each subset.
In one possible implementation, the processor 402 is specifically configured to:
taking the first flow characteristic and the second flow characteristic as characteristic dimensions, and clustering samples in the flow characteristic data set based on a k-means clustering algorithm to obtain a server cluster and a client cluster;
based on the samples belonging to the server cluster and the samples belonging to the client cluster, the samples in the flow characteristic data set are divided into samples corresponding to the server and samples corresponding to the client. In one possible implementation, the processor 402 is specifically configured to:
Inquiring a network flow sample corresponding to each port to be tested under the network address to be tested from the third flow sample set;
and executing the following operations on each port to be tested under the network address to be tested:
determining the number of votes obtained on each voting item by a network traffic sample corresponding to a port to be tested based on a preset voting item and a vote number calculation mechanism corresponding to each voting item, wherein the voting item comprises a judgment rule for judging whether the port is open, and the height of the number of votes represents the degree that the network traffic sample accords with the corresponding judgment rule;
and determining the opening probability of the port to be tested based on the number of votes obtained on each voting item by the network traffic sample corresponding to the port to be tested.
In one possible embodiment, the voting items include positive voting items including voting items that characterize an opening of a port, and negative voting items including voting items that characterize an unopened port;
in one possible implementation, the processor 402 is specifically configured to:
traversing a network flow sample corresponding to a port to be tested, and calculating positive and negative votes obtained by the current network flow sample based on parameters of the traversed network flow sample, preset voting items and a vote count calculation mechanism corresponding to each voting item;
And determining the opening probability of the port to be tested based on the proportion of the total positive ticket number of the network flow sample corresponding to the port to be tested in the total ticket number, wherein the total positive ticket number is the sum of the positive ticket numbers obtained by the network flow samples corresponding to the port to be tested, and the total ticket number is the sum of the total positive ticket number and the total negative ticket number of the network flow sample corresponding to the port to be tested.
In one possible implementation, the processor 402 is specifically configured to:
the open probability of all the ports to be tested is ordered in a descending order, and the ports to be tested with the front set number of open probabilities are selected; or selecting the ports to be detected with the opening probability larger than a preset threshold value from all the ports to be detected;
and actively detecting the selected port to be detected, and determining the port opened under the network address to be detected.
The memory 401 is used for storing programs. In particular, the program may include program code including computer-operating instructions. The memory 401 may be a volatile memory (RAM) such as a random-access memory (RAM); the memory may also be a nonvolatile memory (non-volatile memory), such as a flash memory (flash memory), a Hard Disk Drive (HDD) or a Solid State Drive (SSD); but may be any one or a combination of any of the above volatile and nonvolatile memories.
The processor 402 may be a central processing unit (central processing unit, CPU for short), a network processor (network processor, NP for short), or a combination of CPU and NP. But also a hardware chip. The hardware chip may be an application-specific integrated circuit (ASIC), a programmable logic device (programmable logic device, PLD), or a combination thereof. The PLD may be a complex programmable logic device (complex programmable logic device, CPLD for short), a field-programmable gate array (field-programmable gate array, FPGA for short), general-purpose array logic (generic array logic, GAL for short), or any combination thereof.
The embodiment of the invention also provides a computer readable storage medium, which comprises instructions, when running on a computer, for causing the computer to execute the network service detection method for stateless recording provided by the embodiment.
It will be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the apparatus and modules described above may refer to the corresponding process in the foregoing method embodiment, which is not repeated herein.
In the several embodiments provided in this application, it should be understood that the disclosed systems, apparatuses, and methods may be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative, and for example, the division of the modules is merely a logical function division, and there may be additional divisions when actually implemented, for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or modules, which may be in electrical, mechanical, or other forms.
The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical modules, i.e., may be located in one place, or may be distributed over a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional module in each embodiment of the present application may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module. The integrated modules may be implemented in hardware or in software functional modules. The integrated modules, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product.
The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present application, in whole or in part.
The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.).
The computer readable storage medium may be any available medium that can be stored by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a DVD), or a semiconductor medium (e.g., a Solid State Disk (SSD)), or the like.
The foregoing has described in detail the technical solutions provided herein, and specific examples have been used to illustrate the principles and embodiments of the present application, where the above examples are only used to help understand the methods and core ideas of the present application; meanwhile, as those skilled in the art will have modifications in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product.
Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to the application.
It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various modifications and variations can be made in the present application without departing from the spirit or scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims and the equivalents thereof, the present application is intended to cover such modifications and variations.

Claims (10)

1. A method for detecting network services for stateless recordings, comprising:
preprocessing the acquired network traffic data to obtain a first traffic sample set, wherein the first traffic sample set comprises network traffic samples taking a User Datagram Protocol (UDP) as a transmission protocol;
screening out network traffic samples corresponding to the same source port under each same source network address in the first traffic sample set as a subset to form a second traffic sample set;
clustering the second flow sample set by utilizing the flow characteristics of each subset in the second flow sample set, and screening network flow samples corresponding to the service end in the first flow sample set based on a clustering result to obtain a third flow sample set;
inquiring a network flow sample corresponding to the object to be detected from the third flow sample set, and passively detecting the object to be detected based on the inquired network flow sample to determine the opening probability of the object to be detected, wherein the object to be detected is a port to be detected under a network address to be detected;
And actively detecting the ports to be detected with the opening probability meeting the requirement, and determining the ports which are opened under the network address to be detected.
2. The method of claim 1, wherein preprocessing the acquired network traffic data to obtain a first traffic sample set comprises:
acquiring network traffic data from a designated target network device;
performing operations of removing repeated data and/or abnormal data on the network traffic data to obtain primarily processed network traffic data;
and screening out the network traffic sample taking UDP as a transmission protocol from the primarily processed network traffic data to obtain a first traffic sample set.
3. The method of claim 1, wherein the clustering the second set of traffic samples with the traffic characteristics of each subset of the second set of traffic samples comprises:
obtaining a first traffic characteristic of each subset based on the number of network traffic samples in each subset;
obtaining a second flow characteristic of each subset based on the number of network flow samples in each subset and the message length of the network flow samples in each subset;
clustering samples in a flow characteristic data set based on a first flow characteristic and a second flow characteristic, and dividing the samples in the flow characteristic data set into samples corresponding to a server side and samples corresponding to a client side, wherein the same sample comprises a source network address, a source port, a first flow characteristic and a second flow characteristic corresponding to the same subset, and the flow characteristic data set comprises samples corresponding to each subset; the step of screening the network traffic samples corresponding to the service end in the first traffic sample set based on the clustering result to obtain a third traffic sample set, including:
Determining a source network address corresponding to the server based on the sample corresponding to the server;
and screening the network flow samples corresponding to the service end in the first flow sample set based on the source network address corresponding to the service end to obtain a third flow sample set.
4. The method of claim 3, wherein deriving the first traffic characteristic for each subset based on the number of network traffic samples in each subset comprises:
the number of network traffic samples in each subset is taken as a first traffic characteristic of each subset;
the obtaining the second traffic characteristic of each subset based on the number of the network traffic samples in each subset and the message length of the network traffic samples in each subset includes:
based on the number of network traffic samples in each subset and the total message length of all network traffic samples in each subset, calculating an average message length of the network traffic samples in each subset, and taking the average message length of the network traffic samples in each subset as a second traffic characteristic of each subset.
5. A method according to claim 3, wherein clustering samples in the flow feature dataset based on the first flow feature and the second flow feature comprises:
Taking the first flow characteristic and the second flow characteristic as characteristic dimensions, and clustering samples in the flow characteristic data set based on a k-means clustering algorithm to obtain a server cluster and a client cluster;
based on the samples belonging to the server cluster and the samples belonging to the client cluster, the samples in the flow characteristic data set are divided into samples corresponding to the server and samples corresponding to the client.
6. The method according to claim 1, wherein querying the network traffic samples corresponding to the object to be tested from the third traffic sample set includes:
inquiring a network flow sample corresponding to each port to be tested under the network address to be tested from the third flow sample set;
the passive detection is carried out on the object to be detected based on the inquired network traffic sample, and the determining of the opening probability of the object to be detected comprises the following steps:
and executing the following operations on each port to be tested under the network address to be tested:
determining the number of votes obtained on each voting item by a network traffic sample corresponding to a port to be tested based on a preset voting item and a vote number calculation mechanism corresponding to each voting item, wherein the voting item comprises a judgment rule for judging whether the port is open, and the height of the number of votes represents the degree that the network traffic sample accords with the corresponding judgment rule;
And determining the opening probability of the port to be tested based on the number of votes obtained on each voting item by the network traffic sample corresponding to the port to be tested.
7. The method of claim 6, wherein the ballot items comprise positive ballot items and negative ballot items, the positive ballot items comprising ballot items characterized by an open port, the negative ballot items comprising ballot items characterized by a non-open port;
the determining the number of votes obtained on each voting item by the network flow sample corresponding to the port to be tested based on the preset voting item and the number calculation mechanism corresponding to each voting item comprises the following steps:
traversing a network flow sample corresponding to a port to be tested, and calculating positive and negative votes obtained by the current network flow sample based on parameters of the traversed network flow sample, preset voting items and a vote count calculation mechanism corresponding to each voting item;
and determining the opening probability of the port to be tested based on the proportion of the total positive ticket number of the network flow sample corresponding to the port to be tested in the total ticket number, wherein the total positive ticket number is the sum of the positive ticket numbers obtained by the network flow samples corresponding to the port to be tested, and the total ticket number is the sum of the total positive ticket number and the total negative ticket number of the network flow sample corresponding to the port to be tested.
8. The method of claim 7, wherein actively detecting the ports to be tested with the opening probability meeting the requirement, and determining the ports to be opened under the network address to be tested, comprises:
the open probability of all the ports to be tested is ordered in a descending order, and the ports to be tested with the front set number of open probabilities are selected; or selecting the ports to be detected with the opening probability larger than a preset threshold value from all the ports to be detected;
and actively detecting the selected port to be detected, and determining the port opened under the network address to be detected.
9. A network service detection apparatus for stateless recording, comprising:
the preprocessing module is used for preprocessing the acquired network traffic data to obtain a first traffic sample set, wherein the first traffic sample set comprises network traffic samples taking a user datagram protocol UDP as a transmission protocol;
the first screening module is used for screening network traffic samples corresponding to the same source port under each same source network address in the first traffic sample set as a subset to form a second traffic sample set;
the second screening module is used for clustering the second flow sample set by utilizing the flow characteristics of each subset in the second flow sample set, and screening network flow samples corresponding to the service end in the first flow sample set based on a clustering result to obtain a third flow sample set;
The first detection module is used for inquiring a network traffic sample corresponding to the object to be detected from the third traffic sample set, passively detecting the object to be detected based on the inquired network traffic sample, and determining the opening probability of the object to be detected, wherein the object to be detected is a port to be detected under a network address to be detected;
and the second detection module is used for actively detecting the ports to be detected with the opening probability meeting the requirement and determining the open ports under the network address to be detected.
10. An electronic device comprising at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-8.
CN202311501395.9A 2023-11-13 2023-11-13 Network service detection method, device and equipment for stateless records Pending CN117579532A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311501395.9A CN117579532A (en) 2023-11-13 2023-11-13 Network service detection method, device and equipment for stateless records

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311501395.9A CN117579532A (en) 2023-11-13 2023-11-13 Network service detection method, device and equipment for stateless records

Publications (1)

Publication Number Publication Date
CN117579532A true CN117579532A (en) 2024-02-20

Family

ID=89863581

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311501395.9A Pending CN117579532A (en) 2023-11-13 2023-11-13 Network service detection method, device and equipment for stateless records

Country Status (1)

Country Link
CN (1) CN117579532A (en)

Similar Documents

Publication Publication Date Title
US10356106B2 (en) Detecting anomaly action within a computer network
CN111935170B (en) Network abnormal flow detection method, device and equipment
US9503465B2 (en) Methods and apparatus to identify malicious activity in a network
EP3223495B1 (en) Detecting an anomalous activity within a computer network
CN113328985B (en) Passive Internet of things equipment identification method, system, medium and equipment
CN111953552B (en) Data flow classification method and message forwarding equipment
CN110266726B (en) Method and device for identifying DDOS attack data stream
EP3282643A1 (en) Method and apparatus of estimating conversation in a distributed netflow environment
CN112019449B (en) Traffic identification packet capturing method and device
CN111885106A (en) Internet of things safety management and control method and system based on terminal equipment characteristic information
JP2008219127A (en) Network quality measuring instrument, network quality measuring method, and network quality measuring program
CN113206860A (en) DRDoS attack detection method based on machine learning and feature selection
EP2530873B1 (en) Method and apparatus for streaming netflow data analysis
KR20140035678A (en) Learning-based dns analyzer and analysis method
KR100901696B1 (en) Apparatus of content-based Sampling for Security events and method thereof
CN112104523B (en) Detection method, device and equipment for flow transparent transmission and storage medium
CN112449371A (en) Performance evaluation method of wireless router and electronic equipment
CN114020734A (en) Flow statistics duplication removing method and device
CN114189348A (en) Asset identification method suitable for industrial control network environment
US20150058466A1 (en) Device for server grouping
CN117579532A (en) Network service detection method, device and equipment for stateless records
CN114567501B (en) Automatic asset identification method, system and equipment based on label scoring
Pekar et al. Towards threshold‐agnostic heavy‐hitter classification
CN110995887B (en) ID association method and device
CN111106980B (en) Bandwidth binding detection method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination