CN116032851B - NAT (network Address translation) identification method and system for TCP (Transmission control protocol) short connection based on interval time sequence track characteristics - Google Patents

NAT (network Address translation) identification method and system for TCP (Transmission control protocol) short connection based on interval time sequence track characteristics Download PDF

Info

Publication number
CN116032851B
CN116032851B CN202211730122.7A CN202211730122A CN116032851B CN 116032851 B CN116032851 B CN 116032851B CN 202211730122 A CN202211730122 A CN 202211730122A CN 116032851 B CN116032851 B CN 116032851B
Authority
CN
China
Prior art keywords
short
time sequence
interval
aggregation
connection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211730122.7A
Other languages
Chinese (zh)
Other versions
CN116032851A (en
Inventor
支凤麟
蔡晓华
杨光辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Netis Technologies Co ltd
Original Assignee
Shanghai Netis Technologies Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Netis Technologies Co ltd filed Critical Shanghai Netis Technologies Co ltd
Priority to CN202211730122.7A priority Critical patent/CN116032851B/en
Publication of CN116032851A publication Critical patent/CN116032851A/en
Application granted granted Critical
Publication of CN116032851B publication Critical patent/CN116032851B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention provides a NAT (network address translation) identification method and a system for TCP (transmission control protocol) short connection based on interval time sequence track characteristics, wherein the NAT identification method and the system comprise the following steps: constructing a short-connection network flow matching bipartite graph through a short-connection network flow frame and obtaining corresponding short-connection flow aggregation; generating a corresponding time sequence vector according to the short connection flow aggregation, and further obtaining a corresponding interval time sequence characteristic; and calculating the similarity between the short connection flow aggregation according to the short connection network flow matching bipartite graph and the interval time sequence characteristic, and if the similarity is larger than a preset threshold, successfully matching the NAT of the two short connection flow aggregation, thereby completing NAT identification. The invention completes the NAT identification function through the time sequence characteristic of high-frequency short connection transmission, and compensates for the short board which can not be identified by the NAT identification method under the condition of load missing.

Description

NAT (network Address translation) identification method and system for TCP (Transmission control protocol) short connection based on interval time sequence track characteristics
Technical Field
The invention relates to the technical field of communication, in particular to a NAT (network address translation) identification method and a NAT identification system for TCP (Transmission control protocol) short connection based on interval time sequence track characteristics.
Background
The transmission control protocol (TCP, transmission Control Protocol) is a connection-oriented, reliable, byte-stream based transport layer communication protocol. TCP is intended to accommodate a layered protocol hierarchy that supports multiple network applications. Reliable communication services are provided by means of TCP between pairs of processes in host computers connected to different but interconnected computer communication networks. TCP assumes that it can obtain simple, possibly unreliable datagram services from lower level protocols. In principle, TCP should be able to operate over a variety of communication systems from hardwired to packet-switched or circuit-switched networks.
NAT (Network Address Translation), referred to as network address translation. NAT methods can be used when some hosts inside the private network have been assigned a local IP address (i.e., a private address used only in the private network), but want to communicate with hosts on the internet (without encryption). NAT can not only solve the problem of insufficient IP address, but also effectively avoid the attack from outside the network, conceal and protect the computer inside the network. In addition, this approach, by using a small number of global IP addresses (public network IP addresses) to represent more private IP addresses, will help to slow the exhaustion of the available IP address space.
For network traffic passing through equipment with NAT function, the NAT equipment can modify information such as IP, port and MAC in the traffic when forwarding, and for application needing traffic monitoring, the same network traffic with different IP, port and MAC on two sides of the NAT equipment is matched and linked, and the function is NAT identification.
The conventional NAT identification method is generally realized by comparing the load content similarity of the traffic, but in the network traffic, the high-frequency short-link traffic without load occupies a non-negligible proportion (including but not limited to the scenes such as electronic payment, logistics message short message, line instruction and the like), and at the moment, the NAT identification method based on the load content cannot realize identification.
Patent document CN115022280a discloses a method, a client and a system for NAT detection, where the method includes: the client initiates mapping behavior detection to the STUN server, and identifies the mapping type of the NAT based on the mapping behavior detection; when the mapping type of the NAT is identified as the NAT type of the endpoint irrelevant mapping based on the mapping behavior detection, initiating filtering behavior detection to the STUN server, and identifying the filtering type of the NAT based on the filtering behavior detection; when the filtering type of the NAT is identified as the NAT type of address and port related filtering based on filtering behavior detection, initiating sequential behavior detection to the STUN server, and identifying the sequential type of the NAT based on the sequential behavior detection.
However, patent document CN115022280a requires NAT identification by filtering action and data type of NAT, requires many assumptions, and is not suitable for short-link NAT identification including load data type.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide a NAT identification method and a NAT identification system for TCP short connection based on interval time sequence track characteristics.
The NAT identification method for the TCP short connection based on the interval time sequence track characteristics provided by the invention comprises the following steps:
step S1: constructing a short-connection network flow matching bipartite graph through a short-connection network flow frame and obtaining corresponding short-connection flow aggregation;
Step S2: generating a corresponding time sequence vector according to the short connection flow aggregation, and further obtaining a corresponding interval time sequence characteristic;
Step S3: and calculating the similarity between the short connection flow aggregation according to the short connection network flow matching bipartite graph and the interval time sequence characteristic, and if the similarity is larger than a preset threshold, successfully matching the NAT of the two short connection flow aggregation, thereby completing NAT identification.
Preferably, the step S1 includes:
Step S1.1: identifying short-connection network traffic, and collecting all short-connection network traffic between specific time windows to obtain corresponding short-connection network traffic frames;
Step S1.2: sliding time windows corresponding to two ends of traffic transmission, and dividing short-connection network traffic frames at two ends into continuous and non-overlapping short-connection network traffic frame sequences respectively to obtain a first bipartite graph and a second bipartite graph;
Step S1.3: and respectively aggregating the short-connection network flow frames in the first two-part diagram and the second two-part diagram to obtain corresponding short-connection network flow aggregation.
Preferably, the aggregation includes performing aggregation according to triplets, and dividing the aggregation into a plurality of sets, wherein each element in the sets is a quadruple of a short connection network flow, and each set is a short connection traffic aggregation.
Preferably, step S2 includes:
Step S2.1: sequencing short connection network flows in each short connection flow aggregation in an ascending order according to the occurrence time of the quadruple, and generating a time sequence interval vector corresponding to the short connection flow aggregation;
Step S2.2: sliding the sliding window with the length of n on the time sequence interval vector, obtaining corresponding time sequence interval characteristics according to data covered by each sliding serial port, and further obtaining a time sequence interval characteristic set corresponding to the time sequence interval vector.
Preferably, step S3 includes:
Step S3.1: sequentially calculating the similarity of the time sequence interval characteristics in the corresponding time sequence interval characteristic sets in the two short link flow rate aggregation, wherein the formula is as follows:
wherein, series_gap_sim represents the similarity of the time sequence interval characteristics, and series_gap_dist represents the track distance of the time sequence interval characteristics;
Step S3.2: and after the similarity of the time sequence interval characteristics in the corresponding time sequence interval characteristic sets in the two short-link traffic aggregation is calculated, taking out the maximum value of the similarity of all the time sequence characteristics as the similarity between the two short-link traffic aggregation.
The invention provides a NAT identification system of TCP short connection based on interval time sequence track characteristics, comprising:
module M1: constructing a short-connection network flow matching bipartite graph through a short-connection network flow frame and obtaining corresponding short-connection flow aggregation;
Module M2: generating a corresponding time sequence vector according to the short connection flow aggregation, and further obtaining a corresponding interval time sequence characteristic;
module M3: and calculating the similarity between the short connection flow aggregation according to the short connection network flow matching bipartite graph and the interval time sequence characteristic, and if the similarity is larger than a preset threshold, successfully matching the NAT of the two short connection flow aggregation, thereby completing NAT identification.
Preferably, the module M1 comprises:
Module M1.1: identifying short-connection network traffic, and collecting all short-connection network traffic between specific time windows to obtain corresponding short-connection network traffic frames;
Module M1.2: sliding time windows corresponding to two ends of traffic transmission, and dividing short-connection network traffic frames at two ends into continuous and non-overlapping short-connection network traffic frame sequences respectively to obtain a first bipartite graph and a second bipartite graph;
Module M1.3: and respectively aggregating the short-connection network flow frames in the first two-part diagram and the second two-part diagram to obtain corresponding short-connection network flow aggregation.
Preferably, the aggregation includes performing aggregation according to triplets, and dividing the aggregation into a plurality of sets, wherein each element in the sets is a quadruple of a short connection network flow, and each set is a short connection traffic aggregation.
Preferably, the module M2 comprises:
Module M2.1: sequencing short connection network flows in each short connection flow aggregation in an ascending order according to the occurrence time of the quadruple, and generating a time sequence interval vector corresponding to the short connection flow aggregation;
Module M2.2: sliding the sliding window with the length of n on the time sequence interval vector, obtaining corresponding time sequence interval characteristics according to data covered by each sliding serial port, and further obtaining a time sequence interval characteristic set corresponding to the time sequence interval vector.
Preferably, the module M3 comprises:
Module M3.1: sequentially calculating the similarity of the time sequence interval characteristics in the corresponding time sequence interval characteristic sets in the two short link flow rate aggregation, wherein the formula is as follows:
wherein, series_gap_sim represents the similarity of the time sequence interval characteristics, and series_gap_dist represents the track distance of the time sequence interval characteristics;
Module M3.2: and after the similarity of the time sequence interval characteristics in the corresponding time sequence interval characteristic sets in the two short-link traffic aggregation is calculated, taking out the maximum value of the similarity of all the time sequence characteristics as the similarity between the two short-link traffic aggregation.
Compared with the prior art, the invention has the following beneficial effects:
1. the invention completes the NAT identification function through the time sequence characteristic of high-frequency short connection transmission, and compensates for the short board which can not be identified by the NAT identification method under the condition of load missing.
2. The invention can be used for NAT identification of the loaded flow, and compared with the identification method of the unpacking and the comparison content, the invention does not need to unpack and has higher identification speed.
3. The invention can be compatible with the problems of long forwarding interval and partial traffic loss caused by NAT forwarding equipment, and has better fault tolerance and robustness.
Drawings
Other features, objects and advantages of the present invention will become more apparent upon reading of the detailed description of non-limiting embodiments, given with reference to the accompanying drawings in which:
FIG. 1 is a schematic flow chart of the present invention.
Fig. 2 is a schematic flow chart of constructing a short connection flow matching bipartite graph in the present invention.
FIG. 3 is a flow chart of the method for generating interval timing features according to the present invention.
Fig. 4 is a schematic diagram of a process of identifying and matching based on detection of a time-series track feature in the present invention.
FIG. 5 is a schematic diagram of a trace of a time interval vector plot according to the present invention.
In fig. 5, the abscissa indicates the sequence number of the time interval, and the ordinate indicates the time length of the time interval.
Detailed Description
The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the present invention, but are not intended to limit the invention in any way. It should be noted that variations and modifications could be made by those skilled in the art without departing from the inventive concept. These are all within the scope of the present invention.
Within the network industry, the portion of a TCP network connection that includes three connections and four disconnects is referred to as a session. The session is typically provided with a five-tuple header as follows:
(Source IP, source Port, destination IP, destination Port, protocol type)
The logical layer above the session layer is called flow, i.e. network flow, which is typically provided with a quad header as follows:
(Source IP, source Port, destination IP, destination Port)
For the flows that do not contain SYN header in the flows, called short connection flows, the short connection flows in the present invention, that is, the quadruple representation of short connection network flows is shown as follows:
(Source IP, destination Port, time of occurrence)
Example 1
According to the NAT identification method of the TCP short connection based on the interval time sequence track feature, as shown in fig. 1, the method comprises the following steps:
Step S1: and constructing a short-connection network flow matching bipartite graph through the short-connection network flow frame and obtaining corresponding short-connection flow aggregation. As shown in fig. 2, step S1 includes:
step S1.1: and identifying short-connection network traffic, and collecting all short-connection network traffic between specific time windows to obtain corresponding short-connection network traffic frames. Specifically, the short connection network traffic frame is a set of all short connection flows between time windows [ start_time, end_time ], denoted as short_flow_frame. Wherein the difference between the start_time and the end_time is typically no more than 1500 milliseconds.
Step S1.2: and sliding time windows corresponding to two ends of traffic transmission, and dividing the short-connection network traffic frames at the two ends into continuous and non-overlapping short-connection network traffic frame sequences respectively to obtain a first bipartite graph and a second bipartite graph. Specifically, if traffic is sent from the a side to the B side of the NAT, the traffic frame on the a side is defined as a short link traffic frame with a start-stop time of [ t1, t2 ], and the difference between t1 and t2 is generally not more than 1000 milliseconds; correspondingly, the defined traffic frame on the B side is defined as a short link traffic frame with a start-stop time of [ t1, t2+delay_offset), and is denoted as short_flow_frame_b, and delay_offset is generally not more than 500 milliseconds. Sliding time window [ t1, t 2), dividing flow data at side A into continuous and non-overlapping short connection flow frame sequences, wherein the flow frames jointly form a first two-part graph, generating corresponding flow frame sequences by side flow data at side B, and the flow frames jointly form a second two-part graph.
Step S1.3: and respectively aggregating the short-connection network flow frames in the first two-part diagram and the second two-part diagram to obtain corresponding short-connection network flow aggregation. The method comprises the steps of carrying out aggregation and division into a plurality of sets according to triples, wherein each element in the sets is a quadruple of a short connection network flow, and each set is a short connection flow aggregation. Specifically, in the short connection flow frame of the first bipartite graph, namely, short_flow_frame_a, short connection flows, namely, short_flows, are aggregated according to (source IP, destination port) triplets to obtain a plurality of sets, each element in the sets is a short connection flow, namely, short_flow, and each set is called a short connection flow aggregation and is denoted as short_flow_agg. And performing the same operation of the first two-part diagram on the short link traffic frame of the second two-part diagram, namely short_flow_frame_b, so as to obtain a plurality of short link traffic aggregations.
Step S2: and generating a corresponding time sequence interval vector according to the short connection flow aggregation, and further obtaining a corresponding interval time sequence characteristic. As shown in fig. 3, step S2 includes:
step S2.1: and sequencing the short connection network flows in each short connection flow aggregation in an ascending order according to the occurrence time of the quadruple, and generating a time sequence interval vector corresponding to the short connection flow aggregation. That is, one timing interval vector is generated for each short link traffic aggregation. And for a specific short connection flow aggregation, namely short_flow_agg, sequencing the short connection flows contained in the short connection flow according to the occurrence time of the quadruple, wherein the time is from front to back. If there is a short link traffic aggregation, there are multiple short link flows:
short_flow_1,short_flow_2…short_flow_n
the corresponding occurrence time of these short link flows is time1, time2 … timeN, and the timing interval vector series_gap_vec is:
[gap_1,gap_2…gap_n-1]
wherein gap_i=short_flow_i+1-short_flow_i, the time sequence interval vector is marked as a series_gap_vec, and a corresponding time sequence interval item vector series_gap_vec is generated for each short_flow_agg according to the method.
Step S2.2: sliding a sliding window with the length n being generally greater than or equal to 20, such as 10 and 20, on the time sequence interval vector, wherein the sliding step length is step, the step length is generally 1< = step < = 5, corresponding time sequence interval characteristics are obtained according to data covered by each sliding serial port, and further a time sequence interval characteristic set corresponding to the time sequence interval vector is obtained. Specifically, a certain timing interval feature is as follows:
[(gap_1,time1),(gap_2,time2)…(gap_L,time_L)]
Where gap_x is an element of a timing interval vector, i.e., the length of a timing interval, and time_x is the start time point of the timing interval.
The window slides on each time interval vector, each time slide generates a corresponding time interval feature series_gap_feature, and a set formed by all sliding results is called a time interval feature set and is marked as series_gap_feature_set. All the time interval feature sets of each time interval vector timeseries _gap_vec, i.e. series_gap_feature_set, are calculated by the above method.
Step S3: and calculating the similarity between the short connection flow aggregation according to the short connection network flow matching bipartite graph and the interval time sequence characteristic, and if the similarity is larger than a preset threshold, successfully matching the NAT of the two short connection flow aggregation, thereby completing NAT identification. That is, a threshold value nat_threshold (real number of (0, 1)) is set, and if agg_sim > nat_threshold, it is considered that the two aggregated NATs match successfully, i.e., (source IP, destination port) of the first bipartite graph is mapped to (source IP, destination port) of the second bipartite graph after passing through NAT, as shown in fig. 4, step S3 includes:
Step S3.1: sequentially calculating the similarity of the time sequence interval characteristics in the corresponding time sequence interval characteristic sets in the two short link flow rate aggregation, wherein the formula is as follows:
Where, series_gap_sim represents the similarity of the time interval features, and series_gap_dist represents the track distance of the time interval features.
Step S3.2: and after the similarity of the time sequence interval characteristics in the corresponding time sequence interval characteristic sets in the two short-link traffic aggregation is calculated, taking out the maximum value of the similarity of all the time sequence characteristics as the similarity between the two short-link traffic aggregation.
Specifically, if there are short link traffic aggregations a and B, which respectively correspond to two sets of timing interval features, denoted as fea_set_a and fea_set_b, one timing interval feature is taken from fea_set_a and fea_set_b, respectively, denoted as fea_a_1 and fea_b_1.
The similarity of the timing interval features fea_a_1 and fea_b_1 is calculated. For a certain timing interval feature: [ (gap_1, time1), (gap_2, time2) … (gap_l, time_l) ] fetch its left element, build as a sequence of interval times: the corresponding initial time point sequences of [ gap_1, gap_2 … gap_L ] are [ time1, time2 … time L ] and the sequence of interval times is drawn on a two-dimensional plane to form a track as shown in FIG. 5.
For two timing interval features, similarity can be calculated between tracks formed from a sequence of interval times. The track similarity adopts a time sequence track distance calculation method based on dynamic time adjustment (DTW) improvement, and is defined as follows:
Different interval vector two-point distances: dist (i, j) =gap i-gapj)*log(timej-timei +1)
A square matrix D, D [ i, j ] with a side length L is provided, where D can be recursively defined as if the shortest distance between the trajectory formed by the first i points of one vector and the trajectory formed by the first j points of the other vector:
D[i,j]=min(D[i,j-1],D[i-1,j-1],D[i-1,j])+dist(i,j)
According to the definition, D [ L, L ] is the track distance of two time sequence interval characteristics when the window length is L, and is marked as series_gap_dist, and the similarity is defined as follows:
The similarity of fea_a_1 and fea_b_1 is calculated sequentially. Until the time interval feature combinations (fea_a_1, fea_b_1) belonging to fea_set_a and fea_set_b, respectively, are all calculated to obtain the similarity. Taking the maximum value of the similarity of all the time sequence interval characteristics as the similarity between two short connection flow aggregations, and recording the similarity as agg_sim.
The invention also provides a NAT identification system of the TCP short connection based on the interval time sequence track characteristic, and a person skilled in the art can realize the NAT identification system of the TCP short connection based on the interval time sequence track characteristic by executing the step flow of the NAT identification method of the TCP short connection based on the interval time sequence track characteristic, namely the NAT identification method of the TCP short connection based on the interval time sequence track characteristic can be understood as a preferred implementation mode of the NAT identification system of the TCP short connection based on the interval time sequence track characteristic.
The invention provides a NAT identification system of TCP short connection based on interval time sequence track characteristics, comprising:
Module M1: and constructing a short-connection network flow matching bipartite graph through the short-connection network flow frame and obtaining corresponding short-connection flow aggregation. The module M1 includes: module M1.1: and identifying short-connection network traffic, and collecting all short-connection network traffic between specific time windows to obtain corresponding short-connection network traffic frames. Module M1.2: and sliding time windows corresponding to two ends of traffic transmission, and dividing the short-connection network traffic frames at the two ends into continuous and non-overlapping short-connection network traffic frame sequences respectively to obtain a first bipartite graph and a second bipartite graph. Module M1.3: and respectively aggregating the short-connection network flow frames in the first two-part diagram and the second two-part diagram to obtain corresponding short-connection network flow aggregation. The aggregation includes the steps of carrying out aggregation according to triples, dividing the aggregation into a plurality of sets, wherein each element in the sets is a quadruple of a short connection network flow, and each set is a short connection flow aggregation.
Module M2: and generating a corresponding time sequence interval vector according to the short connection flow aggregation, and further obtaining a corresponding interval time sequence characteristic. The module M2 includes: module M2.1: and sequencing the short connection network flows in each short connection flow aggregation in an ascending order according to the occurrence time of the quadruple, and generating a time sequence interval vector corresponding to the short connection flow aggregation. Module M2.2: sliding the sliding window with the length of n on the time sequence interval vector, obtaining corresponding time sequence interval characteristics according to data covered by each sliding serial port, and further obtaining a time sequence interval characteristic set corresponding to the time sequence interval vector.
Module M3: and calculating the similarity between the short connection flow aggregation according to the short connection network flow matching bipartite graph and the interval time sequence characteristic, and if the similarity is larger than a preset threshold, successfully matching the NAT of the two short connection flow aggregation, thereby completing NAT identification. The module M3 includes:
Module M3.1: sequentially calculating the similarity of the time sequence interval characteristics in the corresponding time sequence interval characteristic sets in the two short link flow rate aggregation, wherein the formula is as follows:
Where, series_gap_sim represents the similarity of the time interval features, and series_gap_dist represents the track distance of the time interval features.
Module M3.2: and after the similarity of the time sequence interval characteristics in the corresponding time sequence interval characteristic sets in the two short-link traffic aggregation is calculated, taking out the maximum value of the similarity of all the time sequence characteristics as the similarity between the two short-link traffic aggregation.
Those skilled in the art will appreciate that the systems, apparatus, and their respective modules provided herein may be implemented entirely by logic programming of method steps such that the systems, apparatus, and their respective modules are implemented as logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc., in addition to the systems, apparatus, and their respective modules being implemented as pure computer readable program code. Therefore, the system, the apparatus, and the respective modules thereof provided by the present invention may be regarded as one hardware component, and the modules included therein for implementing various programs may also be regarded as structures within the hardware component; modules for implementing various functions may also be regarded as being either software programs for implementing the methods or structures within hardware components.
The foregoing describes specific embodiments of the present application. It is to be understood that the application is not limited to the particular embodiments described above, and that various changes or modifications may be made by those skilled in the art within the scope of the appended claims without affecting the spirit of the application. The embodiments of the application and the features of the embodiments may be combined with each other arbitrarily without conflict.

Claims (8)

1. The NAT identification method of the TCP short connection based on the interval time sequence track features is characterized by comprising the following steps:
step S1: constructing a short-connection network flow matching bipartite graph through a short-connection network flow frame and obtaining corresponding short-connection flow aggregation;
Step S2: generating a corresponding time sequence vector according to the short connection flow aggregation, and further obtaining a corresponding interval time sequence characteristic;
Step S3: calculating the similarity between short connection flow aggregation according to the short connection network flow matching bipartite graph and the interval time sequence characteristic, and if the similarity is larger than a preset threshold, matching the two short connection flow aggregation NAT successfully, so as to finish NAT identification;
The step S3 comprises the following steps:
Step S3.1: sequentially calculating the similarity of the time sequence interval characteristics in the corresponding time sequence interval characteristic sets in the two short link flow rate aggregation, wherein the formula is as follows:
Wherein, Representing similarity of time sequence interval characteristics,/>Track distance representing timing interval features;
Step S3.2: and after the similarity of the time sequence interval characteristics in the corresponding time sequence interval characteristic sets in the two short-link traffic aggregation is calculated, taking out the maximum value of the similarity of all the time sequence characteristics as the similarity between the two short-link traffic aggregation.
2. The NAT identification method for TCP short connections based on the interval timing trace feature according to claim 1, wherein said step S1 comprises:
Step S1.1: identifying short-connection network traffic, and collecting all short-connection network traffic between specific time windows to obtain corresponding short-connection network traffic frames;
Step S1.2: sliding time windows corresponding to two ends of traffic transmission, and dividing short-connection network traffic frames at two ends into continuous and non-overlapping short-connection network traffic frame sequences respectively to obtain a first bipartite graph and a second bipartite graph;
Step S1.3: and respectively aggregating the short-connection network flow frames in the first two-part diagram and the second two-part diagram to obtain corresponding short-connection network flow aggregation.
3. The NAT identification method for TCP short connections based on the interval timing trace feature of claim 2, wherein the aggregating includes aggregating according to triplets into sets, each element in a set being a quadruple of a short connection network flow, each set being a short connection traffic aggregation.
4. The NAT identification method for TCP short connections based on the interval timing trace feature of claim 3, wherein step S2 includes:
Step S2.1: sequencing short connection network flows in each short connection flow aggregation in an ascending order according to the occurrence time of the quadruple, and generating a time sequence interval vector corresponding to the short connection flow aggregation;
Step S2.2: sliding the sliding window with the length of n on the time sequence interval vector, obtaining corresponding time sequence interval characteristics according to data covered by each sliding serial port, and further obtaining a time sequence interval characteristic set corresponding to the time sequence interval vector.
5. A NAT identification system for TCP short connections based on interval timing trace features, comprising:
module M1: constructing a short-connection network flow matching bipartite graph through a short-connection network flow frame and obtaining corresponding short-connection flow aggregation;
Module M2: generating a corresponding time sequence vector according to the short connection flow aggregation, and further obtaining a corresponding interval time sequence characteristic;
module M3: calculating the similarity between short connection flow aggregation according to the short connection network flow matching bipartite graph and the interval time sequence characteristic, and if the similarity is larger than a preset threshold, matching the two short connection flow aggregation NAT successfully, so as to finish NAT identification;
The module M3 includes:
Module M3.1: sequentially calculating the similarity of the time sequence interval characteristics in the corresponding time sequence interval characteristic sets in the two short link flow rate aggregation, wherein the formula is as follows:
Wherein, Representing similarity of time sequence interval characteristics,/>Track distance representing timing interval features;
Module M3.2: and after the similarity of the time sequence interval characteristics in the corresponding time sequence interval characteristic sets in the two short-link traffic aggregation is calculated, taking out the maximum value of the similarity of all the time sequence characteristics as the similarity between the two short-link traffic aggregation.
6. The NAT identification system for TCP short connections based on the interval timing trace feature of claim 5, wherein said module M1 comprises:
Module M1.1: identifying short-connection network traffic, and collecting all short-connection network traffic between specific time windows to obtain corresponding short-connection network traffic frames;
Module M1.2: sliding time windows corresponding to two ends of traffic transmission, and dividing short-connection network traffic frames at two ends into continuous and non-overlapping short-connection network traffic frame sequences respectively to obtain a first bipartite graph and a second bipartite graph;
Module M1.3: and respectively aggregating the short-connection network flow frames in the first two-part diagram and the second two-part diagram to obtain corresponding short-connection network flow aggregation.
7. The NAT identification system for TCP short connections based on the interval timing trace feature of claim 6, wherein said aggregating includes aggregating according to triplets into sets, each element in a set being a quadruple of a short connection network flow, each set being a short connection traffic aggregation.
8. The NAT identification system for TCP short connections based on the interval timing trace feature of claim 7, wherein module M2 includes:
Module M2.1: sequencing short connection network flows in each short connection flow aggregation in an ascending order according to the occurrence time of the quadruple, and generating a time sequence interval vector corresponding to the short connection flow aggregation;
Module M2.2: sliding the sliding window with the length of n on the time sequence interval vector, obtaining corresponding time sequence interval characteristics according to data covered by each sliding serial port, and further obtaining a time sequence interval characteristic set corresponding to the time sequence interval vector.
CN202211730122.7A 2022-12-30 2022-12-30 NAT (network Address translation) identification method and system for TCP (Transmission control protocol) short connection based on interval time sequence track characteristics Active CN116032851B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211730122.7A CN116032851B (en) 2022-12-30 2022-12-30 NAT (network Address translation) identification method and system for TCP (Transmission control protocol) short connection based on interval time sequence track characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211730122.7A CN116032851B (en) 2022-12-30 2022-12-30 NAT (network Address translation) identification method and system for TCP (Transmission control protocol) short connection based on interval time sequence track characteristics

Publications (2)

Publication Number Publication Date
CN116032851A CN116032851A (en) 2023-04-28
CN116032851B true CN116032851B (en) 2024-05-14

Family

ID=86079572

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211730122.7A Active CN116032851B (en) 2022-12-30 2022-12-30 NAT (network Address translation) identification method and system for TCP (Transmission control protocol) short connection based on interval time sequence track characteristics

Country Status (1)

Country Link
CN (1) CN116032851B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102307123A (en) * 2011-09-06 2012-01-04 电子科技大学 NAT (Network Address Translation) flow identification method based on transmission layer flow characteristic
CN102801824A (en) * 2012-08-28 2012-11-28 山石网科通信技术(北京)有限公司 Method and system for processing NAT equipment, NAPT equipment and TCP application drainage
CN104580553A (en) * 2015-02-03 2015-04-29 网神信息技术(北京)股份有限公司 Identification method and device for network address translation device
JP2017028393A (en) * 2015-07-17 2017-02-02 Necエンジニアリング株式会社 Communication system, communication device, and vpn construction method
US9577898B1 (en) * 2013-12-31 2017-02-21 Narus, Inc. Identifying IP traffic from multiple hosts behind a network address translation device
WO2017061895A1 (en) * 2015-10-09 2017-04-13 Huawei Technologies Co., Ltd. Method and system for automatic online identification of network traffic patterns
US9729571B1 (en) * 2015-07-31 2017-08-08 Amdocs Software Systems Limited System, method, and computer program for detecting and measuring changes in network behavior of communication networks utilizing real-time clustering algorithms
US10630567B1 (en) * 2018-02-05 2020-04-21 Illuminate Technologies, Llc Methods, systems and computer readable media for monitoring communications networks using cross-correlation of packet flows
CN111131339A (en) * 2020-04-01 2020-05-08 深圳市云盾科技有限公司 NAT equipment identification method and system based on IP identification number
CN114884918A (en) * 2022-05-20 2022-08-09 深圳铸泰科技有限公司 NAT equipment identification method and system based on IP identification number
CN114979017A (en) * 2022-05-19 2022-08-30 杭州电子科技大学 Deep learning protocol identification method and system based on original flow of industrial control system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040249974A1 (en) * 2003-03-31 2004-12-09 Alkhatib Hasan S. Secure virtual address realm
US8990424B2 (en) * 2009-09-08 2015-03-24 Wichorus, Inc. Network address translation based on recorded application state

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102307123A (en) * 2011-09-06 2012-01-04 电子科技大学 NAT (Network Address Translation) flow identification method based on transmission layer flow characteristic
CN102801824A (en) * 2012-08-28 2012-11-28 山石网科通信技术(北京)有限公司 Method and system for processing NAT equipment, NAPT equipment and TCP application drainage
US9577898B1 (en) * 2013-12-31 2017-02-21 Narus, Inc. Identifying IP traffic from multiple hosts behind a network address translation device
CN104580553A (en) * 2015-02-03 2015-04-29 网神信息技术(北京)股份有限公司 Identification method and device for network address translation device
JP2017028393A (en) * 2015-07-17 2017-02-02 Necエンジニアリング株式会社 Communication system, communication device, and vpn construction method
US9729571B1 (en) * 2015-07-31 2017-08-08 Amdocs Software Systems Limited System, method, and computer program for detecting and measuring changes in network behavior of communication networks utilizing real-time clustering algorithms
WO2017061895A1 (en) * 2015-10-09 2017-04-13 Huawei Technologies Co., Ltd. Method and system for automatic online identification of network traffic patterns
US10630567B1 (en) * 2018-02-05 2020-04-21 Illuminate Technologies, Llc Methods, systems and computer readable media for monitoring communications networks using cross-correlation of packet flows
CN111131339A (en) * 2020-04-01 2020-05-08 深圳市云盾科技有限公司 NAT equipment identification method and system based on IP identification number
CN114979017A (en) * 2022-05-19 2022-08-30 杭州电子科技大学 Deep learning protocol identification method and system based on original flow of industrial control system
CN114884918A (en) * 2022-05-20 2022-08-09 深圳铸泰科技有限公司 NAT equipment identification method and system based on IP identification number

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
基于二值异或群的多轨迹识别算法;朱洪亮;李锐;辛阳;杨义先;徐国爱;;高技术通讯;20101025(第10期);全文 *
基于多维频繁序列挖掘的攻击轨迹识别方法;李洪成;吴晓平;俞艺涵;;海军工程大学学报;20180215(第01期);全文 *
基于时序流的移动流量实时分类方法;刘翼;嵩天;廖乐健;;北京理工大学学报;20180515(第05期);全文 *

Also Published As

Publication number Publication date
CN116032851A (en) 2023-04-28

Similar Documents

Publication Publication Date Title
WO2021051561A1 (en) Adversarial defense method and apparatus for image classification network, electronic device, and computer-readable storage medium
CN110546645A (en) Video recognition and training method and device, electronic equipment and medium
US9438612B1 (en) Calculating consecutive matches using parallel computing
CN107026917A (en) The method and system pushed for message
CN112702235B (en) Method for automatically and reversely analyzing unknown protocol
JP2016184412A (en) Method and system for automatic selection of one or more image processing algorithm
CN111709022B (en) Hybrid alarm association method based on AP clustering and causal relationship
CN110677718B (en) Video identification method and device
CN102111331A (en) Matching method based on hash table and adopting mask five-element rule
CN113872943A (en) Network attack path prediction method and device
JP2022521833A (en) Graph stream mining pipeline for efficient subgraph detection
CN114697391B (en) Data processing method, device, equipment and storage medium
CN116032851B (en) NAT (network Address translation) identification method and system for TCP (Transmission control protocol) short connection based on interval time sequence track characteristics
CN114866310A (en) Malicious encrypted flow detection method, terminal equipment and storage medium
CN114979062B (en) Dynamic network address translation using predictions
CN111405007A (en) TCP session management method, device, storage medium and electronic equipment
CN116668377A (en) VPN encrypted traffic classification device and method
Fan et al. A malicious traffic detection method based on attention mechanism
US20160191389A1 (en) Information processing device, method, and medium
CN108011989B (en) Redirection method and device
US20050114512A1 (en) Method and system for establishing communication between at least two devices
CN115150165B (en) Flow identification method and device
CN110505236B (en) Method and system for identifying digital signage device
US20220407788A1 (en) Network monitoring device and connection counting method
WO2023094002A1 (en) Configuring a machine device efficiently over a network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant