CN112131223A - Traffic classification statistical method, device, computer equipment and storage medium - Google Patents
Traffic classification statistical method, device, computer equipment and storage medium Download PDFInfo
- Publication number
- CN112131223A CN112131223A CN202011015388.4A CN202011015388A CN112131223A CN 112131223 A CN112131223 A CN 112131223A CN 202011015388 A CN202011015388 A CN 202011015388A CN 112131223 A CN112131223 A CN 112131223A
- Authority
- CN
- China
- Prior art keywords
- node
- target
- classification dimension
- classification
- red
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000007619 statistical method Methods 0.000 title claims abstract description 16
- 238000000034 method Methods 0.000 claims abstract description 34
- 238000004590 computer program Methods 0.000 claims description 27
- 238000000605 extraction Methods 0.000 claims description 4
- 239000000284 extract Substances 0.000 description 9
- 230000006870 function Effects 0.000 description 9
- 238000010586 diagram Methods 0.000 description 8
- 239000002699 waste material Substances 0.000 description 6
- 238000004891 communication Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000003780 insertion Methods 0.000 description 2
- 230000037431 insertion Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000013468 resource allocation Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2246—Trees, e.g. B+trees
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/285—Clustering or classification
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
- H04L43/0876—Network utilisation, e.g. volume of load or congestion level
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Environmental & Geological Engineering (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The application relates to a traffic classification statistical method, a traffic classification statistical device, a computer device and a storage medium. The method comprises the following steps: extracting target classification dimension information and first flow information according to the acquired data packet; the target classification dimension information is used for representing at least one classification dimension corresponding to the data packet, and the first flow information is used for representing the flow size of the data packet; searching a target node in a preset red-black tree based on the target classification dimension information; the classification dimension information stored by the target node is consistent with the target classification dimension information; if the target node is not found, inserting a new node into the target search path of the red-black tree to obtain a new red-black tree; the target searching path is a path for searching a target node, the new node stores target classification dimension information and first traffic information, and the new red and black tree is used for carrying out traffic classification statistics. By adopting the method, the efficiency of flow classification statistics can be improved.
Description
Technical Field
The present application relates to the field of data communication technologies, and in particular, to a traffic classification statistical method, an apparatus, a computer device, and a storage medium.
Background
In the field of data communications, classification statistics of network traffic is a basic and important technique. The user can know the flow conditions under all classification dimensions through the flow classification statistical technology.
At present, a common traffic classification statistical technique uses a hash table and a linked list data structure. Key values are established in advance according to the classification dimensions to be counted, and then a hash table is established to store the key values. When the traffic classification statistics is carried out, a hash value is calculated according to the classification dimension information of the traffic, and a key value matched with the hash value is searched in a hash table. And after the matched key value is found, determining a memory node corresponding to the key value in the hash table, and judging whether the classification dimension information stored by the memory node is consistent with the classification dimension information of the flow. And if the flow is not consistent, the external linked list on the memory node is required to store the flow.
The data structure of the hash table plus the linked list has the following disadvantages: under the condition of high conflict rate, the external linked list of a certain node in the hash table is very long, the time complexity of operations such as searching the linked list is uncontrollable, and the statistical efficiency is greatly reduced.
Disclosure of Invention
In view of the above, it is necessary to provide a traffic classification statistical method, apparatus, computer device and storage medium capable of improving statistical efficiency.
A traffic classification statistical method, the method comprising:
extracting target classification dimension information and first flow information according to the acquired data packet; the target classification dimension information is used for representing at least one classification dimension corresponding to the data packet, and the first flow information is used for representing the flow size of the data packet;
searching a target node in a preset red-black tree based on the target classification dimension information; the classification dimension information stored by the target node is consistent with the target classification dimension information;
if the target node is not found, inserting a new node into the target search path of the red-black tree to obtain a new red-black tree; the target searching path is a path for searching a target node, target classification dimension information and first traffic information are stored in a new node, and a new red and black tree is used for carrying out traffic classification statistics.
In the above embodiment, the red and black tree structure is adopted, so that the time complexity of operations such as searching, inserting and the like is low, and the efficiency of traffic classification statistics can be effectively improved.
In one embodiment, after searching for the target node in the preset red-black tree based on the target classification dimension information, the method further includes:
and if the target node is found, updating the second traffic information stored in the target node by adopting the first traffic information to obtain a new red-black tree.
In the embodiment, a new node does not need to be inserted into the red and black tree, so that the storage resource can be saved.
In one embodiment, the searching for the target node in the preset red-black tree based on the target classification dimension information includes:
determining a target classification dimension value corresponding to the target classification dimension information;
for the nth node in the target search path, comparing the target classification dimension value with the classification dimension value corresponding to the classification dimension information stored by the nth node;
if the target classification dimension value is inconsistent with the classification dimension value corresponding to the classification dimension information stored in the nth node, comparing the target classification dimension value with the classification dimension value corresponding to the nth node;
and determining the (n + 1) th node in the target search path from the child nodes of the nth node according to the comparison result, and comparing the target classification dimension value with the classification dimension value corresponding to the classification dimension information stored in the (n + 1) th node.
In the above embodiment, not only the accuracy of searching can be ensured, but also the waste of storage resources can be avoided.
In one embodiment, the method further comprises:
and if the target classification dimension value is consistent with the classification dimension value corresponding to the classification dimension information stored in the nth node, determining the nth node as the target node.
In one embodiment, the method further comprises:
and if the target classification dimension value is inconsistent with the classification dimension value corresponding to the classification dimension information stored in the nth node and the nth node has no child node, determining that the target node is not found in the red-black tree.
In the above embodiment, not only the accuracy of searching can be ensured, but also the waste of storage resources can be avoided.
In one embodiment, the inserting a new node into the target search path of the red-black tree to obtain a new red-black tree includes:
applying for a new node, and storing the target classification dimension information and the first flow information into the new node;
and taking the node at the tail end of the target search path as a father node, taking the new node as a child node, and connecting the child node with the father node to obtain a new red-black tree.
In the above embodiment, the new storage resource is applied only when the target node is not found in the red and black tree, that is, the storage resource is allocated as needed.
In one embodiment, after inserting a new node into the target search path of the red-black tree to obtain a new red-black tree, the method further includes:
and traversing each node of the new red-black tree, and carrying out traffic size statistics on the classification dimension to be counted according to the second traffic information stored by each node in the new red-black tree.
The embodiment stores the classification dimension information of the flow by adopting the red and black tree structure, and when the classification dimension to be counted is subjected to flow size statistics, the time complexity of searching the classification dimension information is lower than that of searching the classification dimension information in the prior art, so that the efficiency of flow classification statistics can be improved.
In one embodiment, the extracting the target classification dimension information and the first traffic information according to the obtained data packet includes:
extracting at least one classification dimension corresponding to the data packet from the packet header of the data packet according to an IP layer protocol to obtain target classification dimension information; the classification dimension comprises at least one of a source IP address and an application protocol;
and extracting the data length from the packet header of the data packet according to the IP layer protocol, and calculating the flow of the data packet according to the data length to obtain first flow information.
A traffic classification statistic apparatus, the apparatus comprising:
the information extraction module is used for extracting target classification dimension information and first traffic information according to the acquired data packet; the target classification dimension information is used for representing at least one classification dimension corresponding to the data packet, and the first flow information is used for representing the flow size of the data packet;
the node searching module is used for searching a target node in a preset red-black tree based on the target classification dimension information; the classification dimension information stored by the target node is consistent with the target classification dimension information;
the node insertion module is used for inserting a new node into a target search path of the red-black tree if the target node is not searched, so as to obtain a new red-black tree; the target searching path is a path for searching a target node, target classification dimension information and first traffic information are stored in a new node, and a new red and black tree is used for carrying out traffic classification statistics.
A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:
extracting target classification dimension information and first flow information according to the acquired data packet; the target classification dimension information is used for representing at least one classification dimension corresponding to the data packet, and the first flow information is used for representing the flow size of the data packet;
searching a target node in a preset red-black tree based on the target classification dimension information; the classification dimension information stored by the target node is consistent with the target classification dimension information;
if the target node is not found, inserting a new node into the target search path of the red-black tree to obtain a new red-black tree; the target searching path is a path for searching a target node, target classification dimension information and first traffic information are stored in a new node, and a new red and black tree is used for carrying out traffic classification statistics.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
extracting target classification dimension information and first flow information according to the acquired data packet; the target classification dimension information is used for representing at least one classification dimension corresponding to the data packet, and the first flow information is used for representing the flow size of the data packet;
searching a target node in a preset red-black tree based on the target classification dimension information; the classification dimension information stored by the target node is consistent with the target classification dimension information;
if the target node is not found, inserting a new node into the target search path of the red-black tree to obtain a new red-black tree; the target searching path is a path for searching a target node, target classification dimension information and first traffic information are stored in a new node, and a new red and black tree is used for carrying out traffic classification statistics.
According to the traffic classification statistical method and device, the computer equipment and the storage medium, the server extracts the target classification dimension information and the first traffic information according to the acquired data packet; searching a target node in a preset red-black tree based on the target classification dimension information; and if the target node is not found, inserting a new node into the target search path of the red-black tree to obtain a new red-black tree. In the embodiment of the disclosure, the red-black tree structure is adopted to store the classification dimension information and the flow information, and the time complexity of operations such as searching, inserting and the like is logarithmically increased along with the increase of the data volume; however, in the prior art, the hash table and the linked list are adopted, and the time complexity of operations such as searching and inserting is increased linearly along with the increase of the data volume. Because the logarithmic growth is slower than the linear growth, the time complexity of the embodiment of the present disclosure is lower than that of the prior art, and thus the efficiency of the traffic classification statistics can be effectively improved.
Drawings
FIG. 1 is a diagram of an exemplary traffic classification statistics application environment;
FIG. 2 is a flow chart illustrating an exemplary traffic classification statistical method;
FIG. 3 is a second flowchart of an embodiment of a traffic classification statistics method;
FIG. 4 is a flowchart illustrating a step of searching a target node in a pre-determined red-black tree according to an embodiment;
FIG. 5 is a second flowchart illustrating a step of searching a target node in a predetermined red-black tree according to an embodiment;
FIG. 6 is a flowchart illustrating the steps of inserting a new node into a target search path of a Red-Black tree in one embodiment;
FIG. 7 is a flow chart illustrating a traffic classification statistical method according to another embodiment;
FIG. 8 is a block diagram showing the structure of a traffic classification statistic device according to an embodiment;
FIG. 9 is a second block diagram of an embodiment of a traffic classification statistic device;
FIG. 10 is a third block diagram illustrating the structure of a traffic classification statistic apparatus according to an embodiment;
FIG. 11 is a block diagram showing the structure of a traffic classification statistic apparatus according to an embodiment;
FIG. 12 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The traffic classification statistical method provided by the application can be applied to the application environment shown in fig. 1. The application environment includes a terminal 102 and a server 104. The terminal 102 communicates with the server 104 through a network, and the terminal 102 sends a data packet to the server 104, or the server 104 sends a data packet to the terminal 102. Then, the server 104 extracts target classification dimension information and first traffic information according to the acquired data packet, and then searches a target node in a preset red-black tree based on the target classification dimension information; and if the target node is not found, inserting a new node into the target search path of the red-black tree to obtain a new red-black tree, and carrying out flow classification statistics according to the new red-black tree. The terminal 102 may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices, and the server 104 may be implemented by a stand-alone server or a server cluster composed of a plurality of servers.
In one embodiment, as shown in fig. 2, a traffic classification statistical method is provided, which is described by taking the method as an example applied to the server in fig. 1, and includes the following steps:
The target classification dimension information is used for representing at least one classification dimension corresponding to the data packet, and the first flow information is used for representing the flow size of the data packet. The classification dimension may include at least one of a source IP (Internet Protocol) address and an application Protocol. The classification dimensionality is not limited in the embodiment of the disclosure, and can be set according to actual conditions.
The terminal sends a data packet to the server, and the server extracts a source IP address, an application protocol and the like of the data packet from the data packet sent by the terminal to obtain target classification dimension information of the data packet. Meanwhile, the server determines the flow size of the data packet to obtain first flow information of the data packet.
For example, the server extracts the target classification dimension information from the data packet, where the target classification dimension information includes a source IP address a, and the first traffic information includes a traffic size x 1.
Or, the terminal sends a data packet to the server, and the server processes the received data packet to obtain stream summary information; then, the server extracts the source IP address and the application protocol from the stream summary information and determines the size of the stream. The embodiment of the present disclosure does not limit the manner of extracting the target classification dimension information and the first traffic information.
And the classification dimension information stored by the target node is consistent with the target classification dimension information.
The server is preset with a red-black tree, and each node of the red-black tree stores classification dimension information and flow information. After obtaining the target classification dimension information of the data packet, the server searches a target node from the red-black tree according to the target classification dimension information, wherein the classification dimension information stored by the target node is consistent with the target classification dimension information.
For example, the target classification dimension information includes that the source IP address is a, and the server determines whether each node of the red-black tree is a target node according to the source IP address being a. If the classification dimension information stored by the node 1 of the red-black tree comprises that the source IP address is A, determining that the classification dimension information stored by the node 1 is consistent with the target classification dimension information, wherein the node 1 is a target node; if the classification dimension information stored by the node 1 includes that the source IP address is B, the node 1 is not the target node when the classification dimension information stored by the node 1 is determined to be inconsistent with the target classification dimension information. Then, whether the child node of the node 1 is the target node is judged according to the source IP address A.
And 203, if the target node is not found, inserting a new node into the target search path of the red-black tree to obtain a new red-black tree.
The target searching path is a path for searching a target node, target classification dimension information and first traffic information are stored in a new node, and a new red and black tree is used for carrying out traffic classification statistics.
And if the server searches the red-black tree one by one along the target search path but does not search the target node, storing the target classification dimension information and the first flow information into a new node, and inserting the new node into the target search path of the red-black tree to obtain a new red-black tree. And then, the server carries out traffic size statistics on one or more classification dimensions according to the classification dimension information and the traffic information stored by each node in the red-black tree.
For example, the traffic size of the same source IP address is counted, or the traffic sizes of a plurality of source IP addresses are counted. The traffic can be further divided into uplink traffic and downlink traffic. The statistical mode is not limited in the embodiment of the disclosure, and the statistical mode can be selected according to actual conditions.
In the traffic classification statistical method, a server extracts target classification dimension information and first traffic information according to an obtained data packet; searching a target node in a preset red-black tree based on the target classification dimension information; and if the target node is not found, inserting a new node into the target search path of the red-black tree to obtain a new red-black tree. In the embodiment of the disclosure, the red-black tree structure is adopted to store the classification dimension information and the flow information, and the time complexity of operations such as searching, inserting and the like is logarithmically increased along with the increase of the data volume; however, in the prior art, the hash table and the linked list are adopted, and the time complexity of operations such as searching and inserting is increased linearly along with the increase of the data volume. Because the logarithmic growth is slower than the linear growth, the time complexity of the embodiment of the present disclosure is lower than that of the prior art, and thus the efficiency of the traffic classification statistics can be effectively improved.
In an embodiment, as shown in fig. 3, after the step of searching for the target node in the preset red-black tree based on the target classification dimension information, the method may further include:
and 204, if the target node is found, updating the second traffic information stored in the target node by adopting the first traffic information to obtain a new red-black tree.
And if the server finds the target node in the red-black tree, merging the first traffic information and the second traffic information stored in the target node, and updating the second traffic information stored in the target node by adopting the merged traffic information to obtain a new red-black tree.
For example, if the first traffic information includes a traffic size of x1 and the second traffic information stored in the target node includes a traffic size of x2, the second traffic information stored in the target node is updated to a traffic size of x1+ x 2. The updating mode is not limited in the embodiment of the disclosure, and can be set according to actual conditions.
Understandably, the server searches a target node in a preset red-black tree based on the target classification dimension information; and if the target node is found, updating the second traffic information stored in the target node by adopting the first traffic information. In this way, no new node needs to be inserted into the red and black tree, and therefore storage resources can be saved.
In an embodiment, as shown in fig. 4, the step of searching for the target node in the preset red-black tree based on the target classification dimension information may include:
The server may preset a comparison function for determining a corresponding classification dimension value according to the classification dimension information. For target classification dimension information, the comparison function may determine a corresponding target classification dimension value. For example, the target classification dimension information including the source IP address as a is input into a comparison function, which may determine that the corresponding target classification dimension value is a.
In the process of searching the target node, the classification dimension information stored by the nth node in the target search path is also input into the comparison function, and the comparison function can determine the corresponding classification dimension value. After the target classification dimension value and the classification dimension value corresponding to the nth node are obtained, the comparison function compares the target classification dimension value with the classification dimension value corresponding to the nth node.
And the comparison function predefines the judgment condition of the classification dimension value corresponding to each node in the red and black tree. And if the target classification dimension value is determined to be inconsistent with the classification dimension value corresponding to the nth node, the comparison function compares the target classification dimension value with the classification dimension value corresponding to the nth node to obtain a comparison result. The comparison result may include that the target classification dimension value is greater than the classification dimension value corresponding to the nth node, or that the target classification dimension value is less than the classification dimension value corresponding to the nth node.
And 304, determining the (n + 1) th node in the target search path from the child nodes of the nth node according to the comparison result, and comparing the target classification dimension value with the classification dimension value corresponding to the classification dimension information stored in the (n + 1) th node.
Each node in the red and black tree can comprise a left sub-node and a right sub-node, and if the comparison result shows that the target classification dimension value is larger than the classification dimension value corresponding to the nth node, the left node of the nth node is determined as the (n + 1) th node in the target search path; and if the comparison result is that the target classification dimension value is smaller than the classification dimension value corresponding to the nth node, determining the right node of the nth node as the (n + 1) th node in the target search path. Or if the comparison result is that the target classification dimension value is larger than the classification dimension value corresponding to the nth node, determining the right node of the nth node as the (n + 1) th node in the target search path; and if the comparison result is that the target classification dimension value is smaller than the classification dimension value corresponding to the nth node, determining the left node of the nth node as the (n + 1) th node in the target search path. The specific determination mode is set according to actual conditions, and the embodiment of the disclosure does not limit the specific determination mode in detail.
After the n +1 th node in the target search path is determined, a comparison function is adopted to compare the target classification dimension value with the classification dimension value corresponding to the classification dimension information stored by the n +1 th node, and if the target classification dimension value is consistent with the classification dimension value corresponding to the n +1 th node, the n +1 th node is determined as the target node. If the target classification dimension value is not consistent with the classification dimension value corresponding to the (n + 1) th node, continuing to search for the target node by referring to the step 303 and the step 304.
The target node is searched according to the classification dimension value corresponding to the classification dimension information of each node and the target classification dimension information, so that the searching accuracy can be guaranteed, and various classification dimensions can be set according to user requirements. The structure of the hash table and the linked list is adopted in the prior art, if more classification dimensions need to be set, a large enough hash table needs to be set, but the waste of storage resources is easily caused by the overlarge set of the hash table. The embodiment of the disclosure adopts the structure of the red and black tree, so that the waste of storage resources can be avoided.
As shown in fig. 5, after step 302, the method may further include:
in step 305, if the target classification dimension value is consistent with the classification dimension value corresponding to the classification dimension information stored in the nth node, the nth node is determined as the target node.
If the server determines that the target classification dimension value is consistent with the classification dimension value corresponding to the nth node, the server indicates that the classification dimension information stored by the nth node is consistent with the target classification dimension information, and therefore the nth node is determined as the target node.
As shown in fig. 5, after step 302, the method may further include:
If the server determines that the target classification dimension value is inconsistent with the classification dimension value corresponding to the nth node and the nth node does not have a child node, it indicates that the classification dimension information stored in each node in the red-black tree is inconsistent with the target classification dimension information, and therefore the red-black tree does not have the target node.
In the above embodiment, the server determines the target classification dimension value corresponding to the target classification dimension information; for the nth node in the target search path, comparing the target classification dimension value with the classification dimension value corresponding to the classification dimension information stored by the nth node; if the target classification dimension value is inconsistent with the classification dimension value corresponding to the classification dimension information stored in the nth node, comparing the target classification dimension value with the classification dimension value corresponding to the nth node; and determining the (n + 1) th node in the target search path from the child nodes of the nth node according to the comparison result, and comparing the target classification dimension value with the classification dimension value corresponding to the classification dimension information stored in the (n + 1) th node. According to the embodiment of the disclosure, not only can the accuracy of searching be ensured, but also the waste of storage resources can be avoided.
In an embodiment, as shown in fig. 6, the step of inserting a new node into the target search path of the red-black tree to obtain a new red-black tree may include:
And if the server does not find the target node in the red and black tree, applying for a new node in the memory, and then storing the target classification dimension information and the first flow information into the new node.
And step 402, taking the node at the tail end of the target search path as a father node, taking the new node as a child node, and connecting the child node with the father node to obtain a new red-black tree.
The server obtains a target search path in the process of searching the target node, and after the target classification dimension information and the first traffic information are stored in a new node, the node at the tail end of the target search path is used as a father node, and the new node is used as a child node and connected to the father node, so that a new red-black tree is obtained. In practical application, except that the node at the end of the target search path may be used as a parent node, other nodes in the target search path may also be used as parent nodes, and the selection of the parent node is determined according to practical situations.
In the step of inserting a new node into the target search path of the red-black tree to obtain a new red-black tree, if the target node is not found in the red-black tree, applying for the new node, and storing the target classification dimension information and the first flow information into the new node; and then taking the node at the tail end of the target search path as a father node, taking the new node as a child node, and connecting the child node with the father node to obtain a new red-black tree. In the embodiment of the disclosure, a new storage resource is applied only when a target node is not found in a red-black tree, that is, the storage resource is allocated as needed; in the prior art, the size of the hash table needs to be set in advance, and therefore, compared with the prior art, the resource allocation is more efficient and reliable.
In one embodiment, as shown in fig. 7, a traffic classification statistical method is provided, which is described by taking the method as an example applied to the server in fig. 1, and includes the following steps:
Wherein the classification dimension includes at least one of a source IP address and an application protocol.
After acquiring the data packet, the server strips the packet header part from the data packet, extracts a source IP address corresponding to the data packet from a first preset position of the packet header according to an IP layer protocol, and extracts an application protocol corresponding to the data packet from a second preset position of the packet header; and obtaining the target classification dimension information of the data packet according to the source IP address, the application protocol and the like corresponding to the data packet.
Wherein, according to the source IP address, whether the data packet is the uplink flow or the downlink flow can be determined.
And the server extracts the data length of the data packet from a third preset position of the packet header and then calculates the flow size of the data packet according to the data length. For example, the traffic size of the packet is calculated to be x1 according to the length of the data.
The disclosed embodiment does not limit the order of step 501 and step 502.
And executing one of the steps 505, 507 and 510 according to whether the target classification dimension value is consistent with the classification dimension value corresponding to the classification dimension information stored by the nth node and whether the nth node has a child node.
And 505, if the target classification dimension value is not consistent with the classification dimension value corresponding to the classification dimension information stored in the nth node, comparing the target classification dimension value with the classification dimension value corresponding to the nth node.
And comparing the target classification dimension value with the classification dimension value corresponding to the classification dimension information stored in the (n + 1) th node to obtain a result of whether the target classification dimension value is consistent with the classification dimension value corresponding to the (n + 1) th node, and continuing to search the target node by referring to step 504 until the target node is found or the target node is not found.
And step 508, applying for a new node, and storing the target classification dimension information and the first flow information into the new node.
And 511, updating the second traffic information stored in the target node by adopting the first traffic information to obtain a new red-black tree.
And step 512, traversing each node of the new red-black tree, and performing traffic size statistics on the classification dimension to be counted according to the second traffic information stored by each node in the new red-black tree.
In the process of carrying out flow classification statistics, the server traverses each node of a new red-black tree and determines classification dimension information and flow information stored in each node; then, the server finds out the nodes to be counted according to the classification dimensionality to be counted; and the classification dimension information stored by the nodes to be counted is matched with the preset classification dimension. Then, the server combines the traffic sizes stored in the nodes to be counted, so that the traffic size corresponding to the classification dimension to be counted can be counted.
For example, the classification dimension to be counted is a source IP address a, the classification dimension information stored in the node 1 to be counted, which is found from the new red and black tree according to the classification dimension to be counted, includes the source IP address a, and the stored traffic information includes the traffic size x 1; the classification dimension information stored in the node 2 to be counted comprises a source IP address A, and the stored traffic information comprises a traffic size x 2. According to the above, it can be counted that the traffic size corresponding to the source IP address a is x1+ x 2.
By adopting the structure of the red and black tree, the time complexity of operations such as searching, inserting and the like can be reduced, the efficiency of flow classification statistics is improved, and the storage resources can be allocated as required, so that the waste of the storage resources is avoided.
It should be understood that although the various steps in the flowcharts of fig. 2-7 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 2-7 may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed in turn or alternately with other steps or at least some of the other steps.
In one embodiment, as shown in fig. 8, there is provided a traffic classification statistical apparatus, including:
an information extraction module 601, configured to extract target classification dimension information and first traffic information according to the obtained data packet; the target classification dimension information is used for representing at least one classification dimension corresponding to the data packet, and the first flow information is used for representing the flow size of the data packet;
a node searching module 602, configured to search a target node in a preset red-black tree based on the target classification dimension information; the classification dimension information stored by the target node is consistent with the target classification dimension information;
a node inserting module 603, configured to insert a new node in a target search path of the red-black tree if the target node is not found, so as to obtain a new red-black tree; the target searching path is a path for searching a target node, target classification dimension information and first traffic information are stored in a new node, and a new red and black tree is used for carrying out traffic classification statistics.
In one embodiment, as shown in fig. 9, the apparatus further comprises:
and an information updating module 604, configured to update the second traffic information stored in the target node with the first traffic information if the target node is found, so as to obtain a new blacktree.
In one embodiment, the node searching module 602 is specifically configured to determine a target classification dimension value corresponding to the target classification dimension information; for the nth node in the target search path, comparing the target classification dimension value with the classification dimension value corresponding to the classification dimension information stored by the nth node; if the target classification dimension value is inconsistent with the classification dimension value corresponding to the classification dimension information stored in the nth node, comparing the target classification dimension value with the classification dimension value corresponding to the nth node; and determining the (n + 1) th node in the target search path from the child nodes of the nth node according to the comparison result, and comparing the target classification dimension value with the classification dimension value corresponding to the classification dimension information stored in the (n + 1) th node.
In one embodiment, as shown in fig. 10, the apparatus further comprises:
a first node determining module 605, configured to determine the nth node as the target node if the target classification dimension value is consistent with the classification dimension value corresponding to the classification dimension information stored in the nth node.
In one embodiment, as shown in fig. 10, the apparatus further comprises:
a second node determining module 606, configured to determine that the target node is not found in the red-black tree if the target classification dimension value is inconsistent with the classification dimension value corresponding to the classification dimension information stored in the nth node and there is no child node in the nth node.
In one embodiment, the node insertion module 603 is specifically configured to apply for a new node, and store the target classification dimension information and the first traffic information in the new node; and taking the node at the tail end of the target search path as a father node, taking the new node as a child node, and connecting the child node with the father node to obtain a new red-black tree.
In one embodiment, as shown in fig. 11, the apparatus further comprises:
and the counting module 607 is configured to traverse each node of the new red-black tree, and perform traffic size counting on the classification dimension to be counted according to the second traffic information stored in each node of the new red-black tree.
In one embodiment, the information extraction module is specifically configured to extract at least one classification dimension corresponding to a data packet from a packet header of the data packet according to an IP layer protocol, so as to obtain target classification dimension information; the classification dimension comprises at least one of a source IP address and an application protocol; and extracting the data length from the packet header of the data packet according to the IP layer protocol, and calculating the flow of the data packet according to the data length to obtain first flow information.
For the specific definition of the traffic classification statistical device, reference may be made to the above definition of the traffic classification statistical method, which is not described herein again. The modules in the traffic classification statistical device can be wholly or partially implemented by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 12. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used to store traffic classification statistics. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a traffic classification statistical method.
Those skilled in the art will appreciate that the architecture shown in fig. 12 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having a computer program stored therein, the processor implementing the following steps when executing the computer program:
extracting target classification dimension information and first flow information according to the acquired data packet; the target classification dimension information is used for representing at least one classification dimension corresponding to the data packet, and the first flow information is used for representing the flow size of the data packet;
searching a target node in a preset red-black tree based on the target classification dimension information; the classification dimension information stored by the target node is consistent with the target classification dimension information;
if the target node is not found, inserting a new node into the target search path of the red-black tree to obtain a new red-black tree; the target searching path is a path for searching a target node, target classification dimension information and first traffic information are stored in a new node, and a new red and black tree is used for carrying out traffic classification statistics.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
and if the target node is found, updating the second traffic information stored in the target node by adopting the first traffic information to obtain a new red-black tree.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
determining a target classification dimension value corresponding to the target classification dimension information;
for the nth node in the target search path, comparing the target classification dimension value with the classification dimension value corresponding to the classification dimension information stored by the nth node;
if the target classification dimension value is inconsistent with the classification dimension value corresponding to the classification dimension information stored in the nth node, comparing the target classification dimension value with the classification dimension value corresponding to the nth node;
and determining the (n + 1) th node in the target search path from the child nodes of the nth node according to the comparison result, and comparing the target classification dimension value with the classification dimension value corresponding to the classification dimension information stored in the (n + 1) th node.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
and if the target classification dimension value is consistent with the classification dimension value corresponding to the classification dimension information stored in the nth node, determining the nth node as the target node.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
and if the target classification dimension value is inconsistent with the classification dimension value corresponding to the classification dimension information stored in the nth node and the nth node has no child node, determining that the target node is not found in the red-black tree.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
applying for a new node, and storing the target classification dimension information and the first flow information into the new node;
and taking the node at the tail end of the target search path as a father node, taking the new node as a child node, and connecting the child node with the father node to obtain a new red-black tree.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
and traversing each node of the new red-black tree, and carrying out traffic size statistics on the classification dimension to be counted according to the second traffic information stored by each node in the new red-black tree.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
extracting at least one classification dimension corresponding to the data packet from the packet header of the data packet according to an IP layer protocol to obtain target classification dimension information; the classification dimension comprises at least one of a source IP address and an application protocol;
and extracting the data length from the packet header of the data packet according to the IP layer protocol, and calculating the flow of the data packet according to the data length to obtain first flow information.
In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:
extracting target classification dimension information and first flow information according to the acquired data packet; the target classification dimension information is used for representing at least one classification dimension corresponding to the data packet, and the first flow information is used for representing the flow size of the data packet;
searching a target node in a preset red-black tree based on the target classification dimension information; the classification dimension information stored by the target node is consistent with the target classification dimension information;
if the target node is not found, inserting a new node into the target search path of the red-black tree to obtain a new red-black tree; the target searching path is a path for searching a target node, target classification dimension information and first traffic information are stored in a new node, and a new red and black tree is used for carrying out traffic classification statistics.
In one embodiment, the computer program when executed by the processor further performs the steps of:
and if the target node is found, updating the second traffic information stored in the target node by adopting the first traffic information to obtain a new red-black tree.
In one embodiment, the computer program when executed by the processor further performs the steps of:
determining a target classification dimension value corresponding to the target classification dimension information;
for the nth node in the target search path, comparing the target classification dimension value with the classification dimension value corresponding to the classification dimension information stored by the nth node;
if the target classification dimension value is inconsistent with the classification dimension value corresponding to the classification dimension information stored in the nth node, comparing the target classification dimension value with the classification dimension value corresponding to the nth node;
and determining the (n + 1) th node in the target search path from the child nodes of the nth node according to the comparison result, and comparing the target classification dimension value with the classification dimension value corresponding to the classification dimension information stored in the (n + 1) th node.
In one embodiment, the computer program when executed by the processor further performs the steps of:
and if the target classification dimension value is consistent with the classification dimension value corresponding to the classification dimension information stored in the nth node, determining the nth node as the target node.
In one embodiment, the computer program when executed by the processor further performs the steps of:
and if the target classification dimension value is inconsistent with the classification dimension value corresponding to the classification dimension information stored in the nth node and the nth node has no child node, determining that the target node is not found in the red-black tree.
In one embodiment, the computer program when executed by the processor further performs the steps of:
applying for a new node, and storing the target classification dimension information and the first flow information into the new node;
and taking the node at the tail end of the target search path as a father node, taking the new node as a child node, and connecting the child node with the father node to obtain a new red-black tree.
In one embodiment, the computer program when executed by the processor further performs the steps of:
and traversing each node of the new red-black tree, and carrying out traffic size statistics on the classification dimension to be counted according to the second traffic information stored by each node in the new red-black tree.
In one embodiment, the computer program when executed by the processor further performs the steps of:
extracting at least one classification dimension corresponding to the data packet from the packet header of the data packet according to an IP layer protocol to obtain target classification dimension information; the classification dimension comprises at least one of a source IP address and an application protocol;
and extracting the data length from the packet header of the data packet according to the IP layer protocol, and calculating the flow of the data packet according to the data length to obtain first flow information.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.
Claims (10)
1. A traffic classification statistical method, characterized in that the method comprises:
extracting target classification dimension information and first flow information according to the acquired data packet; the target classification dimension information is used for representing at least one classification dimension corresponding to the data packet, and the first traffic information is used for representing the traffic size of the data packet;
searching a target node in a preset red-black tree based on the target classification dimension information; the classification dimension information stored by the target node is consistent with the target classification dimension information;
if the target node is not found, inserting a new node into the target search path of the red-black tree to obtain a new red-black tree; the target searching path is a path for searching the target node, the new node stores the target classification dimension information and the first traffic information, and the new red and black tree is used for performing traffic classification statistics.
2. The method of claim 1, wherein after the searching for the target node in the preset red-black tree based on the target classification dimension information, the method further comprises:
and if the target node is found, updating second traffic information stored in the target node by adopting the first traffic information to obtain the new red-black tree.
3. The method of claim 1, wherein the searching for the target node in the preset red-black tree based on the target classification dimension information comprises:
determining a target classification dimension value corresponding to the target classification dimension information;
for the nth node in the target search path, comparing the target classification dimension value with a classification dimension value corresponding to the classification dimension information stored by the nth node;
if the target classification dimension value is inconsistent with the classification dimension value corresponding to the classification dimension information stored by the nth node, comparing the target classification dimension value with the classification dimension value corresponding to the nth node;
and determining the (n + 1) th node in the target search path from the child nodes of the (n) th node according to the comparison result, and comparing the target classification dimension value with the classification dimension value corresponding to the classification dimension information stored in the (n + 1) th node.
4. The method of claim 3, further comprising:
and if the target classification dimension value is consistent with the classification dimension value corresponding to the classification dimension information stored in the nth node, determining the nth node as the target node.
5. The method of claim 3, further comprising:
and if the target classification dimension value is inconsistent with the classification dimension value corresponding to the classification dimension information stored in the nth node and the nth node has no child node, determining that the target node is not found in the red-black tree.
6. The method of claim 1, wherein inserting a new node into the target search path of the red and black tree to obtain a new red and black tree comprises:
applying for the new node, and storing the target classification dimension information and the first flow information into the new node;
and taking the node at the tail end of the target search path as a father node, taking the new node as a child node, and connecting the child node and the father node to obtain the new red-black tree.
7. The method of claim 1, wherein after inserting a new node into the target search path of the red-black tree to obtain a new red-black tree, the method further comprises:
and traversing each node of the new red-black tree, and carrying out traffic size statistics on the classification dimension to be counted according to the second traffic information stored by each node in the new red-black tree.
8. A traffic classification statistic device, characterized in that the device comprises:
the information extraction module is used for extracting target classification dimension information and first traffic information according to the acquired data packet; the target classification dimension information is used for representing at least one classification dimension corresponding to the data packet, and the first traffic information is used for representing the traffic size of the data packet;
the node searching module is used for searching a target node in a preset red-black tree based on the target classification dimension information; the classification dimension information stored by the target node is consistent with the target classification dimension information;
the node inserting module is used for inserting a new node into the target searching path of the red-black tree to obtain a new red-black tree if the target node is not searched; the target searching path is a path for searching the target node, the new node stores the target classification dimension information and the first traffic information, and the new red and black tree is used for performing traffic classification statistics.
9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 7.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011015388.4A CN112131223B (en) | 2020-09-24 | 2020-09-24 | Traffic classification statistical method, device, computer equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011015388.4A CN112131223B (en) | 2020-09-24 | 2020-09-24 | Traffic classification statistical method, device, computer equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112131223A true CN112131223A (en) | 2020-12-25 |
CN112131223B CN112131223B (en) | 2024-02-02 |
Family
ID=73839635
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011015388.4A Active CN112131223B (en) | 2020-09-24 | 2020-09-24 | Traffic classification statistical method, device, computer equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112131223B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104408111A (en) * | 2014-11-24 | 2015-03-11 | 浙江宇视科技有限公司 | Method and device for deleting duplicate data |
US20150286664A1 (en) * | 2014-04-07 | 2015-10-08 | Oracle International Corporation | Reducing blocking instances in parallel processing systems performing operations on trees |
CN106059957A (en) * | 2016-05-18 | 2016-10-26 | 中国科学院信息工程研究所 | Flow table rapid searching method and system under high-concurrency network environment |
WO2018058949A1 (en) * | 2016-09-30 | 2018-04-05 | 华为技术有限公司 | Data storage method, device and system |
CN110290117A (en) * | 2019-06-06 | 2019-09-27 | 新华三信息安全技术有限公司 | A kind of method and device of Match IP Address |
CN111324621A (en) * | 2020-02-19 | 2020-06-23 | 中国银联股份有限公司 | Event processing method, device, equipment and storage medium |
-
2020
- 2020-09-24 CN CN202011015388.4A patent/CN112131223B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150286664A1 (en) * | 2014-04-07 | 2015-10-08 | Oracle International Corporation | Reducing blocking instances in parallel processing systems performing operations on trees |
CN104408111A (en) * | 2014-11-24 | 2015-03-11 | 浙江宇视科技有限公司 | Method and device for deleting duplicate data |
CN106059957A (en) * | 2016-05-18 | 2016-10-26 | 中国科学院信息工程研究所 | Flow table rapid searching method and system under high-concurrency network environment |
WO2018058949A1 (en) * | 2016-09-30 | 2018-04-05 | 华为技术有限公司 | Data storage method, device and system |
CN110290117A (en) * | 2019-06-06 | 2019-09-27 | 新华三信息安全技术有限公司 | A kind of method and device of Match IP Address |
CN111324621A (en) * | 2020-02-19 | 2020-06-23 | 中国银联股份有限公司 | Event processing method, device, equipment and storage medium |
Non-Patent Citations (3)
Title |
---|
周彩兰;张亚芳;郭凤玲;: "哈希红黑树算法在网络信息分析中的应用", 软件导刊, no. 13, pages 138 - 139 * |
张先利: "一种新的入侵检测模式匹配算法", 计算机应用与软件, no. 05, pages 272 - 273 * |
马博韬;孙鹏;朱小勇;: "红黑树算法研究综述", 网络新媒体技术, no. 04, pages 60 - 66 * |
Also Published As
Publication number | Publication date |
---|---|
CN112131223B (en) | 2024-02-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110209348B (en) | Data storage method and device, electronic equipment and storage medium | |
CN109325118B (en) | Unbalanced sample data preprocessing method and device and computer equipment | |
CN110442623B (en) | Big data mining method and device and data mining server | |
CN109614399B (en) | Bitmap data query method and device, computer equipment and storage medium | |
CN108389124B (en) | Data processing method, data processing device, computer equipment and storage medium | |
CN112527950B (en) | Map data deleting method and system based on MapReduce | |
CN111405007B (en) | TCP session management method, device, storage medium and electronic equipment | |
CN111522873B (en) | Block generation method, device, computer equipment and storage medium | |
CN112511612A (en) | Cloud storage data storage method, device, system, equipment and storage medium | |
CN112131223B (en) | Traffic classification statistical method, device, computer equipment and storage medium | |
CN116303343A (en) | Data slicing method, device, electronic equipment and storage medium | |
CN107977381B (en) | Data configuration method, index management method, related device and computing equipment | |
CN113992364B (en) | Network data packet blocking optimization method and system | |
CN115865457A (en) | Network attack behavior identification method, server and medium | |
CN112817980B (en) | Data index processing method, device, equipment and storage medium | |
CN110555158A (en) | mutually exclusive data processing method and system, and computer readable storage medium | |
CN114024838A (en) | Log processing method and device and electronic equipment | |
CN114039796A (en) | Network attack determination method and device, computer equipment and storage medium | |
CN116644062A (en) | B+ tree-based people stream analysis method, double B+ tree construction method and device | |
CN109542662B (en) | Memory management method, device, server and storage medium | |
CN111104528A (en) | Picture obtaining method and device and client | |
CN114938402B (en) | Unknown protocol frame structure identification method and device based on dictionary tree | |
CN114143083B (en) | Blacklist policy matching method and device, electronic equipment and storage medium | |
CN108829831B (en) | Data processing method and device, hardware device and chip | |
CN113472654B (en) | Network traffic data forwarding method, device, equipment and medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |