CN109428774B - Data processing method of DPI equipment and related DPI equipment - Google Patents

Data processing method of DPI equipment and related DPI equipment Download PDF

Info

Publication number
CN109428774B
CN109428774B CN201710725583.8A CN201710725583A CN109428774B CN 109428774 B CN109428774 B CN 109428774B CN 201710725583 A CN201710725583 A CN 201710725583A CN 109428774 B CN109428774 B CN 109428774B
Authority
CN
China
Prior art keywords
data stream
record
information
user
target user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710725583.8A
Other languages
Chinese (zh)
Other versions
CN109428774A (en
Inventor
程杜勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wangsu Science and Technology Co Ltd
Original Assignee
Wangsu Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wangsu Science and Technology Co Ltd filed Critical Wangsu Science and Technology Co Ltd
Priority to CN201710725583.8A priority Critical patent/CN109428774B/en
Publication of CN109428774A publication Critical patent/CN109428774A/en
Application granted granted Critical
Publication of CN109428774B publication Critical patent/CN109428774B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0876Network utilisation, e.g. volume of load or congestion level
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/02Capturing of monitoring data
    • H04L43/026Capturing of monitoring data using flow identification

Abstract

The embodiment of the invention relates to the field of data processing, in particular to a data processing method of DPI equipment and related DPI equipment, which are used for reducing the amount of stored data. In the embodiment of the invention, data flow information is obtained and a target user of the data flow information is determined; determining a data stream record of a target user; and if the data stream record is determined to have a first record consistent with the user attribute information of the data stream information, updating the statistical attribute information in the first record according to the data stream information. In the embodiment of the invention, when the data stream record of the target user has the first record which is consistent with the user attribute of the acquired data stream information, the statistical attribute information in the first record is updated according to the data stream information, the acquired data does not need to be stored, and the amount of stored data is reduced; the data stream record of the target user takes the user as a storage unit, and the number of the users in the whole network is far smaller than that of the data stream; therefore, the amount of stored data is further reduced.

Description

Data processing method of DPI equipment and related DPI equipment
Technical Field
The embodiment of the invention relates to the field of data processing, in particular to a data processing method of DPI equipment and related DPI equipment.
Background
In recent years, the scale of the network is continuously enlarged, the number of users of the network is continuously increased, and the types of network applications and services are continuously enriched, thereby bringing inconvenience to network analysis. How to efficiently analyze the network condition, process the network crisis, quickly sense the user behavior, mine the data value and the like becomes an important problem of the current network analysis.
At present, a commonly used network data traffic monitoring method is a Deep Packet Inspection (DPI) technology, which is a 7-layer protocol analysis, and is added with application layer analysis (application layer protocol, load content, etc.), connection state of data packets, etc. in addition to analyzing data services of less than 4 layers (MAC address, IP layer, transport layer); various application types can be identified as an aid for the operator to monitor network traffic. Link information, data packet information, data analysis result information and the like can be stored through the DPI equipment and displayed on a World Wide Web (WEB for short).
In the prior art, DPI devices often use a database to store the information, and data streams corresponding to the information are stored in a manner of inserting the information one by one. However, with the development of networks, the amount of network data becomes larger and larger, and the DPI device stores more contents, which causes the efficiency of DPI device storage to be lower, and the service aspect cannot be refined. Especially for a large network environment, a plurality of network cards 10Gb are adopted, data flow reaches the level of several million pieces per second, and if DPI equipment is stored by using a database in a one-by-one insertion manner, the amount of data to be stored is quite large, and a large amount of storage space is occupied.
Disclosure of Invention
The embodiment of the invention provides a data processing method of DPI equipment and related DPI equipment, which are used for reducing the data volume stored by the DPI equipment.
The embodiment of the invention provides a data processing method of DPI equipment, which comprises the following steps: acquiring data stream information and determining a target user of the data stream information; acquiring a data stream record of the target user; and if the data stream record is determined to have a first record consistent with the user attribute information of the data stream information, updating the statistical attribute information in the first record according to the data stream information.
Optionally, if it is determined that the first record does not exist in the data stream record, adding a second record in the data stream record of the target user according to the data stream information, where the statistical attribute information of the second record is determined according to the statistical attribute information in the data stream record.
Optionally, the determining a target user of the data flow information includes: determining a target user of the data stream information according to the type of the network card and corresponding preset conditions; wherein the preset conditions include: under the condition that the network card type is determined to be an uplink/downlink dual network card, determining a source network protocol IP address in data stream information passing through the uplink network card as the target user, and determining a target IP address in the data stream information passing through the downlink network card as the target user; under the condition that the network card type is determined to be a single network card: if the source IP address in the data stream information is determined to be any IP address in the network segment where the target user is located, determining that the source IP address is the target user; if the destination IP address in the data stream information is determined to be any IP address in the network segment where the target user is located, determining the destination IP address to be the target user; the source IP address and the destination IP address are not in the same network segment.
Optionally, after the updating the statistical attribute information in the first record according to the data flow information, the method further includes: acquiring target data stream records in a statistical time period from the data stream records of the target user; for at least one user attribute information in the target data stream record, performing: and determining the proportion of the statistical attribute information corresponding to the user attribute information in the total sum of the statistical attribute information recorded in the target data stream.
Optionally, at a preset time period, importing the data stream record of the target user stored in the memory into a database; the preset time period is a time period when the network flow is lower than a flow threshold value.
An embodiment of the present invention provides a DPI device for data processing, including: the storage module is used for storing data stream records of all users, the data stream record of each user comprises a plurality of data stream records, and the user attribute information of all the data stream records is not completely the same; the processing module is used for acquiring data stream information and determining a target user of the data stream information; acquiring a data stream record of the target user from the storage module; and if the data stream record is determined to have a first record consistent with the user attribute information of the data stream information, updating the statistical attribute information in the first record according to the data stream information.
Optionally, the processing module is further configured to: and if the first record does not exist in the data stream record, adding a second record in the data stream record of the target user according to the data stream information, wherein the statistical attribute information of the second record is determined according to the statistical attribute information in the data stream record.
Optionally, the processing module is configured to: determining a target user of the data stream information according to the type of the network card and corresponding preset conditions; wherein the preset conditions include: under the condition that the network card type is determined to be an uplink/downlink dual network card, determining a source network protocol IP address in data stream information passing through the uplink network card as the target user, and determining a target IP address in the data stream information passing through the downlink network card as the target user; under the condition that the network card type is determined to be a single network card: if the source IP address in the data stream information is determined to be any IP address in the network segment where the target user is located, determining that the source IP address is the target user; if the destination IP address in the data stream information is determined to be any IP address in the network segment where the target user is located, determining the destination IP address to be the target user; the source IP address and the destination IP address are not in the same network segment.
Optionally, the processing module is further configured to: acquiring target data stream records in a statistical time period from the data stream records of the target user; for at least one user attribute information in the target data stream record, performing: and determining the proportion of the statistical attribute information corresponding to the user attribute information in the total sum of the statistical attribute information recorded in the target data stream.
Optionally, the processing module is further configured to: importing the data stream record of the target user stored in a storage module into a database in a preset time period; the preset time period is a time period when the network flow is lower than a flow threshold value.
In the embodiment of the invention, the data flow information is acquired and the target user of the data flow information is determined, when the data flow record of the target user is determined to have the first record consistent with the user attribute of the acquired data flow information, the statistical attribute information in the first record is updated according to the data flow information, the data flow statistical attribute information of the target user is updated, the existing user information and the data flow information do not need to be repeatedly stored, and the stored data volume is further reduced; moreover, the data stream record of the target user is taken as a storage unit, and the number of users in the whole network is far smaller than that of the data stream; therefore, the query efficiency can be improved by taking the user as the index.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings that are required to be used in the description of the embodiments will be briefly described below.
Fig. 1 is a schematic architecture diagram of a communication system according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of a data processing method of a DPI device according to an embodiment of the present invention;
fig. 3 is a schematic flow chart of a data processing method of another DPI device according to an embodiment of the present invention;
fig. 4 is a schematic flow chart of a data processing method of another DPI device according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a data processing device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 shows an architecture diagram of a communication system to which an embodiment of the invention is applied. As shown in fig. 1, the system architecture may include a client 101, a server 102, and a data processing device, where the data processing device includes a DPI device, and the embodiment of the present invention is discussed by taking the data processing device as DPI device 103 as an example. The DPI device may be located at an egress of the network, near a switch, in an attachment to a router, or in a router, etc., to facilitate obtaining a data flow for communication between the client and the server.
A terminal device, which may be a User Equipment (UE), an Access terminal, a subscriber unit, a subscriber station, a mobile station, a remote terminal, a mobile device, a User terminal, a wireless communication device, a User agent, or a User Equipment, where the client 101 may communicate with one or more core networks through a Radio Access Network (RAN). The access terminal may be a cellular phone, a cordless phone, a Session Initiation Protocol (SIP) phone, a Wireless Local Loop (WLL) station, a Personal Digital Assistant (PDA), a handheld device with Wireless communication function, a computing device or other processing device connected to a Wireless modem, a vehicle-mounted device, a wearable device, a terminal device in a future 5G network, and the like.
The server 102 may be any server that communicates with the client 101.
DPI device 103 is connected between client 101 and server 102. DPI device 103 may be connected between a router corresponding to the client and a router of the server, where the flow direction of the data stream is transmitted from the client to the DPI device through the router of the client, and then the DPI device transmits the data stream to the server through the router corresponding to the server. After the DPI equipment acquires the data flow information, the DPI equipment analyzes the data flow information, saves the analysis result and can display the analysis result on the web when needed. The storage space in the DPI equipment comprises a memory and a magnetic disk, and the magnetic disk is stored in a database form; therefore, the analysis result can be saved in the memory or in the database of the disk. DPI device 103 includes a recording module 103a, an analysis and inspection module 103b, and a presentation module 103 c. As the data flow passes through the DPI device, the recording module 103a records connection trace information for the data flow, including quintuple information: source IP address, destination port, source port, transport layer; then, the data stream is sent to the analysis and detection module 103b, and when receiving the data stream, the analysis and detection module performs traffic analysis and protocol detection processing on the received data stream, that is, determines the uplink and downlink traffic and/or the total traffic and the information of the application layer of the data stream, and sends the analyzed result to the data recording module 103 a; the data flow information obtained by the DPI device can be analyzed by the recording module 103a and the analysis and detection module 103b of the DPI device, and a target user, user attribute information and statistical attribute information corresponding to the data flow information are determined. When the data stream information needs to be displayed, the display module 103c obtains the data stream information from the background for displaying, for example, the statistical result of the user attribute information may be displayed on the web. In the embodiment of the invention, multiple independent threads can be set for the DPI equipment, bound to the CPU and executed in parallel, so that the operation efficiency of the DPI equipment can be improved, and the independent operation of the recording module, the analysis and detection module and the display module in the DPI equipment are not influenced mutually, so that the operation stability of the DPI equipment is ensured.
Based on the system architecture shown in fig. 1, fig. 2 exemplarily shows a flow diagram of a data processing method of a DPI device provided in an embodiment of the present invention, and as shown in fig. 2, the data processing method of the DPI device includes the following steps:
step 201, acquiring data stream information and determining a target user of the data stream information;
step 202, acquiring a data stream record of the target user;
step 203, if it is determined that a first record consistent with the user attribute information of the data stream information exists in the data stream record, updating the statistical attribute information in the first record according to the data stream information.
In the embodiment of the invention, the data flow information is acquired and the target user of the data flow information is determined, when the data flow record of the target user is determined to have the first record consistent with the user attribute of the acquired data flow information, the statistical attribute information in the first record is updated according to the data flow information, the statistical attribute information of the target user is updated, the existing user information and the data flow information do not need to be repeatedly stored, and the stored data volume is further reduced; moreover, the data stream record of the target user is taken as a storage unit, and the number of users in the whole network is far smaller than that of the data stream; therefore, the query efficiency can be improved by taking the user as the index.
In the embodiment of the present invention, the user attribute information includes: any one or more of a source IP address, a destination IP address, a user application type, a user Uniform Resource Locator (URL), a destination port, and a source port; the statistical attribute information includes: any one or any plurality of items of uplink and downlink flow and/or total flow, uplink and downlink rates and online time.
In the embodiment of the present invention, an optional method for determining a target user of data stream information is provided: determining a target user of the data stream information according to the type of the network card and corresponding preset conditions; wherein the preset conditions include: under the condition that the network card type is determined to be an uplink/downlink dual network card, determining a source network protocol IP address in data stream information passing through the uplink network card as the target user, and determining a target IP address in the data stream information passing through the downlink network card as the target user; under the condition that the network card type is determined to be a single network card: if the source IP address in the data stream information is determined to be any IP address in the network segment where the target user is located, determining that the source IP address is the target user; if the destination IP address in the data stream information is determined to be any IP address in the network segment where the target user is located, determining the destination IP address to be the target user; the source IP address and the destination IP address are not in the same network segment. And if the source IP address or the destination IP address does not have the IP address matched with the network segment corresponding to the target user, the data stream is discarded.
In the embodiment of the invention, after the target user of the data stream information is determined, whether the target user exists is judged, if the target user does not exist, a data stream record corresponding to the target user is established, and the data stream record comprises user attribute information and statistical attribute information. If the target user exists, judging whether a first record consistent with the user attribute information of the data stream information exists in the data stream record, and if so, updating the statistical attribute information in the first record according to the data stream information; and if the first record does not exist in the data stream record, adding a second record in the data stream record of the target user according to the data stream information, wherein the statistical attribute information of the second record is determined according to the statistical attribute information in the data stream record.
The embodiment of the present invention provides an optional manner for determining whether the obtained data stream information is stored in a data stream record of a target user corresponding to the data stream information, where the optional manner is as follows:
determining quinary information of the acquired data flow information; the user attribute information comprises quintuple information, and whether the data stream record of the target user has the acquired data stream is determined according to the quintuple information of the acquired data stream; the quintuple information may uniquely identify a piece of data stream information. Therefore, whether the received data stream exists in the data stream record of the user can be accurately judged according to the quintuple information.
For ease of understanding, the data processing method of the DPI device is further described below with reference to a specific embodiment. Optionally, the manner of determining the statistical attribute information in the embodiment of the present invention includes multiple manners, such as: summing, summing after weighting, simultaneously amplifying or reducing by a certain multiple, and the like, which are specifically determined according to actual needs. Assuming that the data stream records storing the target user a and the target user B are as shown in table 1, the user attribute information of the target user a and the target user B includes: user application, destination IP address; the statistical attribute information includes: upstream and downstream flow, total flow.
TABLE 1 data flow records for target user A and target user B
Figure BDA0001385936990000081
Assuming that the obtained data stream is data stream 1, determining a target user of the data stream 1, and if the target user of the data stream 1 is determined to be a, determining a data stream record of the target user, such as a data stream corresponding to the target user a in table 1; user attribute information and statistical attribute information for data stream 1 are determined.
The first condition is as follows: suppose that the determined user attribute information of data stream 1 is: the user application is a 360 search and the destination IP address is IP2The uplink flow is 1.5M, the downlink flow is 20M, and the total flow is 21.5M; determining that the user attribute information of the data stream 1 is consistent with the data stream record corresponding to the 360-degree search in the data stream record of the target user, and calling the data stream record corresponding to the 360-degree search in the data stream record of the target user as a first record, and updating the statistical attribute information in the first record according to the data stream information, namely updating the uplink flow in the first record to be 2.5M, updating the downlink flow to be 30M, and updating the total flow to be 32.5M; update the barThe stream record for stream 1 is discarded.
Case two: suppose that the determined user attribute information of data stream 1 is: the user application is QQ cyclone and the destination IP address is IP8The uplink flow is 1M, the downlink flow is 20M, and the total flow is 21M; if it is determined that the data stream record of the target user a does not have a first record consistent with the user attribute information of the data stream information, adding a second record in the data stream record of the target user according to the data stream information, wherein the user attribute information of the second record is consistent with the user attribute information of the data stream information, and the statistical attribute information of the second record is determined by a summation mode according to the statistical attribute information in the data stream record; the data stream record of target user a after the newly added second record is shown in table 2.
TABLE 2 data stream record of target user A after adding a second record
Figure BDA0001385936990000091
In the embodiment of the invention, the data flow information of each target user is stored by using the user attribute information and the statistical attribute information by reasonably designing the structure of the data stored in the DPI equipment and using the target user as a basic storage unit. And during storage, carrying out duplicate removal on the data stream information of the target user and updating the statistical attribute information of the target user. This can greatly reduce the amount of data saved. For a clearer description of the above method flow, the following description will be made separately for the case where the obtained data streams are 1 and N.
In the embodiment of the present invention, acquiring the data flow information includes periodically or in real time acquiring the data flow information, where the acquired data flow information may be 1 or N, and N is an integer greater than 1.
Figure 3 illustrates another data processing method for DPI devices provided by the present invention. In this embodiment, the number of the acquired data streams is 1, and as shown in fig. 3, the data storage method includes:
step 301, acquiring 1 piece of data stream information;
step 302, determining a target user of the data stream;
step 303, determining whether a target user of the data flow exists in a memory of the DPI device; if so, go to step 304; if not, go to step 308;
step 304, determining the data stream record of the target user;
step 305, judging whether a first record consistent with the user attribute of the data stream information exists in the data stream record of the target user, if so, executing step 306; if not, go to step 307;
step 306, updating the statistical attribute information in the first record according to the data stream information;
step 307, adding a second record in the data stream record of the target user according to the data stream information, where the user attribute information of the second record is consistent with the user attribute information of the data stream information, and the statistical attribute information of the second record is determined according to the statistical attribute information in the data stream record;
step 308, establishing the target user;
step 309, a corresponding data stream record is established under the established target user.
After the data flow of the user is established, applying for a memory space to DPI equipment; after the data stream record of the target user is established, the storage process of the data stream is the same as the method, and is not described herein again.
Fig. 4 illustrates another data processing method provided by the present invention. The number of the data streams obtained in this embodiment is N, where N is an integer greater than 1; as shown in fig. 4, the data storage method includes:
step 401, acquiring N data stream information;
step 402, determining target users of the N pieces of data flow information;
for convenience of description in the embodiment of the present invention, the target users of N pieces of data stream information are taken as the same example for description; if at least 1 target user of the N pieces of data stream information is different, the executed process can be obtained by combining two storage modes of acquiring one piece of data stream information and N pieces of data stream information;
step 403, determining whether the target user exists in the memory of the DPI device; if yes, go to step 404; if not, go to step 408;
step 404, determining a data stream record of the target user;
step 405, clustering the user attribute information in the N pieces of data stream information in a consistent manner to obtain at least 1 cluster, and updating the statistical attribute information of the corresponding cluster for each statistical attribute information in each cluster in the at least one cluster;
the user attribute information includes any one or more of a source IP address, a destination IP address, a user application type, a user Uniform Resource Locator (URL), a destination port, and a source port; the statistical attribute information includes: any one or more of upstream, downstream and/or total flow;
step 406, for each cluster in at least one cluster, determining whether a first record consistent with the user attribute information of the cluster exists in the data stream record of the target user; if yes, go to step 407; if not, go to step 408;
step 407, updating the statistical attribute information in the first record according to the statistical attribute information in the data stream information corresponding to the cluster;
step 408, adding a second record in the data stream record of the target user according to the data stream information corresponding to the cluster, where the user attribute information of the second record is consistent with the user attribute information of the data stream information in the cluster, and the statistical attribute information of the second record is determined according to the statistical attribute information in the data stream record;
step 409, establishing the target user;
steps 404 to 408 are then performed again in sequence based on the target user established.
The storage process of the data stream is the same as the above method, and is not described herein again.
Optionally, when the N pieces of data stream information are acquired, it may also be determined whether each piece of data stream exists in a data stream record of a corresponding target user, and clustering is performed on the remaining data streams.
In the prior art, classification and sorting are often needed when data is displayed. For example, to know the traffic condition of accessing a certain server, all the stored data stream records need to be traversed, all the data streams reaching the server are found, and then the data streams are sorted, so as to obtain the result. If the user wants to view the current network user access data traffic ranking, the total traffic of each user needs to be summarized and then sorted. Under the condition of huge data flow, the whole query process is very slow, and poor user experience is brought. In the embodiment of the present invention, in order to facilitate displaying of data stored in a DPI device, a specific implementation manner is provided: after updating the statistical attribute information in the first record according to the data stream information, the method further includes: determining a target data stream record of the target user corresponding to the target time granularity; wherein the target data stream record comprises at least one user attribute information, the data stream record comprising the target data stream record; time granularity includes any of minutes, hours, or days; for each user attribute information of the at least one user attribute information, performing: and determining the proportion of the statistical attribute information in the target data stream record of the statistical attribute information corresponding to the user attribute information.
In the embodiment of the present invention, a specific example is described to determine a ratio of statistical attribute information included in the target data stream record, where the statistical attribute information corresponds to the user attribute information. Assuming the target time granularity is minutes, 10 minutes is taken as an example; the target data flow records of the target users determined within 10 minutes are shown in table 3, in this embodiment, the user attribute information is exemplified by user applications, and the statistical attribute information is exemplified by total user flow.
TABLE 3 target data flow record for target user in 10 minutes
Numbering User applications Number of connections Upstream flow Downstream traffic Total flow rate
1 Fast thunder 2 0.5M 2M 2.5
2 360 search 10 1M 10M 11M
3 HTTP 25 30M 60M 90M
4 Tencent (Teng-news) 1 0 0 0
The percentage of the total traffic of the fast thunder in the user application to the total traffic included in the target data within 10 minutes is 2.5/(2.5+11+90+0) ═ 2.4%, and the percentage of the total traffic of the HTTP in the user application to the total traffic included in the target data is 90/(2.5+11+90+0) ═ 86.7%; the proportion of other applications can be obtained by the same method.
The embodiment of the invention can also determine the proportion condition of other user attribute information, and the statistical attribute information can be other statistical attribute information such as uplink and downlink flow, total flow or online time and the like. And determining specific values of the user attribute information and the attribute statistical information according to specific requirements.
In the embodiment of the present invention, in order to meet the display of the display module, different user attribute information may be ranked according to the total traffic, for example, ranking the user application types in the user attribute information, ranking the user URLs, ranking the servers corresponding to the destination IP addresses, and ranking the target users according to the total traffic of the target users. Therefore, result display of the DPI equipment is accelerated, the flow condition of a user can be obtained quickly, and user experience is improved.
The embodiment of the invention provides an implementation mode for importing data flow records in DPI equipment into a database, namely, the data flow records of a target user stored in an internal memory are imported into the database in a preset time period; the preset time period is a time period when the network flow is lower than a flow threshold value. Because the embodiment of the invention takes the target user as the basic storage unit and performs the operations of deduplication, aggregation, update and the like on the data stream of the target user, the data amount of the storage quantity is reduced to a great extent, therefore, the quantity of the imported database is correspondingly reduced, and thus, the time for writing in the data path can be saved, and the expenditure of a disk can be saved.
In the embodiment of the invention, when the data is imported into the database of the disk, the data can be selectively imported into the database, and the data volume written into the database can be further reduced.
Alternatively, the preset period may be the end of the day, such as 23 o ' clock and 50 o ' clock to 24 o ' clock; and storing the data stream record of the current day in a preset time period, wherein the network flow is lower than a flow threshold, namely the network flow is small, the requirement on equipment is not met, and the error probability is low.
Furthermore, the data flow records of the target users stored in the memory of the DPI equipment are imported into the database of the disk only in a preset time period of one day, frequent database operation is not needed, the stability of operation of the DPI equipment is improved, and the reduction of the operation efficiency of the DPI equipment is avoided.
From the above, it can be seen that: in the embodiment of the invention, because the data flow information is acquired and the target user of the data flow information is determined, when the data flow record of the target user is determined to have the first record consistent with the user attribute of the acquired data flow information, the statistical attribute information in the first record is updated according to the data flow information, the statistical attribute information of the target user is updated, the existing user information and the data flow information do not need to be repeatedly stored, and the stored data volume is further reduced; moreover, the data stream record of the target user is taken as a storage unit, and the number of users in the whole network is far smaller than that of the data stream; therefore, the query efficiency can be improved by taking the user as the index.
Based on the same concept, fig. 5 provides a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention, and as shown in fig. 5, the data processing apparatus 500 includes a storage module 501 and a processing module 502. Wherein:
the storage module is used for storing data stream records of all users, the data stream record of each user comprises a plurality of data stream records, and the user attribute information of all the data stream records is not completely the same; the processing module is used for acquiring data stream information and determining a target user of the data stream information; acquiring a data stream record of the target user from the storage module; and if the data stream record is determined to have a first record consistent with the user attribute information of the data stream information, updating the statistical attribute information in the first record according to the data stream information.
Optionally, the processing module is further configured to: and if the first record does not exist in the data stream record, adding a second record in the data stream record of the target user according to the data stream information, wherein the statistical attribute information of the second record is determined according to the statistical attribute information in the data stream record.
Optionally, the processing module is configured to: determining a target user of the data stream information according to the type of the network card and corresponding preset conditions; wherein the preset conditions include: if the network card type is an uplink/downlink dual network card, determining a source network protocol IP address in data stream information passing through the uplink network card as the target user, and determining a target IP address in the data stream information passing through the downlink network card as the target user; and if the network card type is a single network card, determining the IP address matched with the network segment where the target user is located in the source IP or the target IP in the data stream information as the target user.
Optionally, the processing module is further configured to: acquiring target data stream records in a statistical time period from the data stream records of the target user; for at least one user attribute information in the target data stream record, performing: and determining the proportion of the statistical attribute information corresponding to the user attribute information in the total sum of the statistical attribute information recorded in the target data stream.
Optionally, the processing module is configured to: importing the data stream record of the target user stored in a storage module into a database in a preset time period; the preset time period is a time period when the network flow is lower than a flow threshold value.
From the above, it can be seen that: in the embodiment of the invention, because the data flow information is acquired and the target user of the data flow information is determined, when the data flow record of the target user is determined to have the first record consistent with the user attribute of the acquired data flow information, the statistical attribute information in the first record is updated according to the data flow information, the statistical attribute information of the target user is updated, the existing user information and the data flow information do not need to be repeatedly stored, and the stored data volume is further reduced; moreover, the data stream record of the target user is taken as a storage unit, and the number of users in the whole network is far smaller than that of the data stream; therefore, the query efficiency can be improved by taking the user as the index.
It should be apparent to those skilled in the art that embodiments of the present invention may be provided as a method, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (10)

1. A data processing method of a Deep Packet Inspection (DPI) device is characterized by comprising the following steps:
acquiring data flow information;
determining a target user of the data stream information according to the type of the network card and corresponding preset conditions; wherein the preset conditions include: under the condition that the network card type is determined to be an uplink/downlink dual network card, determining a source network protocol IP address in data stream information passing through the uplink network card as the target user, and determining a target IP address in the data stream information passing through the downlink network card as the target user; under the condition that the network card type is determined to be a single network card: if the source IP address in the data stream information is determined to be any IP address in the network segment where the target user is located, determining that the source IP address is the target user; if the destination IP address in the data stream information is determined to be any IP address in the network segment where the target user is located, determining the destination IP address to be the target user; the source IP address and the destination IP address are not in the same network segment;
acquiring a data stream record of the target user;
if it is determined that a first record consistent with the user attribute information of the data stream information exists in the data stream record, updating the statistical attribute information in the first record according to the data stream information; the user attribute information at least comprises one item or more items of user application types and user Uniform Resource Locators (URLs), and the first record is stored by taking a target user as a storage unit.
2. The method of claim 1, further comprising:
and if the first record does not exist in the data stream record, adding a second record in the data stream record of the target user according to the data stream information, wherein the statistical attribute information of the second record is determined according to the statistical attribute information in the data stream record.
3. The method of claim 1, wherein after updating the statistical attribute information in the first record based on the data flow information, further comprising:
acquiring target data stream records in a statistical time period from the data stream records of the target user;
for at least one user attribute information in the target data stream record, performing:
and determining the proportion of the statistical attribute information corresponding to the user attribute information in the total sum of the statistical attribute information recorded in the target data stream.
4. The method of any one of claims 1-3, comprising:
importing the data stream record of the target user stored in the memory into a database in a preset time period; the preset time period is a time period when the network flow is lower than a flow threshold value.
5. A DPI device for data processing, comprising:
the storage module is used for storing data stream records of all users, the data stream record of each user comprises a plurality of data stream records, and the user attribute information of all the data stream records is not completely the same;
the processing module is used for acquiring data stream information and determining a target user of the data stream information according to the type of the network card and corresponding preset conditions; wherein the preset conditions include: under the condition that the network card type is determined to be an uplink/downlink dual network card, determining a source network protocol IP address in data stream information passing through the uplink network card as the target user, and determining a target IP address in the data stream information passing through the downlink network card as the target user; under the condition that the network card type is determined to be a single network card: if the source IP address in the data stream information is determined to be any IP address in the network segment where the target user is located, determining that the source IP address is the target user; if the destination IP address in the data stream information is determined to be any IP address in the network segment where the target user is located, determining the destination IP address to be the target user; the source IP address and the destination IP address are not in the same network segment; acquiring a data stream record of the target user from the storage module; if it is determined that a first record consistent with the user attribute information of the data stream information exists in the data stream record, updating the statistical attribute information in the first record according to the data stream information, wherein the user attribute information at least comprises one or more items of a user application type and a user Uniform Resource Locator (URL), and the first record is stored by taking a target user as a storage unit.
6. The device of claim 5, wherein the processing module is further to:
and if the first record does not exist in the data stream record, adding a second record in the data stream record of the target user according to the data stream information, wherein the statistical attribute information of the second record is determined according to the statistical attribute information in the data stream record.
7. The device of claim 5, wherein the processing module is further to:
acquiring target data stream records in a statistical time period from the data stream records of the target user;
for at least one user attribute information in the target data stream record, performing:
and determining the proportion of the statistical attribute information corresponding to the user attribute information in the total sum of the statistical attribute information recorded in the target data stream.
8. The device of any one of claims 5-7, wherein the processing module is further to:
importing the data stream record of the target user stored in the storage module into a database in a preset time period; the preset time period is a time period when the network flow is lower than a flow threshold value.
9. A computer-readable storage medium having computer-executable instructions stored thereon for causing a computer to perform the method of any one of claims 1 to 4.
10. A computer device, comprising:
a storage module for storing program instructions;
a processing module for calling the program instructions stored in the storage module and executing the method according to any one of claims 1 to 4 according to the obtained program.
CN201710725583.8A 2017-08-22 2017-08-22 Data processing method of DPI equipment and related DPI equipment Active CN109428774B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710725583.8A CN109428774B (en) 2017-08-22 2017-08-22 Data processing method of DPI equipment and related DPI equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710725583.8A CN109428774B (en) 2017-08-22 2017-08-22 Data processing method of DPI equipment and related DPI equipment

Publications (2)

Publication Number Publication Date
CN109428774A CN109428774A (en) 2019-03-05
CN109428774B true CN109428774B (en) 2020-12-22

Family

ID=65497376

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710725583.8A Active CN109428774B (en) 2017-08-22 2017-08-22 Data processing method of DPI equipment and related DPI equipment

Country Status (1)

Country Link
CN (1) CN109428774B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114095386B (en) * 2020-07-01 2024-03-26 阿里巴巴集团控股有限公司 Data stream statistics method, device and storage medium
CN115150171B (en) * 2022-06-30 2023-11-10 北京天融信网络安全技术有限公司 Flow statistics method and device, electronic equipment and storage medium

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100337432C (en) * 2004-06-29 2007-09-12 杭州华三通信技术有限公司 Data flow statistic method and device
CN101505276B (en) * 2009-03-23 2011-06-01 杭州华三通信技术有限公司 Network application flow recognition method and apparatus and network application flow management apparatus
CN101888303B (en) * 2009-05-13 2012-07-04 中国移动通信集团上海有限公司 Recording method of network traffic information and related device
CN102025623B (en) * 2010-12-07 2013-03-20 苏州迈科网络安全技术股份有限公司 Intelligent network flow control method
CN202696628U (en) * 2012-07-16 2013-01-23 北京国创富盛通信股份有限公司 Network optimization system
CN103051725B (en) * 2012-12-31 2015-07-29 华为技术有限公司 Application and identification method, data digging method, Apparatus and system
US9113400B2 (en) * 2013-03-08 2015-08-18 Tellabs Operations, Inc Method and apparatus for offloading packet traffic from LTE network to WLAN using DPI
CN103916294B (en) * 2014-04-29 2018-05-04 华为技术有限公司 The recognition methods of protocol type and device
CN104243237B (en) * 2014-09-17 2017-05-17 新华三技术有限公司 P2P flow detection method and device
CN104486143B (en) * 2014-12-01 2018-07-06 中国联合网络通信集团有限公司 A kind of deep message detection method, detecting system
CN107241701B (en) * 2016-03-28 2019-12-06 中国移动通信有限公司研究院 data transmission method and device
CN107360062B (en) * 2017-08-28 2021-02-02 上海国云信息科技有限公司 DPI equipment identification result verification method and system and DPI equipment

Also Published As

Publication number Publication date
CN109428774A (en) 2019-03-05

Similar Documents

Publication Publication Date Title
US20150242497A1 (en) User interest recommending method and apparatus
CN105138541B (en) The method and apparatus of audio-frequency fingerprint matching inquiry
CN106168971A (en) information subscribing method and device
CN110362544A (en) Log processing system, log processing method, terminal and storage medium
US11537751B2 (en) Using machine learning algorithm to ascertain network devices used with anonymous identifiers
EP2389624A1 (en) Sampling analysis of search queries
CN107196848B (en) Information push method and device
CN105447113A (en) Big data based informatiion analysis method
CN106650760A (en) Method and device for recognizing user behavioral object based on flow analysis
CN104615765A (en) Data processing method and data processing device for browsing internet records of mobile subscribers
CN104579970B (en) A kind of strategy matching device of IPv6 messages
CN102006174B (en) Data processing method and device based on online behavior of mobile phone user
CN105871585A (en) Terminal association method and device
CN109428774B (en) Data processing method of DPI equipment and related DPI equipment
CN108600780A (en) Method for pushed information
CN104503983A (en) Method and device for providing website certification data for search engine
US9756122B2 (en) Using hierarchical reservoir sampling to compute percentiles at scale
CN103312540A (en) User service requirement parameter determining method and device
CN106131238B (en) The classification method and device of IP address
CN105335313A (en) Basic data transmission method and apparatus
CN106156258A (en) A kind of method of statistical data, Apparatus and system in distributed memory system
CN108011936A (en) Method and apparatus for pushed information
CN104834728B (en) A kind of method for pushing and device for subscribing to video
CN103647666A (en) Method and apparatus for counting call detail record (CDR) messages and outputting results in real time
CN107508705B (en) Resource tree construction method of HTTP element and computing equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant