WO2021218528A1 - 流量识别方法和流量识别设备 - Google Patents

流量识别方法和流量识别设备 Download PDF

Info

Publication number
WO2021218528A1
WO2021218528A1 PCT/CN2021/083803 CN2021083803W WO2021218528A1 WO 2021218528 A1 WO2021218528 A1 WO 2021218528A1 CN 2021083803 W CN2021083803 W CN 2021083803W WO 2021218528 A1 WO2021218528 A1 WO 2021218528A1
Authority
WO
WIPO (PCT)
Prior art keywords
traffic
time interval
arrival time
probability
identification device
Prior art date
Application number
PCT/CN2021/083803
Other languages
English (en)
French (fr)
Inventor
刘文倩
胡新宇
吴俊�
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP21796188.7A priority Critical patent/EP4131873A4/en
Publication of WO2021218528A1 publication Critical patent/WO2021218528A1/zh
Priority to US18/050,775 priority patent/US20230079312A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/28Flow control; Congestion control in relation to timing considerations
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/30Interconnection arrangements between game servers and game devices; Interconnection arrangements between game devices; Interconnection arrangements between game servers
    • A63F13/35Details of game servers
    • A63F13/358Adapting the game course according to the network or server load, e.g. for reducing latency due to different connection speeds between clients
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/24Traffic characterised by specific attributes, e.g. priority or QoS
    • H04L47/2483Traffic characterised by specific attributes, e.g. priority or QoS involving identification of individual flows
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/30Interconnection arrangements between game servers and game devices; Interconnection arrangements between game devices; Interconnection arrangements between game servers
    • A63F13/35Details of game servers
    • A63F13/352Details of game servers involving special game server arrangements, e.g. regional servers connected to a national server or a plurality of servers managing partitions of the game world
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/142Network analysis or design using statistical or mathematical methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/06Generation of reports
    • H04L43/062Generation of reports related to network traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/38Flow based routing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/24Traffic characterised by specific attributes, e.g. priority or QoS
    • H04L47/2441Traffic characterised by specific attributes, e.g. priority or QoS relying on flow classification, e.g. using integrated services [IntServ]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/24Traffic characterised by specific attributes, e.g. priority or QoS
    • H04L47/2475Traffic characterised by specific attributes, e.g. priority or QoS for supporting traffic characterised by the type of applications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/24Traffic characterised by specific attributes, e.g. priority or QoS
    • H04L47/2416Real-time traffic

Definitions

  • This application relates to the field of communications, and in particular to a flow identification method and flow identification device.
  • the online game business is developing rapidly on a global scale, and the number of users continues to rise.
  • the traffic recognition device analyzes the characteristics of the messages extracted from the game traffic, and uses the methods of machine learning and statistical learning to infer the application types of the business, so as to guarantee the business. Therefore, how to accurately identify the flow of online game services, realize network management, network planning, and improve network service quality has become a research focus in the field of network management.
  • the flow identification device captures the session packets with keywords, extracts the tuple information of the communicating parties from it, and establishes a five-tuple rule base. Then, the flow identification device uses the flow classification algorithm to match the five-tuple prefix of the packets included in the flow, and completes the identification and classification of the game flow through the matching algorithm.
  • the flow identification device realizes the identification and classification of the flow by matching the five-tuple prefix of the packets included in the flow.
  • the accuracy of identifying the game traffic based on the tuple information of the communicating parties is low, and the game traffic cannot be accurately identified.
  • the embodiments of the present application provide a traffic identification method and a traffic identification device, which are used to accurately identify game traffic and improve the recognition accuracy of game traffic.
  • the first aspect of the embodiments of the present application provides a traffic identification method, which includes:
  • the flow identification device obtains the flow to be analyzed of the target data flow; then, the flow identification device obtains the arrival time interval of the packets of the flow to be analyzed, and then determines the probability distribution characteristics of some or all of the time intervals in the arrival time interval.
  • the type of traffic to be analyzed is the case of traffic to be analyzed.
  • the probability distribution of the arrival time interval of the packets of each type of traffic has a certain distribution law. Therefore, the traffic identification device can accurately identify the type of the traffic to be analyzed through the distribution characteristics of the probability of the arrival time interval of the packets of the traffic to be analyzed. For example, when the distribution feature of the probability of all or part of the arrival time interval of the packet of the traffic to be analyzed is more consistent with the distribution feature of the probability of the packet of the game traffic, the traffic identification device can determine the to-be-analyzed The traffic is game traffic, so that the game traffic can be accurately identified and the recognition accuracy of the game traffic can be improved.
  • the flow identification device determining the type of the traffic to be analyzed according to the distribution characteristics of the probability of part or all of the time interval in the arrival time interval includes: the flow identification device according to the part of the arrival time interval Or the similarity between the distribution feature of the probability of the total arrival time interval and the distribution feature of the probability of the arrival time interval of the first type of historical traffic determines the type of the traffic to be analyzed.
  • a specific method for a specific flow identification device to identify the type of traffic to be analyzed is provided.
  • the distribution characteristics of the probability of the arrival time interval of packets of the traffic to be analyzed are compared with those of known types of traffic.
  • the similarity between the distribution characteristics of the probability of the arrival time interval of the message determines whether the traffic to be analyzed is the first type of traffic.
  • the flow identification device is based on the distribution characteristics of the probability of a part or all of the arrival time interval in the arrival time interval and the distribution characteristics of the probability of the arrival time interval of packets of the first type of historical traffic.
  • the similarity between the two to determine the type of the traffic to be analyzed includes: when the similarity is higher than the first similarity, the traffic identification device determines that the traffic to be analyzed is the first type of traffic.
  • the similarity is characterized by the degree of fit between the distribution characteristics of the probability of the partial or all arrival time intervals and the probability distribution model of the reference time interval, and the reference time interval probability distribution model is used to characterize the The distribution characteristics of the probability of the arrival time interval of packets of the first type of historical traffic; when the similarity is higher than the first similarity, the flow identification device determines that the traffic to be analyzed is the first type of traffic including: When the degree of fit is higher than the first degree of fit, the flow identification device determines that the flow to be analyzed is the flow of the first type.
  • the degree of fit between the distribution feature of the probability of the partial or all arrival time interval and the probability distribution model of the reference time interval is provided to characterize the similarity, so that the degree of fit is determined by judging the degree of fit.
  • the identification of the traffic to be analyzed can be realized.
  • the degree of fit is characterized by the reciprocal of the relative entropy and the Kolmogorov-Smirnov (Kolmogorov-Smirnov, KS) test.
  • the fit is the first fit; when the reciprocal of the relative entropy is greater than the first preset threshold and the KS test is less than the second preset threshold, the fit is The fit degree is higher than the first fit degree.
  • two specific parameters characterizing the degree of fit are provided, and the distribution characteristics of the probability of the partial or all arrival time interval and the first type of history are determined by the range of these two specific parameters.
  • the degree of fit between the distribution characteristics of the probability of the arrival time interval of traffic packets.
  • the traffic identification device determines that the traffic to be analyzed is the first type of traffic.
  • the flow identification device is based on the distribution characteristics of the probability of a part or all of the arrival time interval in the arrival time interval and the distribution characteristics of the probability of the arrival time interval of packets of the first type of historical traffic.
  • the similarity between the two to determine the type of traffic to be analyzed includes: the flow identification device calculates the fitting parameters according to the part or all of the arrival time interval, the probability of the part or all of the arrival time interval, and the reference time interval probability distribution model, and the reference time
  • the interval probability distribution model is used to characterize the distribution feature of the probability of the arrival time interval of the first type of historical traffic, and the fitting parameter is used to indicate the distribution feature of the probability of the partial or all arrival time interval and the reference time
  • the degree of fit of the interval probability distribution model is used to the flow identification device determines the type of the flow to be analyzed according to the fitting parameters.
  • the implementation manner of characterizing similarity through fitting parameters and the specific implementation manner of calculating fitting parameters are shown.
  • the fitting parameters are used to characterize the distribution of the probability of this part or all of the arrival time interval.
  • the degree of fit between the feature and the reference time interval probability distribution model so as to accurately determine the type of traffic to be analyzed through the fitting parameters, and improve the accuracy of traffic identification.
  • the fitting parameter includes the reciprocal of the relative entropy and the KS test quantity; the flow identification device determining the type of the traffic to be analyzed according to the fitting parameter includes: when the flow identification device determines the reciprocal of the relative entropy When it is greater than the first preset threshold and the KS inspection amount is less than the second preset threshold, it is determined that the flow to be analyzed is a flow of the first type.
  • two specific forms of fitting parameters are provided, which are the reciprocal of the relative entropy and the KS test quantity, respectively, and show that the traffic to be analyzed is identified by the reciprocal of the relative entropy and the KS test quantity.
  • the type of process improves the feasibility of the program.
  • the first type of traffic is game traffic or video traffic.
  • the traffic identification method in the embodiment of the present application is suitable for the identification of game traffic and/or the identification of video traffic, and may also be suitable for the identification of other types of traffic.
  • the method further includes: the flow identification device obtains the historical flow of the first type; the flow identification device obtains the arrival time interval of the packets of the historical flow and the arrival time interval of the packets of the historical flow.
  • the traffic identification device establishes the reference time interval probability distribution model according to the arrival time interval of the historical traffic packets and the probability of the arrival time interval of the historical traffic packets.
  • the traffic identification device can also establish a reference time interval probability distribution model based on the arrival time interval of the historical traffic packet and the probability of the arrival time interval of the historical traffic packet, so that the reference time interval probability distribution model can be passed through the reference.
  • the similarity between the time interval probability distribution model and the distribution characteristics of the probability of partial or all arrival time intervals of the packets of the traffic to be analyzed. Determine whether the traffic to be analyzed is the first type of traffic, so as to accurately identify the type of traffic to be analyzed. .
  • the reference time interval probability distribution model includes any one of the following: a power-law distribution model, a Gaussian distribution model, a normal distribution model, and a Poisson distribution model.
  • multiple possible forms of the reference time interval probability distribution model are provided.
  • a model that is more consistent with the distribution characteristics of the probability distribution of the arrival time interval of the message corresponding to the traffic type should be selected in combination with the traffic type and experimental results.
  • the flow identification device can select the power-law distribution model as the reference time interval probability distribution model.
  • the traffic identification device determines the type of traffic to be analyzed according to the distribution characteristics of the probability of a part of the arrival time interval in the arrival time interval
  • the part of the arrival time interval in the arrival time interval is the Among the arrival time intervals of the packets of the traffic to be analyzed, the arrival time interval is less than the third preset threshold.
  • the probability of the arrival time interval of the arrival time interval of the packets of the traffic to be analyzed with the smaller arrival time interval can indicate to a certain extent the type of traffic to be analyzed.
  • the method further includes: the flow identification device determines the first packet of the flow to be analyzed, and the arrival time interval of the first packet is less than a third preset threshold; the flow identification device determines the first packet The ratio of the number of a packet to the total number of packets included in the flow to be analyzed; when the flow identification device determines that the ratio is greater than the fourth preset threshold, the flow identification device triggers execution based on part or all of the arrival time interval The step of determining the type of traffic to be analyzed by the distribution characteristics of the probability of the arrival time interval.
  • the flow identification can avoid further identification of all network traffic flowing through the flow identification device, thereby improving the identification efficiency of the flow.
  • a second aspect of the embodiments of the present application provides a flow identification device, and the flow identification device includes:
  • the first obtaining unit is used to obtain the to-be-analyzed traffic of the target data stream
  • the second acquiring unit is configured to acquire the arrival time interval of the packets of the traffic to be analyzed
  • the first determining unit is configured to determine the type of traffic to be analyzed according to the distribution characteristics of the probability of part or all of the arrival time interval in the arrival time interval.
  • the first determining unit is specifically configured to:
  • the type of traffic to be analyzed is determined according to the similarity between the distribution feature of the probability of a part or all of the arrival time interval in the arrival time interval and the distribution feature of the probability of the arrival time interval of the first type of historical traffic.
  • the first determining unit is specifically configured to:
  • the traffic to be analyzed is the traffic of the first type.
  • the similarity is characterized by the degree of fit between the distribution characteristics of the probability of the partial or all arrival time intervals and the probability distribution model of the reference time interval, and the reference time interval probability distribution model is used to characterize the The distribution characteristics of the probability of the arrival time interval of packets of the first type of historical traffic; the first determining unit is specifically configured to:
  • the degree of fit is higher than the first degree of fit, it is determined that the flow to be analyzed is the flow of the first type.
  • the degree of fit is characterized by the reciprocal of the relative entropy and the KS test value.
  • the fit is The degree of fit is the first degree of fit; when the reciprocal of the relative entropy is greater than the first preset threshold and the KS test quantity is less than the second preset threshold, the degree of fit is higher than the first degree of fit.
  • the first determining unit is specifically configured to:
  • the reference time interval probability distribution model is used to characterize the first type of historical traffic packets
  • the distribution characteristic of the probability of the arrival time interval, the fitting parameter is used to indicate the degree of fit between the probability distribution characteristic of the part or all of the arrival time interval and the reference time interval probability distribution model;
  • the type of the flow to be analyzed is determined according to the fitting parameters.
  • the fitting parameters include the reciprocal of the relative entropy and the Kolmogorov-Smirnov KS test quantity; the first determining unit is specifically used for:
  • the flow identification device determines that the reciprocal of the relative entropy is greater than the first preset threshold and the KS inspection amount is less than the second preset threshold, it is determined that the flow to be analyzed is the flow of the first type.
  • the first type of traffic is game traffic or video traffic.
  • the first obtaining unit is further used for:
  • the second acquiring unit is also used for:
  • the flow identification device also includes an establishment unit;
  • the establishing unit is configured to establish the reference time interval probability distribution model according to the arrival time interval of the historical traffic packet and the probability of the arrival time interval of the historical traffic packet.
  • the reference time interval probability distribution model includes any one of the following: a power-law distribution model, a Gaussian distribution model, a normal distribution model, and a Poisson distribution model.
  • the first determining unit determines the type of traffic to be analyzed according to the distribution characteristics of the probability of a part of the arrival time interval in the arrival time interval
  • the part of the arrival time interval in the arrival time interval is Among the arrival time intervals of the packets of the traffic to be analyzed, the arrival time interval is less than the third preset threshold.
  • the traffic identification device further includes a second determining unit and a triggering unit;
  • the second determining unit is configured to determine a first packet of the traffic to be analyzed, the arrival time interval of the first packet is less than a third preset threshold; determine the number of the first packets and the traffic to be analyzed includes The ratio of the total number of messages;
  • the trigger unit is configured to, when the traffic recognition device determines that the ratio is greater than the fourth preset threshold, trigger the first determining unit to determine the to-be-analyzed according to the distribution characteristics of the probability of a part or all of the arrival time interval in the arrival time interval Steps for the type of traffic.
  • a third aspect of the embodiments of the present application provides a flow identification device, the flow identification device includes: a processor, a memory, an input/output device, and a bus; the memory stores computer instructions; the processor is executing the computer instructions in the memory When the memory is stored with computer instructions; when the processor executes the computer instructions in the memory, it is used to implement any one of the implementation manners as in the first aspect.
  • the processor, the memory, and the input/output device are respectively connected to the bus.
  • the fourth aspect of the embodiments of the present application provides a chip system that includes a processor for supporting network devices to implement the functions involved in the above-mentioned first aspect, for example, sending or processing data involved in the above-mentioned method And/or information.
  • the chip system further includes a memory, and the memory is used to store necessary program instructions and data of the network device.
  • the chip system can be composed of chips, and can also include chips and other discrete devices.
  • the fifth aspect of the embodiments of the present application provides a computer program product including instructions, which is characterized in that when it runs on a computer, the computer is caused to execute any one of the implementation manners in the first aspect.
  • the sixth aspect of the embodiments of the present application provides a computer-readable storage medium, which is characterized by including instructions, which when run on a computer, cause the computer to execute any implementation manner as in the first aspect.
  • the flow identification device obtains the flow to be analyzed of the target data flow; then, the flow identification device obtains the arrival time interval of the packets of the flow to be analyzed, and then according to some or all of the time intervals in the arrival time interval
  • the probability distribution characteristics determine the type of traffic to be analyzed. Because the probability distribution of the arrival time interval of each type of traffic has a certain distribution law. Therefore, the traffic identification device can accurately identify the type of the traffic to be analyzed through the distribution characteristics of the probability of the arrival time interval of the packets of the traffic to be analyzed.
  • the traffic identification device can determine the to-be-analyzed The traffic is game traffic, so that the game traffic can be accurately identified and the recognition accuracy of the game traffic can be improved.
  • Figure 1 is a schematic diagram of a framework of an embodiment of the application
  • FIG. 2A is a schematic diagram of an embodiment of a traffic identification method according to an embodiment of this application.
  • FIG. 2B is a schematic diagram of a scenario of a traffic identification method according to an embodiment of this application.
  • FIG. 3A is a schematic diagram of another embodiment of a traffic identification method according to an embodiment of this application.
  • 3B is a schematic diagram of the probability distribution of the arrival time interval of packets of the game traffic of the game application "Glory of the King" according to an embodiment of the application;
  • 3C is a schematic diagram of the distribution of the reciprocal of the relative entropy and the KS test quantity corresponding to the flow to be analyzed according to the embodiment of the application;
  • FIG. 3D is a schematic diagram of the probability distribution of the arrival time interval of packets of the video traffic of the video application "iqiyi" according to an embodiment of the application;
  • FIG. 4 is a schematic diagram of another embodiment of a traffic identification method according to an embodiment of this application.
  • FIG. 5 is a schematic structural diagram of a traffic identification device according to an embodiment of this application.
  • FIG. 6 is a schematic diagram of another structure of a traffic identification device according to an embodiment of the application.
  • the embodiments of the present application provide a traffic identification method and a traffic identification device, which are used to accurately identify game traffic and improve the recognition accuracy of game traffic.
  • FIG. 1 is a schematic diagram of a framework of an embodiment of the application.
  • the game traffic of the game service flow will flow through the network devices of each layer in the network during network transmission.
  • the game server sends the game service stream
  • the game traffic of the game service stream flows through the broadband remote access server (BRAS), optical line terminal (OLT), and optical network terminal in Figure 1 (optical network terminal, ONT).
  • BRAS broadband remote access server
  • ONT optical line terminal
  • ONT optical network terminal
  • the technical solution of the embodiment of the present application proposes to deploy or bypass the flow identification device in any one of the network devices deployed at each layer in the network, and the flow identification device is used for each type of terminal device (for example, the network traffic of mobile phones, personal computers, televisions, etc.) is collected, and the network traffic is identified to determine the type of network traffic, so as to facilitate the use of machine learning and statistical learning methods to infer the business corresponding to the network traffic Type, and provide business guarantee according to the priority of the business type.
  • the network traffic of mobile phones, personal computers, televisions, etc. is collected, and the network traffic is identified to determine the type of network traffic, so as to facilitate the use of machine learning and statistical learning methods to infer the business corresponding to the network traffic Type, and provide business guarantee according to the priority of the business type.
  • the traffic identification device is integrated into the network equipment deployed at each layer at a single point or multiple points, or it can be sidelined in the network equipment deployed at each layer, which is not specifically limited in this application.
  • FIG. 1 only shows the application scenario of identifying game traffic, and it is also applicable to scenarios that are not shown in this application and have similar requirements or the same requirements, and this application does not limit it.
  • the technical solutions of the embodiments of the present application are also applicable to application scenarios where the traffic identification device is used to identify video traffic.
  • the technical solutions of the embodiments of the present application are also applicable to application scenarios where the flow identification device is used for both the identification of game flow and the identification of video flow.
  • FIG. 2A is a schematic diagram of an embodiment of a traffic identification method according to an embodiment of this application.
  • the method includes:
  • the flow identification device obtains the flow to be analyzed of the target data flow.
  • the integrated deployment of the traffic identification equipment on the optical network terminal is taken as an example for description.
  • the traffic identification device can obtain the network traffic.
  • the flow identification device determines that the network flow belongs to the flow of the target data flow according to the five-tuple of the network flow.
  • the flow identification device uses the network flow as the flow to be analyzed; or, the flow identification device uses part of the network flow as the flow to be analyzed.
  • the part of the traffic includes the first m packets in the network traffic, and m is an integer greater than zero.
  • the flow identification device identifies the flow of the target data flow from the network flow through the five-tuple identification, and then part or part of the flow of the target data flow The entire flow rate is regarded as the flow rate to be analyzed.
  • network traffic includes the traffic of data stream A and the traffic of data stream B.
  • Data flow A is the target data flow.
  • the flow flow identification device identifies the flow of data flow A in the network flow through a five-tuple, and then some or all of the flow of the identified data flow A is regarded as the flow to be analyzed .
  • the flow identification device obtains the arrival time interval of the packet of the flow to be analyzed.
  • the message arrival time interval is the time interval for the flow identification device to continuously receive two messages of the same data flow.
  • the arrival time interval of a message can be the interval between the time when the traffic recognition device receives the message and the time when the next message of the message is received, or it can be the time when the traffic recognition device receives the message. The interval between the time of the message and the time when the last message of the message was received.
  • message 1, message 2, and message 3 are three messages of the target data flow continuously received by the flow identification device.
  • the time point when the flow identification device receives the message 1 is the 1ms (millisecond) time.
  • the time when the traffic identification device receives the message 2 is the 1.8th ms time.
  • the time point when the traffic identification device receives the message 3 is 3.
  • the arrival time interval of message 2 can be understood as the time difference between the time point when the traffic recognition device receives message 3 and the time point when the traffic recognition device receives message 2, that is, message 2
  • the arrival time interval is 1.2ms.
  • the arrival time interval of message 2 can be understood as the time difference between the time point when the traffic recognition device receives message 2 and the time point when the traffic recognition device receives message 1, that is, the message The arrival time interval of 2 is 0.8ms.
  • the flow identification device determines the type of the flow to be analyzed according to the distribution characteristics of the probability of part or all of the time intervals in the arrival time interval of the packet of the flow to be analyzed.
  • the probability of the arrival time interval refers to the probability that the arrival time interval appears in the arrival time interval of the packet of the traffic to be analyzed. There are many ways to calculate the probability of the arrival time interval. The following is an example to illustrate:
  • the probability of the arrival time interval is the ratio of the arrival time interval of the packets of the traffic to be analyzed.
  • the arrival time interval is the ratio of the number of packets of the arrival time interval to the total number of packets of the traffic to be analyzed.
  • the probability of arrival time interval A the total number of packets of the traffic to be analyzed is M, and the M packets include N packets with arrival time interval A, then the probability of arrival time interval A is N/M.
  • A is greater than 0
  • M is an integer greater than 0
  • N is an integer greater than 0 and less than M.
  • the probability of the arrival time interval is the arrival time interval in the arrival time interval of the packet of the traffic to be analyzed.
  • the arrival time interval is the number of the arrival time interval of the arrival time interval and the arrival in the arrival time interval of the packet of the traffic to be analyzed.
  • the ratio of the total number of time intervals That is, the probability of the arrival time interval is the ratio of the number of occurrences of the arrival time interval to the total number of occurrences of all arrival time intervals in the arrival time interval of the packet of the traffic to be analyzed.
  • the arrival time intervals of the packets of the traffic to be analyzed are 1 ms, 2 ms, 2 ms, 3 ms, and 5 ms, respectively.
  • the flow identification device determines that the total number of arrival time intervals of packets of the flow to be analyzed is 5.
  • the arrival time interval A is 2 ms
  • the flow identification device determines that the number of arrival time intervals with the arrival time interval of 2 ms in the arrival time intervals of the packets of the traffic to be analyzed is two. It can be seen that the probability of the arrival time interval of 2 ms is 40%.
  • the type of the traffic to be analyzed is the first type of traffic or the second type of traffic.
  • the first type of traffic includes game traffic, and the second type of traffic is non-game traffic; or, the first type of traffic is video traffic, and the second type of traffic is non-video traffic.
  • the non-video traffic is download data traffic.
  • the partial arrival time interval in the arrival time interval of the packet of the traffic to be analyzed includes an arrival time interval in the arrival time interval of the packet of the traffic to be analyzed that is less than the third preset threshold.
  • the arrival time intervals of the packets of the traffic to be analyzed are 1ms, 2ms, 2ms, 3ms, 5ms, 9ms, 20ms, and 30ms, respectively.
  • the third preset threshold is 10ms, then the time interval of the arrival of this part is 1ms, 2ms, 2ms, 3ms, and 5ms, of which 1ms, 3ms, and 5ms appear once, and 2ms appear twice. Since the arrival time intervals of the packets of game traffic or video traffic are relatively small, the traffic identification device can pass the distribution characteristics of the probability of the arrival time interval with the smaller arrival time interval among the arrival time intervals of the packets of the traffic to be analyzed Determine the type of traffic to be analyzed.
  • the traffic identification device can determine that the traffic to be analyzed is game traffic, thereby accurately identifying game traffic, and improving the recognition accuracy of game traffic.
  • step 203 specifically includes step 203a.
  • Step 203a The flow identification device according to the distribution feature of the probability distribution of part or all of the arrival time interval of the message of the traffic to be analyzed and the distribution feature of the probability of the arrival time interval of the message of the first type of historical traffic The similarity between the two determines the type of traffic to be analyzed.
  • the traffic identification device may compare the probability distribution characteristics of some or all of the arrival time intervals of the packets of the traffic to be analyzed with the probability of the arrival time intervals of the packets of the first type of historical traffic. The similarity of the distribution characteristics determines whether the traffic to be analyzed is the first type of traffic.
  • the traffic identification device determines that the traffic to be analyzed is the traffic of the first type.
  • the similarity is characterized by the degree of fit between the distribution characteristics of the probability of the partial or all arrival time intervals and the probability distribution model of the reference time interval, and the reference time interval probability distribution model is used to characterize the first time interval.
  • the degree of fit is characterized by the reciprocal of the relative entropy and the KS test value.
  • the fit is the first degree of fit;
  • the reciprocal of the relative entropy is greater than the first preset threshold and the KS test quantity is less than the second preset threshold, the degree of fit is higher than the first degree of fit.
  • the flow identification device obtains the flow to be analyzed of the target data flow; then, the flow identification device obtains the arrival time interval of the packets of the flow to be analyzed, and then according to a part or all of the arrival time interval in the arrival time interval
  • the distribution characteristics of the probability determine the type of traffic to be analyzed. Because the probability distribution of the arrival time interval of each type of traffic has a certain distribution law. Therefore, the traffic identification device can accurately identify the type of the traffic to be analyzed through the distribution characteristics of the probability of the arrival time interval of the packets of the traffic to be analyzed.
  • the traffic identification device can determine the to-be-analyzed The traffic is game traffic, so that the game traffic can be accurately identified and the recognition accuracy of the game traffic can be improved.
  • step 202a in the embodiment shown in FIG. 2A has a specific execution process, and step 202a specifically includes step 3001 and step 3002.
  • step 202a specifically includes step 3001 and step 3002.
  • FIG. 3A is a schematic diagram of another embodiment of a traffic identification method according to an embodiment of this application.
  • the flow identification device calculates fitting parameters according to the part or all of the arrival time interval, the probability of the part or all of the arrival time interval, and a reference time interval probability distribution model.
  • the fitting parameter is used to indicate the degree of fit between the distribution feature of the probability of the part or all of the arrival time interval and the probability distribution model of the reference time interval.
  • the reference time interval probability distribution model is used to characterize the distribution characteristics of the probability of the arrival time interval of packets of the first type of traffic.
  • the reference time interval probability distribution model is a model trained based on the first type of historical traffic.
  • the reference time interval probability distribution model is used by the traffic identification device according to the part or all of the arrival time interval and the probability of the part or all of the time interval Calculate the fitting parameters.
  • the fitting parameter includes the reciprocal r of the relative entropy and/or the KS test quantity p.
  • the distribution feature of the probability of this part or all of the arrival time interval is expressed as a function P(i), P(i) refers to the arrival time interval of the i-th packet of the traffic to be analyzed received by the traffic identification device actually measured The probability.
  • the reference time interval probability distribution model is the function Q(i), Q(i) refers to the arrival time interval of the i-th packet of the traffic to be analyzed received by the traffic identification device calculated by the reference time interval probability distribution model Probability.
  • the probability of the arrival time interval and the total arrival time interval of the packets of the traffic to be analyzed is specifically as shown in Fig. 3B, the abscissa is the arrival time interval of the packets of the traffic to be analyzed, and the ordinate is the all Probability of the time interval of arrival.
  • the flow identification device takes the arrival time interval corresponding to n packets as an input parameter, inputs it into the reference time interval probability distribution model, and calculates the arrival time corresponding to the n packets The reference probability of the time interval.
  • the traffic identification device substitutes the arrival time interval corresponding to n packets and the actual measured probability of the arrival time interval of the n packets and the reference probability of the arrival time interval corresponding to the n packets into Then calculate the reciprocal r of the D KL (P
  • the distribution characteristic of the probability of this part or all of the arrival time interval is expressed as a function F N (x), F N (x) refers to the actual measurement of the arrival time interval x in the arrival time interval of the packet of the traffic to be analyzed Probability.
  • the reference time interval probability distribution model is expressed as a function F(x), and F(x) is the occurrence probability of the arrival time interval x in the arrival time interval of the packet of the traffic to be analyzed calculated by the reference time interval probability distribution model.
  • means taking the absolute value of c.
  • the traffic to be analyzed includes n packets.
  • the flow identification device takes the arrival time intervals of the n messages as x and substitutes them into F(x), respectively, to obtain the reference probability of the arrival time interval of each of the n messages.
  • the actual probability of the arrival time interval of each message (where the actual probability of each message arrival time interval is the actual measurement value obtained by actual measurement) is known, and the flow identification device calculates each of the n messages.
  • the absolute value of the difference between the reference probability and the actual probability of the arrival time interval of the packets then the flow identification device obtains the n absolute values corresponding to the n packets, and then determines the maximum value of the n absolute values . It can be seen that the maximum value is Dn, that is, p is obtained.
  • the reference time interval probability distribution model may be a model trained by the traffic identification device according to the historical traffic of the first type, or a model trained by other devices according to the historical traffic of the first type and configured On the traffic identification device, there is no specific limitation here.
  • the reference time interval probability distribution model includes a power-law distribution model, a Gaussian distribution model, a normal distribution model, or a Poisson distribution model, which is not specifically limited in this application.
  • a model that is more consistent with the distribution characteristics of the probability distribution of the arrival time interval of the message corresponding to the traffic type should be selected in combination with the traffic type and experimental results. For example, for game traffic, the distribution characteristics of the probability of the arrival time interval of the messages of the game traffic are more consistent with the distribution characteristics of the power law distribution model. Therefore, when the traffic recognition device recognizes the game traffic, the power law distribution model can be selected as the reference time interval probability distribution model.
  • the multiple reference time interval probability distribution models include a first time interval probability distribution model and a second time interval probability distribution model.
  • the first time interval probability distribution model is used to characterize the probability of the arrival time interval of packets of game traffic.
  • Distribution characteristics the second time interval probability distribution model is used to characterize the distribution characteristics of the probability of the arrival time interval of the packets of the video traffic.
  • the traffic recognition device may first determine whether the traffic to be analyzed is game traffic based on the similarity between the first time interval probability distribution model and the probability distribution characteristics of the partial or all arrival time intervals. If the similarity is high, the traffic identification device determines that the traffic to be analyzed is game traffic; if the similarity is low, the traffic identification device can pass the second time interval probability distribution model and the distribution characteristics of the probability of the part or all of the arrival time interval Determine whether the traffic to be analyzed is video traffic. If the similarity is high, the traffic identification device determines that the traffic to be analyzed is video traffic; if the similarity is low, the traffic identification device determines that the traffic to be analyzed is neither game traffic nor video traffic.
  • the flow identification device determines the type of the flow to be analyzed according to the fitting parameters.
  • step 203 For the type of the traffic to be analyzed, please refer to the related description of step 203 in the embodiment shown in FIG. 2A, which will not be repeated here.
  • step 3001 the fitting parameter includes the reciprocal r of the relative entropy and/or the KS test quantity p. Then, this step 3002 includes step 3002a to step 3002c.
  • Step 3002a The flow identification device determines whether r is greater than a first preset threshold and p is less than a second preset threshold; if yes, go to step 3002b; if not, go to step 3002c.
  • the first preset threshold is 1000
  • the second preset threshold is 0.00001.
  • the flow identification device determines whether the calculated r is greater than 1000 and p is less than 0.00001. If so, the flow identification device determines that the traffic to be analyzed is the first type of traffic; if not, the traffic identification device determines that the traffic to be analyzed is the second type Of traffic.
  • the abscissa as p
  • the ordinate as r
  • the r and p corresponding to the traffic to be analyzed corresponding to the multiple data streams acquired by the traffic recognition device are passed through the coordinate points. (p, r) shows that from FIG. 3C, the game traffic and the non-game traffic among the traffic to be analyzed corresponding to the multiple data streams can be clearly determined.
  • the setting sizes of the first preset threshold and the second preset threshold can be specifically determined through experimental data.
  • Step 3002b The flow identification device determines that the flow to be analyzed is the first type of flow.
  • Step 3002c The flow identification device determines that the flow to be analyzed is the second type of flow.
  • step 202 in the embodiment shown in FIG. 2A optionally, before step 202 in the embodiment shown in FIG. 2A, the embodiment shown in FIG. 2A further includes step 202a to step 202e.
  • Step 202a The first packet of the flow to be analyzed by the flow identification device.
  • the first packet is a packet whose arrival time interval is less than the third preset threshold among the packets of the traffic to be analyzed.
  • the third preset threshold is 10 ms (milliseconds)
  • the traffic identification device uses a packet with an arrival time interval of 10 ms in the traffic to be analyzed as the first packet.
  • the size of the third preset threshold may be specifically determined according to the current network transmission state. For example, when the network transmission status is good, the third preset threshold is smaller; when the network transmission status is poor, the third preset threshold is larger.
  • the network transmission status can be specifically determined by the bandwidth and delay of the network transmission.
  • Step 202b The flow identification device determines the ratio of the number of first packets to the total number of packets extracted from the flow to be analyzed.
  • the total number of packets of the traffic to be analyzed is M
  • the number of packets with an arrival time interval of 10 ms is L
  • the ratio of the number of packets with an arrival time interval of 10 ms to the total number of packets of the traffic to be analyzed is L/M.
  • L is an integer greater than 0 and less than M.
  • Step 202c The flow identification device judges whether the ratio is greater than the fourth preset threshold, if yes, execute step 202d; if not, execute step 202e.
  • the fourth preset threshold is 80%, and the flow identification device determines whether the ratio is greater than 80%. If it is, the flow identification device preliminarily determines that the flow to be analyzed is the first type of flow, and the flow identification device can further identify the flow to be analyzed. The traffic is analyzed to accurately identify the type of the traffic to be analyzed; if it is not, the traffic identification device determines the second type of traffic.
  • the setting of the fourth preset threshold may be determined through multiple experimental data.
  • Step 202d The flow identification device triggers the execution of the above step 203.
  • the flow identification device determines that the ratio is greater than the fourth preset threshold, the flow identification device executes step 203 to further identify the flow to be analyzed. Through the screening process of step 202c, it is possible to prevent the flow identification device from further identifying all network traffic flowing through the flow identification device, thereby improving the identification efficiency of the flow.
  • Step 202e The flow identification device determines that the flow to be analyzed is the second type of flow.
  • step 202 in the embodiment shown in FIG. 2A optionally, before step 202 in the embodiment shown in FIG. 2A, the embodiment shown in FIG. 2A further includes step 202f to step 202h.
  • Step 202f The traffic identification device obtains the historical traffic of the first type
  • the historical traffic is game traffic or video traffic.
  • the server is labeling the historical traffic, and the label indicates that the historical traffic is the first type of traffic.
  • the traffic identification device receives the historical traffic, it determines that the historical traffic is the first type of traffic through the label.
  • the size of the historical traffic is generally selected to be 1 to 2 GB.
  • Step 202g The flow identification device determines the arrival time interval of the historical traffic packet and the probability of the arrival time interval of the historical traffic packet.
  • Step 202g is similar to step 202 in the embodiment shown in FIG. 2A.
  • Step 202g is similar to step 202 in the embodiment shown in FIG. 2A.
  • Step 202h The traffic identification device establishes a reference time interval probability distribution model according to the arrival time interval of the historical traffic packet and the probability of the arrival time interval of the historical traffic packet.
  • the flow identification device uses the arrival time interval of historical traffic packets as the abscissa, and uses the probability of the arrival time interval of the historical traffic packets as the ordinate to obtain the arrival time interval of the historical traffic packets. Probability distribution chart. Then, the traffic identification device determines the probability distribution model of the time interval to be formulated according to the probability distribution map of the arrival time interval of the historical traffic packets, and calculates the parameter values of the probability distribution model of the time interval to be formulated. The flow identification device substitutes the parameter value into the time interval probability distribution model to be drawn to obtain the reference time interval distribution probability model. Optionally, the flow identification device calculates the parameter values of the to-be-drawn probability distribution model of the time interval by using a maximum likelihood estimation method.
  • the historical traffic is the game traffic of the game application "Honor of Kings"
  • the traffic identification device uses the arrival time interval of the historical traffic packets as the abscissa, and the probability of the arrival time interval of the historical traffic packets as the ordinate , To obtain the probability distribution diagram of the arrival time interval of the historical traffic packets.
  • FIG. 3B is a schematic diagram of the probability distribution of the arrival time interval of packets of the historical traffic of the game application "Glory of the King". From the time interval probability distribution diagram shown in Fig. 3B, it can be seen that the distribution characteristics of the probability of the arrival time interval of the historical traffic packets are more consistent with the distribution characteristics of the power law distribution model.
  • the flow identification equipment determines that the time interval probability distribution model to be drawn up is a power law distribution model; then, the flow identification equipment calculates the parameter values of the power law distribution model through the maximum likelihood estimation method, and then substitutes the parameter values into the power law distribution In the model, the reference time interval distribution model is obtained.
  • the historical traffic is the video traffic of the video application "iqiyi”
  • the traffic identification device uses the arrival time interval of the packets of the historical traffic of "iqiyi" as the abscissa, and the report of the traffic to be analyzed
  • the probability of the arrival time interval of the message is used as the ordinate to obtain the probability distribution diagram of the arrival time interval of the message of the traffic to be analyzed.
  • FIG. 3D is a schematic diagram of the probability distribution of the arrival time interval of packets of the historical traffic of the video application "iqiyi”. It can be seen from Figure 3D that the distribution characteristics of the probability of the time interval of the historical traffic are more consistent with the distribution characteristics of the Gaussian distribution model.
  • the flow identification device calculates the parameter values of the Gaussian distribution model through the maximum likelihood estimation method. Then substitute the parameter value into the Gaussian distribution model to obtain the reference time interval distribution model.
  • the distribution characteristics of the probability of the arrival time interval of the message of the game traffic are more consistent with the distribution characteristics of the power law distribution model. Therefore, when the traffic recognition device recognizes game traffic, it can accurately determine that the traffic to be analyzed is recognized as game traffic by comparing the distribution characteristics of the probability of the arrival time interval of the packets of the traffic to be analyzed with the distribution characteristics of the power law distribution model. And the distribution characteristics of the probability of the arrival time interval of the packets of the video traffic are more consistent with the distribution characteristics of the Gaussian distribution model, and the traffic identification device compares the distribution characteristics of the probability of the arrival time interval of the packets of the traffic to be analyzed with the Gaussian distribution model. The distribution characteristics of can accurately determine whether the traffic to be analyzed is video traffic.
  • step 202 in the embodiment shown in FIG. 2A specifically includes step 4001 to step 4004.
  • step 4001 to step 4004. Please refer to FIG. 4 for details.
  • FIG. 4 is a schematic diagram of another embodiment of a traffic identification method in an embodiment of this application. The method includes:
  • the traffic identification device determines a first quintuple of a second packet and a second quintuple of a third packet.
  • the first quintuple includes the source IP address, destination IP address, source port number, destination port number, and transmission protocol type of the second message.
  • the second quintuple includes the source IP address, destination IP address, source port number, destination port number, and transmission protocol type of the second message.
  • the traffic identification device obtains the first quintuple and the packet header of the third packet through the packet header of the second packet to obtain the second quintuple.
  • the flow identification device determines, according to the first quintuple and the second quintuple, that the second packet and the third packet are two packets of the target data flow.
  • the flow identification device determines, according to the first moment and the second moment, that the second packet and the third packet are two packets of the target data flow continuously received by the flow identification device.
  • the first moment is the moment when the flow identification device receives the second packet
  • the second moment is the moment when the flow identification device receives the third packet
  • the flow identification device determines the time stamp of the arrival of the message according to the time sequence of the arrival of the message to the flow identification device. From the time stamp, it can be determined that the flow identification device receives the second message at the first time and receives the third message at the second time. Message. Then, the flow identification device determines according to the timestamp that the second packet and the third packet are two packets of the target data flow continuously received by the flow identification device.
  • the third message is the next message of the second message, or the previous message of the second message, and the details are not limited here.
  • the flow identification device uses the time difference between the first time and the second time as the arrival time interval of the second packet.
  • step 4003 and step 4004 the example of step 202 in the embodiment shown in FIG. 2A can be combined, and details are not described herein again.
  • FIG. 5 is a schematic structural diagram of a traffic identification method according to an embodiment of this application.
  • the flow identification device can be used to execute the steps performed by the flow identification device in the embodiments shown in FIG. 2A, FIG. 3A, and FIG. 4.
  • FIG. 5 is a schematic structural diagram of a traffic identification method according to an embodiment of this application.
  • the flow identification device can be used to execute the steps performed by the flow identification device in the embodiments shown in FIG. 2A, FIG. 3A, and FIG. 4.
  • FIG. 5 is a schematic structural diagram of a traffic identification method according to an embodiment of this application.
  • the flow identification device can be used to execute the steps performed by the flow identification device in the embodiments shown in FIG. 2A, FIG. 3A, and FIG. 4.
  • FIG. 4 is a schematic structural diagram of a traffic identification method according to an embodiment of this application.
  • the flow identification device can be used to execute the steps performed by the flow identification device in the embodiments shown in FIG. 2A, FIG. 3A, and FIG. 4.
  • the flow identification device includes a first acquiring unit 501, a second acquiring unit 502, and a first determining unit 503.
  • the traffic identification device further includes an establishment unit 504, a second determination unit 505, and a trigger unit 506.
  • the first obtaining unit 501 is configured to obtain the to-be-analyzed traffic of the target data stream
  • the second acquiring unit 502 is configured to acquire the arrival time interval of the packets of the traffic to be analyzed
  • the first determining unit 503 is configured to determine the type of traffic to be analyzed according to the distribution characteristics of the probability of part or all of the arrival time interval in the arrival time interval.
  • the first determining unit 503 is specifically configured to:
  • the type of traffic to be analyzed is determined according to the similarity between the distribution feature of the probability of a part or all of the arrival time interval in the arrival time interval and the distribution feature of the probability of the arrival time interval of the first type of historical traffic.
  • the first determining unit 503 is specifically configured to:
  • the traffic to be analyzed is the traffic of the first type.
  • the similarity is characterized by the degree of fit between the distribution characteristics of the probability of the partial or all arrival time intervals and the probability distribution model of the reference time interval, and the reference time interval probability distribution model is used to characterize the The distribution characteristics of the probability of the arrival time interval of packets of the first type of historical traffic; the first determining unit 503 is specifically configured to:
  • the degree of fit is higher than the first degree of fit, it is determined that the flow to be analyzed is the flow of the first type.
  • the degree of fit is characterized by the reciprocal of the relative entropy and the KS test value.
  • the fit is The degree of fit is the first degree of fit; when the reciprocal of the relative entropy is greater than the first preset threshold and the KS test quantity is less than the second preset threshold, the degree of fit is higher than the first degree of fit.
  • the first determining unit 503 is specifically configured to:
  • the reference time interval probability distribution model is used to characterize the first type of historical traffic packets
  • the distribution characteristic of the probability of the arrival time interval, the fitting parameter is used to indicate the degree of fit between the probability distribution characteristic of the part or all of the arrival time interval and the reference time interval probability distribution model;
  • the type of the flow to be analyzed is determined according to the fitting parameters.
  • the fitting parameters include the reciprocal of the relative entropy and the Kolmogorov-Smirnov KS test quantity; the first determining unit 503 is specifically used for:
  • the flow identification device determines that the reciprocal of the relative entropy is greater than the first preset threshold and the KS inspection amount is less than the second preset threshold, it is determined that the flow to be analyzed is the flow of the first type.
  • the first type of traffic is game traffic or video traffic.
  • the first obtaining unit 501 is further configured to:
  • the second obtaining unit 502 is also used for:
  • the establishing unit 504 is configured to establish the reference time interval probability distribution model according to the arrival time interval of the historical traffic packet and the probability of the arrival time interval of the historical traffic packet.
  • the reference time interval probability distribution model includes any one of the following: a power-law distribution model, a Gaussian distribution model, a normal distribution model, and a Poisson distribution model.
  • the part of the arrival time interval in the arrival time interval is It is an arrival time interval that is less than the third preset threshold among the arrival time intervals of the packets of the traffic to be analyzed.
  • the second determining unit 505 is configured to:
  • the arrival time interval of the first packet is less than a third preset threshold; determine the ratio of the number of the first packets to the total number of packets included in the flow to be analyzed;
  • the trigger unit 506 is configured to, when the traffic identification device determines that the ratio is greater than the fourth preset threshold, trigger the first determining unit 503 to execute the determination of the probability distribution feature of the arrival time interval according to the probability distribution of part or all of the arrival time interval. Steps for the type of traffic to be analyzed.
  • the first acquiring unit 501 acquires the traffic to be analyzed of the target data flow; then, the second acquiring unit 502 acquires the arrival time interval of the packets of the traffic to be analyzed; the first determining unit 503 obtains the arrival time interval according to the The distribution characteristics of the probability of part or all of the arrival time interval in determines the type of traffic to be analyzed. Because the probability distribution of the arrival time interval of each type of traffic has a certain distribution law. Therefore, the first determining unit 503 can accurately identify the type of the traffic to be analyzed based on the distribution feature of the probability of the arrival time interval of the packet of the traffic to be analyzed.
  • the first determining unit 503 may determine the The traffic to be analyzed is game traffic, so as to accurately identify game traffic and improve the recognition accuracy of game traffic.
  • the embodiment of the present application also provides a traffic identification device 600.
  • FIG. 6, is another schematic diagram of the structure of the flow identification device according to the embodiment of the application.
  • the flow identification device is used to execute the steps performed by the flow identification device of the embodiment shown in FIG. 2A, FIG. 3A, and FIG. 4.
  • FIG. 4 is another schematic diagram of the structure of the flow identification device according to the embodiment of the application.
  • the flow identification device is used to execute the steps performed by the flow identification device of the embodiment shown in FIG. 2A, FIG. 3A, and FIG. 4.
  • FIG. 2A is another schematic diagram of the structure of the flow identification device according to the embodiment of the application.
  • FIG. 4 is another schematic diagram of the structure of the flow identification device according to the embodiment of the application.
  • the flow identification device is used to execute the steps performed by the flow identification device of the embodiment shown in FIG. 2A, FIG. 3A, and FIG. 4.
  • FIG. 2A is another schematic diagram of the structure of the flow identification device according to the embodiment of the application.
  • the traffic identification device 600 includes: a processor 601, a memory 602, an input/output device 603, and a bus 604.
  • the processor 601, the memory 602, and the input/output device 603 are respectively connected to the bus 604, and computer instructions are stored in the memory.
  • the input and output device 603 is used to obtain the to-be-analyzed traffic of the target data stream
  • the processor 601 is configured to obtain the arrival time interval of packets of the traffic to be analyzed; and determine the type of traffic to be analyzed according to the distribution characteristics of the probability of part or all of the arrival time intervals in the arrival time interval.
  • the processor 601 is specifically configured to:
  • the type of traffic to be analyzed is determined based on the similarity between the distribution characteristics of the probability of a part or all of the arrival time intervals in the arrival time interval and the distribution characteristics of the probability of the arrival time interval of the first type of historical traffic.
  • the processor 601 is specifically configured to:
  • the traffic to be analyzed is the traffic of the first type.
  • the similarity is characterized by the degree of fit between the distribution characteristics of the probability of the partial or all arrival time intervals and the probability distribution model of the reference time interval, and the reference time interval probability distribution model is used to characterize the The distribution characteristics of the probability of the arrival time interval of packets of the first type of historical traffic; the processor 601 is specifically configured to:
  • the degree of fit is higher than the first degree of fit, it is determined that the flow to be analyzed is the flow of the first type.
  • the degree of fit is characterized by the reciprocal of the relative entropy and the KS test value.
  • the fit is The degree of fit is the first degree of fit; when the reciprocal of the relative entropy is greater than the first preset threshold and the KS test quantity is less than the second preset threshold, the degree of fit is higher than the first degree of fit.
  • the processor 601 is specifically configured to:
  • the reference time interval probability distribution model is used to characterize the first type of historical traffic packets
  • the distribution characteristic of the probability of the arrival time interval, the fitting parameter is used to indicate the degree of fit between the probability distribution characteristic of the part or all of the arrival time interval and the reference time interval probability distribution model;
  • the type of the flow to be analyzed is determined according to the fitting parameters.
  • the fitting parameters include the reciprocal of the relative entropy and the Kolmogorov-Smirnov KS test quantity; the processor 601 is specifically used for:
  • the flow identification device determines that the reciprocal of the relative entropy is greater than the first preset threshold and the KS inspection amount is less than the second preset threshold, it is determined that the flow to be analyzed is the flow of the first type.
  • the first type of traffic is game traffic or video traffic.
  • the input and output device 603 is also used for:
  • the processor 601 is also used for:
  • the reference time interval probability distribution model is established according to the probability of the arrival time interval of the historical traffic packet and the arrival time interval of the historical traffic packet.
  • the reference time interval probability distribution model includes any one of the following: a power-law distribution model, a Gaussian distribution model, a normal distribution model, and a Poisson distribution model.
  • the processor 601 determines the type of traffic to be analyzed according to the distribution characteristics of the probability of a part of the arrival time interval in the arrival time interval, the part of the arrival time interval in the arrival time interval It is an arrival time interval that is less than the third preset threshold among the arrival time intervals of the packets of the traffic to be analyzed.
  • the input and output device 603 is also used for:
  • the arrival time interval of the first packet is less than a third preset threshold; determine the ratio of the number of the first packets to the total number of packets included in the flow to be analyzed;
  • the processor 601 is also used for:
  • the flow identification device determines that the ratio is greater than the fourth preset threshold, it triggers the execution of the step of determining the type of the flow to be analyzed according to the distribution characteristics of the probability of part or all of the arrival time interval in the arrival time interval.
  • the input and output device 603 obtains the traffic to be analyzed of the target data flow; then, the processor 601 obtains the arrival time interval of the packets of the traffic to be analyzed; The distribution characteristics of the probability of the interval determine the type of traffic to be analyzed. Since the probability distribution of the arrival time interval of the packets of each type of traffic has a certain distribution law, the processor 601 can accurately identify the to-be-analyzed traffic based on the distribution characteristics of the probability of the arrival time interval of the packets of the to-be-analyzed traffic. The type of traffic.
  • the processor 601 may determine that the message to be analyzed
  • the traffic is game traffic, so that the game traffic can be accurately identified and the recognition accuracy of the game traffic can be improved.
  • the embodiment of the present application also provides a computer program product including instructions, which when run on a computer, causes the computer to execute the traffic identification method of the embodiment shown in FIG. 2A, FIG. 3A, and FIG. 4.
  • An embodiment of the present application also provides a computer-readable storage medium, including instructions, which when run on a computer, cause the computer to execute the traffic identification method in the embodiments shown in FIG. 2A, FIG. 3A, and FIG. 4.
  • the traffic identification device when the traffic identification device is a chip in the terminal, the chip includes: a processing unit and a communication unit, the processing unit may be, for example, a processor, and the communication unit may be, for example, an input/output Interface, pin or circuit, etc.
  • the processing unit can execute the computer-executable instructions stored in the storage unit, so that the chip in the terminal executes the flow identification method in the embodiments shown in FIG. 2A, FIG. 3A, and FIG. 4.
  • the storage unit is a storage unit in the chip, such as a register, a cache, etc., and the storage unit may also be a storage unit in the terminal located outside the chip, such as a read-only memory (read-only memory). -only memory, ROM) or other types of static storage devices that can store static information and instructions, random access memory (RAM), etc.
  • the processor mentioned in any one of the foregoing can be a general-purpose central processing unit, a microprocessor, an application-specific integrated circuit (ASIC), or one or more of them used to control the foregoing FIG. 2A,
  • the integrated circuit executed by the program of the flow identification method in the embodiment shown in FIG. 3A and FIG. 4.
  • the disclosed system, device, and method can be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of the present application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , Including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks and other media that can store program codes. .
  • first, second and other words are used to distinguish the same items or similar items that have basically the same function and function. It should be understood that between “first”, “second” and “nth” There are no logic or timing dependencies, and no restrictions on the number and execution order. It should also be understood that although the following description uses the terms first, second, etc. to describe various elements, these elements should not be limited by the terms. These terms are only used to distinguish one element from another.
  • the first image may be referred to as the second image, and similarly, the second image may be referred to as the first image. Both the first image and the second image may be images, and in some cases, may be separate and different images.
  • the size of the sequence number of each process does not mean the order of execution.
  • the execution order of each process should be determined by its function and internal logic, and should not correspond to the difference in the embodiments of the present application.
  • the implementation process constitutes any limitation.
  • determining B according to A does not mean that B is determined only according to A, and B can also be determined according to A and/or other information.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Algebra (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Pure & Applied Mathematics (AREA)
  • Environmental & Geological Engineering (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本申请实施例公开了一种流量识别方法和流量识别设备,用于准确识别游戏流量,提高游戏流量的识别准确率。本申请实施例方法包括:流量识别设备获取目标数据流的待分析流量;所述流量识别设备获取所述待分析流量的报文的到达时间间隔;所述流量识别设备根据所述到达时间间隔中的部分或全部到达时间间隔的概率的分布特征确定所述待分析流量的类型。

Description

流量识别方法和流量识别设备
本申请要求于2020年4月30日提交中国专利局、申请号为202010362612.0、发明名称为“流量识别方法和流量识别设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及通信领域,尤其涉及一种流量识别方法和流量识别设备。
背景技术
网络游戏业务在全球范围内迅速发展,用户量不断上升。流量识别设备通过从游戏流量提取到的报文进行特征分析,利用机器学习和统计学习的方法推理业务的应用类型,从而进行业务保障。因此,如何准确识别网络游戏业务流量,实现对网络管理、网络规划以及提高网络服务质量,成为网络管理领域的研究重点。
目前,流量识别设备通过捕获带有关键字的会话包,从中提取通信双方的元组信息并建立五元组规则库。然后,流量识别设备利用流分类算法对流量中所包括的报文进行五元组前缀匹配,通过匹配算法完成对游戏流量的识别和分类。
由上述方案可知,流量识别设备通过对流量中所包括的报文进行五元组前缀的匹配实现对流量的识别和分类。但是,当通信双方的地址发生变化,会导致通过通信双方的元组信息识别游戏流量的准确率较低,无法准确识别游戏流量。
发明内容
本申请实施例提供了一种流量识别方法和流量识别设备,用于准确识别游戏流量,提高游戏流量的识别准确率。
本申请实施例第一方面提供一种流量识别方法,该方法包括:
流量识别设备获取目标数据流的待分析流量;然后,流量识别设备获取该待分析流量的报文的到达时间间隔,再根据该到达时间间隔中的部分或全部时间间隔的概率的分布特征确定该待分析流量的类型。
本实施例中,由于每种类型的流量的报文的到达时间间隔的概率分布具有一定的分布规律。因此,流量识别设备通过该待分析流量的报文的到达时间间隔的概率的分布特征可以准确识别该待分析流量的类型。例如,当该待分析流量的报文的到达时间间隔中的全部或部分到达时间间隔的概率的分布特征与游戏流量的报文的概率的分布特征较为吻合时,流量识别设备可以确定该待分析流量为游戏流量,从而准确识别游戏流量,提高游戏流量的识别准确率。
一种可能的实现方式中,该流量识别设备根据该到达时间间隔中的部分或全部时间间隔的概率的分布特征确定该待分析流量的类型包括:该流量识别设备根据该到达时间间隔中的部分或全部到达时间间隔的概率的分布特征与第一类型的历史流量的报文的到达时间间隔的概率的分布特征之间的相似度确定该待分析流量的类型。
在该可能的实现方式中,提供了一种具体的流量识别设备识别待分析流量的类型的具体方式,通过待分析流量的报文的到达时间间隔的概率的分布特征与已知类型的流量的报文的到达时间间隔的概率的分布特征之间的相似度判断该待分析流量是否为第一类型的流量。
另一种可能的实现方式中,该流量识别设备根据该到达时间间隔中的部分或全部到达时间间隔的概率的分布特征与第一类型的历史流量的报文的到达时间间隔的概率的分布特征之间的相似度确定该待分析流量的类型包括:当所述相似度高于第一相似度时,流量识别设备确定所述待分析流量为第一类型的流量。
在该可能的实现方式中,提供了一种通过判断相似度的大小确定待分析流量的类型的具体实现方式,提升了方案的可实现性。
另一种可能的实现方式中,该相似度通过该部分或全部到达时间间隔的概率的分布特征与该参考时间间隔概率分布模型的拟合度表征,该参考时间间隔概率分布模型用于表征该第一类型的历史流量的报文的到达时间间隔的概率的分布特征;当所述相似度高于第一相似度时,流量识别设备确定该待分析流量为第一类型的流量包括:当该拟合度高于第一拟合度时,流量识别设备确定该待分析流量为第一类型的流量。
在该可能的实现方式中,提供了通过该部分或全部到达时间间隔的概率的分布特征与该参考时间间隔概率分布模型的拟合度表征该相似度,这样通过判断该拟合度的大小即可实现对待分析流量的识别。
另一种可能的实现方式中,该拟合度通过相对熵的倒数和柯尔莫哥洛夫-斯米尔诺夫(kolmogorov–smirnov,KS)检验量表征,当相对熵的倒数等于第一预设阈值且KS检验量等于第二预设阈值时,该拟合度为第一拟合度;当相对熵的倒数大于第一预设阈值且KS检验量小于第二预设阈值时,该拟合度高于第一拟合度。
在该可能的实现方式中,提供了表征拟合度的两个具体参数,并通过这两个具体参数所处的范围确定该部分或全部到达时间间隔的概率的分布特征与第一类型的历史流量的报文的到达时间间隔的概率的分布特征之间的拟合度,当该相似度高于第一相似度时,流量识别设备确定该待分析流量为第一类型的流量。
另一种可能的实现方式中,该流量识别设备根据该到达时间间隔中的部分或全部到达时间间隔的概率的分布特征与第一类型的历史流量的报文的到达时间间隔的概率的分布特征之间的相似度确定该待分析流量的类型包括:流量识别设备根据该部分或全部到达时间间隔、该部分或全部到达时间间隔的概率和参考时间间隔概率分布模型计算拟合参数,该参考时间间隔概率分布模型用于表征该第一类型的历史流量的报文的到达时间间隔的概率的分布特征,该拟合参数用于指示该部分或全部到达时间间隔的概率的分布特征与该参考时间间隔概率分布模型的拟合度;流量识别设备根据该拟合参数确定该待分析流量的类型。
在该可能的实现方式中,示出了通过拟合参数的方式表征相似度的实现方式以及计算拟合参数的具体实现方式,通过拟合参数表征将该部分或全部到达时间间隔的概率的分布特征与该参考时间间隔概率分布模型的拟合度,从而实现通过拟合参数准确判断待分析流 量的类型,提高流量识别的准确率。
另一种可能的实现方式中,该拟合参数包括相对熵的倒数和KS检验量;流量识别设备根据该拟合参数确定该待分析流量的类型包括:当流量识别设备确定该相对熵的倒数大于第一预设阈值且该KS检验量小于第二预设阈值时,确定该待分析流量为第一类型的流量。
在该可能的实现方式中,提供了拟合参数的两种具体形式的参数,分别为相对熵的倒数和KS检验量,并示出了通过相对熵的倒数和KS检验量识别该待分析流量的类型的过程,提高了方案的可实现性。
另一种可能的实现方式中,该第一类型的流量为游戏流量或者视频流量。
在该可能的实现方式中,本申请实施例中的流量识别方法适用于游戏流量的识别和/或视频流量的识别,还可以适用于其他类型的流量的识别。
另一种可能的实现方式中,该方法还包括:流量识别设备获取第一类型的历史流量;流量识别设备获取该历史流量的报文的到达时间间隔和该历史流量的报文的到达时间间隔的概率;流量识别设备根据该历史流量的报文的到达时间间隔和该历史流量的报文的到达时间间隔的概率建立该参考时间间隔概率分布模型。
在该可能的实现方式中,流量识别设备还可以通过该历史流量的报文的到达时间间隔和该历史流量的报文的到达时间间隔的概率建立参考时间间隔概率分布模型,这样能够通过该参考时间间隔概率分布模型与该待分析流量的报文的部分或全部到达时间间隔的概率的分布特征的相似度判断该待分析流量是否为第一类型的流量,从而准确识别该待分析流量的类型。
另一种可能的实现方式中,该参考时间间隔概率分布模型包括以下任一种:幂律分布模型、高斯分布模型、正态分布模型和泊松分布模型。
在该可能的实现方式中,提供了参考时间间隔概率分布模型的多种可能形式。在选择参考时间间隔概率分布模型时,应当结合流量类型和实验结果选择与该流量类型所对应的报文的到达时间间隔的概率的分布特征较为吻合的模型。例如,对于游戏流量来说,游戏流量的报文的到达时间间隔的概率的分布特征与幂律分布模型的分布特征较为吻合。因此,流量识别设备可以选择幂律分布模型作为参考时间间隔概率分布模型。
另一种可能的实现方式中,当流量识别设备根据该到达时间间隔中的部分到达时间间隔的概率的分布特征确定该待分析流量的类型时,该到达时间间隔中的部分到达时间间隔为该待分析流量的报文的到达时间间隔中的小于第三预设阈值的到达时间间隔。
在该可能的实现方式中,由于游戏流量或视频流量的报文的到达时间间隔都比较小,因此该待分析流量的报文的到达时间间隔中的到达时间间隔较小的到达时间间隔的概率的分布特征可以一定程度上体征该待分析流量的类型。
另一种可能的实现方式中,该方法还包括:流量识别设备确定该待分析流量的第一报文,该第一报文的到达时间间隔小于第三预设阈值;流量识别设备确定该第一报文的数量与该待分析流量所包括的报文总数的比值;当该流量识别设备确定所述比值大于第四预设阈值,流量识别设备触发执行根据该到达时间间隔中的部分或全部到达时间间隔的概率的 分布特征确定该待分析流量的类型的步骤。
在该可能的实现方式中,通过上述方案中流量识别设备对流量的初步筛选过程,流量识别可以避免对所有流经该流量识别设备的网络流量都做进一步识别,从而提高流量的识别效率。
本申请实施例第二方面提供一种流量识别设备,该流量识别设备包括:
第一获取单元,用于获取目标数据流的待分析流量;
第二获取单元,用于获取该待分析流量的报文的到达时间间隔;
第一确定单元,用于根据该到达时间间隔中的部分或全部到达时间间隔的概率的分布特征确定该待分析流量的类型。
一种可能的实现方式中,该第一确定单元具体用于:
根据该到达时间间隔中的部分或全部到达时间间隔的概率的分布特征与第一类型的历史流量的报文的到达时间间隔的概率的分布特征之间的相似度确定该待分析流量的类型。
另一种可能的实现方式中,该第一确定单元具体用于:
当所述相似度高于第一相似度时,确定所述待分析流量为第一类型的流量。
另一种可能的实现方式中,该相似度通过该部分或全部到达时间间隔的概率的分布特征与该参考时间间隔概率分布模型的拟合度表征,该参考时间间隔概率分布模型用于表征该第一类型的历史流量的报文的到达时间间隔的概率的分布特征;该第一确定单元具体用于:
当该拟合度高于第一拟合度时,确定该待分析流量为第一类型的流量。
另一种可能的实现方式中,该拟合度通过相对熵的倒数和KS检验量表征,当相对熵的倒数等于第一预设阈值且KS检验量等于第二预设阈值时,该拟合度为第一拟合度;当相对熵的倒数大于第一预设阈值且KS检验量小于第二预设阈值时,该拟合度高于第一拟合度。
另一种可能的实现方式中,该第一确定单元具体用于:
根据该部分或全部到达时间间隔、该部分或全部到达时间间隔的概率和参考时间间隔概率分布模型计算拟合参数,该参考时间间隔概率分布模型用于表征该第一类型的历史流量的报文的到达时间间隔的概率的分布特征,该拟合参数用于指示该部分或全部到达时间间隔的概率的分布特征与该参考时间间隔概率分布模型的拟合度;
根据该拟合参数确定该待分析流量的类型。
另一种可能的实现方式中,该拟合参数包括相对熵的倒数和柯尔莫哥洛夫-斯米尔诺夫KS检验量;该第一确定单元具体用于:
当该流量识别设备确定该相对熵的倒数大于第一预设阈值且该KS检验量小于第二预设阈值时,确定该待分析流量为第一类型的流量。
另一种可能的实现方式中,该第一类型的流量为游戏流量或者视频流量。
另一种可能的实现方式中,该第一获取单元还用于:
获取第一类型的历史流量;
该第二获取单元还用于:
获取该历史流量的报文的到达时间间隔和该历史流量的报文的到达时间间隔的概率;
该流量识别设备还包括建立单元;
该建立单元,用于根据该历史流量的报文的到达时间间隔和该历史流量的报文的到达时间间隔的概率建立该参考时间间隔概率分布模型。
另一种可能的实现方式中,该参考时间间隔概率分布模型包括以下任一种:幂律分布模型、高斯分布模型、正态分布模型和泊松分布模型。
另一种可能的实现方式中,当第一确定单元根据该到达时间间隔中的部分到达时间间隔的概率的分布特征确定该待分析流量的类型时,该到达时间间隔中的部分到达时间间隔为该待分析流量的报文的到达时间间隔中的小于第三预设阈值的到达时间间隔。
另一种可能的实现方式中,该流量识别设备还包括第二确定单元和触发单元;
该第二确定单元,用于确定该待分析流量的第一报文,该第一报文的到达时间间隔小于第三预设阈值;确定该第一报文的数量与该待分析流量所包括的报文总数的比值;
该触发单元,用于当该流量识别设备确定所述比值大于第四预设阈值,触发第一确定单元执行根据该到达时间间隔中的部分或全部到达时间间隔的概率的分布特征确定该待分析流量的类型的步骤。
本申请实施例第三方面提供一种流量识别设备,该流量识别设备包括:处理器、存储器、输入输出设备以及总线;该存储器中存储有计算机指令;该处理器在执行该存储器中的计算机指令时,该存储器中存储有计算机指令;该处理器在执行该存储器中的计算机指令时,用于实现如第一方面任意一种实现方式。
在第三方面的一种可能的实现方式中,该处理器、存储器、输入输出设备分别与该总线相连。
本申请实施例第四方面提供了一种芯片系统,该芯片系统包括处理器,用于支持网络设备实现上述第一方面中所涉及的功能,例如,例如发送或处理上述方法中所涉及的数据和/或信息。在一种可能的设计中,所述芯片系统还包括存储器,所述存储器,用于保存网络设备必要的程序指令和数据。该芯片系统,可以由芯片构成,也可以包括芯片和其他分立器件。
本申请实施例第五方面提供了一种包括指令的计算机程序产品,其特征在于,当其在计算机上运行时,使得该计算机执行如第一方面中任一种的实现方式。
本申请实施例第六方面提供了一种计算机可读存储介质,其特征在于,包括指令,当该指令在计算机上运行时,使得计算机执行如第一方面中任一种实现方式。
从以上技术方案可以看出,本申请实施例具有以下优点:
经由上述技术方案可知,流量识别设备获取目标数据流的待分析流量;然后,流量识别设备获取该待分析流量的报文的到达时间间隔,再根据该到达时间间隔中的部分或全部时间间隔的概率的分布特征确定该待分析流量的类型。由于每种类型的流量的报文的到达时间间隔的概率分布具有一定的分布规律。因此,流量识别设备通过该待分析流量的报文的到达时间间隔的概率的分布特征可以准确识别该待分析流量的类型。例如,当该待分析流量的报文的到达时间间隔中的全部或部分到达时间间隔的概率的分布特征与游戏流量的 报文的概率的分布特征较为吻合时,流量识别设备可以确定该待分析流量为游戏流量,从而准确识别游戏流量,提高游戏流量的识别准确率。
附图说明
图1为本申请实施例的一个框架示意图;
图2A为本申请实施例流量识别方法的一个实施例示意图;
图2B为本申请实施例流量识别方法的一个场景示意图;
图3A为本申请实施例流量识别方法的另一个实施例示意图;
图3B为本申请实施例游戏应用“王者荣耀”的游戏流量的报文的到达时间间隔的概率分布示意图;
图3C为本申请实施例流量待分析流量所对应的相对熵的倒数和KS检验量的分布示意图;
图3D为本申请实施例视频应用“爱奇艺”的视频流量的报文的到达时间间隔的概率分布示意图;
图4为本申请实施例流量识别方法的另一个实施例示意图;
图5为本申请实施例流量识别设备的一个结构示意图;
图6为本申请实施例流量识别设备的另一个结构示意图。
具体实施方式
本申请实施例提供了一种流量识别方法和流量识别设备,用于准确识别游戏流量,提高游戏流量的识别准确率。
请参阅图1,图1为本申请实施例的一个框架示意图。如图1所示,游戏业务流的游戏流量在网络传输过程中会流经网络中各层的网络设备。例如,游戏服务器发送该游戏业务流,那么该游戏业务流的游戏流量流经图1的宽带远程介入服务器(broadband remote access server,BRAS)、光线路终端(optical line terminal,OLT)以及光网络终端(optical network terminal,ONT)。由此可知,在该网络中,传输的游戏业务流的游戏流量会流经各层部署的网络设备,那么在各层部署的网络设备中就可以获取到游戏流量。因此,本申请实施例的技术方案提出通过在网络中各层部署的网络设备中的任意一个网络设备中单点或多点部署或旁挂流量识别设备,该流量识别设备对各类型终端设备(例如,手机、个人电脑、电视机等)的网络流量进行采集,并对该网络流量进行识别,以确定网络流量的类型,从而方便利用机器学习和统计学习的方法推理该网络流量所对应的业务类型,并根据业务类型的优先级进行业务保障。
需要说明的是,流量识别设备单点或多点集成在各层部署的网络设备中,也可以是旁挂在各层部署的网络设备中,具体本申请不做限定。
上述图1仅仅示出了识别游戏流量的应用场景,对于本申请未示出的且具有类似需求或相同需求的场景同样适用,本申请不做限定。例如,本申请实施例的技术方案也适用于流量识别设备用于对视频流量的识别的应用场景。本申请实施例的技术方案还适用于流量 识别设备既用于对游戏流量的识别,也用于对视频流量的识别的应用场景。
下面通过具体的实施例介绍本申请实施例的技术方案。
请参阅图2A,图2A为本申请实施例流量识别方法的一个实施例示意图。在图2A中,该方法包括:
201、流量识别设备获取目标数据流的待分析流量。
例如,如图1所示,以流量识别设备集成部署在光网络终端为例进行说明。网络流量经过传输流经该光网络终端时,流量识别设备可以获取该网络流量。然后,流量识别设备根据该网络流量的五元组确定该网络流量属于目标数据流的流量。那么,流量识别设备将该网络流量作为待分析流量;或者是,流量识别设备将该网络流量中的部分流量作为该待分析流量。例如,该部分流量包括该网络流量中的前m个报文,m为大于0的整数。
需要说明的是,当该网络流量包括多条数据流的流量时,流量识别设备通过五元组识别从该网络流量中识别目标数据流的流量,再将该目标数据流的流量中的部分或全部流量作为该待分析流量。
例如,网络流量包括数据流A的流量和数据流B的流量。数据流A为目标数据流,流流量识别设备通过五元组识别该网络流量中的数据流A的流量,再将该识别得到的数据流A的流量中的部分或全部流量作为该待分析流量。
202、流量识别设备获取该待分析流量的报文的到达时间间隔。
其中,报文的到达时间间隔为流量识别设备连续接收到同一数据流的两个报文的时间间隔。具体的,一个报文的到达时间间隔可以是流量识别设备接收到该报文的时间与接收到该报文的下一个报文的时间之间的间隔,也可以是流量识别设备接收到该报文的时间与接收到该报文的上一个报文的时间之间的间隔。
例如,如图2B所示,报文1、报文2和报文3为流量识别设备连续接收到的目标数据流的三个报文。流量识别设备接收到报文1的时间点为第1ms(毫秒)的时刻。流量识别设备接收到报文2的时间点为第1.8ms的时刻。流量识别设备接收到报文3的时间点为3。一种可能的实现方式中,报文2的到达时间间隔可以理解为流量识别设备接收到报文3的时间点与流量识别设备接收到报文2的时间点之间的时间差,即报文2的到达时间间隔为1.2ms。另一种可能的实现方式中,报文2的到达时间间隔可以理解为流量识别设备接收到报文2的时间点与流量识别设备接收到报文1的时间点之间的时间差,即报文2的到达时间间隔为0.8ms。
203、流量识别设备根据该待分析流量的报文的到达时间间隔中的部分或全部时间间隔的概率的分布特征确定该待分析流量的类型。
其中,到达时间间隔的概率指该到达时间间隔在该待分析流量的报文的到达时间间隔中出现的概率。而该到达时间间隔的概率的计算方式有多种,下面通过举例说明:
1、到达时间间隔的概率为该待分析流量的报文的到达时间间隔中的到达时间间隔为该到达时间间隔的报文数量与该待分析流量的报文总数的比值。
例如,针对到达时间间隔A的概率,该待分析流量的报文的总数为M个,该M个报文中包括N个到达时间间隔为A的报文,那么可知到达时间间隔A的概率为N/M。其中,A大 于0,M为大于0的整数,N为大于0且小于M的整数。
2、到达时间间隔的概率为该待分析流量的报文的到达时间间隔中的到达时间间隔为该到达时间间隔的到达时间间隔个数与该待分析流量的报文的到达时间间隔中的到达时间间隔的总数的比值。即到达时间间隔的概率为该到达时间间隔出现的次数与该待分析流量的报文的到达时间间隔中的所有到达时间间隔出现的总次数的比值。
例如,针对到达时间间隔A的概率,该待分析流量的报文的到达时间间隔分别为1ms、2ms、2ms、3ms、5ms。流量识别设备确定待分析流量的报文的到达时间间隔的总数为5个。到达时间间隔A为2ms,那么流量识别设备确定该待分析流量的报文的到达时间间隔中到达时间间隔为2ms的到达时间间隔的个数为2个。则可知到达时间间隔2ms的概率为40%。
本实施例中,该待分析流量的类型为第一类型的流量或第二类型的流量。而第一类型的流量包括游戏流量,第二类型的流量为非游戏流量;或者,第一类型的流量为视频流量,第二类型的流量为非视频流量。例如,该非视频流量为下载数据流量。
该待分析流量的报文的到达时间间隔中的部分到达时间间隔包括该待分析流量的报文的到达时间间隔中的到达时间间隔小于第三预设阈值的到达时间间隔。
例如,该待分析流量的报文的到达时间间隔分别为1ms、2ms、2ms、3ms、5ms、9ms、20ms和30ms。第三预设阈值为10ms,那么该部分到达时间间隔为1ms、2ms、2ms、3ms和5ms,其中1ms、3ms、5ms均出现一次,2ms出现2次。由于游戏流量或视频流量的报文的到达时间间隔都比较小,因此流量识别设备可以通过该待分析流量的报文的到达时间间隔中的到达时间间隔较小的到达时间间隔的概率的分布特征判断该待分析流量的类型。
例如,由于游戏流量的报文的到达时间间隔的概率的分布特征与概率分布模型的分布特征较为吻合,因此,当流量识别设备确定该到达时间间隔中的部分或全部到达时间间隔的概率的分布特征与概率分布模型的分布特征的相似度较高时,流量识别设备可以确定该待分析流量为游戏流量,从而准确识别游戏流量,提高游戏流量的识别准确率。
本实施例中,可选的,步骤203具体包括步骤203a。
步骤203a、流量识别设备根据该待分析流量的报文的到达时间间隔中的部分或全部到达时间间隔的概率的分布特征与第一类型的历史流量的报文的到达时间间隔的概率的分布特征之间的相似度确定该待分析流量的类型。
具体的,流量识别设备可以通过该待分析流量的报文的到达时间间隔中的部分或全部到达时间间隔的概率的分布特征与该第一类型的历史流量的报文的到达时间间隔的概率的分布特征的相似度判断该待分析流量是否为第一类型的流量。
可选的,当该相似度高于第一相似度时,流量识别设备确定该待分析流量为第一类型的流量。
一种可能的实现方式中,该相似度通过该部分或全部到达时间间隔的概率的分布特征与该参考时间间隔概率分布模型的拟合度表征,该参考时间间隔概率分布模型用于表征该第一类型的历史流量的报文的到达时间间隔的概率的分布特征;第一相似度对应第一拟合度。那么,当该拟合度高于第一拟合度时,该流量识别设备确定该待分析流量为第一类型 的流量。
而拟合度通过相对熵的倒数和KS检验量表征,当相对熵的倒数等于第一预设阈值且KS检验量等于第二预设阈值时,该拟合度为第一拟合度;相对应的,当相对熵的倒数大于第一预设阈值且KS检验量小于第二预设阈值时,该拟合度高于第一拟合度。
其中,相对熵的倒数和KS检验量的计算过程请参阅后续步骤3001的相关介绍,这里不再赘述。
本申请实施例中,流量识别设备获取目标数据流的待分析流量;然后,流量识别设备获取该待分析流量的报文的到达时间间隔,再根据该到达时间间隔中的部分或全部到达时间间隔的概率的分布特征确定该待分析流量的类型。由于每种类型的流量的报文的到达时间间隔的概率分布具有一定的分布规律。因此,流量识别设备通过该待分析流量的报文的到达时间间隔的概率的分布特征可以准确识别该待分析流量的类型。例如,当该待分析流量的报文的到达时间间隔中的全部或部分到达时间间隔的概率的分布特征与游戏流量的报文的概率的分布特征较为吻合时,流量识别设备可以确定该待分析流量为游戏流量,从而准确识别游戏流量,提高游戏流量的识别准确率。
本申请实施例中,可选的,上述图2A所示的实施例中的步骤202a具有具体的执行过程,该步骤202a具体包括步骤3001和步骤3002。下面结合图3A所示的实施例进行详细介绍。请参阅图3A,图3A为本申请实施例流量识别方法的另一个实施例示意图。
3001、该流量识别设备根据该部分或全部到达时间间隔、该部分或全部到达时间间隔的概率和参考时间间隔概率分布模型计算拟合参数。
本实施中,该拟合参数用于指示该部分或全部到达时间间隔的概率的分布特征与该参考时间间隔概率分布模型的拟合度。该参考时间间隔概率分布模型用于表征第一类型的流量的报文的到达时间间隔的概率的分布特征。
该参考时间间隔概率分布模型是根据第一类型的历史流量所训练得到的模型,该参考时间间隔概率分布模型用于流量识别设备根据该部分或全部到达时间间隔和该部分或全部时间间隔的概率计算拟合参数。
可选的,该拟合参数包括相对熵的倒数r和/或KS检验量p。
首先,介绍相对熵的倒数r的计算过程。这里该部分或全部到达时间间隔的概率的分布特征表示为函数P(i),P(i)指实际测量得到的流量识别设备接收到的该待分析流量的第i个报文的到达时间间隔的概率。参考时间间隔概率分布模型为函数Q(i),Q(i)指通过该参考时间间隔概率分布模型计算得到的流量识别设备接收到的该待分析流量的第i个报文的到达时间间隔的概率。相对熵为
Figure PCTCN2021083803-appb-000001
其中,
Figure PCTCN2021083803-appb-000002
指对1至n所对应的x进行求和,ln(a)指以e为底数,对a求对数。那么,相对熵的倒数为r=1/D KL(P||Q)。
例如,该待分析流量的报文的到达时间间隔和全部到达时间间隔的概率具体如图3B所 示的分布图,横坐标为该待分析流量的报文的到达时间间隔,纵坐标为该全部到达时间间隔的概率。按照报文到达流量识别设备的时间顺序,流量识别设备将n个报文所对应的到达时间间隔作为输入参数,输入到该参考时间间隔概率分布模型,计算得到该n个报文所对应的到达时间间隔的参考概率。然后,该流量识别设备将n个报文所对应的到达时间间隔和实际测量得到的该n个报文的到达时间间隔的概率和该n个报文所对应的到达时间间隔的参考概率代入
Figure PCTCN2021083803-appb-000003
再计算该D KL(P||Q)的倒数r,n为大于1的整数。
下面介绍KS检验量p的计算过程。该部分或全部到达时间间隔的概率的分布特征表示为函数F N(x),F N(x)指实际测量得到的该待分析流量的报文的到达时间间隔中的到达时间间隔x出现的概率。参考时间间隔概率分布模型表示为函数F(x),F(x)为通过参考时间间隔概率分布模型计算得到的该待分析流量的报文的到达时间间隔中的到达时间间隔x出现的概率。流量识别设备确定p=D n=sup|F N(x)-F(x)|,sup(b)为函数,表示对b取最大值。|c|表示对c取绝对值。
该待分析流量包括n个报文。流量识别设备将该n个报文的到达时间间隔分别作为x并代入F(x),分别得到该n个报文中的每个报文的到达时间间隔的参考概率。而该每个报文的到达时间间隔的实际概率(这里每个报文的到达时间间隔的实际概率为实际测量得到的的实际测量值)已知,流量识别设备计算n个报文中的每个报文的到达时间间隔的参考概率和实际概率之间的差值的绝对值,那么流量识别设备得到n个报文所对应的n个绝对值,再确定该n个绝对值中的最大值。则可知,该最大值为Dn,即得到p。
本实施例中,该参考时间间隔概率分布模型可以是该流量识别设备根据第一类型的历史流量所训练得到的模型,也可以是其他设备根据第一类型的历史流量所训练得到的模型并配置在该流量识别设备上,具体此处不做限定。
本实施中,该参考时间间隔概率分布模型包括幂律分布模型、高斯分布模型、正态分布模型或者泊松分布模型,具体本申请不做限定。
本申请实施例中,选择参考时间间隔概率分布模型时,应当结合流量类型和实验结果选择与该流量类型所对应的报文的到达时间间隔的概率的分布特征较为吻合的模型。例如,对于游戏流量来说,游戏流量的报文的到达时间间隔的概率的分布特征与幂律分布模型的分布特征较为吻合。因此,流量识别设备在识别游戏流量时,可以选择幂律分布模型作为参考时间间隔概率分布模型。
需要说明的是,当流量识别设备既用于识别游戏流量,也用于识别视频流量时,流量识别设备中配置有多个参考时间间隔概率分布模型。例如,该多个参考时间间隔概率分布模型包括第一时间间隔概率分布模型和第二时间间隔概率分布模型,第一时间间隔概率分布模型用于表征游戏流量的报文的到达时间间隔的概率的分布特征,该第二时间间隔概率分布模型用于表征视频流量的报文的到达时间间隔的概率的分布特征。
当流量识别设备识别待分析流量时,流量识别设备可以先通过第一时间间隔概率分布模型与该部分或全部到达时间间隔的概率的分布特征的相似度判断该待分析流量是否为游戏流量。如果相似度较高,则流量识别设备确定该待分析流量为游戏流量;如果相似度较低,流量识别设备可以通过第二时间间隔概率分布模型与该部分或全部到达时间间隔的概率的分布特征的相似度判断该待分析流量是否为视频流量。如果相似度较高,则流量识别设备确定该待分析流量为视频流量;如果相似度较低,则流量识别设备确定该待分析流量既不是游戏流量,也不是视频流量。
3002、流量识别设备根据拟合参数确定该待分析流量的类型。
其中,该待分析流量的类型请参阅前述图2A所示的实施例中步骤203的相关描述,这里不再赘述。
可选的,由上述步骤3001可知,该拟合参数包括相对熵的倒数r和/或KS检验量p。那么,该步骤3002包括步骤3002a至步骤3002c。
步骤3002a:流量识别设备判断该r是否大于第一预设阈值且p是否小于第二预设阈值;若是,则步骤3002b;若否,则步骤3002c。
例如,第一预设阈值为1000,第二预设阈值为0.00001。流量识别设备判断计算得到的r是否大于1000且p是否小于0.00001,如果是,流量识别设备确定该待分析流量为第一类型的流量;如果不是,流量识别设备确定该待分析流量为第二类型的流量。具体识别结果可以参阅图3C所示的示意图,以横坐标为p,纵坐标为r,将流量识别设备获取到的多条数据流所对应的待分析流量分别所对应的r和p通过坐标点(p,r)示出,由图3C可以明显确定该多条数据流所对应的待分析流量中的游戏流量和非游戏流量。
需要说明的是,第一预设阈值和第二预设阈值的设定大小具体可以通过实验数据来确定。
步骤3002b:流量识别设备确定该待分析流量为第一类型的流量。
步骤3002c:流量识别设备确定该待分析流量为第二类型的流量。
本申请实施例中,可选的,在上述图2A所示的实施例中步骤202之前,上述图2A所示的实施例还包括步骤202a至步骤202e。
步骤202a:流量识别设备待分析流量的第一报文。
其中,第一报文为该待分析流量的报文中到达时间间隔小于第三预设阈值的报文。例如,第三预设阈值为10ms(毫秒),流量识别设备将待分析流量中的到达时间间隔为10ms的报文作为第一报文。
需要说明的是,第三预设阈值的大小具体可以根据当前网络传输状态确定。例如,当网络传输状态较好时,第三预设阈值较小;当网络传输状态较差时,第三预设阈值较大。而网络传输状态具体可以通过网络传输的带宽和时延来确定。
步骤202b:流量识别设备确定第一报文的数量与待分析流量所提取到的报文总数的比值。
例如,该待分析流量的报文总数为M,到达时间间隔为10ms的报文数量为L。则可知,到达时间间隔为10ms的报文数量与该待分析流量的报文总数的比值为L/M。其中,L为大 于0且小于M的整数。
步骤202c:流量识别设备判断该比值是否大于第四预设阈值,若是,则执行步骤202d;若否,则执行步骤202e。
例如,第四预设阈值为80%,流量识别设备判断该比值是否大于80%,如果是,则流量识别设备初步确定该待分析流量为第一类型的流量,流量识别设备可以进一步识别该待分析流量,以准确识别该待分析流量的类型;如果不是,则流量识别设备确定该第二类型的流量。
需要说明的是,第四预设阈值的设定可以是通过多次实验数据确定的。
步骤202d:流量识别设备触发执行上述步骤203。
当该流量识别设备确定比值大于第四预设阈值时,流量识别设备执行步骤203,以进一步识别该待分析流量。通过上述步骤202c的筛选过程,可以避免流量识别设备对所有流经该流量识别设备的网络流量都做进一步识别,从而提高流量的识别效率。
步骤202e:流量识别设备确定该待分析流量为第二类型的流量。
本申请实施例中,可选的,在上述图2A所示的实施例中步骤202之前,上述图2A所示的实施例还包括步骤202f至步骤202h。
步骤202f:流量识别设备获取第一类型的历史流量;
其中,该历史流量为游戏流量或者视频流量。
具体的,服务器在对该历史流量打标签,该标签指示该历史流量为第一类型的流量。这样,流量识别设备接收到该历史流量时,通过该标签确定该历史流量为第一类型流量。其中,该历史流量的大小一般选取为1至2GB。
步骤202g:流量识别设备确定该历史流量的报文的到达时间间隔和该历史流量的报文的到达时间间隔的概率。
步骤202g与前述图2A所示的实施例中步骤202类似,具体请参阅前述图2A所示的实施例中步骤202的相关介绍,这里不再赘述。
步骤202h:流量识别设备根据该历史流量的报文的到达时间间隔和该历史流量的报文的到达时间间隔的概率建立参考时间间隔概率分布模型。
具体的,流量识别设备将该历史流量的报文的到达时间间隔作为横坐标,将该历史流量的报文的到达时间间隔的概率作为纵坐标,得到该历史流量的报文的到达时间间隔的概率分布图。然后,流量识别设备根据该历史流量的报文的到达时间间隔的概率分布图确定待拟定的时间间隔概率分布模型,并计算该待拟定的时间间隔概率分布模型的参数值。流量识别设备将该参数值代入该待拟定的时间间隔概率分布模型,得到该参考时间间隔分布概率模型。可选的,流量识别设备通过极大似然估计法计算该待拟定的时间间隔概率分布模型的参数值。
例如,该历史流量为游戏应用“王者荣耀”的游戏流量,流量识别设备将该历史流量的报文的到达时间间隔作为横坐标,将该历史流量的报文的到达时间间隔的概率作为纵坐标,得到该历史流量的报文的到达时间间隔的概率分布图。具体如图3B,图3B为游戏应用“王者荣耀”的历史流量的报文的到达时间间隔的概率分布示意图。由图3B所示的该时 间间隔概率分布图可知,该历史流量的报文的到达时间间隔的概率的分布特征与幂律分布模型的分布特征比较吻合。流量识别设备确定待拟定的时间间隔概率分布模型为幂律分布模型;然后,流量识别设备通过极大似然估计法计算该幂律分布模型的参数值,再将该参数值代入该幂律分布模型中,得到该参考时间间隔分布模型。
再举例说明,该历史流量为视频应用“爱奇艺”的视频流量,流量识别设备将该“爱奇艺”的历史流量的报文的到达时间间隔作为横坐标,将该待分析流量的报文的到达时间间隔的概率作为纵坐标,得到该待分析流量的报文的到达时间间隔的概率分布图。具体如图3D所示,图3D为视频应用“爱奇艺”的历史流量的报文的到达时间间隔的概率分布示意图。由图3D可知,该历史流量的报文的时间间隔的概率的分布特征与高斯分布模型的分布特征比较吻合,那么流量识别设备计算通过极大似然估计法计算该高斯分布模型的参数值,再将该参数值代入该高斯分布模型中,得到该参考时间间隔分布模型。
由上述示例可知,游戏流量的报文的到达时间间隔的概率的分布特征与幂律分布模型的分布特征较为吻合。因此流量识别设备识别游戏流量时,通过比对待分析流量的报文的到达时间间隔的概率的分布特征与幂律分布模型的分布特征可以准确判断该待分析流量识别为游戏流量。而视频流量的报文的到达时间间隔的概率的分布特征与高斯分布模型的分布特征比较吻合,则流量识别设备通过比对待分析流量的报文的到达时间间隔的概率的分布特征与高斯分布模型的分布特征可以准确判断该待分析流量是否为视频流量。
在本申请实施例中,可选的,上述图2A所示的实施例中步骤202具体包括步骤4001至步骤4004。具体请参阅图4,图4为本申请实施例中流量识别方法的另一个实施例示意图,该方法包括:
4001、流量识别设备确定第二报文的第一五元组和第三报文的第二五元组。
其中,第一五元组包括第二报文的源IP地址、目的IP地址、源端口号、目的端口号和传输协议类型。第二五元组包括第二报文的源IP地址、目的IP地址、源端口号、目标端口号和传输协议类型。
具体的,流量识别设备通过第二报文的报文头获取该第一五元组和第三报文的报文头获取该第二五元组。
4002、流量识别设备根据该第一五元组和第二五元组确定第二报文和第三报文为目标数据流的两个报文。
4003、流量识别设备根据第一时刻和第二时刻确定该第二报文和第三报文为该流量识别设备连续接收到的目标数据流的两个报文。
其中,第一时刻为流量识别设备接收到第二报文的时刻,第二时刻为流量识别设备接收到第三报文的时刻。
具体的,流量识别设备按照报文到达流量识别设备的时间顺序确定报文到达的时间戳,由该时间戳可以确定流量识别设备在第一时刻接收第二报文,在第二时刻接收第三报文。然后,流量识别设备根据该时间戳确定第二报文和第三报文为流量识别设备连续接收到的目标数据流的两个报文。
可选的,第三报文为第二报文的下一个报文,或者为第二报文的上一个报文,具体此 处不做限定。
4004、流量识别设备将第一时刻和第二时刻之间的时间差作为第二报文的到达时间间隔。
具体的,在理解步骤4003和步骤4004时,可以结合前述图2A所示的实施例中的步骤202的示例,这里不再赘述。
下面对本申请实施例提供的流量识别设备进行介绍。请参阅图5,图5为本申请实施例流量识别方法的一个结构示意图。该流量识别设备可以用于执行上述图2A、图3A和图4所示的实施例中流量识别设备执行的步骤,具体请参阅上述方法实施例中的相关描述。
该流量识别设备包括第一获取单元501、第二获取单元502和第一确定单元503。可选的,该流量识别设备还包括建立单元504、第二确定单元505和触发单元506。
第一获取单元501,用于获取目标数据流的待分析流量;
第二获取单元502,用于获取该待分析流量的报文的到达时间间隔;
第一确定单元503,用于根据该到达时间间隔中的部分或全部到达时间间隔的概率的分布特征确定该待分析流量的类型。
一种可能的实现方式中,该第一确定单元503具体用于:
根据该到达时间间隔中的部分或全部到达时间间隔的概率的分布特征与第一类型的历史流量的报文的到达时间间隔的概率的分布特征之间的相似度确定该待分析流量的类型。
另一种可能的实现方式中,该第一确定单元503具体用于:
当所述相似度高于第一相似度时,确定所述待分析流量为第一类型的流量。
另一种可能的实现方式中,该相似度通过该部分或全部到达时间间隔的概率的分布特征与该参考时间间隔概率分布模型的拟合度表征,该参考时间间隔概率分布模型用于表征该第一类型的历史流量的报文的到达时间间隔的概率的分布特征;该第一确定单元503具体用于:
当该拟合度高于第一拟合度时,确定该待分析流量为第一类型的流量。
另一种可能的实现方式中,该拟合度通过相对熵的倒数和KS检验量表征,当相对熵的倒数等于第一预设阈值且KS检验量等于第二预设阈值时,该拟合度为第一拟合度;当相对熵的倒数大于第一预设阈值且KS检验量小于第二预设阈值时,该拟合度高于第一拟合度。
另一种可能的实现方式中,该第一确定单元503具体用于:
根据该部分或全部到达时间间隔、该部分或全部到达时间间隔的概率和参考时间间隔概率分布模型计算拟合参数,该参考时间间隔概率分布模型用于表征该第一类型的历史流量的报文的到达时间间隔的概率的分布特征,该拟合参数用于指示该部分或全部到达时间间隔的概率的分布特征与该参考时间间隔概率分布模型的拟合度;
根据该拟合参数确定该待分析流量的类型。
另一种可能的实现方式中,该拟合参数包括相对熵的倒数和柯尔莫哥洛夫-斯米尔诺夫KS检验量;该第一确定单元503具体用于:
当该流量识别设备确定该相对熵的倒数大于第一预设阈值且该KS检验量小于第二预设阈值时,确定该待分析流量为第一类型的流量。
另一种可能的实现方式中,该第一类型的流量为游戏流量或者视频流量。
另一种可能的实现方式中,该第一获取单元501还用于:
获取第一类型的历史流量;
该第二获取单元502还用于:
获取该历史流量的报文的到达时间间隔和该历史流量的报文的到达时间间隔的概率;
该建立单元504,用于根据该历史流量的报文的到达时间间隔和该历史流量的报文的到达时间间隔的概率建立该参考时间间隔概率分布模型。
另一种可能的实现方式中,该参考时间间隔概率分布模型包括以下任一种:幂律分布模型、高斯分布模型、正态分布模型和泊松分布模型。
另一种可能的实现方式中,当第一确定单元503根据该到达时间间隔中的部分到达时间间隔的概率的分布特征确定该待分析流量的类型时,该到达时间间隔中的部分到达时间间隔为该待分析流量的报文的到达时间间隔中的小于第三预设阈值的到达时间间隔。
另一种可能的实现方式中,该第二确定单元505用于:
确定该待分析流量的第一报文,该第一报文的到达时间间隔小于第三预设阈值;确定该第一报文的数量与该待分析流量所包括的报文总数的比值;
该触发单元506,用于当该流量识别设备确定所述比值大于第四预设阈值,触发第一确定单元503执行根据该到达时间间隔中的部分或全部到达时间间隔的概率的分布特征确定该待分析流量的类型的步骤。
本申请实施例中,第一获取单元501获取目标数据流的待分析流量;然后,第二获取单元502获取该待分析流量的报文的到达时间间隔;第一确定单元503根据该到达时间间隔中的部分或全部到达时间间隔的概率的分布特征确定该待分析流量的类型。由于每种类型的流量的报文的到达时间间隔的概率分布具有一定的分布规律。因此,第一确定单元503通过该待分析流量的报文的到达时间间隔的概率的分布特征可以准确识别该待分析流量的类型。例如,当该待分析流量的报文的到达时间间隔中的全部或部分到达时间间隔的概率的分布特征与游戏流量的报文的概率的分布特征较为吻合时,第一确定单元503可以确定该待分析流量为游戏流量,从而准确识别游戏流量,提高游戏流量的识别准确率。
本申请实施例还提供一种流量识别设备600。请参阅图6,图6为本申请实施例流量识别设备的另一个结构示意图。该流量识别设备用于执行图2A、图3A和图4所示的实施例流量识别设备执行的步骤,具体请参阅前述方法实施例中的相关描述。
该流量识别设备600包括:处理器601、存储器602、输入输出设备603以及总线604。
一种可能的实现方式中,该处理器601、存储器602、输入输出设备603分别与总线604相连,该存储器中存储有计算机指令。
该输入输出设备603,用于获取目标数据流的待分析流量;
该处理器601,用于获取该待分析流量的报文的到达时间间隔;根据该到达时间间隔中的部分或全部到达时间间隔的概率的分布特征确定该待分析流量的类型。
一种可能的实现方式中,该处理器601具体用于:
根据该到达时间间隔中的部分或全部到达时间间隔的概率的分布特征与第一类型的历 史流量的报文的到达时间间隔的概率的分布特征之间的相似度确定该待分析流量的类型。
另一种可能的实现方式中,该处理器601具体用于:
当所述相似度高于第一相似度时,确定所述待分析流量为第一类型的流量。
另一种可能的实现方式中,该相似度通过该部分或全部到达时间间隔的概率的分布特征与该参考时间间隔概率分布模型的拟合度表征,该参考时间间隔概率分布模型用于表征该第一类型的历史流量的报文的到达时间间隔的概率的分布特征;该处理器601具体用于:
当该拟合度高于第一拟合度时,确定该待分析流量为第一类型的流量。
另一种可能的实现方式中,该拟合度通过相对熵的倒数和KS检验量表征,当相对熵的倒数等于第一预设阈值且KS检验量等于第二预设阈值时,该拟合度为第一拟合度;当相对熵的倒数大于第一预设阈值且KS检验量小于第二预设阈值时,该拟合度高于第一拟合度。
另一种可能的实现方式中,该处理器601具体用于:
根据该部分或全部到达时间间隔、该部分或全部到达时间间隔的概率和参考时间间隔概率分布模型计算拟合参数,该参考时间间隔概率分布模型用于表征该第一类型的历史流量的报文的到达时间间隔的概率的分布特征,该拟合参数用于指示该部分或全部到达时间间隔的概率的分布特征与该参考时间间隔概率分布模型的拟合度;
根据该拟合参数确定该待分析流量的类型。
另一种可能的实现方式中,该拟合参数包括相对熵的倒数和柯尔莫哥洛夫-斯米尔诺夫KS检验量;该处理器601具体用于:
当该流量识别设备确定该相对熵的倒数大于第一预设阈值且该KS检验量小于第二预设阈值时,确定该待分析流量为第一类型的流量。
另一种可能的实现方式中,该第一类型的流量为游戏流量或者视频流量。
另一种可能的实现方式中,该输入输出设备603还用于:
获取第一类型的历史流量;
该处理器601还用于:
获取该历史流量的报文的到达时间间隔和该历史流量的报文的到达时间间隔的概率;
根据该历史流量的报文的到达时间间隔和该历史流量的报文的到达时间间隔的概率建立该参考时间间隔概率分布模型。
另一种可能的实现方式中,该参考时间间隔概率分布模型包括以下任一种:幂律分布模型、高斯分布模型、正态分布模型和泊松分布模型。
另一种可能的实现方式中,当该处理器601根据所述到达时间间隔中的部分到达时间间隔的概率的分布特征确定该待分析流量的类型时,该到达时间间隔中的部分到达时间间隔为该待分析流量的报文的到达时间间隔中的小于第三预设阈值的到达时间间隔。
另一种可能的实现方式中,该输入输出设备603还用于:
确定该待分析流量的第一报文,该第一报文的到达时间间隔小于第三预设阈值;确定该第一报文的数量与该待分析流量所包括的报文总数的比值;
该处理器601还用于:
当该流量识别设备确定所述比值大于第四预设阈值,触发执行根据该到达时间间隔中 的部分或全部到达时间间隔的概率的分布特征确定该待分析流量的类型的步骤。
本申请实施例中,输入输出设备603获取目标数据流的待分析流量;然后,处理器601获取该待分析流量的报文的到达时间间隔;再根据该到达时间间隔中的部分或全部到达时间间隔的概率的分布特征确定该待分析流量的类型。由于每种类型的流量的报文的到达时间间隔的概率分布具有一定的分布规律,因此,处理器601通过该待分析流量的报文的到达时间间隔的概率的分布特征可以准确识别该待分析流量的类型。例如,当该待分析流量的报文的到达时间间隔中的全部或部分到达时间间隔的概率的分布特征与游戏流量的报文的概率的分布特征较为吻合时,处理器601可以确定该待分析流量为游戏流量,从而准确识别游戏流量,提高游戏流量的识别准确率。
本申请实施例还提供一种包括指令的计算机程序产品,当其在计算机上运行时,使得该计算机执行如上述图2A、图3A和图4所示的实施例的流量识别方法。
本申请实施例还提供了一种计算机可读存储介质,包括指令,当该指令在计算机上运行时,使得计算机执行如上述图2A、图3A和图4所示的实施例的流量识别方法。
在另一种可能的设计中,当该流量识别设备为终端内的芯片时,芯片包括:处理单元和通信单元,所述处理单元例如可以是处理器,所述通信单元例如可以是输入/输出接口、管脚或电路等。该处理单元可执行存储单元存储的计算机执行指令,以使该终端内的芯片执行上述图2A、图3A和图4所示的实施例中的流量识别方法。可选地,所述存储单元为所述芯片内的存储单元,如寄存器、缓存等,所述存储单元还可以是所述终端内的位于所述芯片外部的存储单元,如只读存储器(read-only memory,ROM)或可存储静态信息和指令的其他类型的静态存储设备,随机存取存储器(random access memory,RAM)等。
其中,上述任一处提到的处理器,可以是一个通用中央处理器,微处理器,特定应用集成电路(application-specific integrated circuit,ASIC),或一个或多个用于控制上述图2A、图3A和图4所示的实施例中的流量识别方法的程序执行的集成电路。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
本申请中术语“第一”“第二”等字样用于对作用和功能基本相同的相同项或相似项进行区分,应理解,“第一”、“第二”、“第n”之间不具有逻辑或时序上的依赖关系,也不对数量和执行顺序进行限定。还应理解,尽管以下描述使用术语第一、第二等来描述各种元素,但这些元素不应受术语的限制。这些术语只是用于将一元素与另一元素区别分开。例如,在不脱离各种所述示例的范围的情况下,第一图像可以被称为第二图像,并且类似地,第二图像可以被称为第一图像。第一图像和第二图像都可以是图像,并且在某些情况下,可以是单独且不同的图像。
本申请中术语“至少一个”的含义是指一个或多个,本申请中术语“多个”的含义是指两个或两个以上,例如,多个第二报文是指两个或两个以上的第二报文。本文中术语“系统”和“网络”经常可互换使用。
应理解,在本文中对各种所述示例的描述中所使用的术语只是为了描述特定示例,而并非旨在进行限制。如在对各种所述示例的描述和所附权利要求书中所使用的那样,单数形式“一个(“a”“,an”)”和“该”旨在也包括复数形式,除非上下文另外明确地指示。
还应理解,本文中所使用的术语“和/或”是指并且涵盖相关联的所列出的项目中的一个或多个项目的任何和全部可能的组合。术语“和/或”,是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本申请中的字符“/”,一般表示前后关联对象是一种“或”的关系。
还应理解,在本申请的各个实施例中,各个过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
应理解,根据A确定B并不意味着仅仅根据A确定B,还可以根据A和/或其它信息确定B。
还应理解,术语“包括”(也称“includes”、“including”、“comprises”和/或“comprising”)当在本说明书中使用时指定存在所陈述的特征、整数、步骤、操作、元素、和/或部件,但是并不排除存在或添加一个或多个其他特征、整数、步骤、操作、元素、部件、和/或其分组。
还应理解,术语“如果”可被解释为意指“当...时”(“when”或“upon”)或“响 应于确定”或“响应于检测到”。类似地,根据上下文,短语“如果确定...”或“如果检测到[所陈述的条件或事件]”可被解释为意指“在确定...时”或“响应于确定...”或“在检测到[所陈述的条件或事件]时”或“响应于检测到[所陈述的条件或事件]”。
应理解,说明书通篇中提到的“一个实施例”、“一实施例”、“一种可能的实现方式”意味着与实施例或实现方式有关的特定特征、结构或特性包括在本申请的至少一个实施例中。因此,在整个说明书各处出现的“在一个实施例中”或“在一实施例中”、“一种可能的实现方式”未必一定指相同的实施例。此外,这些特定的特征、结构或特性可以任意适合的方式结合在一个或多个实施例中。
以上所述,以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。

Claims (18)

  1. 一种流量识别方法,其特征在于,所述方法包括:
    流量识别设备获取目标数据流的待分析流量;
    所述流量识别设备获取所述待分析流量的报文的到达时间间隔;
    所述流量识别设备根据所述到达时间间隔中的部分或全部到达时间间隔的概率的分布特征确定所述待分析流量的类型。
  2. 根据权利要求1所述的方法,其特征在于,所述流量识别设备根据所述到达时间间隔中的部分或全部到达时间间隔的概率的分布特征确定所述待分析流量的类型包括:
    所述流量识别设备根据所述到达时间间隔中的部分或全部到达时间间隔的概率的分布特征与第一类型的历史流量的报文的到达时间间隔的概率的分布特征之间的相似度确定所述待分析流量的类型。
  3. 根据权利要求2所述的方法,其特征在于,所述流量识别设备根据所述到达时间间隔中的部分或全部到达时间间隔的概率的分布特征与第一类型的历史流量的报文的到达时间间隔的概率的分布特征之间的相似度确定所述待分析流量的类型包括:
    所述流量识别设备根据所述部分或全部到达时间间隔、所述部分或全部到达时间间隔的概率和参考时间间隔概率分布模型计算拟合参数,所述参考时间间隔概率分布模型用于表征所述第一类型的历史流量的报文的到达时间间隔的概率的分布特征,所述拟合参数用于指示所述部分或全部到达时间间隔的概率的分布特征与所述参考时间间隔概率分布模型的拟合度;
    所述流量识别设备根据所述拟合参数确定所述待分析流量的类型。
  4. 根据权利要求3所述的方法,其特征在于,所述拟合参数包括相对熵的倒数和柯尔莫哥洛夫-斯米尔诺夫KS检验量;所述流量识别设备根据所述拟合参数确定所述待分析流量的类型包括:
    当所述流量识别设备确定所述相对熵的倒数大于第一预设阈值且所述KS检验量小于第二预设阈值时,所述流量识别设备确定所述待分析流量为所述第一类型的流量。
  5. 根据权利要求4所述的方法,其特征在于,所述第一类型的流量为游戏流量或者视频流量。
  6. 根据权利要求3至5中的任一项所述的方法,其特征在于,所述方法还包括:
    所述流量识别设备获取所述第一类型的历史流量;
    所述流量识别设备获取所述历史流量的报文的到达时间间隔和所述历史流量的报文的到达时间间隔的概率;
    所述流量识别设备根据所述历史流量的报文的到达时间间隔和所述历史流量的报文的到达时间间隔的概率建立所述参考时间间隔概率分布模型。
  7. 根据权利要求3至6中的任一项所述的方法,其特征在于,所述参考时间间隔概率分布模型包括以下任一种:幂律分布模型、高斯分布模型、正态分布模型和泊松分布模型。
  8. 根据权利要求1至7中的任一项所述的方法,其特征在于,当所述流量识别设备根据所述到达时间间隔中的部分到达时间间隔的概率的分布特征确定所述待分析流量的类型 时,所述到达时间间隔中的部分到达时间间隔为所述待分析流量的报文的到达时间间隔中的小于第三预设阈值的到达时间间隔。
  9. 根据权利要求1至8中的任一项所述的方法,其特征在于,所述方法还包括:
    所述流量识别设备确定所述待分析流量的第一报文,所述第一报文的到达时间间隔小于第三预设阈值;
    所述流量识别设备确定所述第一报文的数量与所述待分析流量所包括的报文总数的比值;
    当所述流量识别设备确定所述比值大于第四预设阈值,则触发执行所述流量识别设备根据所述到达时间间隔中的部分或全部到达时间间隔的概率的分布特征确定所述待分析流量的类型的步骤。
  10. 一种流量识别设备,其特征在于,所述流量识别设备包括:
    第一获取单元,用于获取目标数据流的待分析流量;
    第二获取单元,用于获取所述待分析流量的报文的到达时间间隔;
    第一确定单元,用于根据所述到达时间间隔中的部分或全部到达时间间隔的概率的分布特征确定所述待分析流量的类型。
  11. 根据权利要求10所述的流量识别设备,其特征在于,所述第一确定单元具体用于:
    根据所述到达时间间隔中的部分或全部到达时间间隔的概率的分布特征与第一类型的历史流量的报文的到达时间间隔的概率的分布特征之间的相似度确定所述待分析流量的类型。
  12. 根据权利要求11所述的流量识别设备,其特征在于,所述第一确定单元具体用于:
    根据所述部分或全部到达时间间隔、所述部分或全部到达时间间隔的概率和参考时间间隔概率分布模型计算拟合参数,所述参考时间间隔概率分布模型用于表征所述第一类型的历史流量的报文的到达时间间隔的概率的分布特征,所述拟合参数用于指示所述部分或全部到达时间间隔的概率的分布特征与所述参考时间间隔概率分布模型的拟合度;
    根据所述拟合参数确定所述待分析流量的类型。
  13. 根据权利要求12所述的流量识别设备,其特征在于,所述拟合参数包括相对熵的倒数和柯尔莫哥洛夫-斯米尔诺夫KS检验量;所述第一确定单元具体用于:
    当所述流量识别设备确定所述相对熵的倒数大于第一预设阈值且所述KS检验量小于第二预设阈值时,确定所述待分析流量为所述第一类型的流量。
  14. 根据权利要求13所述的流量识别设备,其特征在于,所述第一类型的流量为游戏流量或者视频流量。
  15. 根据权利要求12至14中的任一项所述的流量识别设备,其特征在于,所述第一获取单元还用于:
    获取所述第一类型的历史流量;
    所述第二获取单元还用于:
    获取所述历史流量的报文的到达时间间隔和所述历史流量的报文的到达时间间隔的概率;
    所述流量识别设备还包括建立单元;
    所述建立单元,用于根据所述历史流量的报文的到达时间间隔和所述历史流量的报文的到达时间间隔的概率建立所述参考时间间隔概率分布模型。
  16. 根据权利要求12至15中的任一项所述的流量识别设备,其特征在于,所述参考时间间隔概率分布模型包括以下任一种:幂律分布模型、高斯分布模型、正态分布模型和泊松分布模型。
  17. 根据权利要求10至16中的任一项所述的流量识别设备,其特征在于,当所述第一确定单元根据所述到达时间间隔中的部分到达时间间隔的概率的分布特征确定所述待分析流量的类型时,所述到达时间间隔中的部分到达时间间隔为所述待分析流量的报文的到达时间间隔中的小于第三预设阈值的到达时间间隔。
  18. 根据权利要求10至17中的任一项所述的流量识别设备,其特征在于,所述流量识别设备还包括第二确定单元和触发单元;
    所述第二确定单元,用于确定所述待分析流量的第一报文,所述第一报文的到达时间间隔小于第三预设阈值;确定所述第一报文的数量与所述待分析流量所包括的报文总数的比值;
    所述触发单元,用于当所述流量识别设备确定所述比值大于第四预设阈值,触发所述第一确定单元执行所述根据所述到达时间间隔中的部分或全部到达时间间隔的概率的分布特征确定所述待分析流量的类型的步骤。
PCT/CN2021/083803 2020-04-30 2021-03-30 流量识别方法和流量识别设备 WO2021218528A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP21796188.7A EP4131873A4 (en) 2020-04-30 2021-03-30 METHOD AND DEVICE FOR TRAFFIC IDENTIFICATION
US18/050,775 US20230079312A1 (en) 2020-04-30 2022-10-28 Traffic Identification Method and Traffic Identification Device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010362612.0 2020-04-30
CN202010362612.0A CN113595930A (zh) 2020-04-30 2020-04-30 流量识别方法和流量识别设备

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/050,775 Continuation US20230079312A1 (en) 2020-04-30 2022-10-28 Traffic Identification Method and Traffic Identification Device

Publications (1)

Publication Number Publication Date
WO2021218528A1 true WO2021218528A1 (zh) 2021-11-04

Family

ID=78237177

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/083803 WO2021218528A1 (zh) 2020-04-30 2021-03-30 流量识别方法和流量识别设备

Country Status (4)

Country Link
US (1) US20230079312A1 (zh)
EP (1) EP4131873A4 (zh)
CN (1) CN113595930A (zh)
WO (1) WO2021218528A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114297419A (zh) * 2021-12-31 2022-04-08 北京卓越乐享网络科技有限公司 多媒体对象的预测方法、装置、设备、介质和程序产品

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110019581A1 (en) * 2009-07-23 2011-01-27 Ralink Technology Corporation Method for identifying packets and apparatus using the same
CN107360032A (zh) * 2017-07-20 2017-11-17 中国南方电网有限责任公司 一种网络流识别方法及电子设备
CN107431663A (zh) * 2015-03-25 2017-12-01 思科技术公司 网络流量分类
CN109862392A (zh) * 2019-03-20 2019-06-07 济南大学 互联网游戏视频流量的识别方法、系统、设备及介质

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101814977B (zh) * 2010-04-22 2012-11-21 北京邮电大学 利用数据流头部特征的tcp流量在线识别方法及装置
CN110519177B (zh) * 2018-05-22 2022-01-21 华为技术有限公司 一种网络流量识别方法及相关设备
CN109412900B (zh) * 2018-12-04 2020-08-21 腾讯科技(深圳)有限公司 一种网络状态识别的方法、模型训练的方法及装置
CN110443657B (zh) * 2019-08-19 2022-03-18 泰康保险集团股份有限公司 客户流量数据处理方法、装置、电子设备及可读介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110019581A1 (en) * 2009-07-23 2011-01-27 Ralink Technology Corporation Method for identifying packets and apparatus using the same
CN107431663A (zh) * 2015-03-25 2017-12-01 思科技术公司 网络流量分类
CN107360032A (zh) * 2017-07-20 2017-11-17 中国南方电网有限责任公司 一种网络流识别方法及电子设备
CN109862392A (zh) * 2019-03-20 2019-06-07 济南大学 互联网游戏视频流量的识别方法、系统、设备及介质

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HUANG ZHIGEN, CHEN JIAN, WANG SHAN: "One Kind of Traffic Classification Method based on Packet Length and Packet Inter-Arrival Time", ELECTRONIC MEASUREMENT TECHNOLOGY, vol. 34, no. 11, 30 November 2011 (2011-11-30), XP055861041, DOI: 10.19651/j.cnki.emt.2011.11.031 *
ZENG FANCHAO: "Research on Characteristic-Based Identification of VoIP and Application thereof", CHINA NEW TELECOMMUNICATIONS, no. 9, 30 September 2015 (2015-09-30), pages 66 - 66, XP055861039 *

Also Published As

Publication number Publication date
EP4131873A1 (en) 2023-02-08
US20230079312A1 (en) 2023-03-16
CN113595930A (zh) 2021-11-02
EP4131873A4 (en) 2023-10-04

Similar Documents

Publication Publication Date Title
CN110233769B (zh) 流量检测方法和设备、样本训练方法和设备、以及介质
CN108737333B (zh) 一种数据检测方法以及装置
KR101295708B1 (ko) 트래픽 수집장치, 트래픽 분석장치, 시스템 및 그 분석방법
US20220174008A1 (en) System and method for identifying devices behind network address translators
CN104468507B (zh) 基于无控制端流量分析的木马检测方法
WO2019148714A1 (zh) DDoS攻击检测方法、装置、计算机设备和存储介质
CN106789242B (zh) 一种基于手机客户端软件动态特征库的识别应用智能分析方法
CN111147394B (zh) 一种远程桌面协议流量行为的多级分类检测方法
CN104994016B (zh) 用于分组分类的方法和装置
CN111953552B (zh) 数据流的分类方法和报文转发设备
CN108462707B (zh) 一种基于深度学习序列分析的移动应用识别方法
US20170041242A1 (en) Network system, communication analysis method and analysis apparatus
CN110034966B (zh) 一种基于机器学习的数据流分类方法及系统
US20200220889A1 (en) Low-complexity detection of potential network anomalies using intermediate-stage processing
CN109299742A (zh) 自动发现未知网络流的方法、装置、设备及存储介质
Wang et al. Characterizing application behaviors for classifying p2p traffic
CN111611280A (zh) 一种基于cnn和sae的加密流量识别方法
CN109768936B (zh) 一种精细化分流系统及分流方法
US20190356564A1 (en) Mode determining apparatus, method, network system, and program
WO2021218528A1 (zh) 流量识别方法和流量识别设备
Liu et al. Semi-supervised encrypted traffic classification using composite features set
CN109802868B (zh) 一种基于云计算的移动应用实时识别方法
CN114554185A (zh) 一种基于无线网络流量的偷拍摄像头检测及防护方法
TWI658715B (zh) 通信裝置、可用頻寬計算系統、可用頻寬計算方法及程式
Gomez et al. Traffic classification in IP networks through Machine Learning techniques in final systems

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21796188

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021796188

Country of ref document: EP

Effective date: 20221103

NENP Non-entry into the national phase

Ref country code: DE