CN115484073A - Abnormal flow identification method and device, electronic equipment and storage medium - Google Patents
Abnormal flow identification method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN115484073A CN115484073A CN202211033370.6A CN202211033370A CN115484073A CN 115484073 A CN115484073 A CN 115484073A CN 202211033370 A CN202211033370 A CN 202211033370A CN 115484073 A CN115484073 A CN 115484073A
- Authority
- CN
- China
- Prior art keywords
- flow
- flow characteristic
- abnormal
- time
- characteristic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000002159 abnormal effect Effects 0.000 title claims abstract description 93
- 238000000034 method Methods 0.000 title claims abstract description 42
- 238000012544 monitoring process Methods 0.000 claims abstract description 39
- 230000008859 change Effects 0.000 claims abstract description 21
- 238000012545 processing Methods 0.000 claims description 41
- 238000004590 computer program Methods 0.000 claims description 10
- 238000000605 extraction Methods 0.000 claims description 8
- 238000004364 calculation method Methods 0.000 description 20
- 238000012986 modification Methods 0.000 description 11
- 230000004048 modification Effects 0.000 description 11
- 238000005516 engineering process Methods 0.000 description 9
- 238000004891 communication Methods 0.000 description 8
- 230000008447 perception Effects 0.000 description 8
- 238000010586 diagram Methods 0.000 description 7
- 230000006399 behavior Effects 0.000 description 4
- 238000012217 deletion Methods 0.000 description 4
- 230000037430 deletion Effects 0.000 description 4
- 230000006870 function Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 230000033228 biological regulation Effects 0.000 description 2
- 230000000052 comparative effect Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000012827 research and development Methods 0.000 description 2
- 230000005856 abnormality Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000002547 anomalous effect Effects 0.000 description 1
- 230000008485 antagonism Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000009193 crawling Effects 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1425—Traffic logging, e.g. anomaly detection
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1416—Event detection, e.g. attack signature detection
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The disclosure provides an abnormal traffic identification method and device, electronic equipment and a storage medium, and relates to the technical field of computers, in particular to the technical field of network security. The specific implementation scheme is as follows: extracting at least one flow characteristic from the real-time flow data stream; comparing at least one flow characteristic with historical flow characteristics under different monitoring durations to obtain a comparison result; and identifying abnormal flow characteristics from the comparison result, wherein the abnormal flow characteristics are used for representing the flow characteristics inconsistent with the change trend of the historical flow characteristics. By adopting the method and the device, the safety of the internet information and the stability of the internet service are improved.
Description
Technical Field
The present disclosure relates to the field of computer technology, and more particularly, to the field of network security technology.
Background
In internet access, a large amount of non-user access exists, for example, traffic of internet information is automatically captured by using a crawler technology, and abnormal access which does not comply with access restrictions and behavior specifications threatens the security of the internet information, and due to overlarge traffic occupied by abnormal access, stability of internet service is threatened, and abnormal traffic needs to be identified.
Disclosure of Invention
The disclosure provides an abnormal traffic identification method, an abnormal traffic identification device, an electronic device and a storage medium.
According to an aspect of the present disclosure, there is provided an abnormal traffic identification method, including:
extracting at least one flow characteristic from the real-time flow data stream;
comparing the at least one flow characteristic with historical flow characteristics under different monitoring durations to obtain a comparison result;
and identifying abnormal flow characteristics from the comparison result, wherein the abnormal flow characteristics are used for representing the flow characteristics inconsistent with the change trend of the historical flow characteristics.
According to another aspect of the present disclosure, there is provided an abnormal traffic identification apparatus including:
a feature extraction module for extracting at least one flow feature from a real-time flow data stream;
the characteristic comparison module is used for comparing the at least one flow characteristic with historical flow characteristics under different monitoring time lengths to obtain a comparison result;
and the characteristic identification module is used for identifying abnormal flow characteristics from the comparison result, and the abnormal flow characteristics are used for representing the flow characteristics which are inconsistent with the change trend of the historical flow characteristics.
According to another aspect of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the above-described abnormal traffic identification method.
According to another aspect of the present disclosure, there is provided a non-transitory computer readable storage medium storing computer instructions for causing the computer to execute the above-described abnormal traffic identification method.
According to another aspect of the present disclosure, there is provided a computer program product comprising computer instructions stored in a computer readable storage medium, the computer instructions when executed by a processor implement the above-mentioned abnormal traffic identification method.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
By adopting the method and the device, the abnormal flow characteristics can be identified in real time, and the safety of the internet information and the stability of the internet service are improved.
Drawings
The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a flow chart diagram of a method for identifying abnormal traffic in accordance with an embodiment of the present disclosure;
FIG. 2 is a flow chart diagram of a method of identifying abnormal traffic in accordance with an embodiment of the present disclosure;
FIG. 3 is a schematic flow chart of a preconfigured database in an abnormal traffic identification method according to an embodiment of the present disclosure;
FIG. 4 is a schematic flow chart illustrating comparison of flow characteristics in an abnormal flow identification method according to an embodiment of the present disclosure;
FIG. 5 is a schematic diagram of an abnormal traffic identification framework in accordance with an embodiment of the present disclosure;
fig. 6 is a schematic structural diagram of an abnormal traffic recognition apparatus according to an embodiment of the present disclosure;
fig. 7 is a block diagram of an electronic device for implementing an abnormal traffic identification method according to an embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
The embodiment of the present disclosure provides an abnormal traffic identification method, as shown in fig. 1, including:
s101, extracting at least one flow characteristic from the real-time flow data stream.
S102, comparing at least one flow characteristic with historical flow characteristics under a preset monitoring duration to obtain a comparison result.
In some examples, the predetermined monitoring period includes: the monitoring duration of a long time is preset (for example, several days are the monitoring duration of a cycle), and the monitoring duration of a short time is preset (for example, several hours are the monitoring duration of a cycle), and the historical flow characteristics under different monitoring durations, that is, the monitoring duration of a long time is preset, and the historical flow characteristics under the monitoring duration of a short time are respectively compared with at least one flow characteristic, so as to obtain one or more types of comparison results.
S103, identifying abnormal flow characteristics from the comparison result, wherein the abnormal flow characteristics are used for representing the flow characteristics inconsistent with the change trend of the historical flow characteristics.
In an example of S101-S103, a Flink feature engine supporting real-time data stream computation may be loaded, and based on the Flink feature engine as a distributable open source computing framework oriented to data stream processing and batch data processing, at least one traffic feature may be extracted from a real-time traffic data stream, where a type of the at least one traffic feature includes: at least one of an Internet Protocol (IP) number and a User Identification (UID) number. Comparing at least one flow characteristic with historical flow characteristics under different monitoring time lengths (such as the monitoring time length used for reflecting long-time historical change trend of the flow characteristic, and the monitoring time length used for reflecting short-time sudden increase or sudden decrease change of the flow characteristic) to obtain a comparison result. If the change trend of the extracted at least one flow characteristic is inconsistent with the change trend of the historical flow characteristic, the extracted at least one flow characteristic is the abnormal flow characteristic in the comparison result, and therefore the abnormal flow characteristic can be identified from the comparison result and taken as the abnormal flow characteristic.
By adopting the embodiment of the disclosure, at least one flow characteristic can be extracted from the real-time flow data stream, so that the at least one flow characteristic is compared with the historical flow characteristics under different monitoring durations to obtain a comparison result. After the comparison result is obtained, an abnormal flow characteristic can be identified from the comparison result, and the abnormal flow characteristic is used for representing the flow characteristic inconsistent with the change trend of the historical flow characteristic. The abnormal traffic characteristics can be identified in real time, so that non-user access behaviors such as adopting a crawler technology are avoided, the safety of internet information is improved, the abnormal access can be shielded after the abnormal access is identified, the traffic occupied by the abnormal access is avoided, and the stability of internet service is also improved.
In one embodiment, extracting at least one flow feature from a real-time flow data stream comprises: and carrying out flow division on the real-time flow data flow according to the time window to obtain a data flow segment matched with the time window, and extracting at least one flow characteristic from the data flow segment. By adopting the embodiment, the time window can be 5 minutes, and 5 minutes are taken as the time window, namely the real-time traffic data stream is divided every 5 minutes, if abnormal traffic characteristics are found in the divided data stream segments, abnormal alarm information can be directly sent, the 5 minutes is only an example, the shorter the time of the time window is, the faster the abnormal traffic characteristics can be identified, and the better real-time abnormal monitoring effect can be achieved.
In one embodiment, the method further comprises: and after the abnormal flow characteristics are identified, sending abnormal alarm information. By adopting the embodiment, the abnormal alarm information can be sent immediately after the abnormal flow characteristic is identified, so that the abnormal condition can be positioned in time, or the abnormal alarm information is sent after the abnormal flow characteristic is identified and the preset time is waited, so that the normal service flow is prevented from being misreported.
In one embodiment, comparing at least one flow characteristic with historical flow characteristics at different monitoring time periods to obtain a comparison result includes: and comparing the at least one flow characteristic with the historical flow characteristic in the first monitoring time period to obtain a first processing result (the first processing result is used for representing the historical change trend of the flow characteristic). And comparing the at least one flow characteristic with the historical flow characteristic in the second monitoring time period to obtain a second processing result (the second processing result is used for representing the change of the sudden increase or the sudden decrease of the flow characteristic in the preset short time). And obtaining a comparison result according to the first processing result and the second processing result. By adopting the embodiment, the at least one flow characteristic is qualitatively determined by adopting the multi-standard comparison standards with different monitoring durations so as to accurately judge whether the at least one flow characteristic is an abnormal flow characteristic.
In one embodiment, the method further comprises: and configuring the types of the flow characteristics to be compared, comparing at least one flow characteristic with the type (such as at least one of an IP number and a UID number) of the flow characteristic to obtain a third processing result, and updating the comparison result according to the third processing result to obtain an updated comparison result. An abnormal flow characteristic is identified from the updated comparison. By adopting the embodiment, not only can the extracted at least one flow characteristic be compared with the historical flow characteristic, but also the extracted at least one flow characteristic can be compared with the preconfigured flow characteristic and the type thereof, so that whether the at least one flow characteristic is the abnormal flow characteristic or not can be accurately judged through the comprehensive comparison. Moreover, the initial comparison result can be updated and put in a database, so that the database of the flow characteristics can be continuously improved.
In some examples, the following method for identifying abnormal traffic is provided, as shown in fig. 2, and includes:
s201, acquiring a real-time traffic data stream, and dividing a time window for the real-time traffic data stream.
For example, the execution subject of this embodiment may be a server (such as a local server, a cloud server, a server cluster, and the like), a computer, a terminal device, a processor, a chip, and the like, which is not limited in this embodiment.
Wherein the real-time traffic data stream is from a real-time access log of a website; while the time window may be a shorter time interval, e.g. 5 minutes or 10 minutes, preferably the time window is 5 minutes. The time duration of the current time window can be flexibly set according to the actual situation (for example, the time duration can be set by comprehensively considering real-time monitoring and load balancing).
It should be noted that, in the identification of the abnormal traffic characteristics, if only the traffic data in a short time (for example, a second level) in the acquired log data is compared, false alarm is likely to occur in a service with large traffic disturbance, and setting a time window of a minute level, such as 5 minutes or 10 minutes, takes into account both the real-time property of data and the richness of data, and avoids false alarm.
S202, configuring the required flow characteristics in advance, and calculating at least one flow characteristic of the real-time flow data stream in the time window in real time.
As shown in fig. 3, step S202 may further include:
s2021, pre-configuring flow characteristics required to be used by a worker according to historical experience, wherein the pre-configuration comprises addition, modification and/or deletion of the flow characteristics.
The staff can complete the addition, modification and/or deletion of the flow characteristics through the configuration operation; if the statistic value of a certain parameter in the time window is increased rapidly, the parameter can be used as a newly added flow characteristic to represent a currently unknown new attack and/or cheating type, so that the new attack and/or cheating type can be perceived in real time; if a new attack and/or cheating type represented by a certain flow characteristic does not appear any more, a worker can delete or modify the flow characteristic into other required flow characteristics, wherein the modification of the flow characteristic comprises the following steps: modification of flow characteristic types and/or values.
S2022, calculating in real time at least one flow characteristic of the real time flow data stream within the time window. The at least one traffic characteristic may be one or more of a Page View (PV), a cheat PV, an IP address number, a UID number, and other parameters; the flow characteristics obtained by each calculation are used in the subsequent step S203, and at the same time, the flow characteristics are stored in the database to be called as historical data.
It should be noted that, if there is a case where a competitor maliciously crawls user data, a sudden change in the parameter value inevitably occurs in a short time in one or more parameters of PV, cheating PV, IP address number, and UID number as traffic characteristics (for example, statistics of statistical information of multiple dimensions such as PV number of 5 minutes, cheating PV number of 5 minutes, IP address number of 5 minutes, UID number of 5 minutes, and the like are counted in a time window of 5 minutes), and the introduction of multiple traffic characteristics provides multi-dimensional statistical information, which greatly improves the accuracy of traffic risk perception.
S203, comparing the extracted at least one flow characteristic with at least one historical flow characteristic threshold value.
Wherein, at least one historical flow characteristic threshold value can be obtained by carrying out statistical calculation on the historical flow characteristic.
Specifically, as shown in fig. 4, step S203 may further include:
s2031, comparing the extracted at least one flow characteristic with at least one flow characteristic threshold value of historical synchronization.
For example, the historical contemporaneous flow characteristic threshold may be obtained by calculating a corresponding flow characteristic threshold α within a time period 24 ± 2 hours before the current time, where a value range of the flow characteristic threshold α is: mu.s 1 -c×δ 1 <α<μ 1 +c×δ 1 Wherein, mu 1 Is the mean value of the corresponding flow characteristics, delta, over a 24 + -2 hour period 1 Is the standard deviation of the corresponding flow characteristics within a period of 24 +/-2 hours, and c is a constant defined according to different services.
S2032, comparing the extracted at least one flow characteristic with at least one flow characteristic threshold value of an adjacent time period.
For example, the flow characteristic threshold value of the adjacent time period may be obtained by calculating a flow characteristic threshold value β in a time period 0 to 6 hours before the current time, and a value range of the flow characteristic threshold value β is as follows: mu.s 2 -c×δ 2 <α<μ 2 +c×δ 2 Wherein, mu 2 Is the average value of the corresponding flow characteristics, delta, in the time interval from 0 to 6 hours ago 2 Is a period of 0-6 hours beforeThe standard deviation of the corresponding flow characteristics, c is a constant defined according to different services.
As shown in S2031-S2032, a method of two comparisons, or dual-standard comparison, is adopted, which can not only capture the sudden increase and sudden decrease of short-time traffic characteristics, but also prevent the false alarm caused by the sudden increase and sudden decrease of the normal traffic of the service itself; for example, in a period of 8 m. By reasonably selecting the historical flow characteristic threshold calculation method and the calculation time period, the calculation amount of the historical flow characteristic threshold calculation is reduced, and the accuracy of flow risk perception is improved.
And S204, if the comparison result is abnormal, sending abnormal alarm information.
For example, if at least one flow characteristic of the real-time flow data stream in the extracted time window is not within at least one flow characteristic threshold range of the historical synchronization period, continuously determining whether at least one flow characteristic of the real-time flow data stream is within at least one flow characteristic threshold range of an adjacent period, and if at least one flow characteristic of the real-time flow data stream is not within at least one flow characteristic threshold range of an adjacent period, sending abnormality warning information for the currently identified abnormal condition.
By adopting the embodiment of the disclosure, multi-dimensional, dual-standard and real-time online alarm sending is realized, new unknown new attacks and/or cheating types can be identified, timeliness and accuracy of flow risk perception are improved, false alarm is avoided, and computing resources are saved.
In internet access, taking access of non-user behaviors as access realized by a crawler technology as an example, the crawler technology can automatically capture the traffic of world wide web information, and abnormal access to network data under the condition of not following access limitation and behavior specification can cause loss of core content assets, and the traffic occupied by abnormal access is overlarge, so that the stability of website service is threatened; to maintain security of web information, it is often necessary to identify anomalous network traffic.
In view of this, it is necessary to block the crawler traffic, so as to reduce the crawling of internet core content assets (such as encyclopedia, vocabulary entry and other high-quality information content), to improve security, avoid the waste of server resources, to improve service stability, and to obtain more and more accurate malicious wind-control subject identifications including IP addresses, UID and the like according to a continuous rich anti-crawler policy.
The original data is exemplified by log data of a website, and the log data records time of each access, an access object, identity information left by an accessor, and the like. If a large relational database is used for batch processing to block crawler traffic and better identify an abnormal wind control subject (or abnormal traffic characteristic), the following challenges are required.
1) Incomplete strategy system: a strategy system constructed according to a rule established by manual experience can never completely and fully understand the shapes of cheating flow, and the possibility that the flow risk exists is sensed by one system in real time, so that the missing or spurious part of the existing strategy system is mined through manual intervention analysis.
2) Antagonism of black birth attack: the attack means of the black products continuously try to bypass the existing anti-cheating strategy, and a system needs to sense a new attack type in real time, so that a reinforcement strategy system is manually intervened, and the new attack type is blocked.
3) Risk of human operation: in the daily process of policy development and debugging, a research and development engineer of a traffic anti-cheating platform still has the possibility of causing errors or even failure of a policy system due to misoperation or coding errors, and needs a system to sense the condition that real-time traffic is abnormal.
4) And (3) system bug: a strategy engine behind a traffic anti-cheating platform and a blacklist library are adopted, each link of real-time computing depends on a large number of different cloud resources, the conditions of cloud resource abnormity and crash often exist, engineers need to intervene in time manually, and a system needs to sense real-time traffic abnormity caused by system crash for the first time.
In summary, a complete flow anti-cheating system needs a real-time risk sensing system to timely identify abnormal conditions of flow characteristics, so as to send out abnormal alarm information to relevant research and development personnel in time.
The current flow characteristic identification technology only selects data flows in a short time for comparison, and the data flows are non-real-time and are easily mistakenly reported in the service with large flow disturbance. In this application example, a real-time traffic data stream is identified, after the real-time traffic data stream enters a traffic anti-cheating system as shown in fig. 5, extraction of various traffic characteristics of a 5-minute time window is completed (for example, extraction includes PV, cheating PV, IP number, UID number, and the like, and the traffic characteristics fall into a library after each extraction, and are compared with values of a first monitoring duration (such as a comparative period, exemplarily, the comparative period of the same hour of yesterday is compared, currently, 10 points are used, an average value of 10 points of yesterday can be checked) and a second monitoring duration (such as a cyclic period, exemplarily, an average value of 6 hours before from the current time) of historically calculated traffic characteristics, and if there is an anomaly, an anomaly alarm message is sent.
Specifically, the system for preventing traffic cheating shown in fig. 5 includes: the system comprises an acquisition module 501, a feature engine module 502, a historical information recording module 503, a monitoring calculation module 504 and an alarm module 505.
The obtaining module 501 is configured to obtain a real-time traffic data stream, and divide a time window for the real-time traffic data stream, for example, the obtaining module 501 of this embodiment may be a server (such as a local server, a cloud server, a server cluster, and the like), a computer, a terminal device, a processor, a chip, and the like, which is not limited in this embodiment. Wherein the real-time traffic data stream is from a real-time access log of a website; while the time window may be a short time interval, e.g. 5 minutes or 10 minutes, preferably the time window is 5 minutes.
The feature engine module 502 is configured to pre-configure a desired flow feature and extract at least one flow feature of the real-time flow data stream within the time window in real time. The flow characteristics required to be used can be preconfigured by the staff according to historical experience, and the preconfiguration comprises addition, modification and/or deletion of the flow characteristics. The staff can also complete the addition, modification and/or deletion of the flow characteristics through the configuration operation; if the statistic value of a certain parameter in the time window is increased rapidly, the parameter can be used as a newly added flow characteristic to represent a currently unknown new attack and/or cheating type, so that the new attack and/or cheating type can be perceived in real time; if a new attack and/or cheating type represented by a certain flow characteristic does not appear any more, a worker can delete or modify the flow characteristic into other required flow characteristics, wherein the modification of the flow characteristic comprises the following steps: modification of flow characteristic types and/or values. It should be noted that the feature engine module may be a Flink feature engine module or other engine modules that can meet the real-time requirement; preferably, the feature engine module is a Flink feature engine module; the scenario of the present example focuses on abnormal traffic risk perception and traffic feature identification, and the scenario has a very high real-time requirement, and needs to discover abnormal traffic risk and alarm within 5 minutes, so the applicant introduces a Flink feature engine module, and through research, the applicant discovers that Flink as a framework and a distributed processing engine has the characteristics of maintaining high throughput and low delay, and is very suitable for real-time statistics, and never discloses that Flink can be used for traffic risk perception, and the applicant creatively applies Flink to configuration and statistics of traffic features aiming at the real-time requirement of a traffic risk perception technology, and obtains a very good technical effect compared with related technologies.
The historical information recording module 503 is configured to receive the at least one flow characteristic extracted by the characteristic engine module 502, and send the historical flow characteristic to the monitoring calculation module 504. At least one flow characteristic calculated by the characteristic engine module 502 is sent to the historical information recording module 503, and may be sent to the monitoring calculation module 504 as a historical flow characteristic in the future, for statistical calculation of a historical contemporaneous flow characteristic threshold value and/or a flow characteristic threshold value in an adjacent time period, and the historical information recording module 503 may be a database.
A monitor calculation module 504 configured to compare the calculated at least one of the flow characteristics with at least one historical flow characteristic threshold.
At least one historical flow characteristic threshold is obtained by performing statistical calculation on historical flow characteristics, and the method specifically comprises the following steps:
comparing the calculated at least one flow characteristic with at least one flow characteristic threshold of the historical synchronization period, for example, the flow characteristic threshold of the historical synchronization period may be obtained by calculating, by the monitoring calculation module 504, a corresponding flow characteristic threshold α within a period of 24 ± 2 hours before the current time, where a value range of the flow characteristic threshold α is: mu.s 1 -c×δ 1 <α<μ 1 +c×δ 1 Wherein, mu 1 Is the mean value of the corresponding flow characteristics, delta, over a 24 + -2 hour period 1 And c is a constant defined according to different services, and is the standard deviation of the corresponding flow characteristics in the 24 +/-2 hour period. Comparing the calculated at least one flow characteristic with at least one flow characteristic threshold value of an adjacent time period, for example, the flow characteristic threshold value of the adjacent time period may be obtained by calculating a flow characteristic threshold value β in a time period 0-6 hours before the current time by the monitoring calculation module 504, where a value range of the flow characteristic threshold value β is: mu.s 2 -c×δ 2 <α<μ 2 +c×δ 2 Wherein, mu 2 Is the average value of the corresponding flow characteristics, delta, in the time interval from 0 to 6 hours ago 2 The standard deviation of the corresponding flow characteristics in a time period before 0-6 hours, and c is a constant defined according to different services. If at least one flow characteristic of the real-time flow data stream in the time window calculated by the characteristic engine module 502 is not within at least one flow characteristic threshold range of the historical synchronization period, continuously judging whether the at least one flow characteristic of the real-time flow data stream is within at least one flow characteristic threshold range of the adjacent time period, and if the at least one flow characteristic of the real-time flow data stream is not within the at least one flow characteristic threshold range of the adjacent time period, determining that the comparison result is abnormal and sensing the flow risk.
And the alarm module 505 is used for sending alarm information if the comparison result is abnormal.
By adopting the flow anti-cheating system, through two comparisons or called dual-standard comparisons, the conditions of sudden increase and sudden decrease of short-time flow characteristics can be captured, and false alarm caused by sudden increase and sudden decrease of normal flow of a service is prevented; for example, in the period of 8 m. By reasonably selecting the flow characteristic threshold value calculation method and the calculation time period, the calculation amount of the flow characteristic threshold value calculation is reduced, and the accuracy of flow risk perception is improved. After the abnormal flow characteristics are identified, the abnormal alarm information can be sent, so that not only can the multidimensional, double-standard and real-time online alarm sending be realized, but also new unknown new attacks and/or cheating types can be identified, the timeliness and the accuracy of flow risk perception are improved, the false alarm is avoided, and the calculation resources are saved.
Fig. 6 shows an abnormal traffic recognition apparatus according to an embodiment of the present disclosure, which includes: a feature extraction module 601, configured to extract at least one traffic feature from a real-time traffic data stream; a feature comparison module 602, configured to compare the at least one flow feature with historical flow features for different monitoring durations to obtain a comparison result; a feature identification module 603, configured to identify an abnormal flow feature from the comparison result, where the abnormal flow feature is used to characterize a flow feature inconsistent with a change trend of the historical flow feature.
In one embodiment, the feature extraction module 601 is configured to perform traffic division on the real-time traffic data stream according to a time window to obtain a data stream segment matching the time window; extracting the at least one traffic feature from the data stream segment.
In an embodiment, the system further comprises an alarm module, configured to send an abnormal alarm message after identifying the abnormal flow characteristic.
In one embodiment, the characteristic comparison module 602 is configured to compare the at least one flow characteristic with a historical flow characteristic for a first monitoring duration to obtain a first processing result; the first processing result is used for representing the historical change trend of the flow characteristic; comparing the at least one flow characteristic with the historical flow characteristic in the second monitoring time period to obtain a second processing result; the second processing result is used for representing the change of sudden increase or sudden decrease of the flow characteristic in a preset short time; and obtaining the comparison result according to the first processing result and the second processing result.
In one embodiment, the system further comprises an updating module, configured to configure the type of the flow characteristics to be compared; comparing the at least one flow characteristic with the type of the flow characteristic to obtain a third processing result; updating the comparison result according to the third processing result to obtain an updated comparison result; and identifying abnormal flow characteristics from the updated comparison result.
In one embodiment, the at least one flow characteristic comprises: at least one of an IP number and a UID number.
In the technical scheme of the disclosure, the acquisition, storage, application and the like of the personal information of the related user all accord with the regulations of related laws and regulations, and do not violate the good customs of the public order.
The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.
FIG. 7 shows a schematic block diagram of an example electronic device 700 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic devices may also represent various forms of mobile devices, such as personal digital processors, cellular telephones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 7, the device 700 comprises a computing unit 701, which may perform various suitable actions and processes according to a computer program stored in a Read Only Memory (ROM) 702 or a computer program loaded from a storage unit 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data required for the operation of the device 700 can also be stored. The computing unit 701, the ROM 702, and the RAM 703 are connected to each other by a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
Various components in the device 700 are connected to the I/O interface 705, including: an input unit 706 such as a keyboard, a mouse, or the like; an output unit 707 such as various types of displays, speakers, and the like; a storage unit 708 such as a magnetic disk, optical disk, or the like; and a communication unit 709 such as a network card, modem, wireless communication transceiver, etc. The communication unit 709 allows the device 700 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program code, when executed by the processor or controller, causes the functions/acts specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user may provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server with a combined blockchain.
It should be understood that various forms of the flows shown above, reordering, adding or deleting steps, may be used. For example, the steps described in the present disclosure may be executed in parallel, sequentially or in different orders, and are not limited herein as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.
Claims (15)
1. An abnormal traffic identification method comprises the following steps:
extracting at least one flow characteristic from the real-time flow data stream;
comparing the at least one flow characteristic with historical flow characteristics under different monitoring durations to obtain a comparison result;
and identifying abnormal flow characteristics from the comparison result, wherein the abnormal flow characteristics are used for representing the flow characteristics inconsistent with the change trend of the historical flow characteristics.
2. The method of claim 1, wherein said extracting at least one traffic feature from a real-time traffic data stream comprises:
carrying out flow division on the real-time flow data stream according to a time window to obtain a data stream segment matched with the time window;
extracting the at least one traffic feature from the data stream segment.
3. The method of claim 1 or 2, further comprising:
and after the abnormal flow characteristics are identified, sending abnormal alarm information.
4. The method of claim 1 or 2, wherein the comparing the at least one flow characteristic with historical flow characteristics for different monitoring periods of time to obtain a comparison result comprises:
comparing the at least one flow characteristic with the historical flow characteristic under the first monitoring duration to obtain a first processing result; the first processing result is used for representing the historical change trend of the flow characteristic;
comparing the at least one flow characteristic with the historical flow characteristic in the second monitoring time period to obtain a second processing result; the second processing result is used for representing the change of sudden increase or sudden decrease of the flow characteristic in a preset short time;
and obtaining the comparison result according to the first processing result and the second processing result.
5. The method of claim 4, further comprising:
configuring the types of the flow characteristics to be compared;
comparing the at least one flow characteristic with the type of the flow characteristic to obtain a third processing result;
updating the comparison result according to the third processing result to obtain an updated comparison result;
and identifying abnormal flow characteristics from the updated comparison result.
6. The method of claim 5, wherein the at least one flow characteristic comprises: at least one of an Internet Protocol (IP) number and a User Identification (UID) number.
7. An abnormal traffic identification apparatus comprising:
a feature extraction module for extracting at least one flow feature from a real-time flow data stream;
the characteristic comparison module is used for comparing the at least one flow characteristic with historical flow characteristics under different monitoring durations to obtain a comparison result;
and the characteristic identification module is used for identifying abnormal flow characteristics from the comparison result, and the abnormal flow characteristics are used for representing the flow characteristics which are inconsistent with the change trend of the historical flow characteristics.
8. The apparatus of claim 7, wherein the feature extraction module is to:
carrying out flow division on the real-time flow data flow according to a time window to obtain a data flow segment matched with the time window;
extracting the at least one traffic feature from the data stream segment.
9. The apparatus of claim 7 or 8, further comprising an alarm module to:
and after the abnormal flow characteristics are identified, sending abnormal alarm information.
10. The apparatus of claim 7 or 8, wherein the feature alignment module is configured to:
comparing the at least one flow characteristic with the historical flow characteristic under the first monitoring duration to obtain a first processing result; the first processing result is used for representing the historical change trend of the flow characteristic;
comparing the at least one flow characteristic with the historical flow characteristic under the second monitoring duration to obtain a second processing result; the second processing result is used for representing the change of sudden increase or sudden decrease of the flow characteristic in a preset short time;
and obtaining the comparison result according to the first processing result and the second processing result.
11. The apparatus of claim 10, further comprising an update module to:
configuring the types of the flow characteristics to be compared;
comparing the at least one flow characteristic with the type of the flow characteristic to obtain a third processing result;
updating the comparison result according to the third processing result to obtain an updated comparison result;
and identifying abnormal flow characteristics from the updated comparison result.
12. The apparatus of claim 11, wherein the at least one flow characteristic comprises: at least one of an internet protocol number and a user identification number.
13. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-6.
14. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-6.
15. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211033370.6A CN115484073A (en) | 2022-08-26 | 2022-08-26 | Abnormal flow identification method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211033370.6A CN115484073A (en) | 2022-08-26 | 2022-08-26 | Abnormal flow identification method and device, electronic equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115484073A true CN115484073A (en) | 2022-12-16 |
Family
ID=84422130
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211033370.6A Pending CN115484073A (en) | 2022-08-26 | 2022-08-26 | Abnormal flow identification method and device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115484073A (en) |
-
2022
- 2022-08-26 CN CN202211033370.6A patent/CN115484073A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107809331B (en) | Method and device for identifying abnormal flow | |
CN110417778B (en) | Access request processing method and device | |
US11444861B2 (en) | Method and apparatus for detecting traffic | |
CN111694718A (en) | Method and device for identifying abnormal behavior of intranet user, computer equipment and readable storage medium | |
CN108306846B (en) | Network access abnormity detection method and system | |
CN111626498A (en) | Equipment operation state prediction method, device, equipment and storage medium | |
CN112738094A (en) | Expandable network security vulnerability monitoring method, system, terminal and storage medium | |
CN113452700B (en) | Method, device, equipment and storage medium for processing safety information | |
CN113656252B (en) | Fault positioning method, device, electronic equipment and storage medium | |
CN111416857A (en) | Client crash processing method, device, system, equipment and storage medium | |
CN112769595A (en) | Abnormality detection method, abnormality detection device, electronic device, and readable storage medium | |
CN117093627A (en) | Information mining method, device, electronic equipment and storage medium | |
CN115484073A (en) | Abnormal flow identification method and device, electronic equipment and storage medium | |
CN115426287B (en) | System monitoring and optimizing method and device, electronic equipment and medium | |
CN116015811A (en) | Method, device, storage medium and electronic equipment for evaluating network security | |
CN115827379A (en) | Abnormal process detection method, device, equipment and medium | |
CN114238069A (en) | Web application firewall testing method and device, electronic equipment, medium and product | |
CN109327433B (en) | Threat perception method and system based on operation scene analysis | |
CN113791897A (en) | Method and system for displaying server baseline detection report of rural telecommunication system | |
CN113259322A (en) | Method, system and medium for preventing Web service abnormity | |
CN113779098B (en) | Data processing method, device, electronic equipment and storage medium | |
CN115378746B (en) | Network intrusion detection rule generation method, device, equipment and storage medium | |
CN115129697A (en) | Data noise reduction method and device | |
CN117689379A (en) | Real-time monitoring method and device for high-concurrency payment scene | |
CN118260176A (en) | Business behavior data processing method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |