CN111917730A - HTTP bypass flow-based machine behavior analysis method - Google Patents
HTTP bypass flow-based machine behavior analysis method Download PDFInfo
- Publication number
- CN111917730A CN111917730A CN202010662522.3A CN202010662522A CN111917730A CN 111917730 A CN111917730 A CN 111917730A CN 202010662522 A CN202010662522 A CN 202010662522A CN 111917730 A CN111917730 A CN 111917730A
- Authority
- CN
- China
- Prior art keywords
- http
- machine behavior
- bypass
- traffic
- flow
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000004458 analytical method Methods 0.000 title claims abstract description 21
- 239000000523 sample Substances 0.000 claims abstract description 16
- 238000000034 method Methods 0.000 claims abstract description 10
- 238000001914 filtration Methods 0.000 claims abstract description 7
- 238000012544 monitoring process Methods 0.000 claims abstract description 3
- 238000004364 calculation method Methods 0.000 claims description 5
- 235000014510 cooky Nutrition 0.000 claims description 4
- 230000003068 static effect Effects 0.000 claims description 3
- 230000000694 effects Effects 0.000 abstract description 3
- 230000009545 invasion Effects 0.000 abstract description 2
- 230000006399 behavior Effects 0.000 description 17
- 239000003795 chemical substances by application Substances 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000002262 irrigation Effects 0.000 description 1
- 238000003973 irrigation Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/02—Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
- H04L63/0227—Filtering policies
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/02—Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
- H04L63/0227—Filtering policies
- H04L63/0263—Rule management
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/02—Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
Abstract
The invention discloses a machine behavior analysis method based on HTTP bypass flow, which comprises the following steps: carrying out bypass flow by monitoring a network port, and carrying out regular filtering on the bypass flow; then, flow distribution is carried out according to users and channels, flow of different channels of different users is distinguished, and the flow is analyzed into probe data which is pushed to a Kafka middleware; and the rule engine reads the probe data in the Kafka and performs real-time index operation, and then judges whether the behavior is the behavior of the network machine according to the rule. The HTTP bypass flow machine behavior analysis method can accurately identify the network machine access behavior in real time without invading a service system. The method can reduce the invasion to the service system, and is simple, easy to deploy and easy to expand. And the user can flexibly define statistical indexes and dynamic adjustment rules, dynamically load and take effect in real time. The invention combines multi-factor judgment, and the recognition rate of machine behavior analysis is more than 90%.
Description
Technical Field
The invention relates to the technical field of networks, in particular to a machine behavior analysis method based on HTTP bypass flow.
Background
With the development of computer and network technologies, network automated machine behaviors (such as search engines) bring great improvement to the working efficiency of people and bring great convenience to life. But accompanying some malicious machine behaviors pose challenges to service resources, information security, and the like. For example, malicious traffic consumes service resources, the behavior of database collision and crawler causes batch leakage of accounts, the behavior of number robbing and ticket robbing is trampled on social fairness, and batch business information capture damages enterprise interests.
The main way of machine behavior analysis for current network automation is through traffic logs. Firstly, the complexity of the method is high, and the complexity is linearly increased along with the complexity of service system construction, high-availability cluster deployment and the like; secondly, the intrusiveness is high, and a service system needs to be embedded with points based on log analysis; secondly, the accuracy is low, and most of the information in the HTTP protocol is discarded by the log recorded by the service system.
Therefore, the machine behavior analysis based on HTTP bypass traffic is becoming a hot point of research.
Disclosure of Invention
The invention aims to provide a machine behavior analysis method based on HTTP bypass flow aiming at the defects of the existing analysis method.
The purpose of the invention is realized by the following technical scheme: the invention provides a machine behavior analysis method based on HTTP bypass flow, which comprises the following steps:
s1, carrying out bypass flow on monitoring of the network port, and carrying out regular filtering on the bypass flow;
s2 flow distribution is carried out according to users and channels, flow of different channels of different users is distinguished, and the flow is analyzed into probe data which is pushed to a Kafka middleware;
and the S3 rule engine reads the probe data in the Kafka and performs real-time index operation, and then judges whether the behavior is the network machine behavior or not by combining the rules.
Further, the bypass traffic in step S1 refers to capturing packets through the libpcap on the specified network port, and specifying an IP address and a port through the bpf (bsd Packet filter).
Further, the step S2 distinguishes different users and distinguishes channels such as iOS, ANDROID, WEB, and the like.
Further, the probe data is a data object which is obtained by analyzing and encapsulating the bypass flow of the HTTP and can be identified by the rule engine.
Further, the probe data includes, but is not limited to, attributes of the user, channel, Method requested, resource URI, Use-Agent, IP, Cookie, and the like.
Further, the rules engine is a business decision component that performs predefined or dynamically defined semantics for accepting data, matching rules, and making decisions based on the rules.
Further, the rules in the rule engine are user-defined rules, and can be set according to user requirements.
Further, the regular filtering is to directly discard the bypassed traffic of other protocols such as HTTPS, TPS and the like, and distinguish the bypassed traffic as dynamic or static traffic.
Further, the rule engine comprises an index calculation container used for storing user-defined indexes and performing real-time index calculation.
The invention has the beneficial effects that:
(1) the invention can reduce the invasion to the service system, and is simple and easy to deploy and expand.
(2) The invention can realize that the user can flexibly define the statistical index and the dynamic regulation rule, dynamically load and take effect in real time.
(3) The invention combines multi-factor judgment, and can improve the recognition rate of machine behavior analysis.
(4) The method has higher identification accuracy, and can effectively analyze more than 90% of network machine behaviors.
Drawings
FIG. 1 is a diagram of a bypass traffic collection-based model of the present invention.
FIG. 2 is a model based on rules engine identification according to the present invention.
Detailed Description
Embodiments of the present invention will be described in further detail below with reference to the accompanying drawings.
As shown in fig. 1 and fig. 2, the method for analyzing machine behavior based on HTTP bypass traffic provided by the present invention includes the following steps:
(1) the bypass traffic component is configured with a check-sum filtering regularization such as/(http): v/\\/([ \ w. ] + \/.
(2) The bypass traffic component initiates loading the configuration and generates a canonical model.
(3) The bypass flow component monitors the appointed network port by using a BPF algorithm through a libpcap technology, and appoints an IP address and a port through a BPF (BSD Packet Filter) to carry out bypass flow.
(4) And (3) filtering out non-HTTP (such as other protocols like HTTPS, TPS and the like) requests and static resources by the bypassed flow through the regular model in the step (2).
(5) The analysis of the flow into probe data according to different users and channels is shown in the following table (ao1 … aon is probe data, and attribute 1 … n is a plurality of attribute information analyzed by http).
Probe data | Subscriber | Channel for irrigation | METHOD | URI | Attribute 1 … n |
ao1 | User1 | WEB | GET | …/user | Cookie… |
ao2 | User1 | iOS | POST | …/login | Use-Agent… |
ao3 | User1 | ANDROID | GET | …/query | Referer… |
ao4 | User2 | WEB | POST | …/add | Cookie… |
aon | … | … | … | … | … |
(6) The probe data is pushed to Kafka middleware.
(7) The user defines the index by self, and the user can predefine or dynamically define one or more indexes, such as a Use-Agent list for WEB and IP access, the number of times of access of iOS and IP in the past 24 hours, and the like.
(8) The user self-defines the rule, and the user can pre-define or dynamically define one or more rules, such as that 'Use-Agent is empty', 'the number of different Use-agents used in the past 3 hours in WEB and IP is more than 15', and the like.
(9) And starting and loading the user-defined index and the user-defined rule by the rule engine to generate an index model and a rule model.
(10) The rules engine reads the probe data in Kafka and pushes the probe data to the index model and the rules model.
(11) And the rule model judges whether the flow is a machine behavior or not according to the intermediate result of the real-time calculation of the index and the rule model. Recognition effects were as in 183.226.234.xxx using multiple Use-Agen:
Mozilla/4.0(compatible;MSIE 8.0;Windows NT 6.1;Trident/4.0;SLCC2;.NET CLR2.0.50727;.NET CLR 3.5.30729;.NET CLR 3.0.30729;Media Center PC6.0;.NET4.0C;.NET4.0E;{D9D54F49-E51C-445e-92F2-1EE3C2313240}),
Mozilla/4.0(compatible;MSIE 7.0;Windows NT 6.1;Trident/7.0;SLCC2;.NET CLR2.0.50727;.NET CLR 3.5.30729;.NET CLR 3.0.30729;Media Center PC6.0;.NET4.0C;.NET4.0E;{D9D54F49-E51C-445e-92F2-1EE3C2313240}),
Mozilla/4.0(compatible;MSIE 7.0;Windows NT 6.1;Trident/4.0;SLCC2;.NET CLR2.0.50727;.NET CLR 3.5.30729;.NET CLR 3.0.30729;Media Center PC6.0;.NET4.0C;.NET4.0E;{D9D54F49-E51C-445e-92F2-1EE3C2313240}),
Mozilla/5.0(Windows NT 6.1;Trident/7.0;rv:11.0)like Gecko
the recognition rate is over 90 percent through the machine behavior analysis test of the bypass flow of millions of running water.
The above-described embodiments are intended to illustrate rather than to limit the invention, and any modifications and variations of the present invention are within the spirit of the invention and the scope of the appended claims.
Claims (9)
1. The HTTP bypass traffic-based machine behavior analysis method is characterized by comprising the following steps:
s1, carrying out bypass flow on monitoring of the network port, and carrying out regular filtering on the bypass flow;
s2 flow distribution is carried out according to users and channels, flow of different channels of different users is distinguished, and the flow is analyzed into probe data which is pushed to a Kafka middleware;
and the S3 rule engine reads the probe data in the Kafka and performs real-time index operation, and then judges whether the behavior is the network machine behavior or not by combining the rules.
2. The HTTP bypass traffic-based machine behavior analysis method according to claim 1, wherein the bypass traffic in step S1 refers to a Packet capture of a specified network port by a libpcap, and specifies an IP address and a port by a bpf (bsd Packet filter).
3. The HTTP bypass traffic based machine behavior analysis method according to claim 1, wherein the step S2 is to distinguish different users and to distinguish channels such as iOS, ANDROID, and WEB.
4. The method for analyzing the machine behavior based on the HTTP bypass traffic as claimed in claim 1, wherein the probe data is a data object that is obtained by parsing and encapsulating the HTTP bypass traffic and is recognized by a rule engine.
5. The HTTP bypass traffic based machine behavior analysis Method according to claim 1, wherein the probe data includes attributes such as but not limited to user, channel, requested Method, resource URI, Use-Agent, IP, Cookie, etc.
6. The HTTP bypass traffic based machine behavior analysis method as recited in claim 1, wherein the rule engine is a business decision component that implements predefined or dynamically defined semantics for accepting data, matching rules, and making decisions based on the rules.
7. The method of analyzing machine behavior based on HTTP bypass traffic as recited in claim 6, wherein the rules in the rule engine are user-defined rules that can be set according to user requirements.
8. The method for analyzing machine behavior based on HTTP bypass traffic as claimed in claim 1, wherein the regular filtering is to directly discard traffic of other protocols such as HTTPS, TPS and so on which is bypassed, and distinguish dynamic or static traffic from bypassed traffic.
9. The HTTP bypass traffic based machine behavior analysis method according to claim 1, wherein the rule engine includes an index calculation container for storing user-defined indexes and performing real-time index calculation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010662522.3A CN111917730A (en) | 2020-07-10 | 2020-07-10 | HTTP bypass flow-based machine behavior analysis method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010662522.3A CN111917730A (en) | 2020-07-10 | 2020-07-10 | HTTP bypass flow-based machine behavior analysis method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111917730A true CN111917730A (en) | 2020-11-10 |
Family
ID=73227920
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010662522.3A Pending CN111917730A (en) | 2020-07-10 | 2020-07-10 | HTTP bypass flow-based machine behavior analysis method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111917730A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115567503A (en) * | 2022-12-07 | 2023-01-03 | 华信咨询设计研究院有限公司 | HTTPS protocol analysis method based on flow analysis |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7849502B1 (en) * | 2006-04-29 | 2010-12-07 | Ironport Systems, Inc. | Apparatus for monitoring network traffic |
CN102833111A (en) * | 2012-08-30 | 2012-12-19 | 北京锐安科技有限公司 | Visual hyper text transfer protocol (HTTP) data supervising method and device |
CN104601570A (en) * | 2015-01-13 | 2015-05-06 | 国家电网公司 | Network security monitoring method based on bypass monitoring and software packet capturing technology |
CN105591833A (en) * | 2014-11-26 | 2016-05-18 | 中国银联股份有限公司 | Flow-acquiring method based on rule engine |
CN106789242A (en) * | 2016-12-22 | 2017-05-31 | 广东华仝九方科技有限公司 | A kind of identification application intellectual analysis engine based on mobile phone client software behavioral characteristics storehouse |
CN107026821A (en) * | 2016-02-01 | 2017-08-08 | 阿里巴巴集团控股有限公司 | The processing method and processing device of message |
CN107426017A (en) * | 2017-06-26 | 2017-12-01 | 杭州沃趣科技股份有限公司 | A kind of method for carrying out data analysis by gathering switch network flow |
CN109347817A (en) * | 2018-10-12 | 2019-02-15 | 厦门安胜网络科技有限公司 | A kind of method and device that network security redirects |
CN109861995A (en) * | 2019-01-17 | 2019-06-07 | 安徽谛听信息科技有限公司 | A kind of safe big data intelligent analysis method of cyberspace, computer-readable medium |
CN110798342A (en) * | 2019-10-14 | 2020-02-14 | 杭州迪普科技股份有限公司 | Method and device for realizing bypass mode based on software |
-
2020
- 2020-07-10 CN CN202010662522.3A patent/CN111917730A/en active Pending
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7849502B1 (en) * | 2006-04-29 | 2010-12-07 | Ironport Systems, Inc. | Apparatus for monitoring network traffic |
CN102833111A (en) * | 2012-08-30 | 2012-12-19 | 北京锐安科技有限公司 | Visual hyper text transfer protocol (HTTP) data supervising method and device |
CN105591833A (en) * | 2014-11-26 | 2016-05-18 | 中国银联股份有限公司 | Flow-acquiring method based on rule engine |
CN104601570A (en) * | 2015-01-13 | 2015-05-06 | 国家电网公司 | Network security monitoring method based on bypass monitoring and software packet capturing technology |
CN107026821A (en) * | 2016-02-01 | 2017-08-08 | 阿里巴巴集团控股有限公司 | The processing method and processing device of message |
CN106789242A (en) * | 2016-12-22 | 2017-05-31 | 广东华仝九方科技有限公司 | A kind of identification application intellectual analysis engine based on mobile phone client software behavioral characteristics storehouse |
CN107426017A (en) * | 2017-06-26 | 2017-12-01 | 杭州沃趣科技股份有限公司 | A kind of method for carrying out data analysis by gathering switch network flow |
CN109347817A (en) * | 2018-10-12 | 2019-02-15 | 厦门安胜网络科技有限公司 | A kind of method and device that network security redirects |
CN109861995A (en) * | 2019-01-17 | 2019-06-07 | 安徽谛听信息科技有限公司 | A kind of safe big data intelligent analysis method of cyberspace, computer-readable medium |
CN110798342A (en) * | 2019-10-14 | 2020-02-14 | 杭州迪普科技股份有限公司 | Method and device for realizing bypass mode based on software |
Non-Patent Citations (1)
Title |
---|
李若鹏: "基于大数据的网络异常行为检测平台的设计与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑 2018年》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115567503A (en) * | 2022-12-07 | 2023-01-03 | 华信咨询设计研究院有限公司 | HTTPS protocol analysis method based on flow analysis |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106815112B (en) | Massive data monitoring system and method based on deep packet inspection | |
Callado et al. | A survey on internet traffic identification | |
CN101924757B (en) | Method and system for reviewing Botnet | |
KR101010302B1 (en) | Security management system and method of irc and http botnet | |
CN106464577B (en) | Network system, control device, communication device and communication control method | |
CN102420701B (en) | Method for extracting internet service flow characteristics | |
US20160191549A1 (en) | Rich metadata-based network security monitoring and analysis | |
WO2022083226A1 (en) | Anomaly identification method and system, storage medium and electronic device | |
US8938534B2 (en) | Automatic provisioning of new users of interest for capture on a communication network | |
US20080162397A1 (en) | Method for Analyzing Activities Over Information Networks | |
US20070076606A1 (en) | Statistical trace-based methods for real-time traffic classification | |
CN106921637A (en) | The recognition methods of the application message in network traffics and device | |
CN111092852A (en) | Network security monitoring method, device, equipment and storage medium based on big data | |
US10148698B2 (en) | Selective enforcement of event record purging in a high volume log system | |
CN102739457A (en) | Network flow recognition system and method based on DPI (Deep Packet Inspection) and SVM (Support Vector Machine) technology | |
CN106656616A (en) | Whole network flow analysis method of computer network | |
CN110958231A (en) | Industrial control safety event monitoring platform and method based on Internet | |
CN101741608A (en) | Traffic characteristic-based P2P application identification system and method | |
US11343143B2 (en) | Using a flow database to automatically configure network traffic visibility systems | |
Fiadino et al. | HTTPTag: A flexible on-line HTTP classification system for operational 3G networks | |
CN113283498A (en) | VPN flow rapid identification method facing high-speed network | |
CN110011860A (en) | Android application and identification method based on network traffic analysis | |
CN111917730A (en) | HTTP bypass flow-based machine behavior analysis method | |
US20090252041A1 (en) | Optimized statistics processing in integrated DPI service-oriented router deployments | |
Abdelkefi et al. | SENATUS: an approach to joint traffic anomaly detection and root cause analysis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: Room ABCD, 17th floor, building D, Paradise Software Park, No.3 xidoumen Road, Xihu District, Hangzhou City, Zhejiang Province, 310012 Applicant after: Zhejiang Bangsheng Technology Co.,Ltd. Address before: Room ABCD, 17th floor, building D, Paradise Software Park, No.3 xidoumen Road, Xihu District, Hangzhou City, Zhejiang Province, 310012 Applicant before: ZHEJIANG BANGSUN TECHNOLOGY Co.,Ltd. |
|
CB02 | Change of applicant information | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20201110 |