CN111917730A - HTTP bypass flow-based machine behavior analysis method - Google Patents

HTTP bypass flow-based machine behavior analysis method Download PDF

Info

Publication number
CN111917730A
CN111917730A CN202010662522.3A CN202010662522A CN111917730A CN 111917730 A CN111917730 A CN 111917730A CN 202010662522 A CN202010662522 A CN 202010662522A CN 111917730 A CN111917730 A CN 111917730A
Authority
CN
China
Prior art keywords
http
machine behavior
bypass
traffic
flow
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010662522.3A
Other languages
Chinese (zh)
Inventor
李白
陈伟
金路
董永川
鲁萍
黄滔
蒋琦杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Bangsun Technology Co ltd
Original Assignee
Zhejiang Bangsun Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Bangsun Technology Co ltd filed Critical Zhejiang Bangsun Technology Co ltd
Priority to CN202010662522.3A priority Critical patent/CN111917730A/en
Publication of CN111917730A publication Critical patent/CN111917730A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/02Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
    • H04L63/0227Filtering policies
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/02Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
    • H04L63/0227Filtering policies
    • H04L63/0263Rule management
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]

Abstract

The invention discloses a machine behavior analysis method based on HTTP bypass flow, which comprises the following steps: carrying out bypass flow by monitoring a network port, and carrying out regular filtering on the bypass flow; then, flow distribution is carried out according to users and channels, flow of different channels of different users is distinguished, and the flow is analyzed into probe data which is pushed to a Kafka middleware; and the rule engine reads the probe data in the Kafka and performs real-time index operation, and then judges whether the behavior is the behavior of the network machine according to the rule. The HTTP bypass flow machine behavior analysis method can accurately identify the network machine access behavior in real time without invading a service system. The method can reduce the invasion to the service system, and is simple, easy to deploy and easy to expand. And the user can flexibly define statistical indexes and dynamic adjustment rules, dynamically load and take effect in real time. The invention combines multi-factor judgment, and the recognition rate of machine behavior analysis is more than 90%.

Description

HTTP bypass flow-based machine behavior analysis method
Technical Field
The invention relates to the technical field of networks, in particular to a machine behavior analysis method based on HTTP bypass flow.
Background
With the development of computer and network technologies, network automated machine behaviors (such as search engines) bring great improvement to the working efficiency of people and bring great convenience to life. But accompanying some malicious machine behaviors pose challenges to service resources, information security, and the like. For example, malicious traffic consumes service resources, the behavior of database collision and crawler causes batch leakage of accounts, the behavior of number robbing and ticket robbing is trampled on social fairness, and batch business information capture damages enterprise interests.
The main way of machine behavior analysis for current network automation is through traffic logs. Firstly, the complexity of the method is high, and the complexity is linearly increased along with the complexity of service system construction, high-availability cluster deployment and the like; secondly, the intrusiveness is high, and a service system needs to be embedded with points based on log analysis; secondly, the accuracy is low, and most of the information in the HTTP protocol is discarded by the log recorded by the service system.
Therefore, the machine behavior analysis based on HTTP bypass traffic is becoming a hot point of research.
Disclosure of Invention
The invention aims to provide a machine behavior analysis method based on HTTP bypass flow aiming at the defects of the existing analysis method.
The purpose of the invention is realized by the following technical scheme: the invention provides a machine behavior analysis method based on HTTP bypass flow, which comprises the following steps:
s1, carrying out bypass flow on monitoring of the network port, and carrying out regular filtering on the bypass flow;
s2 flow distribution is carried out according to users and channels, flow of different channels of different users is distinguished, and the flow is analyzed into probe data which is pushed to a Kafka middleware;
and the S3 rule engine reads the probe data in the Kafka and performs real-time index operation, and then judges whether the behavior is the network machine behavior or not by combining the rules.
Further, the bypass traffic in step S1 refers to capturing packets through the libpcap on the specified network port, and specifying an IP address and a port through the bpf (bsd Packet filter).
Further, the step S2 distinguishes different users and distinguishes channels such as iOS, ANDROID, WEB, and the like.
Further, the probe data is a data object which is obtained by analyzing and encapsulating the bypass flow of the HTTP and can be identified by the rule engine.
Further, the probe data includes, but is not limited to, attributes of the user, channel, Method requested, resource URI, Use-Agent, IP, Cookie, and the like.
Further, the rules engine is a business decision component that performs predefined or dynamically defined semantics for accepting data, matching rules, and making decisions based on the rules.
Further, the rules in the rule engine are user-defined rules, and can be set according to user requirements.
Further, the regular filtering is to directly discard the bypassed traffic of other protocols such as HTTPS, TPS and the like, and distinguish the bypassed traffic as dynamic or static traffic.
Further, the rule engine comprises an index calculation container used for storing user-defined indexes and performing real-time index calculation.
The invention has the beneficial effects that:
(1) the invention can reduce the invasion to the service system, and is simple and easy to deploy and expand.
(2) The invention can realize that the user can flexibly define the statistical index and the dynamic regulation rule, dynamically load and take effect in real time.
(3) The invention combines multi-factor judgment, and can improve the recognition rate of machine behavior analysis.
(4) The method has higher identification accuracy, and can effectively analyze more than 90% of network machine behaviors.
Drawings
FIG. 1 is a diagram of a bypass traffic collection-based model of the present invention.
FIG. 2 is a model based on rules engine identification according to the present invention.
Detailed Description
Embodiments of the present invention will be described in further detail below with reference to the accompanying drawings.
As shown in fig. 1 and fig. 2, the method for analyzing machine behavior based on HTTP bypass traffic provided by the present invention includes the following steps:
(1) the bypass traffic component is configured with a check-sum filtering regularization such as/(http): v/\\/([ \ w. ] + \/.
(2) The bypass traffic component initiates loading the configuration and generates a canonical model.
(3) The bypass flow component monitors the appointed network port by using a BPF algorithm through a libpcap technology, and appoints an IP address and a port through a BPF (BSD Packet Filter) to carry out bypass flow.
(4) And (3) filtering out non-HTTP (such as other protocols like HTTPS, TPS and the like) requests and static resources by the bypassed flow through the regular model in the step (2).
(5) The analysis of the flow into probe data according to different users and channels is shown in the following table (ao1 … aon is probe data, and attribute 1 … n is a plurality of attribute information analyzed by http).
Probe data Subscriber Channel for irrigation METHOD URI Attribute 1 … n
ao1 User1 WEB GET …/user Cookie…
ao2 User1 iOS POST …/login Use-Agent…
ao3 User1 ANDROID GET …/query Referer…
ao4 User2 WEB POST …/add Cookie…
aon
(6) The probe data is pushed to Kafka middleware.
(7) The user defines the index by self, and the user can predefine or dynamically define one or more indexes, such as a Use-Agent list for WEB and IP access, the number of times of access of iOS and IP in the past 24 hours, and the like.
(8) The user self-defines the rule, and the user can pre-define or dynamically define one or more rules, such as that 'Use-Agent is empty', 'the number of different Use-agents used in the past 3 hours in WEB and IP is more than 15', and the like.
(9) And starting and loading the user-defined index and the user-defined rule by the rule engine to generate an index model and a rule model.
(10) The rules engine reads the probe data in Kafka and pushes the probe data to the index model and the rules model.
(11) And the rule model judges whether the flow is a machine behavior or not according to the intermediate result of the real-time calculation of the index and the rule model. Recognition effects were as in 183.226.234.xxx using multiple Use-Agen:
Mozilla/4.0(compatible;MSIE 8.0;Windows NT 6.1;Trident/4.0;SLCC2;.NET CLR2.0.50727;.NET CLR 3.5.30729;.NET CLR 3.0.30729;Media Center PC6.0;.NET4.0C;.NET4.0E;{D9D54F49-E51C-445e-92F2-1EE3C2313240}),
Mozilla/4.0(compatible;MSIE 7.0;Windows NT 6.1;Trident/7.0;SLCC2;.NET CLR2.0.50727;.NET CLR 3.5.30729;.NET CLR 3.0.30729;Media Center PC6.0;.NET4.0C;.NET4.0E;{D9D54F49-E51C-445e-92F2-1EE3C2313240}),
Mozilla/4.0(compatible;MSIE 7.0;Windows NT 6.1;Trident/4.0;SLCC2;.NET CLR2.0.50727;.NET CLR 3.5.30729;.NET CLR 3.0.30729;Media Center PC6.0;.NET4.0C;.NET4.0E;{D9D54F49-E51C-445e-92F2-1EE3C2313240}),
Mozilla/5.0(Windows NT 6.1;Trident/7.0;rv:11.0)like Gecko
the recognition rate is over 90 percent through the machine behavior analysis test of the bypass flow of millions of running water.
The above-described embodiments are intended to illustrate rather than to limit the invention, and any modifications and variations of the present invention are within the spirit of the invention and the scope of the appended claims.

Claims (9)

1. The HTTP bypass traffic-based machine behavior analysis method is characterized by comprising the following steps:
s1, carrying out bypass flow on monitoring of the network port, and carrying out regular filtering on the bypass flow;
s2 flow distribution is carried out according to users and channels, flow of different channels of different users is distinguished, and the flow is analyzed into probe data which is pushed to a Kafka middleware;
and the S3 rule engine reads the probe data in the Kafka and performs real-time index operation, and then judges whether the behavior is the network machine behavior or not by combining the rules.
2. The HTTP bypass traffic-based machine behavior analysis method according to claim 1, wherein the bypass traffic in step S1 refers to a Packet capture of a specified network port by a libpcap, and specifies an IP address and a port by a bpf (bsd Packet filter).
3. The HTTP bypass traffic based machine behavior analysis method according to claim 1, wherein the step S2 is to distinguish different users and to distinguish channels such as iOS, ANDROID, and WEB.
4. The method for analyzing the machine behavior based on the HTTP bypass traffic as claimed in claim 1, wherein the probe data is a data object that is obtained by parsing and encapsulating the HTTP bypass traffic and is recognized by a rule engine.
5. The HTTP bypass traffic based machine behavior analysis Method according to claim 1, wherein the probe data includes attributes such as but not limited to user, channel, requested Method, resource URI, Use-Agent, IP, Cookie, etc.
6. The HTTP bypass traffic based machine behavior analysis method as recited in claim 1, wherein the rule engine is a business decision component that implements predefined or dynamically defined semantics for accepting data, matching rules, and making decisions based on the rules.
7. The method of analyzing machine behavior based on HTTP bypass traffic as recited in claim 6, wherein the rules in the rule engine are user-defined rules that can be set according to user requirements.
8. The method for analyzing machine behavior based on HTTP bypass traffic as claimed in claim 1, wherein the regular filtering is to directly discard traffic of other protocols such as HTTPS, TPS and so on which is bypassed, and distinguish dynamic or static traffic from bypassed traffic.
9. The HTTP bypass traffic based machine behavior analysis method according to claim 1, wherein the rule engine includes an index calculation container for storing user-defined indexes and performing real-time index calculation.
CN202010662522.3A 2020-07-10 2020-07-10 HTTP bypass flow-based machine behavior analysis method Pending CN111917730A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010662522.3A CN111917730A (en) 2020-07-10 2020-07-10 HTTP bypass flow-based machine behavior analysis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010662522.3A CN111917730A (en) 2020-07-10 2020-07-10 HTTP bypass flow-based machine behavior analysis method

Publications (1)

Publication Number Publication Date
CN111917730A true CN111917730A (en) 2020-11-10

Family

ID=73227920

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010662522.3A Pending CN111917730A (en) 2020-07-10 2020-07-10 HTTP bypass flow-based machine behavior analysis method

Country Status (1)

Country Link
CN (1) CN111917730A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115567503A (en) * 2022-12-07 2023-01-03 华信咨询设计研究院有限公司 HTTPS protocol analysis method based on flow analysis

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7849502B1 (en) * 2006-04-29 2010-12-07 Ironport Systems, Inc. Apparatus for monitoring network traffic
CN102833111A (en) * 2012-08-30 2012-12-19 北京锐安科技有限公司 Visual hyper text transfer protocol (HTTP) data supervising method and device
CN104601570A (en) * 2015-01-13 2015-05-06 国家电网公司 Network security monitoring method based on bypass monitoring and software packet capturing technology
CN105591833A (en) * 2014-11-26 2016-05-18 中国银联股份有限公司 Flow-acquiring method based on rule engine
CN106789242A (en) * 2016-12-22 2017-05-31 广东华仝九方科技有限公司 A kind of identification application intellectual analysis engine based on mobile phone client software behavioral characteristics storehouse
CN107026821A (en) * 2016-02-01 2017-08-08 阿里巴巴集团控股有限公司 The processing method and processing device of message
CN107426017A (en) * 2017-06-26 2017-12-01 杭州沃趣科技股份有限公司 A kind of method for carrying out data analysis by gathering switch network flow
CN109347817A (en) * 2018-10-12 2019-02-15 厦门安胜网络科技有限公司 A kind of method and device that network security redirects
CN109861995A (en) * 2019-01-17 2019-06-07 安徽谛听信息科技有限公司 A kind of safe big data intelligent analysis method of cyberspace, computer-readable medium
CN110798342A (en) * 2019-10-14 2020-02-14 杭州迪普科技股份有限公司 Method and device for realizing bypass mode based on software

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7849502B1 (en) * 2006-04-29 2010-12-07 Ironport Systems, Inc. Apparatus for monitoring network traffic
CN102833111A (en) * 2012-08-30 2012-12-19 北京锐安科技有限公司 Visual hyper text transfer protocol (HTTP) data supervising method and device
CN105591833A (en) * 2014-11-26 2016-05-18 中国银联股份有限公司 Flow-acquiring method based on rule engine
CN104601570A (en) * 2015-01-13 2015-05-06 国家电网公司 Network security monitoring method based on bypass monitoring and software packet capturing technology
CN107026821A (en) * 2016-02-01 2017-08-08 阿里巴巴集团控股有限公司 The processing method and processing device of message
CN106789242A (en) * 2016-12-22 2017-05-31 广东华仝九方科技有限公司 A kind of identification application intellectual analysis engine based on mobile phone client software behavioral characteristics storehouse
CN107426017A (en) * 2017-06-26 2017-12-01 杭州沃趣科技股份有限公司 A kind of method for carrying out data analysis by gathering switch network flow
CN109347817A (en) * 2018-10-12 2019-02-15 厦门安胜网络科技有限公司 A kind of method and device that network security redirects
CN109861995A (en) * 2019-01-17 2019-06-07 安徽谛听信息科技有限公司 A kind of safe big data intelligent analysis method of cyberspace, computer-readable medium
CN110798342A (en) * 2019-10-14 2020-02-14 杭州迪普科技股份有限公司 Method and device for realizing bypass mode based on software

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李若鹏: "基于大数据的网络异常行为检测平台的设计与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑 2018年》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115567503A (en) * 2022-12-07 2023-01-03 华信咨询设计研究院有限公司 HTTPS protocol analysis method based on flow analysis

Similar Documents

Publication Publication Date Title
CN106815112B (en) Massive data monitoring system and method based on deep packet inspection
Callado et al. A survey on internet traffic identification
CN101924757B (en) Method and system for reviewing Botnet
KR101010302B1 (en) Security management system and method of irc and http botnet
CN106464577B (en) Network system, control device, communication device and communication control method
CN102420701B (en) Method for extracting internet service flow characteristics
US20160191549A1 (en) Rich metadata-based network security monitoring and analysis
WO2022083226A1 (en) Anomaly identification method and system, storage medium and electronic device
US8938534B2 (en) Automatic provisioning of new users of interest for capture on a communication network
US20080162397A1 (en) Method for Analyzing Activities Over Information Networks
US20070076606A1 (en) Statistical trace-based methods for real-time traffic classification
CN106921637A (en) The recognition methods of the application message in network traffics and device
CN111092852A (en) Network security monitoring method, device, equipment and storage medium based on big data
US10148698B2 (en) Selective enforcement of event record purging in a high volume log system
CN102739457A (en) Network flow recognition system and method based on DPI (Deep Packet Inspection) and SVM (Support Vector Machine) technology
CN106656616A (en) Whole network flow analysis method of computer network
CN110958231A (en) Industrial control safety event monitoring platform and method based on Internet
CN101741608A (en) Traffic characteristic-based P2P application identification system and method
US11343143B2 (en) Using a flow database to automatically configure network traffic visibility systems
Fiadino et al. HTTPTag: A flexible on-line HTTP classification system for operational 3G networks
CN113283498A (en) VPN flow rapid identification method facing high-speed network
CN110011860A (en) Android application and identification method based on network traffic analysis
CN111917730A (en) HTTP bypass flow-based machine behavior analysis method
US20090252041A1 (en) Optimized statistics processing in integrated DPI service-oriented router deployments
Abdelkefi et al. SENATUS: an approach to joint traffic anomaly detection and root cause analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: Room ABCD, 17th floor, building D, Paradise Software Park, No.3 xidoumen Road, Xihu District, Hangzhou City, Zhejiang Province, 310012

Applicant after: Zhejiang Bangsheng Technology Co.,Ltd.

Address before: Room ABCD, 17th floor, building D, Paradise Software Park, No.3 xidoumen Road, Xihu District, Hangzhou City, Zhejiang Province, 310012

Applicant before: ZHEJIANG BANGSUN TECHNOLOGY Co.,Ltd.

CB02 Change of applicant information
RJ01 Rejection of invention patent application after publication

Application publication date: 20201110