CN109922081A - A kind of long connection data analysing method of TCP flow - Google Patents

A kind of long connection data analysing method of TCP flow Download PDF

Info

Publication number
CN109922081A
CN109922081A CN201910261785.0A CN201910261785A CN109922081A CN 109922081 A CN109922081 A CN 109922081A CN 201910261785 A CN201910261785 A CN 201910261785A CN 109922081 A CN109922081 A CN 109922081A
Authority
CN
China
Prior art keywords
data
connection
protocol
tcp
port
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910261785.0A
Other languages
Chinese (zh)
Other versions
CN109922081B (en
Inventor
梁永喜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
All-Knowledgeable Science And Technology (hangzhou) Co Ltd
Original Assignee
All-Knowledgeable Science And Technology (hangzhou) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by All-Knowledgeable Science And Technology (hangzhou) Co Ltd filed Critical All-Knowledgeable Science And Technology (hangzhou) Co Ltd
Priority to CN201910261785.0A priority Critical patent/CN109922081B/en
Publication of CN109922081A publication Critical patent/CN109922081A/en
Application granted granted Critical
Publication of CN109922081B publication Critical patent/CN109922081B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Computer And Data Communications (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a kind of long connection data analysing methods of TCP flow, for established TCP connection data stream, quaternary group information establishes session, cache a small amount of TCP data, in a manner of packet boundary and port number information detection content, to match the feature of known protocol, the content come out further according to protocol analysis, IP address and port information, distinguish the direction of connection, complete parsing protocol message, distinguish the boundary of message body, and the link information of shaking hands that addition is simulated during existing communication analysis, and compatible holding TCP standard communication protocol, persistently parse subsequent valid data.The present invention solves the data for connection has been established in flow, carries out protocol identification, parses and restore valid data, the information source by restoring the complete session behavior of application layer protocol in network flow data, for behavior auditing and risk discovery;The data flowing serviced by unknown system, for finding that data flow associated information source.

Description

A kind of long connection data analysing method of TCP flow
Technical field
The invention belongs to be connected to the network analysis field, the long connection data analysing method of specifically a kind of TCP flow.
Background technique
The process of classical network communication is that TCP Client initiates network connection, is connected to server, carries out in specific protocol Data communication, close connection after the completion, a large amount of such connection had for server-side terminaloriented and initiates to grasp with closing Make.System service biggish for flow, usual single server cannot complete task, but pass through agency by multiple servers Mode offer service is provided equally, before the server of load balancing is set, in load-balanced server and really processing TCP long connection is established between the server of business, and the request of customer in response and provide service on this connection, these length connect It is then several minutes short to connect effective time, long then a few hours or a couple of days.And the protocol analysis on Most current mirror image flow connects skill Art requires or based on the complete session connection of TCP, for the transmission data in established connection, the side generally all ignored Formula is not reprocessed, and is lost so as to cause the data on flows for currently establishing connection.
Summary of the invention
In order to solve the above technical problems existing in the prior art, the present invention provides a kind of long connection data of TCP flow Analysis method includes the following steps:
One, connection data flow is had been established to TCP, establishes session, it is data cached;
Two, selection protocol identification sequence is confirmed according to port number information;
Three, according to the content of packet boundary and caching, the data content of certain data volume is saved;
Four, according to protocol format rule, first pattern match goes out data cached be adapted to assembly of protocols, then fine granularity analyzes agreement Whether the relation constraint of internal interfield is met;
Five, message content is parsed using the Current protocol format that analysis obtains, extracts the field contents in agreement, identify field Feature simultaneously verifies the effective range for meeting protocol conventions;
Six, identification network connection direction, will currently be added in dissection process process, continue analysis protocol message.
Further, the identification network connection direction in step 6 specifically includes:
1) direction of network connection is distinguished according to the field feature parsed;
If 2) can not Direct Recognition, analyze it is all be flexibly connected sessions IP address informations, if current connection IP and port energy The no service side IP and port information for having existed simultaneously connection session applies the existing direction for being flexibly connected session;
3) if it does not exist, being connected in session with port in current active according to current two groups of IP address was once the meeting of server Words record, the range of current connectivity port, the current IP address that connects is as server address probability, and current connectivity port is as clothes Business device port probability, so that connection Direction Probability is calculated, to distinguish the direction of connection.
Further, in step 4 the relation constraint of interfield include the dependence of structural body size, fields offset and/ Or protocol boundary character symbol set.
The present invention is for established TCP connection data stream, quaternary group information (source IP, source port, Target IP, target side Mouthful) session is established, a small amount of TCP data is cached, it is known to match in a manner of packet boundary and port number information detection content The feature of agreement distinguishes the direction of connection, global solution further according to content, IP address and port information that protocol analysis comes out Protocol message is analysed, the boundary of message body, and the link information of shaking hands that addition is simulated during existing communication analysis are distinguished, and It is compatible to keep TCP standard communication protocol, persistently parse subsequent valid data.The present invention is solved for connection has been established in flow Data carry out protocol identification, valid data are parsed and restore, by restoring application layer protocol in network flow data Complete session behavior, the information source found for behavior auditing and risk;The data flowing serviced by unknown system, is used for It was found that data flow associated information source.
Specific embodiment
The invention will be further described below.
Application layer protocol can all have certain message format, not have generally between multiple message formats in long connection Apparent feature.With http protocol example, message format is divided into three parts: first trip, head (Header), text (Body).This Three parts pass through CRLF(carriage return character and newline) it is separated, wherein head (Header) partial interior has multiple rows (CRLF separation), Key-Value form.
All it is identical message format for client and server-side, is that HTTP request first trip and http response are first respectively Row.HTTP request first trip format is method, path, version, and with space-separated, http response first trip format is version number, state Code, state text, with space-separated.
The position occurred according to CRLF separator and feature, identify several parts such as first trip, head and text, and verifying is known Not Chu Lai protocol format whether standard define.If complying with standard definition, it is added in the connection management of session establishment.
The present invention specifically includes the following steps:
One, connection data flow is had been established to TCP, establishes session, it is data cached;
Two, selection protocol identification sequence is confirmed according to port number information;
Three, according to the content of packet boundary and caching, the data content of certain data volume is saved;
Four, according to protocol format rule, first pattern match goes out data cached be adapted to assembly of protocols, then fine granularity analyzes agreement The relation constraint for whether meeting internal interfield, such as the dependence of structural body size, fields offset, protocol boundary character symbol Set;
Five, message content is parsed using the Current protocol format that analysis obtains, extracts the field contents in agreement, identify field Feature simultaneously verifies the effective range for meeting protocol conventions;
Six, the direction of network connection is distinguished according to the field feature parsed, if can recognize that direction (client and service End), go to step nine;
If seven, can not Direct Recognition, analyze it is all be flexibly connected sessions IP address informations, if current connection IP and port energy The no service side IP and port information for having existed simultaneously connection session applies the existing direction (client for being flexibly connected session End and server-side), go to step nine;
Eight, if it does not exist, being connected in session with port in current active according to current two groups of IP address was once server Conversation recording, the range of current connectivity port, the current IP address that connects is as server address probability, current connectivity port conduct Service-Port probability, so that connection Direction Probability is calculated, to distinguish the direction (client and server-side) of connection;
Nine, it will currently be added in dissection process process, continue analysis protocol message.

Claims (3)

1. a kind of long connection data analysing method of TCP flow, includes the following steps:
One, connection data flow is had been established to TCP, establishes session, it is data cached;
Two, selection protocol identification sequence is confirmed according to port number information;
Three, according to the content of packet boundary and caching, the data content of certain data volume is saved;
Four, according to protocol format rule, first pattern match goes out data cached be adapted to assembly of protocols, then fine granularity analyzes agreement Whether the relation constraint of internal interfield is met;
Five, message content is parsed using the Current protocol format that analysis obtains, extracts the field contents in agreement, identify field Feature simultaneously verifies the effective range for meeting protocol conventions;
Six, identification network connection direction, will currently be added in dissection process process, continue analysis protocol message.
2. the long connection data analysing method of TCP flow as described in claim 1, it is characterised in that: the identification network in step 6 Connection direction specifically includes:
1) direction of network connection is distinguished according to the field feature parsed;
If 2) can not Direct Recognition, analyze it is all be flexibly connected sessions IP address informations, if current connection IP and port energy The no service side IP and port information for having existed simultaneously connection session applies the existing direction for being flexibly connected session;
3) if it does not exist, being connected in session with port in current active according to current two groups of IP address was once the meeting of server Words record, the range of current connectivity port, the current IP address that connects is as server address probability, and current connectivity port is as clothes Business device port probability, so that connection Direction Probability is calculated, to distinguish the direction of connection.
3. the long connection data analysing method of TCP flow as claimed in claim 1 or 2, it is characterised in that:
The relation constraint of interfield includes that the dependence, fields offset and/or protocol boundary of structural body size are special in step 4 Levy Fu Jihe.
CN201910261785.0A 2019-04-02 2019-04-02 TCP stream length connection data analysis method Active CN109922081B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910261785.0A CN109922081B (en) 2019-04-02 2019-04-02 TCP stream length connection data analysis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910261785.0A CN109922081B (en) 2019-04-02 2019-04-02 TCP stream length connection data analysis method

Publications (2)

Publication Number Publication Date
CN109922081A true CN109922081A (en) 2019-06-21
CN109922081B CN109922081B (en) 2021-06-25

Family

ID=66968225

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910261785.0A Active CN109922081B (en) 2019-04-02 2019-04-02 TCP stream length connection data analysis method

Country Status (1)

Country Link
CN (1) CN109922081B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112398865A (en) * 2020-11-20 2021-02-23 苏州新网天盾科技有限公司 Application layer information reasoning method under multilayer protocol nesting condition
GB2604695A (en) * 2020-12-03 2022-09-14 Ibm Network traffic rule identification
CN115412532A (en) * 2022-08-15 2022-11-29 深圳市风云实业有限公司 SIP and extension protocol session control flow identification and processing method

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1909489A (en) * 2006-08-30 2007-02-07 中国科学院计算技术研究所 Method for distinguishing RTP/RTCP flow capacity
CN102255909A (en) * 2011-07-11 2011-11-23 北京星网锐捷网络技术有限公司 Session stream monitoring method and device
CN104836749A (en) * 2015-03-27 2015-08-12 清华大学 Software-defined networking (SDN) data plane strip state forwarding processor
CN108055273A (en) * 2017-12-22 2018-05-18 北京启明星辰信息安全技术有限公司 A kind of intranet server finds method, system and Network Security Audit System
US20180316646A1 (en) * 2015-04-02 2018-11-01 Aunigma Network Security Corp. (Dba Aunigma Network Solutions Corp) Real time dynamic client access control
CN109067789A (en) * 2018-09-25 2018-12-21 郑州云海信息技术有限公司 Web vulnerability scanning method, system based on linux system
CN109120405A (en) * 2018-10-29 2019-01-01 全球能源互联网研究院有限公司 A kind of terminal security cut-in method, apparatus and system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1909489A (en) * 2006-08-30 2007-02-07 中国科学院计算技术研究所 Method for distinguishing RTP/RTCP flow capacity
CN102255909A (en) * 2011-07-11 2011-11-23 北京星网锐捷网络技术有限公司 Session stream monitoring method and device
CN104836749A (en) * 2015-03-27 2015-08-12 清华大学 Software-defined networking (SDN) data plane strip state forwarding processor
US20180316646A1 (en) * 2015-04-02 2018-11-01 Aunigma Network Security Corp. (Dba Aunigma Network Solutions Corp) Real time dynamic client access control
CN108055273A (en) * 2017-12-22 2018-05-18 北京启明星辰信息安全技术有限公司 A kind of intranet server finds method, system and Network Security Audit System
CN109067789A (en) * 2018-09-25 2018-12-21 郑州云海信息技术有限公司 Web vulnerability scanning method, system based on linux system
CN109120405A (en) * 2018-10-29 2019-01-01 全球能源互联网研究院有限公司 A kind of terminal security cut-in method, apparatus and system

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112398865A (en) * 2020-11-20 2021-02-23 苏州新网天盾科技有限公司 Application layer information reasoning method under multilayer protocol nesting condition
GB2604695A (en) * 2020-12-03 2022-09-14 Ibm Network traffic rule identification
US11575589B2 (en) 2020-12-03 2023-02-07 International Business Machines Corporation Network traffic rule identification
GB2604695B (en) * 2020-12-03 2023-09-06 Ibm Network traffic rule identification
CN115412532A (en) * 2022-08-15 2022-11-29 深圳市风云实业有限公司 SIP and extension protocol session control flow identification and processing method
CN115412532B (en) * 2022-08-15 2023-07-21 深圳市风云实业有限公司 Method for identifying and processing session control flow of SIP and extension protocol

Also Published As

Publication number Publication date
CN109922081B (en) 2021-06-25

Similar Documents

Publication Publication Date Title
CN109922081A (en) A kind of long connection data analysing method of TCP flow
US10084713B2 (en) Protocol type identification method and apparatus
US11218382B2 (en) Quality of service monitoring method, device, and system
EP2482517B1 (en) Method, apparatus and system for protocol identification
US20040098641A1 (en) Expert system for protocols analysis
KR20070083389A (en) Interferring server state in a stateless communication protocol
CN103139315A (en) Application layer protocol analysis method suitable for home gateway
CN109842629A (en) The implementation method of custom protocol based on protocol analysis frame
CN105828310B (en) Charging method, device and system for data service
CN108055312A (en) Method for routing and its device and computer installation and its readable storage medium storing program for executing
CN108234345A (en) A kind of traffic characteristic recognition methods of terminal network application, device and system
EP3562109A1 (en) Method for identifying multiple packets, method for identifying data packet, and traffic guiding method
CN104994016A (en) Method and apparatus for packet classification
CN108156223A (en) A kind of accurate supplying system of message based on websocket and method
CN112672381A (en) Data association method, device, terminal equipment and medium
US9973372B2 (en) Method and device for extracting data from a data stream travelling around an IP network
EP3534575B1 (en) Method for identifying single packet, and traffic guiding method
CN103036746A (en) Passive measurement method and passive measurement system of web page responding time based on network intermediate point
CN110581780A (en) automatic identification method for WEB server assets
CN104796426B (en) The detection method at webpage back door
CN112822208A (en) Internet of things equipment identification method and system based on block chain
CN107517237A (en) A kind of video frequency identifying method and device
CN112291076A (en) Packet loss positioning method, device and system and computer storage medium
WO2023006716A1 (en) Vehicle data
CN110213335A (en) A kind of supplementary translation first-grade translator synchronization system and implementation method based on browser

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant