CN104702622B - Many-one type intranet and extranet big data one-way transmission communication means - Google Patents

Many-one type intranet and extranet big data one-way transmission communication means Download PDF

Info

Publication number
CN104702622B
CN104702622B CN201510141826.4A CN201510141826A CN104702622B CN 104702622 B CN104702622 B CN 104702622B CN 201510141826 A CN201510141826 A CN 201510141826A CN 104702622 B CN104702622 B CN 104702622B
Authority
CN
China
Prior art keywords
data
packet
information
intranet
collector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510141826.4A
Other languages
Chinese (zh)
Other versions
CN104702622A (en
Inventor
陈博俊
杨蕾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
HONGXU INFORMATION TECHNOLOGY Co Ltd WUHAN
Original Assignee
HONGXU INFORMATION TECHNOLOGY Co Ltd WUHAN
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by HONGXU INFORMATION TECHNOLOGY Co Ltd WUHAN filed Critical HONGXU INFORMATION TECHNOLOGY Co Ltd WUHAN
Priority to CN201510141826.4A priority Critical patent/CN104702622B/en
Publication of CN104702622A publication Critical patent/CN104702622A/en
Application granted granted Critical
Publication of CN104702622B publication Critical patent/CN104702622B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)
  • Computer And Data Communications (AREA)

Abstract

The invention discloses a kind of many-one type intranet and extranet big data one-way transmission communication system and its method, it is related to computer network information treatment technology.The system includes the collector unit being sequentially connected(100), external network server unit(200)With intranet server unit(300);Described external network server unit(200)The network interface packet capturing module interactive successively moved by unidirectional flow of data(201), data processing recombination module(202)With network interface sending module(203)Composition;Described intranet server unit(300)It is provided with Intranet processing module(301).The present invention can guarantee that on the basis of data processing server is transferred to Intranet, without changing capture program and Data Management Analysis program, to the data for analysis of having unpacked, and protocol assembly be carried out, so as to reach the high efficiency of information transfer, reliability and security.

Description

Many-one type intranet and extranet big data one-way transmission communication means
Technical field
The present invention relates to computer network information treatment technology, more particularly to a kind of many-one type intranet and extranet big data are unidirectional Communication system and its method are transmitted, the advantage with scalability, low cost and high security.
Background technology
With the aggravation developed rapidly with mobile Internet business market competition of computer network, the information of many manufacturers Reclaim faces enormous challenge.The server of processing data is exposed to outer net by many manufacturers in order to gather the data of our company In environment, this is undoubtedly knows clearly huge hidden danger to the trade secret band of user's personal information and manufacturer.
In such circumstances, the communication system that many manufacturers build includes outer net collector, the tertiary-structure network being sequentially communicated Server and intranet server, one is set up between the intranet server and outer net collector of processing information and plays fire wall work Tertiary-structure network server.
It is another due to having added between intranet server and outer net collector but this brings the problem of another is huge again The tertiary-structure network server of one forwarding data, can so cause to be transmitted to intranet server by tertiary-structure network server and exist Had a very large change in the structure of data message, may cause the message handling program of outer net collector can not correctly be handled The data that collector is sent;And the original reception processing information programme of many manufacturers is all often complicated and size of code is huge , change the transmission on the problem of data protocol is not only netscape messaging server Netscape one end, ten hundreds of data acquisition units Data protocol is also required to modification, and this is undoubtedly to enterprise and user with huge inconvenience, huge economy of also being known clearly for enterprise's band Input.
The content of the invention
The object of the present invention is to overcome the problems of the prior art, is not changing original information processing routine and transmission There is provided a kind of many-one type intranet and extranet big data one-way transmission communication system and its method on the basis of program, i.e., including one kind The program of the former agreement of data processing and restructuring of high efficient and reliable is run on outer net Quarantine Server, from without influenceing collector data Transmission, the former communications protocol format of collector can also be sent to data processing end, be faced largely with solving former transmission system The problem of changing code, while also providing high reliability and high security for data transfer.
Realizing purpose of the present invention technical scheme is:
Design is a kind of to possess high literacy, high data-handling capacity, the extendible at any time and computer network that is easily managed Network mass data processing program replaces redefining for unnecessary code revision and data protocol, and it can not only be solved at present Should guarantee data security the problem of handling in the collection of computer network mass data, it is most important that can not change original system Ensure that the processing of data is errorless under any code and structure of system.
First, many-one type intranet and extranet big data one-way transmission communication system
Including the collector unit, external network server unit and intranet server unit being sequentially connected.
2nd, many-one type intranet and extranet big data one-way transmission communication means
Specifically, this method comprises the following steps:
1. the configuration file of collector in collector unit is set(Such as destination interface and purpose IP);
2. data collector unit gathered are sent to external network server unit;
3. the network interface packet capturing module crawl of external network server unit connects the packet of collector unit one end network interface;
4. the data processing recombination module in external network server unit runs Protocol reassembling algorithm routine, to external network server The packet of the network interface packet capturing module crawl of unit carries out data receiver and data recombination, the number that identification collector unit is sent According to, and carry out Protocol reassembling;
The program ensures, in a large amount of explosive many data access, can efficiently capture the network interface card number of external network server unit According to, and any Protocol layer data analysis is carried out, hash is abandoned, the data that collector is sent are recombinated, and to different stream not With packet carry out separation restructuring, it is ensured that the integrality and primitiveness of information, it is unidirectional by network interface module of giving out a contract for a project after the completion of restructuring It is transferred to intranet server unit;
5. external network server unit is transmitted the data come and is received and carried out in Intranet related by intranet server unit Processing.
The present invention has following features:
1. the system sends code and data transmission format protocol on data cube computation without the change at collector end;
2. in peak period, when mass data is poured in, the system captures the data message that collector is sent from network interface, Carry out multiple threads, it is ensured that data are without loss and high speed processing;
3. in recombination data, by the way of MAP containers and single-track link table, to the destination of each TCP flow Location, raw address, destination interface, source port, host-host protocol carry out hash computings, determine that a unique hash value is every to determine One different TCP flow, then recombinates TCP flow further according to No. seq in different data streams, so as to enter to each TCP flow Row restructuring.
According to These characteristics, the present invention receives following good effect in use:
1. TCP five elements are utilized(Destination address, raw address, destination interface, source port and host-host protocol)To determine mark The hash values of different TCP flows, and in each hash value rear mount chained list, packet data is recombinated using seq, so So that no matter whether packet caused packet to reach that order is different all without influence number in transmitting procedure by network interferences According to restructuring, with higher reliability;
2. because analysis restructuring bag program is only run in external network server, the so collector systems for magnanimity and greatly rule It is not required to make an amendment for mould data analytics server, reliably safety.
In a word, the present invention can guarantee that on the basis of data processing server is transferred to Intranet, without changing capture program With Data Management Analysis program, to the data for analysis of having unpacked, carry out protocol assembly, thus reach the high efficiency of information transfer, Reliability and security.
Brief description of the drawings
Fig. 1 is the block diagram of the system;
Wherein:
100-collector unit,
101-the 1 collector, the 102-the 2 collector ... ...
10N-N collectors, N are natural number, N<65535;
200-external network server unit,
201-network interface packet capturing module,
202-data processing recombination module;
203-network interface is given out a contract for a project module;
300-intranet server unit,
301-Intranet processing module;
Fig. 2 is data receiver flow chart;
Fig. 3 is data recombination flow chart.
Embodiment
Described in detail below in conjunction with drawings and examples:
First, system
1st, it is overall
Such as Fig. 1, the system includes collector unit 100, external network server unit 200 and the intranet server being sequentially connected Unit 300.
2nd, functional block
1)Collector unit 100
Collector unit 100 includes the 1st collector 101, the 2nd collector 102 ... N collector 10N,
N is natural number, 1≤N<65535;
Data are gathered by the 1st collector 101 to N collectors 10N, and the destination interface and purpose IP configured according to early stage Send to external network server unit 200.
2)External network server unit 200
The hardware configuration of external network server unit 200 is server host;
Network interface packet capturing module 201 interactive successively, data that the software of external network server unit 200 is moved by unidirectional flow of data Processing recombination module 202 and network interface sending module 203 are constituted.
(1)Network interface packet capturing module 201
The hardware configuration of network interface card packet capturing module 201 is one piece of PCI-Express;
The software of network interface card packet capturing module 201 is mainly the trawl performance of adaptation services device, and data processing recombination module 202 can Calling system function captures the network packet received from the hardware device PCI-Express of the network interface card packet capturing module 201.
(2)Data processing recombination module 202
The hardware configuration of data processing recombination module 202 is server host;
The software of data processing recombination module 202 is mainly Protocol reassembling algorithm routine.
(3)Network interface sending module 203
The hardware configuration of network interface card packet capturing module 203 is one piece of PCI-Express;
The software of network interface card packet capturing module 203 is mainly the trawl performance of adaptation services device, and data processing recombination module 202 can Calling system function sends the data after restructuring to intranet server 300 through the network interface card.
3)Intranet server unit 300
The hardware configuration of intranet server unit 300 is server host.
The software of intranet server unit 300 is Intranet processing module 301;
The data that external network server unit 200 is sent are received by intranet server unit 300, and carry out phase according to demand It should handle.
Intranet server unit 300 is provided with Intranet processing module 301;
The hardware configuration of Intranet processing module 301 is server host;
The software of Intranet processing module 301 for the network data analysis that is designed according to Intranet the processing requirement of data and Processing routine.
3rd, operation principle:
1. from the 1st collector unit 101 to N collector 10N, respectively by destination of the data of collection according to configuration Mouth, purpose IP are sent to external network server unit 200;
2. the data that the network interface packet capturing module 201 of external network server unit 200 is sent to collector unit 100 are carried out Operation has Protocol reassembling algorithm routine in data processing recombination module 202 in network interface packet capturing, external network server unit 200, the journey Sequence mainly completes data receiver processing and data reorganization, and its major function is:202 pairs of nets of data processing recombination module first The carry out data receiver processing of the data of the crawl of mouth packet capturing module 201, data are sorted out by agreement, will receive IP packets and enter Capable analysis application layer data of unpacking, then leaves useful data filtering, hash is abandoned;Then data processing recombination module Data after 202 pairs of data reception processings carry out data recombination processing, are recombinated according to original data TCP flow, by one Complete data message recovers;For TCP restructuring, this method carries out data flow data by the way of MAP and chained list are shared Reduction, by the five elements of different TCP flows:Source address, destination address, source port, destination interface and communication protocol carry out hash Computing, gained hash values is stored in the first element of MAP containers, to distinguish different TCP flows, is deposited in the second element of container The chained list node of stream, each node represents a partial data of a stream;There are the length of the packet, seq models in node Value, and the memory block that a byte length is data package size are enclosed, when memory block is fully written the packet weight of the expression stream Group is completed, and the packet is sent and deletion of node, then completes the stream restructuring of a packet;Final data handles recombination module After 202 carry out data recombinations, recombination data is transmitted to network interface sending module 203, network interface sending module 203 is by the number after restructuring According to being sent to intranet server unit 300;
3. the data that intranet server unit 300 is sent by the network interface card of server host to external network server unit 200 Received, by the carry out relevant treatment as needed to data of the Intranet processing module 301 in intranet server unit 300.
2nd, method
The main working process of the Protocol reassembling algorithm routine run in data processing recombination module 202 is as follows:
1st, the workflow of data receiver
Such as Fig. 2, the workflow of the data receiver of data processing recombination module 202 comprises the following steps:
A, network interface crawl data -21
Network interface packet capturing module is initialized, starts to capture the packet of specified network interface card, packet is sent to outer net by collector Server network interface card, the port that each collector is sent is different, and packet is stored on the memory block specified;
B, parsing packet -22
Analysis of unpacking is carried out to the packet that network interface is captured, destination address, source address, destination interface, source are parsed respectively Port, host-host protocol, No. seq of TCP layer, IP ID, additionally will determine that the type of the bag, read the data of application layer;
C, determine whether Transmission Control Protocol -23
Packet content to parsing judges, determines whether Transmission Control Protocol, be then enter step D, otherwise packet loss- 207;
D, calculating hash values -24
TCP five elements to being the packet that Transmission Control Protocol is transmitted:Destination address, source address, destination interface, source port, biography Defeated agreement, carries out hash computings, calculates the unique sign hash values that can determine that TCP flow;
E, extraction seq number-25
The seq number of the packet of Transmission Control Protocol transmission are extracted, and it is put into data structure with hash values;
F, extraction application layer data -26
The data content of Transmission Control Protocol packet is extracted.
2nd, the workflow of data recombination
Such as Fig. 3, the workflow of the data recombination of data processing recombination module 202 comprises the following steps:
A, inflow data -3001
Through the data after the data processing work flow processing of Protocol reassembling algorithm routine in data processing recomposition unit 202 Flow into this flow;
B, determine whether the TCP flow -3002 that has recorded
The TCP flow recorded is determined whether according to the hash values of calculating, is then to enter step c, otherwise into step m;
C, determine whether information indicating symbol -3003
According to the data of packet application layer, the identifier of useful information is determined whether, is then to enter step d, otherwise Into step i;
D, addition information node -3004
In the hash MAP containers to have, the data for finding the TCP flow deposit the head pointer of chained list, are pointed to according to pointer Chained list insert new information node;
E, packet length information subtract addition message length -3005
Insert after the information content, length is subtracted to the information content length of the bag institute band;
F, judge travel through chained list whether have length for 0 node -3006
The chained list that message length in E carries out subtraction is traveled through, the node that length is 0 is determined whether, is then to enter step Rapid g, otherwise exits traversal -3008;
G, transmission data and deletion of node -3007
Nodal information is 0, shows that all data package-restructurings of the information node are completed, then sends all information of the node Content, and delete the node;
I, traversal chained list, search seq scopes -3009
To in chained list node data structure, there is seq initial value, the seq number in new data information are subtracted into starting Value;
J, judge whether existing node seq scopes -3010
The value in J after subtraction is judged whether in the Chief Information Officer angle value of the node, is then to enter step k, is otherwise lost Bag -3015;
K, addition information -3011
The data message is added in k in the information node that judges, according to new data information seq number and in fact seq The difference of value, it is determined that in the storage location of storage data block;
M, determine whether information indicating symbol -3012
Judge whether packet information has identifier, be then to enter step n, otherwise packet loss -3015;
New TCP flow -3013 is added in n, MAP
The hash values calculated in MAP containers according to the TCP five elements of new data packets, insert new hash values, show new TCP flow recombinated;
O, the new chained list -3014 of establishment
After the hash values that the MAP containers that step O is created are set up, corresponding content is set up, content is a new chained list Head node;
Again to step d.

Claims (1)

1. a kind of many-one type intranet and extranet big data one-way transmission communication means,
System includes collector unit (100), external network server unit (200) and the intranet server unit being sequentially connected (300);
Described collector unit (100) includes the 1st collector (101), the 2nd collector (102) ... N collectors (10N), N is natural number, 1≤N<65535;
Network interface packet capturing module (201) interactive successively, data that described external network server unit (200) is moved by unidirectional flow of data Handle recombination module (202) and network interface sending module (203) composition;
Described intranet server unit (300) is provided with Intranet processing module (301);
Method comprises the following steps:
1. the configuration file of collector in collector unit (100), including destination interface and purpose IP are set;
2. data collector unit (100) gathered are sent to external network server unit (200);
3. network interface packet capturing module (201) crawl of external network server unit (200) connects collector unit (100) one end network interface Packet;
4. the data processing recombination module in external network server unit (200) runs Protocol reassembling algorithm routine, to outer net service The packet of network interface packet capturing module (201) crawl of device unit (200) carries out data receiver and data recombination, recognizes collector list The data that first (100) are sent, and carry out Protocol reassembling;
5. external network server unit (200) is transmitted the data come and is received and entered in Intranet by intranet server unit (300) Row relevant treatment;
The workflow of the data receiver of Protocol reassembling algorithm routine comprises the following steps:
A, network interface crawl data (21)
Network interface packet capturing module is initialized, starts to capture the packet of specified network interface card, packet is sent to outer net service by collector Device network interface card, the port that each collector is sent is different, and packet is stored on the memory block specified;
B, parsing packet (22)
The packet that network interface is captured unpack analysis, parse respectively destination address, source address, destination interface, source port, Host-host protocol, No. seq of TCP layer, IP ID, additionally will determine that the type of the bag, read the data of application layer;
C, determine whether Transmission Control Protocol (23)
Packet content to parsing judges, determines whether Transmission Control Protocol, is then to enter step D, otherwise packet loss (27);
D, calculating hash values (24)
TCP five elements to being the packet that Transmission Control Protocol is transmitted:Destination address, source address, destination interface, source port and transmission Agreement, carries out hash computings, calculates the unique sign hash values that can determine that TCP flow;
E, extraction seq number (25)
The seq number of the packet of Transmission Control Protocol transmission are extracted, and it is put into data structure with hash values;
F, extraction application layer data (26)
The data content of Transmission Control Protocol packet is extracted;
It is characterized in that the workflow of the data recombination of Protocol reassembling algorithm routine comprises the following steps:
A, inflow data (3001)
Through the data flow after the data processing work flow processing of Protocol reassembling algorithm routine in data processing recomposition unit (202) Enter this flow;
B, determine whether the TCP flow (3002) that has recorded
The TCP flow recorded is determined whether according to the hash values of calculating, is then to enter step c, otherwise into step m;
C, determine whether information indicating accord with (3003)
According to the data of packet application layer, the identifier of useful information is determined whether, is then to enter step d, otherwise enters Step i;
D, addition information node (3004)
In existing hash MAP containers, the data for finding the TCP flow deposit the head pointer of chained list, the chain pointed to according to pointer Table inserts new information node;
E, packet length information subtract addition message length (3005)
Insert after the information content, length is subtracted to the information content length of the bag institute band;
F, judge travel through chained list whether have length for 0 node (3006)
The chained list that message length in e carries out subtraction is traveled through, the node that length is 0 is determined whether, is then to enter step g, Otherwise traversal (3008) is exited;
G, transmission data and deletion of node (3007)
Nodal information is 0, shows that all data package-restructurings of the information node are completed, then sends in all information of the node Hold, and delete the node;
I, traversal chained list, search seq scopes (3009)
To in chained list node data structure, there is seq initial value, the seq number in new data information are subtracted into initial value;
J, judge whether existing node seq scopes (3010)
The value in j after subtraction is judged whether in the Chief Information Officer angle value of the node, is then to enter step k, otherwise packet loss (3015);
K, addition information (3011)
The data message is added in k in the information node that judges, according to new data information seq number and in fact seq values Difference, it is determined that in the storage location of storage data block;
M, determine whether information indicating accord with (3012)
Judge whether packet information has identifier, be then to enter step n, otherwise packet loss (3015);
New TCP flow (3013) is added in n, MAP
The hash values calculated in MAP containers according to the TCP five elements of new data packets, insert new hash values, show new TCP Stream is recombinated;
O, the new chained list (3014) of establishment
After the hash values that the MAP containers that step o is created are set up, corresponding content is set up, content is the cephalomere of a new chained list Point;
Again to step d.
CN201510141826.4A 2015-03-30 2015-03-30 Many-one type intranet and extranet big data one-way transmission communication means Active CN104702622B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510141826.4A CN104702622B (en) 2015-03-30 2015-03-30 Many-one type intranet and extranet big data one-way transmission communication means

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510141826.4A CN104702622B (en) 2015-03-30 2015-03-30 Many-one type intranet and extranet big data one-way transmission communication means

Publications (2)

Publication Number Publication Date
CN104702622A CN104702622A (en) 2015-06-10
CN104702622B true CN104702622B (en) 2017-09-15

Family

ID=53349390

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510141826.4A Active CN104702622B (en) 2015-03-30 2015-03-30 Many-one type intranet and extranet big data one-way transmission communication means

Country Status (1)

Country Link
CN (1) CN104702622B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106685992B (en) * 2017-02-14 2023-05-23 厦门畅享信息技术有限公司 Cross-network security switching and interactive application system and method based on unidirectional transmission technology
CN112511532A (en) * 2020-11-27 2021-03-16 江苏宏诚智能科技有限公司 Internet of things gateway for private network and NB-IOT communication

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101488960A (en) * 2009-03-04 2009-07-22 哈尔滨工程大学 Apparatus and method for TCP protocol and data recovery based on parallel processing
CN103780610A (en) * 2014-01-16 2014-05-07 绵阳师范学院 Network data recovery method based on protocol characteristics
CN104270344A (en) * 2014-09-12 2015-01-07 北京天行网安信息技术有限责任公司 Quintillion gatekeeper

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101488960A (en) * 2009-03-04 2009-07-22 哈尔滨工程大学 Apparatus and method for TCP protocol and data recovery based on parallel processing
CN103780610A (en) * 2014-01-16 2014-05-07 绵阳师范学院 Network data recovery method based on protocol characteristics
CN104270344A (en) * 2014-09-12 2015-01-07 北京天行网安信息技术有限责任公司 Quintillion gatekeeper

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
《多网安全隔离交换系统的设计与实现》;丁烽祥;《厦门大学学报(自然科学版)》;20071115;第46卷(第S2期);第1.2节,图2、4、6 *

Also Published As

Publication number Publication date
CN104702622A (en) 2015-06-10

Similar Documents

Publication Publication Date Title
CN112468370B (en) High-speed network message monitoring and analyzing method and system supporting custom rules
CN104090891B (en) Data processing method, Apparatus and system
CN103281213B (en) A kind of network traffic content extracts and analyzes search method
CN112039904A (en) Network traffic analysis and file extraction system and method
CN101827073B (en) Tracking fragmented data flows
CN102468987B (en) NetFlow characteristic vector extraction method
CN102811162A (en) Method and apparatus for detecting network attacks using a flow based technique
CN105337991A (en) Integrated message flow searching and updating method
CN101316232B (en) Fragmentation and reassembly method based on network protocol version six
CN107241305A (en) A kind of network protocol analysis system and its analysis method based on polycaryon processor
CN114327833A (en) Efficient flow processing method based on software-defined complex rule
CN107862074A (en) Big data quantity parameter rapid read-write method
CN104702622B (en) Many-one type intranet and extranet big data one-way transmission communication means
CN100481812C (en) Flow controlling method based on application and network equipment for making applied flow control
CN105847179B (en) The method and device that Data Concurrent reports in a kind of DPI system
CN110213756A (en) A kind of data transmission method, device and its relevant device
CN101388848B (en) Flow recognition method combining network processor with general processor
CN101631122A (en) Method for improving TDS protocol analysis accuracy in packet-losing environment
CN111355671B (en) Network traffic classification method, medium and terminal equipment based on self-attention mechanism
CN105516016B (en) A kind of packet filtering system and packet filtering method based on stream using Tilera multinuclears accelerator card
CN103780460A (en) System for realizing hardware filtering of TAP device through FPGA
CN108460044B (en) Data processing method and device
EP3073685B1 (en) Network control device, network control method, and program
CN105337797A (en) Data capturing method of network protocol of complex electronic information system
CN115801467B (en) Tunnel encapsulation-oriented Torr flow identification method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant