CN102045378A - Method for realizing full distribution of protocol stack process and distributed system - Google Patents

Method for realizing full distribution of protocol stack process and distributed system Download PDF

Info

Publication number
CN102045378A
CN102045378A CN2009102357417A CN200910235741A CN102045378A CN 102045378 A CN102045378 A CN 102045378A CN 2009102357417 A CN2009102357417 A CN 2009102357417A CN 200910235741 A CN200910235741 A CN 200910235741A CN 102045378 A CN102045378 A CN 102045378A
Authority
CN
China
Prior art keywords
subsystem
inpcb
port
data
well
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2009102357417A
Other languages
Chinese (zh)
Other versions
CN102045378B (en
Inventor
郭显志
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New H3C Technologies Co Ltd
Original Assignee
Hangzhou H3C Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou H3C Technologies Co Ltd filed Critical Hangzhou H3C Technologies Co Ltd
Priority to CN 200910235741 priority Critical patent/CN102045378B/en
Publication of CN102045378A publication Critical patent/CN102045378A/en
Application granted granted Critical
Publication of CN102045378B publication Critical patent/CN102045378B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a method and a system for realizing full distribution of a protocol stack. In the method, the full distribution aim of the protocol stack is fulfilled by synchronizing a small number of internet protocol control block (INPCB) data according to the characteristics of a transmission control protocol (TCP), a user datagram protocol (UDP) and a raw internet protocol (RawIP). In particular, for the TCP, the INPCB data of a mother SOCKET is synchronized; for the UDP, the INPCB data connected with a server terminal is synchronized; and for the RawIP, all the connected INPCB data is synchronized. By the method and the system, the full distribution of protocol stack process can be realized without synchronizing all the INPCB data.

Description

Full distributed implementation method and distributed system that protocol stack is handled
Technical field
The present invention relates to the treatment technology of distributed system, be specifically related to protocol stack is handled in the distributed system full distributed implementation method and distributed system.
Background technology
At present, in order to improve disposal ability, distributed system occurred, comprised a plurality of subsystems in the distributed system, each subsystem has the independent CPUs disposal ability, can the independent operating protocol stack, and a plurality of subsystems cooperate finishes a cover system function.
In distributed system, need the distributed of supported protocol stack processing.Protocol stack is meant transmission control protocol/Internet Protocol (TCP/IP) protocol stack, comprises the agreement of transmission control protocol (TCP), User Datagram Protoco (UDP) (UDP), three kinds of INET protocol suites of original Internet Protocol (RawIP).In INET protocol suite (comprising INET4 and INET6), each socket (SOCKET) connects corresponding protocol control block, INPCB (Internet Protocol Control Block) is the protocol control block of INET protocol suite, local address (LIP), local port (LP), far-end address (DIP), remote port link informations such as (DP) that INPCB record SOCKET connects.
Each subsystem in the distributed system all might be created the INPCB data, and the SOCKET that sets up on each subsystem connection is not simultaneously, and its INPCB data of creating are also different.Each subsystem is used the link information coupling INPCB data of message after receiving the IP datagram literary composition, when the match is successful, institute is received the IP datagram literary composition give INPCB corresponding upper agreement SOCKET application program.
Fig. 1 has described the full distributed implementation that a kind of desirable protocol stack is handled, and this mode realizes the full distributed of protocol stacks by all INPCB data synchronously, and protocol stack is handled and comprised that INPCB data sync and message handle operation.
As shown in Figure 1, distributed system comprises three subsystems, and each subsystem is created a SOCKET and connected, and the situation after the INPCB data sync is that three INPCB data in the subsystem are identical, and knows the corresponding relation of INPCB data and subsystem.Receive the message packet2 of SOCKET2 as System1 after, the INPCB data that coupling is local, INPCB2 on the coupling determines being connected on the System2 of packet2 correspondence this moment, therefore directly packet2 is passed through System2; System2 mates local INPCB data once more, and discovery is that the local data of using are delivered to upper-layer protocol on directly and handled, if on the coupling then abandon.When packet2 when System3 enters, processing procedure is similar, System3 passes through System2 with packet2, handles by giving local upper-layer protocol on the System2.
The full distributed implementation that above-mentioned protocol stack is handled is desirable implementation, is only applicable to the situation that data source is fixed and data volume is little.And more situation is in the reality, and data source is unfixing, data volume is bigger, and adopting synchronously, the mode of all INPCB data can cause certain influence to systematic function.Specifically: can move on each subsystem arbitrarily because the SOCKET of upper-layer protocol uses, the INPCB data of each SOCKET correspondence also just might produce on any subsystem like this.It is full distributed to accomplish that protocol stack is handled, certainly will need the INPCB data sync of all SOCKET correspondences is arrived each subsystem, make each subsystem can find these INPCB data, guarantee that message enters and can correctly be handled from any subsystem, and be unlikely to abandon.Therefore the full distributed implementation that this protocol stack is handled need be between subsystem synchronously a large amount of INPCB data, this will certainly increase the weight of system burden, thereby reduces the traffic handing capacity of system.
In order to address the above problem, adopt at present based on the distributed implementation of the part of port diagnostic and realize the distributed of protocol stack processing indirectly, be specially:
1, the main subsystem (System Master) of using is set, System Master does server end and monitors well-known port, when receiving the client connection request, connects.And other subsystems except that System Master are not monitored well-known port, only allow the connection of setting up non-well-known port as client.
2, for the connection of non-well-known port, can divide port range in advance, different sub-systems is handled the connection of fixed port scope, and each subsystem is received to have and can be known clearly behind the message that port identity connects and give which subsystem processes.Therefore do not need the INPCB data are carried out synchronously.
3, for the RawIP agreement, owing to there is not port diagnostic, therefore can not participate in the division of port range, such connection is limited in main with setting up on the subsystem, and subsystem is issued System Master and is focused on after receiving the message of RawIP connection.
Fig. 2 has described the message handling process under the distributed implementation of part.As shown in Figure 2, be that distributed system comprises three subsystems equally, each subsystem is created a SOCKET and is connected.Suppose that the manageable port of whole system agreement System Master comprises: segment port 1~4000 (comprising well-known port 1~1024), segment port 5000~6000,10000~20000; The manageable port-for-port section 4001~4999 of System II subsystem; The manageable port-for-port section 6001~6999 of System III subsystem.Each subsystem is all known self and the manageable segment port information of other subsystems.The INPCB data of each subsystem maintenance self segment port correspondence, and do not need to carry out the INPCB data synchronization.
After System II receives Packet1, the destination interface 3301 that System II is checked through Packet1 is the process range of System Master, then directly be transparent to System Master, System Master receives that being checked through the message destination interface behind the Packet1 is this subsystem processes scope, then continue the INPCB data above this subsystem of coupling, the match is successful serves layer on then and uses, and it fails to match then directly abandons Packet1.The processing of Pakcet2 is similar, and System III receives the destination interface 4100 that is checked through Packet2 behind the Packet2 and is the process range of System II, then directly is transparent to System II and is mated and subsequent treatment by SystemII.
But also there is following drawback in this distributed processing mode based on port diagnostic:
1, can only realize that part is distributed.
Can only do server end monitoring well-known port by System Master owing to limited, and can only support based on port diagnostic distributed that TCP/UDP is this need be by the agreement of port numbers identification, do not have the agreement of port information on System Master, to focus on like this for RawIP, therefore can only realize that in fact part is distributed.Realize that the distributed consequence of part is the burden that has increased System Master, thereby reduce the performance of System Master.
2, be subjected to message aggression easily.
Because the well-known port message can only be focused on by System Master, after subsystem received the message of well-known port, whether the well-known port service of nonrecognition correspondence opened, but gave System Master with message transmission simply.Construct the message of a large amount of well-known ports squeezes into from each subsystem as the user, each subsystem checks that destination interface is that well-known port then send System Master to handle on all, and may not open corresponding service above the SystemMaster, so just increased the weight of the processing of System Master, influenced regular traffic and handle.
3, the system expandability is not high.
Each subsystem can only distribute local port number as client the time in the non-well-known port scope of making an appointment, can not expand, so each subsystem operable local dynamic port resource-constrained that connects, form system bottleneck easily.And port resource can not be fully utilized, and does not set up any connection such as certain subsystem, and agreement can not be used for this port resource above subsystem by other subsystems so, has wasted port resource.
Summary of the invention
In view of this, the invention provides the full distributed implementation method that a kind of protocol stack is handled, can not need under synchronous all INPCB data conditions, it is full distributed to realize that protocol stack is handled.
In the full distributed implementation method that a kind of protocol stack provided by the present invention is handled, each subsystem is carried out following simultaneous operation and message processing operation in the distributed system;
Described simultaneous operation comprises,
For Transmission Control Protocol: be recorded as the non-well-known port section that each subsystem distributes in the distributed system of place; When setting up client TCP connection as client, distribute the non-well-known port Duan Zhongwei client TCP that obtains to connect from this locality and distribute port, the INPCB data that asynchronous client TCP connects; As server the time, create female socket SOCKET and monitor well-known port, only be synchronized in all subsystems with the INPCB data of female SOCKET and with the corresponding relation of subsystem;
For udp protocol: be recorded as the non-well-known port section that each subsystem distributes in the distributed system of place; When setting up client UDP connection as client, distribute the non-well-known port Duan Zhongwei client UDP that obtains to connect from this locality and distribute port, the INPCB data that asynchronous client UDP connects; As server the time, INPCB data that Servers-all end UDP is connected and be synchronized in all subsystems with the corresponding relation of subsystem;
For the RawIP agreement: INPCB data that all RawIP are connected and be synchronized in all subsystems with the corresponding relation of subsystem;
Each subsystem is carried out following message and is handled operation after receiving message in the distributed system:
Mate port resource, comprising: adopt each non-well-known port section of message destination slogan coupling local record, if the port match success, and the non-well-known port section of coupling belongs to this subsystem, then continuation coupling INPCB data; If the port match success, but the non-well-known port section of coupling does not belong to this subsystem, then with the subsystem of message transmission to the non-well-known port section correspondence of coupling; If the match is successful for port, then continue coupling INPCB data;
Described coupling INPCB data comprise: the link information according to message is carried out the longest coupling in the INPCB of local record data, if the match is successful, and the INPCB data that match belong to this subsystem, then with the application layer process of this subsystem of message up sending; If the match is successful, but the INPCB data that match do not belong to this subsystem, then with the subsystem of message transmission to the INPCB data correspondence of coupling; If the match is successful, be that distributed system can not be discerned the message processing according to message.
Preferably, the obtain manner for the non-well-known port section of each subsystem distribution in the distributed system is: be responsible for the main distribution with the non-well-known port section of management subsystem of port resource management in the distributed system; With one section non-well-known port section of subsystem application, the non-well-known port section of application is with after exhausting to described master for subsystem in the distributed system, the non-well-known port section of another section of continuation application; Described master will distribute to the non-well-known port section of subsystem with subsystem and the corresponding relation between the subsystem is synchronized to each subsystem.
Preferably, when the female SOCKET of described correspondence creates sub-SOCKET and connects, corresponding relation and the INPCB data of female SOCKET and sub-SOCKET are saved in the mother-child relationship (MCR) table; When female SOCKET does not have corresponding sub-SOCKET, allow the female SOCKET of deletion.
Preferably, described is that distributed system can not be discerned message and is treated to: dropping packets according to message.
The present invention further provides a kind of distributed system, can not need under synchronous all INPCB data conditions, it is full distributed to realize that protocol stack is handled.
In a kind of distributed system provided by the present invention, comprise a plurality of subsystems;
Each subsystem comprises port numbers maintenance unit, port number assignment unit, INPCB data synchronisation unit, INPCB data storage cell and message process unit;
Described port numbers maintenance unit is used for being recorded as the non-well-known port section that distributed system each subsystem in place distributes;
Described port number assignment unit, be used for that the place subsystem sets up as client that client TCP connects or client UDP when connecting, distribute the non-well-known port Duan Zhongwei client TCP that obtains to connect from this locality of described port numbers maintenance unit record or client UDP connects and distributes port;
Described INPCB data synchronisation unit, for Transmission Control Protocol, when the place subsystem is set up client TCP connection as client, the INPCB data that asynchronous client TCP connects, when as server, create female SOCKET and monitor well-known port, only be synchronized in all subsystems with the INPCB data of female SOCKET and with the corresponding relation of subsystem; For udp protocol, when the place subsystem is set up client UDP connection as client, the INPCB data that asynchronous client UDP connects, as server the time, INPCB data that Servers-all end UDP is connected and be synchronized in all subsystems with the corresponding relation of subsystem; For the RawIP agreement, INPCB data that all RawIP are connected and be synchronized in all subsystems with the corresponding relation of subsystem;
Described INPCB data storage cell is used to store the local INPCB data that the place subsystem is created, receive and preserve INPCB data that other subsystems come synchronously and with the corresponding relation of subsystem;
Described message process unit comprises port match module and INPCB Data Matching module;
Described port match module, be used for behind the message that receives from subsystem outside, place, adopt message destination slogan to mate each non-well-known port section of described port numbers maintenance unit record, if port match success, and the non-well-known port section of coupling belongs to this subsystem, then sends to described INPCB Data Matching module; If the port match success, but the non-well-known port section of coupling does not belong to this subsystem, then with the subsystem of message transmission to the non-well-known port section correspondence of coupling; If the match is successful, then send to described INPCB Data Matching module;
Described INPCB Data Matching module, be used for behind the message that receives from the port match module, link information according to message is carried out the longest coupling in the INPCB data of described INPCB data storage cell record, if the match is successful, and the INPCB data that match belong to this subsystem, then with the application layer process of this subsystem of message up sending; If the match is successful, but the INPCB data that match do not belong to this subsystem, then with the subsystem of message transmission to the INPCB data correspondence of coupling; If the match is successful, be that distributed system can not be discerned the message processing according to message.
Preferably, the port numbers maintenance unit comprises port application module and port logging modle;
Described port application module, the master who is used for being responsible for the port resource management to distributed system is with one section non-well-known port section of subsystem application, and the non-well-known port section of application is with after exhausting, the non-well-known port section of another section of continuation application; Receive described master and be synchronized to local non-well-known port section with subsystem and use corresponding relation between the subsystem of this non-well-known port section, and send to the port logging modle;
Described port logging modle is used to write down the non-well-known port section that received and the corresponding relation of subsystem.
Preferably, when the place subsystem is described main when using subsystem, described port numbers maintenance unit further comprises the application respond module, be used for when receiving other subsystems the application of non-well-known port section, never distribute one section non-well-known port section in the non-well-known port resource that dispenses, the non-well-known port section of distribution and the corresponding relation of subsystem are synchronized in port logging modle and other each subsystems.
Preferably, described INPCB Data Matching module abandons the message that the match is successful when the match is successful.
According to above technical scheme as seen, use the present invention and can reach following beneficial effect:
By the above as can be seen, 1, each subsystem all can be used as server end and monitors well-known port, and each subsystem can be handled RawIP and connect, avoid only adopting in the prior art master to focus on the well-known port connection and be connected bring main with RawIP, realized that well-known port connects the distributed treatment that is connected with RawIP with the subsystem problem that over-burden with subsystem.
The server end TCP more for the connection amount connects, the present invention utilizes female SOCKET of Transmission Control Protocol and sub-SOCKET to have the characteristics of incidence relation, the INPCB data of synchronous female SOCKET are only mated and subsequent treatment as long as the INPCB data that the message coupling goes up female SOCKET just can be carried out INPCB with message transmission subsystem under the message so; And connect and is connected with the UDP of well-known port for connecting less RawIP, as long as all synchronously INPCB data so the last INPCB data of message coupling, also can be carried out INPCB with message transmission subsystem under the message and mate and subsequent treatment.This method according to agreement characteristics differentiation INPCB data sync mode is realizing having avoided a large amount of INPCB data synchronization under the full distributed situation.
2, the well-known port message can be by each subsystem distributed treatment, receive the message of well-known port when subsystem after, can exclude the message of the well-known port of the service of not opening through the matching process of INPCB data, for this class message, owing to there are not corresponding INPCB data, therefore the subsystem that receives this class message can abandon received packet, thereby avoids being subjected to the attack of a large amount of well-known port messages of malice structure.
3, subsystem of the present invention is applied for after System Master applies for non-well-known port section, exhausts forever when needs local port resource again.This mode has increased the flexibility ratio of port assignment, and port resource can be fully utilized, and avoids the waste of port resource.
Description of drawings
Fig. 1 is the full distributed implementation that a kind of desirable protocol stack is handled.
Fig. 2 is a message handling process under the distributed implementation of part in the prior art.
Fig. 3 is the schematic diagram after distribution of the full distributed implementation lower port of the present invention section and the INPCB data sync.
Fig. 4 is message forwarding process figure under the full distributed implementation of the present invention.
Fig. 5 is the message forwarding process figure based on segment port distribution shown in Figure 3 and INPCB data synchronization result.
Fig. 6 is the structural representation of subsystem in the distributed system of the present invention.
Embodiment
Below in conjunction with the accompanying drawing embodiment that develops simultaneously, describe the present invention.
The present invention is according to the characteristics of Transmission Control Protocol, udp protocol and RawIP agreement, and it is full distributed to realize that by synchronously a small amount of INPCB data protocol stack is handled.
At first, the characteristics of Transmission Control Protocol, udp protocol and RawIP agreement are analyzed:
Transmission Control Protocol: for the connection of Transmission Control Protocol, the main information of connection comprises local address, local port, address, opposite end and Peer-Port.Local port scope 1~65535.Well-known port is 1~1024, and the number of connection of well-known port is many.
Udp protocol: for the connection of udp protocol, the main information of connection comprises local address, local port, address, opposite end and Peer-Port.Local port scope 1~65535.Well-known port is 1~1024, and the part well-known port is also arranged more than 1024.Overall number of connection is little.
The RawIP agreement: for the connection of RawIP agreement, the main information of connection comprises local address and address, opposite end.There is not port information.This class number of connection is fewer.
Secondly, from role's angle analysis of distributed system:
Connection is connected with UDP for TCP, and two ends one end of connection is a server end, and an end is a client, and each subsystem can be done server end, also can do client.When subsystem is done the server section, adopt well-known port to connect, the connection of being set up is called server end and connects, and when subsystem is acted as a guest the family end, adopts non-well-known port to connect, and the connection of being set up is called client and connects.But, there is not role's notion for the RawIP agreement.Therefore, all connect from the analysis of role's angle below and be connected with UDP at TCP.
Role server:
When subsystem is done server, create when connecting and specify by static, use local well-known port.For this connection-oriented application of TCP, be the opportunity that create to connect when accepting client-requested, and number of connection determines by client terminal quantity, and quantity can bigger (such as the such application of SSL VPN) usually; Usually, subsystem can adopt a female SOCKET to monitor a well-known port, and when client was initiated to the connection of server end, sub-SOCKET of server end acceptance connection back establishment connects and client communicates.And for this non-connection-oriented application of UDP, little substantially as the number of connection that role server is created.
Client role:
When subsystem is acted as a guest the family end, create and pass through dynamic assignment when connecting, use local non-well-known port.It is few to use distributed system to do the situation of client, so the quantity that this class connects is little.
From above analysis as seen, at first the connection amount of well-known port is big, i.e. the quantity of server end connection is big, and in server end connected, the TCP connection had accounted for very most of, and UDP connects also seldom.Correspondingly, the connection amount of non-well-known port is little, i.e. the quantity of client connection is little.In addition, the connection amount of RawIP agreement is also little.In view of the above, the present invention accomplishes that as far as possible synchronous a spot of data reach full distributed effect.Each agreement needs data in synchronization as follows in each subsystem:
● Transmission Control Protocol
During as client: be recorded as the non-well-known port section that each subsystem distributes in the distributed system of place, promptly write down each subsystem and allow the non-well-known port number of use and the corresponding relation of subsystem; When subsystem is set up client TCP connection as client, distribute the non-well-known port Duan Zhongwei client TCP that obtains to connect from this locality and distribute port, the INPCB data that asynchronous client TCP connects.
Wherein, the corresponding relation of non-well-known port section and subsystem can be divided in advance, but this mode underaction, port resource can not be fully utilized.Therefore, preferably, the present invention is managed concentratedly the distribution of non-well-known port section by the System Master in the distributed system; When certain subsystem need be set up client TCP connection, to one section non-well-known port section of System Master application, non-well-known port Duan Zhongwei from application connects the dynamic assignment port then, rather than as prior art, use dynamic assignment port from the local port section of dividing in advance, when the application non-well-known port section with after exhausting, can the non-well-known port section of another section of continuation application.System Master can be synchronized to each subsystem with distributing to the non-well-known port section of subsystem and the corresponding relation between the subsystem, so that the port resource section that each subsystem has been applied for can be known by other subsystems.
During as server: well-known port is no longer only focused on by System Master, and each subsystem can be monitored well-known port.Specifically, each subsystem is created female SOCKET and is monitored well-known port, is synchronized in all subsystems with the INPCB data of female SOCKET and with the corresponding relation of subsystem; After listening to and accept the TCP connection of client initiation, corresponding female SOCKET creates sub-SOCKET and connects and client communication, but the INPCB data of different step SOCKET.As seen, this present invention considers the more characteristics of server end TCP number of connection, only synchronous female SOCKET the INPCB data, avoided the huge problem of synchronous amount.
For realize female SOCKET synchronously, when the female SOCKET of correspondence sets up sub-SOCKET and connects, the corresponding relation of female SOCKET and sub-SOCKET is logged in the mother-child relationship (MCR) table, not only comprise the corresponding relation between the mothers and sons SOCKET in this mother-child relationship (MCR) table, also comprise the INPCB data of each female SOCKET and sub-SOCKET.When synchronous INPCB data, can determine which is the synchronous SOCKET of needs not according to the mother-child relationship (MCR) table.It should be noted that, under the situation that has sub-SOCKET in the mother-child relationship (MCR) table, do not allow to delete corresponding female SOCKET, when female SOCKET does not have corresponding sub-SOCKET, just can delete female SOCKET, and the INPCB data that will be synchronized to female SOCKET of each subsystem reclaim, thereby avoid can't finding owing to can't match the INPCB data of female SOCKET the situation of SOCKET to take place.Owing to have incidence relation between female SOCKET and the sub-SOCKET,, be equivalent to realize the INPCB data synchronization of sub-SOCKET therefore by the INPCB data of synchronous female SOCKET.
● udp protocol
During as client: handle identical with Transmission Control Protocol.Promptly be recorded as the non-well-known port section that each subsystem distributes in the distributed system of place; When setting up client UDP connection as client, distribute the non-well-known port Duan Zhongwei client UDP that obtains to connect from this locality and distribute port, the INPCB data that asynchronous client UDP connects.
During as server: because the overall number of connection of udp protocol is little, thereby INPCB data that Servers-all end UDP can be connected and be synchronized in all subsystems with the corresponding relation of subsystem.
● the RawIP agreement:
The RawIP agreement does not have the notion of server end and client, and the quantity that RawIP connects is little, so the present invention's INPCB data that all RawIP are connected and be synchronized in all subsystems with the corresponding relation of subsystem.
Cite an actual example below non-well-known port section distribution and INPCB data sync are described.Be applied as example with Transmission Control Protocol, suppose that each non-well-known port section comprises 64 available ports number, referring to Fig. 3;
System?II:
System II applies for non-well-known port section 1 (2000~2063), and INPCB22 has been created in this locality, and INPCB22 is the INPCB data of certain port in the non-well-known port section 1;
System II adopts female SOCKET to monitor well-known port 200, corresponding well-known port 200 has been created INPCB2, after accepting the TCP connection of client initiation, two sub-SOCKET have been created, two sub-SOCKET distinguish corresponding INPCB222 and INPCB2222, and have safeguarded a mother-child relationship (MCR) table as shown in table 1.
Figure B2009102357417D0000121
Table 1
System?III:
System III applies for non-well-known port section 2 (3000~3063), and INPCB33 has been created in this locality; INPCB33 is certain port INPCB data of port 3002 for example in the non-well-known port section 2;
System III adopts female SOCKET to monitor well-known port 300, corresponding well-known port 300 has been created INPCB3, after accepting the TCP connection of client initiation, two sub-SOCKET have been created, two sub-SOCKET distinguish corresponding INPCB333 and INPCB333, and have safeguarded a mother-child relationship (MCR) table as shown in table 2.
Figure B2009102357417D0000122
Table 2
According to the allocation rule and the INPCB data synchronization rule of aforementioned non-well-known port section, each subsystem is carried out following operation:
System Master is with the corresponding relation global synchronization between the corresponding relation between non-well-known port section 1 and the System II and non-well-known port section 2 and the System III, promptly be synchronized to System II and System III, also comprise System Master itself certainly;
System II carries out global synchronization with the INPCB data (INPCB2) of the female SOCKET in this locality and the corresponding relation of System II, promptly is synchronized to System Master and System III, also comprises System II itself certainly;
System III carries out global synchronization with the INPCB data (INPCB3) of the female SOCKET in this locality and the corresponding relation of System III, promptly is synchronized to System Master and System II, also comprises System III itself certainly;
Through Synchronous Processing, the situation of each subsystem of stable back as shown in Figure 3.For clear, Fig. 3 separately shows INPCB synchrodata and INPCB local data.
Through data sync stable after, the distributed treatment flow process of data message on each subsystem as shown in Figure 4, referring to Fig. 4, each subsystem is carried out following each step after receiving message:
Step 401: after subsystem receives message, adopt each non-well-known port section of message destination slogan coupling local record, if the port match success, and the non-well-known port section of coupling belongs to this subsystem, and then execution in step 402; If the port match success, but the non-well-known port section of coupling does not belong to this subsystem, and then execution in step 404; If the match is successful for port, then execution in step 402.
Step 402: coupling INPCB data.
The step of the coupling INPCB data of this step comprises: the link information according to message is carried out the longest coupling in the INPCB of local record data.These INPCB data comprise locally-attached INPCB data, comprise that also other subsystems are synchronized to the INPCB data of this subsystem.If the match is successful, and the INPCB data that match belong to this subsystem, execution in step 403; If the match is successful, but the INPCB data that match do not belong to this subsystem, execution in step 404; If coupling is unsuccessful, then, execution in step 405;
Wherein, link information comprises local address (LIP), local port (LP), address, opposite end (DIP), Peer-Port (DP).The longest coupling is meant gets the highest INPCB data of matching degree as matching result.The coupling here is not to be with the reason of mating fully: the INPCB data of female SOCKET and sub-SOCKET correspondence have identical LIP and LP (is the DIP and the DP of message for received packet), DP and the DIP of female SOCKET and sub-SOCKET are inequality usually, female SOCKET does not generally specify DIP and DP, and do not specify be designated as 0); If receive the INPCB data of not preserving corresponding sub-SOCKET in the subsystem of message, the INCPB data that then can mate female SOCKET by the longest coupling, thereby implement the transparent transmission operation, if and receive the sub-SOCKET that coupling is arranged in the subsystem of message, then with the message up sending application layer process, explanation this point below can give an actual example.
Step 403: the application layer process of received packet being given this subsystem on directly.This flow process finishes.
Step 404: received packet is transparent to subsystem processes under it.This flow process finishes.After subsystem receives this message under the received packet, also handle according to the flow process shown in Fig. 4.
Step 405: according to received packet is that the message that distributed system can't be handled is handled, for example direct dropping packets, or implement certain predetermined strategy.
So far, this flow process finishes.
In conjunction with the flow process of Fig. 4 as can be seen, when adopting the flow process shown in Fig. 4 that received packet is handled, if received packet is the Transmission Control Protocol message, and port numbers is non-well-known port, then can in the port match step, determine message place subsystem, and then handle by corresponding subsystem;
If received packet is the TCP message, the message port numbers is a well-known port, but corresponding sub-SOCKET is not safeguarded in this locality, then in INPCB Data Matching process, the INPCB data that can mate female SOCKET by the longest coupling, can determine message place subsystem according to the corresponding relation of female SOCKET and subsystem this moment, and then handled by corresponding subsystem;
If received packet is the TCP message, the message port numbers is a well-known port, and local maintenance corresponding sub-SOCKET, then in INPCB Data Matching process, can directly mate the INPCB data that go up corresponding sub-SOCKET by the longest coupling, and then handle by the application layer of this subsystem;
If received packet is the UDP message, and UDP message port numbers is non-well-known port, and then processing mode is identical with the TCP message; Therefore if UDP message port numbers is a well-known port, then because this protocol of I NPCB data are all synchronous, can in INPCB Data Matching process, finds subsystem under the message, and then handle by corresponding subsystem;
If received packet is the RawIP message, because this protocol of I NPCB data are all synchronous, then can in INPCB Data Matching process, finds subsystem under the message, and then handle by corresponding subsystem.
As seen, no matter message enters distributed system from which subsystem, and no matter received packet is the message of what protocol type, can appropriately be handled, thereby it is full distributed to have realized that protocol stack is handled.And, if received packet is the attack message of the well-known port of the service of not opening, the well-known port of service can not created and synchronously corresponding INPCB data owing to do not open, therefore the subsystem that receives message determines that in INPCB Data Matching process it fails to match, thereby abandon the attack message that receives, take place to avoid attacking.
Be example with the subsystem shown in Fig. 3 below, the message handling process is described.The message handling process is referring to Fig. 5,
When Packet1 when System II subsystem enters, the destination interface 3002 corresponding port resource Res2 of System II identification Packet1 find the corresponding subsystem System of Res2 III, then with message transmission to System III; System III identifies the port range that Packet1 belongs to this subsystem, mates the INPCB data then, send the book system application on matching behind the INPCB33 successfully.
When Packet2 when System III subsystem enters, the destination interface 200 of System III identification Packet2 does not belong to any segment port resource; Carry out the longest coupling of INPCB data then, be transparent to System II after matching INPCB2; System II is by the longest coupling, can match INPCB222 (but in the reference table 1 content of INPCB222), send local upper level applications on directly.
By the above as can be seen, 1, each subsystem all can be used as server end and monitors well-known port, and each subsystem can be handled RawIP and connect, avoid only adopting in the prior art master to focus on the well-known port connection and be connected bring main with RawIP, realized that well-known port connects the distributed treatment that is connected with RawIP with the subsystem problem that over-burden with subsystem.The server end TCP more for the connection amount connects, the present invention utilizes female SOCKET of Transmission Control Protocol and sub-SOCKET to have the characteristics of incidence relation, the INPCB data of synchronous female SOCKET are only mated and subsequent treatment as long as the INPCB data that the message coupling goes up female SOCKET just can be carried out INPCB with message transmission subsystem under the message so; And connect and is connected with the UDP of well-known port for connecting less RawIP, as long as all synchronously INPCB data so the last INPCB data of message coupling, also can be carried out INPCB with message transmission subsystem under the message and mate and subsequent treatment.This method according to agreement characteristics differentiation INPCB data sync mode is realizing having avoided a large amount of INPCB data synchronization under the full distributed situation.
2, the well-known port message can be by each subsystem distributed treatment, receive the message of well-known port when subsystem after, can exclude the message of the well-known port of the service of not opening through the matching process of INPCB data, corresponding this class message, owing to there are not corresponding INPCB data, therefore the subsystem that receives this class message can abandon received packet, thereby avoids being subjected to the attack of a large amount of well-known port messages of malice structure.
3, subsystem of the present invention is applied for after System Master applies for non-well-known port section, exhausts forever when needs local port resource again.This mode has increased the flexibility ratio of port assignment, and extensibility obviously strengthens, and port resource can be fully utilized, and avoids the waste of port resource.The client number of connection that each subsystem can be set up is unrestricted, just can create the client connection as long as can be assigned to dynamic port in theory.
In order to realize above-mentioned full distributed implementation method and message forwarding method, the present invention also provides a kind of distributed system.Fig. 6 shows the structural representation of each subsystem in this distributed system.As shown in Figure 6, this subsystem comprises port numbers maintenance unit 61, port number assignment unit 62, INPCB data synchronisation unit 63 and INPCB data storage cell 64.
Wherein, port numbers maintenance unit 61 is used for being recorded as the non-well-known port section that distributed system each subsystem in place distributes.
Preferably, this port numbers maintenance unit 61 comprises port application module 611 and port logging modle 612; Wherein,
Port application module 611, the master who is used for being responsible for the port resource management to distributed system is with one section non-well-known port section of subsystem application, and the non-well-known port section of application is with after exhausting, the non-well-known port section of another section of continuation application; Receive described master and be synchronized to local non-well-known port section with subsystem and use corresponding relation between the subsystem of this non-well-known port section, and send to port logging modle 612.
Port logging modle 612 is used to write down port receives the non-well-known port section and the corresponding relation of subsystem.
When this subsystem is that described responsible port resource management main is when using subsystem, this port numbers maintenance unit 61 further comprises application respond module 613, be used for when receiving other subsystems the application of non-well-known port section, never distribute one section non-well-known port section in the non-well-known port resource that dispenses, the non-well-known port section of distribution and the corresponding relation of subsystem are synchronized in port logging modle 612 and other each subsystems.
Port number assignment unit 62, be used for that the place subsystem sets up as client that client TCP connects or client UDP when connecting, distribute the non-well-known port Duan Zhongwei client TCP that obtains to connect from this locality of described port numbers maintenance unit 61 records or client UDP connects and distributes port.
INPCB data synchronisation unit 63, for Transmission Control Protocol, when the place subsystem is set up client TCP connection as client, the INPCB data that asynchronous client TCP connects, when as server, create female SOCKET and monitor well-known port, be synchronized in all subsystems with the INPCB data of female SOCKET and with the corresponding relation of subsystem; Accept that TCP that client initiates connects and after corresponding female SOCKET creates sub-SOCKET and connect, the INPCB data of different step SOCKET; For udp protocol, when the place subsystem is set up client UDP connection as client, the INPCB data that asynchronous client UDP connects, as server the time, INPCB data that Servers-all end UDP is connected and be synchronized in all subsystems with the corresponding relation of subsystem; For the RawIP agreement, INPCB data that all RawIP are connected and be synchronized in all subsystems with the corresponding relation of subsystem.
INPCB data storage cell 64 is used to store the local INPCB data that the place subsystem is created, receive and preserve INPCB data that other subsystems come synchronously and with the corresponding relation of subsystem.
More than the cooperation of each module can to realize that protocol stack is handled full distributed.In order to realize the message forwarding, each subsystem also needs to comprise message process unit 65.Among the present invention, this message process unit 65 comprises port match module 651 and INPCB Data Matching module 652.
Port match module 651, be used for after receiving message from subsystem outside, place (comprising) from the message of distributed system outside with from the message of other subsystems, adopt each non-well-known port section of message destination slogan mating end slogan maintenance unit 61 (in port logging modle 612) record, if port match success, and the non-well-known port section of coupling belongs to this subsystem, then sends to INPCB Data Matching module 652; If the port match success, but the non-well-known port section of coupling does not belong to this subsystem, then with the subsystem of message transmission to the non-well-known port section correspondence of coupling; If the match is successful, then send to INPCB Data Matching module 652.
INPCB Data Matching module 652, be used for behind the message that receives from port match module 651, link information according to message is carried out the longest coupling in the INPCB data of INPCB data storage cell 64 records, if the match is successful, and the INPCB data that match belong to this subsystem, then with the application layer process of this subsystem of message up sending; If the match is successful, but the INPCB data that match do not belong to this subsystem, then with the subsystem of message transmission to the INPCB data correspondence of coupling; If the match is successful, be that distributed system can not be discerned the message processing according to message, for example abandon.
In sum, more than be preferred embodiment of the present invention only, be not to be used to limit protection scope of the present invention.Within the spirit and principles in the present invention all, any modification of being done, be equal to replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (8)

1. the full distributed implementation method that protocol stack is handled is characterized in that, each subsystem is carried out following simultaneous operation and message processing operation in the distributed system;
Described simultaneous operation comprises,
For Transmission Control Protocol: be recorded as the non-well-known port section that each subsystem distributes in the distributed system of place; When setting up client TCP connection as client, distribute the non-well-known port Duan Zhongwei client TCP that obtains to connect from this locality and distribute port, the INPCB data that asynchronous client TCP connects; As server the time, create female socket SOCKET and monitor well-known port, only be synchronized in all subsystems with the INPCB data of female SOCKET and with the corresponding relation of subsystem;
For udp protocol: be recorded as the non-well-known port section that each subsystem distributes in the distributed system of place; When setting up client UDP connection as client, distribute the non-well-known port Duan Zhongwei client UDP that obtains to connect from this locality and distribute port, the INPCB data that asynchronous client UDP connects; As server the time, INPCB data that Servers-all end UDP is connected and be synchronized in all subsystems with the corresponding relation of subsystem;
For the RawIP agreement: INPCB data that all RawIP are connected and be synchronized in all subsystems with the corresponding relation of subsystem;
Each subsystem is carried out following message and is handled operation after receiving message in the distributed system:
Mate port resource, comprising: adopt each non-well-known port section of message destination slogan coupling local record, if the port match success, and the non-well-known port section of coupling belongs to this subsystem, then continuation coupling INPCB data; If the port match success, but the non-well-known port section of coupling does not belong to this subsystem, then with the subsystem of message transmission to the non-well-known port section correspondence of coupling; If the match is successful for port, then continue coupling INPCB data;
Described coupling INPCB data comprise: the link information according to message is carried out the longest coupling in the INPCB of local record data, if the match is successful, and the INPCB data that match belong to this subsystem, then with the application layer process of this subsystem of message up sending; If the match is successful, but the INPCB data that match do not belong to this subsystem, then with the subsystem of message transmission to the INPCB data correspondence of coupling; If the match is successful, be that distributed system can not be discerned the message processing according to message.
2. the method for claim 1 is characterized in that, the obtain manner of the non-well-known port section of distributing for each subsystem in the distributed system is: be responsible for the main distribution with the non-well-known port section of management subsystem of port resource management in the distributed system; With one section non-well-known port section of subsystem application, the non-well-known port section of application is with after exhausting to described master for subsystem in the distributed system, the non-well-known port section of another section of continuation application; Described master will distribute to the non-well-known port section of subsystem with subsystem and the corresponding relation between the subsystem is synchronized to each subsystem.
3. the method for claim 1 is characterized in that, when the female SOCKET of described correspondence creates sub-SOCKET and connects, corresponding relation and the INPCB data of female SOCKET and sub-SOCKET is saved in the mother-child relationship (MCR) table; When female SOCKET does not have corresponding sub-SOCKET, allow the female SOCKET of deletion.
4. the method for claim 1 is characterized in that, described is that distributed system can not be discerned message and is treated to: dropping packets according to message.
5. distributed system, this distributed system comprises a plurality of subsystems; It is characterized in that each subsystem comprises port numbers maintenance unit, port number assignment unit, INPCB data synchronisation unit, INPCB data storage cell and message process unit;
Described port numbers maintenance unit is used for being recorded as the non-well-known port section that distributed system each subsystem in place distributes;
Described port number assignment unit, be used for that the place subsystem sets up as client that client TCP connects or client UDP when connecting, distribute the non-well-known port Duan Zhongwei client TCP that obtains to connect from this locality of described port numbers maintenance unit record or client UDP connects and distributes port;
Described INPCB data synchronisation unit, for Transmission Control Protocol, when the place subsystem is set up client TCP connection as client, the INPCB data that asynchronous client TCP connects, when as server, create female SOCKET and monitor well-known port, only be synchronized in all subsystems with the INPCB data of female SOCKET and with the corresponding relation of subsystem; For udp protocol, when the place subsystem is set up client UDP connection as client, the INPCB data that asynchronous client UDP connects, as server the time, INPCB data that Servers-all end UDP is connected and be synchronized in all subsystems with the corresponding relation of subsystem; For the RawIP agreement, INPCB data that all RawIP are connected and be synchronized in all subsystems with the corresponding relation of subsystem;
Described INPCB data storage cell is used to store the local INPCB data that the place subsystem is created, receive and preserve INPCB data that other subsystems come synchronously and with the corresponding relation of subsystem;
Described message process unit comprises port match module and INPCB Data Matching module;
Described port match module, be used for behind the message that receives from subsystem outside, place, adopt message destination slogan to mate each non-well-known port section of described port numbers maintenance unit record, if port match success, and the non-well-known port section of coupling belongs to this subsystem, then sends to described INPCB Data Matching module; If the port match success, but the non-well-known port section of coupling does not belong to this subsystem, then with the subsystem of message transmission to the non-well-known port section correspondence of coupling; If the match is successful, then send to described INPCB Data Matching module;
Described INPCB Data Matching module, be used for behind the message that receives from the port match module, link information according to message is carried out the longest coupling in the INPCB data of described INPCB data storage cell record, if the match is successful, and the INPCB data that match belong to this subsystem, then with the application layer process of this subsystem of message up sending; If the match is successful, but the INPCB data that match do not belong to this subsystem, then with the subsystem of message transmission to the INPCB data correspondence of coupling; If the match is successful, be that distributed system can not be discerned the message processing according to message.
6. distributed system as claimed in claim 5 is characterized in that, the port numbers maintenance unit comprises port application module and port logging modle;
Described port application module, the master who is used for being responsible for the port resource management to distributed system is with one section non-well-known port section of subsystem application, and the non-well-known port section of application is with after exhausting, the non-well-known port section of another section of continuation application; Receive described master and be synchronized to local non-well-known port section with subsystem and use corresponding relation between the subsystem of this non-well-known port section, and send to the port logging modle;
Described port logging modle is used to write down the non-well-known port section that received and the corresponding relation of subsystem.
7. distributed system as claimed in claim 5, it is characterized in that, when the place subsystem is described main when using subsystem, described port numbers maintenance unit further comprises the application respond module, be used for when receiving other subsystems the application of non-well-known port section, never distribute one section non-well-known port section in the non-well-known port resource that dispenses, the non-well-known port section of distribution and the corresponding relation of subsystem are synchronized in port logging modle and other each subsystems.
8. distributed system as claimed in claim 5 is characterized in that, described INPCB Data Matching module abandons the message that the match is successful when the match is successful.
CN 200910235741 2009-10-13 2009-10-13 Method for realizing full distribution of protocol stack process and distributed system Active CN102045378B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 200910235741 CN102045378B (en) 2009-10-13 2009-10-13 Method for realizing full distribution of protocol stack process and distributed system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 200910235741 CN102045378B (en) 2009-10-13 2009-10-13 Method for realizing full distribution of protocol stack process and distributed system

Publications (2)

Publication Number Publication Date
CN102045378A true CN102045378A (en) 2011-05-04
CN102045378B CN102045378B (en) 2013-02-13

Family

ID=43911144

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200910235741 Active CN102045378B (en) 2009-10-13 2009-10-13 Method for realizing full distribution of protocol stack process and distributed system

Country Status (1)

Country Link
CN (1) CN102045378B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107315605A (en) * 2017-06-14 2017-11-03 上海青橙实业有限公司 The method and apparatus of Jack Server ports Dynamic Matching
CN109309663A (en) * 2018-08-13 2019-02-05 厦门集微科技有限公司 Realize that docker network penetrates the method and device of two layers of protocol stack under cloud computing environment
WO2020005621A1 (en) * 2018-06-26 2020-01-02 Microsoft Technology Licensing, Llc Scalable sockets for quic

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1567114A (en) * 2003-07-09 2005-01-19 中国科学院沈阳自动化研究所 Wireless local area network on-site bus network control station
CN101430674A (en) * 2008-12-23 2009-05-13 北京航空航天大学 Intraconnection communication method of distributed virtual machine monitoring apparatus

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1567114A (en) * 2003-07-09 2005-01-19 中国科学院沈阳自动化研究所 Wireless local area network on-site bus network control station
CN101430674A (en) * 2008-12-23 2009-05-13 北京航空航天大学 Intraconnection communication method of distributed virtual machine monitoring apparatus

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107315605A (en) * 2017-06-14 2017-11-03 上海青橙实业有限公司 The method and apparatus of Jack Server ports Dynamic Matching
WO2020005621A1 (en) * 2018-06-26 2020-01-02 Microsoft Technology Licensing, Llc Scalable sockets for quic
US11115504B2 (en) 2018-06-26 2021-09-07 Microsoft Technology Licensing, Llc Batch processing for QUIC
US11223708B2 (en) 2018-06-26 2022-01-11 Microsoft Technology Licensing, Llc Scalable sockets for QUIC
CN109309663A (en) * 2018-08-13 2019-02-05 厦门集微科技有限公司 Realize that docker network penetrates the method and device of two layers of protocol stack under cloud computing environment
CN109309663B (en) * 2018-08-13 2021-03-19 厦门集微科技有限公司 Method and device for realizing penetration of two-layer protocol stack by docker network in cloud computing environment

Also Published As

Publication number Publication date
CN102045378B (en) 2013-02-13

Similar Documents

Publication Publication Date Title
CN101296238B (en) Method and equipment for remaining persistency of security socket layer conversation
CN1135800C (en) Internet protocol handler for telecommunications platform with processor cluster
US7401114B1 (en) Method and apparatus for making a computational service highly available
EP3352431B1 (en) Network load balance processing system, method, and apparatus
US20150172345A1 (en) System and method for efficient delivery of repetitive multimedia content
CN110009201B (en) Electric power data link system and method based on block chain technology
US9137212B2 (en) Communication method and apparatus using changing destination and return destination ID's
CN102025616B (en) Method, device and switch for realizing BFD (Bidirectional Forwarding Detection)
CN101453495A (en) Method, system and equipment for preventing authentication address resolution protocol information loss
JP2007507760A (en) Secure cluster configuration dataset transfer protocol
JP4398354B2 (en) Relay system
KR100793349B1 (en) Multicast forwarding apparatus and control method in system for using PPP multi-link
CN112055048B (en) P2P network communication method and system for high-throughput distributed account book
US8539089B2 (en) System and method for vertical perimeter protection
US20050286555A1 (en) Data transfer system, communication protocol conversion cradle, address conversion method used therefor, and program thereof
US20160094636A1 (en) System and method for supporting asynchronous request/response in a network environment
EP2469776A1 (en) Cluster router and cluster routing method
CN103441937A (en) Sending method and receiving method of multicast data
CN102045378B (en) Method for realizing full distribution of protocol stack process and distributed system
CN110909030B (en) Information processing method and server cluster
JP5437290B2 (en) Service distribution method, service distribution device, and program
CN112492004B (en) Method, device, system and storage medium for establishing local communication link
US20020071430A1 (en) Keyed authentication rollover for routers
KR100773778B1 (en) Method for controlling server with multicast transmitting and System thereof
CN112711465B (en) Data processing method and device based on cloud platform, electronic equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: 310052 Binjiang District Changhe Road, Zhejiang, China, No. 466, No.

Patentee after: Xinhua three Technology Co., Ltd.

Address before: 310053 Hangzhou hi tech Industrial Development Zone, Zhejiang province science and Technology Industrial Park, No. 310 and No. six road, HUAWEI, Hangzhou production base

Patentee before: Huasan Communication Technology Co., Ltd.