CN1599319A - Method, system, and program for managing data transmission through a network - Google Patents

Method, system, and program for managing data transmission through a network Download PDF

Info

Publication number
CN1599319A
CN1599319A CN200410056658.0A CN200410056658A CN1599319A CN 1599319 A CN1599319 A CN 1599319A CN 200410056658 A CN200410056658 A CN 200410056658A CN 1599319 A CN1599319 A CN 1599319A
Authority
CN
China
Prior art keywords
message
dma
memory access
direct memory
destination
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN200410056658.0A
Other languages
Chinese (zh)
Other versions
CN100438403C (en
Inventor
H·T·贝弗利
A·楚巴尔
G·Y·曹
A·L·阿里兹佩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Intel Corp
Original Assignee
Intel Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intel Corp filed Critical Intel Corp
Publication of CN1599319A publication Critical patent/CN1599319A/en
Application granted granted Critical
Publication of CN100438403C publication Critical patent/CN100438403C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/19Flow control; Congestion control at layers above the network layer
    • H04L47/193Flow control; Congestion control at layers above the network layer at the transport layer, e.g. TCP related
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/27Evaluation or update of window size, e.g. using information derived from acknowledged [ACK] packets
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/04Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
    • H04L63/0428Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/16Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/16Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
    • H04L69/163In-band adaptation of TCP data exchange; In-band control procedures

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

Provided are a method, system, and program for managing data transmission from a source to a destination through a network. The destination imposes a window value on the source which limits the quantity of data packets which can be sent from the source to the destination without receiving an acknowledgment of being received by the destination. In one embodiment, the source imposes a second window value, smaller than the destination window value, which limits even further the quantity of data packets which can be sent from the source to the destination without receiving an acknowledgment of being received by the destination. In another embodiment, a plurality of direct memory access connections are established between the source and a plurality of specified memory locations of a plurality of destinations. The source imposes a plurality of message limits, each message limit imposing a separate limit for each direct memory access connection on the quantity of messages sent from the source to the specified memory location of the direct memory access connection associated with the message limit and lacking a message acknowledgment of being received by the destination of the direct memory access connection associated with the message limit.

Description

Be used to manage method, the system and program by the transfer of data of network
Technical field
The present invention relates to be used to manage method, the system and program by the transfer of data of network.
Background technology
In network environment, will receive I/O (I/O) request or I/O request responding to from main frame, sending such as the network adapter on the master computer of ethernet controller, fiber channel controller etc.Usually, master computer operating system comprises that device driver communicates by letter with network adapter hardware, with management in the I/O of transmission over networks request.Master computer also comprises the host-host protocol driver, and its be will becoming grouping in the packing data of transmission over networks, and each grouping comprises destination address and the part of the data that will transmit.The packet that receives on network adapter is stored in the packet buffer that has distributed available in the mainframe memory usually.The grouping that the host-host protocol drive processes is stored in packet buffer by the network adapter reception, and visit is embedded in any I/O order or data in the grouping.
For example, the host-host protocol driver can carry out transmission control protocol (TransmissionControl Protocol) (TCP) and Internet Protocol (Internet Protocol) (IP) encode and the addressing data that are used to transmit and the effective load data in the TCP/IP grouping that on network adapter, receives of decoding and visit.The form and the addressing scheme of IP regulation grouping (being also referred to as datagram).TCP is the more upper-layer protocol that connects between destination and information source.An also more high-rise connection of upper-layer protocol (that is, long-range direct memory access (DMA) Remote Direct MemoryAccess (RDMA)) foundation, and except other operation, also allow the direct layout of data on the memory cell of stipulating on the destination.
Device driver can use a large amount of host processor resources to handle Network Transmission request to network adapter.A kind of technology that reduces the load on the host-processor is to use TCP/IP offload engine (Offload Engine) (TOE), wherein opposite with device driver, the ICP/IP protocol associative operation is carried out in network adapter hardware, thereby makes host-processor avoid carrying out some or all ICP/IP protocol associative operations.These transport protocol operations comprise utilize verification and and out of Memory packing TCP/IP grouping in data, and take the TCP/IP grouping that on network, receives apart with visit pay(useful) load or data.
Fig. 1 is shown in the TCP/IP stream of packets 10 that sends to destination host in the TCP connection from the information source main frame.In as the TCP RFC (requests for comments) that accepts in industry, in the Transmission Control Protocol of regulation, give the unique serial number of each packet allocation.When each grouping is successfully sent to destination host, send an affirmation by destination host to the information source main frame, utilize order of packets number notice information source main frame successfully to receive this grouping.Therefore, stream 10 comprises a part 12 that has been sent out and be confirmed to be the grouping that is received by destination host.Stream 10 also comprises a part 14 that still is not confirmed to be the grouping that is received by destination host as yet that has been sent by the information source main frame.The information source main frame keeps TCP unacknowledged data pointer 16, and it points to the serial number of the first transmission grouping unconfirmed.TCP unacknowledged data pointer 16 be stored in protocol control block 18a, 18b ..., field 17a, the 17b of 18n ..., among the 17n (Fig. 2), each is used to start and many relevant TCP maintaining between information source main frame and the one or more destination host one of connect.
The size of capacity of packet buffer that is used to be stored in the packet that receives on the destination host is normally limited.According to Transmission Control Protocol, destination host is labeled as the size that 20 numerical value that are referred to herein as tcp window come its available cache memory device space of being had of advertisement by being sent among Fig. 1.Therefore, the information source main frame uses tcp window to limit the quantity of not finishing grouping that sends to destination host, i.e. the quantity of the transmission grouping that do not receive the confirmation as yet of information source main frame.Be used for tcp window value that each TCP connects be stored in protocol control block 18a, 18b that the relevant TCP of control connects ..., field 21a, the 21b of 18n ..., among the 21n.
For example, if destination host sends the tcp window value of the 128KB (kilobytes) that is used for specific T CP connection, then the information source main frame will be restricted to 128KB with the data volume that it sends according to Transmission Control Protocol in the TCP connection, receive the affirmation that receives some or all data from destination host up to it.If destination host confirms that it has received whole 128KB, then the information source main frame can send other 128KB.Otherwise if destination host is confirmed only to have received 96KB, for example, the information source main frame will only send additional 96KB on TCP connects, receive further affirmation up to it.
Related protocol controll block 18a, 18b ..., field 23a, the 23b of 18n ..., next data pointer 22 of TCP of 23n stored points to the serial number of the next one grouping that will send to destination host.That a part 24 representative of the data flow 10 between the end 28 of next data pointer 22 of TCP and tcp window 20 does not send as yet but being allowed under Transmission Control Protocol, send and not waiting for the grouping of any additional affirmation, because these groupings are still in tcp window shown in Figure 1.Do not allow under Transmission Control Protocol, to be sent in the part 26 of the data flow 10 outside the border, end 28 of tcp window 20, unless receive additional affirmation.
When destination host sent to the information source main frame with affirmation, TCP unacknowledged data pointer 16 moved the affirmation that is used for the additional packet of this connection with indication.The beginning border 30 of tcp window 20 so that also move on border, tcp window end 28, therefore can send the additional packet of this connection along with TCP unacknowledged data pointer 16 moves together.
Fig. 3 diagram by a plurality of RDMAs of network between each memory cell of each software application of information source main frame and one or more destination hosts connect 50a, 50b ..., 50n.Each RDMA connect 50a, 50b ..., 50 operate in the TCP connection.In rdma protocol, as what define among the RDMA RFC (requests for comments) that accepts in industry, each RDMA connect 50a, 50b ..., 50n comprise formation to 51a, 51b ..., 51n, comprise the formation 52a, the 52b that utilize software application to create ..., 52n, the predetermined message that sends on the designated memory cell that is stored in destination host of described software application.Each application queue 52a, 52b ..., 52n storage will utilize the message of relevant software application transmission.Each formation 52a, 52b ..., the size of 52n can be quite big or relatively little, this depends on the quantity that will utilize the message that related application sends.
Each formation that each RDMA connects 50a, 50b......, 50n to 51a, 51b ..., 51n also comprises with related application formation 52a, the 52b of software application ... paired network interface formation 60a, the 60b of 52n ..., 60n.Network interface 62 comprises various hardware, and typically network interface unit and various software comprise the driver of being carried out by main frame.Network interface can also comprise that offload engine is with the operation that carries on an agreement.
In response to the transmission that comes self-application will RDMA connect 50a, 50b ..., the request of the message of on the destination host storage address of appointment, storing on the other end of one of 50n, network interface 62 obtains the message credit (credit) that is labeled as " null message " from the public library 64 of null message.The size in storehouse 64 (being the message number that network interface 62 can be handled) is the function of the hardware capabilities of network interface 62 normally.If can obtain null message from this storehouse 64, then application queue 52a, the 52b that uses from request ..., obtain a message among the 52n, and with its be arranged in specific RDMA connect 50a, 50b ..., the formation of 50n to 51a, 51b ..., corresponding network interface queue 60a, the 60b of 51n ..., in the 60n.Will network interface formation 60a, 60b ..., the message of lining up among the 60n is in the designated store address that sends to the destination host of confirming to be successfully received and to be stored in each message on the designated store address on the network.The message that still is not identified as yet that is sent out is referred to herein as " uncompleted transmission message ".In case a message is confirmed to be by destination host and successfully receives and store, then a null message is resumed or replenishes in the storehouse 64 of null message.
According to rdma protocol, all network interface formation 60a, 60b ..., the sum of the message of lining up among the 60n add by all RDMA connect 50a, 50b ..., the sum of the uncompleted message that sends of 50n do not allow to surpass the size in null message storehouse 64 usually.In case RDMA connect 50a, 50b ..., one of 50n reaches the restriction that utilizes null message storehouse 64 to apply, then cannot line up again with send from any 50a of connection, 50b ..., the RDMA message of 50n, additionally confirm up to receiving.
However, in this technical field, still need to improve the performance of connection.
Description of drawings
With reference now to accompanying drawing,, identical in the accompanying drawings label is represented corresponding part:
The data flow that Fig. 1 diagram sends according to the prior art Transmission Control Protocol;
Fig. 2 illustrates the protocol control block according to Transmission Control Protocol of prior art;
Fig. 3 illustrates the message queue that the network interface of prior art is connected with RDMA;
Fig. 4 diagram wherein realizes a kind of embodiment of the computing environment of aspect of the present invention;
Fig. 5 illustrates the prior art packet configuration;
A kind of embodiment of Fig. 6 diagram data flow that the aspect sends according to the present invention;
A kind of embodiment of the operation that is used for the management data transmission of Fig. 7 diagram aspect according to the present invention;
The storage of Fig. 8 diagram aspect according to the present invention is used for a kind of embodiment of the data structure of management data information transmitted;
A kind of pattern of the operation of Fig. 9 pictorial image 7;
The network interface that is used for the RDMA connection of Figure 10 diagram aspect and a kind of embodiment of message queue according to the present invention;
The another kind of embodiment of the operation that is used for the management data transmission of Figure 11 diagram aspect according to the present invention; With
The structure that Figure 12 diagram can be used with described embodiment.
Embodiment
In the following description, will be with reference to the accompanying drawings, these accompanying drawings constitute the part of specification and some embodiment of the present invention are described.Be to be understood that: without departing from the scope of the invention, also can use other embodiment, and can carry out the improvement of structure and operation.
Fig. 4 diagram wherein can realize the computing environment of aspect of the present invention.Computer 102 comprises one or more CPU (CPU) 104 (only illustrating), volatile memory 106, Nonvolatile memory devices 108, operating system 110 and network adapter 112.Application program 114 is further carried out in memory 106, and can send grouping and therefrom receive grouping to remote computer.Computer 102 can be included in any computing equipment well known in the art, for example main frame, server, personal computer, work station, laptop computer, handheld computer, telephone set equipment, the network equipment, virtual equipment, storage control etc.Can use any CPU 104 well known in the art and operating system 110.The part that program in the memory 106 and data can be used as memory management operations exchanges in the storage device 108.
Network adapter 112 comprises network protocol layer 116, is used to realize physical communication layer, so that send network packet and receive network packet from it to remote equipment by network 118.Network 118 can comprise Local Area Network, internet, wide area network (WAN), storage area network (SAN) or the like.These embodiment can be configured to go up the transmission data in wireless network or connection such as WLAN, Bluetooth (bluetooth) etc.In certain embodiments, network adapter 112 and network protocol layer 116 can be carried out Ethernet protocol, token ring agreement, fiber channel protocol, Infiniband (unlimited frequency band), serial advanced technology attachment (SATA), parallel SCSI, Serial Attached SCSI (SAS) cable or the like or any other network communication protocol as known in the art.
Device driver 120 is carried out in memory 106, and comprises network adapter 112 particular commands so that communicate by letter with network adapter 112, and between operating system 110 and network adapter 112 interface.In some embodiments, network adapter 112 comprises transmission protocol layer 121 and network protocol layer 116.For example, network adapter 112 can be carried out TCP/IP offload engine (TOE), and is wherein different with device driver 120, carries out transport layer operations in the offload engine of the transmission protocol layer of realizing in network adapter 112 hardware 121.
Network layer 116 is handled network service, and the TCP/IP grouping that receives is offered transmission protocol layer, with decrypt packet, if encrypt.Transmission protocol layer 121 and device driver 120 interfaces, and the additional host-host protocol layer operation of execution, for example handle the decryption content of the message that comprises in the grouping that on network adapter 112, receives, described additional transport layer operations solderless wrapped connection (wrap) is in transport layer, and for example TCP and/or IP, internet small computer systems interface (iSCSI), optical-fibre channel SCSI, parallel SCSI transmit or any other transport layer protocol well known in the art.Transport offload engine 121 can be disassembled out pay(useful) load from the TCP/IP grouping that is received, and sends data to device driver 120, uses 114 so that turn back to.
In some embodiments, network adapter 112 can also comprise rdma protocol layer 122 and transmission protocol layer 121.For example, network adapter 112 can realize the RDMA offload engine, and is wherein different with device driver 120, carries out the RDMA layer operation in the offload engine of the rdma protocol layer of realizing in network adapter 112 hardware 122.
Thereby the application 114 that sends message in the RDMA connection can send message by the rdma protocol layer 122 of device driver 120 and network adapter 112.The data of message can send to transmission protocol layer 121, so that be packaged in the TCP/IP grouping.To divide into groups by network protocol layer 116 before sending on the network 118, transmission protocol layer 121 can also be encrypted this grouping.
Memory 106 also comprises file object 124, and this may also be referred to as socket object (socketobject), is included in the relevant information that arrives the connection of remote computer on the network 118.Using 114 uses the information in the file object 124 to identify connection.Using 114 will use file object 124 to communicate by letter with remote system.File object 124 can indicate will be used for the local port of communicating by letter or socket with remote system, therein carry out local network (IP) address of using 114 computer 102, use 114 and sent with the data volume that receives and use 114 remote port of communicating by letter with it and the network addresss, for example IP address.Contextual information 126 comprises data structure, and this data structure comprises that the management as described below of device driver 120 preservations sends to the information requested of network adapter 112.
Fig. 5 is shown on the network adapter 112 and receives or by the form of the network packet 150 of its transmission.RDMA message comprises one or more such groupings 150.Form so that procotol 114 is understood for example is encapsulated in the form that will comprise in the ethernet frame that adds the Ethernet part, realizes network packet 150, and described additional Ethernet part is title and error detection code (not shown) for example.Transmission grouping 152 is included in the network packet 150.Transmission grouping 152 can comprise the transport layer that can be handled by host-host protocol driver 121, such as TCP and/or IP agreement, internet small computer systems interface (iSCSI) agreement, optical-fibre channel SCSI, parallel SCSI transmission or the like.Transmission grouping 152 comprises effective load data 154 and other transport layer field, for example title and error detection code.Effective load data 154 comprises the basic content that is transmitted, for example order, state and/or data.Operating system 110 can comprise mechanical floor, SCSI driver (not shown) for example, and handling the content of effective load data 154, and visit any state, order and/or data wherein.
If when in the TCP connection when destination host sends data, the specific T CP connection of destination host is given big relatively tcp window 20 (Fig. 1), recognize that then the TCP with big tcp window connects the resource that can continue to use up destination host and sends data with other TCP ways of connecting of getting rid of the information source main frame.Therefore, other TCP connection of information source main frame may be prevented from sending data.In a kind of execution mode as shown in Figure 6, use virtual window 200 when in the TCP connection, sending data computer-chronograph 102, it may be much smaller than the tcp window that is provided by TCP purpose of connecting main frame.When next data pointer 22 of TCP arrived at the border 202, end of virtual window 200, main frame information source 102 stopped at and sends data in this TCP connection, even next data pointer 22 of TCP does not arrive at the border, end 28 of tcp window 20 as yet.As a result, provide the chance of 102 the resource of using a computer, therefore shared resource more liberally to other connection.
The operation of device driver 120 and network adapter 122 when Fig. 7 is shown in the data transmission resources that uses virtual window 200 Distribution Calculation machines 102.In response to the request of software application 114, between computer 102 and destination host, set up TCP and connect (square frame 210).When setting up TCP and connect, with the protocol control block 18a, the 18b that are similar to Fig. 2 ..., mode plug-in mounting (populate) such as protocol control block 222a, the 222b of 18n ..., the protocol control block of one of 222n (Fig. 8).According to TCP RFC, each protocol control block 222a, 222b ..., 222n have the field 17a, the 17b that are used to store TCP unacknowledged data pointer 16 ..., 17n, the field 21a that is used to store tcp window, 21b ..., 21n and the field 23a, the 23b that are used to store next pointer of TCP that relevant TCP is connected ..., 23n.
In this implementation, be used for the virtual window 200 that this TCP connects and have the peaked maximum of the virtual window of being referred to herein as, it be stored in related protocol controll block 222a, 222b ..., field 224a, the 224b of 222n ..., in the 224n.The relatively size and the peaked size of virtual window (square frame 230) of the tcp window 20 that receives from TCP purpose of connecting main frame.If tcp window 20 is not less than the virtual window maximum, then the size of virtual window 200 be set to protocol control block 222a, 222b ..., field 224a, the 224b of 222n ..., the peaked size of virtual window of 224n stored, described protocol control block is controlled this TCP and is connected.With the size of virtual window 200 be stored in protocol control block 222a, 222b that this TCP of control connects ..., field 233a, the 233b of 222n ..., in the 233n.
Before sending any data, the agreement that control TCP is connected control fast 222a, 222b ..., the serial number of the TCP unacknowledged data pointer 16 of 222n first packet that is set to send.Computer 102 begins to forward packets to destination host (square frame 234).The data volume that sends in this step can change according to specific application.Yet in many application, the data volume that sends in this step will be the relative small scale part of the size of virtual window 200.Protocol control block 222a, the 222b that control TCP is connected ..., next data pointer 22 of TCP of 222n is set to the serial number of next packet that will send.
After sending these packets, carry out and check (square frame 236), whether confirmed to receive the arbitrary packet that sends among the packet to determine destination host.If, then by move forward protocol control block 222a, 222b that control TCP connects ..., the TCP unacknowledged data pointer 16 of 222n comes mobile virtual window 200 (square frame 240) (Fig. 6).But move forward TCP unacknowledged data pointer 16 has sent the serial number of unacknowledged data grouping still with mark beginning according to priority.But sent still unacknowledged those packets are expressed as the packet of transmission in Fig. 6 the part 14 of data flow 250.Those packets that sent and confirmed are illustrated as part 12 in Fig. 6.The transmission of TCP unacknowledged data pointer 16 flag datas stream 150 and confirm grouping 12 and transmission but border 252 between the grouping unconfirmed 14.TCP unacknowledged data pointer 16 is gone back the beginning border 254 of mark virtual window 200 and tcp window 20.
Therefore, when mobile TCP unacknowledged data pointer 16 (square cabinet 240), confirm (square frame 236) when being connected the reception packet with this specific T CP of computer 110 at destination host, also mobile virtual window 200 and tcp window 20.Selectively, if grouping still unconfirmed (square frame 236), then virtual window 200, tcp window 20 and TCP unacknowledged data pointer 16 keep not moving.
Those packets that do not send as yet but allow to send and do not receive any further affirmation are labeled as part 256 in Fig. 6.But the beginning border 258 of next data pointer 22 mark part 256 of TCP, border, end 202 marks of virtual window 200 allow the border, end 260 of the data flow part 256 that does not send as yet.If next data pointer 22 of TCP has arrived at the border, end 202 of virtual window 200, the data flow that expression sends arrives at the end of (square cabinet 262) virtual window 200, the size of part 256 is zero, and can not send other packet again, up to receive additional affirmation from destination host.In case receive additional affirmation,, thereby constitute a new portion 256 and allow to send additional packet then with mobile virtual window 200 (square frame 240).
Attention: when the border, end 202 of arriving at virtual window 200 but not during the border 28, end of big tcp window 20, suspend and send but unacknowledged data stream of packets (part 14).Thereby, stop to send a part 264 that has been allowed to not waiting for the data flow 250 between the border, end 28 of the border, end 202 of virtual window 200 and tcp window 20 that sends under the additional situation about confirming, up to receiving additional affirmation.Therefore, can use a computer 110 resource of other connection sends packet, and additional affirmation is waited in the connection of having arrived at the border, end 202 of its virtual window 200.
If next data pointer 22 of TCP does not arrive at the border, end 202 of virtual window 200 as yet, show to send the end that data flow does not arrive at (square frame 262) virtual window 200 as yet that then the size of part 256 is a non-zero.Therefore, can send additional packet (square frame 234), up to the end of arriving at virtual window (square frame 262).
Recognize: destination host may begin to use up buffer space, and so littler tcp window 20 of advertisement.In this case, can with related protocol controll block 222a, 222b ..., field 21a, the 21b of 222n ..., the value of the tcp window of 21n stored is re-set as this smaller value.If the tcp window of destination host 20 should become less than (square frame 230) virtual window maximum, then the size of virtual window 200 can be reset the size that (square cabinet 270) is tcp window as shown in Figure 9.Thereby the border, end 202 of virtual window 200 is consistent with the border, end 28 of tcp window 20.As a result, virtual window 200 will be no more than the capacity as the represented destination host of the tcp window 200 of destination host advertisement.
In the implementation of Fig. 7, virtual window 200 is set to smaller in tcp window 20 and the virtual window maximum.Obviously can making ins all sorts of ways is provided with virtual window.For example, the virtual window maximum can be fixed size, for example 16KB.Selectively, the size of virtual window can be used as the advertisement of destination host institute tcp window size function and change.For example, a part that can virtual window be set to tcp window.This part for example can be a standing part.In addition, the function of quantity that can virtual window be set to effective connection of driver or network adapter.Thereby, for example, can virtual window be set to the 1/N part of tcp window, wherein N is the quantity of effective connection of driver or network adapter.
Further, can programme each virtual window maximum 224a, 224b ..., 224n, to allow by serve as to wish to obtain more those connections of high quality-of-service (QoS) to provide bigger virtual window maximum to allow one to connect the acquisition service quality (QoS) higher than another connection.And the virtual window maximum can be changed at any time during the specific connection.Thereby, can use random change to change QoS at the life period that is connected at any time with the peaked ability of the virtual window of specific join dependency.
Further understand: use for some, one of the application 114 of computer 102 (Fig. 4) can generate be used for RDMA connect 50a, 50b ..., application queue 52a, the 52b of 50n ..., 52n (Fig. 3), this is much larger than the whole storehouse 64 of null message.For example, in the RDMA connection, be stored in the designated memory cell of destination host, then this situation may occur if application has a large amount of message.RDMA with big application queue connects the use of the null message that can control this storehouse 64, and thereby continues to send message with the resource of using up the information source main frame with other RDMA ways of connecting of getting rid of the information source main frame.Therefore, other RDMA connection of information source main frame may be prevented from sending message.
In a kind of implementation as shown in figure 10, when connect at RDMA 350a, 350b ..., during the last transmission of 350n message, 102 pairs of RDMA connections of computer 350a, 350b ..., the quantity of the null message in the storehouse 364 of the consumable network interface 366 of arbitrary connection among the 350n applies restriction.More particularly, the storehouse 364 of null message is divided into a plurality of null messages limited storehouse 372a, 372b ..., 372n, each can be much smaller than whole null message storehouse 364.When RDMA connect 350a, 350b ..., one of 350n arrive at by limited storehouse 372a, the 372b of its relevant null message ..., during restriction that 372n applied, related application formation 52a, the 52b that main frame information source 102 stops to connect from this specific RDAM ..., extract (pull) message among the 52n, even the maintenance in null message storehouse 364 of other null message is not used.When other null message replenished relevant null message limited storehouse 372a, 372b ..., in the 372n time, related application formation 52a, the 52b that main frame information source 102 restarts to connect from this specific RDMA ..., extract message among the 52n.Therefore, connect for other RDMA to provide the chance of 102 resource of using a computer, thus shared resource more liberally.
Figure 11 be shown in use limited null message storehouse 372a, 372b ..., 372n come Distribution Calculation machine 102 the transmission of messages resource time device driver 120 and network adapter 122 operation.In this example, RDMA is discussed and connects 350a.Other RDMA connect 350b ..., 350n operates in a similar fashion.
In response to the request of software application 114, between computer 102 and destination host, set up RDMA and connect 350a (square frame 410).In one implementation, RDMA connection 350a moves in the TCP connection.Thereby, when setting up RDMA and connect 350a, plug-in mounting protocol control block 222a, 222b ..., the field 23a of next pointer of TCP of being connected with the relevant TCP that is used to store according to TCP RFC with the field 21a that comprises the field 17a that is used to store TCP unacknowledged data pointer 16, be used to store tcp window of the protocol control block 222a (Fig. 8) of 222n.In addition, as discussed above, TCP connects the fairness between the TCP connection that can use virtual window 200 that computer 102 is provided.Therefore, the virtual window maximum can be stored in the field 224a of relevant protocol control block 222a.
The size of the limited null message storehouse 372a of specific RDMA connection 350a is set to the size at the message limits value of the field 424a stored of related protocol controll block 222a.In illustrated embodiment, RDMA is connected parameter be stored in the protocol control block identical with TCP connection parameter.Obviously, RDMA can be connected parameter and be stored in fully independently in the controll block, perhaps be stored in other data structure.
Computer 102 begins to send a message to the assigned address of destination host to extraction message the application queue 52a of 51a by the formation that connects 350a from specific RDMA, and with this message queueing (square frame 434) in network interface formation 60a, the related application formation 52a of described network interface formation 60a and software application is paired.The queuing consumption of message is used for so null message of the relevant limited null message storehouse 372a of the RDMA connection 350a of each message of queuing in the network interface formation 60a.The message number of lining up in this step can change according to application-specific.Yet in many application, the data volume of queuing will be a relative small scale part of message restriction size.
After this message or a plurality of message queueing, carry out and check (square frame 436), whether confirmed to be received in RDMA and connected the arbitrary message that sends among the 350a to determine destination host.If then null message is replenished the relevant limited null message storehouse 372a that the RDMA that is used for each message of confirming like this connects 350a.
Carry out and check whether (square frame 450) is sky with the relevant limited null message storehouse 372a that determines RDMA connection 350a.If then can in network interface formation 60a, not line up so that send to destination host, up to receiving additional affirmation from the RDMA message of specific RDMA connection 350a.Therefore, control turns back to square frame 436, to wait for the further affirmation that RDMA is connected 350a.On the other hand, be not empty (square frame 450) if RDMA connects the limited null message storehouse 372a of 350a, the additional RDMA message of the application queue 52a of the RDMA that then can line up in network interface formation 60a connection 350a is so that send to destination host.By this way, the message sum of lining up in network interface formation 60a adds that the sum that utilizes RDMA to connect the outstanding message that 350a sends can be no more than and is used to be provided with the size that RDMA connects the limited null message storehouse 372a of 350a.
RDMA connect 350b ..., 350n operates in a similar fashion.Therefore, message restriction is stored in is connected with each RDMA 350b ..., the protocol control block 222b that is correlated with of 350n ..., the field 424b of 222n ..., among the 424n.
In the implementation of Figure 11, with RDMA connect 350a, 350b ..., each limited null message storehouse 372a, 372b of 350n ..., the size of 372n be set to be connected with each RDMA 350a, 350b ..., protocol control block 222a, 222b that 350n is relevant ..., the message restriction of storing among the 222n.Obviously, can make in all sorts of ways be provided with each limited null message storehouse 372a, 372b ..., the size of 372n.For example, the message restriction can be a fixed size, for example fixed news quantity.And the size that can be used for the message restriction that each RDMA connects is set to one of size of null message storehouse 364 fixedly fraction.Selectively, the size that is used for the message restriction that each RDMA connects can be used as related application formation 52a, 52n ..., the function of the size of 52n and changing.For example, the size that can be used for the message restriction that each RDMA connects is set to the sub-fraction of the size in null message storehouse 364, wherein the size of each fraction and related application formation 52a, 52n ..., being in proportion of 52n.In addition, can be used for the function of quantity that message restriction that each RDMA connects is set to effective connection of driver or network adapter.For example, the size that can be used for the message restriction of each RDMA connection is set to the 1/N of the size in null message storehouse 364, and wherein N is the quantity of effective connection of driver or network adapter.In this example, all limited null message storehouse 372a, 372b ..., the sum of 372n is equal to or less than whole null message storehouse 364 with maintenance.And, each formation to 51a, 51b ..., 51n can have be used for each RDMA connect 350a, 350b ..., the independently null message storehouse of 350n.
Further, each limited null message storehouse 372a, 372b ..., 372n can be programmed, so that provide bigger limited null message storehouse to allow a connection to surpass the higher service quality (QoS) of another connection by those connections for the higher QoS of hope.And, can change limited null message storehouse at any time during the specific connection.Thereby, can use any change to change QoS at this life period that is connected at any time with the ability in the limited null message storehouse of specific join dependency.
Thereby, as can be seen can for every connect programming virtual window maximum 224a, 224b ..., 224n and limited null message storehouse 372a, 372b ..., 372n, with the service quality (QoS) of the different brackets that allows can preferably show according to the needs of using.And, can the life period that connects change at any time virtual window maximum 224a, 224b ..., 224n and limited null message storehouse 372a, 372b ..., 372n to be to change QoS.
Additional embodiment details
The described technology that is used to handle the request that relates to network card can use standard programming and/or engineering to be embodied as the article (article) of method, equipment or manufacturing, to generate software, firmware, hardware or its combination.Term " manufacturing article " is meant code or logic or the computer-readable media of realizing, for example magnetic recording medium (for example hard disk drive, floppy disk, tape or the like), light storage device (CD-ROM, CD etc.), volatibility and non-volatile memory device (for example EEPROM, ROM, PROM, RAM, DRAM, SRAM, firmware, FPGA (Field Programmable Gate Array) or the like) in hardware logic (for example integrated circuit (IC) chip, programmable gate array (PGA), application-specific integrated circuit (ASIC) (ASIC) or the like) as used herein.The code in computer-readable media by processor access and execution.Can also or on network, realize the code of preferred embodiment by transmission medium from file server access.In these cases, wherein the manufacturing article of code can comprise transmission medium, for example Network transmission line, wireless medium, the signal propagated by space, radio wave and infrared signal etc.Thereby " manufacturing article " can comprise the medium of wherein implementing code.In addition, " manufacturing article " can comprise wherein implement, the combination of the hardware and software component of processing and run time version.Certainly, person of skill in the art will appreciate that, can carry out many modifications to this structure without departing from the scope of the invention, and the article of making can be included in any information-bearing medium well known in the art.
In described embodiment, some operation is described as utilizing device driver 120 or utilizes the protocol layer of network adapter 112 to carry out.In alternate embodiments, the operation that is described as utilizing device driver 120 to carry out can utilize network adapter 112 to carry out, and vice versa.
In described embodiment, on network, send grouping to remote computer from network adapter cards.In alternate embodiments, utilize protocol layer or device driver send of handling and the grouping that is received can be sent to the independent processor of in same computer, carrying out, wherein device driver and host-host protocol driver are carried out.In these embodiments, do not use network card, because transmit grouping between the processing in same computer and/or operating system.
In some embodiments, device driver and network adapter embodiment can be included in the computer system, this computer system comprises storage control and controller, described storage control for example is SCSI, integrated drive electronics (IDE), Redundant Array of Independent Disks or the like, described controller management to such as disc driver, tape media, CD, or the like the access of non-volatile memory device.In selectable execution mode, network adapter embodiment can be included in the system that does not comprise storage control, for example some hub and switch.
In some embodiments, device driver and network adapter embodiment can realize in computer system, described computer system comprises that Video Controller is with information reproduction, thereby with the monitor of coupled computer systems on show, described computer system comprises device driver and network adapter, and for example computer system comprises desktop computer, work station, server, main frame, portable computer, handheld computer or the like.Selectively, network adapter and device driver embodiment can realize in the computing equipment that does not comprise such as the Video Controller of switch and router etc.
In some embodiments, network adapter can be configured to send data by the cable that is connected to the port on the network adapter.Selectively, network adapter embodiment can be configured to such as the wireless network of WLAN, bluetooth etc. or connect and send data.
Fig. 8 diagram is used for the information of plug-in mounting protocol control block.In selectable execution mode, these data structures can comprise additional or different with information shown in accompanying drawing information.
Fig. 7 and the illustrated logic illustration of Figure 11 with certain some incident that occurs in proper order.In alternate embodiments, can carry out some operation with different orders, and some operation can be modified or delete.In addition, step can be added to above-mentioned logic, and still meet the foregoing description.And operation described herein can sequentially occur, and perhaps can handle some operation concurrently.In addition, can utilize single processing unit or utilize distributed processing unit to come executable operations.
A kind of implementation of the computer configuation 500 of the networking component of Figure 12 diagram such as Fig. 4 and main frame shown in Figure 10 and memory device.This structure 500 can comprise processor 502 (for example microprocessor), memory 504 (for example volatile storage devices) and storage device 506 (for example Nonvolatile memory devices, for example disc driver, CD drive, tape drive or the like).Storage device 506 can comprise storage device internal storage device or connection or network-accessible.Program in the storage device 506 is loaded in the memory 504, and carries out in the prior art manner known by processor 502.This network configuration also comprises network card 508, so as can with communicate such as networks such as Ethernet and FC-AL.In addition, in certain embodiments, this structure can comprise Video Controller 509, so as on display monitor information reproduction, wherein Video Controller 509 can be realized on video card, perhaps is integrated on the integrated circuit package that is installed on the mainboard.As discussed, some network equipment can have a plurality of network cards.Use input equipment 510 that user's input is offered processor 502, and can comprise keyboard, mouse, stylus, microphone, touch display screen or any other startup well known in the art or input mechanism.Output equipment 512 can reproduce from processor 502 or the information that sends such as other assembly such as display monitor, printer and storage device.
Network adapter 508 can be on such as the network card of peripheral cell interconnection (PCI) card or certain other I/O card or is being installed on the integrated circuit package on the mainboard or realizes in software.
Carried out the foregoing description of various embodiments of the invention for the purpose of illustration and description.That this does not plan exhaustive or limit the invention to disclosed precise forms.Under the enlightenment of above-mentioned instruction, can carry out numerous modifications and variations.Predetermined scope of the present invention will not utilized detailed explanation but utilize claims to limit.Above-mentioned explanation, example and data provide the manufacturing of the present invention's composition and the complete description of use.Because can under the situation that does not break away from the spirit and scope of the present invention, realize many embodiment of the present invention, so the present invention is present in the following appending claims.

Claims (39)

1. be used to send a kind of method of data, comprise:
Foundation is suitable for sending effective connection of packet between main frame and destination;
Receive first window value of representing first number of data packets from the destination;
Send packet from described main frame to described destination;
Receive the affirmation of each packet that receives by described destination from described destination, the restriction that wherein said first window value representative is applied on described main frame by described destination to the quantity of the packet that sends to described destination from described main frame and lack the affirmation that receives by the destination; With
To be second number of data packets by the restricted number that still is not confirmed to be the grouping that receives by described destination that described main frame sends less than described first window value.
2. the process of claim 1 wherein that this connection is that transmission control protocol between main frame and the destination connects and wherein said first window value is a transmission control protocol send window value.
3. the method for claim 1 also comprises:
Between main frame and a plurality of destination, set up a plurality of effective connections;
Receive first window value that representative is used for first number of data packets of this connection from each destination;
Send packet from described main frame to each destination;
Receive the affirmation of each packet that receives by each destination from each destination, wherein the restriction that on described main frame, applies by this purpose of connecting ground of first window value of each connection representative to the quantity that sends to packet this purpose of connecting ground and that lack the affirmation that receives by this purpose of connecting ground from described main frame; With
Will by described main frame to each connect to send but not being confirmed to be the number of packet that is received by each purpose of connecting ground is restricted to second number of data packets less than the window value of this connection;
Wherein less than second quantity of each connection of the window value that the connects quantity of Host Based effective connection at least in part.
4. the method for claim 2, wherein said main frame has a plurality of transmission control protocols and connects, each transmission control protocol connection has protocol control block, this protocol control block storage transmission control protocol send window value and less than the virtual window value of described transmission control protocol send window value, wherein each virtual window value will by described main frame send but be not confirmed to be second number of data packets of the restricted number of the grouping that receives by each transmission control protocol purpose of connecting ground to the virtual window value defined that utilizes the transmission control protocol connection.
5. the method for claim 3 also comprises:
In response to the destination size of transmission control protocol send window value is reduced to the 3rd quantity less than second quantity, will by described main frame send but be not confirmed to be the 4th number of data packets that the number of packet that is received by described destination is restricted to the reduction size that is not more than transmission control protocol send window value.
6. the method for claim 1 also comprises:
Setting up a plurality of effective direct memory access (DMA) between a plurality of designated memory cells of described main frame and a plurality of destinations connects;
With a plurality of message designated memory cell, wherein each message comprises a plurality of packets with sending to the direct memory access (DMA) purpose of connecting;
Receive message authentication, each message for being received by the destination sends each message authentication by this destination; With
Set up a plurality of message restrictions, each message restriction applies quantity designated memory cell and that lack the message of the message authentication that is received by the direct memory access (DMA) purpose of connecting ground relevant with this message restriction that sends to the direct memory access (DMA) connection relevant with this message restriction from described main frame and is used for the independently restriction that each direct memory access (DMA) is connected.
7. the method for claim 6, wherein each direct memory access (DMA) connects and is included in the application of described main frame and described main frame is connected to network interface between the network of a plurality of destinations, comprise that with wherein said network interface being used for each direct memory access (DMA) is connected and is suitable for lining up and will connects the formation of the message that sends and the wherein said designated memory cell that message is sent to direct memory access (DMA) purpose of connecting ground at every turn by the direct memory access (DMA) relevant with each formation and be included in queuing message in the network interface formation with the direct memory access (DMA) join dependency; When wherein send when the designated memory cell that connects to relevant direct memory access (DMA) from described main frame and quantity that lack the message of the message authentication that is received by relevant direct memory access (DMA) purpose of connecting ground reaches the direct memory access (DMA) relevant with the network interface formation is connected the independent message restriction that applies, suspend and the network interface formation of direct memory access (DMA) join dependency in the queuing of message.
8. the method for claim 7, wherein be less than when the direct memory access (DMA) relevant with the network interface formation connected the independent message that applies and limit, restart the queuing of message in the network interface formation with the direct memory access (DMA) join dependency when send to designated memory cell and quantity that lack the message of the message authentication that receives by relevant direct memory access (DMA) purpose of connecting ground that relevant direct memory access (DMA) connects from described main frame.
9. the method for claim 8, wherein send to connect be that transmission control protocol between main frame and destination connects in grouping, and wherein each direct memory access (DMA) to connect be that long-range direct memory access (DMA) between main frame and the direct memory access (DMA) purpose of connecting ground connects.
10. the method for claim 10, wherein said network interface has the null message storehouse, it sends all designated memory cells that connect to all direct memory access (DMA) from described main frame and total amount that lack the message of the message authentication that is received by relevant direct memory access (DMA) purpose of connecting ground applies restriction, and wherein each message restriction is less than the network interface storehouse of null message.
11. the method for claim 6, wherein each message is limited to the quantity that the Host Based effective direct memory access (DMA) in small part ground connects.
12. the method for claim 6 also is included in by before relevant some message of direct memory access (DMA) connection transmission, changes the size of the message restriction of effective direct memory access (DMA) connection.
13. the method for claim 6, wherein each message is limited to the quantity that the Host Based effective direct memory access (DMA) in small part ground connects.
14. the method for claim 1, also be included in send change before some groupings restriction by main frame send but be not confirmed to be the size of second number of packet of the quantity of the grouping that receives by the destination.
15. be suitable for a kind of system of communicating by letter with destination with data storage device, comprise with memory cell:
System storage;
Processor is coupled to this system storage;
Network adapter;
Data storage controller is used to manage I/O (I/O) visit to data storage device;
The device driver that can carry out by this processor in this memory, wherein being suitable for one of at least among device driver and the network adapter:
(i) set up the effective connection that is suitable between main frame and destination, sending packet;
(ii) receive first window value of representing first number of data packets from the destination;
(iii) send packet to described destination from this system;
(iv) for each packet that receives by described destination, from the confirmation of receipt of described destination, the restriction that the representative of wherein said first window value is applied in described system by described destination to the quantity of the packet that sends to described destination from described system and lack the affirmation that receives by the destination; With
(v) will be second number of data packets by the restricted number that still is not confirmed to be the grouping that receives by described destination that described system sends less than described first quantity.
16. the system of claim 15, wherein data storage device comprises magnetic recording medium.
17. the system of claim 15, wherein said connection is that the transmission control protocol between main frame and the destination connects and wherein said first window value is a transmission control protocol send window value.
18. the system of claim 15, wherein at least one among device driver and the network adapter also is suitable for:
Between this system and a plurality of destination, set up a plurality of effective connections;
Receive first window value that representative is used for first number of data packets of this connection from each destination;
Send packet from described system to each destination;
For each packet that receives by each destination, from each destination confirmation of receipt, wherein the restriction that in described system, applies by this purpose of connecting ground of first window value of each connection representative to the quantity that sends to packet this purpose of connecting ground and that lack the affirmation that receives by this purpose of connecting ground from described system; With
Will by described system to each connect to send but not being confirmed to be the number of packet that is received by each purpose of connecting ground is restricted to second number of data packets less than the window value of this connection;
Wherein less than second quantity of each connection of the window value of this connection at least in part based on the quantity of effective connection of this system.
19. the system of claim 17, wherein at least one among device driver and the network adapter is suitable for setting up a plurality of transmission control protocols and is connected, each transmission control protocol connection has the protocol control block data structure, this protocol control block data structure storage transmission control protocol send window value and less than the virtual window value of described transmission control protocol send window value, wherein each virtual window value will by described system send but be not confirmed to be second number of data packets of the restricted number of the grouping that receives by each transmission control protocol purpose of connecting ground to the virtual window value defined that utilizes the transmission control protocol connection.
20. the system of claim 19, wherein at least one among device driver and the network adapter is suitable for:
In response to the destination size of transmission control protocol send window value is reduced to the 3rd quantity that is less than second quantity, will by described system send but the restricted number that is not confirmed to be the grouping that is received by described destination be the 4th number of data packets that is not more than the reduction size of transmission control protocol send window value.
21. the system of claim 15, wherein at least one among device driver and the network adapter is suitable for:
Setting up a plurality of effective direct memory access (DMA) between a plurality of designated memory cells of described system and a plurality of destinations connects;
With a plurality of message designated memory cell, wherein each message comprises a plurality of packets with sending to the direct memory access (DMA) purpose of connecting;
Receive message authentication, each message for being received by the destination sends each message authentication by this destination; With
Set up a plurality of message restrictions, each message restriction applies quantity designated memory cell and that lack the message of the message authentication that is received by the direct memory access (DMA) purpose of connecting ground relevant with this message restriction that sends to the direct memory access (DMA) connection relevant with the message restriction from described system and is used for the independently restriction that each direct memory access (DMA) is connected.
22. the system of claim 21, wherein at least one among device driver and the network adapter is suitable for being provided for each direct memory access (DMA) and is connected and is suitable for lining up and will connects the formation of the message that sends by the direct memory access (DMA) relevant with each formation, wherein when message being sent at every turn the designated memory cell on direct memory access (DMA) purpose of connecting ground, being suitable for one of at least among device driver and the network adapter: with the formation of direct memory access (DMA) join dependency in queuing message; Reach when the direct memory access (DMA) relevant with this formation connected the independent message restriction that applies the queuing of message in the formation of time-out and direct memory access (DMA) join dependency with the quantity that sends when the designated memory cell that is connected to relevant direct memory access (DMA) from described system and lack the message of the message authentication that receives by relevant direct memory access (DMA) purpose of connecting ground.
23. the system of claim 22, wherein at least one among device driver and the network adapter is suitable for: be less than when the direct memory access (DMA) relevant with the network interface formation connected the independent message that applies and limit when send to designated memory cell and message number that lack the message authentication that is received by relevant direct memory access (DMA) purpose of connecting ground that relevant direct memory access (DMA) is connected from described system, restart the queuing of message in the formation with the direct memory access (DMA) join dependency.
24. the system of claim 23, wherein send to connect be that transmission control protocol in that this system and grouping send between the purpose of connecting ground connects in grouping, and wherein each direct memory access (DMA) to connect be that long-range direct memory access (DMA) between this system and direct memory access (DMA) connect connects.
25. the system of claim 24, one of at least be suitable for providing the null message storehouse among device driver and the network adapter, it sends all designated memory cells that connect to all direct memory access (DMA) from described system and total amount that lack the message of the message authentication that is received by relevant direct memory access (DMA) purpose of connecting ground applies restriction, and wherein each message restriction is less than the null message storehouse.
26. make article for one kind, be used to manage transfer of data by network, wherein these manufacturing article cause the operation that will be performed, described operation comprises:
Foundation is suitable for sending effective connection of packet between main frame and destination;
Receive first window value of representing first number of data packets from the destination;
Send packet from described main frame to described destination;
For each packet that receives by described destination, from the confirmation of receipt of described destination, the restriction that wherein said first window value representative is applied the quantity that sends to described destination from described main frame and lack the packet of the affirmation that is received by the destination on described main frame by described destination; With
To be second number of data packets by the restricted number that still is not confirmed to be the grouping that receives by described destination that described main frame sends less than described first quantity.
27. the manufacturing article of claim 26, wherein said connection are that the transmission control protocol between main frame and the destination connects and wherein said first window value is a transmission control protocol send window value.
28. the manufacturing article of claim 27, wherein said operation also comprises:
Between main frame and a plurality of destination, set up a plurality of effective connections;
Receive first window value that representative is used for first number of data packets of this connection from each destination;
Send packet from described main frame to each destination;
For each packet that receives by each destination, from each destination confirmation of receipt, wherein first window value of each connection representative by this purpose of connecting ground on described main frame to send to this purpose of connecting ground from described main frame and lack the restriction that the number of data packets by the affirmation of this connections destination reception applies; With
Will by described main frame to each connect to send but not being confirmed to be the number of packet that is received by each purpose of connecting ground is restricted to second number of data packets less than the window value of this connection;
Wherein less than second quantity of each connection of the window value that connects at least in part based on the quantity of effective connection of this main frame.
29. the manufacturing article of claim 28, wherein said main frame has a plurality of transmission control protocols and connects, each transmission control protocol connection has protocol control block, this protocol control block storage transmission control protocol send window value and less than the virtual window value of described transmission control protocol send window value, wherein each virtual window value will by described main frame send but be not confirmed to be second number of data packets of the restricted number of the grouping that receives by each transmission control protocol purpose of connecting ground to the virtual window value defined that utilizes the transmission control protocol connection.
30. the manufacturing article of claim 28, wherein said operation also comprises:
In response to the destination size of transmission control protocol send window value is reduced to the 3rd quantity less than second quantity, will by described main frame send but the restricted number that is not confirmed to be the grouping that is received by described destination be the 4th number of data packets that is not more than the reduction size of transmission control protocol send window value.
31. the manufacturing article of claim 26, wherein said operation also comprises:
Setting up a plurality of effective direct memory access (DMA) between a plurality of designated memory cells of described main frame and a plurality of destinations connects;
With a plurality of message designated memory cell, wherein each message comprises a plurality of packets with sending to the direct memory access (DMA) purpose of connecting;
Receive message authentication, each message for being received by the destination sends each message authentication by this destination;
Set up a plurality of message restrictions, each message restriction applies for the quantity designated memory cell that sends to the direct memory access (DMA) connection relevant with this message restriction from described main frame and that lack the message of the message authentication that is received by the direct memory access (DMA) purpose of connecting ground relevant with this message restriction and is used for the independently restriction that each direct memory access (DMA) is connected.
32. the manufacturing article of claim 31, wherein each direct memory access (DMA) connects and is included in the application of described main frame and described main frame is connected to network interface between the network of a plurality of destinations, comprise that with wherein said network interface being used for each direct memory access (DMA) is connected and is suitable for lining up and will connects the formation of the message that sends and the wherein said designated memory cell that at every turn message is sent to direct memory access (DMA) purpose of connecting ground by the direct memory access (DMA) relevant with each formation and be included in queuing message in the network interface formation with the direct memory access (DMA) join dependency; When wherein send when the designated memory cell that connects to relevant direct memory access (DMA) from described main frame and quantity that lack the message of the message authentication that is received by relevant direct memory access (DMA) purpose of connecting ground reaches the direct memory access (DMA) relevant with this network interface formation is connected the independent message restriction that applies, suspend and the network interface formation of direct memory access (DMA) join dependency in the queuing of message.
33. the manufacturing article of claim 32, wherein be less than when being applied to independent message that the direct memory access (DMA) relevant with this network interface formation connects and limiting, restart the queuing of message in the network interface formation with the direct memory access (DMA) join dependency when send to designated memory cell and quantity that lack the message of the message authentication that receives by relevant direct memory access (DMA) purpose of connecting ground that relevant direct memory access (DMA) connects from described main frame.
34. the manufacturing article of claim 33, wherein send to connect be that transmission control protocol between main frame and destination connects in grouping, and wherein each direct memory access (DMA) to connect be that long-range direct memory access (DMA) between main frame and this direct memory access (DMA) purpose of connecting ground connects.
35. the manufacturing article of claim 34, wherein said network interface has the null message storehouse, it sends all designated memory cells that connect to all direct memory access (DMA) from described main frame and total amount that lack the message of the message authentication that is received by relevant direct memory access (DMA) purpose of connecting ground applies restriction, and wherein each message restriction is less than the network interface storehouse of null message.
36. the manufacturing article of claim 31, wherein each message is limited to the quantity that the Host Based effective direct memory access (DMA) in small part ground connects.
37. the manufacturing article of claim 31 also are included in by before relevant some message of direct memory access (DMA) connection transmission, change the size of the message restriction of effective direct memory access (DMA) connection.
38. the manufacturing article of claim 31, wherein each message is limited to the quantity that the Host Based effective direct memory access (DMA) in small part ground connects.
39. the manufacturing article of claim 26, also be included in send change before some groupings restriction by main frame send but be not confirmed to be the size of second number of packet of the number of packet that receives by the destination.
CNB2004100566580A 2003-09-15 2004-08-13 Method, system, and program for managing data transmission through a network Expired - Fee Related CN100438403C (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US10/663,026 US7870268B2 (en) 2003-09-15 2003-09-15 Method, system, and program for managing data transmission through a network
US10/663026 2003-09-15

Publications (2)

Publication Number Publication Date
CN1599319A true CN1599319A (en) 2005-03-23
CN100438403C CN100438403C (en) 2008-11-26

Family

ID=34274265

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2004100566580A Expired - Fee Related CN100438403C (en) 2003-09-15 2004-08-13 Method, system, and program for managing data transmission through a network

Country Status (2)

Country Link
US (1) US7870268B2 (en)
CN (1) CN100438403C (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106953797A (en) * 2017-04-05 2017-07-14 广东浪潮大数据研究有限公司 A kind of method and apparatus of the RDMA data transfers based on Dynamic link library

Families Citing this family (39)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7529778B1 (en) 2001-12-12 2009-05-05 Microsoft Corporation System and method for providing access to consistent point-in-time file versions
US7761529B2 (en) * 2004-06-30 2010-07-20 Intel Corporation Method, system, and program for managing memory requests by devices
US7617256B2 (en) * 2004-07-19 2009-11-10 Microsoft Corporation Remote file updates through remote protocol
US20060168274A1 (en) * 2004-11-08 2006-07-27 Eliezer Aloni Method and system for high availability when utilizing a multi-stream tunneled marker-based protocol data unit aligned protocol
WO2006124357A2 (en) 2005-05-11 2006-11-23 Bigfoot Networks, Inc. Distributed processing system and method
US8332526B2 (en) 2005-05-25 2012-12-11 Microsoft Corporation Data communication protocol including negotiation and command compounding
EP1727055B1 (en) * 2005-05-25 2016-09-07 Microsoft Technology Licensing, LLC Data communication coordination with sequence numbers
US9455844B2 (en) * 2005-09-30 2016-09-27 Qualcomm Incorporated Distributed processing system and method
US7990861B1 (en) * 2006-04-03 2011-08-02 Juniper Networks, Inc. Session-based sequence checking
WO2007139426A1 (en) * 2006-05-31 2007-12-06 Intel Corporation Multiple phase buffer enlargement for rdma data transfer
US20080085970A1 (en) * 2006-10-02 2008-04-10 The Yokohama Rubber Co., Ltd. Rubber composition for tire inner liner and pneumatic tire using the same
EP2143000A4 (en) * 2007-03-26 2011-04-27 Bigfoot Networks Inc Method and system for communication between nodes
US7921177B2 (en) * 2007-07-18 2011-04-05 International Business Machines Corporation Method and computer system for providing remote direct memory access
TWI423032B (en) * 2009-04-30 2014-01-11 Ralink Technology Corp Method for enhancing data transmission efficiency
US8631277B2 (en) 2010-12-10 2014-01-14 Microsoft Corporation Providing transparent failover in a file system
US9331955B2 (en) 2011-06-29 2016-05-03 Microsoft Technology Licensing, Llc Transporting operations of arbitrary size over remote direct memory access
US8856582B2 (en) 2011-06-30 2014-10-07 Microsoft Corporation Transparent failover
US20130067095A1 (en) 2011-09-09 2013-03-14 Microsoft Corporation Smb2 scaleout
US8788579B2 (en) 2011-09-09 2014-07-22 Microsoft Corporation Clustered client failover
CN104883335B (en) * 2014-02-27 2017-12-01 王磊 A kind of devices at full hardware TCP protocol stack realizes system
US20180032471A1 (en) * 2016-07-26 2018-02-01 Samsung Electronics Co., Ltd. Self-configuring ssd multi-protocol support in host-less environment
US11983138B2 (en) 2015-07-26 2024-05-14 Samsung Electronics Co., Ltd. Self-configuring SSD multi-protocol support in host-less environment
CN105138410A (en) * 2015-08-31 2015-12-09 北京锐安科技有限公司 Message queue achievement method and device based on disk buffer
US10623341B2 (en) * 2015-09-30 2020-04-14 International Business Machines Corporation Configuration of a set of queues for multi-protocol operations in a target driver
US10498654B2 (en) 2015-12-28 2019-12-03 Amazon Technologies, Inc. Multi-path transport design
US9985904B2 (en) * 2015-12-29 2018-05-29 Amazon Technolgies, Inc. Reliable, out-of-order transmission of packets
US9985903B2 (en) * 2015-12-29 2018-05-29 Amazon Technologies, Inc. Reliable, out-of-order receipt of packets
US10148570B2 (en) 2015-12-29 2018-12-04 Amazon Technologies, Inc. Connectionless reliable transport
US10210123B2 (en) 2016-07-26 2019-02-19 Samsung Electronics Co., Ltd. System and method for supporting multi-path and/or multi-mode NMVe over fabrics devices
US10346041B2 (en) 2016-09-14 2019-07-09 Samsung Electronics Co., Ltd. Method for using BMC as proxy NVMeoF discovery controller to provide NVM subsystems to host
US10372659B2 (en) 2016-07-26 2019-08-06 Samsung Electronics Co., Ltd. Multi-mode NMVE over fabrics devices
US11461258B2 (en) 2016-09-14 2022-10-04 Samsung Electronics Co., Ltd. Self-configuring baseboard management controller (BMC)
US11144496B2 (en) 2016-07-26 2021-10-12 Samsung Electronics Co., Ltd. Self-configuring SSD multi-protocol support in host-less environment
US10891253B2 (en) * 2016-09-08 2021-01-12 Microsoft Technology Licensing, Llc Multicast apparatuses and methods for distributing data to multiple receivers in high-performance computing and cloud-based networks
SE540244C2 (en) 2016-09-26 2018-05-08 Scania Cv Ab Method in a bus gateway for load balancing traffic for non-cyclic messages between different segments of the bus
US20180088978A1 (en) * 2016-09-29 2018-03-29 Intel Corporation Techniques for Input/Output Access to Memory or Storage by a Virtual Machine or Container
US11068412B2 (en) * 2019-02-22 2021-07-20 Microsoft Technology Licensing, Llc RDMA transport with hardware integration
US11025564B2 (en) 2019-02-22 2021-06-01 Microsoft Technology Licensing, Llc RDMA transport with hardware integration and out of order placement
US11467873B2 (en) * 2019-07-29 2022-10-11 Intel Corporation Technologies for RDMA queue pair QOS management

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6473793B1 (en) * 1994-06-08 2002-10-29 Hughes Electronics Corporation Method and apparatus for selectively allocating and enforcing bandwidth usage requirements on network users
FI98174C (en) * 1995-05-09 1997-04-25 Nokia Telecommunications Oy Data transmission system with sliding window based data flow control
US6219713B1 (en) * 1998-07-07 2001-04-17 Nokia Telecommunications, Oy Method and apparatus for adjustment of TCP sliding window with information about network conditions
US6862622B2 (en) * 1998-07-10 2005-03-01 Van Drebbel Mariner Llc Transmission control protocol/internet protocol (TCP/IP) packet-centric wireless point to multi-point (PTMP) transmission system architecture
US6742021B1 (en) * 1999-01-05 2004-05-25 Sri International, Inc. Navigating network-based electronic information using spoken input with multimodal error feedback
US6711137B1 (en) * 1999-03-12 2004-03-23 International Business Machines Corporation System and method for analyzing and tuning a communications network
US6560243B1 (en) * 1999-04-30 2003-05-06 Hewlett-Packard Development Company System and method for receiver based allocation of network bandwidth
US6674717B1 (en) * 2000-03-30 2004-01-06 Network Physics, Inc. Method for reducing packet loss and increasing internet flow by feedback control
US7380006B2 (en) * 2000-12-14 2008-05-27 Microsoft Corporation Method for automatic tuning of TCP receive window based on a determined bandwidth
WO2003040735A1 (en) * 2001-11-07 2003-05-15 Cyneta Networks Inc. Resource aware session adaptation system and method for enhancing network throughput
US20040049580A1 (en) * 2002-09-05 2004-03-11 International Business Machines Corporation Receive queue device with efficient queue flow control, segment placement and virtualization mechanisms
US7385923B2 (en) * 2003-08-14 2008-06-10 International Business Machines Corporation Method, system and article for improved TCP performance during packet reordering
US7103683B2 (en) * 2003-10-27 2006-09-05 Intel Corporation Method, apparatus, system, and article of manufacture for processing control data by an offload adapter
US20050141425A1 (en) 2003-12-24 2005-06-30 Foulds Christopher T. Method, system, and program for managing message transmission through a network
US7562158B2 (en) * 2004-03-24 2009-07-14 Intel Corporation Message context based TCP transmission

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106953797A (en) * 2017-04-05 2017-07-14 广东浪潮大数据研究有限公司 A kind of method and apparatus of the RDMA data transfers based on Dynamic link library
CN106953797B (en) * 2017-04-05 2020-05-26 苏州浪潮智能科技有限公司 RDMA data transmission method and device based on dynamic connection

Also Published As

Publication number Publication date
CN100438403C (en) 2008-11-26
US20050060442A1 (en) 2005-03-17
US7870268B2 (en) 2011-01-11

Similar Documents

Publication Publication Date Title
CN100438403C (en) Method, system, and program for managing data transmission through a network
US20220255884A1 (en) System and method for facilitating efficient utilization of an output buffer in a network interface controller (nic)
CN1864376B (en) Method, system, and product for utilizing host memory from an offload adapter
CN1606290A (en) Method, system, and program for managing memory for data transmission through a network
CN1458590A (en) Method for synchronous and uploading downloaded network stack connection by network stact
US7929442B2 (en) Method, system, and program for managing congestion in a network controller
US20050141425A1 (en) Method, system, and program for managing message transmission through a network
US7664892B2 (en) Method, system, and program for managing data read operations on network controller with offloading functions
US9813283B2 (en) Efficient data transfer between servers and remote peripherals
US8010707B2 (en) System and method for network interfacing
US8392565B2 (en) Network memory pools for packet destinations and virtual machines
US9021142B2 (en) Reflecting bandwidth and priority in network attached storage I/O
CN1315077C (en) System and method for efficient handling of network data
US7613132B2 (en) Method and system for controlling virtual machine bandwidth
WO2013095654A1 (en) Shared send queue
CN1647054A (en) Network device driving system structure
CN1520556A (en) End node partitioning using local identifiers
CN1905524A (en) Processor load based dynamic segmentation method and system
CN103176780A (en) Binding system and method of multiple network interfaces
US7627899B1 (en) Method and apparatus for improving user experience for legitimate traffic of a service impacted by denial of service attack
US7761529B2 (en) Method, system, and program for managing memory requests by devices
US7404040B2 (en) Packet data placement in a processor cache
CN1581853A (en) Method for treating group to be transmitted on network, system and programe thereof
CN102843435A (en) Access and response method and access and response system of storing medium in cluster system
CN1949203A (en) Architecture of interface target machine for miniature computer system and data transmitting method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20081126

Termination date: 20100813