US20060034275A1 - Data transfer, synchronising applications, and low latency networks - Google Patents
Data transfer, synchronising applications, and low latency networks Download PDFInfo
- Publication number
- US20060034275A1 US20060034275A1 US11/198,043 US19804305A US2006034275A1 US 20060034275 A1 US20060034275 A1 US 20060034275A1 US 19804305 A US19804305 A US 19804305A US 2006034275 A1 US2006034275 A1 US 2006034275A1
- Authority
- US
- United States
- Prior art keywords
- data
- address
- application
- burst
- memory
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/56—Provisioning of proxy services
- H04L67/568—Storing data temporarily at an intermediate stage, e.g. caching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/38—Information transfer, e.g. on bus
- G06F13/382—Information transfer, e.g. on bus using universal interface adapter
- G06F13/385—Information transfer, e.g. on bus using universal interface adapter for adaptation of a particular data processing system to different peripheral devices
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L49/00—Packet switching elements
- H04L49/90—Buffering arrangements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L49/00—Packet switching elements
- H04L49/90—Buffering arrangements
- H04L49/901—Buffering arrangements using storage descriptor, e.g. read or write pointers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/56—Provisioning of proxy services
- H04L67/565—Conversion or adaptation of application format or content
Definitions
- This invention in its various aspects, relates to the field of asynchronous networking, and specifically to: a memory mapped network interface; a method of synchronising between a sending application, running on a first computer, and a receiving application, running on a second computer, the computers each having a memory mapped network interface; a communication protocol; and a computer network.
- This invention also relates to data transfer and to synchronising applications.
- FIG. 1 A traditional network is shown in FIG. 1 .
- the Central Processing Unit (CPU) 202 writes data from memory 204 through its system controller 206 to its Network Interface Card (NIC) 210 .
- NIC Network Interface Card
- DMA Direct Memory Access
- the NIC 210 takes the data and forms network packets 216 , which contain enough information to allow them to be routed across the network 218 to computer system 201 .
- a network packet arrives at the NIC 211 , it must be demultiplexed to determine where the data needs to be placed. In traditional networks this must be done by the operating system.
- the incoming packet therefore generates an interrupt 207 , which causes software, a device driver in operating system 209 , to run.
- the device driver examines the header information of each incoming network packet 216 and determines the correct location in memory 205 , for data contained within the network packet.
- the data is transferred into memory using the CPU 203 or DMA hardware (not shown).
- the driver may then request that operating system 209 reschedule any application process that is blocked waiting for this data to arrive.
- These networks therefore provide implicit synchronisation between sending and receiving applications and are called synchronous networks.
- asynchronous networks In asynchronous networks, the final memory location within the receiving computer for received data can be computed by the receiving NIC from the header information of a received network packet. This computation can be done without the aid of the operating system.
- VISA Virtual Interface Architecture
- Memory-mapped networks are one example of asynchronous networks.
- An early computer network using memory mapping is described in U.S. Pat. No. 4,393,443.
- a memory-mapped network is shown in FIG. 2 .
- Application 222 running on Computer 220 would like to communicate with application 223 running on Computer 221 using network 224 .
- a portion of the application 222 's memory address space is mapped using the computer 220 's virtual memory system onto a memory aperture of the NIC 226 as shown by the application's page-tables 228 (these page-tables and their use is well known in the art).
- a portion of application 223 's memory address space is mapped using computer 221 's virtual memory system onto a memory aperture of the NIC 229 using the application 223 's page-tables 231 .
- Software is usually required to create these mappings, but once they have been made, data transfer to and from a remote machine can be achieved using a CPU read or write instruction to a mapped virtual memory address.
- NIC 226 determines the address of the destination computer 221 and the address of the remote memory aperture 225 within that computer. Some combination of this address information can be regarded as the network address, which is the target of the write.
- All the aperture mappings and network address translations are calculated at the time that the connection between the address spaces of computers 220 and 221 is made.
- the process of address lookups and translations at each stage in the system can be carried out using hardware.
- NIC 226 After receiving a write, NIC 226 creates network packets using its packetisation engine 230 . These packets are forwarded to the destination computer 221 . At the destination, the memory aperture addresses of the incoming packets are remapped by the packet handler onto physical memory locations 227 . The destination NIC 229 then writes the incoming data to these physical memory locations 227 . This physical memory has also been mapped at connection set-up time into the address space of application 223 . Hence application 223 is able, using page-tables 231 and the virtual memory system, to access the data using processor read and write operations.
- SCI Scalable Coherent Interface
- SCI is an example of an asynchronous network standard, which provides poor facilities for synchronisation at the time of data reception.
- a network using SCI is disclosed in U.S. Pat. No. 5,819,075.
- FIG. 3 shows an example of an SCI-like network, where application 242 on computer 240 would like to communicate with application 243 on computer 241 . Let us suppose that application 243 has blocked waiting for the data. Application 242 transmits data using the methods described above. After sending the data, application 242 must then construct a synchronisation packet in local memory, and program the event generator 244 , in NIC 246 , to send the synchronisation packet 248 , to the destination node.
- the NIC 245 on computer 241 On receiving synchronisation packet 248 , the NIC 245 on computer 241 , invokes its event handler 247 , which generates an interrupt 249 allowing the operating system 248 to determine that application 243 is blocked and should be woken up.
- This is called out-of-band synchronisation since the synchronisation packet must be treated as a separate and distinct entity and not as part of the data stream.
- Out-of-band synchronisation greatly reduces the potential of memory-mapped networks to provide high bandwidth and low latency.
- NIC In other existing asynchronous networks, such as the newly emerging Virtual Interface Architecture (VIA) standard and the forthcoming Next Generation Input/Output (NGIO) standard, some support is provided for synchronisation.
- VIP Virtual Interface Architecture
- NGIO Next Generation Input/Output
- a NIC will raise a hardware interrupt when some data has arrived. However, the interrupt does not identify the recipient of the data, instead only indicates that some data has arrived for some communicating end-point.
- a first aspect of the invention provides a method of synchronising between a sending application on a first computer and a receiving application on a second computer, each computer having a main memory, and at least one of the computers having an asynchronous network interface, comprising the steps of:
- Another aspect of the invention provides an asynchronous network interface for use in a host computer having a main memory and connected to a network, the interface comprising:
- a further aspect of the invention provides a method of passing data between an application on a first computer and remote hardware within a second computer or on a passive backplane, the first computer having a main memory and an asynchronous network interface, the method comprising the steps of:
- a further aspect of the invention provides a method of arranging data transfers from one or more applications on a computer, the computer having a main memory, an asynchronous network interface, and a Direct Memory Access (DMA) engine having a request queue address common to all the applications, comprising the steps of:
- a yet further aspect of the invention provides a method of transferring data from a sending application on a first computer to a receiving application on a second computer, each computer having a main memory, and a memory mapped network interface, the method comprising the steps of:
- Another aspect of the invention provides a computer network comprising two computers, the first computer running a sending application and the second computer running a receiving application, each computer having a main memory and a memory mapped network interface, the main memory of the second computer having: a buffer for storing data being transferred between computers as well as data identifying one or more pointer memory location(s);
- a further aspect of the invention provides a method of sending a request from a client application on a first computer to a server application on a second computer, and sending a response from the server application to the client application, both computers having a main memory and a memory mapped network interface, the method comprising the steps of:
- Another aspect of the invention provides a method of arranging data for transfer as a data burst over a computer network comprising the steps of: providing a header comprising the destination address of a certain data word in the data burst, and a signal at the beginning or end of the data burst for indicating the start or end of the burst, the destination addresses of other words in the data burst being inferrable from the address in the header.
- a further aspect of the invention provides a method of processing a data burst received over a computer network comprising the steps of:
- Another aspect of the invention provides a method of interrupting transfer of a data burst over a computer network comprising the steps of:
- a further aspect of the invention provides a method of restarting the transfer of a data burst, after the transfer of that data burst has been interrupted, the method comprising the steps of:
- the first aspect of the present invention addresses the synchronisation problem for memory mapped network interfaces.
- the present invention uses a network interface, containing snooping hardware which can be programmed to contain triggering values comprising either addresses, address ranges, or other data which are to be matched. These data are termed ‘Tripwires’.
- the interface monitors the data stream, including address data, passing through the interface for addresses and data which match the Tripwires which have been set.
- the snooping hardware can generate interrupts or increment event counters, or perform some other application specified action.
- This snooping hardware is preferably based upon Content Addressable Memory (CAM). References herein to the “data stream” refer to the stream of data words being transferred and to the address data accompanying them.
- CAM Content Addressable Memory
- the invention thus provides in-band synchronisation by using synchronisation primitives which are programmable by user level applications, while still delivering high bandwidth and low latency.
- the programming of the synchronisation primitives can be made by the sending and receiving applications independently of each other and no synchronisation information is required to traverse the network.
- a number of different interfaces between the network interface and an application can be supported. These interfaces include VIA and the forthcoming Next Generation Input/Output (NGIO) standard.
- NGIO Next Generation Input/Output
- An interface can be chosen to best match an application's requirements, and changed as its requirements change.
- the network interface of the present invention can support a number of such interfaces simultaneously.
- the Tripwire facility supports the monitoring of outgoing as well as incoming data streams. These Tripwires can be used to inform a sending application that its DMA send operations have completed or are about to complete.
- Memory-Mapped network interfaces also have the potential to be used for communication between hardware entities. This is because memory mapped network interfaces are able to pass arbitrary memory bus cycles over the network. As shown in FIG. 4 , it is possible to set up a memory aperture 254 , in the NIC 252 of Computer 250 , which is directly mapped via NIC 259 , onto an address region 257 of the I/O bus 253 of passive backplane 251 .
- NIC Network Interface card
- the interface of the present invention can be programmed to present the same hardware interface as the remote hardware device 255 , and so appear at the hardware level in computer 250 to be an instance of the remote hardware device. If the network card 252 were an interface according to the present invention, so programmed, the remote hardware device 255 would appear as physically located within computer 250 , in a manner transparent to all software.
- the hardware device 255 is able to be physically located both at the remote end of a dedicated link, or over a general network. The invention will support both general networking activity and remote hardware communication simultaneously on a single network card.
- Another aspect of the invention relates to a link-level communication protocol which can be used to support cut-through routing and forwarding. There is no need for an entire packet to arrive at a NIC, or any other network entity supporting the communication protocol, before data transmission can be started on an outgoing link.
- the invention also allows large bursts of data to be handled effectively without the need for a small physical network packet size such as that employed by an ATM network, it being possible to dynamically stop and restart a burst and regenerate all address information using hardware.
- FIG. 1 shows two computers connected by a traditional network
- FIG. 2 shows two computers connected by a traditional memory-mapped network
- FIG. 3 shows a traditional SCI-like network
- FIG. 4 shows a traditional memory-mapped network between hardware entities
- FIG. 5 shows two or more computers connected by an embodiment of the present invention, using Network Interface Cards (NICs);
- NICs Network Interface Cards
- FIG. 6 shows in detail the various functional blocks comprising the NICs of FIG. 5 ;
- FIG. 7 shows the functional blocks of the NIC loyed within a Field Programmable Gate Array (FPGA);
- FPGA Field Programmable Gate Array
- FIGS. 8 and 8 e shows the communication protocol used in one embodiment of the invention
- FIG. 9 shows schematically hardware communication according to an embodiment of the invention.
- FIG. 10 shows schematically a circular buffer abstraction according to one embodiment of the invention.
- FIG. 11 shows schematically the system support for discrete message communication using circular buffers
- FIG. 12 shows a client-server interaction according to an embodiment of the invention
- FIG. 13 shows how the system of the present invention can support VIA
- FIG. 14 shows outgoing stream synchronisation according to an embodiment of the present invention
- FIG. 15 shows a client-server interaction according to an embodiment of the invention using a hardware data source
- FIG. 16 shows an apparatus for synchronising an end-point application and constituting an embodiment of the invention
- FIG. 17 shows another apparatus for synchronising an end-point application and constituting an embodiment of the invention
- FIGS. 18 to 23 show examples of actions which may be performed by the apparatuses of FIGS. 16 and 17 ;
- FIG. 24 illustrates the format of a data burst with implied addresses
- FIG. 25 illustrates an interruption in forwarding a burst of the type shown in FIG. 24 ;
- FIG. 26 illustrates forwarding of the rest of the burst
- FIG. 27 illustrates coalescing of two data bursts
- FIG. 28 illustrates “transparent” communication over a network between an application running on a computer and remote hardware
- FIG. 29 illustrates applications of various tripwires at different locations in a computer.
- computers 1 , 2 use the present invention to exchange data.
- a plurality of other computers such as 3 , may participate in the data exchange if connected via optional network switch 4 .
- Each computer 1 , 2 is composed of a microprocessor central processing unit 5 , 57 , memory 6 , 60 , local cache memory 7 , 57 , and system controller 8 , 58 .
- the system controller 8 , 58 interacts with its microprocessor 5 , 57 to allow the microprocessor to exchange data with devices attached to I/O bus 9 .
- Attached to I/O bus 9 , 59 are standard peripherals, such as a video adapter 10 .
- Also attached to I/O bus 9 , 59 is one or more network interfaces, in the form of NICS 11 , 56 which represent an embodiment of this invention.
- the I/O bus is a standard PCI bus conforming to PCI Local Bus Specification, Rev. 2.1, although any other bus capable of supporting bus master operations can be used with suitable modification of System Controller peripherals, such as video card 10 , and the interface to NIC 11 , 56 .
- each NIC comprises a memory 18 , 19 , 20 for storing triggering values, a receiver 15 for receiving a data stream, a comparator for comparing part of the data stream with the triggering values and a memory 23 for storing information which will identify matched triggering values.
- each NIC 56 , 11 is composed of a PCI to Local Bus bridge 12 , a control Field Programmable Gate Array (FPGA) 13 , transmit (Tx) serialiser 14 , fibre-optic transceiver 15 , receive (Rx) de-serialiser 16 , address multiplexer and latch 17 , CAM array 18 , 19 , 20 , boot ROMs 21 and 22 , static RAM 23 , FLASH ROM 24 , and clock generator and buffer 25 , 26 .
- FIG. 6 also shows examples of known chips which could be used for each component, for example boot ROM 21 could be an Altera EPC 1 chip.
- FPGA 13 is comprised of functional blocks 27 - 62 . The working of the blocks will be explained by reference to typical data flows.
- NIC 11 Operation of NIC 11 begins by computer 1 being started or reset. This operation causes the contents of boot ROM 21 to be loaded into FPGA 13 thereby programming the FPGA and, in turn, causing state machines 28 , 37 , 40 , 43 , 45 , 46 and 47 to be reset.
- Clock generator 25 begins running and provides a stable clock for the Tx serialiser 14 .
- Clock buffer/divider 26 provides suitable clocks for the rest of the system.
- Serialiser 14 and de-serialiser 16 are reset and remain in a reset condition until communication with another node is established and a satisfactory receive clock is regenerated by de-serialiser 16 .
- PCI bridge 12 is also reset and loaded with the contents of boot ROM 22 .
- Bridge 12 can convert (and re-convert at the target end) memory access cycles into I/O cycles and support legacy memory apertures, and as the rest of the NIC supports byte-enabled (byte-wide as well as word-wide) transfers, ROM 22 can be loaded with any PCI configuration space information, and can thus emulate any desired PCI card transparently to microprocessor 5 .
- FLASH control state machine 47 runs and executes a simple microcode sequence stored in FLASH memory 24 . Typically this allows the configuration space of another card such as 69 in FIG. 9 to be read, and additional information to be programmed into bridge 12 . Programming of the FLASH memory is also handled by state machine 47 in conjunction with bridge 12 .
- Microprocessor 5 writes one or more words to an address location defined by system controller 8 to lie within NIC Il 's address space.
- PCI to local bus bridge 12 captures these writes and turns them into local bus protocol (discussed elsewhere in this document). If the writes are within the portion of the address space determined to be within the local control aperture of the NIC by register decode 48 , then the writes take place locally to the Content Addressable Memory appropriate register, (CAM), Static RAM (SRAM) or FLASH memory area. Otherwise target state machine 28 claims the cycles and forwards them to protocol encoder 29 .
- CAM Content Addressable Memory appropriate register
- SRAM Static RAM
- byte-enable, parity data and control information are added first to an address and then to each word to be transferred in a burst, with a control bit marking the beginning of the burst and possibly also a control bit marking the end of the burst.
- the control bit marking the beginning of the burst indicates that address data forming the header of the data burst comprises the first “data” word of the burst.
- Xon/Xoff-style management bits from block 31 are also added here. This protocol, specific to the serialiser 14 and de-serialiser 16 is also discussed elsewhere in this document.
- Serialiser 14 converts a 23-bit parallel data stream at 62 MHz to a 1-bit data stream at approximately 1.5 Gbit/s; this is converted to an optical signal by transceiver 15 and carried over a fibre-optic link to a corresponding transceiver 15 in NIC 56 , part of computer 2 . It should be noted that other physical layers and protocols are possible and do not limit the scope of the invention.
- NIC 56 the reconstructed digital signal is clock-recovered and de-serialised to 62 MHz by block 16 .
- Block 32 expands the recovered 23 bits to 46 bits, reversing the action of block 30 .
- Protocol decoder 33 checks that the incoming words have suitable sequences of control bits. If so, it passes address/data streams into command FIFO 34 . If the streams have errors, they are passed into error FIFO 35 ; master state machine 37 is stopped; and an interrupt is raised on microprocessor 57 by block 53 . Software is then used to decipher the incoming stream until a correct sequence is found, whereupon state machine 37 is restarted.
- master state machine 37 When a stream arrives at the head of FIFO 34 , master state machine 37 requests access to local bus 55 from arbiter 40 . When granted, it passes first the address, then the following data onto local bus 55 . Bridge 12 reacts to this address/data stream by requesting access to I/O bus 59 from system controller 58 . When granted, it writes the required data into memory 60 .
- Reads of computer 2 's memory 60 initiated by computer 1 take place in a similar manner. However, state machine 28 after sending the address word sends no other words, rather it waits for return data. Data is returned because master state machine 37 in NIC 56 reacts to the arrival of a read address by requesting a read of memory 60 via I/O bus 59 and corresponding local bus bridge 12 . This data is returned as if it were write data flowing from NIC 56 to NIC 11 , but without an initial address. Protocol decoder 33 reacts to this addressless data by routing it to read return FIFO 36 , whereupon state machine 28 is released from its wait and the microprocessor 5 's read cycle is allowed to complete.
- remote interrupt generator 54 causes state machine 28 to send a word from NIC 56 to a mailbox register in NIC 11 's bridge 12 . This will have been configured by software to raise an interrupt on microprocessor 5 .
- master state machine 37 uses pipeline delay 38 to anticipate the draining of FIFO 34 and to terminate the data burst on local bus 55 . It then uses the CAM address latch/counter 41 to restart the burst when more data arrives in FIFO 34 .
- Tripwires are triggering values, such as addresses, address ranges or other data, that are programmed into the NIC to be matched.
- the trigging values used as tripwires are addresses.
- three CAM devices are pipelined to reduce the match cycle time from around 70 nanoseconds to less than 30 nanoseconds.
- Tripwires takes place by microprocessor 5 writing to PCI bridge 12 via system controller 8 and I/O bus 9 .
- CAM array 18 , 19 , 20 appears like conventional RAM to microprocessor 5 .
- CAM controller 43 generating suitable control signals to enable all three CAMs 18 , 19 , 20 for write access.
- Address latch 44 passes data to the CAMs unmodified.
- Address multiplexer 41 is arranged to pass local bus data out on the CAM address bus where it is latched at the moment addresses are valid on the local bus by latch 17 .
- the process is similar, except that only CAM 18 is arranged to be enabled for read access, and address latch/counter 44 has its data flow direction reversed.
- microprocessor 5 sees the expected data returned, since the memory arrays in CAMs 18 , 19 , 20 either contain the same data, or internal flags indicating that particular segments of the memory array have not yet been written and should not participate in match cycles.
- a burst starts with the address of the first data word followed by an arbitrary number of data words.
- the address of the data words is implicit and increments from the start address.
- address latch/counter 44 is loaded with the address of each new data burst, and incremented each time a valid data item is presented on internal local bus 55 .
- CAM control state machine 43 is arranged to enable each CAM 18 , 19 , 20 in sequence for a compare operation as each new address is output by latch/counter 44 .
- This sequential enabling of the CAMs combined with their latching properties permits the access time for a comparison operation to be reduced by a factor of three (there being three CAMs in this implementation, other implementations being possible) from 70 ns to less than 30 ns.
- the CAM op-code for each comparison operation is output from one of the internal registers 49 via address multiplexers 41 and 17 .
- the op-code is actually latched by address multiplexer 17 at the end of a read/write cycle, freeing the CAM address bus to return the index of matched Tripwires after comparison operations.
- the Tripwire data (i.e. the addresses to be monitored) is written to sequential addresses in the CAM array.
- the comparison operation (cycle) all valid Tripwires are compared in parallel with the address of the current data, be it inbound or outbound.
- masking operations may be performed, depending on the type of CAM used, allowing certain bits of the address to be ignored during the comparison. In this way, a Tripwire may actually represent a range of addresses rather than one particular address.
- the CAM array When the CAM array signals a match found (i.e. a Tripwire has been hit), it returns the address of the Tripwire (its offset in the CAM array) via the CAM address bus to the tripwire FIFO 42 . Two courses of action are then possible, depending on how internal registers 49 have been programmed.
- state machine 45 One course of action is for state machine 45 to request that an interrupt be generated by management logic 53 .
- an interrupt is received by microprocessor 5 , and software is run which services the interrupt. Normally this would involve microprocessor 5 reading the Tripwire address from FIFO 42 , matching the address with a device-driver table, signalling the appropriate process, marking it runnable and rescheduling.
- a course of action is for state machine 45 to cause records to be read from SRAM 23 using state machine 46 .
- a record comprises a number of data words; an address and two data words. These words are programmed by the software just before the Tripwire information is stored in the CAM. When a Tripwire match is made, the address in LATCH 44 is left shifted by two to form an address index for SRAM 23 .
- the first word is then read by state machine 46 and placed on local bus 55 as an address in memory 6 .
- a fetch-and-increment operation is then performed by state machine 45 , using the second and third words of the SRAM record to first AND and then OR, or else INCREMENT the data referred to in memory 6 .
- a bit in the first word read by the state machine will indicate which operation it should take. In the case of an INCREMENT, the first data word also indicates the amount to increment by.
- the device driver While in the case of the interrupt followed by a Tripwire FIFO read, the device driver is presented with a list of endpoints which require attention. This list improves system performance as the device driver is not required to scan a large number of memory locations looking for such endpoints.
- the device driver Since the device driver is not required to know where the memory locations which have been used for synchronisation are. It is also not required to have any knowledge or take part in the application level communication protocol. All communication protocol processing can be performed by the application and different applications are free to use differing protocols for their own purposes, and one device driver instance may support a number of such applications.
- the invention addresses this problem by using hardware FIFO 50 to queue DMA requests from applications.
- Each application wanting to request DMA transfers sets up a descriptor, containing the start address and the length of the data to be transferred, in its local memory and posts the address of the descriptor to the DMA queue, whose address is common to all applications. This can be arranged by mapping a single page containing the physical address of the DMA queue as a write-only page into the address space of all user applications as they are initialised.
- bridge 12 When the DMA process is complete, bridge 12 notifies state machine 51 of the completion. The state machine then uses data from descriptor block 52 to write back a completion descriptor in memory 6 .
- an interrupt can also be raised on microprocessor 5 , although a Tripwire may already have been crossed to provide this notification early in order to minimise the delay bringing the relevant application back onto microprocessor 5 's run queue. This is shown later in this document.
- state machine 51 writes a failure code back into the completion field of the descriptor that the application has just attempted to place on the queue.
- the application does not need to read the status of the NIC in order to safely post a DMA request. All applications can safely share the same hardware posting address, and no time-consuming virtualisation or system device driver process is necessary.
- timeout logic 61 is activated to terminate the current cycle and return an interrupt through block 53 .
- Another aspect of the invention relates to the protocol which is preferably used by the NIC.
- This protocol uses an address and some additional bits in its header. This allows the transfer of variable length packets with simple routines for Segmentation and Reassembly (SAR) that are transparent to the sending or receiving codes. This is also done without the need to have an entire packet arrive before segmentation, reassembly or forwarding can occur, allowing the data to be put out on the ongoing link immediately. This enables data to traverse many links without significantly adding to the overall latency.
- the packets may be fragmented and coalesced on each link, for example between the NIC and a host I/O bus bridge, or between the NIC and another NIC. We term this cut-through routing and forwarding.
- cut-through forwarding and routing enables small packets to pass through the network without any delays caused by large packets of other streams. While other network physical layers such as ATM also provide the ability to perform cut-through forwarding and routing, they do so at the cost of requiring all packets to be of a fixed small size.
- FIG. 8 shows an example of how this protocol has been implemented using the 23-bit data transfer capability of HP's GLINK chipset (serialiser 14 and de-serialiser 16 ).
- PCI to local bus bridge 12 provides a bus of 32 address/data bits, 4 parity bits and 4 byte-enable bits. It also provides an address valid signal (ADS) which signifies that a burst is beginning, and that the address is present on the address/data bus. The burst continues until a burst last signal (BLAST) is set active, signifying the end of a burst. It provides a read/write signal, and some other control signals that need not be transferred to a remote computer.
- FIG. 8A shows how this protocol is used to transfer an n data word burst 63 . The data traffic closely mirrors that used on the PCI bus, but uses fewer signals.
- the destination address always precedes each data burst. Therefore, the bursts can be of variable size, can be split or coalesced, by generating fresh address words, or by removing address words where applicable.
- sequential data words are destined for sequentially incrementing addresses.
- data words having sequentially decrementing addresses might also be used, or any other pattern of addresses may be used so long as it remains easy to calculate. So far as the endpoints are concerned, exactly the same data is transferred to exactly the same locations.
- the benefits are that packets can be of any size at all, reducing the overhead of sending an address; packets can be split (and addresses regenerated to continue) by network switches to provide quality of service, and receivers need not wait for a complete packet to arrive to begin decoding work.
- the destination address given in the header may be for the ‘nth’ data word in the burst, rather than for the first, although using the first data word address is preferred.
- FIG. 8 b shows how the protocol of FIG. 8 a is transcribed onto the G-LINK physical layer.
- the first word in any packet contains an 18-bit network address.
- Each word of 63 is split into two words in 64 ; the lower 16 bits carry high and low addresses or data, corresponding to the address/data bus; the next 4 bits carry either byte enables or parity data.
- the byte enable field (only 2 bits of which are available, owing to the limitations of G-LINK) is used to carry a 2-bit code indicating read, write or escape packet use. Escape packets are normally used to carry diagnostic or error information between nodes, or as a means of carrying the Xon/Xoff-style protocol when no other data is in transit.
- the G-LINK nCAV signal corresponds to the ADS signal of 63 ; nDAV is active throughout the rest of the burst and the combination of nDAV inactive and nCAV inactive signals the end of a burst, or nCAV active indicates the immediate beginning of another burst.
- FIG. 8 c shows a read data burst 65 ; this is the same as a write burst 64 , except data bit 16 is set to 0.
- the data field contains the network address for the read data to be returned to.
- the data for a read returns 66 it travels like a write burst, but is signified by there only being one nCAV active (signifying the network address) along with the first word.
- An additional bit, denoted FLAG in FIG. 8 is used to cary Xon/Xoff sttyle information when a burst is in progress. It is not necessary therefore to break up a burst in order to send an Escape packet containing the Xon/Xoff information.
- the FLAG bit also serves as an additional end of packet indicator.
- FIG. 8 c 67 , 68 shows an escape packet; after the network address, this travels with 68 or without 67 a payload as defined by data bit 16 in the first word of the burst.
- an extra network address word may precede each of these packets.
- Other physical layer or network layer solutions are possible, without compromise to this patent application, including fibre channel parts (using 8B/10B encoding) and conventional networks such as ATM or even Ethernet.
- the physical layer only needs to provide some means of identifying data from non-data and the start of one burst from the end of a previous one.
- a further aspect of the invention relates to the distribution of hardware around a network.
- One use of a network is to enable one computer to access a hardware device whose location is physically distant.
- a hardware device whose location is physically distant.
- the NIC 73 is programmed from Boot ROM 22 to present the same hardware interface as that of the frame-grabber card 69 .
- Computer 72 can be running the standard application program as provided by a third party vendor which is unaware that system has been distributed over a network.
- All control reads and writes to the frame-grabber 69 are transparently forwarded by the NIC 73 , and there is no requirement for an extra process to be placed in the data path to interface between the application running on CPU 74 and the NIC 73 .
- Passive PCI I/O back-plane 71 requires simply a PCI bus clock and arbiter i.e., no processor, memory or cache. These functions can be implemented at very low cost.
- the I/O buses are conformant to PCI Local Bus Specification 2.1.
- This PCI standard supports the concept of a bridge between two PCI buses. It is possible to program the NIC 73 to present the same hardware interface as a PCI bridge between Computer 72 and passive back-plane 71 . Such programming would enable a plurality of hardware devices to be connected to back-plane 71 and controlled by computer 72 without the requirement for additional interfacing software. Again, it should be clear that the invention will support both general networking activity and this remote hardware communication, simultaneously using a single network card.
- FIG. 10 shows a system comprising two software processes, applications 102 and 103 , on different computers 100 , 101 .
- Application 102 is producing some data.
- Application 103 is awaiting the production of data and then consuming it.
- the circular buffer 107 is composed of a region of memory on Computer 101 which holds the data and two memory locations—RDP 106 and WRP 109 .
- WRP 109 contains the pointer to the next byte of data to be written into the buffer, while RDP 106 contains the pointer to the last byte of data to be read from the buffer.
- RDP 106 contains the pointer to the last byte of data to be read from the buffer.
- WRP 108 and RDP 111 are also private values in the caches of computer 100 and computer 101 respectively.
- Each computer 100 , 101 may use the value of WR 1 and RDP held in its own local cache memory to compute how much data can be written to or read from the buffer at any point in time, without the requirement for communication over the network.
- the producer sets up a Tripwire 110 , which will match on a write to the RDP pointer 106
- the consumer sets up a Tripwire 113 , which will match on a write to the WRP pointer 109 .
- consumer application 103 attempts to read data from the circular buffer 107 , it first checks to see if the circular buffer is empty. If so, application 103 must wait until the buffer is not empty, determined when WRP 109 has been seen to be incremented. During this waiting period, application 103 may either block, requesting an operating system reschedule, or poll the WRP 109 pointer.
- producer application 102 decides to write to the circular buffer 107 , it may do so while the buffer is not full. After writing some data, application 102 updates its local cached value of WRP 108 , and writes the updated value to the memory location 109 , in computer 101 . When the value of WRP 109 , is updated, the Tripwire 113 , will match as has been previously described.
- NIC 115 will raise a hardware interrupt 114 .
- This interrupt causes CPU 118 to run device driver software contained within operating system 118 .
- the device driver will service the interrupt by reading the tripwire FIFO 42 on NIC 115 and determine from the value read, the system identifier for application 103 .
- the device driver can then request that operating system 118 , reschedule application 103 .
- the device driver would then indicate that the tripwire 113 should not generate a hardware interrupt until application 103 has been next descheduled and subsequently another Tripwire match has occurred.
- system identifier for each running application is loaded into internal registers 49 , each time the operating system reschedules. This enables the NIC to determine the currently running application, and so make the decision whether or not to raise a hardware interrupt for a particular application given a Tripwire match.
- application 102 and application 103 could be operating on different parts of the circular buffer simultaneously without the need for mutual exclusion mechanisms or Tripwire.
- the most important properties of the data structure are that the producer and the consumer are able to process data without hindrance from each other and that flow control is explicit within the software abstraction. Data is streamed through the system. The consumer can remove data from the buffer at the same time as the producer is adding more data. There is no danger of buffer over-run, since a producer will never transmit more data than can fit in the buffer.
- the producer only ever increments WRP 108 , 109 and reads RDP 106
- the consumer only ever increments RDP 106 , 111 , and reads WRP 109 .
- Inconsistencies in the values of WRP and RDP seen by either the producer or consumer either cause the consumer to not process some valid data (when RDP 106 is inconsistent with 111 ), or the producer to not write some more data (when WRP 109 is inconsistent with 108 ), until the inconsistency has been resolved. Neither of these occurrences cause incorrect operation or performance degradation so long as they are transient.
- computer 100 can store the value of the RDP 106 pointer in its processor cache, since the producer application 102 only reads the pointer 106 . Any remote writes to the memory location of the RDP pointer 106 will automatically invalidate the copy in the cache causing the new value to be fetched from memory. This process is automatically carried out and managed by the system controller 8 .
- computer 101 keeps a private copy of the RDP pointer 111 in its own cache, there is no need for any remote reads of RDP pointer values during operation of the circular buffer.
- a further enhancement to the above arrangement can be used to provide support for applications which would like to exchange data in discrete units.
- the system maintains a second circular buffer 127 , of updated WRP 129 values corresponding to buffer 125 .
- This second buffer 127 is used to indicate to a consumer how much data to consume in order that data be consumed in the same discrete units as it were produced.
- circular buffer 125 contains the data to be exchanged between the applications 122 and 123 .
- the producer, application 122 writes data into buffer 125 , updating the pointer WRP 129 , as previously described. Once data has been placed in buffer 125 , application 122 then writes the new value of the WRP 129 pointer into buffer 127 . At the same time it also manipulates the pointer WRP 131 . If either of these write operations does not complete then the application level write operation is blocked until some data is read by the consumer application 123 .
- the Tripwire mechanism can be used as previously described, for either application to block on either a full or empty buffer pair.
- the consumer application 123 is able to read from both buffers 125 and 127 , in the process updating the RDP pointers 133 , 135 in its local cache and RDP pointers 124 , 126 over the network in the manner previously described.
- a data value read from buffer 127 indicates an amount of data, which had been written into buffer 125 . This value may be used by application level or library software 123 , to consume data from buffer 125 in the same order and by the same discrete amounts as it were produced by application 122 .
- the NIC can also be used to directly support a low latency Request/Response style of communication, as seen in client/server environments such as Common Object Request Broker Architecture (CORBA) and Network File System (NFS) as well as transactional systems such as databases.
- CORBA Common Object Request Broker Architecture
- NFS Network File System
- FIG. 12 Such an arrangement is shown in FIG. 12 , where application 142 on computer 140 acts as a client requesting service from application 143 on computer 141 , which acts as a server.
- the applications interact via memory mappings using two circular buffers 144 and 145 , one contained in the main memory of each computer.
- the circular buffers operate as previously described, and also can be configured to transfer data in discrete units as previously described.
- Application 142 the client, writes a request 147 directly into the circular buffer 145 , via the memory mapped connection(s), and waits for a reply by waiting on data to arrive in circular buffer 144 .
- Most Request/Response systems use a process known as marshalling to construct the request and use an intermediate buffer in memory of the client application to do the marshalling.
- marshalling is used to construct a response, with an intermediate buffer being required in the memory of the server application.
- marshalling can take place directly into the circular buffer 145 of the server as shown. No intermediate storage of the request is necessary at either the client or server computers 140 , 141 .
- the server application 143 notices the request (possibly using the Tripwire mechanism) and is able to begin unmarshalling the request as soon as it starts to arrive in the buffer 145 . It is possible that the server may have started to process the request 149 while the client is still marshalling and transmitting, thus reducing latency in the communication.
- the server After processing the request, the server writes the reply 146 directly into buffer 144 , unblocking application 142 (using the Tripwire mechanism), which then unmarshalls and processes the reply 148 . Again, there is no need for intermediate storage, and unmarshalling by the client may be overlapped with marshalling and transmission by the server.
- FIG. 15 shows a Request/Response system which is a file serving application.
- the client application 262 writes a request 267 for some data held on disks controlled by 271 .
- the server application 263 reads 269 and decodes the request from its circular buffer 265 in the manner previously described. It then performs authentication and authorisation on the request according to the particular application.
- the server application 263 uses a two-part approach to send its reply. Firstly, it writes, into the circular buffer 264 , the software generated header part of the reply 266 . The server application 263 then requests 273 that the disk controller 271 send the required data part of the reply 272 over the network to circular buffer 264 .
- This request to the disk controller takes the form of a DMA request, with the target address being an address on I/O bus 270 which has been mapped onto circular buffer 264 . Note that the correct offset is applied to the address such that reply data 272 from the disk is placed immediately following the header data 266 .
- the server application 263 Before initiating the request 273 , the server application 263 can ensure that sufficient space is available in the buffer 264 to accept the reply data. Further, it is not necessary for the server application 263 to await the completion request 273 . It is possible for the client application 262 to have set a Tripwire 274 to match once the reply data 272 has been received into buffer 264 . This match can be programmed to increment the WRP pointer associated with buffer 264 , rather than requiring application 263 to increment the pointer as previously described. If a request fails, then the client application 262 level timeout mechanism would detect and retry the operation.
- reply data 272 be placed in some other data structure, (such as a kernel buffer-cache page), through manipulation of 169 and 167 as described later. This is useful when 264 is not the final destination of the rept data, so preventing a final memory copy operation by the client.
- Server application 263 would be unaware of this client side optimisation.
- FIG. 13 shows two applications communicating using VIA.
- Application 152 sends data to application 153 , by first writing the data to be sent into a region of its memory, shown as block 154 .
- Application 152 then builds a transmit descriptor 156 , which describes the location of block 154 and the action required by the NIC (in this case data transmission). This descriptor is then placed onto the TxQueue 158 , which has been mapped into the user-level address-space of application 152 .
- Application 152 then finally writes to the doorbell register 160 in the NIC 162 to notify the NIC that work has been placed on the TxQueue 158 .
- the NIC 162 can determine, from the value written, the address in physical memory of the activated TxQueue 158 .
- the NIC 152 reads and removes the descriptor 156 from the TxQueue 158 , determines from the descriptor 156 , the address of data block 154 and invokes a DMA 164 engine to transmit the data contained in block 154 .
- the NIC 162 places the descriptor 156 on a completion queue 166 , which is also mapped into the address space of application 152 , and optionally generates a hardware interrupt.
- the application 152 can determine when data has been successfully sent by examining queue 166 .
- application 153 When application 153 is to receive data, it builds a receive descriptor 157 describing where the incoming data should be placed, in this case block 155 . Application 153 then places descriptor 157 onto RxQueue 159 , which is mapped into its user-level address-space. Application 153 then writes to the doorbell register 161 to indicate that its RXQueue 159 has been activated. It may choose to either poll its completion queue 163 , waiting for data to arrive, or block until data has arrived and a hardware interrupt generated.
- the NIC 165 in computer 151 services the doorbell register 161 write by first removing the descriptor 157 from the RxQueue 159 .
- the NIC 165 locates the physical pages of memory corresponding to block 155 and described by the receive descriptor 157 .
- the VIA standard allows these physical pages to have been previously locked by application 153 (preventing the virtual memory system moving or removing the pages from physical memory).
- the NIC is also capable of traversing the page-table structures held in physical memory and itself locking the pages.
- the NIC 165 continues to service the doorbell register write and constructs a Translation Look-aside (TLB) entry 167 located in SRAM 23 .
- TLB Translation Look-aside
- the TLB translation having been previously set up, occurs with little overhead and the data is written 175 to appropriate memory block 155 .
- a Tripwire 171 will have been arranged (when the TLB 167 entry was constructed) to match when the address range corresponding to block 155 is written to. This Tripwire match causes the firmware 173 (implemented in state machine 51 ) to place the receive descriptor 157 onto completion queue 163 to invalidate the TLB mapping 167 and optionally generate an interrupt. If the RxQueue 159 has been loaded with other receive descriptors, then the next descriptor is taken and loaded into the TLB as previously described.
- the NIC could also respond to Tripwire match 171 by placing an index on Tripwire FIFO 42 , which could enable the device driver to identify the active VIA endpoint without searching all completion queues in the system.
- This method can be extended to provide support for I20 and the forthcoming Next Generation I/O (NGIO) standard.
- NGIO Next Generation I/O
- the transmit, receive and completion queues are located on the NIC rather than in the physical memory of the computer, as is currently the case for the VIA standard.
- FIG. 14 shows a Direct Memory Access (DMA) engine 182 on the NIC 183 , which has been programmed in the manner previously described by a number of user-level applications 184 . These applications have requested that the NIC 183 transfer their respective data blocks 181 through the NIC 183 , local bus 189 , fibre-optic transceiver 190 and onto network 200 . After each application has placed its data transfer request onto the DMA request queue 185 , it blocks, awaiting a re-schedule, initiated by device driver 187 . It can be important that the system maintains fair access between a large number of such applications, especially under circumstances where an application requires a strict periodic access to the queue, such as an application generating a video stream.
- DMA Direct Memory Access
- Data transferred over the network by the DMA engine 182 traverses local bus 189 , and is monitored by the Tripwire unit 186 . This takes place in the same manner as for received data, (both transmitted and received data pass through the NIC using the same local bus 55 ).
- Each application when programming the DMA engine 182 to transmit a data block, also constructs a Tripwire which is set to match on an address in the data block.
- the address to match could indicate that all or a certain portion of the data has been transmitted.
- the device driver 187 can quickly determine which application should be made runnable. By causing a system reschedule, the application can be run on the CPU at the appropriate moment to generate more DMA requests. Because the device driver can execute at the same time that the DMA engine is transferring data, this decision can be made in parallel to data transfer operations. Hence, by the time that a particular application's data transfer requests have been satisfied, the system can ensure that the application be running on the CPU and able to generate more requests.
- FIG. 16 illustrates a generalised apparatus or arrangement for synchronising an end-point application using a tripwire.
- An end-point is a final destination for an information stream and is the point at which processing of the information takes place.
- Examples of end-points include a web, a file, a database server and hardware devices such as a disk or graphics controller.
- An end-point may be running an operating system and a number of data processing applications and these are referred to as end-point applications.
- examples of end-point applications include an operating system or a component thereof, a network protocol stack, and any application-level processing. Arrangements such as network switches and routers do not constitute end-points or end-point applications because their purpose is to ensure that the information is delivered elsewhere.
- the arrangement comprises a computer 300 which is optionally connected to other computers 301 and 302 via a network 303 .
- the computer 300 comprises a program memory (illustrated by way of example only as a read only memory (ROM) 305 ) which contains a program for controlling the computer to synchronise the end-point application in accordance with an address-based event in an information stream on an information pathway 307 , such as a bus, within the computer.
- the information stream may be wholly within the computer, for example from another application performed by the computer 300 , or may be from a remote source, such as from the network 303 .
- the bus 307 is connected to a memory 308 in the end-point application 306 , which also comprises a code generator 309 and an action generator 310 .
- the code generator 309 supplies codes to a comparator which is illustrated as a content addressable memory (CAM) 311 .
- the CAM 311 has another input connected to the bus 307 and is arranged to perform a comparison between each entry in the CAM and the information stream on the bus 307 . When a match is found, the CAM sends a signal to the action generator 310 which performs an action which is associated with an address-based event in the information stream.
- the end-point application 306 sets a tripwire, for example to be triggered when data relating to an end-point address or range of end-point addresses in the memory 308 are present on the bus 307 .
- the code generator 309 supplies a code which is written into the CAM 311 and which comprises the destination memory address of the data or possibly part of this address, such as the most significant bits when a range of addresses is to be monitored. It is also possible to enter a code which represents not only the address or range of addresses but also part or all of one or more items of data which are expected in the information stream.
- the CAM 311 compares the address of each data burst on the bus 307 , and possibly also at least some of the data of each burst, with each code stored in the CAM 311 and supplies a signal to the action generator 310 when a match is found.
- the action generator 310 then causes the appropriate action to be taken within the end-point application 306 . This may be a single action, several actions, or one or more specific actions which are determined not only by the triggering of the tripwire but also by the data within the information stream, for example arriving at the appropriate location or locations in the memory 308 .
- the information stream 307 may be wholly internal to the computer 300 and an example of this is an application-to-application stream of information where both applications are running, for example alternately, on the computer 300 .
- the information stream may be partly or wholly from outside the computer 300 , as illustrated by the broken line connection from the bus 307 to the network 303 .
- the information stream may be from a switch fabric, a network, or a plurality of sources.
- a switch fabric is a device which has a plurality of inputs and outputs and which is capable of forwarding data from each input to the appropriate output according to routing information contained within the data.
- a switch fabric may alternatively be wholly contained within the computer.
- the information stream preferably has a data burst arrangement as described hereinafter and, in the case of a plurality of sources, the data bursts may arrive from any of the sources at any time, which amounts to multiplexing.
- FIG. 17 shows an arrangement which illustrates two possible modifications to the arrangement shown in FIG. 16 .
- the bus 307 is connected to an input/output bus 312 of the end-point application 306 within the computer 300 .
- An example of an active controller is a disk controller.
- the arrangement shown in FIG. 17 also differs from that shown in FIG. 16 in that the tripwire may be triggered by an address-based event in the information stream on the bus 307 which does not exactly match any of the codes stored in the CAM 311 . Instead, the information from the information stream on the bus 307 first passes through a process 313 before being supplied to the CAM for comparison with each of the stored codes.
- the information stream comprises packets or bursts of data starting with an address, for example corresponding to an address in the memory 308 to which the first item of data after the address in the packet or burst is allocated. Subsequent items of data are to be allocated to consecutive addresses, for example such that each item of data in the burst is to be allocated to the next highest address location after the preceding data item.
- the address at the start of each burst relates to the first data item and the following data item addresses can be inferred by incrementing the address upon the arrival of the second and each subsequent item of data.
- the application 306 can cause the code generator 309 to store in the CAM 311 a code which corresponds to an implied address in the actual information stream appearing on the bus 307 .
- the process 313 detects the address at the start of each data burst and supplies this to the CAM 311 with the arrival of the first data item. As each subsequent data item of the same burst arrives, the process 313 increments the address and supplies this to the CAM 311 . This allows a tripwire to be triggered when, for example a data item having an implied address is present on the bus 307 because the CAM can match the corresponding stored code with the address supplied by the process 313 .
- the action generator 310 can cause any one or more of various different actions to be triggered by the tripwire.
- the resulting action may be determined by which tripwire has been triggered i.e. which code stored in the CAM 311 has been matched. It is also possible for the action to be at least partly determined by the data item which effectively triggered the tripwire. Any action may be targetted at the computer containing the tripwire or at a different computer.
- Various possible actions are described hereinafter as typical examples and may be performed singly or in any appropriate combination for the specific application and may be targetted at the computer containing the tripwire or at a different computer.
- FIG. 18 illustrates the action generator 310 raising an interrupt request IRQ and supplying this to the interrupt line of a central processing unit (CPU) 320 of the computer 300 .
- FIG. 19 illustrates the action generator 310 setting a bit in a bitmap 321 , for example in the memory 308 .
- the action generator may raise an interrupt request if an application which requires data corresponding to the tripwire is not currently running but is runnable; for example it has not exhausted its time-slice. Otherwise, for example if the application is awaiting rescheduling, the relevant bit in the bitmap 321 may be set.
- the operating system may periodically check the bitmap 321 for changes and, as a result of the arrival of the relevant data for an application which is presently not running, may decide to reschedule or wakeup the application.
- FIG. 20 illustrates another type of action which may be performed as a result of detection of the address-based event.
- a counter 322 for example whose count is stored within the memory 308 , is incremented in response to triggering of the tripwire. Incrementing may take place as a result of any tripwire being triggered or only by one or more specific tripwires depending on the specific application.
- FIG. 21 illustrates another action which is such that, when the or the appropriate tripwire is triggered, a predetermined value “N” is written to a location “X” shown at 323 as being in the memory 308 (or being mapped thereto).
- FIG. 22 illustrates another combination of actions which may be used to indicate that an application should be awakened or rescheduled.
- a tripwire When a tripwire is triggered, an interrupt request is supplied to the CPU 320 and a “runnable bit” for a specific application is set at location 324 in the memory 308 .
- the operating system of the computer 300 responds to the interrupt request by waking up or rescheduling the application whose runnable bit has been set.
- FIG. 23 illustrates an action which modifies entries in the CAM 311 in response to triggering of a tripwire.
- the code which triggers the tripwire may be deleted if no further tripwires are required for the same address-based event.
- the code may be modified so as effectively to set a different but related tripwire.
- a further possibility is to generate a completely new code and supply this to the CAM 311 in order to set a new unrelated tripwire.
- FIG. 24 illustrates the format of a data burst, a sequence of which forms the information stream on the bus 307 .
- the data burst comprises a plurality of items which arrive one after the other in sequence on the bus.
- the first item is an address A(n) which is or corresponds to the end-point address, for example in the memory 308 , for receiving the subsequent data items.
- This address is the actual address n of the first data item D 1 of the burst, which immediately follows the address A(n).
- the subsequent data items D 2 , D 3 . . . , D p arrive in sequence and their destination addresses are implied by their position within the burst relative to the first data item D 1 and its address n.
- the second data item D 2 has an implied address n+1
- the third data item D 3 has an implied address n+2 and so on.
- Each data item is written or supplied to the implied address as its destination address.
- This data burst format may be used to fragment and coalesce bursts as the data stream passes through a forwarding unit 330 , such as a network interface card or a switch, of an information pathway.
- a forwarding unit 330 such as a network interface card or a switch, of an information pathway.
- the forwarding unit can start to transmit a burst as soon as the first data item has arrived and does not have to wait until the whole data burst has arrived.
- FIG. 25 illustrates an example of this in which an interruption in the data burst occurs.
- the forwarding unit 330 has already started transmission of the burst and the first r data items 331 together with the burst address have already been forwarded.
- the remainder 332 of the burst has not yet arrived and the forwarding unit 330 terminates forwarding or transmission of that burst.
- the forwarding unit 330 recalculates the destination address A(r+1) for the remainder of the burst and inserts this in front of the data item D r+1 . This is transmitted as a further burst 333 as illustrated in FIG. 26 .
- This technique may be used even when the whole burst is available for forwarding by the forwarding unit 330 .
- the forwarding unit 330 may terminate transmission of a particular burst before completion of transmission for reasons of arbitration between a number of competing bursts or for flow control reasons.
- individual data bursts can be forwarded in tact or can be sent in two or more fragments as necessary or convenient and all such bursts are treated as valid bursts by any subsequent forwarding units.
- FIG. 27 illustrates an alternative situation in which the forwarding unit has an internal buffer 335 which contains first and second bursts 336 and 337 .
- the implied address of the first data item D n +1 of the second burst 337 immediately follows the implied address of the last data item D n of the first burst 336 .
- the forwarding unit checks for such situations and, when they are found, coalesces the first and second bursts into a coalesced burst 338 as shown in the lower part of FIG. 27 .
- the forwarding unit then transmits a single contiguous burst, which saves the overhead of the excess address information (which is deleted from the second burst).
- Any subsequent forwarding units then treat the coalesced burst 338 as a single burst.
- the format of the data burst allows such fragmentation or merging of bursts to take place. This in turn allows forwarding units to transmit data as soon as it arrives so as to reduce or minimise latency. Also, bursts of any length or number of data items can be handled which improves the flexibility of transmission of data.
- FIG. 28 illustrates an example of communication between an application, whose address space is shown at 340 , and remote hardware 341 via a network 303 such that the network 303 is “transparent” or “invisible” to each of the application and the remote hardware 341 .
- the address space 340 contains mapped configuration data and registers of the remote hardware as indicated at 342 . This is mapped onto the system input/output bus 343 to which a network interface card 344 is connected.
- the network interface card 344 is loaded with configuration and register data corresponding to the remote hardware 341 . All application requests are forwarded over the network 303 transparently to the remote hardware 341 so that the remote hardware appears as though it is local to the application and the network 303 is invisible.
- the remote hardware 341 is connected to a passive input/output bus 345 which is provided with a network interface card 346 for interfacing to the network 303 .
- the configuration and registers of the remote hardware are illustrated at 347 and are mapped ultimately to the region 342 of the address space 340 of the application. Again, the network is invisible to the remote hardware 341 and the remote application appears to be local to it.
- the application sends a request to the remote hardware 341 , for example requesting that the remote hardware supply data to be used in or processed by the application, this is written in the space 342 which is mapped to the system input/output bus 343 .
- the network interface card 344 sends read/write requests over the network 303 to the card 346 , which supplies these via the passive input/output bus 345 to the remote hardware 341 .
- the bus 345 appears equivalent to the bus 343 .
- the remote hardware 341 may supply an interrupt and/or data for the application to the bus 345 . Again, the network interface card 346 sends this via the network 303 to the card 344 . The network interface card 344 supplies an interrupt request to the computer running the application and writes the data on behalf of the remote hardware to the space 342 in the address space 340 of the application. Thus, to the application, the remote hardware 341 appears to be connected directly to the bus 343 .
- tripwires may be implemented at other points in a system as illustrated by tripwire units 2 to 5 in FIG. 29 .
- the system comprises a disk controller 351 connected to an input/output bus 307 b and the tripwire unit 2 is implemented as part of the disk controller 351 .
- Such an arrangement allows tripwire operations to inform applications of any characteristic data transfer to or from the disk controller 351 .
- Such an arrangement is particularly useful where the controller 351 is able to transfer data to and from a non-contiguous memory region corresponding to user-level buffers of an application. This allows data transfer and application level notification to be achieved without requiring hardware interrupts or kernel intervention.
- the tripwire unit 3 is associated with a system controller 352 connected to a host bus 307 a and the input/output bus 307 b .
- a system controller 352 connected to a host bus 307 a and the input/output bus 307 b .
- Such an arrangement allows tripwire operations to inform applications of any characteristic data transfer to or from any device in the computer system.
- This includes hardware devices, such as the disk controller 351 and the network interface card 350 , and, in the case of a system employing several CPUs, enables an application running on one of the CPUs to synchronise on a data transfer to or from an application running on another of the CPUs.
- a tripwire may be used for synchronisation between applications running on the same CPU. This reduces the need for other mechanisms such as spin locks where both applications are required to operate in lock-step with the data transfer.
- Tripwire units 4 and 5 are implemented in the CPU 320 or the memory 308 . This is generally equivalent to the tripwire unit 3 , where all data transfers in the system can be monitored. However, the tripwire unit 4 may monitor data written by an application to cache, which may not appear on the host bus 307 a.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
- Communication Control (AREA)
- Computer And Data Communications (AREA)
Abstract
Asynchronous network interface and method of synchronisation between two applications on different computers is provided. The network interface contains snooping hardware which can be programmed to contain triggering values comprising either addresses, address ranges or other data which are to be matched. These data are termed “trip wires”. Once programmed, the interface monitors the data stream, including address data, passing through the interface for addresses and data which match the trip wires which have been set. On a match, the snooping hardware can generate interrupts, increment event counters, or perform some other application-specified action. This snooping hardware is preferably based upon Content-Addressable Memory. The invention thus provides in-band synchronisation by using synchronisation primitives which are programmable by user level applications, while still delivering high bandwidth and low latency. The programming of the synchronisation primitives can be made by the sending and receiving applications independently of each other and no synchronisation information is required to traverse the network.
Description
- This application is a divisional of U.S. patent application Ser. No. 09/980,539 filed on Oct. 23, 2001, which is hereby incorporated herein by reference in its entirety.
- This invention, in its various aspects, relates to the field of asynchronous networking, and specifically to: a memory mapped network interface; a method of synchronising between a sending application, running on a first computer, and a receiving application, running on a second computer, the computers each having a memory mapped network interface; a communication protocol; and a computer network. This invention also relates to data transfer and to synchronising applications.
- Due to a number of reasons, traditional networks, such as Gigabit Ethernet, ATM, etc., have not been able to deliver high bandwidth and low latency to applications that require them. A traditional network is shown in
FIG. 1 . To move data fromcomputer 200 to anothercomputer 201 over a network, the Central Processing Unit (CPU) 202 writes data frommemory 204 through itssystem controller 206 to its Network Interface Card (NIC) 210. Alternatively, data may be transferred to the NIC 210 using Direct Memory Access (DMA)hardware network packets 216, which contain enough information to allow them to be routed across thenetwork 218 tocomputer system 201. - When a network packet arrives at the NIC 211, it must be demultiplexed to determine where the data needs to be placed. In traditional networks this must be done by the operating system. The incoming packet therefore generates an interrupt 207, which causes software, a device driver in operating system 209, to run. The device driver examines the header information of each
incoming network packet 216 and determines the correct location inmemory 205, for data contained within the network packet. The data is transferred into memory using theCPU 203 or DMA hardware (not shown). The driver may then request that operating system 209 reschedule any application process that is blocked waiting for this data to arrive. Thus there is a direct sequence from the arrival of incoming packets to the scheduling of the receiving application. These networks therefore provide implicit synchronisation between sending and receiving applications and are called synchronous networks. - It is difficult to achieve optimum performance using modem synchronous network hardware. One reason is that the number of interrupts that have to be processed increases as packets are transmitted at a higher rate. Each interrupt requires that the operating system is invoked and software is executed for each packet. Such overheads both increase latency and the data transfer size threshold at which the maximum network bandwidth is achieved.
- These observations have led to the development of asynchronous networks. In asynchronous networks, the final memory location within the receiving computer for received data can be computed by the receiving NIC from the header information of a received network packet. This computation can be done without the aid of the operating system.
- Hence, in asynchronous networks there is no need to generate a system interrupt on the arrival of incoming data packets. Asynchronous networks therefore have the potential of delivering high bandwidth and low latency; much greater than synchronous networks. The Virtual Interface Architecture (VIA) is emerging as a standard for asynchronous networking.
- Memory-mapped networks are one example of asynchronous networks. An early computer network using memory mapping is described in U.S. Pat. No. 4,393,443.
- A memory-mapped network is shown in
FIG. 2 . Application 222 running onComputer 220 would like to communicate withapplication 223 running onComputer 221 usingnetwork 224. A portion of the application 222's memory address space is mapped using thecomputer 220's virtual memory system onto a memory aperture of the NIC 226 as shown by the application's page-tables 228 (these page-tables and their use is well known in the art). Likewise, a portion ofapplication 223's memory address space is mapped usingcomputer 221's virtual memory system onto a memory aperture of the NIC 229 using theapplication 223's page-tables 231. Software is usually required to create these mappings, but once they have been made, data transfer to and from a remote machine can be achieved using a CPU read or write instruction to a mapped virtual memory address. - If application 222 were to issue a number of processor write instructions to this part of its address space, the virtual memory and I/O controllers of
computer 220 will ensure that these write instructions are captured by the memory aperture of the NIC 226. NIC 226, determines the address of thedestination computer 221 and the address of theremote memory aperture 225 within that computer. Some combination of this address information can be regarded as the network address, which is the target of the write. - All the aperture mappings and network address translations are calculated at the time that the connection between the address spaces of
computers - After receiving a write, NIC 226 creates network packets using its packetisation engine 230. These packets are forwarded to the
destination computer 221. At the destination, the memory aperture addresses of the incoming packets are remapped by the packet handler ontophysical memory locations 227. The destination NIC 229 then writes the incoming data to thesephysical memory locations 227. This physical memory has also been mapped at connection set-up time into the address space ofapplication 223. Henceapplication 223 is able, using page-tables 231 and the virtual memory system, to access the data using processor read and write operations. - Commercial equipment for building memory-mapped networks is available from a number of vendors, including Dolphin Interconnect Solutions. Industry standards, such as Scalable Coherent Interface (SCI) (IEEE Standard 1596-1992), have been defined for building memory mapped networks, and implementations to the standards are currently available.
- SCI is an example of an asynchronous network standard, which provides poor facilities for synchronisation at the time of data reception. A network using SCI is disclosed in U.S. Pat. No. 5,819,075.
FIG. 3 shows an example of an SCI-like network, where application 242 oncomputer 240 would like to communicate withapplication 243 oncomputer 241. Let us suppose thatapplication 243 has blocked waiting for the data. Application 242 transmits data using the methods described above. After sending the data, application 242 must then construct a synchronisation packet in local memory, and program theevent generator 244, in NIC 246, to send thesynchronisation packet 248, to the destination node. - On receiving
synchronisation packet 248, the NIC 245 oncomputer 241, invokes itsevent handler 247, which generates aninterrupt 249 allowing theoperating system 248 to determine thatapplication 243 is blocked and should be woken up. This is called out-of-band synchronisation since the synchronisation packet must be treated as a separate and distinct entity and not as part of the data stream. Out-of-band synchronisation greatly reduces the potential of memory-mapped networks to provide high bandwidth and low latency. - In other existing asynchronous networks, such as the newly emerging Virtual Interface Architecture (VIA) standard and the forthcoming Next Generation Input/Output (NGIO) standard, some support is provided for synchronisation. A NIC will raise a hardware interrupt when some data has arrived. However, the interrupt does not identify the recipient of the data, instead only indicates that some data has arrived for some communicating end-point.
- While delivery of data can be achieved solely by hardware, the software task of scheduling between a large number of applications, each handling received data, becomes difficult to achieve. Software, known as a device driver, is required to examine a large number of memory locations to determine which applications have received data. It must then notify such applications that data has been delivered to them. This might include a reschedule request to the operating system for the relevant applications.
- Other known data transfer techniques are disclosed in
EP 0 600 683,EP 0 359 137,EP 0 029 800, U.S. Pat. No. 5,768,259, U.S. Pat. No. 5,550,808 and JP 600211559. - The present invention, in its various aspects, is defined in more detail in the appended claims to which reference should now be made.
- A first aspect of the invention provides a method of synchronising between a sending application on a first computer and a receiving application on a second computer, each computer having a main memory, and at least one of the computers having an asynchronous network interface, comprising the steps of:
-
- providing the asynchronous network interface with a set of rules for directing incoming data to memory locations in the main memory of the second computer;
- storing in the network interface one or more triggering value(s), each triggering value representing a state of a data transfer between the applications;
- receiving, at the network interface, a data stream being transferred between the applications;
- comparing at least part of the data stream received with the stored triggering values;
- if the compared part of the data stream matches any stored triggering value, indicating that the triggering value has been matched; and
- storing the data received in the main memory of the second computer at one or more memory location(s) in accordance with the said rules.
- Another aspect of the invention provides an asynchronous network interface for use in a host computer having a main memory and connected to a network, the interface comprising:
-
- means for storing a set of rules for directing incoming data to memory locations in the main memory of the host computer;
- a memory for storing one or more triggering value(s), each value representing a state of a data transfer between two or more applications in the computer network;
- a receiver for receiving a data stream being transferred between two or more applications in the computer network; comparison means for comparing at least part of the data stream received by the network interface with the stored triggering values; and
- a memory for storing information identifying any matched triggering values.
- A further aspect of the invention provides a method of passing data between an application on a first computer and remote hardware within a second computer or on a passive backplane, the first computer having a main memory and an asynchronous network interface, the method comprising the steps of:
-
- providing the asynchronous network interface with a set of rules for directing incoming data to memory or I/O location(s) of the remote hardware;
- storing in the network interface one or more triggering value(s), each triggering value representing a state of a data transfer between the application and the hardware;
- receiving, at the network interface, a data stream being transferred between the application and the hardware;
- comparing at least part of the data stream received with the stored triggering value(s);
- indicating that a triggering value has been matched, if any compared part of the data stream matches a triggering value;
- storing data transmitted in memory or I/O location(s) of the remote hardware in accordance with the said rules; and
- storing the data received in the main memory of the computer at one or more memory location(s) in accordance with the said rules.
- A further aspect of the invention provides a method of arranging data transfers from one or more applications on a computer, the computer having a main memory, an asynchronous network interface, and a Direct Memory Access (DMA) engine having a request queue address common to all the applications, comprising the steps of:
-
- the application requesting the network interface to store a triggering value corresponding to a property of the data block to be transferred;
- an application requesting the DMA engine to transfer a block of data;
- the network interface storing a triggering value corresponding to a property of the data block to be transferred, along with an identification of the application which requested the DMA transfer;
- the network interface monitoring the data stream being sent by the applications and comparing at least part of the data stream with the triggering value(s) stored in its memory; and
- if any triggering value matches, indicating that that triggering value has matched.
- A yet further aspect of the invention provides a method of transferring data from a sending application on a first computer to a receiving application on a second computer, each computer having a main memory, and a memory mapped network interface, the method comprising the steps of:
-
- creating a buffer in the main memory of the second computer for storing data being transferred as well as data identifying one or more pointer memory location(s);
- storing at said pointer memory location(s) at least one write pointer and at least one read pointer for indicating those areas of the buffer available for writes and for reads;
- in dependence on the values of the WRP(s) and RDP(s), the sender application writing to the buffer;
- updating the value of the WDP(s), after a write has taken place, to update the indication of the areas of the buffer available for reads and writes;
- in dependence on the values of WRP(s) and RDP(s), the receiver application reading from the buffer; and
- updating the value of the RDP(s), after a read has taken place, to update the indication of the areas of the buffer available for reads and writes.
- Another aspect of the invention provides a computer network comprising two computers, the first computer running a sending application and the second computer running a receiving application, each computer having a main memory and a memory mapped network interface, the main memory of the second computer having: a buffer for storing data being transferred between computers as well as data identifying one or more pointer memory location(s);
-
- means for reading at least one write pointer (WRP) and at least one read pointer (RDP) stored at (a) pointer memory location(s), for indicating those areas of the buffer available for writes and those areas available for reads;
- the network interface of the second computer comprising:
- a memory mapping;
- means for reading data from the buffer in accordance with the contents of the WRP(s) and RDP(s); and
- means for updating the value of the RDP(s), after a read has taken place, to update the indication of the areas of the buffer available for reads and writes.
- A further aspect of the invention provides a method of sending a request from a client application on a first computer to a server application on a second computer, and sending a response from the server application to the client application, both computers having a main memory and a memory mapped network interface, the method comprising the steps of:
-
- (A) providing a buffer in the main memory of each computer;
- (B) the client application, providing software stubs which produce a marshalled stream of data representing the request;
- (C) the client application sending the marshalled stream of data to the server's buffer;
- (D) the server application unmarshalling the stream of data by providing software stubs which convert the marshalled stream of data into a representation of the request in the server's main memory;
- (E) the server application processing the request and generating a response;
- (F) the server application providing software stubs which produce a marshalled stream of data representing the response;
- (G) the server application sending the marshalled stream of data to the client's buffer; and
- (H) the client application unmarshalling the received stream of data by providing software stubs which convert the received marshalled stream of data into a representation of the response in the client's main memory.
- Another aspect of the invention provides a method of arranging data for transfer as a data burst over a computer network comprising the steps of: providing a header comprising the destination address of a certain data word in the data burst, and a signal at the beginning or end of the data burst for indicating the start or end of the burst, the destination addresses of other words in the data burst being inferrable from the address in the header.
- A further aspect of the invention provides a method of processing a data burst received over a computer network comprising the steps of:
-
- reading a reference address from the header of the data burst, and
- calculating the addresses of each data word in the burst from the position of that data word in the burst in relation to the position of the data word to which the address in the header corresponds, and from the reference address read from the header.
- Another aspect of the invention provides a method of interrupting transfer of a data burst over a computer network comprising the steps of:
-
- halting transfer of a portion of the data burst which has not yet been transferred, thereby splitting the data burst into two burst sections, one which is transferred, and one waiting to be transferred.
- A further aspect of the invention provides a method of restarting the transfer of a data burst, after the transfer of that data burst has been interrupted, the method comprising the steps of:
-
- calculating a new reference address for the untransferred data burst section from the address contained in the header of the whole data burst, and from the position in the whole data burst of the first data word of the untransferred data burst section in relation to the position of the data word to which the address in the header corresponds;
- providing a new header for the untransferred data burst section comprising the new reference address; and
- transmitting the new header along with the untransferred data burst section.
- The first aspect of the present invention addresses the synchronisation problem for memory mapped network interfaces. The present invention uses a network interface, containing snooping hardware which can be programmed to contain triggering values comprising either addresses, address ranges, or other data which are to be matched. These data are termed ‘Tripwires’. Once programmed, the interface monitors the data stream, including address data, passing through the interface for addresses and data which match the Tripwires which have been set. On a match, the snooping hardware can generate interrupts or increment event counters, or perform some other application specified action. This snooping hardware is preferably based upon Content Addressable Memory (CAM). References herein to the “data stream” refer to the stream of data words being transferred and to the address data accompanying them.
- The invention thus provides in-band synchronisation by using synchronisation primitives which are programmable by user level applications, while still delivering high bandwidth and low latency. The programming of the synchronisation primitives can be made by the sending and receiving applications independently of each other and no synchronisation information is required to traverse the network.
- A number of different interfaces between the network interface and an application can be supported. These interfaces include VIA and the forthcoming Next Generation Input/Output (NGIO) standard. An interface can be chosen to best match an application's requirements, and changed as its requirements change. The network interface of the present invention can support a number of such interfaces simultaneously.
- The Tripwire facility supports the monitoring of outgoing as well as incoming data streams. These Tripwires can be used to inform a sending application that its DMA send operations have completed or are about to complete.
- Memory-Mapped network interfaces also have the potential to be used for communication between hardware entities. This is because memory mapped network interfaces are able to pass arbitrary memory bus cycles over the network. As shown in
FIG. 4 , it is possible to set up amemory aperture 254, in theNIC 252 ofComputer 250, which is directly mapped viaNIC 259, onto anaddress region 257 of the I/O bus 253 ofpassive backplane 251. - Using existing memory mapped interfaces, such as DEC Memory Channel or Dolphin SCI, an application running on
Computer 250, which requires use of thehardware device 255, would require a (usually software) process to interface between itself and the Network Interface card (NIC) 252. This is because theNIC 252, would not appear at the hardware level incomputer 250 as an instance of theremote hardware device 255, but instead as a network card which has amemory aperture 254 mapped onto the hardware device. - In a further aspect of the invention, we have appreciated that the interface of the present invention can be programmed to present the same hardware interface as the
remote hardware device 255, and so appear at the hardware level incomputer 250 to be an instance of the remote hardware device. If thenetwork card 252 were an interface according to the present invention, so programmed, theremote hardware device 255 would appear as physically located withincomputer 250, in a manner transparent to all software. Thehardware device 255, is able to be physically located both at the remote end of a dedicated link, or over a general network. The invention will support both general networking activity and remote hardware communication simultaneously on a single network card. - Another aspect of the invention relates to a link-level communication protocol which can be used to support cut-through routing and forwarding. There is no need for an entire packet to arrive at a NIC, or any other network entity supporting the communication protocol, before data transmission can be started on an outgoing link. The invention also allows large bursts of data to be handled effectively without the need for a small physical network packet size such as that employed by an ATM network, it being possible to dynamically stop and restart a burst and regenerate all address information using hardware.
- A preferred embodiment of the various aspects of the invention will now be described with reference to the drawings in which:
-
FIG. 1 shows two computers connected by a traditional network; -
FIG. 2 shows two computers connected by a traditional memory-mapped network; -
FIG. 3 shows a traditional SCI-like network; -
FIG. 4 shows a traditional memory-mapped network between hardware entities; -
FIG. 5 shows two or more computers connected by an embodiment of the present invention, using Network Interface Cards (NICs); -
FIG. 6 shows in detail the various functional blocks comprising the NICs ofFIG. 5 ; -
FIG. 7 shows the functional blocks of the NIC loyed within a Field Programmable Gate Array (FPGA); -
FIGS. 8 and 8 e shows the communication protocol used in one embodiment of the invention; -
FIG. 9 shows schematically hardware communication according to an embodiment of the invention; -
FIG. 10 shows schematically a circular buffer abstraction according to one embodiment of the invention; -
FIG. 11 shows schematically the system support for discrete message communication using circular buffers; -
FIG. 12 shows a client-server interaction according to an embodiment of the invention; -
FIG. 13 shows how the system of the present invention can support VIA; -
FIG. 14 shows outgoing stream synchronisation according to an embodiment of the present invention; -
FIG. 15 shows a client-server interaction according to an embodiment of the invention using a hardware data source; -
FIG. 16 shows an apparatus for synchronising an end-point application and constituting an embodiment of the invention; -
FIG. 17 shows another apparatus for synchronising an end-point application and constituting an embodiment of the invention; - FIGS. 18 to 23 show examples of actions which may be performed by the apparatuses of
FIGS. 16 and 17 ; -
FIG. 24 illustrates the format of a data burst with implied addresses; -
FIG. 25 illustrates an interruption in forwarding a burst of the type shown inFIG. 24 ; -
FIG. 26 illustrates forwarding of the rest of the burst; -
FIG. 27 illustrates coalescing of two data bursts; -
FIG. 28 illustrates “transparent” communication over a network between an application running on a computer and remote hardware; and -
FIG. 29 illustrates applications of various tripwires at different locations in a computer. - Referring to
FIG. 5 ,computers optional network switch 4. - Each
computer central processing unit memory local cache memory system controller system controller microprocessor video adapter 10. Also attached to I/O bus 9,59 is one or more network interfaces, in the form ofNICS computers video card 10, and the interface toNIC - Referring to
FIG. 6 , each NIC comprises amemory receiver 15 for receiving a data stream, a comparator for comparing part of the data stream with the triggering values and amemory 23 for storing information which will identify matched triggering values. More specifically, in the preferred embodiment eachNIC Local Bus bridge 12, a control Field Programmable Gate Array (FPGA) 13, transmit (Tx) serialiser 14, fibre-optic transceiver 15, receive (Rx) de-serialiser 16, address multiplexer and latch 17,CAM array boot ROMs static RAM 23,FLASH ROM 24, and clock generator andbuffer FIG. 6 also shows examples of known chips which could be used for each component, forexample boot ROM 21 could be anAltera EPC 1 chip. - Referring to
FIG. 7 ,FPGA 13 is comprised of functional blocks 27-62. The working of the blocks will be explained by reference to typical data flows. - Operation of
NIC 11 begins bycomputer 1 being started or reset. This operation causes the contents ofboot ROM 21 to be loaded intoFPGA 13 thereby programming the FPGA and, in turn, causingstate machines -
Clock generator 25 begins running and provides a stable clock for theTx serialiser 14. Clock buffer/divider 26 provides suitable clocks for the rest of the system. Serialiser 14 andde-serialiser 16 are reset and remain in a reset condition until communication with another node is established and a satisfactory receive clock is regenerated byde-serialiser 16. -
PCI bridge 12 is also reset and loaded with the contents ofboot ROM 22.Bridge 12 can convert (and re-convert at the target end) memory access cycles into I/O cycles and support legacy memory apertures, and as the rest of the NIC supports byte-enabled (byte-wide as well as word-wide) transfers,ROM 22 can be loaded with any PCI configuration space information, and can thus emulate any desired PCI card transparently tomicroprocessor 5. - Immediately after reset, FLASH
control state machine 47 runs and executes a simple microcode sequence stored inFLASH memory 24. Typically this allows the configuration space of another card such as 69 inFIG. 9 to be read, and additional information to be programmed intobridge 12. Programming of the FLASH memory is also handled bystate machine 47 in conjunction withbridge 12. - Data transfer could in principle commence at this point, but
arbiter 40 is barred from granting bus access toMaster state machine 37 until a status bit has been set in one of the internal registers 49. This allows software to set up the Tripwires during the initialisation stage. - Writes from
computer 1 tocomputer 2 take place in the following manner.Microprocessor 5 writes one or more words to an address location defined bysystem controller 8 to lie within NIC Il 's address space. PCI tolocal bus bridge 12 captures these writes and turns them into local bus protocol (discussed elsewhere in this document). If the writes are within the portion of the address space determined to be within the local control aperture of the NIC byregister decode 48, then the writes take place locally to the Content Addressable Memory appropriate register, (CAM), Static RAM (SRAM) or FLASH memory area. Otherwise targetstate machine 28 claims the cycles and forwards them toprotocol encoder 29. - At the protocol encoder, byte-enable, parity data and control information are added first to an address and then to each word to be transferred in a burst, with a control bit marking the beginning of the burst and possibly also a control bit marking the end of the burst. The control bit marking the beginning of the burst indicates that address data forming the header of the data burst comprises the first “data” word of the burst. Xon/Xoff-style management bits from
block 31 are also added here. This protocol, specific to theserialiser 14 andde-serialiser 16 is also discussed elsewhere in this document. - Data is fed on from
encoder 29 tooutput multiplexer 30, reducing the pin count forFPGA 13 and matching the bus width provided byserialiser 14.Serialiser 14 converts a 23-bit parallel data stream at 62 MHz to a 1-bit data stream at approximately 1.5 Gbit/s; this is converted to an optical signal bytransceiver 15 and carried over a fibre-optic link to a correspondingtransceiver 15 inNIC 56, part ofcomputer 2. It should be noted that other physical layers and protocols are possible and do not limit the scope of the invention. - In
NIC 56, the reconstructed digital signal is clock-recovered and de-serialised to 62 MHz byblock 16.Block 32 expands the recovered 23 bits to 46 bits, reversing the action ofblock 30.Protocol decoder 33 checks that the incoming words have suitable sequences of control bits. If so, it passes address/data streams intocommand FIFO 34. If the streams have errors, they are passed intoerror FIFO 35;master state machine 37 is stopped; and an interrupt is raised onmicroprocessor 57 byblock 53. Software is then used to decipher the incoming stream until a correct sequence is found, whereuponstate machine 37 is restarted. - When a stream arrives at the head of
FIFO 34,master state machine 37 requests access tolocal bus 55 fromarbiter 40. When granted, it passes first the address, then the following data ontolocal bus 55.Bridge 12 reacts to this address/data stream by requesting access to I/O bus 59 fromsystem controller 58. When granted, it writes the required data intomemory 60. - Reads of
computer 2'smemory 60 initiated bycomputer 1 take place in a similar manner. However,state machine 28 after sending the address word sends no other words, rather it waits for return data. Data is returned becausemaster state machine 37 inNIC 56 reacts to the arrival of a read address by requesting a read ofmemory 60 via I/O bus 59 and correspondinglocal bus bridge 12. This data is returned as if it were write data flowing fromNIC 56 toNIC 11, but without an initial address.Protocol decoder 33 reacts to this addressless data by routing it to readreturn FIFO 36, whereuponstate machine 28 is released from its wait and themicroprocessor 5's read cycle is allowed to complete. Should the address region be marked inNIC 56'sbridge 12 as read-prefetchable, then a number of words are returned; ifstate machine 28 continues requesting data as if from a local bus burst read, then subsequent words are fulfilled directly from readreturn FIFO 36. - Should
NIC 56 need to raise an interrupt onmicroprocessor 5, remote interruptgenerator 54 causesstate machine 28 to send a word fromNIC 56 to a mailbox register inNIC 11'sbridge 12. This will have been configured by software to raise an interrupt onmicroprocessor 5. - Inevitably, since the
clocks 25 inNICs command FIFO 34 exceeds a pre-programmed threshold value, an Xoff bit is sent to thecorresponding protocol encoder 29. This bit causes the encoder to request that the sendingstate machine 28 stops, if necessary in mid burst. Logic inbridge 12 takes care of restarting the data burst when the corresponding Xon is received some time later. This logic calculates a new reference address for the unsent part of the data burst, using the reference address in the header of the whole data burst, and from a count of the number of data words which are sent before the transfer is stopped. As, in this embodiment, successive data words in a burst have successively incrementing destination addresses, the destination address of the first data word in the unsent part of the data burst can easily be calculated. - It is also possible that data may be read out of
FIFO 34 faster than it is written in. In the event of this happening,master state machine 37 usespipeline delay 38 to anticipate the draining ofFIFO 34 and to terminate the data burst onlocal bus 55. It then uses the CAM address latch/counter 41 to restart the burst when more data arrives inFIFO 34. - ‘Tripwires’ are triggering values, such as addresses, address ranges or other data, that are programmed into the NIC to be matched. Preferably, the trigging values used as tripwires are addresses. To meet timing requirements during address match cycles (as data flows through the NIC), three CAM devices are pipelined to reduce the match cycle time from around 70 nanoseconds to less than 30 nanoseconds.
- The programming of Tripwires takes place by
microprocessor 5 writing toPCI bridge 12 viasystem controller 8 and I/O bus 9. For the purpose of writing the Tripwire data,CAM array microprocessor 5. For write cycles, this is done byCAM controller 43 generating suitable control signals to enable all threeCAMs Address latch 44 passes data to the CAMs unmodified.Address multiplexer 41 is arranged to pass local bus data out on the CAM address bus where it is latched at the moment addresses are valid on the local bus bylatch 17. For read cycles, the process is similar, except thatonly CAM 18 is arranged to be enabled for read access, and address latch/counter 44 has its data flow direction reversed. So far asmicroprocessor 5 is concerned, it sees the expected data returned, since the memory arrays inCAMs - Owing to the nature of the address/data bus being comprised of bursts of data, according to the preferred local protocol, the actual data stream cannot be used for monitoring address changes. A burst starts with the address of the first data word followed by an arbitrary number of data words. The address of the data words is implicit and increments from the start address. For normal inbound or outbound data transfer operations, address latch/
counter 44 is loaded with the address of each new data burst, and incremented each time a valid data item is presented on internallocal bus 55. CAMcontrol state machine 43 is arranged to enable eachCAM counter 44. This sequential enabling of the CAMs combined with their latching properties permits the access time for a comparison operation to be reduced by a factor of three (there being three CAMs in this implementation, other implementations being possible) from 70 ns to less than 30 ns. The CAM op-code for each comparison operation is output from one of theinternal registers 49 viaaddress multiplexers address multiplexer 17 at the end of a read/write cycle, freeing the CAM address bus to return the index of matched Tripwires after comparison operations. - The Tripwire data (i.e. the addresses to be monitored) is written to sequential addresses in the CAM array. During the comparison operation (cycle), all valid Tripwires are compared in parallel with the address of the current data, be it inbound or outbound. During the operation, masking operations may be performed, depending on the type of CAM used, allowing certain bits of the address to be ignored during the comparison. In this way, a Tripwire may actually represent a range of addresses rather than one particular address.
- When the CAM array signals a match found (i.e. a Tripwire has been hit), it returns the address of the Tripwire (its offset in the CAM array) via the CAM address bus to the
tripwire FIFO 42. Two courses of action are then possible, depending on howinternal registers 49 have been programmed. - One course of action is for
state machine 45 to request that an interrupt be generated bymanagement logic 53. In this case, an interrupt is received bymicroprocessor 5, and software is run which services the interrupt. Normally this would involvemicroprocessor 5 reading the Tripwire address fromFIFO 42, matching the address with a device-driver table, signalling the appropriate process, marking it runnable and rescheduling. - An alternative course of action is for
state machine 45 to cause records to be read fromSRAM 23 usingstate machine 46. A record comprises a number of data words; an address and two data words. These words are programmed by the software just before the Tripwire information is stored in the CAM. When a Tripwire match is made, the address inLATCH 44 is left shifted by two to form an address index forSRAM 23. The first word is then read bystate machine 46 and placed onlocal bus 55 as an address inmemory 6. A fetch-and-increment operation is then performed bystate machine 45, using the second and third words of the SRAM record to first AND and then OR, or else INCREMENT the data referred to inmemory 6. A bit in the first word read by the state machine will indicate which operation it should take. In the case of an INCREMENT, the first data word also indicates the amount to increment by. - These alternatives enable the implementation of such primitives as an event counter incremented on tripwire matches, or the setting of a system reschedule flag. This mechanism enables multiple applications to process data without the requirement for hardware interrupts to be generated after receipt of each network packet.
- While in the case of the interrupt followed by a Tripwire FIFO read, the device driver is presented with a list of endpoints which require attention. This list improves system performance as the device driver is not required to scan a large number of memory locations looking for such endpoints.
- Since the device driver is not required to know where the memory locations which have been used for synchronisation are. It is also not required to have any knowledge or take part in the application level communication protocol. All communication protocol processing can be performed by the application and different applications are free to use differing protocols for their own purposes, and one device driver instance may support a number of such applications.
- There is also a problem connected with programming a DMA engine that is addressed by an aspect of the invention. Conventional access to DMA engines is moderated either by a single system device driver, which requires (slow) context switches to access, or by virtualisation of the registers by system page fault, also requiring (multiple) context switches. The problem is that it is not safe for a user level application to directly modify the DMA engine registers or a linked list DMA queue, because this must be done atomically. In most systems, user applications cannot atomically update the DMA queue as they can be descheduled at any moment.
- The invention addresses this problem by using
hardware FIFO 50 to queue DMA requests from applications. Each application wanting to request DMA transfers sets up a descriptor, containing the start address and the length of the data to be transferred, in its local memory and posts the address of the descriptor to the DMA queue, whose address is common to all applications. This can be arranged by mapping a single page containing the physical address of the DMA queue as a write-only page into the address space of all user applications as they are initialised. - As soon as DMA
work queue FIFO 50 is not empty,local bus 55 is not busy and the DMA engine inbridge 12 is also not busy, Master/Target/DMA arbiter 40 grantsDMA state machine 51 access tolocal bus 55. Using the address posted by the application inFIFO 50,state machine 51 then usesbridge 12 to read the descriptor inmemory 6 into thedescriptor block 52.State machine 51 then posts the start address and length information held inblock 52 into the DMA engine inbridge 12. - When the DMA process is complete,
bridge 12 notifiesstate machine 51 of the completion. The state machine then uses data fromdescriptor block 52 to write back a completion descriptor inmemory 6. Optionally, an interrupt can also be raised onmicroprocessor 5, although a Tripwire may already have been crossed to provide this notification early in order to minimise the delay bringing the relevant application back ontomicroprocessor 5's run queue. This is shown later in this document. - Should queue 50 be full, then
state machine 51 writes a failure code back into the completion field of the descriptor that the application has just attempted to place on the queue. Thus the application does not need to read the status of the NIC in order to safely post a DMA request. All applications can safely share the same hardware posting address, and no time-consuming virtualisation or system device driver process is necessary. - Should any operation take longer than a preset number of PCI cycles,
timeout logic 61 is activated to terminate the current cycle and return an interrupt throughblock 53. - Another aspect of the invention relates to the protocol which is preferably used by the NIC. This protocol uses an address and some additional bits in its header. This allows the transfer of variable length packets with simple routines for Segmentation and Reassembly (SAR) that are transparent to the sending or receiving codes. This is also done without the need to have an entire packet arrive before segmentation, reassembly or forwarding can occur, allowing the data to be put out on the ongoing link immediately. This enables data to traverse many links without significantly adding to the overall latency. The packets may be fragmented and coalesced on each link, for example between the NIC and a host I/O bus bridge, or between the NIC and another NIC. We term this cut-through routing and forwarding. In a network carrying a large number of streams, cut-through forwarding and routing enables small packets to pass through the network without any delays caused by large packets of other streams. While other network physical layers such as ATM also provide the ability to perform cut-through forwarding and routing, they do so at the cost of requiring all packets to be of a fixed small size.
-
FIG. 8 shows an example of how this protocol has been implemented using the 23-bit data transfer capability of HP's GLINK chipset (serialiser 14 and de-serialiser 16). PCI tolocal bus bridge 12 provides a bus of 32 address/data bits, 4 parity bits and 4 byte-enable bits. It also provides an address valid signal (ADS) which signifies that a burst is beginning, and that the address is present on the address/data bus. The burst continues until a burst last signal (BLAST) is set active, signifying the end of a burst. It provides a read/write signal, and some other control signals that need not be transferred to a remote computer.FIG. 8A shows how this protocol is used to transfer an n data word burst 63. The data traffic closely mirrors that used on the PCI bus, but uses fewer signals. - The destination address always precedes each data burst. Therefore, the bursts can be of variable size, can be split or coalesced, by generating fresh address words, or by removing address words where applicable. In the preferred embodiment, sequential data words are destined for sequentially incrementing addresses. However, data words having sequentially decrementing addresses might also be used, or any other pattern of addresses may be used so long as it remains easy to calculate. So far as the endpoints are concerned, exactly the same data is transferred to exactly the same locations. The benefits are that packets can be of any size at all, reducing the overhead of sending an address; packets can be split (and addresses regenerated to continue) by network switches to provide quality of service, and receivers need not wait for a complete packet to arrive to begin decoding work.
- Also, the destination address given in the header may be for the ‘nth’ data word in the burst, rather than for the first, although using the first data word address is preferred.
-
FIG. 8 b shows how the protocol ofFIG. 8 a is transcribed onto the G-LINK physical layer. The first word in any packet contains an 18-bit network address. Each word of 63 is split into two words in 64; the lower 16 bits carry high and low addresses or data, corresponding to the address/data bus; the next 4 bits carry either byte enables or parity data. During the address phase, the byte enable field (only 2 bits of which are available, owing to the limitations of G-LINK) is used to carry a 2-bit code indicating read, write or escape packet use. Escape packets are normally used to carry diagnostic or error information between nodes, or as a means of carrying the Xon/Xoff-style protocol when no other data is in transit. The G-LINK nCAV signal corresponds to the ADS signal of 63; nDAV is active throughout the rest of the burst and the combination of nDAV inactive and nCAV inactive signals the end of a burst, or nCAV active indicates the immediate beginning of another burst. -
FIG. 8 c, shows a read data burst 65; this is the same as a write burst 64, except data bit 16 is set to 0. On the outbound request, the data field contains the network address for the read data to be returned to. When the data for a read returns 66, it travels like a write burst, but is signified by there only being one nCAV active (signifying the network address) along with the first word. An additional bit, denoted FLAG inFIG. 8 , is used to cary Xon/Xoff sttyle information when a burst is in progress. It is not necessary therefore to break up a burst in order to send an Escape packet containing the Xon/Xoff information. The FLAG bit also serves as an additional end of packet indicator. - In
FIG. 8 c, 67,68 shows an escape packet; after the network address, this travels with 68 or without 67 a payload as defined by data bit 16 in the first word of the burst. - In a full networked implementation, an extra network address word may precede each of these packets. Other physical layer or network layer solutions are possible, without compromise to this patent application, including fibre channel parts (using 8B/10B encoding) and conventional networks such as ATM or even Ethernet. The physical layer only needs to provide some means of identifying data from non-data and the start of one burst from the end of a previous one.
- A further aspect of the invention relates to the distribution of hardware around a network. One use of a network is to enable one computer to access a hardware device whose location is physically distant. As an example, consider the situation shown in
FIG. 9 , where it is required to display the images viewed by thecamera 70, (connected a frame-grabber card 69) on the monitor which is, in turn, connected tocomputer 72. TheNIC 73 is programmed fromBoot ROM 22 to present the same hardware interface as that of the frame-grabber card 69.Computer 72 can be running the standard application program as provided by a third party vendor which is unaware that system has been distributed over a network. All control reads and writes to the frame-grabber 69, are transparently forwarded by theNIC 73, and there is no requirement for an extra process to be placed in the data path to interface between the application running onCPU 74 and theNIC 73. Passive PCI I/O back-plane 71, requires simply a PCI bus clock and arbiter i.e., no processor, memory or cache. These functions can be implemented at very low cost. - The I/O buses are conformant to PCI Local Bus Specification 2.1. This PCI standard supports the concept of a bridge between two PCI buses. It is possible to program the
NIC 73 to present the same hardware interface as a PCI bridge betweenComputer 72 and passive back-plane 71. Such programming would enable a plurality of hardware devices to be connected to back-plane 71 and controlled bycomputer 72 without the requirement for additional interfacing software. Again, it should be clear that the invention will support both general networking activity and this remote hardware communication, simultaneously using a single network card. - A circular buffer abstraction will now be discussed as an example of the use of the NIC by an application. The circular buffer abstraction is designed for applications which require a producer/consumer software stream abstraction, with the properties of low latency and high bandwidth data transmission. It also has the properties of responsive flow control and low buffer space requirements.
FIG. 10 shows a system comprising two software processes,applications different computers Application 102 is producing some data.Application 103 is awaiting the production of data and then consuming it. Thecircular buffer 107, is composed of a region of memory onComputer 101 which holds the data and two memory locations—RDP 106 andWRP 109.WRP 109 contains the pointer to the next byte of data to be written into the buffer, whileRDP 106 contains the pointer to the last byte of data to be read from the buffer. When the circular buffer is empty, then WRP is equal to RDP+1 modulo wrap-around of the buffer. Similarly, the buffer is full when WRP is equal to RDP−1. There are also private values ofWRP 108 andRDP 111 in the caches ofcomputer 100 andcomputer 101 respectively. Eachcomputer - When the
circular buffer 107 is created, the producer sets up aTripwire 110, which will match on a write to theRDP pointer 106, and the consumer sets up aTripwire 113, which will match on a write to theWRP pointer 109. - If
consumer application 103 attempts to read data from thecircular buffer 107, it first checks to see if the circular buffer is empty. If so,application 103 must wait until the buffer is not empty, determined whenWRP 109 has been seen to be incremented. During this waiting period,application 103 may either block, requesting an operating system reschedule, or poll theWRP 109 pointer. - If
producer application 102 decides to write to thecircular buffer 107, it may do so while the buffer is not full. After writing some data,application 102 updates its local cached value ofWRP 108, and writes the updated value to thememory location 109, incomputer 101. When the value ofWRP 109, is updated, theTripwire 113, will match as has been previously described. - If
consumer application 103 is not running onCPU 118 when some data is written into the buffer andTripwire 113 matches,NIC 115 will raise a hardware interrupt 114. This interrupt causesCPU 118 to run device driver software contained withinoperating system 118. The device driver will service the interrupt by reading thetripwire FIFO 42 onNIC 115 and determine from the value read, the system identifier forapplication 103. The device driver can then request thatoperating system 118, rescheduleapplication 103. The device driver would then indicate that thetripwire 113 should not generate a hardware interrupt untilapplication 103 has been next descheduled and subsequently another Tripwire match has occurred. - Note that the system identifier for each running application is loaded into
internal registers 49, each time the operating system reschedules. This enables the NIC to determine the currently running application, and so make the decision whether or not to raise a hardware interrupt for a particular application given a Tripwire match. - Hence, once
consumer application 103 is again running on the processor further writes to thecircular buffer 107, byapplication 102, may occur without triggering further hardware interrupts.Application 103 now reads data from thecircular buffer 107. It can read data until the buffer becomes empty (detected by comparing the values of RDP andWRP 111,109). After reading,application 102 will update its local value ofRDP 111 and finally writes the updated value of RDP tomemory location 106 over the network. - If
producer application 102 had been blocked on a full buffer, this update ofRDP 106 would generate aTripwire match 110, resulting inapplication 102, being unblocked and able to write more data into thebuffer 107. - In normal operation,
application 102 andapplication 103 could be operating on different parts of the circular buffer simultaneously without the need for mutual exclusion mechanisms or Tripwire. - The most important properties of the data structure are that the producer and the consumer are able to process data without hindrance from each other and that flow control is explicit within the software abstraction. Data is streamed through the system. The consumer can remove data from the buffer at the same time as the producer is adding more data. There is no danger of buffer over-run, since a producer will never transmit more data than can fit in the buffer.
- The producer only ever
increments WRP RDP 106, and the consumer only everincrements RDP WRP 109. Inconsistencies in the values of WRP and RDP seen by either the producer or consumer either cause the consumer to not process some valid data (whenRDP 106 is inconsistent with 111), or the producer to not write some more data (whenWRP 109 is inconsistent with 108), until the inconsistency has been resolved. Neither of these occurrences cause incorrect operation or performance degradation so long as they are transient. - It should also be noted that on most computer architectures, including the Alpha AXP and Intel Pentium ranges,
computer 100 can store the value of theRDP 106 pointer in its processor cache, since theproducer application 102 only reads thepointer 106. Any remote writes to the memory location of theRDP pointer 106 will automatically invalidate the copy in the cache causing the new value to be fetched from memory. This process is automatically carried out and managed by thesystem controller 8. In addition, sincecomputer 101 keeps a private copy of theRDP pointer 111 in its own cache, there is no need for any remote reads of RDP pointer values during operation of the circular buffer. Similar observations can also be made for theWRP pointer 109 in the memory ofcomputer 101 and theWRP pointer 108 in the cache ofcomputer 100. This feature of the buffer abstraction ensures that high performance and low latency are maintained. Responsive application level flow-control is possible because the cached pointer values can be exposed to the user-level applications - A further enhancement to the above arrangement can be used to provide support for applications which would like to exchange data in discrete units. As shown in
FIG. 11 , and in addition to the system described inFIG. 10 . The system maintains a secondcircular buffer 127, of updatedWRP 129 values corresponding to buffer 125. Thissecond buffer 127 is used to indicate to a consumer how much data to consume in order that data be consumed in the same discrete units as it were produced. Note thatcircular buffer 125 contains the data to be exchanged between theapplications - The producer,
application 122 writes data intobuffer 125, updating thepointer WRP 129, as previously described. Once data has been placed inbuffer 125,application 122 then writes the new value of theWRP 129 pointer intobuffer 127. At the same time it also manipulates thepointer WRP 131. If either of these write operations does not complete then the application level write operation is blocked until some data is read by theconsumer application 123. The Tripwire mechanism can be used as previously described, for either application to block on either a full or empty buffer pair. - The
consumer application 123 is able to read from bothbuffers RDP pointers RDP pointers buffer 127 indicates an amount of data, which had been written intobuffer 125. This value may be used by application level orlibrary software 123, to consume data frombuffer 125 in the same order and by the same discrete amounts as it were produced byapplication 122. - The NIC can also be used to directly support a low latency Request/Response style of communication, as seen in client/server environments such as Common Object Request Broker Architecture (CORBA) and Network File System (NFS) as well as transactional systems such as databases. Such an arrangement is shown in
FIG. 12 , where application 142 oncomputer 140 acts as a client requesting service from application 143 oncomputer 141, which acts as a server. The applications interact via memory mappings using twocircular buffers - Application 142, the client, writes a
request 147 directly into thecircular buffer 145, via the memory mapped connection(s), and waits for a reply by waiting on data to arrive incircular buffer 144. Most Request/Response systems use a process known as marshalling to construct the request and use an intermediate buffer in memory of the client application to do the marshalling. Likewise marshalling is used to construct a response, with an intermediate buffer being required in the memory of the server application. Using the present invention, marshalling can take place directly into thecircular buffer 145 of the server as shown. No intermediate storage of the request is necessary at either the client orserver computers - The server application 143 notices the request (possibly using the Tripwire mechanism) and is able to begin unmarshalling the request as soon as it starts to arrive in the
buffer 145. It is possible that the server may have started to process therequest 149 while the client is still marshalling and transmitting, thus reducing latency in the communication. - After processing the request, the server writes the
reply 146 directly intobuffer 144, unblocking application 142 (using the Tripwire mechanism), which then unmarshalls and processes thereply 148. Again, there is no need for intermediate storage, and unmarshalling by the client may be overlapped with marshalling and transmission by the server. - A further useful and novel property of a Request/Response system built using the present invention, is that data may be written into the buffer both from software running on a CPU, or any hardware device contained in the computer system.
FIG. 15 shows a Request/Response system which is a file serving application. Theclient application 262 writes arequest 267 for some data held on disks controlled by 271. The server application 263 reads 269 and decodes the request from itscircular buffer 265 in the manner previously described. It then performs authentication and authorisation on the request according to the particular application. - If the request for data is accepted, the server application 263 uses a two-part approach to send its reply. Firstly, it writes, into the
circular buffer 264, the software generated header part of the reply 266. The server application 263 then requests 273 that the disk controller 271 send the required data part of the reply 272 over the network tocircular buffer 264. This request to the disk controller takes the form of a DMA request, with the target address being an address on I/O bus 270 which has been mapped ontocircular buffer 264. Note that the correct offset is applied to the address such that reply data 272 from the disk is placed immediately following the header data 266. - Before initiating the request 273, the server application 263 can ensure that sufficient space is available in the
buffer 264 to accept the reply data. Further, it is not necessary for the server application 263 to await the completion request 273. It is possible for theclient application 262 to have set aTripwire 274 to match once the reply data 272 has been received intobuffer 264. This match can be programmed to increment the WRP pointer associated withbuffer 264, rather than requiring application 263 to increment the pointer as previously described. If a request fails, then theclient application 262 level timeout mechanism would detect and retry the operation. - It is also possible for the
client application 262 to arrange that reply data 272 be placed in some other data structure, (such as a kernel buffer-cache page), through manipulation of 169 and 167 as described later. This is useful when 264 is not the final destination of the rept data, so preventing a final memory copy operation by the client. Server application 263 would be unaware of this client side optimisation. - By use of this mechanism, the processing load on the server is reduced. The requirement for the server application to wait for completion of its disk requests is removed. The requirement for high bandwidth streams of reply data to pass through the server's system controller, memory, cache or CPU is also removed.
- As previously stated, the NIC of the present invention could be used to support the Virtual Interface Architecture (VIA) Standard.
FIG. 13 shows two applications communicating using VIA.Application 152 sends data to application 153, by first writing the data to be sent into a region of its memory, shown asblock 154.Application 152 then builds a transmitdescriptor 156, which describes the location ofblock 154 and the action required by the NIC (in this case data transmission). This descriptor is then placed onto theTxQueue 158, which has been mapped into the user-level address-space ofapplication 152.Application 152 then finally writes to thedoorbell register 160 in theNIC 162 to notify the NIC that work has been placed on theTxQueue 158. - Once the
doorbell register 160 has been written, theNIC 162 can determine, from the value written, the address in physical memory of the activatedTxQueue 158. TheNIC 152 reads and removes thedescriptor 156 from theTxQueue 158, determines from thedescriptor 156, the address of data block 154 and invokes aDMA 164 engine to transmit the data contained inblock 154. When the data is transmitted 168, theNIC 162 places thedescriptor 156 on acompletion queue 166, which is also mapped into the address space ofapplication 152, and optionally generates a hardware interrupt. Theapplication 152 can determine when data has been successfully sent by examiningqueue 166. - When application 153 is to receive data, it builds a receive
descriptor 157 describing where the incoming data should be placed, in this case block 155. Application 153 then placesdescriptor 157 ontoRxQueue 159, which is mapped into its user-level address-space. Application 153 then writes to thedoorbell register 161 to indicate that itsRXQueue 159 has been activated. It may choose to either poll itscompletion queue 163, waiting for data to arrive, or block until data has arrived and a hardware interrupt generated. - The NIC 165 in
computer 151 services thedoorbell register 161 write by first removing thedescriptor 157 from theRxQueue 159. The NIC 165 then locates the physical pages of memory corresponding to block 155 and described by the receivedescriptor 157. The VIA standard allows these physical pages to have been previously locked by application 153 (preventing the virtual memory system moving or removing the pages from physical memory). However, the NIC is also capable of traversing the page-table structures held in physical memory and itself locking the pages. - The NIC 165 continues to service the doorbell register write and constructs a Translation Look-aside (TLB)
entry 167 located inSRAM 23. When data arrives corresponding to a particular VIA endpoint, the incoming address matches anaperture 169 in the NIC, which has been marked as requiring a TLB translation. This translation is carried out bystate machine 46 and determines the physical memory address of block 155. - The TLB translation, having been previously set up, occurs with little overhead and the data is written 175 to appropriate memory block 155. A
Tripwire 171 will have been arranged (when theTLB 167 entry was constructed) to match when the address range corresponding to block 155 is written to. This Tripwire match causes the firmware 173 (implemented in state machine 51) to place the receivedescriptor 157 ontocompletion queue 163 to invalidate theTLB mapping 167 and optionally generate an interrupt. If theRxQueue 159 has been loaded with other receive descriptors, then the next descriptor is taken and loaded into the TLB as previously described. If application 153 is blocked waiting for data to arrive, the interrupt generated will result, (after a device driver has performed a search of all the completion queues in the system), in application 153 being re-scheduled. If there is no TLB mapping for the VIA Aperture addresses, or the mapping is invalid, an error is raised using an interrupt. If the NIC 165 is in the process of reloading theTLB 167 when new data arrives, then hardwareflow control mechanism 31 is used to control the data until a path to the memory block incomputer 151 has been completed. - As an optional extension to the VIA standard, the NIC could also respond to
Tripwire match 171 by placing an index onTripwire FIFO 42, which could enable the device driver to identify the active VIA endpoint without searching all completion queues in the system. - This method can be extended to provide support for I20 and the forthcoming Next Generation I/O (NGIO) standard. Here, the transmit, receive and completion queues are located on the NIC rather than in the physical memory of the computer, as is currently the case for the VIA standard.
- As mentioned previously, another aspect of this invention is its use in providing support for the outbound streaming of data through the NIC. This setup is described in
FIG. 14 . It shows a Direct Memory Access (DMA)engine 182 on theNIC 183, which has been programmed in the manner previously described by a number of user-level applications 184. These applications have requested that theNIC 183 transfer their respective data blocks 181 through theNIC 183,local bus 189, fibre-optic transceiver 190 and ontonetwork 200. After each application has placed its data transfer request onto theDMA request queue 185, it blocks, awaiting a re-schedule, initiated bydevice driver 187. It can be important that the system maintains fair access between a large number of such applications, especially under circumstances where an application requires a strict periodic access to the queue, such as an application generating a video stream. - Data transferred over the network by the
DMA engine 182, traverseslocal bus 189, and is monitored by theTripwire unit 186. This takes place in the same manner as for received data, (both transmitted and received data pass through the NIC using the same local bus 55). - Each application, when programming the
DMA engine 182 to transmit a data block, also constructs a Tripwire which is set to match on an address in the data block. The address to match could indicate that all or a certain portion of the data has been transmitted. When this Tripwire fires and causes a hardware interrupt 188, thedevice driver 187 can quickly determine which application should be made runnable. By causing a system reschedule, the application can be run on the CPU at the appropriate moment to generate more DMA requests. Because the device driver can execute at the same time that the DMA engine is transferring data, this decision can be made in parallel to data transfer operations. Hence, by the time that a particular application's data transfer requests have been satisfied, the system can ensure that the application be running on the CPU and able to generate more requests. -
FIG. 16 illustrates a generalised apparatus or arrangement for synchronising an end-point application using a tripwire. An end-point is a final destination for an information stream and is the point at which processing of the information takes place. Examples of end-points include a web, a file, a database server and hardware devices such as a disk or graphics controller. An end-point may be running an operating system and a number of data processing applications and these are referred to as end-point applications. Thus, examples of end-point applications include an operating system or a component thereof, a network protocol stack, and any application-level processing. Arrangements such as network switches and routers do not constitute end-points or end-point applications because their purpose is to ensure that the information is delivered elsewhere. - The arrangement comprises a
computer 300 which is optionally connected toother computers network 303. Thecomputer 300 comprises a program memory (illustrated by way of example only as a read only memory (ROM) 305) which contains a program for controlling the computer to synchronise the end-point application in accordance with an address-based event in an information stream on aninformation pathway 307, such as a bus, within the computer. The information stream may be wholly within the computer, for example from another application performed by thecomputer 300, or may be from a remote source, such as from thenetwork 303. - The
bus 307 is connected to amemory 308 in the end-point application 306, which also comprises acode generator 309 and anaction generator 310. Thecode generator 309 supplies codes to a comparator which is illustrated as a content addressable memory (CAM) 311. TheCAM 311 has another input connected to thebus 307 and is arranged to perform a comparison between each entry in the CAM and the information stream on thebus 307. When a match is found, the CAM sends a signal to theaction generator 310 which performs an action which is associated with an address-based event in the information stream. - In a typical example of use of the synchronising arrangement, the end-
point application 306 sets a tripwire, for example to be triggered when data relating to an end-point address or range of end-point addresses in thememory 308 are present on thebus 307. Thecode generator 309 supplies a code which is written into theCAM 311 and which comprises the destination memory address of the data or possibly part of this address, such as the most significant bits when a range of addresses is to be monitored. It is also possible to enter a code which represents not only the address or range of addresses but also part or all of one or more items of data which are expected in the information stream. TheCAM 311 compares the address of each data burst on thebus 307, and possibly also at least some of the data of each burst, with each code stored in theCAM 311 and supplies a signal to theaction generator 310 when a match is found. Theaction generator 310 then causes the appropriate action to be taken within the end-point application 306. This may be a single action, several actions, or one or more specific actions which are determined not only by the triggering of the tripwire but also by the data within the information stream, for example arriving at the appropriate location or locations in thememory 308. - As mentioned hereinbefore, the
information stream 307 may be wholly internal to thecomputer 300 and an example of this is an application-to-application stream of information where both applications are running, for example alternately, on thecomputer 300. However, the information stream may be partly or wholly from outside thecomputer 300, as illustrated by the broken line connection from thebus 307 to thenetwork 303. Thus, the information stream may be from a switch fabric, a network, or a plurality of sources. A switch fabric is a device which has a plurality of inputs and outputs and which is capable of forwarding data from each input to the appropriate output according to routing information contained within the data. A switch fabric may alternatively be wholly contained within the computer. The information stream preferably has a data burst arrangement as described hereinafter and, in the case of a plurality of sources, the data bursts may arrive from any of the sources at any time, which amounts to multiplexing. -
FIG. 17 shows an arrangement which illustrates two possible modifications to the arrangement shown inFIG. 16 . In this case, thebus 307 is connected to an input/output bus 312 of the end-point application 306 within thecomputer 300. This represents an example of a hardware end-point for the information stream but other types of hardware end-points are possible, such as active controllers, and may be located “outside” theapplication 306. An example of an active controller is a disk controller. - The arrangement shown in
FIG. 17 also differs from that shown inFIG. 16 in that the tripwire may be triggered by an address-based event in the information stream on thebus 307 which does not exactly match any of the codes stored in theCAM 311. Instead, the information from the information stream on thebus 307 first passes through aprocess 313 before being supplied to the CAM for comparison with each of the stored codes. - One application of this is for the case where the information stream comprises packets or bursts of data starting with an address, for example corresponding to an address in the
memory 308 to which the first item of data after the address in the packet or burst is allocated. Subsequent items of data are to be allocated to consecutive addresses, for example such that each item of data in the burst is to be allocated to the next highest address location after the preceding data item. Thus, the address at the start of each burst relates to the first data item and the following data item addresses can be inferred by incrementing the address upon the arrival of the second and each subsequent item of data. - The
application 306 can cause thecode generator 309 to store in the CAM 311 a code which corresponds to an implied address in the actual information stream appearing on thebus 307. Theprocess 313 detects the address at the start of each data burst and supplies this to theCAM 311 with the arrival of the first data item. As each subsequent data item of the same burst arrives, theprocess 313 increments the address and supplies this to theCAM 311. This allows a tripwire to be triggered when, for example a data item having an implied address is present on thebus 307 because the CAM can match the corresponding stored code with the address supplied by theprocess 313. - As mentioned hereinbefore, the
action generator 310 can cause any one or more of various different actions to be triggered by the tripwire. The resulting action may be determined by which tripwire has been triggered i.e. which code stored in theCAM 311 has been matched. It is also possible for the action to be at least partly determined by the data item which effectively triggered the tripwire. Any action may be targetted at the computer containing the tripwire or at a different computer. Various possible actions are described hereinafter as typical examples and may be performed singly or in any appropriate combination for the specific application and may be targetted at the computer containing the tripwire or at a different computer. -
FIG. 18 illustrates theaction generator 310 raising an interrupt request IRQ and supplying this to the interrupt line of a central processing unit (CPU) 320 of thecomputer 300.FIG. 19 illustrates theaction generator 310 setting a bit in abitmap 321, for example in thememory 308. These two actions may be used independently of each other or together. For example, the action generator may raise an interrupt request if an application which requires data corresponding to the tripwire is not currently running but is runnable; for example it has not exhausted its time-slice. Otherwise, for example if the application is awaiting rescheduling, the relevant bit in thebitmap 321 may be set. The operating system may periodically check thebitmap 321 for changes and, as a result of the arrival of the relevant data for an application which is presently not running, may decide to reschedule or wakeup the application. -
FIG. 20 illustrates another type of action which may be performed as a result of detection of the address-based event. In this example, acounter 322, for example whose count is stored within thememory 308, is incremented in response to triggering of the tripwire. Incrementing may take place as a result of any tripwire being triggered or only by one or more specific tripwires depending on the specific application.FIG. 21 illustrates another action which is such that, when the or the appropriate tripwire is triggered, a predetermined value “N” is written to a location “X” shown at 323 as being in the memory 308 (or being mapped thereto). -
FIG. 22 illustrates another combination of actions which may be used to indicate that an application should be awakened or rescheduled. When a tripwire is triggered, an interrupt request is supplied to theCPU 320 and a “runnable bit” for a specific application is set atlocation 324 in thememory 308. The operating system of thecomputer 300 responds to the interrupt request by waking up or rescheduling the application whose runnable bit has been set. -
FIG. 23 illustrates an action which modifies entries in theCAM 311 in response to triggering of a tripwire. Any form of modification is possible. For example, the code which triggers the tripwire may be deleted if no further tripwires are required for the same address-based event. As an alternative, the code may be modified so as effectively to set a different but related tripwire. A further possibility is to generate a completely new code and supply this to theCAM 311 in order to set a new unrelated tripwire. -
FIG. 24 illustrates the format of a data burst, a sequence of which forms the information stream on thebus 307. The data burst comprises a plurality of items which arrive one after the other in sequence on the bus. The first item is an address A(n) which is or corresponds to the end-point address, for example in thememory 308, for receiving the subsequent data items. This address is the actual address n of the first data item D1 of the burst, which immediately follows the address A(n). The subsequent data items D2, D3 . . . , Dp arrive in sequence and their destination addresses are implied by their position within the burst relative to the first data item D1 and its address n. Thus, the second data item D2 has an implied address n+1, the third data item D3 has an implied address n+2 and so on. Each data item is written or supplied to the implied address as its destination address. - This data burst format may be used to fragment and coalesce bursts as the data stream passes through a
forwarding unit 330, such as a network interface card or a switch, of an information pathway. For example, the forwarding unit can start to transmit a burst as soon as the first data item has arrived and does not have to wait until the whole data burst has arrived. -
FIG. 25 illustrates an example of this in which an interruption in the data burst occurs. Theforwarding unit 330 has already started transmission of the burst and the firstr data items 331 together with the burst address have already been forwarded. Theremainder 332 of the burst has not yet arrived and theforwarding unit 330 terminates forwarding or transmission of that burst. - When the
remainder 332 of the burst starts to arrive, theforwarding unit 330 recalculates the destination address A(r+1) for the remainder of the burst and inserts this in front of the data item Dr+1. This is transmitted as afurther burst 333 as illustrated inFIG. 26 . - This technique may be used even when the whole burst is available for forwarding by the
forwarding unit 330. For example, theforwarding unit 330 may terminate transmission of a particular burst before completion of transmission for reasons of arbitration between a number of competing bursts or for flow control reasons. Thus, individual data bursts can be forwarded in tact or can be sent in two or more fragments as necessary or convenient and all such bursts are treated as valid bursts by any subsequent forwarding units. -
FIG. 27 illustrates an alternative situation in which the forwarding unit has aninternal buffer 335 which contains first andsecond bursts second burst 337 immediately follows the implied address of the last data item Dn of thefirst burst 336. The forwarding unit checks for such situations and, when they are found, coalesces the first and second bursts into a coalescedburst 338 as shown in the lower part ofFIG. 27 . The forwarding unit then transmits a single contiguous burst, which saves the overhead of the excess address information (which is deleted from the second burst). Any subsequent forwarding units then treat the coalesced burst 338 as a single burst. The format of the data burst allows such fragmentation or merging of bursts to take place. This in turn allows forwarding units to transmit data as soon as it arrives so as to reduce or minimise latency. Also, bursts of any length or number of data items can be handled which improves the flexibility of transmission of data. -
FIG. 28 illustrates an example of communication between an application, whose address space is shown at 340, andremote hardware 341 via anetwork 303 such that thenetwork 303 is “transparent” or “invisible” to each of the application and theremote hardware 341. Theaddress space 340 contains mapped configuration data and registers of the remote hardware as indicated at 342. This is mapped onto the system input/output bus 343 to which anetwork interface card 344 is connected. Thenetwork interface card 344 is loaded with configuration and register data corresponding to theremote hardware 341. All application requests are forwarded over thenetwork 303 transparently to theremote hardware 341 so that the remote hardware appears as though it is local to the application and thenetwork 303 is invisible. - The
remote hardware 341 is connected to a passive input/output bus 345 which is provided with anetwork interface card 346 for interfacing to thenetwork 303. The configuration and registers of the remote hardware are illustrated at 347 and are mapped ultimately to theregion 342 of theaddress space 340 of the application. Again, the network is invisible to theremote hardware 341 and the remote application appears to be local to it. - When the application sends a request to the
remote hardware 341, for example requesting that the remote hardware supply data to be used in or processed by the application, this is written in thespace 342 which is mapped to the system input/output bus 343. Thenetwork interface card 344 sends read/write requests over thenetwork 303 to thecard 346, which supplies these via the passive input/output bus 345 to theremote hardware 341. Viewed from theremote hardware 341, thebus 345 appears equivalent to thebus 343. - The
remote hardware 341 may supply an interrupt and/or data for the application to thebus 345. Again, thenetwork interface card 346 sends this via thenetwork 303 to thecard 344. Thenetwork interface card 344 supplies an interrupt request to the computer running the application and writes the data on behalf of the remote hardware to thespace 342 in theaddress space 340 of the application. Thus, to the application, theremote hardware 341 appears to be connected directly to thebus 343. - Although implementations of tripwires have been described in detail hereinbefore with reference to the
tripwire unit 1 shown inFIG. 29 associated with thenetwork interface card 350, tripwires may be implemented at other points in a system as illustrated bytripwire units 2 to 5 inFIG. 29 . The system comprises adisk controller 351 connected to an input/output bus 307 b and thetripwire unit 2 is implemented as part of thedisk controller 351. Such an arrangement allows tripwire operations to inform applications of any characteristic data transfer to or from thedisk controller 351. Such an arrangement is particularly useful where thecontroller 351 is able to transfer data to and from a non-contiguous memory region corresponding to user-level buffers of an application. This allows data transfer and application level notification to be achieved without requiring hardware interrupts or kernel intervention. - The
tripwire unit 3 is associated with asystem controller 352 connected to a host bus 307 a and the input/output bus 307 b. Such an arrangement allows tripwire operations to inform applications of any characteristic data transfer to or from any device in the computer system. This includes hardware devices, such as thedisk controller 351 and thenetwork interface card 350, and, in the case of a system employing several CPUs, enables an application running on one of the CPUs to synchronise on a data transfer to or from an application running on another of the CPUs. Similarly, a tripwire may be used for synchronisation between applications running on the same CPU. This reduces the need for other mechanisms such as spin locks where both applications are required to operate in lock-step with the data transfer. -
Tripwire units CPU 320 or thememory 308. This is generally equivalent to thetripwire unit 3, where all data transfers in the system can be monitored. However, thetripwire unit 4 may monitor data written by an application to cache, which may not appear on the host bus 307 a.
Claims (6)
1. A method of arranging data for transfer as a data burst over a computer network comprising the steps of: providing a header comprising the destination address of a certain data word in the data burst, and a signal at the beginning or end of the data burst for indicating the start or end of the burst, the destination addresses of other words in the data burst being inferrable from the address in the header.
2. A method according to claim 1 , in which the signal identifying the end of a burst comprises a null signal.
3. A method of processing a data burst received over a computer network comprising the steps of:
reading a reference address from the header of the data burst, and
calculating the addresses of each data word in the burst from the position of that data word in the burst in relation to the position of the data word to which the address in the header corresponds, and from the reference address read from the header.
4. A method of interrupting transfer of a data burst over a computer network comprising the steps of:
halting transfer of a portion of the data burst which has not yet been transferred, thereby splitting the data burst into two burst sections, one which is transferred, and one waiting to be transferred.
5. A method of restarting transfer of a data burst that has been interrupted according to the method of claim 4 , comprising the steps of:
calculating a new reference address for the untransferred data burst section from the address contained in the header of the whole data burst, and from the position in the whole data burst of the first data word of the untransferred data burst section in relation to the position of the data word to which the address in the header corresponds;
providing a new header for the untransferred data burst section comprising the new reference address; and
transmitting the new header along with the untransferred data burst section.
6. A method according to claim 5 , comprising calculating the new reference address for the untransferred data burst section from the reference address contained in the header of the whole data burst and from the number of data words in the transferred data burst section.
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/198,043 US20060034275A1 (en) | 2000-05-03 | 2005-08-05 | Data transfer, synchronising applications, and low latency networks |
US12/105,412 US8423675B2 (en) | 1999-05-04 | 2008-04-18 | Data transfer, synchronising applications, and low latency networks |
US13/802,400 US8843655B2 (en) | 1999-05-04 | 2013-03-13 | Data transfer, synchronising applications, and low latency networks |
US14/492,800 US9769274B2 (en) | 1999-05-04 | 2014-09-22 | Data transfer, synchronising applications, and low latency networks |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
WOPCT/GB00/01691 | 2000-05-03 | ||
PCT/GB2000/001691 WO2000067131A2 (en) | 1999-05-04 | 2000-05-03 | Data transfer, synchronising applications, and low latency networks |
US98053902A | 2002-05-13 | 2002-05-13 | |
US11/198,043 US20060034275A1 (en) | 2000-05-03 | 2005-08-05 | Data transfer, synchronising applications, and low latency networks |
Related Parent Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/GB2000/001691 Division WO2000067131A2 (en) | 1999-05-04 | 2000-05-03 | Data transfer, synchronising applications, and low latency networks |
US09980539 Division | 2000-05-03 | ||
US98053902A Division | 1999-05-04 | 2002-05-13 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/105,412 Division US8423675B2 (en) | 1999-05-04 | 2008-04-18 | Data transfer, synchronising applications, and low latency networks |
Publications (1)
Publication Number | Publication Date |
---|---|
US20060034275A1 true US20060034275A1 (en) | 2006-02-16 |
Family
ID=35507393
Family Applications (7)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/198,260 Expired - Lifetime US8346971B2 (en) | 1999-05-04 | 2005-08-05 | Data transfer, synchronising applications, and low latency networks |
US11/198,252 Expired - Fee Related US8073994B2 (en) | 2000-05-03 | 2005-08-05 | Data transfer, synchronising applications, and low latency networks |
US11/198,043 Abandoned US20060034275A1 (en) | 1999-05-04 | 2005-08-05 | Data transfer, synchronising applications, and low latency networks |
US12/105,412 Expired - Fee Related US8423675B2 (en) | 1999-05-04 | 2008-04-18 | Data transfer, synchronising applications, and low latency networks |
US13/654,876 Expired - Fee Related US8725903B2 (en) | 1999-05-04 | 2012-10-18 | Data transfer, synchronising applications, and low latency networks |
US13/802,400 Expired - Fee Related US8843655B2 (en) | 1999-05-04 | 2013-03-13 | Data transfer, synchronising applications, and low latency networks |
US14/492,800 Expired - Fee Related US9769274B2 (en) | 1999-05-04 | 2014-09-22 | Data transfer, synchronising applications, and low latency networks |
Family Applications Before (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/198,260 Expired - Lifetime US8346971B2 (en) | 1999-05-04 | 2005-08-05 | Data transfer, synchronising applications, and low latency networks |
US11/198,252 Expired - Fee Related US8073994B2 (en) | 2000-05-03 | 2005-08-05 | Data transfer, synchronising applications, and low latency networks |
Family Applications After (4)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/105,412 Expired - Fee Related US8423675B2 (en) | 1999-05-04 | 2008-04-18 | Data transfer, synchronising applications, and low latency networks |
US13/654,876 Expired - Fee Related US8725903B2 (en) | 1999-05-04 | 2012-10-18 | Data transfer, synchronising applications, and low latency networks |
US13/802,400 Expired - Fee Related US8843655B2 (en) | 1999-05-04 | 2013-03-13 | Data transfer, synchronising applications, and low latency networks |
US14/492,800 Expired - Fee Related US9769274B2 (en) | 1999-05-04 | 2014-09-22 | Data transfer, synchronising applications, and low latency networks |
Country Status (1)
Country | Link |
---|---|
US (7) | US8346971B2 (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040215442A1 (en) * | 2003-04-24 | 2004-10-28 | International Business Machines Corporation | Method and apparatus to use clock bursting to minimize command latency in a logic simulation hardware emulator / accelerator |
US20080028103A1 (en) * | 2006-07-26 | 2008-01-31 | Michael Steven Schlansker | Memory-mapped buffers for network interface controllers |
US20080239954A1 (en) * | 2007-03-26 | 2008-10-02 | Bigfoot Networks, Inc. | Method and system for communication between nodes |
US7725556B1 (en) | 2006-10-27 | 2010-05-25 | Hewlett-Packard Development Company, L.P. | Computer system with concurrent direct memory access |
US8700771B1 (en) * | 2006-06-26 | 2014-04-15 | Cisco Technology, Inc. | System and method for caching access rights |
US20180150123A1 (en) * | 2016-11-28 | 2018-05-31 | Qualcomm Incorporated | Wifi memory power minimization |
Families Citing this family (44)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8346971B2 (en) | 1999-05-04 | 2013-01-01 | At&T Intellectual Property I, Lp | Data transfer, synchronising applications, and low latency networks |
EP3386128B1 (en) | 2001-10-05 | 2019-07-24 | TQ Delta, LLC | Systems and methods for multi-pair atm over dsl |
WO2005089241A2 (en) | 2004-03-13 | 2005-09-29 | Cluster Resources, Inc. | System and method for providing object triggers |
US8782654B2 (en) | 2004-03-13 | 2014-07-15 | Adaptive Computing Enterprises, Inc. | Co-allocating a reservation spanning different compute resources types |
US20070266388A1 (en) | 2004-06-18 | 2007-11-15 | Cluster Resources, Inc. | System and method for providing advanced reservations in a compute environment |
US8176490B1 (en) | 2004-08-20 | 2012-05-08 | Adaptive Computing Enterprises, Inc. | System and method of interfacing a workload manager and scheduler with an identity manager |
US8271980B2 (en) | 2004-11-08 | 2012-09-18 | Adaptive Computing Enterprises, Inc. | System and method of providing system jobs within a compute environment |
US8631130B2 (en) | 2005-03-16 | 2014-01-14 | Adaptive Computing Enterprises, Inc. | Reserving resources in an on-demand compute environment from a local compute environment |
US8863143B2 (en) | 2006-03-16 | 2014-10-14 | Adaptive Computing Enterprises, Inc. | System and method for managing a hybrid compute environment |
US9231886B2 (en) | 2005-03-16 | 2016-01-05 | Adaptive Computing Enterprises, Inc. | Simple integration of an on-demand compute environment |
CA2603577A1 (en) | 2005-04-07 | 2006-10-12 | Cluster Resources, Inc. | On-demand access to compute resources |
US20090265485A1 (en) * | 2005-11-30 | 2009-10-22 | Broadcom Corporation | Ring-based cache coherent bus |
US8554943B1 (en) * | 2006-03-31 | 2013-10-08 | Emc Corporation | Method and system for reducing packet latency in networks with both low latency and high bandwidths requirements |
US8041773B2 (en) | 2007-09-24 | 2011-10-18 | The Research Foundation Of State University Of New York | Automatic clustering for self-organizing grids |
US9256560B2 (en) * | 2009-07-29 | 2016-02-09 | Solarflare Communications, Inc. | Controller integration |
US8988800B1 (en) | 2009-09-15 | 2015-03-24 | Marvell International Ltd. | Error correction for storage devices |
US9054990B2 (en) | 2009-10-30 | 2015-06-09 | Iii Holdings 2, Llc | System and method for data center security enhancements leveraging server SOCs or server fabrics |
US20110103391A1 (en) | 2009-10-30 | 2011-05-05 | Smooth-Stone, Inc. C/O Barry Evans | System and method for high-performance, low-power data center interconnect fabric |
US8599863B2 (en) | 2009-10-30 | 2013-12-03 | Calxeda, Inc. | System and method for using a multi-protocol fabric module across a distributed server interconnect fabric |
US20130107444A1 (en) | 2011-10-28 | 2013-05-02 | Calxeda, Inc. | System and method for flexible storage and networking provisioning in large scalable processor installations |
US9876735B2 (en) | 2009-10-30 | 2018-01-23 | Iii Holdings 2, Llc | Performance and power optimized computer system architectures and methods leveraging power optimized tree fabric interconnect |
US9465771B2 (en) | 2009-09-24 | 2016-10-11 | Iii Holdings 2, Llc | Server on a chip and node cards comprising one or more of same |
US9069929B2 (en) | 2011-10-31 | 2015-06-30 | Iii Holdings 2, Llc | Arbitrating usage of serial port in node card of scalable and modular servers |
US9077654B2 (en) | 2009-10-30 | 2015-07-07 | Iii Holdings 2, Llc | System and method for data center security enhancements leveraging managed server SOCs |
US9680770B2 (en) * | 2009-10-30 | 2017-06-13 | Iii Holdings 2, Llc | System and method for using a multi-protocol fabric module across a distributed server interconnect fabric |
US9648102B1 (en) | 2012-12-27 | 2017-05-09 | Iii Holdings 2, Llc | Memcached server functionality in a cluster of data processing nodes |
US10877695B2 (en) | 2009-10-30 | 2020-12-29 | Iii Holdings 2, Llc | Memcached server functionality in a cluster of data processing nodes |
US11720290B2 (en) | 2009-10-30 | 2023-08-08 | Iii Holdings 2, Llc | Memcached server functionality in a cluster of data processing nodes |
US9311269B2 (en) | 2009-10-30 | 2016-04-12 | Iii Holdings 2, Llc | Network proxy for high-performance, low-power data center interconnect fabric |
US9043509B2 (en) * | 2011-01-14 | 2015-05-26 | Broadcom Corporation | Method and system for low-latency networking |
US9565137B2 (en) * | 2012-04-26 | 2017-02-07 | Nxp Usa, Inc. | Cut-through forwarding module and a method of receiving and transmitting data frames in a cut-through forwarding mode |
US9330036B2 (en) | 2013-11-25 | 2016-05-03 | Red Hat Israel, Ltd. | Interrupt reduction by dynamic application buffering |
US9361231B2 (en) | 2014-01-15 | 2016-06-07 | International Business Machines Corporation | Implicit I/O send on cache operations |
US9431052B2 (en) | 2014-06-26 | 2016-08-30 | Marvell World Trade Ltd. | Two dimensional magnetic recording systems, devices and methods |
KR20160132169A (en) * | 2015-05-06 | 2016-11-17 | 에스케이하이닉스 주식회사 | Memory system including semiconductor memory device and program method thereof |
KR102450972B1 (en) * | 2015-12-07 | 2022-10-05 | 삼성전자주식회사 | Device and method for transmitting a packet to application |
US10162775B2 (en) * | 2015-12-22 | 2018-12-25 | Futurewei Technologies, Inc. | System and method for efficient cross-controller request handling in active/active storage systems |
US11050682B2 (en) * | 2017-09-28 | 2021-06-29 | Intel Corporation | Reordering of data for parallel processing |
US11227035B2 (en) * | 2018-11-15 | 2022-01-18 | International Business Machines Corporation | Intelligent pattern based application grouping and activating |
US11681625B2 (en) * | 2018-12-20 | 2023-06-20 | Intel Corporation | Receive buffer management |
EP3694166B1 (en) * | 2019-02-06 | 2022-09-21 | Hitachi Energy Switzerland AG | Cyclic time-slotted operation in a wireless industrial network |
US20190188111A1 (en) * | 2019-02-26 | 2019-06-20 | Intel Corporation | Methods and apparatus to improve performance data collection of a high performance computing application |
US11249986B2 (en) | 2019-12-17 | 2022-02-15 | Paypal, Inc. | Managing stale connections in a distributed system |
US11561912B2 (en) * | 2020-06-01 | 2023-01-24 | Samsung Electronics Co., Ltd. | Host controller interface using multiple circular queue, and operating method thereof |
Citations (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3701972A (en) * | 1969-12-16 | 1972-10-31 | Computer Retrieval Systems Inc | Data processing system |
US4429387A (en) * | 1982-02-05 | 1984-01-31 | Siemens Corporation | Special character sequence detection circuit arrangement |
US4644529A (en) * | 1985-08-02 | 1987-02-17 | Gte Laboratories Incorporated | High-speed switching processor for a burst-switching communications system |
US4866664A (en) * | 1987-03-09 | 1989-09-12 | Unisys Corporation | Intercomputer communication control apparatus & method |
US4977582A (en) * | 1988-03-31 | 1990-12-11 | At&T Bell Laboratories | Synchronization of non-continuous digital bit streams |
US4993022A (en) * | 1988-02-12 | 1991-02-12 | Hitachi, Ltd. | System for and method of multiplexing speech signals utilizing priority signals for individual transmission channels |
US5179556A (en) * | 1991-08-02 | 1993-01-12 | Washington University | Bandwidth management and congestion control scheme for multicast ATM networks |
US5296936A (en) * | 1991-07-22 | 1994-03-22 | International Business Machines Corporation | Communication apparatus and method for transferring image data from a source to one or more receivers |
US5363484A (en) * | 1990-11-13 | 1994-11-08 | International Business Machines Corporation | Multiple computer system with combiner/memory interconnection system employing separate direct access link for transferring information packets |
US5438640A (en) * | 1993-07-16 | 1995-08-01 | Sumitomo Electric Industries, Ltd. | Optical waveguide device for receiving functional component |
US5487152A (en) * | 1990-04-27 | 1996-01-23 | National Semiconductor Corporation | Method and apparatus for frame header splitting in a media access control/host system interface unit |
US5488724A (en) * | 1990-05-29 | 1996-01-30 | Advanced Micro Devices, Inc. | Network controller with memory request and acknowledgement signals and a network adapter therewith |
US5513177A (en) * | 1986-09-16 | 1996-04-30 | Hitachi, Ltd. | Distributed switching system having at least one module |
US5586273A (en) * | 1994-08-18 | 1996-12-17 | International Business Machines Corporation | HDLC asynchronous to synchronous converter |
US5727154A (en) * | 1995-04-28 | 1998-03-10 | Fry; Shawn C. | Program synchronization on first and second computers by determining whether information transmitted by first computer is an acceptable or unacceptable input to second computer program |
US5740372A (en) * | 1994-09-26 | 1998-04-14 | Oki Electric Industry Co., Ltd. | Circuit which detects a signal in successive frames by feeding back look up data with one frame delay as addresses to look up memory |
US5764895A (en) * | 1995-01-11 | 1998-06-09 | Sony Corporation | Method and apparatus for directing data packets in a local area network device having a plurality of ports interconnected by a high-speed communication bus |
US5778175A (en) * | 1995-12-22 | 1998-07-07 | Digital Equipment Corporation | Method implemented by a computer network adapter for autonomously adjusting a transmit commencement threshold valve upon concurrence of an underflow condition |
US5787251A (en) * | 1992-12-21 | 1998-07-28 | Sun Microsystems, Inc. | Method and apparatus for subcontracts in distributed processing systems |
US5862346A (en) * | 1996-06-28 | 1999-01-19 | Metadigm | Distributed group activity data network system and corresponding method |
US6260073B1 (en) * | 1996-12-30 | 2001-07-10 | Compaq Computer Corporation | Network switch including a switch manager for periodically polling the network ports to determine their status and controlling the flow of data between ports |
US6262976B1 (en) * | 1998-09-17 | 2001-07-17 | Ordered Networks, Inc. | System and method for network flow optimization using traffic classes |
US6356962B1 (en) * | 1998-09-30 | 2002-03-12 | Stmicroelectronics, Inc. | Network device and method of controlling flow of data arranged in frames in a data-based network |
US6470398B1 (en) * | 1996-08-21 | 2002-10-22 | Compaq Computer Corporation | Method and apparatus for supporting a select () system call and interprocess communication in a fault-tolerant, scalable distributed computer environment |
US6757390B2 (en) * | 2000-03-28 | 2004-06-29 | Seiko Instruments Inc. | Wristwatch type wireless telephone |
US6792085B1 (en) * | 1999-09-10 | 2004-09-14 | Comdial Corporation | System and method for unified messaging with message replication and synchronization |
US6907473B2 (en) * | 1998-10-30 | 2005-06-14 | Science Applications International Corp. | Agile network protocol for secure communications with assured system availability |
US6920507B1 (en) * | 1996-06-28 | 2005-07-19 | Metadigm Llc | System and corresponding method for providing redundant storage of a data file over a computer network |
US6954923B1 (en) * | 1999-01-28 | 2005-10-11 | Ati International Srl | Recording classification of instructions executed by a computer |
Family Cites Families (74)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4377852A (en) * | 1980-03-31 | 1983-03-22 | Texas Instruments Incorporated | Terminal emulator |
US4467411A (en) * | 1981-03-06 | 1984-08-21 | International Business Machines Corporation | Scheduling device operations in a buffered peripheral subsystem |
JPS60211559A (en) | 1984-04-06 | 1985-10-23 | Nec Corp | Common memory control system |
CA1253971A (en) | 1986-06-26 | 1989-05-09 | Pierre Goyer | Synchronization service for a distributed operating system or the like |
US5140679A (en) | 1988-09-14 | 1992-08-18 | National Semiconductor Corporation | Universal asynchronous receiver/transmitter |
JP2585757B2 (en) | 1988-11-02 | 1997-02-26 | 株式会社日立製作所 | Information signal recording / reproducing method and recording / reproducing apparatus |
US5165031A (en) | 1990-05-16 | 1992-11-17 | International Business Machines Corporation | Coordinated handling of error codes and information describing errors in a commit procedure |
US5765011A (en) | 1990-11-13 | 1998-06-09 | International Business Machines Corporation | Parallel processing system having a synchronous SIMD processing with processing elements emulating SIMD operation using individual instruction streams |
JP2519860B2 (en) | 1991-09-16 | 1996-07-31 | インターナショナル・ビジネス・マシーンズ・コーポレイション | Burst data transfer apparatus and method |
US5388237A (en) | 1991-12-30 | 1995-02-07 | Sun Microsystems, Inc. | Method of and apparatus for interleaving multiple-channel DMA operations |
US5371870A (en) | 1992-04-24 | 1994-12-06 | Digital Equipment Corporation | Stream buffer memory having a multiple-entry address history buffer for detecting sequential reads to initiate prefetching |
US5555390A (en) | 1992-10-19 | 1996-09-10 | International Business Machines Corporation | Data storage method and subsystem including a device controller for respecifying an amended start address |
EP0669020B1 (en) * | 1992-11-13 | 1997-04-02 | Microsoft Corporation | Methods for marshalling interface pointers for remote procedure calls |
US5566302A (en) * | 1992-12-21 | 1996-10-15 | Sun Microsystems, Inc. | Method for executing operation call from client application using shared memory region and establishing shared memory region when the shared memory region does not exist |
US6157961A (en) * | 1992-12-21 | 2000-12-05 | Sun Microsystems, Inc. | Client-side stub interpreter |
JPH0773252B2 (en) | 1993-04-21 | 1995-08-02 | 日本電気株式会社 | Transmission frame generation circuit |
US5442390A (en) * | 1993-07-07 | 1995-08-15 | Digital Equipment Corporation | Video on demand with memory accessing and or like functions |
CA2107299C (en) * | 1993-09-29 | 1997-02-25 | Mehrad Yasrebi | High performance machine for switched communications in a heterogenous data processing network gateway |
US5436640A (en) * | 1993-10-29 | 1995-07-25 | Thrustmaster, Inc. | Video game and simulator joystick controller with geared potentiometer actuation |
JPH08180001A (en) | 1994-04-12 | 1996-07-12 | Mitsubishi Electric Corp | Communication system, communication method and network interface |
US7190284B1 (en) | 1994-11-16 | 2007-03-13 | Dye Thomas A | Selective lossless, lossy, or no compression of data based on address range, data type, and/or requesting agent |
US6029205A (en) | 1994-12-22 | 2000-02-22 | Unisys Corporation | System architecture for improved message passing and process synchronization between concurrently executing processes |
US5737607A (en) * | 1995-09-28 | 1998-04-07 | Sun Microsystems, Inc. | Method and apparatus for allowing generic stubs to marshal and unmarshal data in object reference specific data formats |
US6047323A (en) | 1995-10-19 | 2000-04-04 | Hewlett-Packard Company | Creation and migration of distributed streams in clusters of networked computers |
US5758089A (en) | 1995-11-02 | 1998-05-26 | Sun Microsystems, Inc. | Method and apparatus for burst transferring ATM packet header and data to a host computer system |
US5887172A (en) | 1996-01-10 | 1999-03-23 | Sun Microsystems, Inc. | Remote procedure call system and method for RPC mechanism independent client and server interfaces interoperable with any of a plurality of remote procedure call backends |
JP3634379B2 (en) | 1996-01-24 | 2005-03-30 | サン・マイクロシステムズ・インコーポレイテッド | Method and apparatus for stack caching |
US6038643A (en) | 1996-01-24 | 2000-03-14 | Sun Microsystems, Inc. | Stack management unit and method for a processor having a stack |
US5797043A (en) | 1996-03-13 | 1998-08-18 | Diamond Multimedia Systems, Inc. | System for managing the transfer of data between FIFOs within pool memory and peripherals being programmable with identifications of the FIFOs |
US6044409A (en) | 1996-06-26 | 2000-03-28 | Sun Microsystems, Inc. | Framework for marshaling and unmarshaling argument object references |
GB2315638B (en) | 1996-07-19 | 2000-09-13 | Ericsson Telefon Ab L M | Synchronisation checking |
US5905870A (en) | 1996-09-11 | 1999-05-18 | Advanced Micro Devices, Inc | Arrangement for initiating and maintaining flow control in shared-medium, full-duplex, and switched networks |
US5995488A (en) | 1996-10-08 | 1999-11-30 | Advanced Micro Devices, Inc. | Method and apparatus for regulating data flow in networks |
US6072781A (en) * | 1996-10-22 | 2000-06-06 | International Business Machines Corporation | Multi-tasking adapter for parallel network applications |
JPH10133997A (en) | 1996-11-01 | 1998-05-22 | Fuji Xerox Co Ltd | Dma controller |
US6208655B1 (en) | 1996-11-27 | 2001-03-27 | Sony Europa, B.V., | Method and apparatus for serving data |
US6405264B1 (en) | 1997-12-18 | 2002-06-11 | Sun Microsystems, Inc. | Marshaling and unmarshaling framework for supporting filters in a distributed object system |
US6519686B2 (en) * | 1998-01-05 | 2003-02-11 | Intel Corporation | Information streaming in a multi-process system using shared memory |
US6434161B1 (en) * | 1998-02-25 | 2002-08-13 | 3Com Corporation | UART with direct memory access buffering of data and method therefor |
US6134607A (en) * | 1998-04-03 | 2000-10-17 | Avid Technology, Inc. | Method and apparatus for controlling data flow between devices connected by a memory |
US6101533A (en) * | 1998-04-15 | 2000-08-08 | Unisys Corporation | Multiple interface data communication system and method utilizing multiple connection library interfaces with buffer and lock pool sharing |
US6201817B1 (en) | 1998-05-28 | 2001-03-13 | 3Com Corporation | Memory based buffering for a UART or a parallel UART like interface |
US6321252B1 (en) | 1998-07-17 | 2001-11-20 | International Business Machines Corporation | System and method for data streaming and synchronization in multimedia groupware applications |
US6425017B1 (en) | 1998-08-17 | 2002-07-23 | Microsoft Corporation | Queued method invocations on distributed component applications |
US6161160A (en) | 1998-09-03 | 2000-12-12 | Advanced Micro Devices, Inc. | Network interface device architecture for storing transmit and receive data in a random access buffer memory across independent clock domains |
US7046625B1 (en) | 1998-09-30 | 2006-05-16 | Stmicroelectronics, Inc. | Method and system for routing network-based data using frame address notification |
US6717910B1 (en) | 1998-09-30 | 2004-04-06 | Stmicroelectronics, Inc. | Method and apparatus for controlling network data congestion |
US6269413B1 (en) | 1998-10-30 | 2001-07-31 | Hewlett Packard Company | System with multiple dynamically-sized logical FIFOs sharing single memory and with read/write pointers independently selectable and simultaneously responsive to respective read/write FIFO selections |
US6637020B1 (en) * | 1998-12-03 | 2003-10-21 | International Business Machines Corporation | Creating applications within data processing systems by combining program components dynamically |
US6405276B1 (en) | 1998-12-10 | 2002-06-11 | International Business Machines Corporation | Selectively flushing buffered transactions in a bus bridge |
US7233977B2 (en) | 1998-12-18 | 2007-06-19 | Emc Corporation | Messaging mechanism employing mailboxes for inter processor communications |
US6279050B1 (en) | 1998-12-18 | 2001-08-21 | Emc Corporation | Data transfer apparatus having upper, lower, middle state machines, with middle state machine arbitrating among lower state machine side requesters including selective assembly/disassembly requests |
US6314478B1 (en) * | 1998-12-29 | 2001-11-06 | Nec America, Inc. | System for accessing a space appended to a circular queue after traversing an end of the queue and upon completion copying data back to the queue |
US6549934B1 (en) | 1999-03-01 | 2003-04-15 | Microsoft Corporation | Method and system for remote access to computer devices via client managed server buffers exclusively allocated to the client |
US6757698B2 (en) | 1999-04-14 | 2004-06-29 | Iomega Corporation | Method and apparatus for automatically synchronizing data from a host computer to two or more backup data storage locations |
US6321225B1 (en) | 1999-04-23 | 2001-11-20 | Microsoft Corporation | Abstracting cooked variables from raw variables |
US7007099B1 (en) | 1999-05-03 | 2006-02-28 | Lucent Technologies Inc. | High speed multi-port serial-to-PCI bus interface |
GB2349717A (en) | 1999-05-04 | 2000-11-08 | At & T Lab Cambridge Ltd | Low latency network |
US8346971B2 (en) | 1999-05-04 | 2013-01-01 | At&T Intellectual Property I, Lp | Data transfer, synchronising applications, and low latency networks |
US6757744B1 (en) * | 1999-05-12 | 2004-06-29 | Unisys Corporation | Distributed transport communications manager with messaging subsystem for high-speed communications between heterogeneous computer systems |
WO2001029653A1 (en) | 1999-10-15 | 2001-04-26 | Iona Technologies, Inc. | A system and method for dynamically demarshaling a data stream in a distributed object environment |
ES2320724T3 (en) | 1999-10-22 | 2009-05-28 | Nomadix, Inc. | SYSTEMS AND PROCEDURES FOR THE DYNAMIC MANAGEMENT OF THE BANDWIDTH BY PAYABLE IN A COMMUNICATIONS NETWORK. |
US6567953B1 (en) | 2000-03-29 | 2003-05-20 | Intel Corporation | Method and apparatus for host-based validating of data transferred between a device and a host |
US6757398B2 (en) | 2001-01-19 | 2004-06-29 | Conexant Systems, Inc. | System and method for sensing a sound system |
GB0221464D0 (en) | 2002-09-16 | 2002-10-23 | Cambridge Internetworking Ltd | Network interface and protocol |
KR100985036B1 (en) | 2002-12-20 | 2010-10-04 | 코닌클리케 필립스 일렉트로닉스 엔.브이. | More user friendly time-shift buffer |
US7689738B1 (en) | 2003-10-01 | 2010-03-30 | Advanced Micro Devices, Inc. | Peripheral devices and methods for transferring incoming data status entries from a peripheral to a host |
US6963946B1 (en) | 2003-10-01 | 2005-11-08 | Advanced Micro Devices, Inc. | Descriptor management systems and methods for transferring data between a host and a peripheral |
US7325122B2 (en) * | 2004-02-20 | 2008-01-29 | International Business Machines Corporation | Facilitating inter-DSP data communications |
GB0404696D0 (en) * | 2004-03-02 | 2004-04-07 | Level 5 Networks Ltd | Dual driver interface |
US7337248B1 (en) | 2004-03-19 | 2008-02-26 | Sun Microsystems, Inc. | Adaptive synchronization method for communication in storage systems |
JP2006099853A (en) | 2004-09-29 | 2006-04-13 | Hitachi Global Storage Technologies Netherlands Bv | Recording and reproducing device |
US7802031B2 (en) * | 2005-05-18 | 2010-09-21 | Qlogic, Corporation | Method and system for high speed network application |
CN101072227A (en) | 2006-05-11 | 2007-11-14 | 华为技术有限公司 | Sending system, method and receiving system for video broadcasting system |
-
2005
- 2005-08-05 US US11/198,260 patent/US8346971B2/en not_active Expired - Lifetime
- 2005-08-05 US US11/198,252 patent/US8073994B2/en not_active Expired - Fee Related
- 2005-08-05 US US11/198,043 patent/US20060034275A1/en not_active Abandoned
-
2008
- 2008-04-18 US US12/105,412 patent/US8423675B2/en not_active Expired - Fee Related
-
2012
- 2012-10-18 US US13/654,876 patent/US8725903B2/en not_active Expired - Fee Related
-
2013
- 2013-03-13 US US13/802,400 patent/US8843655B2/en not_active Expired - Fee Related
-
2014
- 2014-09-22 US US14/492,800 patent/US9769274B2/en not_active Expired - Fee Related
Patent Citations (30)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3701972A (en) * | 1969-12-16 | 1972-10-31 | Computer Retrieval Systems Inc | Data processing system |
US4429387A (en) * | 1982-02-05 | 1984-01-31 | Siemens Corporation | Special character sequence detection circuit arrangement |
US4644529A (en) * | 1985-08-02 | 1987-02-17 | Gte Laboratories Incorporated | High-speed switching processor for a burst-switching communications system |
US5513177A (en) * | 1986-09-16 | 1996-04-30 | Hitachi, Ltd. | Distributed switching system having at least one module |
US4866664A (en) * | 1987-03-09 | 1989-09-12 | Unisys Corporation | Intercomputer communication control apparatus & method |
US4993022A (en) * | 1988-02-12 | 1991-02-12 | Hitachi, Ltd. | System for and method of multiplexing speech signals utilizing priority signals for individual transmission channels |
US4977582A (en) * | 1988-03-31 | 1990-12-11 | At&T Bell Laboratories | Synchronization of non-continuous digital bit streams |
US5513320A (en) * | 1990-04-27 | 1996-04-30 | National Semiconductor Corporation | Transmit data descriptor structure in a media access control/host system interface that implements flexible transmit data descriptor structure unit |
US5487152A (en) * | 1990-04-27 | 1996-01-23 | National Semiconductor Corporation | Method and apparatus for frame header splitting in a media access control/host system interface unit |
US5488724A (en) * | 1990-05-29 | 1996-01-30 | Advanced Micro Devices, Inc. | Network controller with memory request and acknowledgement signals and a network adapter therewith |
US5363484A (en) * | 1990-11-13 | 1994-11-08 | International Business Machines Corporation | Multiple computer system with combiner/memory interconnection system employing separate direct access link for transferring information packets |
US5296936A (en) * | 1991-07-22 | 1994-03-22 | International Business Machines Corporation | Communication apparatus and method for transferring image data from a source to one or more receivers |
US5179556A (en) * | 1991-08-02 | 1993-01-12 | Washington University | Bandwidth management and congestion control scheme for multicast ATM networks |
US5787251A (en) * | 1992-12-21 | 1998-07-28 | Sun Microsystems, Inc. | Method and apparatus for subcontracts in distributed processing systems |
US5438640A (en) * | 1993-07-16 | 1995-08-01 | Sumitomo Electric Industries, Ltd. | Optical waveguide device for receiving functional component |
US5586273A (en) * | 1994-08-18 | 1996-12-17 | International Business Machines Corporation | HDLC asynchronous to synchronous converter |
US5740372A (en) * | 1994-09-26 | 1998-04-14 | Oki Electric Industry Co., Ltd. | Circuit which detects a signal in successive frames by feeding back look up data with one frame delay as addresses to look up memory |
US5764895A (en) * | 1995-01-11 | 1998-06-09 | Sony Corporation | Method and apparatus for directing data packets in a local area network device having a plurality of ports interconnected by a high-speed communication bus |
US5727154A (en) * | 1995-04-28 | 1998-03-10 | Fry; Shawn C. | Program synchronization on first and second computers by determining whether information transmitted by first computer is an acceptable or unacceptable input to second computer program |
US5778175A (en) * | 1995-12-22 | 1998-07-07 | Digital Equipment Corporation | Method implemented by a computer network adapter for autonomously adjusting a transmit commencement threshold valve upon concurrence of an underflow condition |
US5862346A (en) * | 1996-06-28 | 1999-01-19 | Metadigm | Distributed group activity data network system and corresponding method |
US6920507B1 (en) * | 1996-06-28 | 2005-07-19 | Metadigm Llc | System and corresponding method for providing redundant storage of a data file over a computer network |
US6470398B1 (en) * | 1996-08-21 | 2002-10-22 | Compaq Computer Corporation | Method and apparatus for supporting a select () system call and interprocess communication in a fault-tolerant, scalable distributed computer environment |
US6260073B1 (en) * | 1996-12-30 | 2001-07-10 | Compaq Computer Corporation | Network switch including a switch manager for periodically polling the network ports to determine their status and controlling the flow of data between ports |
US6262976B1 (en) * | 1998-09-17 | 2001-07-17 | Ordered Networks, Inc. | System and method for network flow optimization using traffic classes |
US6356962B1 (en) * | 1998-09-30 | 2002-03-12 | Stmicroelectronics, Inc. | Network device and method of controlling flow of data arranged in frames in a data-based network |
US6907473B2 (en) * | 1998-10-30 | 2005-06-14 | Science Applications International Corp. | Agile network protocol for secure communications with assured system availability |
US6954923B1 (en) * | 1999-01-28 | 2005-10-11 | Ati International Srl | Recording classification of instructions executed by a computer |
US6792085B1 (en) * | 1999-09-10 | 2004-09-14 | Comdial Corporation | System and method for unified messaging with message replication and synchronization |
US6757390B2 (en) * | 2000-03-28 | 2004-06-29 | Seiko Instruments Inc. | Wristwatch type wireless telephone |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040215442A1 (en) * | 2003-04-24 | 2004-10-28 | International Business Machines Corporation | Method and apparatus to use clock bursting to minimize command latency in a logic simulation hardware emulator / accelerator |
US7716036B2 (en) * | 2003-04-24 | 2010-05-11 | International Business Machines Corporation | Method and apparatus to use clock bursting to minimize command latency in a logic simulation hardware emulator / accelerator |
US8700771B1 (en) * | 2006-06-26 | 2014-04-15 | Cisco Technology, Inc. | System and method for caching access rights |
US20080028103A1 (en) * | 2006-07-26 | 2008-01-31 | Michael Steven Schlansker | Memory-mapped buffers for network interface controllers |
US9137179B2 (en) * | 2006-07-26 | 2015-09-15 | Hewlett-Packard Development Company, L.P. | Memory-mapped buffers for network interface controllers |
US7725556B1 (en) | 2006-10-27 | 2010-05-25 | Hewlett-Packard Development Company, L.P. | Computer system with concurrent direct memory access |
US20080239954A1 (en) * | 2007-03-26 | 2008-10-02 | Bigfoot Networks, Inc. | Method and system for communication between nodes |
US8687487B2 (en) * | 2007-03-26 | 2014-04-01 | Qualcomm Incorporated | Method and system for communication between nodes |
US20180150123A1 (en) * | 2016-11-28 | 2018-05-31 | Qualcomm Incorporated | Wifi memory power minimization |
US10539996B2 (en) * | 2016-11-28 | 2020-01-21 | Qualcomm Incorporated | WiFi memory power minimization |
Also Published As
Publication number | Publication date |
---|---|
US8346971B2 (en) | 2013-01-01 |
US8073994B2 (en) | 2011-12-06 |
US20130290558A1 (en) | 2013-10-31 |
US9769274B2 (en) | 2017-09-19 |
US8843655B2 (en) | 2014-09-23 |
US20150081925A1 (en) | 2015-03-19 |
US8423675B2 (en) | 2013-04-16 |
US20050289238A1 (en) | 2005-12-29 |
US20130041930A1 (en) | 2013-02-14 |
US8725903B2 (en) | 2014-05-13 |
US20080228946A1 (en) | 2008-09-18 |
US20060029053A1 (en) | 2006-02-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9769274B2 (en) | Data transfer, synchronising applications, and low latency networks | |
EP1302854B1 (en) | Asynchronous Data transfer | |
EP1358562B1 (en) | Method and apparatus for controlling flow of data between data processing systems via a memory | |
US7281030B1 (en) | Method of reading a remote memory | |
US6747949B1 (en) | Register based remote data flow control | |
US7409468B2 (en) | Controlling flow of data between data processing systems via a memory | |
JP4317365B2 (en) | Method and apparatus for transferring interrupts from a peripheral device to a host computer system | |
US20040034718A1 (en) | Prefetching of receive queue descriptors | |
EP2383658B1 (en) | Queue depth management for communication between host and peripheral device | |
US6880047B2 (en) | Local emulation of data RAM utilizing write-through cache hardware within a CPU module |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |