WO2015078219A1 - 一种信息缓存方法、装置和通信设备 - Google Patents

一种信息缓存方法、装置和通信设备 Download PDF

Info

Publication number
WO2015078219A1
WO2015078219A1 PCT/CN2014/086497 CN2014086497W WO2015078219A1 WO 2015078219 A1 WO2015078219 A1 WO 2015078219A1 CN 2014086497 W CN2014086497 W CN 2014086497W WO 2015078219 A1 WO2015078219 A1 WO 2015078219A1
Authority
WO
WIPO (PCT)
Prior art keywords
associated data
cache unit
queue pair
queue
unit
Prior art date
Application number
PCT/CN2014/086497
Other languages
English (en)
French (fr)
Inventor
彭胜勇
程子明
石仔良
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2015078219A1 publication Critical patent/WO2015078219A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/568Storing data temporarily at an intermediate stage, e.g. caching

Definitions

  • the present invention relates to the field of communications technologies, and in particular, to an information caching method, apparatus, and communication device.
  • RDMA Remote Direct Memory Access
  • CPU central processing unit
  • the server in the RDMA system includes a CPU, a storage module such as a dual in-line memory module (DIMM), a host channel adapter (HCA), and the like, and the server The communication between the servers is achieved through the interconnection of cables between the HCAs.
  • DIMM dual in-line memory module
  • HCA host channel adapter
  • the HCA in one server can send the data to the storage module through the CPU, and then send the data to the HCA of the other server, and the HCA of the other server stores the received data into the storage module through the CPU.
  • the CPU is only responsible for writing data to the storage module and writing the task of transmitting data to the transmission queue, and the control processing of the data transmission protocol, such as parsing the data message, encapsulating the data message and the response data. Messages and the like are executed by the HCA without the need for the CPU to participate, thereby eliminating the need for a large amount of CPU processing power and reducing the CPU load.
  • Embodiments of the present invention provide an information caching method, apparatus, and communication device, which reduce frequent operations between a processor in a communication device and a module having an RDMA function.
  • a first aspect of the embodiments of the present invention provides an information caching method, which is applied to a remote data access RDMA module included in a communication device, where the method includes:
  • the associated data of the queue pair and the priority information of the associated data are correspondingly stored in a cache unit of the RDMA module.
  • the determining the priority information of the associated data of the queue pair includes:
  • the priority information of the associated data is determined in a service level field or a custom field of a queue pair context of the queue pair.
  • association data of the queue pair and the priority information of the associated data are correspondingly stored in a cache unit of the RDMA module, where include:
  • the cache unit is selected as the third cache unit according to the preset policy in the non-idle cache unit, and the The associated data of the queue pair and the priority information of the associated data replace the information in the third cache unit;
  • the priority of the cache unit is consistent with the priority of the associated data stored in the cache unit.
  • the method further includes:
  • the cache unit in the RDMA module is placed as an idle cache unit.
  • the cache unit includes a label domain and a content domain, the label domain is configured to store an identifier of the queue pair and priority information of the associated data, and the content domain is configured to store associated data of the queue pair.
  • the associated data includes any one or more of the following: a queue context of the queue pair, a memory translation protection table of the transmitted data, and a completion queue context of the queue pair;
  • the queue context, the memory translation protection table, and the completion queue context are stored in a preset order.
  • the associated data of the cache unit includes one or more, and the method further includes:
  • a second aspect of the embodiments of the present invention provides an information cache apparatus, including:
  • An association data acquiring unit configured to acquire associated data of a queue pair that the communication device transmits data
  • a priority determining unit configured to determine priority information of the associated data of the queue pair
  • a storage unit configured to store, in the cache unit of the information cache device, the associated data of the queue pair acquired by the associated data acquiring unit and the priority information of the associated data determined by the priority determining unit.
  • the priority determining unit is specifically configured to determine the association in a service level field or a custom field of a queue pair context of the queue pair Priority information for the data.
  • the storage unit includes:
  • a first storage unit configured to select an idle cache unit as the first cache unit in the information cache device, and store the associated data of the queue pair and the priority information of the associated data to the selected first In a cache unit;
  • a second storage unit configured to: if there is no free cache unit in the information cache device, A cache unit having a lower priority than the associated data of the queue pair is selected as the second cache unit, and the associated data of the queue pair and the priority information of the associated data are replaced by the cache unit of the non-idle cache unit. Describe the information in the second cache unit;
  • a third storage unit configured to select a cache unit according to a preset policy in the non-idle cache unit if the priority of the non-idle cache unit is the same as the priority of the associated data of the queue pair a third cache unit, and replacing the information in the third cache unit with the associated data of the queue pair and the priority information of the associated data;
  • the priority of the cache unit is consistent with the priority of the associated data stored in the cache unit.
  • the third possible implementation manner of the second aspect of the embodiments of the present invention further includes:
  • a logout unit configured to: when the queue pair is logged off, set a cache unit in the information cache device to an idle cache unit.
  • the cache unit includes a label domain and a content domain, the label domain is configured to store an identifier of the queue pair and priority information of the associated data, and the content domain is configured to store associated data of the queue pair.
  • the associated data includes at least one of: a queue context of the queue pair, a memory translation protection table of the transmission data, and a completion queue context of the queue pair;
  • the queue context, the memory translation protection table, and the completion queue context are stored in a preset order.
  • the associated data of the cache unit includes one or more, and the device further includes:
  • an update unit configured to update any one or more of the associated data in the cache unit.
  • a third aspect of the embodiments of the present invention further provides a communications device, including a processor and remote data directly Accessing the RDMA module and the storage module;
  • the RDMA module is connected to the processor, and is an information cache device according to the second aspect of the embodiment of the present invention, or any possible implementation manner of the first to sixth possible implementation manners of the second aspect.
  • the RDMA module when the communication device including the RDMA module transmits data in the form of a queue pair, when the associated data of the queue pair is needed for the first time, the RDMA module pairs the queue.
  • the associated data is acquired and stored in the cache unit of the RDMA module correspondingly along with the priority information of the associated data. In this way, if any associated data is needed later, the RDMA module does not need to be obtained from the storage module through the processor interface connected thereto, but is directly obtained from the cache unit, thereby avoiding frequent interaction between the RDMA module and the storage module.
  • the RDMA module caches the associated data of the queue pair according to the priority level, and in the case where the buffer space in the RDMA module is limited, the high priority queue is prioritized.
  • the associated data cache for the pair is prioritized.
  • FIG. 1 is a schematic structural diagram of a communication device according to an embodiment of the present invention.
  • FIG. 2 is a flowchart of an information caching method according to an embodiment of the present invention.
  • FIG. 3 is a schematic structural diagram of a buffer unit in an RDMA module included in a communication device according to an embodiment of the present invention
  • FIG. 5 is a schematic structural diagram of an HCA card in a communication device in an application embodiment of the present invention.
  • FIG. 6 is a flowchart of an operation of a request cache unit executed by a cache management module included in an HCA card in a communication device according to an embodiment of the present invention
  • FIG. 7 is a flowchart of a read operation performed by a cache management module included in an HCA card in a communication device according to an embodiment of the present invention
  • FIG. 8 is a write performed by a cache management module included in an HCA card in a communication device according to an embodiment of the present invention. Flow chart of operation;
  • FIG. 9 is a schematic structural diagram of an information cache apparatus according to an embodiment of the present invention.
  • FIG. 10 is a schematic structural diagram of another information cache apparatus according to an embodiment of the present invention.
  • FIG. 11 is a schematic structural diagram of a communication device according to an embodiment of the present invention.
  • the embodiment of the invention provides an information caching method, which is mainly for information caching in the process of transmitting data by the communication device as shown in FIG. 1 .
  • the communication device includes a processor (such as a CPU), a storage module (such as a DIMM), and an RDMA module (such as an HCA card), wherein the processor is mainly responsible for writing data to the storage module and writing a task of transmitting data to the sending queue.
  • the RDMA module can be connected to the processor through a Peripheral Component Interconnect Express (PCIE), which is mainly responsible for the control processing of the data transmission protocol, such as parsing data packets, encapsulating data packets, and responding data packets.
  • PCIE Peripheral Component Interconnect Express
  • Communication devices can be interconnected by cables between RDMA modules, which can be Ethernet cables or Infiniband network cables, depending on the port type of the RDMA module.
  • the method in this embodiment is a method performed by an RDMA module in a communication device, and the flowchart is as shown in FIG. 2, and includes:
  • Step 101 Acquire association data of a queue pair (QP) for transmitting data of the communication device, where one queue pair may include a send queue (SQ) and a receive queue (Receive Queue, RQ), and the associated data of the queue pair It is the configuration information of the queue pair, which can include information that needs to be used in the process of transmitting data, such as Queue Pair Context (QPC), Complete Queue Context (CQC), and MTPT table of data.
  • the associated data may further include information such as a shared receive queue (SRQ).
  • the processor in the communication device first creates a queue pair and sets associated data of the queue pair, such as queue-to-context, data MTPT table, and completion queue context, etc., in the MTPT table. Storing data stored in a storage module of the communication device The correspondence between the physical address and the logical address; then the processor writes the transferred data to the storage module; again, the type of data to be transmitted by the processor (such as a write operation), the start address of the transmitted data, and the length information of the transmitted data The content is written to the send queue of the queue pair.
  • the processor when setting the associated data of the queue pair, the processor also needs to set the priority information for indicating the priority of the queue in the queue context.
  • the service level of the queue context may be used.
  • the priority information can be set by the user according to requirements, such as setting a higher priority for the queue pair of the task that the user cares about.
  • the associated data of a queue pair created by the processor of the communication device does not differ according to the transmission data, that is, after the queue pair is created for the first time and the associated data of the queue pair is initialized, the queue pair The associated data does not change during the data transmission process, but the associated data of different queue pairs is different.
  • the RDMA module After the processor creates the queue pair, the RDMA module is notified that the task needs to be executed, and the RDMA module is required to perform the identifier of the queue pair corresponding to the task, such as the sequence number. In this way, the RDMA module can obtain the associated data of the queue pair through the PCIE through the processor according to the identifier of the queue pair, and then obtain the physical address of the data to be transmitted from the associated data; and then pass the processor according to the physical address. After the data to be transmitted is obtained from the storage module, the data to be transmitted is encapsulated into an RDMA message, and sent to the RDMA message through the interface between the RDMA module and other communication devices.
  • the RDMA module can obtain all the associated data of the queue pair at one time, and can also obtain partial associated data, and can store the associated data in the cache unit of the RDMA module according to the following steps 102 and 103.
  • Step 102 Determine priority information of the associated data of the queue pair, where the priority information of the associated data may be determined from a service level field or a custom field included in the queue context of the acquired associated data.
  • the associated data of the queue pair and the priority information of the associated data are correspondingly stored in the cache unit of the RDMA module.
  • the RDMA module may first allocate a buffer space sufficient to store all associated data of the queue as a cache unit. And can set the cache unit to include a tag domain and a content domain.
  • the label field is used to store the identifier of the queue pair and the priority information of the associated data, and may also store the identifier of the label domain and the valid bit, etc., wherein the identifier of the label domain may uniquely determine a slow
  • the storage unit, the valid bit may indicate whether the cache unit is in an idle state.
  • the cache unit is idle, that is, the cache unit is not used or is invalid after use, otherwise
  • the cache unit is in a non-idle state; the content field is used to store the associated data of the queue pair, and the RDMA module can number the associated data to refer to which associated data, and the size of the content field is determined by the size of the associated data, wherein One type of associated data can be referred to as a member of the cache unit.
  • the RDMA module may specify a storage order of each associated data in the content domain, such that in the content domain, the associated data of the queue pair, such as the queue context, the memory translation protection table, and the completion queue context, are stored in a preset order. For example, in Figure 3, the queue context, the completion queue context, and the memory translation protection table are stored in the order.
  • the RDMA module when the communication device including the RDMA module transmits data in the form of a queue pair, when the associated data of the queue pair is needed for the first time, the RDMA module pairs the queue.
  • the associated data is acquired and stored in the cache unit of the RDMA module correspondingly along with the priority information of the associated data. In this way, if any associated data is needed later, the RDMA module does not need to be obtained from the storage module through the processor interface connected thereto, but is directly obtained from the cache unit, thereby avoiding frequent interaction between the RDMA module and the storage module.
  • the RDMA module caches the associated data of the queue pair according to the priority level, and in the case where the buffer space in the RDMA module is limited, the high priority queue is prioritized.
  • the associated data cache for the pair is prioritized.
  • the RDMA module when the RDMA module performs the foregoing step 103, the RDMA module may be specifically implemented by the following steps, including:
  • step 201 it is determined whether there is an idle cache unit in the RDMA module. If yes, step 202 is performed; if not, the determination in step 203 is continued.
  • the RDMA module determines whether there is a free cache unit, it can specifically find the valid bit in the tag field in the cache unit, and indicate whether the cache unit is idle.
  • Step 202 Select an idle cache unit as the first cache unit in the RDMA module, and store the associated data of the queue pair and the priority information of the associated data in the first cache unit.
  • Step 203 it is determined whether there is a cache unit having a lower priority than the associated data of the queue pair acquired in the above step 101 in the non-idle cache unit, and if yes, executing step 204; if not, in the non-idle
  • the priority of the cache unit is the same as the priority of the associated data of the queue pair.
  • the RDMA module performs step 205. If the priority of the non-idle buffer unit is higher than the priority of the associated data of the queue pair, the information in the cache unit cannot be replaced. In this case, the RDMA module does not. Cache the acquired associated data.
  • the priority of the cache unit is the same as the priority of the associated data stored in the cache unit.
  • Step 204 Select a cache unit having a lower priority than the associated data of the queue pair as the second cache unit, and replace the information in the second cache unit with the associated data of the queue pair and the priority information of the associated data.
  • Step 205 Select a cache unit as a third cache unit according to a preset policy in a non-idle cache unit, and replace the information in the third cache unit with the associated data of the queue pair and the priority information of the associated data.
  • the preset policies may include algorithms such as Least Recently Used (LRU).
  • the RDMA module can ensure the cache of the associated data of the queue pair with higher priority.
  • the RDMA module since the RDMA module allocates resources of the cache unit for the first time according to the size of the associated data of the queue pair to be stored, the granularity of the non-idle cache unit in the RDMA module is different from the non-idle.
  • the size of the associated data currently stored by the cache unit is the same. Therefore, when the RDMA module is in the above steps 204 and 205, it is necessary to select a buffer unit whose storage space is large enough and is not wasted for the information acquired in the above steps 101 and 102, that is, the size of the selected cache unit is equal to the associated data to be stored. And the overall size of the priority information of the associated data, so that when the information is replaced, the associated data of the queue pair and the priority information of the associated data can be completely stored in the selected cache unit.
  • the RDMA module when the associated data of the queue pair and the priority information of the associated data are first stored in the cache unit, in the process of subsequent data transmission, some of the associated data may change, and the RDMA module further The associated data stored in the cache unit can be updated, wherein one or more associated data can be modified. And when the queue pair is logged out, in order to improve the utilization of the cache unit in the RDMA module, the RDMA module can set the above cache unit as an idle cache unit, and the data stored in the cache unit is invalid, so that the cache unit can The associated data of a queue pair of any priority is stored, that is, the data in the cache unit can be replaced by any associated data.
  • the logout of the queue pair may be sent off by the driver of the RDMA module included in the communication device.
  • the command when the RDMA module receives the logout command, can cancel the corresponding queue pair, and the logout command initiated by the driver module can be triggered by the user calling the driver of the RDMA module to perform the operation of the logout queue function.
  • the information caching method provided by the embodiment of the present invention is mainly described in a specific embodiment.
  • the method is mainly applied to the communication device shown in FIG. 1.
  • the processor in the communication device is a CPU
  • the RDMA module is an HCA card.
  • the card may be a Field Programmable Gate Array (FPGA) or an Application Specific Integrated Circuit (ASIC).
  • the HCA card and the CPU are connected through a PCIE interface.
  • the structure of the HCA card in this embodiment may be as shown in FIG. 5, and includes a PCIE interface, a protocol engine, a transceiver interface, a queue management module, and a cache management module, where:
  • the PCIE interface is an interface between the HCA card and the CPU.
  • the CPU can read the associated data of the queue pair from the storage module, and can read and write the data to be transmitted.
  • the protocol engine is used to process the tasks delivered by the PCIE interface and the RDMA packets received from the cable, and terminate the RDMA protocol.
  • the protocol engine when the protocol engine receives the task delivered by the PCIE interface, the task analyzes the task and reads the associated data of the corresponding queue pair from the queue management module, and requests the PCIE interface to obtain the data to be transmitted according to the read associated data, and according to the The RDMA protocol constructs a header, and encapsulates the data to be transmitted into a complete RDMA packet, which is transmitted to the transceiver interface and sent to the line.
  • the protocol engine when the protocol engine receives the RDMA packet of the transceiver interface, it analyzes the packet header, and obtains the corresponding association data from the queue management module according to the sequence number of the destination queue pair identified by the header, and the data to be transmitted is to be transmitted.
  • the PCIE interface is written to the storage module of the communication device; at the same time, the protocol engine sends a response message to the transceiver interface, or reads the target data from the storage module through the PCIE interface, and then constructs a read response message to be transmitted to the transceiver interface and sent to the line. Cable.
  • the transceiver interface is a connection interface with an HCA card in other communication devices.
  • the transceiver interface is used to implement conversion between a logical message on the protocol engine and a physical signal conforming to the line electrical rules, thereby enabling implementation with other communication devices. Communication between.
  • the queue management module is configured to obtain the associated data of the required queue pair from the cache management module. If the associated data does not exist in the HCA card, the associated data needs to be obtained from the storage module through the protocol engine and the PCIE interface. And requesting the cache module to be stored through the cache unit. Linked data.
  • the cache management module is configured to respond to commands issued by the queue management module for requesting, searching, releasing, reading, and writing to the cache unit.
  • the application cache unit may be sent to the cache management module.
  • the cache management module determines whether to allow caching of the associated data according to an internal algorithm, and if so, may return the identifier of the cache unit, such as a sequence number, to the queue management module; otherwise, returning an illegal value indicates that the associated data is not allowed to be cached.
  • the queue management module When the queue management module needs to provide the associated data to the protocol engine, it first sends a search command to the cache management module to confirm whether there is corresponding associated data. In another case, when the queue management module receives the logout queue sent by the PCIE interface. When the operation is correct, a search command is issued to the cache management module to confirm whether there is a cache unit corresponding to the queue pair. If it exists, the cache management module returns the sequence number of the cache unit that stores the associated data; if it does not exist, it returns an invalid sequence number.
  • the queue management module When the queue management module receives the operation of the logout queue pair and has searched for the cache unit corresponding to the queue pair, the queue management module sends a command to release the cache unit to the cache management module; when there is a queue management module in the cache unit When the associated data of the queue pair to be used is used, the queue management module may send a command to read the information in the cache unit to the cache management module to obtain corresponding associated data; when the cache management module allows to store the associated data of a certain queue pair When the sequence number of the cache unit is given, the queue management module can write the associated data to the cache unit that gives the sequence number through the cache management module.
  • the CPU in the communication device A writes the data to be transmitted to the storage module, and writes the transferred task to the transmission queue of the storage module, and notifies the HCA card to perform the corresponding task through the doorbell.
  • the doorbell may be a notification message, and the sequence number of the queue pair may be included in the notification message.
  • the PCIE interface in the HCA card receives the doorbell transmission to the protocol engine, and the protocol engine parses the doorbell to obtain the sequence number of the queue pair included therein, and obtains the association of the queue pair through the PCIE interface to the storage module of the communication device A.
  • Data which may include at least one of the following: a queue of queue pairs Context, data such as the MTPT table of the data to be transmitted and the completion queue context of the queue pair.
  • step B1 the protocol engine in the HCA card can trigger the queue management module to send a request for the cache unit to the cache management module, and carry the sequence number and priority information of the queue pair in the command. If the associated data is stored in the cache unit of the HCA card according to the structure of FIG. 3, after the cache management module receives the command, the operation of the application cache unit may be performed according to the following steps.
  • the flowchart is as shown in FIG. 6, and includes:
  • step C11 The cache management module re-confirms whether the associated data of the queue pair is stored in the cache unit according to the sequence number of the queue pair in the received command, and if yes, returns the existing information to the queue management module, and ends the process; If it is not stored, step C12 is performed.
  • step C12 determining whether there is an idle cache unit, mainly traversing a valid bit in the cache unit, the valid bit is used to indicate whether the cache unit is idle, and if so, returning the sequence number of one of the idle cache units to the queue management module; If not, step C13 is performed.
  • step C13 determining whether there is a cache unit whose priority is lower than the priority indicated by the priority information in the command received in the above step C11, specifically, traversing the priority bit in the buffer unit and comparing with the priority information in the received command If yes, step C14 is performed; if not, information that does not allow caching is returned to the queue management module.
  • C14 Select a non-idle cache unit with a lower priority, and return the sequence number of the cache unit to the queue management module, and allow the associated data of the queue pair to be stored in the cache unit.
  • the queue management unit When the queue management module receives the sequence number of the cache unit and allows the associated data of the queue pair to be stored in the cached information, the queue management unit initiates a request to write to the cache unit, and may also carry the queue pair in the request. Sequence number, priority information, sequence number of the cache unit, and associated data to be stored, so that when the cache module unit receives the request, the priority information, the sequence number of the queue pair, and the associated data are in accordance with the structure shown in FIG. 3 above. Save to the corresponding cache unit.
  • the protocol engine in the HCA card analyzes the queue context in the associated data to obtain the physical address of the queue in the queue, and the protocol engine reads the send queue from the storage module of the communication device A through the PCIE interface according to the physical address. After the Work Queue Element (WQE), the WQE is analyzed to determine the source virtual address and length of the data to be transmitted that need to be sent.
  • WQE Work Queue Element
  • the protocol engine triggers the queue management module to initiate a read request to the cache management module, and carries the sequence number of the queue pair in the read request, and a letter that needs to read a member of the associated data, such as the MTPT table.
  • the read operation may be performed as follows. The flowchart is as shown in FIG. 7, and includes:
  • Step F11 After receiving the read request, the cache management module determines, according to the sequence number of the queue pair, whether there is a cache unit corresponding to the queue pair, and if not, returns non-existent information to the queue management module, and if yes, executes Step F12.
  • the cache management module finds a corresponding cache unit according to the sequence number of the queue pair, and calculates the offset and read length of the member in the cache unit according to the member information that needs to be read in the read request, thereby reading the member.
  • the cache management module returns the member read from the cache unit, that is, the MTPT table, to the protocol engine through the queue management module.
  • the protocol engine After the protocol engine obtains the MTPT table, because the mapping between the physical address and the virtual address of the data to be transmitted is recorded in the MTPT table, the protocol engine determines according to the source virtual address obtained in step E1 and the MTPT table. The physical address of the data to be transmitted; and then the data to be transmitted is obtained from the storage module of the communication device A through the PCIE interface according to the physical address and the length of the data to be transmitted obtained in the above step E1.
  • the protocol engine encapsulates the data to be transmitted into an RDMA message, and transmits the RDMA message from the transceiver interface to the cable connected to the communication device B.
  • the RDMA message may include the following information: The data, the type of operation (write operation in this embodiment), the sequence number of the destination queue pair to be operated, and the destination virtual address.
  • the transceiver interface included in the HCA card of the communication device B transmits the RDMA message to the protocol engine to analyze the RDMA message, and obtains the transmitted data and operation type (write operation in this embodiment), to be operated.
  • the destination queue to be operated corresponds to the queue pair when the data is transmitted in the communication device A, and when the communication device A sets the queue pair when transmitting the data, the transmission queue of the communication device A to transmit data and the communication device B can receive the data.
  • the information of the destination queue pair to be operated is the information of the receiving queue.
  • the protocol engine triggers the queue management module to start reading a member of the associated data of the queue pair, such as an MTPT table or multiple members, according to the sequence number of the destination queue pair, and the cache management module may follow the above FIG. 7
  • the method shown returns the read MTPT table to the protocol engine; If the associated data of the destination queue pair is not stored in the cache unit, the protocol engine obtains the associated data of the destination queue pair, including the MTPT table, from the storage module of the communication device B through the PCIE interface.
  • the associated data of the queue pair in the communication device B may be set before the data is transmitted with the communication device A.
  • the protocol engine obtains the corresponding physical address according to the obtained MTPT table and the destination virtual address, and then writes the data transmitted in the RDMA packet to the storage module corresponding to the physical address through the PCIE interface.
  • the protocol engine prepares a response message, and transmits the response message to the cable connected to the communication device A through the transceiver interface.
  • the response message may include the following information: the sequence number of the queue pair.
  • the transceiver interface included in the HCA card of the communication device A After receiving the RDMA message, the transceiver interface included in the HCA card of the communication device A transmits the RDMA message to the protocol engine to analyze the response message, and obtains the sequence number of the queue pair.
  • the protocol engine may trigger the queue management module to initiate the acquisition of the associated data of the queue pair to the cache management module, including completing the queue context and the like.
  • the cache management module may follow The method shown in Figure 7 above returns the read associated data to the protocol engine.
  • C3 The protocol engine generates a completion queue element (CQE) according to the completion queue context, and optionally generates a Complete Event Queue Element (CEQE) and a report interrupt, etc., and the CPU passes the rotation training CQE or the response interrupt (when generating CEQE) And when interrupted, it is known that there is a completion queue generation, and the CPU terminates the task of transmitting data this time.
  • CQE completion queue element
  • CEQE Complete Event Queue Element
  • report interrupt etc.
  • the protocol engine may trigger the queue management module to initiate a release request to the cache management module, and the release request carries the sequence number of the queue pair, and the cache management module finds the The queue pair performs the logout of the corresponding cache unit, that is, the valid bit in the cache unit is set to be invalid, that is, the cache unit is set to an idle cache unit.
  • the queue management module included in the HCA card in the communication device may be sent to the cache management module. Initiating a write request, and including the sequence number of the queue pair and the information of the associated data to be written in the write request; when the cache management module receives the write request, it may follow The following steps are performed to perform a write operation, and the flowchart is as shown in FIG. 8, and includes:
  • the cache management module searches for the existence of the cache unit corresponding to the sequence number of the queue pair according to the sequence number of the queue pair in the received request, and if not, returns the non-existent information to the queue management module; if yes, executes Step B4.
  • B4 Find the corresponding cache unit according to the serial number of the queue, and calculate the offset and length of the associated data in the cache unit according to the information of the associated data that needs to be written in the write request.
  • the embodiment of the present invention further provides an information cache device, which supports RDMA operations.
  • the structure diagram is as shown in FIG. 9 and includes:
  • the associated data obtaining unit 10 is configured to acquire associated data of a queue pair in which the communication device transmits data.
  • the priority determining unit 11 is configured to determine priority information of the associated data of the queue pair; the priority determining unit 11 may specifically determine, in a service level field or a custom field of a queue pair context of the queue pair Priority information.
  • the storage unit 12 is configured to store the associated data of the queue pair acquired by the associated data acquiring unit 10 and the priority information of the associated data determined by the priority determining unit 11 in the cache unit 13 of the information cache device. in.
  • the cache unit may include a label domain and a content domain, where the label domain is used to store the identifier of the queue pair and the priority information of the associated data, and may also store the cache unit identifier and the valid bit; the content domain is used to store the queue.
  • the associated data includes information such as a queue context of the queue pair, a memory translation protection table of the transmitted data, and a completion queue context of the queue pair, and in the content domain, the queue context, the memory translation protection table, and the completion queue context. They are stored in the preset order, for example, in the order shown in Figure 3.
  • the associated data acquiring unit 10 associates the queue pair of the data transmitted by the communication device.
  • the data is acquired and stored by the storage unit 12 along with the priority information of the associated data in the cache unit 13 of the information cache device.
  • the information cache device does not need to interface with the processor in the communication device from the storage module.
  • the block is obtained directly from the cache unit to avoid frequent read and write operations between the information cache device and the storage module.
  • the associated data of a queue pair can correspond to a priority level, the information cache device is based on the priority level.
  • the associated data of the cache queue pair caches the associated data of the high priority queue pair preferentially when the buffer space in the information cache device is limited.
  • the information cache device may include a logout unit 14 and an update unit 15 in addition to the structure shown in FIG. 9, and wherein the storage unit 12 may also pass the first
  • the storage unit 120, the second storage unit 121, and the third storage unit 122 are implemented, specifically:
  • the first storage unit 120 is configured to select an idle cache unit as the first cache unit in the cache unit 13 of the information cache device, and determine the associated data and priority of the queue pair acquired by the associated data acquisition unit 10
  • the priority information of the associated data determined by the unit 11 is stored in the first cache unit.
  • the priority of the cache unit is the same as the priority of the associated data stored in the cache unit.
  • a second storage unit 121 configured to: if there is no free cache unit in the cache unit 13 of the information cache device, select a priority in the non-idle cache unit that has a lower priority than the associated data of the queue pair.
  • the cache unit is used as the second cache unit, and replaces the information in the second cache unit with the associated data of the queue pair acquired by the associated data obtaining unit 10 and the priority information of the associated data determined by the priority determining unit 11.
  • the third storage unit 122 is configured to: if the priority of the non-idle cache unit is the same as the priority of the associated data of the queue pair, select the cache unit according to the preset policy in the non-idle cache unit.
  • the third cache unit replaces the information in the third cache unit with the associated data of the queue pair acquired by the associated data obtaining unit 10 and the priority information of the associated data determined by the priority determining unit 11.
  • the unregistering unit 14 is configured to: when the communication device queue pair is logged out, set a cache unit of the associated data of the corresponding queue pair in the information cache device as an idle cache unit, and store a queue pair of any priority. Link data. Specifically, the logout unit 14 may modify the valid bit in the cache unit to indicate that the cache unit is an idle cache unit.
  • the updating unit 15 is configured to update any one or more of the associated data in the cache unit 13.
  • the information cache device allocates the resources of the cache unit for the first time according to the size of the associated data of the queue pair to be stored, the granularity of the non-idle cache unit and the current non-idle cache unit are The size of the stored associated data is the same. Therefore, when the information of the cache unit is replaced, the second storage unit 121 and the third storage unit 122 need to select a cache unit with a sufficient storage space for the associated data acquired by the associated data acquisition unit 10, that is, the size of the selected cache unit. It is equal to the overall size of the priority data of the associated data and associated data that needs to be stored, so that when the information is replaced, the associated data of the queue pair and the priority information of the associated data can be completely stored in the selected cache unit.
  • the embodiment of the present invention further provides a communication device.
  • the schematic diagram of the structure is as shown in FIG. 11, and includes a memory 20, an input device 23, and an output device 24 and an RDMA module 22 respectively connected to the processor 21, wherein:
  • the memory 20 is used to store data input from the input device 23, and may also store information such as necessary files for processing the data by the processor 21.
  • the input device 23 and the output device 24 are ports through which the communication device communicates with other devices, and may also include devices external to the communication device such as a display, a keyboard, a mouse, a printer, and the like.
  • the processor 21 can be configured to write the data to be transmitted into the memory 20 and write the transferred task into the transmission queue.
  • the RDMA module 22 can be connected to the output device 24 and connected to other communication devices for acquiring associated data of a queue pair for transmitting data by the communication device; determining priority information of associated data of the queue pair; The associated data of the queue pair and the priority information of the associated data are correspondingly stored in the cache unit of the RDMA module 22, wherein the RDMA module 22 can be in the service level field or the custom field of the queue pair context of the queue pair Determining priority information of the associated data. Thus, if any associated data is needed later, the RDMA module 22 does not need to be acquired from the memory 20 through the interface with the processor 21 in the communication device, but is directly obtained from the cache unit, avoiding the RDMA module 22 and the memory 20.
  • the RDMA module 22 caches the associated data of the queue pair according to the priority level, and in the case where the buffer space in the RDMA module 22 is limited, The associated data of the high priority queue pair is preferentially cached.
  • the buffer unit in the RDMA module 22 may include a label domain and a content domain, where the label domain is used to store the identifier of the queue pair and the priority information of the associated data, and may also store the cache unit label.
  • the knowledge and valid bits; the content field is used to store the associated data of the queue pair.
  • the associated data includes information such as a queue context of the queue pair, a memory translation protection table of the transmitted data, and a completion queue context of the queue pair, and in the content domain, the queue context, the memory translation protection table, and the completion queue context. They are stored in the preset order, for example, in the order shown in Figure 3.
  • the RDMA module 22 may select an idle cache unit as the first cache unit in the cache unit of the RDMA module 22, and associate the associated data of the queue pair.
  • the priority information of the data is stored in the first cache unit, wherein the priority of the cache unit is consistent with the priority of the associated data stored in the cache unit. If there is no free cache unit in the cache unit of the RDMA module 22, a cache unit having a lower priority than the associated data of the queue pair is selected as the second cache unit in the non-idle cache unit, and used as the second cache unit.
  • the associated data of the queue pair and the priority information of the associated data replace the information in the second cache unit.
  • the cache unit is selected as the third cache unit according to a preset policy in the non-idle cache unit, and the location is used.
  • the associated data of the queue pair and the priority information of the associated data replace the information in the third cache unit.
  • the RDMA module 22 since the RDMA module 22 allocates the resources of the cache unit for the first time according to the size of the associated data of the queue pair to be stored, the size of the non-idle cache unit and the current non-idle cache unit are currently allocated. The size of the stored associated data is the same. Therefore, when replacing the information of the cache unit, the RDMA module 22 needs to select a buffer unit with sufficient storage space for the associated data, that is, the size of the selected cache unit is greater than or equal to the priority information of the associated data and associated data to be stored. The overall size, so that when the information is replaced, the associated data of the queue pair and the priority information of the associated data can be completely stored in the selected cache unit.
  • the RDMA module 22 may set the cache unit of the associated data of the storage queue pair to be idle.
  • the cache unit so that the cache unit can store the associated data of the queue pair of any priority.
  • RDMA module 22 may modify the valid bits in the cache unit to indicate that the cache unit is an idle cache unit.
  • the structure of the RDMA module 22 may be as shown in the above information cache device or the structure of the HCA card shown in FIG. 5, and details are not described herein.
  • the associated data of the high-priority queue pair is preferentially stored in the cache unit of the RDMA module, so that high-priority tasks can always be hit in the cache unit, improving the performance of the high-priority task. Therefore, the user is provided with a mechanism that can set the priority of the queue pair of the task that he cares about is higher, try not to be replaced, and improve the performance of the task (or process) that the user cares about;
  • the hit rate of the information in the cache unit is relatively stable, and there is no big difference due to different transmission scenarios.
  • the cache unit can be idle when the associated data in the cache unit is not needed, the associated data of any arbitrary priority level can be stored, so that the resources of the cache space can be utilized to the maximum extent.
  • the size of the cache unit is the size of all associated data of the corresponding queue pair, so that only the necessary information in the process of transmitting data is stored, and other information is not stored, which is consistent with the characteristics of RDMA transmission, and has higher efficiency.
  • the associated data stored in the cache unit can be read and written separately according to the member unit, for example, reading one or more members of all associated data to satisfy the characteristics of RDMA transmission.
  • RDMA hardware can be completed by a program to instruct related RDMA hardware, and the program can be stored in a computer readable storage medium, and the storage medium can include : Read only memory (ROM), random access memory (RAM), magnetic or optical disk, etc.; and related RDMA hardware can be implemented by Field Programmable Logic Array (FPGA) or Application Specific Integrated Circuit (ASIC).
  • ROM Read only memory
  • RAM random access memory
  • ASIC Application Specific Integrated Circuit

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Communication Control (AREA)

Abstract

一种信息缓存方法、装置和通信设备,应用于通信技术领域。当包括RDMA模块的通信设备在按照队列对的形式进行传输数据的过程中,在首次需要用到队列对的某一关联数据时,RDMA模块就将该队列对的关联数据获取到,并连同该关联数据的优先级信息一起对应地储存到RDMA模块的缓存单元中。这样在后续如果需要用到该关联数据时,RDMA模块就不需要通过与其相连的处理器接口从存储模块来获取,而直接从缓存单元中获取,避免了RDMA模块与存储器之间频繁的读写操作;同时RDMA模块根据优先级别缓存队列对的关联数据,在RDMA模块中缓存空间有限的情况下,优先将高优先级别的队列对的关联数据缓存。

Description

一种信息缓存方法、装置和通信设备
本申请要求于2013年11月27日提交中国专利局、申请号为201310617002.0,发明名称为“一种信息缓存方法、装置和通信设备”的中国专利申请优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及通信技术领域,特别涉及信息缓存方法、装置和通信设备。
背景技术
远端数据直接存取(Remote Direct Memory Access,RDMA)技术可以降低服务器之间进行数据处理的延迟问题,能降低服务器中中央处理器(Central Processing Unit,CPU)处理数据传输的负荷。具体地,在RDMA系统中的服务器包括CPU、存储模块比如双列直插式存储模块(Dual-Inline-Memory-Modules,DIMM)和主机端通道适配器(Host Channel Adapter,HCA)等,且服务器之间是通过HCA之间的线缆互联,以实现服务器之间的通信。
其中,一个服务器中的HCA可以通过CPU向存储模块得到发送数据后,发送给另一服务器的HCA,并由另一服务器的HCA将接收的数据通过CPU储存到存储模块中。这样在数据传输的过程中,CPU只是负责将数据写入存储模块和将传输数据的任务写入到发送队列中等,而数据传输协议的控制处理比如解析数据报文、封装数据报文和应答数据报文等由HCA来执行,而不需要CPU来参与,从而不需要用到大量CPU的处理能力,降低了CPU的负荷。
但是上述数据传输的过程中,当服务器中的HCA在发送数据时,需要将发送数据所关联的信息比如内存翻译保护表(Memory Translate Protect Table,MTPT)等信息通过CPU向存储模块获取到,使得HCA与CPU之间会频繁的进行读写存储模块。
发明内容
本发明实施例提供信息缓存方法、装置和通信设备,减少通信设备中处理器与具有RDMA功能的模块之间频繁的操作。
本发明实施例第一方面提供一种信息缓存方法,应用于通信设备包括的远端数据直接存取RDMA模块中,所述方法包括:
获取所述通信设备传输数据的队列对的关联数据;
确定所述队列对的关联数据的优先级信息;
将所述队列对的关联数据和所述关联数据的优先级信息对应地储存到所述RDMA模块的缓存单元中。
本发明实施例第一方面的第一种可能的实现方式中,所述确定所述队列对的关联数据的优先级信息,具体包括:
在所述队列对的队列对上下文的服务级别字段或自定义字段中确定所述关联数据的优先级信息。
本发明实施例第一方面的第二种可能的实现方式中,所述将所述队列对的关联数据和所述关联数据的优先级信息对应地储存到所述RDMA模块的缓存单元中,具体包括:
在所述RDMA模块中选择空闲的缓存单元作为第一缓存单元,并将所述队列对的关联数据和所述关联数据的优先级信息储存到所述第一缓存单元中;
如果所述RDMA模块中没有空闲的缓存单元,在所述非空闲的缓存单元中选择优先级比所述队列对的关联数据的优先级低的缓存单元作为第二缓存单元,并用所述队列对的关联数据和所述关联数据的优先级信息替换所述第二缓存单元中的信息;
如果存在非空闲的缓存单元的优先级与所述队列对的关联数据的优先级相同,则在所述非空闲的缓存单元中按照预置的策略选择缓存单元作为第三缓存单元,并用所述队列对的关联数据和所述关联数据的优先级信息替换所述第三缓存单元中的信息;
其中,缓存单元的优先级与所述缓存单元中储存的关联数据的优先级一致。
结合本发明实施例第一方面,或第一方面的第一种或第二种可能的实现方式,本发明实施例第一方面的第三种可能的实现方式中,所述方法还包括:
当所述队列对注销时,将所述RDMA模块中的所述缓存单元置为空闲的缓存单元。
结合本发明实施例第一方面,或第一方面的第一种到第三种可能的实现方式中任一可能实现方式,本发明实施例第一方面的第四种可能的实现方式中:
所述缓存单元包括标签域和内容域,所述标签域用于储存所述队列对的标识和所述关联数据的优先级信息;所述内容域用于储存所述队列对的关联数据。
结合本发明实施例第一方面第四种可能的实现方式,本发明实施例第一方面的第五种可能的实现方式中:
所述关联数据中包括如下任一个或多个信息:所述队列对的队列上下文、传输数据的内存翻译保护表和所述队列对的完成队列上下文;
在所述内容域中,将所述队列上下文、内存翻译保护表和完成队列上下文按照预置的顺序储存。
结合本发明实施例第一方面,或第一方面的第一种到第五种可能的实现方式中任一可能实现方式,本发明实施例第一方面的第六种可能的实现方式中,所述缓存单元的关联数据包括一种或多种,所述方法还包括:
对所述缓存单元中的任一种或多种关联数据进行更新。
本发明实施例第二方面提供一种信息缓存装置,包括:
关联数据获取单元,用于获取所述通信设备传输数据的队列对的关联数据;
优先级确定单元,用于确定所述队列对的关联数据的优先级信息;
储存单元,用于将所述关联数据获取单元获取的队列对的关联数据和所述优先级确定单元确定的关联数据的优先级信息对应地储存到所述信息缓存装置的缓存单元中。
在本发明实施例第二方面的第一种可能的实现方式中,所述优先级确定单元,具体用于在所述队列对的队列对上下文的服务级别字段或自定义字段中确定所述关联数据的优先级信息。
在本发明实施例第二方面的第二种可能的实现方式中,所述储存单元包括:
第一储存单元,用于在所述信息缓存装置中选择空闲的缓存单元作为第一缓存单元,并将所述队列对的关联数据和所述关联数据的优先级信息储存到所述选择的第一缓存单元中;
第二储存单元,用于如果所述信息缓存装置中没有空闲的缓存单元,在所 述非空闲的缓存单元中选择优先级比所述队列对的关联数据的优先级低的缓存单元作为第二缓存单元,并用所述队列对的关联数据和所述关联数据的优先级信息替换所述第二缓存单元中的信息;
第三储存单元,用于如果存在非空闲的缓存单元的优先级与所述队列对的关联数据的优先级相同,则在所述非空闲的缓存单元中按照预置的策略选择缓存单元作为第三缓存单元,并用所述队列对的关联数据和所述关联数据的优先级信息替换所述第三缓存单元中的信息;
其中,缓存单元的优先级与所述缓存单元中储存的关联数据的优先级一致。
结合本发明实施例第二方面,或第二方面的第一种或第二种可能的实现方式,本发明实施例第二方面的第三种可能的实现方式中,还包括:
注销单元,用于当所述队列对注销时,将所述信息缓存装置中的缓存单元置为空闲的缓存单元。
结合本发明实施例第二方面,或第二方面的第一种到第三种可能的实现方式中任一可能实现方式,本发明实施例第二方面的第三种可能的实现方式中:
所述缓存单元包括标签域和内容域,所述标签域用于储存所述队列对的标识和所述关联数据的优先级信息;所述内容域用于储存所述队列对的关联数据。
结合本发明实施例第二方面第三种可能的实现方式,本发明实施例第二方面的第四种可能的实现方式中:
所述关联数据中包括至少一个如下的信息:所述队列对的队列上下文、传输数据的内存翻译保护表和所述队列对的完成队列上下文;
在所述内容域中,将所述队列上下文、内存翻译保护表和完成队列上下文按照预置的顺序储存。
结合本发明实施例第二方面,或第二方面的第一种到第四种可能的实现方式中任一可能实现方式,本发明实施例第二方面的第五种可能的实现方式中,所述缓存单元的关联数据包括一种或多种,所述装置还包括:
更新单元,用于对所述缓存单元中的任一种或多种关联数据进行更新。
本发明实施例第三方面还提供一种通信设备,包括处理器、远端数据直接 存取RDMA模块和存储模块;
所述RDMA模块与所述处理器连接,是本发明实施例第二方面,或第二方面的第一种到第六种可能的实现方式中任一可能实现方式所述的信息缓存装置。
可见,在本实施例中,当包括RDMA模块的通信设备在按照队列对的形式进行传输数据的过程中,在首次需要用到队列对的某一关联数据时,RDMA模块就将该队列对的关联数据获取到,并连同该关联数据的优先级信息一起对应地储存到RDMA模块的缓存单元中。这样在后续如果需要用到任一关联数据时,RDMA模块就不需要通过与其相连的处理器接口从存储模块来获取,而直接从缓存单元中获取,避免了RDMA模块与存储模块之间频繁的读写操作;同时,由于一个队列对的关联数据可以对应一个优先级别,这样RDMA模块根据优先级别缓存队列对的关联数据,在RDMA模块中缓存空间有限的情况下,优先将高优先级别的队列对的关联数据缓存。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1是本发明实施例中通信设备的结构示意图;
图2是本发明实施例提供的一种信息缓存方法的流程图;
图3是本发明实施例中通信设备包括的RDMA模块中缓存单元的结构示意图;
图4是本发明实施例提供的另一种信息缓存方法的流程图;
图5是本发明应用实施例中通信设备内HCA卡的结构示意图;
图6是本发明实施例的通信设备中HCA卡包括的缓存管理模块执行的申请缓存单元操作的流程图;
图7是本发明实施例的通信设备中HCA卡包括的缓存管理模块执行的读取操作的流程图;
图8是本发明实施例的通信设备中HCA卡包括的缓存管理模块执行的写入 操作的流程图;
图9是本发明实施例提供的一种信息缓存装置的结构示意图;
图10是本发明实施例提供的另一种信息缓存装置的结构示意图;
图11是本发明实施例提供的一种通信设备的结构示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
本发明实施例提供一种信息缓存方法,主要是针对如图1所示的通信设备传输数据的过程中的信息缓存。在该通信设备中包括处理器(比如CPU)、存储模块(比如DIMM)和RDMA模块(比如HCA卡),其中处理器主要负责将数据写入存储模块和将传输数据的任务写入到发送队列中等;而RDMA模块可以通过外部设备互连高速通道(Peripheral Component Interconnect Express,PCIE)与处理器连接,主要负责数据传输协议的控制处理比如解析数据报文、封装数据报文和应答数据报文等。通信设备之间可以通过RDMA模块之间的线缆互联,该线缆可以是以太网线或Infiniband网线,具体由RDMA模块的端口类型决定。
本实施例的方法是通信设备中的RDMA模块所执行的方法,流程图如图2所示,包括:
步骤101,获取通信设备传输数据的队列对(Queue Pair,QP)的关联数据,其中一个队列对可以包括发送队列(Send Queue,SQ)和接收队列(Receive Queue,RQ),队列对的关联数据是队列对的配置信息,可以包括在传输数据过程中需要用到的信息,比如队列上下文(Queue Pair Context,QPC)、完成队列上下文(Complete Queue Context,CQC)和数据的MTPT表等,在有些实施例中,该关联数据还可以包括共享接收队列(SRQ)等信息。
可以理解,如果需要传输数据,则通信设备中的处理器会先创建队列对,并设置队列对的关联数据,比如队列对上下文、数据的MTPT表和完成队列上下文等关联数据,在该MTPT表中储存有在通信设备的存储模块中储存数据的 物理地址和逻辑地址的对应关系;然后处理器将传输的数据写入到存储模块中;再次,处理器将传输数据的类型(比如写操作)、传输数据的起始地址和传输数据的长度信息等内容写入到队列对的发送队列中。本实施例中,处理器在设置队列对的关联数据时,还需要在队列上下文中设置用于指示该队列对优先级的优先级信息,具体地,可以使用队列上下文的服务级别(service level)字段或其它的自定义字段来指示该优先级信息。该优先级信息可以由用户根据需要进行设置,比如为用户关心的任务的队列对设置较高的优先级。需要说明的是,通信设备的处理器所创建的一个队列对的关联数据不会因传输数据的不同而不同,即在首次创建了队列对并初始化了该队列对的关联数据后,该队列对的关联数据在数据传输的过程中就不会改变,但是不同队列对的关联数据确不同。
当处理器创建了队列对后,通知RDMA模块有任务需要执行,并告诉RDMA模块需要执行任务对应的队列对的标识比如序号。这样RDMA模块就可以根据该队列对的标识,通过PCIE经过处理器向存储模块获取到该队列对的关联数据,进而从关联数据中得到待传输数据的物理地址;然后根据该物理地址通过处理器向存储模块获取到待传输数据后,将待传输数据封装成RDMA报文,通过该RDMA模块与其它通信设备的接口发送给RDMA报文。
其中RDMA模块可以一次性地获取到该队列对的所有关联数据,也可以获取到部分关联数据,同时可以按照如下步骤102和103,将这些关联数据储存到RDMA模块的缓存单元中。
步骤102,确定队列对的关联数据的优先级信息,该关联数据的优先级信息可以从获取到的关联数据中的队列上下文包括的服务级别字段或自定义字段中确定。
步骤103,将队列对的关联数据和关联数据的优先级信息对应地储存到RDMA模块的缓存单元中,具体地,RDMA模块可以先分配一个足够储存该队列对所有关联数据的缓存空间作为缓存单元,并可以设置该缓存单元包括标签(tag)域和内容域。
参考图3所示,标签域用于储存队列对的标识和关联数据的优先级信息,还可以储存标签域的标识和有效位等,其中标签域的标识可以唯一确定一个缓 存单元,有效位可以指示该缓存单元是否处于空闲状态,如果该缓存单元中未储存数据或存储的数据处于无效状态,则说明该缓存单元空闲,即缓存单元未使用或使用后被无效,否则缓存单元处于非空闲状态;内容域用于储存队列对的关联数据,且RDMA模块可以为这些关联数据进行编号以指代具体是哪个关联数据,内容域的尺寸由关联数据的大小来决定,其中一种类型的关联数据可以称为该缓存单元的一个成员。进一步地,RDMA模块可以规定内容域中各个关联数据的存放顺序,这样在内容域中,将队列对的关联数据比如队列上下文、内存翻译保护表和完成队列上下文等就会按照预置的顺序储存,比如图3中按照队列上下文、完成队列上下文和内存翻译保护表的顺序存放。
可见,在本实施例中,当包括RDMA模块的通信设备在按照队列对的形式进行传输数据的过程中,在首次需要用到队列对的某一关联数据时,RDMA模块就将该队列对的关联数据获取到,并连同该关联数据的优先级信息一起对应地储存到RDMA模块的缓存单元中。这样在后续如果需要用到任一关联数据时,RDMA模块就不需要通过与其相连的处理器接口从存储模块来获取,而直接从缓存单元中获取,避免了RDMA模块与存储模块之间频繁的读写操作;同时,由于一个队列对的关联数据可以对应一个优先级别,这样RDMA模块根据优先级别缓存队列对的关联数据,在RDMA模块中缓存空间有限的情况下,优先将高优先级别的队列对的关联数据缓存。
参考图4所示,在一个具体的实施例中,RDMA模块在执行上述步骤103时,具体可以通过如下的步骤来实现,具体包括:
步骤201,判断RDMA模块中是否有空闲的缓存单元,如果有,则执行步骤202;如果没有,继续执行步骤203中的判断。其中RDMA模块在判断是否有空闲的缓存单元时,具体可以查找缓存单元内标签域中的有效位,是否指示该缓存单元空闲。
步骤202,在RDMA模块中选择空闲的缓存单元作为第一缓存单元,并将队列对的关联数据和关联数据的优先级信息储存到第一缓存单元中。
步骤203,判断在非空闲的缓存单元中是否有优先级比上述步骤101中获取的队列对的关联数据的优先级低的缓存单元,如果有,则执行步骤204;如果没有,在非空闲的缓存单元的优先级与队列对的关联数据的优先级相同的情况 下,RDMA模块会执行步骤205,而非空闲的缓存单元的优先级比队列对的关联数据的优先级都高的情况,则不能将缓存单元中的信息置换出去,此时,RDMA模块不会缓存该获取的关联数据。其中缓存单元的优先级与缓存单元中储存的关联数据的优先级一致。
步骤204,选择优先级比队列对的关联数据的优先级低的缓存单元作为第二缓存单元,并用队列对的关联数据和关联数据的优先级信息替换第二缓存单元中的信息。
步骤205,在非空闲的缓存单元中按照预置的策略选择缓存单元作为第三缓存单元,并用队列对的关联数据和关联数据的优先级信息替换第三缓存单元中的信息。其中预置的策略可以包括:最近最少使用算法(Least Recently Used,LRU)等算法。
可见,通过上述步骤201到205,RDMA模块可以保证优先级较高的队列对的关联数据的缓存。
需要说明的是,由于RDMA模块在首次分配缓存单元的资源时,是根据需要储存的队列对的关联数据的大小来分配的,则在RDMA模块中非空闲的缓存单元的粒度与该非空闲的缓存单元当前储存的关联数据的大小一致。因此,RDMA模块在上述步骤204和205时,需要为上述步骤101和102获取的信息选择一个存储空间足够大且不会浪费的缓存单元,即选择的缓存单元的大小要等于需要储存的关联数据和关联数据的优先级信息的总体大小,这样在替换信息时,上述队列对的关联数据和关联数据的优先级信息才能完全地储存到选择的缓存单元中。
另外需要说明的是,当上述队列对的关联数据和关联数据的优先级信息首次被储存到缓存单元后,在后续传输数据的过程中,其中某些关联数据可能会发生改变,则RDMA模块还可以对缓存单元中储存的关联数据进行更新,其中可以对某一个或多个关联数据进行修改。且当队列对注销时后,为了提高RDMA模块中缓存单元的利用率,RDMA模块可以将上述的缓存单元置为空闲的缓存单元,则该缓存单元中储存的数据无效,这样该缓存单元就可以储存任意优先级的队列对的关联数据,即该缓存单元中的数据可以被任意关联数据替换。其中,队列对的注销可以由通信设备中包括的RDMA模块的驱动下发注销 命令,当RDMA模块接收到注销命令就可以对相应的队列对进行注销,而驱动模块发起的注销命令可以是由用户调用该RDMA模块的驱动执行注销队列对功能的操作等来触发
以下以一个具体的实施例来说明本发明实施例提供的信息缓存方法,主要应用于如图1所示的通信设备中,在通信设备中的处理器为CPU,RDMA模块为HCA卡,该HCA卡可以是一块现场可编程逻辑门阵列(Field Programmable Gate Array,FPGA)或固化的专用集成电路(Application Specific Integrated Circuit,ASIC),HCA卡与CPU之间通过PCIE接口连接。本实施例中的HCA卡的结构可以如图5所示,包括PCIE接口、协议引擎、收发接口、队列管理模块和缓存(cache)管理模块,其中:
(1)PCIE接口,是HCA卡与CPU的接口,可以通过CPU从存储模块中读取队列对的关联数据,且可以读写待传输数据。
(2)协议引擎用于处理PCIE接口下发的任务和从线缆上接收过来的RDMA报文,并终结RDMA协议。
具体地,当协议引擎接收到PCIE接口下发的任务时,分析任务并从队列管理模块读取对应的队列对的关联数据,根据读取的关联数据向PCIE接口请求获取待传输数据,并按照RDMA协议构造头部,将待传输数据封装成完整的RDMA报文,传递给收发接口发送到线路上。
另一方面,当协议引擎接收到收发接口的RDMA报文时,会分析报文头部,根据头部标识的目的队列对的序号,从队列管理模块获取相应的关联数据,并将待传输数据通过PCIE接口写到通信设备的存储模块中;同时协议引擎会向收发接口回送应答报文,或者通过PCIE接口从存储模块中读取目标数据,然后构造读应答报文传递给收发接口发送到线缆上。
(3)收发接口是与其它通信设备中HCA卡的连接接口,通过该收发接口实现协议引擎上的逻辑报文与符合线路电气规则的物理信号之间的转换,进而可以实现与其它通信设备之间的通信。
(4)队列管理模块,用于从缓存管理模块中获取需要的队列对的关联数据,如果在HCA卡中不存在该关联数据,则需要通过协议引擎和PCIE接口从存储模块中获取这些关联数据,并向缓存模块申请通过缓存单元来储存获取到 的关联数据。
(5)缓存管理模块用来响应队列管理模块发出的申请、搜索、释放、读出、写入缓存单元的命令。
具体地,当队列管理模块未能从缓存管理模块读出需要的队列对的关联数据,则当通过PCIE接口从通信设备的存储模块中获取到关联数据后,可以向缓存管理模块发出申请缓存单元的命令;则缓存管理模块依据内部算法,确定是否允许缓存该关联数据,如果允许,则可以向队列管理模块返回缓存单元的标识比如序号;否则,返回非法值指示不允许缓存该关联数据。
当队列管理模块需要向协议引擎提供关联数据时,首先向缓存管理模块发出搜索命令,以确认是否存在相应的关联数据;另一种情况下,当队列管理模块接收到PCIE接口下发的注销队列对的操作时,会向缓存管理模块发出搜索命令,以确认是否存在该队列对对应的缓存单元。如果存在,缓存管理模块会返回储存关联数据的缓存单元的序号;不存在则返回无效序号。
当队列管理模块接收到注销队列对的操作,且已经搜索到存在该队列对对应的缓存单元时,队列管理模块会向缓存管理模块下发释放缓存单元的命令;当缓存单元中存在队列管理模块需要使用的队列对的关联数据时,队列管理模块可以向缓存管理模块下发读出缓存单元中信息的命令,以获得相应的关联数据;当缓存管理模块允许储存某个队列对对应的关联数据时,在给出缓存单元的序号后,队列管理模块可以通过缓存管理模块向给出序号的缓存单元写入关联数据。
当两个上述的通信设备(比如通信设备A和B)在传输数据时,具体可以通过如下方法来实现:
(1)通信设备A发送待传输数据
A1:通信设备A中的CPU将待传输数据写入到存储模块中,且将传输的任务写入到存储模块的发送队列中,并通过门铃通知HCA卡执行相应的任务。其中门铃可以为一个通知消息,在该通知消息中可以包括队列对的序号。
B1:HCA卡中的PCIE接口接收到该门铃传输给协议引擎,由协议引擎解析该门铃,得到其中包含的队列对的序号后,通过PCIE接口向通信设备A的存储模块获取该队列对的关联数据,可以包括如下至少一个信息:队列对的队列 上下文,待传输数据的MTPT表和队列对的完成队列上下文等数据。
C1:在执行步骤B1的同时,HCA卡中的协议引擎可以触发队列管理模块向缓存管理模块下发申请缓存单元的命令,并在命令中携带队列对的序号和优先级信息。如果在HCA卡的缓存单元中按照上述图3的结构储存关联数据,当缓存管理模块接收该命令后,可以按照如下的步骤来执行申请缓存单元的操作,流程图如图6所示,包括:
C11:缓存管理模块根据接收命令中的队列对的序号再次确认在缓存单元中是否储存该队列对对应的关联数据,如果储存有,则向队列管理模块返回已存在的信息,并结束流程;如果未储存有,则执行步骤C12。
C12:判断是否有空闲的缓存单元,主要是遍历缓存单元中的有效位,该有效位用来指示缓存单元是否空闲,如果有,则向队列管理模块返回其中一个空闲的缓存单元的序号;如果没有,则执行步骤C13。
C13:判断是否存在优先级比上述步骤C11中接收的命令中优先级信息所指示的优先级低的缓存单元,具体可以遍历缓存单元中的优先级位并与接收命令中的优先级信息进行比较,如果存在,则执行步骤C14;如果不存在,则向队列管理模块返回不允许缓存的信息。
C14:选择一个优先级较低的非空闲缓存单元,并向队列管理模块返回该缓存单元的序号,并允许将队列对的关联数据储存到该缓存单元中。
D1:当队列管理模块接收到缓存单元的序号和允许将队列对的关联数据储存到该缓存的信息后,会向缓存管理单元发起写入缓存单元的请求,在请求中还可以携带队列对的序号、优先级信息、缓存单元的序号和需要储存的关联数据,这样当缓存模块单元接收到该请求后,就会将优先级信息、队列对的序号和关联数据按照上述图3所示的结构储存到相应的缓存单元中。
E1:HCA卡中的协议引擎分析关联数据中的队列上下文,得到队列对中发送队列的物理地址,则协议引擎会根据该物理地址通过PCIE接口从通信设备A的存储模块中读取该发送队列的工作队列元素(Work Queue Element,WQE)后,分析该WQE,确定需要发送的待传输数据的源虚拟地址和长度。
F1:协议引擎触发队列管理模块向缓存管理模块发起读取请求,并在读取请求中携带队列对的序号,和需要读取关联数据某一成员比如MTPT表的信 息,当缓存管理模块接收该读取请求后,可以按照如下步骤执行读取操作,流程图如图7所示,包括:
F11:缓存管理模块接收到该读取请求后,根据其中队列对的序号确定是否存在该队列对对应的缓存单元,如果不存在,则向队列管理模块返回不存在的信息,如果存在,则执行步骤F12。
F12:缓存管理模块根据队列对的序号找到对应的缓存单元,并根据读取请求中需要读取的成员信息,计算该成员在缓存单元中的偏移和读取长度,从而读取该成员。
F13:缓存管理模块将从缓存单元中读取的成员即MTPT表通过队列管理模块返回给协议引擎。
G1:当协议引擎获取到MTPT表后,由于在MTPT表中记载着待传输数据的物理地址和虚拟地址的对应关系,则协议引擎会根据步骤E1中得到的源虚拟地址及该MTPT表,确定待传输数据的物理地址;然后根据该物理地址和上述步骤E1中得到的待传输数据的长度,通过PCIE接口从通信设备A的存储模块中获取到待传输数据。
H1:协议引擎会将待传输数据封装成RDMA报文,并将该RDMA报文从收发接口传输到与通信设备B连接的线缆上,其中,在RDMA报文中可以包括如下的信息:传输的数据、操作类型(本实施例中为写操作)、待操作的目的队列对的序号以及目的虚拟地址。
(2)通信设备B接收待传输数据
A2:通信设备B中HCA卡所包括的收发接口接收到RDMA报文后,传输给协议引擎分析该RDMA报文,得到传输的数据、操作类型(本实施例中为写操作)、待操作的目的队列对的序号以及目的虚拟地址。其中待操作的目的队列对与通信设备A中传输数据时的队列对相对应,通信设备A在传输数据时设置队列对时,可以设置通信设备A传输数据的发送队列和通信设备B接收数据的接收队列,则该待操作的目的队列对的信息即为接收队列的信息。
B2:协议引擎根据目的队列对的序号,触发队列管理模块向缓存管理模块发起读取该队列对的关联数据中某一成员比如MTPT表或多个成员,则缓存管理模块可以按照上述图7所示的方法将读取的MTPT表返回给协议引擎;如 果在缓存单元中没有储存有该目的队列对的关联数据,则协议引擎会通过PCIE接口向通信设备B的存储模块获取目的队列对对应的关联数据,包括MTPT表。
需要说明的是,通信设备B中队列对的关联数据可以是在与通信设备A传输数据前设置的。
C2:协议引擎根据获取的MTPT表及目的虚拟地址,得到对应的物理地址,则通过PCIE接口将RDMA报文中传输的数据写入到该物理地址对应的存储模块中。
D2:协议引擎准备应答报文,并将该应答报文通过收发接口传输到与通信设备A连接的线缆上,在应答报文中可以包括如下信息:队列对的序号。
(3)通信设备A接收应答报文
A3:通信设备A中HCA卡所包括的收发接口接收到RDMA报文后,传输给协议引擎分析该应答报文,得到队列对的序号。
B3:协议引擎可以触发队列管理模块向缓存管理模块发起获取该队列对对应的关联数据包括完成队列上下文等信息,当缓存管理模块在接收到读取该队列对的关联数据的请求后,可以按照上述图7所示的方法将读取的关联数据返回到协议引擎。
C3:协议引擎根据该完成队列上下文生成完成队列元素(CQE),同时可选地生成完成事件队列元素(Complete Event Queue Element,CEQE)和上报中断等,CPU通过轮训CQE或响应中断(当生成CEQE和中断时),得知有完成队列生成,由CPU来终结本次传输数据的任务。
D3,同时,在通信设备A和B之间进行多次数据传输后,协议引擎可以触发队列管理模块向缓存管理模块发起释放请求,在释放请求中携带队列对的序号,则缓存管理模块找到该队列对对应的缓存单元,进行队列对的注销,即将缓存单元中的有效位设置为无效,即将该缓存单元置为空闲的缓存单元。
需要说明的是,在上述两个通信设备在传输数据的过程中,如果在缓存单元中队列对的某些关联数据发生改变时,通信设备中HCA卡所包括的队列管理模块可以向缓存管理模块发起写入请求,并在写入请求中包括队列对的序号和需要写入的关联数据的信息;当缓存管理模块接收到该写入请求后,可以按照 如下的步骤来执行写入操作,流程图如图8所示,包括:
A4:缓存管理模块根据接收的请求中的队列对的序号,查找是否有该队列对的序号对应的缓存单元存在,如果不存在,则向队列管理模块返回不存在的信息;如果存在,则执行步骤B4。
B4:根据该对列队的序号查找到对应的缓存单元,并根据写入请求中需要写入的关联数据的信息,计算该关联数据在缓存单元中的偏移和长度。
C4:根据步骤B4中计算的偏移,将需要写入的关联数据写入到缓存单元中相应的位置。
本发明实施例还提供一种信息缓存装置,该装置支持RDMA操作,结构示意图如图9所示,包括:
关联数据获取单元10,用于获取所述通信设备传输数据的队列对的关联数据。
优先级确定单元11,用于确定所述队列对的关联数据的优先级信息;该优先级确定单元11具体可以在所述队列对的队列对上下文的服务级别字段或自定义字段中确定所述优先级信息。
储存单元12,用于将所述关联数据获取单元10获取的队列对的关联数据和所述优先级确定单元11确定的关联数据的优先级信息对应地储存到所述信息缓存装置的缓存单元13中。
本实施例中,上述缓存单元可以包括标签域和内容域,标签域用于储存队列对的标识和关联数据的优先级信息,且还可以储存缓存单元标识和有效位;内容域用于储存队列对的关联数据。且关联数据中包括所述队列对的队列上下文、传输数据的内存翻译保护表和所述队列对的完成队列上下文等信息,并在内容域中,将队列上下文、内存翻译保护表和完成队列上下文等按照预置的顺序储存,比如按照图3所示的顺序存放。
可见,在本实施例的信息缓存装置中,在传输数据的过程中,如果在首次需要用到队列对的某些关联数据时,关联数据获取单元10会将通信设备传输数据的队列对的关联数据获取到,并由储存单元12连同该关联数据的优先级信息一起对应地储存到信息缓存装置的缓存单元13中。这样在后续如果需要用到该关联数据时,信息缓存装置就不需要通过与通信设备中处理器的接口从存储模 块来获取,而直接从缓存单元中获取,避免信息缓存装置与存储模块之间频繁的读写操作;同时,由于一个队列对的关联数据可以对应一个优先级别,这样信息缓存装置就根据优先级别缓存队列对的关联数据,在信息缓存装置中缓存空间有限的情况下,优先将高优先级别的队列对的关联数据缓存。
参考图10所示,在一个具体的实施例中,信息缓存装置除了可以包括如图9所示的结构外,还可以包括注销单元14和更新单元15,且其中储存单元12还可以通过第一储存单元120、第二储存单元121和第三储存单元122来实现,具体地:
第一储存单元120,用于在所述信息缓存装置的缓存单元13中选择空闲的缓存单元作为第一缓存单元,并将所述关联数据获取单元10获取的队列对的关联数据和优先级确定单元11确定的关联数据的优先级信息储存到所述第一缓存单元中。其中缓存单元的优先级与所述缓存单元中储存的关联数据的优先级一致。
第二储存单元121,用于如果所述信息缓存装置的缓存单元13中没有空闲的缓存单元,在所述非空闲的缓存单元中选择优先级比所述队列对的关联数据的优先级低的缓存单元作为第二缓存单元,并用所述关联数据获取单元10获取的队列对的关联数据和优先级确定单元11确定的关联数据的优先级信息替换所述第二缓存单元中的信息。
第三储存单元122,用于如果存在非空闲的缓存单元的优先级与所述队列对的关联数据的优先级相同,则在所述非空闲的缓存单元中按照预置的策略选择缓存单元作为第三缓存单元,并用所述关联数据获取单元10获取的队列对的关联数据和优先级确定单元11确定的关联数据的优先级信息替换所述第三缓存单元中的信息。
注销单元14,用于当所述通信设备队列对注销时,将所述信息缓存装置中储存对应的队列对的关联数据的缓存单元置为空闲的缓存单元,可以储存任意优先级的队列对的关联数据。具体地,该注销单元14可以修改该缓存单元中的有效位,以指示该缓存单元是空闲的缓存单元。
更新单元15,用于对所述缓存单元13中的任一种或多种关联数据进行更新。
需要说明的是,由于信息缓存装置在首次分配缓存单元的资源时,是根据需要储存的队列对的关联数据的大小来分配的,则非空闲的缓存单元的粒度与该非空闲的缓存单元当前储存的关联数据的大小一致。因此,上述第二储存单元121和第三储存单元122在替换缓存单元的信息时,需要为关联数据获取单元10获取的关联数据选择一个存储空间足够大的缓存单元,即选择的缓存单元的大小等于需要储存的关联数据和关联数据的优先级信息的总体大小,这样在替换信息时,队列对的关联数据和关联数据的优先级信息才能完全地储存到选择的缓存单元中。
本发明实施例还提供一种通信设备,结构示意图如图11所示,包括分别挂接到处理器21上的存储器20、输入装置23和输出装置24和RDMA模块22,其中:
存储器20中用来储存从输入装置23输入的数据,且还可以储存处理器21处理数据的必要文件等信息。输入装置23和输出装置24是通信设备与其他设备通信的端口,还可以包括通信设备外接的设备比如显示器、键盘、鼠标和打印机等。
处理器21,可以用于将待传输的数据写入到存储器20中,并将传输的任务写入到发送队列中。
RDMA模块22可以与输出装置24连接,并与其它通信设备之间连接,用于获取所述通信设备传输数据的队列对的关联数据;确定所述队列对的关联数据的优先级信息;将所述队列对的关联数据和所关联数据的优先级信息对应地储存到该RDMA模块22的缓存单元中,其中,RDMA模块22可以在所述队列对的队列对上下文的服务级别字段或自定义字段中确定所述关联数据的优先级信息。这样在后续如果需要用到任一关联数据时,RDMA模块22就不需要通过与通信设备中处理器21的接口从存储器20来获取,而直接从缓存单元中获取,避免RDMA模块22与存储器20之间频繁的读写操作;同时,由于一个队列对的关联数据可以对应一个优先级别,这样RDMA模块22就根据优先级别缓存队列对的关联数据,在RDMA模块22中缓存空间有限的情况下,优先将高优先级别的队列对的关联数据缓存。
本实施例中,RDMA模块22中的缓存单元可以包括标签域和内容域,标签域用于储存队列对的标识和关联数据的优先级信息,且还可以储存缓存单元标 识和有效位;内容域用于储存队列对的关联数据。且关联数据中包括所述队列对的队列上下文、传输数据的内存翻译保护表和所述队列对的完成队列上下文等信息,并在内容域中,将队列上下文、内存翻译保护表和完成队列上下文等按照预置的顺序储存,比如按照图3所示的顺序存放。
进一步地,RDMA模块22在储存关联数据和关联数据的优先级信息时,可以在RDMA模块22的缓存单元中选择空闲的缓存单元作为第一缓存单元,并将所述队列对的关联数据和关联数据的优先级信息储存到所述第一缓存单元中,其中缓存单元的优先级与所述缓存单元中储存的关联数据的优先级一致。如果所述RDMA模块22的缓存单元中没有空闲的缓存单元,在所述非空闲的缓存单元中选择优先级比所述队列对的关联数据的优先级低的缓存单元作为第二缓存单元,并用所述队列对的关联数据和关联数据的优先级信息替换所述第二缓存单元中的信息。如果所述非空闲的缓存单元的优先级与所述队列对的关联数据的优先级相同,则在所述非空闲的缓存单元中按照预置的策略选择缓存单元作为第三缓存单元,并用所述队列对的关联数据和关联数据的优先级信息替换所述第三缓存单元中的信息。
需要说明的是,由于RDMA模块22在首次分配缓存单元的资源时,是根据需要储存的队列对的关联数据的大小来分配的,则非空闲的缓存单元的大小与该非空闲的缓存单元当前储存的关联数据的大小一致。因此,RDMA模块22在替换缓存单元的信息时,需要为关联数据选择一个存储空间足够大的缓存单元,即选择的缓存单元的大小要大于或等于需要储存的关联数据和关联数据的优先级信息的总体大小,这样在替换信息时,队列对的关联数据和关联数据的优先级信息才能完全地储存到选择的缓存单元中。
另外需要说明的是,当上述队列对的关联数据和关联数据的优先级信息首次被储存到缓存单元后,在后续传输数据的过程中,其中某些关联数据可能会发生改变,则RDMA模块22还可以对缓存单元中储存的关联数据进行更新;且当队列对注销时,为了提高RDMA模块22中缓存单元的利用率,RDMA模块22可以将上述储存队列对的关联数据的缓存单元置为空闲的缓存单元,这样该缓存单元就可以储存任意优先级的队列对的关联数据。具体地,RDMA模块22可以修改该缓存单元中的有效位,以指示该缓存单元是空闲的缓存单元。
在具体的实施例中,RDMA模块22的结构可以如上述信息缓存装置或如图5所示的HCA卡的结构所示,在此不进行赘述。
综上所述,采用上述信息缓存方法,可以带来如下几点有益效果:
1、高优先级的队列对的关联数据优先储存在RDMA模块的缓存单元中,使得高优先级的任务总是可以在缓存单元中被命中,提高了高优先级任务的性能。从而提供给用户一种机制,可以设置自己关心的任务的队列对的优先级较高,尽量不被置换出来,提高用户关心的任务(或进程)的性能;
2、缓存单元中信息的命中率较稳定,不会因为传输场景不同而有较大的差异。
3、由于在不需要缓存单元中的关联数据时,可以将该缓存单元置为空闲,能储存其它任意优先级别的关联数据,使得能最大限度利用缓存空间的资源。
4、缓存单元的大小是对应的队列对的所有关联数据的大小,这样只储存了传输数据过程中的必要信息,而不会储存其它信息,符合RDMA传输的特点,有更高的效率。
5、缓存单元中储存的关联数据可以按照成员单位被单独读写,比如读取所有关联数据中的某一个或多个成员,满足RDMA传输的特点。
本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令相关的RDMA硬件来完成,该程序可以存储于一计算机可读存储介质中,存储介质可以包括:只读存储器(ROM)、随机存取存储器(RAM)、磁盘或光盘等;而相关的RDMA硬件可以通过现场可编程逻辑阵列(FPGA)或特定应用集成电路(ASIC)等来实现。
以上对本发明实施例所提供的信息缓存方法、装置和通信设备进行了详细介绍,本文中应用了具体个例对本发明的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本发明的方法及其核心思想;同时,对于本领域的一般技术人员,依据本发明的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本发明的限制。

Claims (15)

  1. 一种信息缓存方法,其特征在于,应用于通信设备包括的远端数据直接存取RDMA模块中,所述方法包括:
    获取所述通信设备传输数据的队列对的关联数据;
    确定所述队列对的关联数据的优先级信息;
    将所述队列对的关联数据和所述关联数据的优先级信息对应地储存到所述RDMA模块的缓存单元中。
  2. 如权利要求1所述的方法,其特征在于,所述确定所述队列对的关联数据的优先级信息,具体包括:
    在所述队列对的队列对上下文的服务级别字段或自定义字段中确定所述关联数据的优先级信息。
  3. 如权利要求1所述的方法,其特征在于,所述将所述队列对的关联数据和所述关联数据的优先级信息对应地储存到所述RDMA模块的缓存单元中,具体包括:
    在所述RDMA模块中选择空闲的缓存单元,作为第一缓存单元,并将所述队列对的关联数据和所述关联数据的优先级信息储存到所述第一缓存单元中;
    如果所述RDMA模块中没有空闲的缓存单元,在所述非空闲的缓存单元中选择优先级比所述队列对的关联数据的优先级低的缓存单元,作为第二缓存单元,并用所述队列对的关联数据和所述关联数据的优先级信息替换所述第二缓存单元中的信息;
    如果存在非空闲的缓存单元的优先级与所述队列对的关联数据的优先级相同,则在所述非空闲的缓存单元中按照预置的策略选择缓存单元,作为第三缓存单元,并用所述队列对的关联数据和所述关联数据的优先级信息替换所述第三缓存单元中的信息;
    其中,缓存单元的优先级与所述缓存单元中储存的关联数据的优先级一致。
  4. 如权利要求1至3任一项所述的方法,其特征在于,所述方法还包括:
    当所述队列对注销时,将所述RDMA模块中的所述缓存单元置为空闲的缓存单元。
  5. 如权利要求1至4任一项所述的方法,其特征在于,
    所述缓存单元包括标签域和内容域,所述标签域用于储存所述队列对的标识和所述关联数据的优先级信息;所述内容域用于储存所述队列对的关联数据。
  6. 如权利要求5所述的方法,其特征在于,
    所述关联数据中包括如下任一个或多个信息:所述队列对的队列上下文、传输数据的内存翻译保护表和所述队列对的完成队列上下文;
    在所述内容域中,将所述队列上下文、内存翻译保护表和完成队列上下文按照预置的顺序储存。
  7. 如权利要求1至6任一项所述的方法,其特征在于,所述缓存单元的关联数据包括一种或多种,所述方法还包括:
    对所述缓存单元中的任一种或多种关联数据进行更新。
  8. 一种信息缓存装置,其特征在于,包括:
    关联数据获取单元,用于获取所述通信设备传输数据的队列对的关联数据;
    优先级确定单元,用于确定所述队列对的关联数据的优先级信息;
    储存单元,用于将所述关联数据获取单元获取的队列对的关联数据和所述优先级确定单元确定的关联数据的优先级信息对应地储存到所述信息缓存装置的缓存单元中。
  9. 如权利要求8所述的装置,其特征在于,所述优先级确定单元,具体用于在所述队列对的队列对上下文的服务级别字段或自定义字段中确定所述关联数据的优先级信息。
  10. 如权利要求8所述的装置,其特征在于,所述储存单元包括:
    第一储存单元,用于在所述信息缓存装置中选择空闲的缓存单元作为第一缓存单元,并将所述队列对的关联数据和所述关联数据的优先级信息储存到所述第一缓存单元中;
    第二储存单元,用于如果所述信息缓存装置中没有空闲的缓存单元,在所述非空闲的缓存单元中选择优先级比所述队列对的关联数据的优先级低的缓存单元,作为第二缓存单元,并用所述队列对的关联数据和所述关联数据的优 先级信息替换所述第二缓存单元中的信息;
    第三储存单元,用于如果存在非空闲的缓存单元的优先级与所述队列对的关联数据的优先级相同,则在所述非空闲的缓存单元中按照预置的策略选择缓存单元,作为第三缓存单元,并用所述队列对的关联数据和所述关联数据的优先级信息替换所述第三缓存单元中的信息;
    其中,缓存单元的优先级与所述缓存单元中储存的关联数据的优先级一致。
  11. 如权利要求8至10任一项所述的装置,其特征在于,还包括:
    注销单元,用于当所述队列对注销时,将所述信息缓存装置中的缓存单元置为空闲的缓存单元。
  12. 如权利要求8至11任一项所述的装置,其特征在于,
    所述缓存单元包括标签域和内容域,所述标签域用于储存所述队列对的标识和所述关联数据的优先级信息;所述内容域用于储存所述队列对的关联数据。
  13. 如权利要求12所述的装置,其特征在于,
    所述关联数据中包括至少一个如下的信息:所述队列对的队列上下文、传输数据的内存翻译保护表和所述队列对的完成队列上下文;
    在所述内容域中,将所述队列上下文、内存翻译保护表和完成队列上下文按照预置的顺序储存。
  14. 如权利要求8至13任一项所述的装置,其特征在于,所述缓存单元的关联数据包括一种或多种,所述装置还包括:
    更新单元,用于对所述缓存单元中的任一种或多种关联数据进行更新。
  15. 一种通信设备,其特征在于,包括处理器、远端数据直接存取RDMA模块和存储模块;
    所述RDMA模块与所述处理器连接,是如权利要求8至14中任一项所述的信息缓存装置。
PCT/CN2014/086497 2013-11-27 2014-09-15 一种信息缓存方法、装置和通信设备 WO2015078219A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310617002.0A CN103647807B (zh) 2013-11-27 2013-11-27 一种信息缓存方法、装置和通信设备
CN201310617002.0 2013-11-27

Publications (1)

Publication Number Publication Date
WO2015078219A1 true WO2015078219A1 (zh) 2015-06-04

Family

ID=50252960

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/086497 WO2015078219A1 (zh) 2013-11-27 2014-09-15 一种信息缓存方法、装置和通信设备

Country Status (2)

Country Link
CN (1) CN103647807B (zh)
WO (1) WO2015078219A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111277616A (zh) * 2018-12-04 2020-06-12 中兴通讯股份有限公司 一种基于rdma的数据传输方法和分布式共享内存系统
CN113301285A (zh) * 2021-05-11 2021-08-24 深圳市度信科技有限公司 多通道数据传输方法、装置及系统
CN113709057A (zh) * 2017-08-11 2021-11-26 华为技术有限公司 网络拥塞的通告方法、代理节点、网络节点及计算机设备
EP4160425A4 (en) * 2020-07-06 2023-11-01 Huawei Technologies Co., Ltd. DATA TRANSMISSION METHOD, CHIP AND DEVICE

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103647807B (zh) * 2013-11-27 2017-12-15 华为技术有限公司 一种信息缓存方法、装置和通信设备
CN104077198B (zh) * 2014-06-19 2017-06-06 华为技术有限公司 门铃db恢复方法及装置、具有该装置的输入/输出i/o设备
CN105808477A (zh) * 2014-12-29 2016-07-27 杭州华为数字技术有限公司 一种数据访问方法以及相关装置
CN106484314B (zh) * 2015-09-01 2020-01-03 阿里巴巴集团控股有限公司 一种缓存数据控制方法及装置
US9880772B2 (en) * 2015-09-21 2018-01-30 Micron Technology, Inc. Systems and methods for providing file information in a memory system protocol
CN105446936B (zh) * 2015-11-16 2018-07-03 上海交通大学 基于htm和单向rdma操作的分布式哈希表方法
CN106940682B (zh) * 2017-03-07 2020-06-09 武汉科技大学 一种基于片上可编程存储器的嵌入式系统优化方法
CN108304214B (zh) * 2017-12-13 2022-05-13 超聚变数字技术有限公司 一种立即数的完整性的校验方法及装置
CN108366111B (zh) * 2018-02-06 2020-04-07 西安电子科技大学 一种用于交换设备的数据包低时延缓存装置与方法
CN108663971A (zh) * 2018-06-01 2018-10-16 北京汉能光伏投资有限公司 命令转发方法及装置、太阳能系统和中央控制器
CN109359069A (zh) * 2018-09-25 2019-02-19 济南浪潮高新科技投资发展有限公司 一种数据传输方法及装置
CN112463654A (zh) * 2019-09-06 2021-03-09 华为技术有限公司 一种带预测机制的cache实现方法
CN113055131B (zh) * 2019-12-26 2023-06-30 阿里巴巴集团控股有限公司 数据处理方法、数据切分方法、计算设备和介质
CN111262917A (zh) 2020-01-13 2020-06-09 苏州浪潮智能科技有限公司 一种基于fpga云平台的远端数据搬移装置和方法
CN113381939B (zh) * 2020-03-10 2022-04-29 阿里巴巴集团控股有限公司 数据传输方法、装置、电子设备及计算机可读存储介质
CN111737176B (zh) * 2020-05-11 2022-07-15 瑞芯微电子股份有限公司 一种基于pcie数据的同步装置及驱动方法
CN113300979A (zh) * 2021-02-05 2021-08-24 阿里巴巴集团控股有限公司 Rdma网络下的网卡队列创建方法以及装置
CN115633104B (zh) * 2022-09-13 2024-02-13 江苏为是科技有限公司 数据发送方法、数据接收方法、装置及数据收发系统
CN115858160B (zh) * 2022-12-07 2023-12-05 江苏为是科技有限公司 远程直接内存访问虚拟化资源分配方法及装置、存储介质
CN116303173B (zh) * 2023-05-19 2023-08-08 深圳云豹智能有限公司 减少rdma引擎片上缓存的方法、装置、系统及芯片

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060029088A1 (en) * 2004-07-13 2006-02-09 International Business Machines Corporation Reducing latency in a channel adapter by accelerated I/O control block processing
CN101165663A (zh) * 2006-10-17 2008-04-23 国际商业机器公司 使用高速缓存的地址转换与i/o适配器进行通信的装置、方法
CN103647807A (zh) * 2013-11-27 2014-03-19 华为技术有限公司 一种信息缓存方法、装置和通信设备

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7941799B2 (en) * 2004-05-27 2011-05-10 International Business Machines Corporation Interpreting I/O operation requests from pageable guests without host intervention
CN102609378B (zh) * 2012-01-18 2016-03-30 中国科学院计算技术研究所 一种消息式内存访问装置及其访问方法
CN103019962B (zh) * 2012-12-21 2016-03-30 华为技术有限公司 数据缓存处理方法、装置以及系统

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060029088A1 (en) * 2004-07-13 2006-02-09 International Business Machines Corporation Reducing latency in a channel adapter by accelerated I/O control block processing
CN101165663A (zh) * 2006-10-17 2008-04-23 国际商业机器公司 使用高速缓存的地址转换与i/o适配器进行通信的装置、方法
CN103647807A (zh) * 2013-11-27 2014-03-19 华为技术有限公司 一种信息缓存方法、装置和通信设备

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113709057A (zh) * 2017-08-11 2021-11-26 华为技术有限公司 网络拥塞的通告方法、代理节点、网络节点及计算机设备
CN111277616A (zh) * 2018-12-04 2020-06-12 中兴通讯股份有限公司 一种基于rdma的数据传输方法和分布式共享内存系统
CN111277616B (zh) * 2018-12-04 2023-11-03 中兴通讯股份有限公司 一种基于rdma的数据传输方法和分布式共享内存系统
EP4160425A4 (en) * 2020-07-06 2023-11-01 Huawei Technologies Co., Ltd. DATA TRANSMISSION METHOD, CHIP AND DEVICE
CN113301285A (zh) * 2021-05-11 2021-08-24 深圳市度信科技有限公司 多通道数据传输方法、装置及系统

Also Published As

Publication number Publication date
CN103647807B (zh) 2017-12-15
CN103647807A (zh) 2014-03-19

Similar Documents

Publication Publication Date Title
WO2015078219A1 (zh) 一种信息缓存方法、装置和通信设备
US11929927B2 (en) Network interface for data transport in heterogeneous computing environments
US20200322287A1 (en) Switch-managed resource allocation and software execution
US9866479B2 (en) Technologies for concurrency of cuckoo hashing flow lookup
US9529532B2 (en) Method and apparatus for memory allocation in a multi-node system
US20140032796A1 (en) Input/output processing
WO2015165298A1 (zh) 计算机,控制设备和数据处理方法
US9864717B2 (en) Input/output processing
US20150254182A1 (en) Multi-core network processor interconnect with multi-node connection
US10592459B2 (en) Method and system for ordering I/O access in a multi-node environment
US20200379922A1 (en) Adaptive routing for pooled and tiered data architectures
WO2016095762A1 (zh) 一种虚拟机访问控制方法,及虚拟机访问控制系统
US20210326270A1 (en) Address translation at a target network interface device
JP2017537404A (ja) メモリアクセス方法、スイッチ、およびマルチプロセッサシステム
US20140025859A1 (en) Input/output processing
EP3588310A1 (en) Technologies for demoting cache lines to shared cache
CN115248795A (zh) 高速外围组件互连(pcie)接口系统及其操作方法
WO2014206229A1 (zh) 一种加速器以及数据处理方法
WO2012177689A2 (en) Facilitating implementation, at least in part, of at least one cache management policy
US9137167B2 (en) Host ethernet adapter frame forwarding
US11093405B1 (en) Shared mid-level data cache
US8898353B1 (en) System and method for supporting virtual host bus adaptor (VHBA) over infiniband (IB) using a single external memory interface
US11036643B1 (en) Mid-level instruction cache
WO2019095942A1 (zh) 一种数据传输方法及通信设备
US8051246B1 (en) Method and apparatus for utilizing a semiconductor memory of a node as a disk cache

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14866212

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14866212

Country of ref document: EP

Kind code of ref document: A1