WO2014176775A1 - 一种计算机系统、高速外围组件互联端点设备的访问方法、和装置 - Google Patents

一种计算机系统、高速外围组件互联端点设备的访问方法、和装置 Download PDF

Info

Publication number
WO2014176775A1
WO2014176775A1 PCT/CN2013/075088 CN2013075088W WO2014176775A1 WO 2014176775 A1 WO2014176775 A1 WO 2014176775A1 CN 2013075088 W CN2013075088 W CN 2013075088W WO 2014176775 A1 WO2014176775 A1 WO 2014176775A1
Authority
WO
WIPO (PCT)
Prior art keywords
access
endpoint device
pcie endpoint
pcie
processor
Prior art date
Application number
PCT/CN2013/075088
Other languages
English (en)
French (fr)
Inventor
杜阁
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to KR1020137032327A priority Critical patent/KR101539878B1/ko
Priority to ES16180277.2T priority patent/ES2687609T3/es
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN201380000957.XA priority patent/CN104335194B/zh
Priority to EP18155911.3A priority patent/EP3385854B1/en
Priority to JP2015514331A priority patent/JP5953573B2/ja
Priority to EP13792568.1A priority patent/EP2811413B1/en
Priority to PCT/CN2013/075088 priority patent/WO2014176775A1/zh
Priority to CA2833940A priority patent/CA2833940C/en
Priority to ES18155911T priority patent/ES2866156T3/es
Priority to AU2013263866A priority patent/AU2013263866B2/en
Priority to EP16180277.2A priority patent/EP3173936B1/en
Priority to ES13792568.1T priority patent/ES2610978T3/es
Priority to BR112013033792-3A priority patent/BR112013033792B1/pt
Priority to ZA2013/08948A priority patent/ZA201308948B/en
Priority to US14/143,460 priority patent/US8782317B1/en
Priority to US14/297,959 priority patent/US10025745B2/en
Publication of WO2014176775A1 publication Critical patent/WO2014176775A1/zh
Priority to US14/703,328 priority patent/US9477632B2/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/42Bus transfer protocol, e.g. handshake; Synchronisation
    • G06F13/4204Bus transfer protocol, e.g. handshake; Synchronisation on a parallel bus
    • G06F13/4221Bus transfer protocol, e.g. handshake; Synchronisation on a parallel bus being an input/output bus, e.g. ISA bus, EISA bus, PCI bus, SCSI bus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/40Bus structure
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/20Handling requests for interconnection or transfer for access to input/output bus
    • G06F13/28Handling requests for interconnection or transfer for access to input/output bus using burst mode transfer, e.g. direct memory access DMA, cycle steal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/382Information transfer, e.g. on bus using universal interface adapter
    • G06F13/385Information transfer, e.g. on bus using universal interface adapter for adaptation of a particular data processing system to different peripheral devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/40Bus structure
    • G06F13/4004Coupling between buses
    • G06F13/4027Coupling between buses using bus bridges
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/40Bus structure
    • G06F13/4063Device-to-bus coupling
    • G06F13/4068Electrical coupling
    • G06F13/4081Live connection to bus, e.g. hot-plugging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/4401Bootstrapping
    • G06F9/4411Configuring for operating with peripheral devices; Loading of device drivers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2213/00Indexing scheme relating to interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F2213/0026PCI express
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2213/00Indexing scheme relating to interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F2213/28DMA
    • G06F2213/2802DMA using DMA transfer descriptors

Definitions

  • Embodiments of the present invention relate to computer technology, and in particular, to a high-speed peripheral component interconnection endpoint device access method, computer system, and apparatus. Background technique
  • PCIe Peripheral Component Interconnect Express
  • CPUs central processing units
  • peripherals as a core service channel in computing and storage devices.
  • peripheral devices that can be interconnected with the CPU through the PCIe bus, such as NIC devices or Solid State Disks (SSDs). These devices are collectively referred to as PCIe endpoint devices in this document.
  • the PCIe bus is widely used as the bus interface of the server or storage system.
  • the need for online expansion and maintenance requires the addition or removal of PCIe endpoint devices, that is, hot swapping requirements.
  • the existing PCIe hot plug follows the following operation process: The operator initiates a hot plug request by pressing a button, and the hot plug controller knows that after the hot plug event, all the drivers in the system that may access the PCIe endpoint device are stopped. The PCIe endpoint device is accessed, and the resources of the PCIe endpoint device that needs to be hot swapped are offloaded. Thereafter, the PCIe endpoint device is powered off, and the operator pulls out the PCIe endpoint device.
  • Hot swapping of existing PCIe endpoint devices requires advance notice to ensure system operation.
  • the PCIe bus has gradually evolved from interconnection within the system to interconnection between systems.
  • Applications such as external cables have increased, cables have been easily dropped, and PCIe endpoint devices that are not notified are abnormally offline.
  • the scenario in which the user directly accesses the system using the SSD is more widely used. Due to user habits, the user may directly insert and remove the SSD without prior notice.
  • the PCIe endpoint device mentioned above is suddenly abnormally offline, if the CPU has initiated a read/write instruction to the PCIe endpoint device, the related instruction will remain in the pending state, when the CPU pairs the PCIe endpoint.
  • the access command of the device continues to accumulate to a certain extent, and the CPU considers the entire system abnormal, and reports a Machine check exception (MCE) error, which causes the entire system to be reset.
  • MCE Machine check exception
  • Embodiments of the present invention provide an access method, a computer system, and a device for interconnecting an endpoint device of a high-speed peripheral component. After the PCIe endpoint device is abnormally offline, the processor is prevented from generating a reset.
  • an embodiment of the present invention provides a computer system, where the computer system includes: a processor;
  • the computer system further includes an access agent, the access agent respectively connecting the processor and the PCIe endpoint device;
  • the processor is configured to obtain an operation instruction, the operation instruction instructing the processor to access the PCIe endpoint device by using the access proxy, and sending an access request to the access proxy according to the operation instruction, where the access Requesting the access agent to access the PCIe endpoint device;
  • the access proxy is configured to send a response message of the access request to the processor after receiving an access request sent by the processor.
  • the computer system further includes: a driving module of a PCIe endpoint device, configured to generate the operation instruction according to an access interface of a pre-configured PCIe endpoint device, where the The access interface of the configured PCIe endpoint device points to the access proxy;
  • the processor is specifically configured to acquire the operation instruction generated by a driving module of the PCIe endpoint device.
  • the computer system further includes:
  • the driver module and main operating system of the PCIe endpoint device The driver module and main operating system of the PCIe endpoint device;
  • the driving module of the PCIe endpoint device is configured to invoke the main operating system to perform access to the PCIe endpoint device;
  • the main operating system receives the call of the driver module of the PCIe endpoint device, and generates the operation instruction according to the access interface of the pre-configured PCIe endpoint device, where the access interface of the pre-configured PCIe endpoint device points to the access proxy;
  • the processor is specifically configured to acquire the operation instruction generated by the main operating system.
  • the access proxy is further configured to perform the access according to the access Requesting access to the PCIe endpoint device is requested.
  • the PCIe endpoint device is connected to the processor of the computer system through the PCIe bus, the method comprising: the processor acquiring an operation instruction, the operation instruction instructing the processor to access the PCIe endpoint device by using an access proxy;
  • the processor sends an access request to the access proxy according to the operation instruction, where the access request instructs the access proxy to access the PCIe endpoint device;
  • the processor receives a response message of the access request sent by the access proxy.
  • the processor acquires the operation instruction generated by a driving module of the PCIe endpoint device according to a pre-configured access interface of the PCIe endpoint device, where the The configured access interface of the PCIe endpoint device is directed to the access proxy; or the processor acquires the operation instruction generated by the primary operating system according to the pre-configured access interface of the PCIe endpoint device, the pre-configured The access interface of the PCIe endpoint device points to the access agent.
  • the operation instruction specifically indicates that the processor passes a direct memory access (DMA) engine Accessing the PCIe endpoint device;
  • DMA direct memory access
  • the processor sends a data transfer request to the DMA engine according to the operation instruction,
  • the data transfer request instructs the DMA engine to move specified data in a memory of the PCIe endpoint device to a memory of the computer system, or to move specified data in a memory of the computer system to the PCIe endpoint In the memory of the device.
  • the method further includes:
  • the processor obtains the access result according to the first notification message.
  • the method further includes:
  • the processor performs subsequent processing of the access failure according to the second notification message.
  • the subsequent processing of the access failure includes:
  • the processor suspends the PCIe endpoint device Access.
  • a computer comprising:
  • a memory for storing a computer execution instruction
  • the processor executes the computer-executed instructions stored by the memory and communicates with a device external to the computer via a communication interface to cause the computer to perform one of the second aspects when the computer is running A high-speed peripheral component interconnect high-speed peripheral component interconnect PCIe endpoint device access method.
  • a computer readable medium comprising computer executed instructions, when calculating When the processor of the machine executes the computer-executed instructions, the computer performs the access method of a high-speed peripheral component interconnect PCIe endpoint device of the second aspect.
  • a high-speed peripheral component interconnect PCIe endpoint device access method is provided.
  • the PCIe endpoint device is connected to a processor of a computer system through a PCIe bus, and the method includes: receiving a call instruction, where the call instruction indicates to the PCIe
  • the access device generates an operation instruction according to the access interface of the pre-configured PCIe endpoint device, where the access interface of the pre-configured PCIe endpoint device points to the access proxy, and the operation instruction is used to indicate that the processor passes the The access agent accesses the PCIe endpoint device.
  • a high-speed peripheral component interconnection PCIe endpoint device access device including:
  • a receiving module configured to receive a call instruction, where the call instruction instructs access to the PCIe endpoint device
  • a generating module configured to generate, according to the pre-configured access interface of the PCIe endpoint device, an operation instruction for accessing the PCIe endpoint device, where the pre-configured access interface of the PCIe endpoint device points to the access And the operation instruction is used to instruct the processor to access the PCIe endpoint device by using the access proxy.
  • a computer comprising:
  • a memory for storing a computer execution instruction
  • the processor executes the computer execution instructions stored by the memory when the computer is running to cause the computer to perform the following method:
  • the eighth aspect provides a computer readable medium, comprising computer executed instructions, when the processor of the computer executes the computer to execute an instruction, the computer performs the following method:
  • an access proxy is provided, the access proxy being applied to a computer system, the computer system comprising a processor and a high speed peripheral component interconnect PCIe bus, wherein the PCIe bus is connected to at least one PCIe endpoint device;
  • the access agent connects the processor and the PCIe endpoint device respectively;
  • the access proxy is configured to isolate direct access between the processor and a PCIe endpoint device, receive an access request by the processor to the PCIe endpoint device, and return a response message of the access request to the processor.
  • a PCIe switch is provided, wherein the PCIe switch is applied to a computer system, where the computer system includes a processor and a high-speed peripheral component interconnect PCIe bus, and the PCIe bus is connected to at least one PCIe endpoint.
  • An upstream port of the PCIe switch is connected to the processor through the PCIe bus, and a downstream port of the PCIe switch is connected to the PCIe endpoint device through the PCIe bus; the PCIe switch is built in as a ninth The access agent described in the aspect.
  • a method for allocating resources to a PCIe endpoint device for accessing a high speed peripheral component in a computer system comprising:
  • the specified share is a resource requirement of a PCIe endpoint device of a type with the largest resource requirement.
  • the PCIe endpoint device and the processor that are connected to the computer system form a PCIe domain.
  • the method also includes: recording, in the PCIe tree, the specified share of resources allocated for an access port of each PCIe endpoint device.
  • the method further includes:
  • a computer system including: a processor;
  • BIOS basic input/output system BIOS, configured to reserve a designated share of resources for an access port of each of the PCIe endpoint devices, where the specified share is greater than or equal to a resource requirement of each of the PCIe endpoint devices;
  • a PCIe management module configured to allocate, according to the specified share of resources reserved by the BIOS, a reserved share of resources reserved for an access port of each PCIe endpoint device.
  • the processor no longer directly accesses the PCIe endpoint device to be accessed, but completes access by accessing an agent, which can isolate the impact of the abnormal offline of the PCIe endpoint device, and return an access request to the processor.
  • the response message causes the processor cached tasks to not accumulate due to timeouts, so that the processor avoids MCE resets.
  • the system reserves and allocates a specified amount of resources for the access port of the PCIe endpoint device, so that the processor can no longer scan the PCIe when the PCIe endpoint device accesses the system.
  • the endpoint device avoids the entire system reset problem caused by MCE errors that may occur when the PCIe endpoint device is connected to the computer system.
  • FIG. 1 is a composition diagram of a computer system according to an embodiment of the present invention.
  • FIG. 2 is a block diagram of a program module included in a memory according to an embodiment of the present invention
  • FIG. 3 is a block diagram of still another computer system according to an embodiment of the present invention
  • FIG. 5 is a flowchart of a method according to an embodiment of the present invention.
  • FIG. 6 is a flow chart of still another method according to an embodiment of the present invention.
  • FIG. 7 is a flowchart of still another method according to an embodiment of the present invention.
  • FIG. 8 is a flowchart of still another method according to an embodiment of the present invention.
  • FIG. 9 is a flowchart of still another method according to an embodiment of the present invention.
  • FIG. 10 is a flowchart of still another method according to an embodiment of the present invention.
  • FIG. 11 is a structural diagram of an access device of a PCIe endpoint device according to an embodiment of the present invention.
  • FIG. 12 is a structural diagram of a computer according to an embodiment of the present invention. detailed description
  • the embodiment of the invention provides a method, a computer system and a device for accessing a high-speed peripheral component interconnection endpoint device.
  • a PCIe endpoint device needs to be hot-swapped, the PCIe endpoint can be directly disconnected without prior notification to the system for pre-processing.
  • the case where the PCIe endpoint device is directly pulled out of the system or the fault is dropped is collectively referred to as abnormal offline of the PCIe endpoint device.
  • FIG. 1 is a block diagram of a computer system according to an embodiment of the present invention.
  • the computer system shown in FIG. 1 includes a CPU 110, a memory 120, and a PCIe endpoint device 130.
  • the PCIe endpoint device 130 is connected to the CPU 110 through the PCIe bus 140, and can Plug and unplug from the computer system.
  • the PCIe endpoint device 130 includes various types, such as a graphics processing unit 131, a network adapter 132, a solid state hard disk 133, and a video acceleration component 134.
  • the memory 120 is used to store data, and the stored data may be data acquired by the CPU from an external device. It may be program data that causes the CPU to run.
  • one or more program modules may be stored in the memory, and the CPU 110 performs related operations according to the computer executed instructions of the program module; the PCIe endpoint device in the computer system shown in FIG.
  • the 130 and CPU 110 form a PCIe domain, and all devices in the PCIe domain are connected to the CPU 110 via the PCIe bus 140 and are controlled by the CPU 110.
  • the program module in the memory 120 may specifically include an application module 121, a driving module 122, and a host operating system (HOS) 123, and the application module 121 generates a pair of PCIe.
  • the access requirement of the endpoint device, the driver module 122 is configured to invoke the corresponding interface of the HOS123 according to the access requirement of the application module to the PCIe endpoint device (if the access interface is provided by the HOS), the HOS123 generates an operation instruction according to the call of the driver module, so that the CPU The operation instructions access or control the corresponding PCIe endpoint device.
  • one PCIe endpoint device corresponds to one driver module (of course, one driver module may correspond to multiple PCIe endpoint devices, as long as each PCIe endpoint device is provided with a corresponding driver module), for example, according to FIG.
  • the system architecture, the driver module of the PCIe endpoint device in the memory 120 may include a driver module 122-1 of the graphics processing unit, a driver module 122-2 of the network adapter NIC, a driver module 122-3 of the solid state drive SSD, and a driver of the video acceleration component. Module 122-4.
  • the driver module 122-3 of the SSD receives the call of the application module 121, and then calls HOS123, HOS123.
  • the operation instruction is given to the CPU 110.
  • the operation instruction includes an indication of the device SSD 133 to be accessed and related operation requirements.
  • the CPU 110 issues an access request to the SSD 133 according to the operation instruction of the SSD drive module 122-3, and requests access to the SSD133 register. If the SSD 133 is abnormally offline, the CPU 110 will not receive the response message of the SSD 133 access request to the CPU 110. At this time, the CPU will consider that the access task is not completed, if such unfinished tasks accumulate to a certain extent in the CPU. The CPU will consider the entire system abnormal, and report the MCE error to reset.
  • the embodiment of the present invention changes the access mode of the CPU to the PCIe endpoint device, and the CPU does not directly access the PCIe endpoint device, but accesses the PCIe endpoint device through a third party. As shown in FIG. 1, the embodiment of the present invention is newly added in the system.
  • the access proxy 160 is used to access the PCIe endpoint device in place of the CPU 110 and to isolate the impact of the PCIe endpoint device being abnormally offline on the CPU 110.
  • CPU 110 access to SSD 1303 will no longer use line 1, but line 2 and line 3 (line 1 is shown as Line1, line 2 is shown as Line2, line 3 is shown as Line3, icon
  • the dotted line of Linel-3 is not an actual connection, but is only used to visually show the signal flow lines between the various component modules).
  • the CPU 110 first acquires an operation instruction instructing the CPU to access the SSD 133 through the access proxy 160, and the CPU 110 transmits an access request to the access proxy 160 via the line 2, and the access proxy 160 returns a response to the access request to the CPU 110 via the line 2. Message. Subsequently, the access proxy performs access to the PCIe endpoint device according to the access request, that is, reads and writes the register of the SSD 133 through the line 3.
  • the access proxy 160 provided by the embodiment of the present invention can return a response message to the CPU 110 after receiving the access instruction of the CPU 110, so that the access request sent by the CPU 110 can always receive the corresponding response message, so the access task of the CPU 110 does not Accumulation occurs because it is not completed, so that no MCE error occurs, and a system reset initiated by the CPU is avoided.
  • the embodiment of the present invention changes the access mode of the CPU to the PCIe endpoint device, and can be implemented by upgrading or improving the driver module corresponding to the PCIe endpoint device.
  • the access interface is pre-configured in the corresponding driver module of the PCIe endpoint device.
  • the pre-configured access interface points to the access proxy, and the corresponding driver module of the PCIe endpoint device needs to be determined.
  • the driver module of the PCIe endpoint device accesses, the driver module of the PCIe endpoint device generates an operation instruction of the CPU according to the pre-configured access interface, and the operation instruction instructs the CPU to access the PCIe endpoint device by accessing the proxy.
  • the access interface is pre-configured in the HOS, and the pre-configured access interface points to the access proxy.
  • the driver module of the PCIe endpoint device still calls the HOS to access the PCIe endpoint device when it is determined that the PCIe endpoint device needs to be accessed, and the HOS receives the call instruction sent by the driver of the PCIe endpoint device, because of the configured PCIe endpoint device.
  • the access interface has been pre-configured as the access agent, and the HOS operation instruction instructs the CPU to access the PCIe endpoint device through the access proxy.
  • the access agent of the embodiment of the present invention includes an isolation function and an access proxy function.
  • an isolation module it is necessary to ensure its independence from the PCIe endpoint device, and also to ensure its independence from the CPU. Keeping the independence of the PCIe endpoint device, it is necessary to ensure that the access proxy is not directly pulled out along with the PCIe endpoint device.
  • the access proxy and the PCIe endpoint device need to belong to different devices in physical setting; Independence, mainly to ensure that the access agent has a separate processor, the access agent's processor and the system's CPU are independent, even if the PCIe endpoint device is directly pulled out, the access agent module will not be affected Infected to the CPU.
  • the access proxy needs to implement access to the PCIe endpoint device, and return a response message to the received request of the received CPU, and the response message of the access request may be a confirmation response, or a rejection response or a failure response, but Regardless of which kind of response message, the CPU indicates that the access request sent by the CPU has been received, and after receiving the response message, the CPU determines the present After the secondary task is completed, the timer started for this task can be closed, so that the CPU's own task timeout shutdown mechanism remains normal, and other messages cached by the CPU are not accumulated due to timeout, thereby avoiding the CPU to generate an MCE reset.
  • access agent 160 is provided as a separate, incremental device in a computer system that is coupled to the CPU and the PCIe endpoint device via a PCIe bus, respectively.
  • the access agent 160 can be implemented with an existing device in the PCIe domain, for example, the access agent 160 is packaged with the CPU as a firmware.
  • the implementation of the Direct Memory Access (DMA) engine, the access agent can also be implemented in new hardware, such as installing a software module with the access agent function on a hardware device with a separate processor.
  • DMA Direct Memory Access
  • connection relationship is always in a hold state, that is, the connection relationship between the two is not disconnected, or the access agent is not hot swappable relative to the CPU, for example, the hardware device that will load the access agent or The hardware device implementing the access agent is soldered on the printed circuit board PCB to which the CPU is connected, or the hardware device loading the access agent or the hardware device for implementing the access agent is connected to the processor.
  • the interface is fixed using a connection device.
  • the computer system shown in FIG. 3 includes a PCIe switch 150 in addition to the CPU, PCIe bus, and PCIe endpoint device shown in FIG. 1.
  • the upstream port of the PCIe switch 150 is connected to the CPU 110 through the PCIe bus 140, and the downstream port faces each.
  • the PCIe endpoint devices provide a PCIe port, and each PCIe port is connected to each PCIe endpoint device through the PCIe bus 140.
  • the PCIe switch 150 is used to route data downstream to the corresponding PCIe port, and to route data upstream to the CPU 110 from each individual PCIe port.
  • the new access agent 160 is disposed inside the PCIe switch 150, and the access agent 160 in this embodiment is implemented by the DMA engine.
  • the PCIe endpoint device 130 is connected to the PCIe switch 150 through the PCIe bus 140. Since the PCIe switch 150 and the PCIe endpoint device 130 belong to different devices, the direct pullout of any one PCIe endpoint device does not cause the PCIe switch 150. It is removed from the system, that is, the access agent 160 is not pulled out with the PCIe endpoint device, and the independence of the access proxy 160 and the PCIe endpoint device 130 is achieved. In addition, in this embodiment, Since the DMA engine has a separate processor, if any PCIe endpoint device is directly pulled out, even if the DMA access to the PCIe endpoint device is affected, the DMA will isolate the impact. Whether or not the PCIe endpoint device is successfully accessed, the DMA is guaranteed. A response message to the access request issued by the CPU 110 is returned, thereby avoiding the CPU-initiated MCE reset problem.
  • the application module generates an access requirement for the SSD solid state hard disk 133.
  • the CPU 110 acquires an operation instruction generated by the drive module 122-3 of the solid state drive SSD, and the operation instruction instructs the CPU 110 to access the SSD solid state drive 133 through the DMA, and the CPU 110 is based on the solid state drive.
  • An operation instruction of the drive module 122-3 of the SSD sends a data transfer request to the DMA, the data transfer request instructing the DMA engine to move the designated data in the memory of the PCIe endpoint device to the memory of the computer system, or
  • the designated data in the memory of the computer system is moved to the memory of the PCIe endpoint device, and after receiving the data transfer request from the CPU 110, the DMA returns a response message to the CPU 110 for the data transfer request, and performs the SSD solid state drive 133.
  • the data is moved, and after the data transfer is completed, the notification message of the completion of the access is returned to the CPU 110 to notify the CPU 110 to acquire the result of the current access.
  • the PCIe switch 150 may also be soldered on the printed circuit board PCB to which the CPU 110 is connected, or the PCIe switch may be The interface connected to the CPU 110 is fixed by using a connection device, thereby ensuring that the DMA built in the PCIe switch 150 is not pulled out from the system. Thus it is guaranteed that the DMA can always return a response message to the access request to the CPU.
  • FIG. 4 a computer system provided by another embodiment of the present invention.
  • this embodiment adds an access proxy 160 to the CPU 110, which can be implemented by the DMA engine.
  • the access proxy 160 is disposed inside the CPU 110, that is, the access proxy 160 is not pulled out with the PCIe endpoint device being unplugged, and the access proxy 160 and the PCIe endpoint device 130 are independent; in addition, in this embodiment, Since the DMA engine has a separate processor, if any PCIe endpoint device is directly pulled out, even if the DMA access to the PCIe endpoint device is affected, the DMA will isolate the effect and will not be transmitted to the CPU 110, whether or not the PCIe is accessed.
  • the DMA ensures that the response message of the access request issued by the CPU 110 is returned, thereby avoiding the CPU-initiated MCE reset problem.
  • the specific access mode in this embodiment is the same as that in the embodiment of FIG. 1 and FIG. 3, and details are not described herein again.
  • the access method of the PCIe endpoint device in the embodiment of the present invention may be implemented in the computer system shown in FIG. 1 or FIG. 3 or FIG. 4, but FIG. 1 or FIG. 3 or FIG. 4 is only applicable to the embodiment of the present invention.
  • An example is not specifically limited to the application of the present invention, and the present application is no longer described for other system embodiments or application scenarios.
  • the settings of the access agents in the system described in FIG. 1, FIG. 3 and FIG. 4 are only two examples, and those skilled in the art may also set the access agents added in the embodiments of the present invention to other locations located in the system.
  • the technical principle according to the embodiment of the present invention is implemented by using other technical means.
  • the CPU 110 depicted in Figures 1, 3 and 4 is also only an example, and may be, for example, a specific integrated circuit, in either form, which implements the functions of the processor in a computer system.
  • the computer system according to the embodiment of the present invention may be a computing server or a server that manages routes, such as a switch.
  • the specific implementation form of the computer system is not limited.
  • the following describes an embodiment of the present invention by implementing an access agent added to a computer system.
  • the access process of the PCIe endpoint device is the PCIe end provided by the embodiment of the present invention.
  • the process of accessing the device including:
  • S501 The CPU acquires an operation instruction, the operation instruction instructing the CPU to access the PCIe endpoint device by using an access proxy in the computer system;
  • the operation instruction may be generated by a driver module of the PCIe endpoint device, because the driver module of the PCIe endpoint device has pre-configured the access interface of the PCIe endpoint device as the access proxy, and the upper application module generates a PCIe.
  • the access module of the PCIe endpoint device generates an operation instruction for accessing the PCIe endpoint device, the operation instruction instructing the CPU to access the PCIe endpoint to be accessed through an access proxy in the computer system
  • the operation instruction may be generated by the HOS in the computer system, where the HOS pre-configures the access interface of the PCIe endpoint device as the access proxy, and when the upper application module generates the access requirement for a PCIe endpoint device,
  • the driver module of the PCIe endpoint device invokes the HOS, and the HOS generates an operation instruction according to the pre-configured access interface, the operation instruction instructing the CPU to access the PCIe endpoint device to be accessed through an access proxy in the computer system.
  • S502 The CPU sends an access request to the access proxy according to the operation instruction, where the access request instructs the access proxy to access the PCIe endpoint device;
  • the access proxy After receiving the access request sent by the CPU, the access proxy returns a response message of the access request to the CPU.
  • the response message of the access request may be an acknowledgment response, or may be a refusal response or a failure response, but any response message indicates to the CPU that the access request sent by the CPU has been received, and the CPU receives the response. After the response message is determined, it is determined that the task is completed, and the timer started for the task can be closed, and the CPU's own task timeout shutdown mechanism remains normal.
  • the CPU no longer directly accesses the PCIe endpoint device to be accessed, but completes the access by accessing the proxy.
  • the access proxy can isolate the impact of the abnormal offline of the PCIe endpoint device, and the response of the access proxy returning the access request to the CPU.
  • the message causes the CPU cached tasks to not accumulate due to timeouts, so that the CPU avoids MCE resets.
  • the accessing process of the access agent to the PCIe endpoint device includes:
  • the access proxy initiates an access operation to the PCIe endpoint device according to an access request of the CPU.
  • step 605 The accessing agent determines whether the access operation initiated by the PCIe endpoint device is successfully performed. If successful, step 606 is performed, and if it fails, step 608 is performed;
  • S606 The access proxy sends a first notification message of the access completion to the CPU.
  • S607 After receiving the first notification message, the CPU obtains a result of the current access; the CPU may further access according to the access The result informs the upper module that the access is completed.
  • S608 The access proxy sends a second notification message of the access failure to the CPU.
  • S609 After receiving the second notification message, the CPU performs subsequent processing of the access failure.
  • the subsequent processing of the access failure includes: determining a reason why the access proxy fails to access the PCIe endpoint device, and if the access failure occurs because the PCIe endpoint device to be accessed is abnormally offline, The CPU suspends access to the PCIe endpoint device. If the access failure is due to the access agent itself failure, the CPU resets the access proxy or issues a notification of the access proxy failure to repair the location. The failure of the access agent.
  • the upper module may be further notified to stop accessing the PCIe endpoint device.
  • the access proxy replaces the CPU to access the PCIe endpoint device, and returns a response message of the access request to the CPU, thereby avoiding the entire CPU caused by the MCE error. System reset. Further, when the access proxy fails to access the PCIe endpoint device, the access proxy notifies the CPU of the failure to access the message, the CPU performs fault diagnosis, and determines that the access fails. When the PCIe endpoint device to be accessed is abnormally offline, the access to the PCIe endpoint device to be accessed is suspended, thereby avoiding waste of resources caused by continuous and unsuccessful access of the system.
  • the specific access process is as shown in FIG. 7, and includes:
  • the CPU in the computer system obtains an operation instruction, where the operation instruction carries an access interface and access content, the access interface points to a DMA engine, the access content indicates that the access object is the SSD, the access is a read operation, and The source address of the read operation; the access content may further indicate the length of the read operation, but in general, the length of the read operation may be determined by the default length of the system;
  • the driving module of the SSD device receives the call of the upstream endpoint, and generates an operation instruction for accessing the PCIe endpoint device according to the pre-configured access interface.
  • the specific implementation manner of the operation instruction sent by the foregoing driving module to the CPU may also have other forms.
  • the operation instruction carries an indication that the access object is an SSD, the access is a read operation, and a start address of the read operation.
  • an instruction is added in the operation instruction to indicate that access to the SSD is implemented by operating the DMA engine.
  • S702 The CPU sends a data transfer request to the DMA engine according to the operation instruction, where the data transfer request is used to instruct the DMA engine to move specified data in a memory of the PCIe endpoint device to the computer system.
  • the memory In the memory;
  • the CPU after acquiring an operation instruction of the driving module of the SSD, the CPU requests a destination address of the read operation from a memory of the computer system, and after obtaining the destination address of the read operation, The DMA engine sends a data transfer request, the data transfer request indicating a source address, a destination address, and a length of the read operation to instruct the DMA engine to move the data of the length of the read operation from the source address of the read operation to the location The destination address of the read operation; S703: After receiving the data movement request of the CPU, the DMA engine returns a response message of the data movement request to the CPU; after receiving the response message of the data movement request, the CPU is no longer Timeout timing of the data movement request, ensuring that other messages buffered by the CPU do not cause the CPU to generate an MCE reset due to accumulation;
  • the DMA engine initiates a read request to the SSD device, the read request carries a source address of the read operation, and the read request is used to request to read a value of a register corresponding to a source address of the read operation.
  • the cache of the DMA engine Into the cache of the DMA engine;
  • step 706 determines whether the read request is successfully executed. If successful, step 706 is performed, and if it fails, step 709 is performed;
  • the DMA engine sends a first notification message to the CPU, where the specific first notification message may be a first MSI interrupt (Message Signaled Interrupts, MSI), to notify the CPU that the access is completed.
  • MSI Message Signaled Interrupts
  • the CPU After receiving the first MSI interrupt message, the CPU reads the data to the destination address of the read operation, and may notify the driving module of the SSD device that the current access is completed.
  • the DMA engine sends a second notification message to the CPU, where the specific second notification message may be a second MSI interrupt, to notify the CPU that the access fails.
  • the subsequent processing of the access failure may include: initiating a diagnosis of the DMA engine, determining whether the DMA engine is faulty;
  • the CPU resets the DMA engine or issues a notification of the DMA engine failure to repair the failure of the DMA engine;
  • the CPU may further notify the driving module of the SSD device to stop accessing the SSD device.
  • the CPU acquires an operation instruction generated by a driving module of the SSD device, where the operation instruction carries an access interface and access content, the access interface points to a DMA engine, and the access content indicates that the access object is the SSD, the access Is a write operation, a source address and a destination address of the write operation;
  • the operation instruction sent by the foregoing driving module to the CPU may also be in other forms.
  • the operation instruction carries the access object is an SSD
  • the access is a write operation
  • the source address and the destination address of the write operation are Instructing
  • an instruction is added in the operation instruction to indicate that access to the SSD is implemented by operating the DMA engine.
  • the CPU sends an access request to the DMA engine according to an operation instruction of the SSD driving module, where the data moving request instructs the DMA engine to move specified data in a memory of the computer system to the PCIe endpoint In the memory of the device;
  • the CPU After acquiring an operation instruction of the driving module of the SSD, the CPU sends a data transfer request to the DMA engine, where the data transfer request indicates accessing a source address and a destination address and a length of the write operation, Transmitting, by the DMA engine, data of a length of the write operation from a source address of the write operation to a destination address of the write operation;
  • S804 The DMA engine initiates a read request to a source address of the write operation to read data of the source address into a cache of the DMA engine;
  • S805 the DMA engine initiates a write request to the SSD device after the data of the source address is read into the self cache, where the write request carries a destination address of the write operation, and the write request is used to request The data in the cache of the DMA engine is written into a register corresponding to the destination address;
  • step 807 The DMA engine determines whether the write request is successfully executed. If successful, step 807 is performed, and if it fails, step 809 is performed;
  • the DMA engine sends a first MSI interrupt (Message Signaled Interrupts, MSI) to the CPU to notify the CPU that the access is completed.
  • MSI Message Signaled Interrupts
  • the DMA engine initiates a second MSI interrupt to the CPU to notify the CPU that the access fails.
  • the subsequent processing of the access failure may include: initiating a diagnosis of the DMA engine, determining whether the DMA engine is faulty;
  • the CPU resets the DMA engine or issues a notification of the DMA engine failure to repair the failure of the DMA engine;
  • the DMA engine does not fail, it is determined that the access failure is caused by the CPU being suspended from accessing the SSD device because the SSD device is abnormally offline.
  • the CPU may further notify the driving module of the SSD device to stop accessing the SSD device.
  • the flow shown in FIG. 7 and FIG. 8 describes the flow of the method for reading or writing to the SSD device by the DMA engine provided by the embodiment of the present invention.
  • the DMA engine replaces the CPU to access the PCIe endpoint device. And returning a response message of the access request to the CPU, so that The CPU does not generate an MCE error and avoids a reset of the entire system.
  • the DMA engine fails to move the data of the SSD device, the DMA engine notifies the CPU of the failure to access the message, the CPU performs fault diagnosis, and determines that the access failure is due to the SSD device.
  • the access to the SSD device is suspended, thereby avoiding waste of resources caused by continuous and unsuccessful access of the system.
  • the embodiment of the present invention changes the access mode of the CPU to the PCIe endpoint device, and can be implemented by upgrading or improving the driver module or the main operating system corresponding to the PCIe endpoint device. If the driver module corresponding to the PCIe endpoint device is used to change the access mode of the CPU to the PCIe endpoint device, the following processes may be included:
  • the driving module of the PCIe endpoint device receives a call instruction of an upper application module, where the calling instruction indicates to access the PCIe endpoint device;
  • the driver module corresponding to the PCIe endpoint device generates an operation instruction according to the access interface of the pre-configured PCIe endpoint device, where the access interface of the pre-configured PCIe endpoint device points to the access proxy, and the operation instruction is used to indicate The CPU accesses the PCIe endpoint device through the access proxy.
  • a driver module corresponding to the PCIe endpoint device receives a call instruction of an upper application module, where the call instruction instructs access to the PCIe endpoint device;
  • a driver module corresponding to the PCIe endpoint device invokes a host operating system, where the invoke instruction instructs access to the PCIe endpoint device;
  • S1003 The main operating system generates an operation instruction according to the access interface of the pre-configured PCIe endpoint device, where the access interface of the pre-configured PCIe endpoint device points to the access proxy, and the operation instruction is used to indicate that the CPU passes
  • the access agent accesses the PCIe endpoint device.
  • the access device of the high-speed peripheral component interconnection PCIe endpoint device includes:
  • the receiving module 1101 is configured to receive a call instruction, where the call instruction instructs access to the PCIe endpoint device;
  • the generating module 1102 is configured to generate, according to the pre-configured access interface of the PCIe endpoint device, an operation instruction for accessing the PCIe endpoint device, where the pre-configured access interface of the PCIe endpoint device points to the And accessing the proxy, the operation instruction is used to instruct the CPU to access the PCIe endpoint device by using the access proxy.
  • the access device may be a driver module of the PCIe endpoint device or a main operating system of the computer system.
  • FIG. 12 is a schematic structural diagram of a computer according to an embodiment of the present invention.
  • the computer of the embodiment of the present invention may include:
  • CPU 120 memory 1202 and communication interface 1205 are connected by system bus 1204 and communicate with each other.
  • Processor 1201 may be a single core or multi-core central processing unit, or a particular integrated circuit, or one or more integrated circuits configured to implement embodiments of the present invention.
  • the memory 1202 may be a high speed RAM memory or a non-volatile memory, such as at least one disk memory.
  • Memory 1202 is for use by computer to execute instructions 1203. Specifically, the program code may be included in the computer execution instruction 1203.
  • the processor 1201 executes the computer execution instruction 1203, and the method flow described in any one of FIGS. 5-10 can be performed.
  • the embodiment of the present invention proposes a new PCIe endpoint device resource.
  • the CPU does not need to scan and allocate resources to the newly powered PCIe endpoint device.
  • the Input-Output System needs to reserve resources for each device in the system.
  • the BIOS scans the access ports of each PCIe endpoint device. When scanning to a PCIe endpoint device, the BIOS goes to Read the corresponding registers of the PCIe endpoint device, and perform corresponding resource reservation according to the requirements of the PCIe endpoint device, such as reservation of bus resources and memory address resources.
  • the access port of the PCIe endpoint device in the embodiment of the present invention may specifically be a downlink port of the PCIe switch or a downlink port of the north bridge in the system.
  • the resource allocation scheme of the PCIe endpoint device provided by the embodiment of the present invention is different from the prior art in that the BIOS of the computer system is reserved in the resource.
  • the BOIS is no longer based on the actually scanned PCIe endpoint device.
  • the actual requirement is to reserve the resource, but to reserve a specified share of resources for the access port of each PCIe endpoint device, where the specified share is greater than or equal to the resource requirement of the PCIe endpoint device.
  • the specified share may be It is the resource requirement of the PCIe endpoint device of the type with the most resource requirements.
  • the BIOS scans an access port of each PCIe endpoint device in the computer system, regardless of whether a PCIe endpoint device is scanned or not, regardless of which type of PCIe endpoint device is scanned, The PCIe endpoint device's access port may subsequently access the PCIe endpoint device of the type with the largest resource requirement. If the current system may use 10 PCIe endpoint devices, the resource requirement. The largest is the SSD device, which requires 10M of non-prefetchable memory resources and 3 PCIe buses. Then, the BIOS reserves 3 PCIe bus resources and 10M unpredictable on the access port of each PCIe endpoint device. Take resources.
  • the PCIe management module of the computer system forms all PCIe endpoint devices and PCIe switches managed by one CPU in the computer system into one PCIe domain, and is the PCIe domain.
  • a corresponding PCIe tree is configured, and the PCIe tree is used to describe a connection relationship between each PCIe endpoint device in the PCIe domain to each layer of the CPU and a resource configuration of each PCIe endpoint device. Since the BIOS has reserved a specified share of resources for the access port of each PCIe endpoint device, the PCIe management module does not go to the PCIe management module when loading the access port of each Cle endpoint device.
  • the actual resource demand of the PCIe endpoint device of the port is scanned, and the resource allocation is performed according to the previous resource reservation of the BIOS, that is, the designation reserved by the BIOS is allocated for the access port of each PCIe endpoint device. a resource of the share, and record the resource allocation of the specified share into the PCIe tree.
  • the PCIe management module does not release the allocated for the powered down PCIe endpoint device when determining that the PCIe endpoint device is offline.
  • the resource of the quota is specified, and the structure of the PCIe tree is kept unchanged, that is, the connection relationship and resource configuration of the offline PCIe endpoint device are reserved in the PCIe tree. In this way, since the resources and the connection relationship of the PCIe endpoint device are configured in the PCIe domain, when the PCIe endpoint device powers on the PCIe domain, the PCIe management module notifies the corresponding driver.
  • the PCIe endpoint device of the module is powered on, and the PCIe endpoint device completes accessing the PCIe domain in the computer system.
  • the CPU does not need to scan the PCIe endpoint device, thereby further avoiding the entire system reset problem caused by the MCE error that may occur when the PCIe endpoint device is connected to the computer system.
  • aspects, or aspects of the invention may be The manner in which it can be implemented can be embodied as a system, method, or computer program product.
  • aspects of the invention, or possible implementations of various aspects may be in the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, etc.), or a combination of software and hardware aspects, They are collectively referred to herein as "circuits,""modules," or “systems.”
  • aspects of the invention, or possible implementations of various aspects may take the form of a computer program product, which is a computer readable program code stored in a computer readable medium.
  • the computer readable medium can be a computer readable signal medium or a computer readable storage medium.
  • the computer readable storage medium includes, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any suitable combination of the foregoing, such as random access memory (RAM), read only memory (ROM), Erase programmable read-only memory (EPROM or flash memory), optical fiber, portable read-only memory (CD-ROM).
  • the processor in the computer reads the computer readable program code stored in the computer readable medium, such that the processor can perform the functional actions specified in each step or combination of steps in the flowchart; A device that functions as specified in each block, or combination of blocks.
  • the computer readable program code can be executed entirely on the user's computer, partly on the user's computer, as a separate software package, partly on the user's computer and partly on the remote computer, or entirely on the remote computer or server.
  • the functions noted in the various steps in the flowcharts or in the blocks in the block diagrams may not occur in the order noted.
  • two steps, or two blocks, shown in succession may in fact be executed substantially simultaneously, or the blocks may sometimes be executed in the reverse order.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Bus Control (AREA)
  • Debugging And Monitoring (AREA)
  • Information Transfer Systems (AREA)
  • Hardware Redundancy (AREA)
  • Multi Processors (AREA)
  • Power Sources (AREA)

Abstract

本发明实施例提出了一种计算机系统和高速外围组件互联PCIe端点设备访问方法,计算机系统包括:处理器、PCIe总线和访问代理,访问代理分别连接所述处理器和所述PCIe端点设备;处理器用于获取操作指令,操作指令指示所述处理器通过所述访问代理对所述PCIe端点设备进行访问,根据操作指令向访问代理发送访问请求,访问请求指示所述访问代理对所述PCIe端点设备进行访问;访问代理用于在接收所述处理器发送的访问请求后,向所述处理器发送所述访问请求的响应消息。由于处理器不再直接访问待访问的PCIe端点设备,而是通过访问代理来完成访问,该访问代理能够向处理器返回访问请求的响应消息,使得处理器避免MCE复位。

Description

一种计算机系统、 高速外围组件互联端点设备的访问方法、 和装 技术领域
本发明实施例涉及计算机技术,特别是一种高速外围组件互联端点设备 的访问方法、 计算机系统和装置。 背景技术
高速夕卜围组件互联 ( Peripheral Component Interconnect Express , PCIe ), 是用于计算和通信平台上的高性能系统总线。 PCIe总线被广泛的应用在中央 处理器( central processing unit, CPU )和外围设备的互连系统中, 在计算和 存储设备中作为核心业务通道。 通过 PCIe总线与 CPU互连的外围设备可以 有多种, 例如网卡设备或固态硬盘(Solid State Disk, SSD )等, 本文件中 将此类设备统称为 PCIe端点设备。
PCIe总线作为服务器或存储系统的总线接口被广泛应用, 系统正常运行 时, 由于在线扩容和维护的需求, 需要在不断电的情况下增加或者移除 PCIe 端点设备, 即热插拔的需求。 现有的 PCIe热插拔遵循以下操作流程: 操作者 通过按下按鈕发起热插拔请求, 热插拔控制器获知到热插拔事件后, 通知系 统中所有可能访问该 PCIe端点设备的驱动停止访问该 PCIe端点设备, 并且将 需要进行热插拔的 PCIe端点设备的资源卸载掉, 此后, 对该 PCIe端点设备下 电, 操作者拔出该 PCIe端点设备。
现有的 PCIe端点设备热插拔需要预先通知才能保证系统正常运行。 然 而, 近年来, PCIe总线逐渐由系统内的互连发展为系统间的互连, 外部线缆 等应用增多, 线缆容易异常脱落, 出现未预先通知的 PCIe端点设备异常离线 的情况。 另外, 用户使用固态硬盘 SSD直接接入系统的场景越来越广泛, 由 于用户习惯的因素, 用户可能不预先通知就直接插拔 SSD盘。 对于上述提到 的 PCIe端点设备突然异常离线的情况, 如果 CPU已经发起对这个 PCIe端点设 备的读写指令, 相关的指令将一直处于待完成状态, 当 CPU对所述 PCIe端点 设备的访问指令持续积累达到一定程度, CPU就会认为整个系统异常, 报出 机器检测异常 (Machine check exception, MCE )错误, 从而导致整个系统 复位。 发明内容
本发明实施例提出了一种高速外围组件互联端点设备的访问方法、计算 机系统和装置, 在 PCIe端点设备异常离线之后, 避免处理器产生复位。
第一方面,本发明实施例提出了一种计算机系统,所述计算机系统包括: 处理器;
高速外围组件互联 PCIe总线, 用于连接 PCIe端点设备;
所述计算机系统还包括访问代理, 所述访问代理分别连接所述处理器和 所述 PCIe端点设备;
所述处理器用于获取操作指令, 所述操作指令指示所述处理器通过所述 访问代理对所述 PCIe端点设备进行访问; 以及根据所述操作指令向所述访 问代理发送访问请求, 所述访问请求指示所述访问代理对所述 PCIe端点设 备进行访问;
所述访问代理用于在接收所述处理器发送的访问请求后, 向所述处理器 发送所述访问请求的响应消息。
结合第一方面, 在第一种可能的实现方式中, 所述计算机系统还包括: PCIe端点设备的驱动模块, 用于根据预先配置的 PCIe端点设备的访问接口 生成所述操作指令, 所述预先配置的 PCIe端点设备的访问接口指向所述访 问代理;
所述处理器具体用于获取所述 PCIe端点设备的驱动模块生成的所述操 作指令。
结合第一方面, 在第二种可能的实现方式中, 所述计算机系统还包括:
PCIe端点设备的驱动模块和主操作系统; 所述 PCIe端点设备的驱动模块用于调用所述主操作系统以进行对所述 PCIe端点设备的访问;
所述主操作系统接收所述 PCIe端点设备的驱动模块的调用, 根据预先 配置的 PCIe端点设备的访问接口生成所述操作指令, 所述预先配置的 PCIe 端点设备的访问接口指向所述访问代理;
所述处理器具体用于获取所述主操作系统生成的所述操作指令。
结合第一方面或者第一方面的第一种可能的实现方式或者第一方面的 第二种可能的实现方式, 在第三种可能的实现方式中, 所述访问代理还用 于根据所述访问请求执行对所述 PCIe端点设备的访问。
第二方面,还提出了一种高速外围组件互联 PCIe端点设备的访问方法,
PCIe端点设备通过 PCIe总线连接到计算机系统的处理器, 所述方法包括: 所述处理器获取操作指令, 所述操作指令指示所述处理器通过访问代 理访问所述 PCIe端点设备;
所述处理器根据所述操作指令, 向所述访问代理发送访问请求, 所述 访问请求指示所述访问代理对所述 PCIe端点设备进行访问;
所述处理器接收所述访问代理发送的所述访问请求的响应消息。
结合第二方面,在第一种可能的实现方式中,所述处理器获取所述 PCIe 端点设备的驱动模块根据预先配置的所述 PCIe端点设备的访问接口生成的 所述操作指令, 所述预先配置的所述 PCIe端点设备的访问接口指向所述访 问代理; 或者, 所述处理器获取主操作系统根据预先配置的所述 PCIe端点 设备的访问接口生成的所述操作指令, 所述预先配置的所述 PCIe端点设备 的访问接口指向所述访问代理。 结合第二方面或者第二方面的第一种可能 的实现方式, 在第二种可能的实现方式中, 所述操作指令具体指示所述处 理器通过直接存储器存取 ( Direct Memory Access, DMA ) 引擎访问所述 PCIe端点设备;
所述处理器根据所述操作指令, 向所述 DMA引擎发送数据搬移请求, 所述数据搬移请求指令所述 DMA引擎将所述 PCIe端点设备的存储器中的 指定数据搬移到所述计算机系统的存储器中, 或者将所述计算机系统的存 储器中的指定数据搬移到所述 PCIe端点设备的存储器中。
结合第二方面或者第二方面的第一种可能的实现方式或者第二方面的 第二种可能的实现方式, 在第三种可能的实现方式中, 还包括:
所述处理器接收所述访问代理发送的第一通知消息, 所述第一通知消 息表明所述访问代理对所述 PCIe端点设备访问成功;
所述处理器才艮据所述第一通知消息, 获取访问结果。
结合第二方面或者第二方面的第一种可能的实现方式或者第二方面的 第二种可能的实现方式, 在第四种可能的实现方式中, 还包括:
所述处理器接收所述访问代理发送的第二通知消息, 所述第二通知消 息表明所述访问代理对所述 PCIe端点设备访问失败;
所述处理器根据所述第二通知消息, 执行访问失败的后续处理。
结合第二方面的第四种可能的实现方式,在第五种可能的实现方式中, 所述访问失败的后续处理包括:
所述处理器确定所述访问代理对所述 PCIe端点设备访问失败的原因, 若所述访问失败的原因是所述待访问的 PCIe端点设备异常离线, 所述处理 器中止对所述 PCIe端点设备的访问。
第三方面, 提出了一种计算机, 包括:
处理器;
存储器, 用于存储计算机执行指令;
当所述计算机运行时,所述处理器执行所述存储器存储的所述计算机执 行指令, 并通过通信接口与所述计算机外部的设备进行通信, 以使所述计 算机执行第二方面所述的一种高速外围组件互联高速外围组件互联 PCIe端 点设备的访问方法。
第四方面, 提出了一种计算机可读介质, 包括计算机执行指令, 当计算 机的处理器执行所述计算机执行指令时, 所述计算机执行第二方面所述的 一种高速外围组件互联 PCIe端点设备的访问方法。
第五方面, 提出了一种高速外围组件互联 PCIe端点设备的访问方法, PCIe端点设备通过 PCIe总线连接计算机系统的处理器, 所述方法包括: 接收调用指令, 所述调用指令指示对所述 PCIe端点设备进行访问; 根据预先配置的 PCIe端点设备的访问接口, 生成操作指令, 其中, 所 述预先配置的 PCIe端点设备的访问接口指向访问代理,所述操作指令用以 指示所述处理器通过所述访问代理访问所述 PCIe端点设备。
第六方面, 提出了一种高速外围组件互联 PCIe端点设备的访问装置, 包括:
接收模块, 用于接收调用指令, 所述调用指令指示对所述 PCIe端点设 备进行访问;
生成模块, 用于根据预先配置的所述 PCIe端点设备的访问接口, 生成 对所述 PCIe端点设备进行访问的操作指令, 其中, 所述预先配置的所述 PCIe端点设备的访问接口指向所述访问代理, 所述操作指令用以指示所述 处理器通过所述访问代理访问所述 PCIe端点设备。
第七方面, 提出了一种计算机, 包括:
处理器;
存储器, 用于存储计算机执行指令;
当所述计算机运行时,所述处理器执行所述存储器存储的所述计算机执 行指令, 以使所述计算机执行如下方法:
接收调用指令, 所述调用指令指示对所述 PCIe端点设备进行访问; 根据预先配置的 PCIe端点设备的访问接口, 生成对所述 PCIe端点设备 进行访问的操作指令, 其中, 所述预先配置的所述 PCIe端点设备的访问接 口指向所述访问代理, 所述操作指令用以指示所述处理器通过所述访问代 理访问所述 PCIe端点设备。 第八方面提出一种计算机可读介质, 包括计算机执行指令, 当计算机 的处理器执行所述计算机执行指令时, 所述计算机执行如下方法:
接收调用指令, 所述调用指令指示对所述 PCIe端点设备进行访问; 根据预先配置的 PCIe端点设备的访问接口, 生成对所述 PCIe端点设 备进行访问的操作指令, 其中, 所述预先配置的所述 PCIe端点设备的访问 接口指向所述访问代理, 所述操作指令用以指示所述处理器通过所述访问 代理访问所述 PCIe端点设备。
第九方面, 提出一种访问代理, 所述访问代理应用于计算机系统中, 所述计算机系统包括处理器和高速外围组件互联 PCIe总线, 所述 PCIe总 线连接至少一个 PCIe端点设备;
所述访问代理分别连接所述处理器和所述 PCIe端点设备;
所述访问代理用于隔离所述处理器与 PCIe端点设备之间的直接访问, 接收所述处理器对所述 PCIe端点设备的访问请求, 向所述处理器返回所述 访问请求的响应消息。
第十方面, 提出一种 PCIe交换器, 其特征在于, 所述 PCIe交换器应 用于计算机系统中, 所述计算机系统包括处理器和高速外围组件互联 PCIe 总线, 所述 PCIe总线连接至少一个 PCIe端点设备;
所述 PCIe交换器的上游端口通过所述 PCIe总线与所述处理器连接, 所述 PCIe交换器的下游端口通过所述 PCIe总线与所述 PCIe端点设备连接; 所述 PCIe交换器内置如第九方面所述的访问代理。
第十一方面, 提出一种为接入计算机系统中的高速外围组件互联 PCIe 端点设备分配资源的方法, 包括:
为每个 PCIe端点设备的接入端口预留指定份额的资源,所述指定份额 大于或者等于所述 PCIe端点设备的资源需求量;
根据预留的所述指定份额的资源, 为所述每个 PCIe端点设备的接入端 口分配所预留的指定份额的资源。 根据第十一方面, 在第一种可能的实现方式中, 所述指定份额为资源 需求量最大的类型的 PCIe端点设备的资源需求量。
根据第十一方面或在第十一方面的第一种可能的实现方式, 在第二种 可能的实现方式中, 所述计算机系统中接入的 PCIe端点设备与处理器组成 一个 PCIe域, 所述 PCIe域配置对应的 PCIe树;
所述方法还包括: 在所述 PCIe树中记录为所述每个 PCIe端点设备的 接入端口所分配的所述指定份额的资源。
根据第十一方面的第二种可能的实现方式中, 在第三种可能的实现方 式中, 所述方法还包括:
当所述每个 PCIe端点设备从所述计算机系统离线后, 保留所述 PCIe 树中记录的所述每个 PCIe端点设备的接入端口所分配的所述指定份额的资 源。
第十二方面, 提出一种计算机系统, 包括: 处理器;
高速外围组件互联 PCIe总线, 用于连接 PCIe端点设备;
基本输入输出系统 BIOS, 用于为每个所述 PCIe端点设备的接入端口 预留指定份额的资源, 所述指定份额大于或者等于每个所述 PCIe端点设备 的资源需求量;
PCIe管理模块, 用于根据所述 BIOS预留的所述指定份额的资源, 为所 述每个 PCIe端点设备的接入端口分配所预留的指定份额的资源。
本发明实施例中, 处理器不再直接访问待访问的 PCIe端点设备, 而是 通过访问代理来完成访问, 该访问代理能够隔离 PCIe端点设备异常离线带 来的影响, 并向处理器返回访问请求的响应消息, 使得处理器緩存的任务不 会因为超时而不断积累, 以使得处理器避免 MCE复位。
本发明实施例中, 系统为 PCIe端点设备的接入端口预留并分配指定份 额的资源,使得 PCIe端点设备在接入系统时,处理器可以不再去扫描该 PCIe 端点设备,避免了在 PCIe端点设备接入计算机系统时可能发生的 MCE错误 引发的整个系统复位问题。
附图说明
为了更清楚地说明本发明实施例的技术方案, 下面将对现有技术或实施 例中所需要使用的附图作筒单地介绍, 显而易见地, 下面描述中的附图仅仅 是本发明的一些实施例, 对于本领域普通技术人员来讲, 在不付出创造性劳 动的前提下, 还可以根据这些附图获得其他的附图。
图 1是根据本发明实施例提供的一种计算机系统的组成图;
图 2是根据本发明实施例存储器所包括的程序模块的组成图; 图 3是根据本发明实施例提供的又一种计算机系统的组成图; 图 4是根据本发明实施例提供的又一种计算机系统的组成图; 图 5是根据本发明实施例提供的一种方法流程图;
图 6是根据本发明实施例提供的又一种方法流程图;
图 7是根据本发明实施例提供的又一种方法流程图;
图 8是根据本发明实施例提供的又一种方法流程图;
图 9是根据本发明实施例提供的又一种方法流程图;
图 10是根据本发明实施例提供的又一种方法流程图;
图 11是根据本发明实施例提供的一种 PCIe端点设备的访问装置的组成 图;
图 12是根据本发明实施例提供的一种计算机的组成图。 具体实施方式
本发明实施例提出了一种高速外围组件互联端点设备的访问方法、计算 机系统和装置, 当有 PCIe端点设备需要进行热插拔操作时可以不用预先通 知系统进行预处理, 就直接断开 PCIe端点设备与处理器之间的连接, 此时 处理器也不会产生 MCE复位的风险。本发明实施例将 PCIe端点设备被直接 拔出系统或者出现故障掉线的情况统称为 PCIe端点设备的异常离线。 本发明实施例的系统架构
图 1描绘了本发明实施例提供的计算机系统的组成图, 图 1所示的计算 机系统中包括 CPU 110、 存储器 120和 PCIe端点设备 130, PCIe端点设备 130通过 PCIe总线 140连接到 CPU110,并可以从该计算机系统中插拔出去。 PCIe端点设备 130包含多种类型,例如图形处理单元 131、 网络适配器 132、 固态硬盘 133与视频加速部件 134; 存储器 120用于存储数据, 所存储的数 据可以是 CPU从外部设备获取的数据,也可以是使得 CPU运行的程序数据, 具体地, 存储器中的可以存储有一个或多个的程序模块, CPU110根据程序 模块的计算机执行指令进行相关操作; 图 1所示的计算机系统中的 PCIe端 点设备 130和 CPU110组成一个 PCIe域,所述 PCIe域中的所有设备通过 PCIe 总线 140与 CPU 110连接并接受 CPU110的控制。
在图 1所示的系统架构下, 如图 2所示, 存储器 120中的程序模块可以 具体包括应用模块 121、 驱动模块 122和主操作系统 Host Operation System ( HOS ) 123,应用模块 121产生对 PCIe端点设备的访问需求,驱动模块 122 用于根据应用模块对 PCIe端点设备的访问需求, 可以调用 HOS123的相应 接口 (如果访问接口由 HOS提供), HOS123根据驱动模块的调用产生操作 指令, 使得 CPU根据操作指令对相应的 PCIe端点设备进行访问或控制。 一 般来说, 一个 PCIe端点设备对应一个驱动模块(当然也可能一个驱动模块 对应多个 PCIe端点设备, 只要保证每个 PCIe端点设备配备一个对应的驱动 模块即可), 例如根据图 1所示的系统架构, 存储器 120中的 PCIe端点设备 的驱动模块可以包括图形处理单元的驱动模块 122-1、 网络适配器 NIC的驱 动模块 122-2、 固态硬盘 SSD的驱动模块 122-3和视频加速部件的驱动模块 122-4。
举例来说, 如果应用模块产生对 SSD 固态硬盘的访问需求, 按照现有 的 CPU对 PCIe端点设备的访问方式, SSD的驱动模块 122-3接收到应用模 块 121的调用后, 接着调用 HOS123, HOS123根据默认配置的访问接口产 生操作指令给 CPUllO, 该操作指令中包括待访问的设备 SSD133的指示以 及相关的操作要求, CPU110根据 SSD 的驱动模块 122-3 的操作指令, 向 SSD133发出访问请求, 要求对 SSD133 的寄存器进行访问, 如果 SSD133 发生异常离线, CPU110将接收不到 SSD133对 CPU110的访问请求的响应 消息, 此时, CPU就会认为此次访问任务未完成, 如果此类未完成的任务在 CPU中积累到一定程度, CPU就会认为整个系统异常, 报出 MCE错误进行 复位。
本发明实施例改变 CPU对 PCIe端点设备的访问方式, CPU将不再直接 访问 PCIe端点设备, 而是通过第三方来访问 PCIe端点设备, 如图 1所示, 本发明实施例在系统中新增加访问代理 160, 该访问代理 160 用于代替 CPU110来访问 PCIe端点设备, 并隔离 PCIe端点设备异常离线对 CPU110 的影响。 如图 1所示, CPU110对 SSD1303的访问将不再采用线路 1 , 而是 采用线路 2和线路 3 (线路 1即图示 Linel , 线路 2即图示 Line2, 线路 3即 图示 Line3 , 图示的 Linel-3的虚线并非实际的连接,仅用于形象的示出各个 组成模块之间的信号流动线路)。 CPU110首先获取操作指令, 所述操作指令 指示所述 CPU通过访问代理 160对 SSD133进行访问, CPU110再通过线路 2向访问代理 160发送访问请求, 访问代理 160通过线路 2向 CPU110返回 该访问请求的响应消息。 后续, 访问代理再根据所述访问请求执行对所述 PCIe端点设备的访问, 即通过线路 3对 SSD133的寄存器进行读写操作。 这 样, 一方面由于 CPU110并不与 PCIe端点设备 130产生直接的信号联系, PCIe端点设备 130的离线与否对于 CPU110而言是不可见的, 即 PCIe端点 设备并不影响 CPU110的业务处理, 另一方面, 本发明实施例提供的访问代 理 160在接收到 CPU110的访问指令之后, 能够向 CPU110返回响应消息, 使得 CPU110发出的访问请求总是能够接收到相应的响应消息,因此 CPU110 的访问任务不会因为未完成而产生积累, 从而不会产生 MCE错误, 避免了 CPU发起的系统复位。 本发明实施例改变 CPU对 PCIe端点设备的访问方式,可以通过对 PCIe 端点设备相对应的驱动模块进行升级或者改进来实现。 当通过改造 PCIe端 点设备相应的驱动模块实现的时候, 在 PCIe端点设备相应的驱动模块中预 先配置访问接口,该预先配置的访问接口指向访问代理, PCIe端点设备相对 应的驱动模块在确定需要对 PCIe端点设备进行访问时, PCIe端点设备的驱 动模块根据预先配置的访问接口生成 CPU的操作指令,该操作指令指示 CPU 通过访问代理来访问 PCIe端点设备。
另外, 改变 CPU对 PCIe端点设备的访问方式, 还可以有其他的实施方 式, 例如, 通过修改 HOS来实现, 在 HOS中预先配置访问接口, 该预先配 置的访问接口指向访问代理。 PCIe端点设备的驱动模块在确定需要对 PCIe 端点设备进行访问时, 仍然调用 HOS 以对 PCIe端点设备进行访问, HOS 收到 PCIe端点设备的驱动发送的调用指令后, 由于其配置的 PCIe端点设备 的访问接口已经被预先配置成了所述访问代理, 于是 HOS操作指令, 所述 操作指令指示所述 CPU通过访问代理对 PCIe端点设备进行访问。
下面讨论本发明实施例的访问代理的功能和具体实现形式。本发明实施 例的访问代理包含隔离功能和访问代理功能。 作为隔离模块, 需要保证自身 相对 PCIe端点设备的独立性, 也需要保证自身相对 CPU的独立性。 保持相 对 PCIe端点设备的独立性, 需要保证该访问代理不会跟随 PCIe端点设备一 起被直接拔出, 因此, 该访问代理与 PCIe端点设备在物理设置上需要分属 不同的设备; 保持相对 CPU的独立性, 主要是保证该访问代理具备独立的 处理器, 该访问代理的处理器与系统的 CPU是各自独立的, 即使 PCIe端点 设备被直接拔出时,访问代理模块的受到的影响也不会传染到 CPU。作为代 理模块, 访问代理需要实现对 PCIe端点设备的访问, 以及对接收到的 CPU 的访问请求返回响应消息, 所述访问请求的响应消息可以是确认响应, 也可 以是拒绝响应或者失败响应, 但无论哪一种响应消息, 均向所述 CPU表示 已经接收到其发送的访问请求, 所述 CPU接收到所述响应消息后, 确定本 次任务完成, 可以关闭对此次任务启动的计时器, 使得 CPU 自身的任务超 时关闭机制保持正常, CPU緩存的其他消息不会因为超时而不断积累, 避 免了 CPU产生 MCE复位。
基于上述对访问代理的功能的考虑,访问代理的在系统中的设置还可以 有多种形式。 在图 1所示的系统架构中, 访问代理 160是作为一个独立的新 增设备被设置在计算机系统中, 该访问代理通过 PCIe总线分别与所述 CPU 和所述 PCIe端点设备连接。 另外, 访问代理 160还可以与 PCIe域中的已有 设备封装在一起来实现, 例如, 该访问代理 160与所述 CPU封装在一起作 为一个固件。 器存取(Direct Memory Access, DMA ) 引擎实现, 访问代理也可以采用新 的硬件实现, 例如将具备该访问代理功能的软件模块安装在一个具有独立处 理器的硬件设备上。
由于本发明实施例的访问代理需要向 CPU返回访问请求的响应消息, 在具体实现的时候, 有不同的方式来实现访问代理的这一功能, 其中的一种 实现方式是保证访问代理相对于 CPU的连接关系始终处于保持状态, 即两 者之间的连接关系不会被断开, 或者是访问代理相对于 CPU而言是不可热 插拔的, 例如, 将装载该访问代理的硬件设备或者用于实现该访问代理的硬 件设备焊接在所述 CPU所连接的印制电路板 PCB上, 或者将装载该访问代 理的硬件设备或者用于实现该访问代理的硬件设备与所述处理器的相连接 的接口使用连接器件固定。 本发明另一个实施例提供的计算机系统如图 3所示。
图 3所示的计算机系统中除了图 1所示的 CPU、 PCIe总线和 PCIe端点 设备外, 还包括 PCIe交换器 150, 该 PCIe交换器 150上游端口通过 PCIe 总线 140与 CPU110连接, 下游端口面向每个 PCIe端点设备提供一个 PCIe 端口, 所述每个 PCIe端口通过 PCIe总线 140连接到每个 PCIe端点设备, PCIe交换器 150用于将数据向下游路由到对应的 PCIe端口, 以及从每个独 立的 PCIe端口将数据向上游路由至 CPU110。在图 3所示的实施例中,新增 的访问代理 160设置在 PCIe交换器 150内部, 并且本实施例中的访问代理 160通过 DMA引擎实现。 PCIe端点设备 130通过 PCIe总线 140与 PCIe 交换器 150相连接, 由于 PCIe交换器 150与 PCIe端点设备 130分属于不同 的设备,因此,任何一个 PCIe端点设备的直接拔出不会导致 PCIe交换器 150 从系统中被拔出, 也即保证了访问代理 160不会随 PCIe端点设备的拔出而 被拔出, 实现了访问代理 160与 PCIe端点设备 130的独立性; 另外, 在本 实施例中, 由于 DMA引擎具备独立的处理器, 若任意一个 PCIe端点设备 被直接拔出, 即使 DMA对该 PCIe端点设备的访问受到影响, DMA也会隔 离这个影响, 无论是否访问 PCIe端点设备成功, DMA都保证向 CPU110返 回其发出的访问请求的响应消息, 以此避免了 CPU发起的 MCE复位问题。
仍然以应用模块产生对 SSD固态硬盘 133的访问需求为例, CPU110获 取固态硬盘 SSD的驱动模块 122-3生成的操作指令,该操作指令指示 CPU110 通过 DMA来访问 SSD固态硬盘 133, CPU110根据固态硬盘 SSD的驱动模 块 122-3的操作指令向 DMA发送数据搬移请求, 该数据搬移请求指示所述 DMA引擎将所述 PCIe端点设备的存储器中的指定数据搬移到所述计算机系 统的存储器中, 或者将所述计算机系统的存储器中的指定数据搬移到所述 PCIe端点设备的存储器中, DMA接收到 CPU110的数据搬移请求之后, 向 CPU110返回对该数据搬移请求的响应消息, 并对 SSD固态硬盘 133进行数 据搬移, 在数据搬移结束之后, 向 CPU 110返回访问完成的通知消息, 以通 知 CPU110获取本次访问的结果。
进一步地, 由于本发明实施例中的 DMA 内置于所述 PCIe交换器 150 中,该 PCIe交换器 150还可以焊接在所述 CPU110所连接的印制电路板 PCB 上, 或者将所述 PCIe交换器 150与所述 CPU110的相连接的接口使用连接 器件固定, 以此保证该 PCIe交换器 150内置的 DMA不会从系统中被拔出, 因而保证 DMA总是能够向 CPU返回访问请求的响应消息。
如图 4所示, 为本发明另一个实施例提供的计算机系统。
在图 4 所示的实施例中, 与图 3 所示实施例不同的是, 本实施例在 CPU110中新增访问代理 160,该访问代理 160可以通过 DMA引擎实现。访 问代理 160设置在 CPU110 内部, 也即保证了访问代理 160不会随 PCIe端 点设备的拔出而被拔出, 实现了访问代理 160与 PCIe端点设备 130的独立 性; 另外, 在本实施例中, 由于 DMA引擎具备独立的处理器, 若任意一个 PCIe端点设备被直接拔出,即使 DMA对该 PCIe端点设备的访问受到影响, DMA也会隔离这个影响并不会传染到 CPU110, 无论是否访问 PCIe端点设 备成功, DMA都保证向 CPU110返回其发出的访问请求的响应消息, 以此 避免了 CPU发起的 MCE复位问题。 本实施例中的具体放访问方式同图 1 和图 3实施例所述方式一致, 在此不再赘述。
本发明实施例中的 PCIe端点设备的访问方法可以在图 1或图 3或图 4 所示的计算机系统中实施,但是图 1或图 3或图 4所示的只是适用本发明实 施例的其中一种示例, 并不是对本发明应用的具体限定, 本申请文件对其他 系统实施例或应用场景不再——阐述。 另外, 图 1、 图 3和图 4中描述的访 问代理在系统中的设置只是两个示例, 本领域技术人员还可以将本发明实施 例新增的访问代理设置于位于系统中的其它位置, 或者根据本发明实施例的 技术原理采用其他技术手段来实现。
图 1、 图 3和图 4中描述的 CPU110也只是一种示例, 例如还可以是特 定集成电路, 不管哪一种形式, 其在计算机系统中, 实现处理器的功能。 本 发明实施例所述的计算机系统, 可以是计算型服务器, 也可以是管理路由的 服务器, 例如交换机, 本发明对计算机系统的具体实现形式不做限定。
PCIe端点 i殳备的访问流程
下面介绍本发明实施例通过在计算机系统中新增的访问代理来实现对
PCIe端点设备的访问流程, 如图 5所示, 为本发明实施例提供的对 PCIe端 点设备进行访问的流程, 包括:
S501: CPU获取操作指令, 该操作指令指示所述 CPU通过所述计算机 系统中的访问代理来访问所述 PCIe端点设备;
具体地, 所述操作指令可以由 PCIe端点设备的驱动模块生成, 由于该 PCIe端点设备的驱动模块已经预先配置了 PCIe端点设备的访问接口为所述 访问代理,当上层应用模块产生对某个 PCIe端点设备的访问需求时,该 PCIe 端点设备的驱动模块生成对 PCIe端点设备进行访问的操作指令, 该操作指 令指示所述 CPU通过所述计算机系统中的访问代理来访问所述待访问的 PCIe端点设备; 或者所述操作指令也可以由计算机系统中的 HOS生成, 该 HOS预先配置了 PCIe端点设备的访问接口为所述访问代理, 当上层应用模 块产生对某个 PCIe端点设备的访问需求时, 该 PCIe端点设备的驱动模块调 用 HOS, HOS根据预先配置的访问接口生成操作指令, 该操作指令指示所 述 CPU通过所述计算机系统中的访问代理来访问所述待访问的 PCIe端点设 备。
S502: 所述 CPU根据所述操作指令, 向所述访问代理发送访问请求, 该访问请求指示所述访问代理对所述 PCIe端点设备进行访问;
S503: 所述访问代理接收到所述 CPU发送的访问请求后, 向所述 CPU 返回所述访问请求的响应消息;
所述访问请求的响应消息可以是确认响应,也可以是拒绝响应或者失败 响应, 但无论哪一种响应消息, 均向所述 CPU表示已经接收到其发送的访 问请求, 所述 CPU接收到所述响应消息后, 确定本次任务完成, 可以关闭 对此次任务启动的计时器, CPU自身的任务超时关闭机制保持正常。
上述流程中, CPU不再直接访问待访问的 PCIe端点设备, 而是通过访 问代理来完成访问, 该访问代理能够隔离 PCIe端点设备异常离线带来的影 响, 同时访问代理向 CPU返回访问请求的响应消息, 使得 CPU緩存的任务 不会因为超时而不断积累, 以使得所述 CPU避免 MCE复位。 进一步地, 如图 6所示, 本发明又一流程实施例中, 访问代理对 PCIe 端点设备的访问过程包括:
S601-S603: 同上述步骤 S501-S503, 在此不再赘述;
S604: 所述访问代理根据 CPU的访问请求对所述 PCIe端点设备发起访 问操作;
S605: 所述访问代理确定对所述 PCIe端点设备发起的访问操作是否成 功执行, 若成功, 执行步骤 606, 若失败, 执行步骤 608;
S606: 所述访问代理向所述 CPU发送访问完成的第一通知消息; S607: 所述 CPU接收到所述第一通知消息后, 获取本次访问的结果; 所述 CPU还可以根据所述访问结果通知上层模块本次访问完成;
S608: 所述访问代理向所述 CPU发送访问失败的第二通知消息; S609: 所述 CPU接收到所述第二通知消息后, 执行访问失败的后续处 理;
具体地,所述访问失败的后续处理包括: 确定所述访问代理对所述 PCIe 端点设备访问失败的原因,若所述访问失败的原因是由于所述待访问的 PCIe 端点设备异常离线, 所述 CPU中止对所述 PCIe端点设备的访问, 若所述访 问失败的原因是由于所述访问代理自身故障, 所述 CPU对所述访问代理进 行复位或者发出所述访问代理故障的通知, 以修复所述访问代理的故障。
所述 CPU中止对所述 PCIe端点设备的访问之后, 还可以进一步通知上 层模块停止对所述 PCIe端点设备的访问。
上述流程描述了本发明实施例提供的 PCIe端点设备的访问方法, 该方 法中访问代理代替 CPU对 PCIe端点设备进行访问, 并向 CPU返回访问请 求的响应消息, 避免 CPU产生 MCE错误而引起的整个系统的复位。 进一步 地, 在访问代理访问所述 PCIe端点设备失败的时候, 所述访问代理通知所 述 CPU访问失败的消息, 所述 CPU进行故障诊断, 并在确定所述访问失败 是由于所述待访问的 PCIe端点设备异常离线时, 中止对所述待访问的 PCIe 端点设备的访问,从而避免了系统不断进行重复且不可能成功的访问所造成 的资源浪费。
结合上述图 3或图 4所示的计算机系统实施例, 当访问代理采用 DMA 引擎实现, 上层应用模块产生对 SSD 的读操作需求时, 具体的访问流程如 图 7所示, 包括:
S701: 计算机系统中的 CPU获得操作指令, 所述操作指令携带访问接 口和访问内容, 所述访问接口指向 DMA引擎, 所述访问内容指示访问对象 是所述 SSD、 所述访问是读操作以及所述读操作的源地址; 所述访问内容还 可以进一步指示所述读操作的长度,但是一般情况下读操作的长度可以以系 统默认的长度为准;
其中, 当上游端点产生对 SSD设备的读操作需求时, 所述 SSD设备的 驱动模块接收上游端点的调用, 根据预先配置的访问接口生成对 PCIe端点 设备进行访问的操作指令。
上述驱动模块向 CPU所发出的操作指令的具体实现方式还可以有其他 的形式, 例如, 该操作指令携带访问对象是 SSD、 所述访问是读操作以及所 述读操作的起始地址的指示, 另外在该操作指令中还新增一个指示, 指示通 过操作 DMA引擎实现对所述 SSD的访问。
S702: 所述 CPU根据所述操作指令, 向所述 DMA引擎发送数据搬移 请求, 该数据搬移请求用于指令所述 DMA引擎将所述 PCIe端点设备的存 储器中的指定数据搬移到所述计算机系统的存储器中;
具体地, 所述 CPU在获取了所述 SSD的驱动模块的操作指令之后, 向 所述计算机系统的存储器申请所述读操作的目的地址,在获取到所述读操作 的目的地址之后, 向所述 DMA引擎发送数据搬移请求, 所述数据搬移请求 指示读操作的源地址、 目的地址和长度, 以指示所述 DMA引擎从所述读操 作的源地址搬移所述读操作的长度的数据到所述读操作的目的地址; S703: 所述 DMA引擎接收到所述 CPU的所述数据搬移请求之后, 向 所述 CPU返回所述数据搬移请求的响应消息;所述 CPU接收到所述数据搬 移请求的响应消息之后, 不再对该数据搬移请求进行超时计时, 保证 CPU 緩存的其他消息不会因累计而使得所述 CPU产生 MCE复位;
S704: 所述 DMA引擎向所述 SSD设备发起读请求,所述读请求携带所 述读操作的源地址, 所述读请求用以请求将所述读操作的源地址对应的寄存 器的值读取到所述 DMA引擎的緩存中;
S705: 所述 DMA引擎确定所述读请求是否被成功执行, 若成功, 执行 步骤 706, 若失败, 执行步骤 709;
S706: 所述 DMA引擎将自身緩存中的数据通过写请求写入所述读操作 的目的地址;
S707: 所述 DMA引擎向所述 CPU发起第一通知消息, 具体的第一通 知消息可以为第一 MSI中断(Message Signaled Interrupts, MSI ), 以通知所 述 CPU所述访问完成;
S708: 所述 CPU接收到所述第一 MSI中断消息后, 到所述读操作的目 的地址读取所述数据, 并可以通知 SSD设备的驱动模块本次访问完成;
S709: 所述 DMA引擎向所述 CPU发起第二通知消息, 具体的第二通 知消息可以为第二 MSI中断, 以通知所述 CPU所述访问失败;
S710: 所述 CPU接收到所述第二 MSI中断消息后, 执行访问失败的后 续处理;
具体地,访问失败的后续处理可以包括:发起对所述 DMA引擎的诊断, 确定所述 DMA引擎是否发生故障;
若所述 DMA引擎发生故障,所述 CPU对所述 DMA引擎进行复位或者 发出所述 DMA引擎故障的通知, 以修复所述 DMA引擎的故障;
若所述 DMA引擎未发生故障, 则确定所述访问失败的原因是由于所述 SSD设备异常离线, 所述 CPU中止对所述 SSD设备的访问。
进一步,所述 CPU还可以通知所述 SSD设备的驱动模块停止对所述 SSD 设备的访问。
另一方面, 结合如图 3或图 4所示的计算机系统实施例, 当访问代理采 用 DMA引擎实现, 上层应用模块产生对 SSD的写操作需求时, 具体的访问 流程如图 8所示, 包括:
S801: CPU获取所述 SSD设备的驱动模块生成的操作指令, 所述操作 指令携带访问接口和访问内容, 所述访问接口指向 DMA引擎, 所述访问内 容指示访问对象是所述 SSD、 所述访问是写操作、 所述写操作的源地址和目 的地址;
上述驱动模块向 CPU所发出的操作指令的具体实现方式还可以有其他 的形式, 例如, 该操作指令携带访问对象是 SSD、 所述访问是写操作、 所述 写操作的源地址和目的地址的指示, 另外在该操作指令中还新增一个指示, 指示通过操作 DMA引擎实现对所述 SSD的访问。
S802: 所述 CPU根据所述 SSD驱动模块的操作指令, 向所述 DMA引 擎发送访问请求, 该数据搬移请求指示所述 DMA引擎将所述计算机系统的 存储器中的指定数据搬移到所述 PCIe端点设备的存储器中;
具体地, 所述 CPU在获取了所述 SSD的驱动模块的操作指令之后, 向 所述 DMA引擎发送数据搬移请求, 所述数据搬移请求指示访问所述写操作 的源地址和目的地址和长度, 以指示所述 DMA引擎从所述写操作的源地址 搬移所述写操作的长度的数据到所述写操作的目的地址;
S803: 所述 DMA引擎接收到所述 CPU的所述数据搬移请求之后, 向 所述 CPU返回所述数据搬移请求的响应消息;
S804: 所述 DMA引擎向所述写操作的源地址发起读请求, 以将所述源 地址的数据读取到所述 DMA引擎的緩存中; S805: 所述 DMA引擎在所述源地址的数据读取到自身緩存之后, 向所 述 SSD设备发起写请求, 所述写请求携带所述写操作的目的地址, 所述写 请求用以请求将所述 DMA引擎的緩存中的数据写入到所述目的地址对应的 寄存器中;
S806: 所述 DMA引擎确定所述写请求是否被成功执行, 若成功, 执行 步骤 807, 若失败, 执行步骤 809;
S807:所述 DMA引擎向所述 CPU发起第一 MSI中断( Message Signaled Interrupts, MSI ), 以通知所述 CPU所述访问完成;
S808: 所述 CPU接收到所述第一 MSI中断消息后, 获知本次写操作完 成; 进一步, 可以通知 SSD设备的驱动模块本次访问完成;
S809:所述 DMA引擎向所述 CPU发起第二 MSI中断,以通知所述 CPU 所述访问失败;
S810: 所述 CPU接收到所述第二 MSI中断消息后, 执行访问失败的后 续处理;
具体地,访问失败的后续处理可以包括:发起对所述 DMA引擎的诊断, 确定所述 DMA引擎是否发生故障;
若所述 DMA引擎发生故障,所述 CPU对所述 DMA引擎进行复位或者 发出所述 DMA引擎故障的通知, 以修复所述 DMA引擎的故障;
若所述 DMA引擎未发生故障, 则确定所述访问失败的原因是由于所述 SSD设备异常离线, 所述 CPU中止对所述 SSD设备的访问。
进一步,所述 CPU还可以通知所述 SSD设备的驱动模块停止对所述 SSD 设备的访问。
上述图 7和图 8所示的流程描述了本发明实施例提供的 DMA引擎完成 对 SSD设备的读或者写的方法流程, 该读或者写的方法中 DMA引擎代替 CPU对 PCIe端点设备进行访问, 并向 CPU返回访问请求的响应消息,使得 CPU不会产生 MCE错误, 避免整个系统的复位。 进一步地, 在 DMA引擎 对所述 SSD设备数据搬移失败的时候, 所述 DMA引擎通知所述 CPU访问 失败的消息, 所述 CPU进行故障诊断, 并在确定所述访问失败是由于所述 SSD设备被直接从系统中拔出或者发生故障时, 中止对所述 SSD设备的访 问, 从而避免了系统不断进行重复且不可能成功的访问所造成的资源浪费。
另外, 本发明实施例改变 CPU对 PCIe端点设备的访问方式, 可以通过 对 PCIe端点设备相对应的驱动模块或者主操作系统进行升级或者改进来实 现。 如果通过 PCIe端点设备相对应的驱动模块实现改变 CPU对 PCIe端点 设备的访问方式, 可以包括如下流程:
S901: PCIe端点设备的驱动模块接收上层应用模块的调用指令, 所述 调用指令指示对所述 PCIe端点设备进行访问;
S902: PCIe端点设备相对应的驱动模块根据预先配置的 PCIe端点设 备的访问接口, 生成操作指令, 其中, 所述预先配置的 PCIe端点设备的访 问接口指向访问代理, 所述操作指令用以指示所述 CPU通过所述访问代理 访问所述 PCIe端点设备。
如果通过主操作系统实现改变 CPU对 PCIe端点设备的访问方式, 可以 包括如下流程:
S1001: PCIe端点设备相对应的驱动模块接收上层应用模块的调用指 令, 所述调用指令指示对所述 PCIe端点设备进行访问;
S1002: PCIe端点设备相对应的驱动模块调用主操作系统, 所述调用 指令指示对所述 PCIe端点设备进行访问;
S1003: 所述主操作系统根据预先配置的 PCIe端点设备的访问接口, 生成操作指令, 其中, 所述预先配置的 PCIe端点设备的访问接口指向访问 代理, 所述操作指令用以指示所述 CPU通过所述访问代理访问所述 PCIe 端点设备。
本发明实施例的装置 如图 11 ,本发明实施例提供的高速外围组件互联 PCIe端点设备的访问 装置包括:
接收模块 1101 , 用于接收调用指令, 所述调用指令指示对所述 PCIe 端点设备进行访问;
生成模块 1102, 用于根据预先配置的所述 PCIe端点设备的访问接口, 生成对所述 PCIe端点设备进行访问的操作指令, 其中, 所述预先配置的所 述 PCIe端点设备的访问接口指向所述访问代理,所述操作指令用以指示所 述 CPU通过所述访问代理访问所述 PCIe端点设备。
具体地, 所述访问装置可以是所述 PCIe端点设备的驱动模块或者所述 计算机系统的主操作系统。
如图 12, 为本发明实施例的计算机的结构组成示意图。 本发明实施例的 计算机可包括:
处理器 1201、存储器 1202、 系统总线 1204和通信接口 1205。 CPU120 存储器 1202和通信接口 1205之间通过系统总线 1204连接并完成相互间的 通信。
处理器 1201可能为单核或多核中央处理单元, 或者为特定集成电路, 或者为被配置成实施本发明实施例的一个或多个集成电路。
存储器 1202 可以为高速 RAM 存储器, 也可以为非易失性存储器 ( non-vo la t i l e memory ), 例如至少一个磁盘存储器。
存储器 1202用于计算机执行指令 1203。 具体的, 计算机执行指令 1203 中可以包括程序代码。
当计算机运行时, 处理器 1201运行计算机执行指令 1203 , 可以执行图 5-图 10任意之一所述的方法流程。
PCIe端点设备的接入计算机系统
当 PCIe端点设备被拔出计算机系统后, 后续还有可能被重新插入计算 机系统中, 另外, 也存在一个新的 PCIe端点设备需要接入正在运行状态的 计算机系统中的情形, 例如, 随着 SSD设备的普及, 用户直接插拔 SSD设 备的现象会越来越频繁。 在现有技术中, 当任何一个 PCIe端点设备上电接 入系统的时候, CPU都会启动对该 PCIe端点设备的扫描和资源分配流程, 而在此 CPU对新上电的 PCIe端点设备的扫描过程中, 若该 PCIe端点设备 被从系统中直接拔出, 也有可能造成 CPU报出 MCE错误, 从而引发系统的 复位, 为了规避上述问题, 本发明实施例提出了一种新的 PCIe端点设备的 资源分配方案,从而使得 PCIe端点设备新上电接入系统的的时候, CPU不 用再去对该新上电的 PCIe端点设备进行扫描和资源分配。
在计算机系统启动时, 计算机系统的基本输入输出系统(Basic
Input-Output System, BIOS)需要为系统中的每个设备预留资源, 针对 PCIe 端点设备, BIOS会对每个 PCIe端点设备的接入端口进行扫描, 当扫描到有 PCIe端点设备后, BIOS去读 PCIe端点设备的相应寄存器, 根据 PCIe端点 设备的需求进行相应的资源预留, 例如总线资源和内存地址资源的预留。 本 发明实施例中所说的 PCIe端点设备的接入端口, 具体来说, 可以是 PCIe交 换器的下行端口或者是系统中北桥的下行端口。
本发明实施例提供的 PCIe 端点设备的资源分配方案, 计算机系统的 BIOS在资源预留的方式与现有技术不同, 所述 BOIS在计算机系统启动的 时候, 不再根据实际扫描到的 PCIe端点设备的实际需求进行资源预留, 而 是为每个 PCIe端点设备的接入端口预留指定份额的资源, 所述指定份额大 于或者等于 PCIe端点设备的资源需求量, 优选地, 所述指定份额可以是资 源需求量最大的类型的 PCIe端点设备的资源需求量。例如,所述 BIOS对所 述计算机系统中的每个 PCIe端点设备的接入端口进行扫描, 无论是否扫描 到 PCIe端点设备, 也无论扫描到的是哪一种类型的 PCIe端点设备, 都指定 每个 PCIe端点设备的接入端口后续可能接入资源需求最大的类型的 PCIe端 点设备, 如果当前系统中可能会用到 10种 PCIe端点设备, 其中资源需求量 最大的是 SSD设备, 它需要 10M的不可预取内存资源和 3条 PCIe总线, 那 么, BIOS就在每个 PCIe端点设备的接入端口上都预留 3条 PCIe的总线资 源和 10M的不可预取资源。
其次,在所述 BIOS进行资源预留之后,所述计算机系统的 PCIe管理模 块将所述计算机系统中的一个 CPU 管理的所有 PCIe端点设备和 PCIe交换 器组成一个 PCIe域, 并为所述 PCIe域配置对应的 PCIe树, 所述 PCIe树用 来描述所述 PCIe域中的各个 PCIe端点设备到所述 CPU的每一层的连接关 系和每个 PCIe端点设备的资源配置情况。 由于 BIOS已经为每个 PCIe端点 设备的接入端口预留了指定份额的资源, 所述 PCIe管理模块在加载所述每 个 Cle端点设备的接入端口时, 所述 PCIe管理模块也不再去扫描该端口的 PCIe端点设备的资源实际需求量, 而是根据 BIOS的之前的资源预留情况进 行资源分配,即为所述每个 PCIe端点设备的接入端口分配所述 BIOS所预留 的指定份额的资源, 并将所述指定份额的资源分配情况记录到所述 PCIe树 中。
进而, 当 PCIe端点设备由于发生故障或者被从所述计算机系统中离线 时, 所述 PCIe管理模块在确定该 PCIe端点设备离线时, 并不释放为所述下 电的 PCIe端点设备分配的所述指定配额的资源, 并且, 保持所述 PCIe树的 结构不变, 即在所述 PCIe树中保留所述离线的 PCIe端点设备的连接关系和 资源配置情况。 这样, 由于, 所述 PCIe端点设备的资源和连接关系在所述 PCIe域中已经配置好, 当所述 PCIe端点设备上电接入所述 PCIe域时,所述 PCIe 管理模块通知该对应的驱动模块所述 PCIe端点设备上电完成, 所述 PCIe端点设备即完成接入所述计算机系统中的 PCIe域。 上述方案在 PCIe 端点设备上电时, CPU不用再去扫描该 PCIe端点设备, 从而进一步避免了 在 PCIe端点设备接入计算机系统时可能发生的 MCE错误引发的整个系统复 位问题。
本领域普通技术人员将会理解, 本发明的各个方面、 或各个方面的可 能实现方式可以被具体实施为系统、 方法或者计算机程序产品。 因此, 本 发明的各方面、 或各个方面的可能实现方式可以采用完全硬件实施例、 完 全软件实施例 (包括固件、 驻留软件等等), 或者组合软件和硬件方面的实 施例的形式, 在这里都统称为"电路"、 "模块 "或者 "系统"。 此外, 本发明的 各方面、 或各个方面的可能实现方式可以采用计算机程序产品的形式, 计 算机程序产品是指存储在计算机可读介质中的计算机可读程序代码。
计算机可读介质可以是计算机可读信号介质或者计算机可读存储介 质。 计算机可读存储介质包含但不限于电子、 磁性、 光学、 电磁、 红外或 半导体系统、 设备或者装置, 或者前述的任意适当组合, 如随机存取存储 器 (RAM)、 只读存储器 (ROM)、 可擦除可编程只读存储器 (EPROM 或者 快闪存储器)、 光纤、 便携式只读存储器 (CD-ROM)。
计算机中的处理器读取存储在计算机可读介质中的计算机可读程序代 码, 使得处理器能够执行在流程图中每个步骤、 或各步骤的组合中规定的 功能动作; 生成实施在框图的每一块、 或各块的组合中规定的功能动作的 装置。
计算机可读程序代码可以完全在用户的计算机上执行、 部分在用户的 计算机上执行、 作为单独的软件包、 部分在用户的计算机上并且部分在远 程计算机上, 或者完全在远程计算机或者服务器上执行。 也应该注意, 在 某些替代实施方案中, 在流程图中各步骤、 或框图中各块所注明的功能可 能不按图中注明的顺序发生。 例如, 依赖于所涉及的功能, 接连示出的两 个步骤、 或两个块实际上可能被大致同时执行, 或者这些块有时候可能被 以相反顺序执行。
本领域普通技术人员可以意识到, 结合本文中所公开的实施例描述的 各示例的单元及算法步骤, 能够以电子硬件、 或者计算机软件和电子硬件 的结合来实现。 这些功能究竟以硬件还是软件方式来执行, 取决于技术方 案的特定应用和设计约束条件。 专业技术人员可以对每个特定的应用来使 用不同方法来实现所描述的功能, 但是这种实现不应认为超出本发明的范 围。
以上所述, 仅为本发明的具体实施方式, 但本发明的保护范围并不局 限于此, 任何熟悉本技术领域的技术人员在本发明揭露的技术范围内, 可 轻易想到变化或替换, 都应涵盖在本发明的保护范围之内。 因此, 本发明 的保护范围应所述以权利要求的保护范围为准。

Claims

权利要求
1、 一种计算机系统, 所述计算机系统包括:
处理器;
高速外围组件互联 PCIe总线, 用于连接 PCIe端点设备;
所述计算机系统还包括访问代理, 所述访问代理分别连接所述处理器和 所述 PCIe端点设备;
所述处理器用于获取操作指令, 所述操作指令指示所述处理器通过所述 访问代理对所述 PCIe端点设备进行访问, 以及根据所述操作指令向所述访 问代理发送访问请求, 所述访问请求指示所述访问代理对所述 PCIe端点设 备进行访问;
所述访问代理用于在接收所述处理器发送的所述访问请求后, 向所述处 理器发送所述访问请求的响应消息。
2、 根据权利要求 1所述的计算机系统, 其特征在于, 所述计算机系统 还包括:
PCIe端点设备的驱动模块, 用于接收对所述 PCIe端点设备进行访问的 调用指令, 根据预先配置的 PCIe端点设备的访问接口生成所述操作指令, 所述预先配置的 PCIe端点设备的访问接口指向所述访问代理;
所述处理器具体用于获取所述 PCIe端点设备的驱动模块生成的所述操 作指令。
3、 根据权利要求 1所述的计算机系统, 其特征在于, 所述计算机系统 还包括: PCIe端点设备的驱动模块和主操作系统;
所述 PCIe端点设备的驱动模块用于接收对所述 PCIe端点设备进行访问 的调用指令, 调用所述主操作系统以进行对所述 PCIe端点设备的访问; 所述主操作系统用于响应所述 PCIe端点设备的驱动模块的调用, 根据 预先配置的 PCIe端点设备的访问接口生成所述操作指令, 所述预先配置的 PCIe端点设备的访问接口指向所述访问代理;
所述处理器具体用于获取所述主操作系统生成的所述操作指令。
4、 根据权利要求 1-3任一项所述的计算机系统, 其特征在于, 所述访 问代理还用于根据所述访问请求执行对所述 PCIe端点设备的访问。
5、 根据权利要求 4所述的计算机系统, 其特征在于, 所述访问代理由 直接存储器存取 DMA引擎实现;
所述处理器用于根据所述操作指令向所述访问代理发送访问请求为: 所述处理器具体用于根据所述操作指令, 向所述 DMA引擎发送数据搬 移请求;
所述访问代理用于根据所述访问请求执行对所述 PCIe端点设备的访问 为:
所述 DMA引擎具体用于根据所述数据搬移请求, 将所述 PCIe端点设 备的存储器中的指定数据搬移到所述计算机系统的存储器中, 或者将所述计 算机系统的存储器中的指定数据搬移到所述 PCIe端点设备的存储器中。
6、 根据权利要求 4或 5所述的计算机系统, 其特征在于,
所述访问代理还用于向所述处理器发送第一通知消息, 所述第一通知消 息表明对所述 PCIe端点设备访问成功;
所述处理器还用于在接收所述第一通知消息后, 获取访问结果。
7、 根据权利要求 4或 5所述的计算机系统, 其特征在于,
所述访问代理还用于向所述处理器发送第二通知消息, 所述第二通知消 息表明对所述 PCIe端点设备访问失败;
所述处理器还用于在接收所述第二通知消息后, 执行访问失败的后续处 理。
8、 根据权利要求 7所述的计算机系统, 其特征在于, 所述处理器用于 在接收所述第二通知消息后, 执行访问失败的后续处理为:
所述处理器用于在接收所述第二通知消息后,确定所述访问代理对所述 PCIe端点设备访问失败的原因;
若所述访问失败的原因是所述 PCIe端点设备异常离线, 所述处理器用 于中止对所述 PCIe端点设备的访问。
9、 根据权利要求 8所述的计算机系统, 其特征在于, 所述计算机系统 还包括 PCIe管理模块;
所述 PCIe管理模块用于获取所述 PCIe端点设备异常离线的通知,保留 为所述 PCIe端点设备分配的资源。
10、 根据权利要求 1-9任一项所述的计算机系统, 其特征在于, 所述访 问代理与所述处理器封装在一起。
11、 根据权利要求 1-9任一项所述的计算机系统, 其特征在于, 所述访 问代理与所述处理器固定连接;
所述访问代理用于通过与所述处理器的固定连接向所述处理器发送所 述访问请求的响应消息。
12、 根据权利要求 11所述的计算机系统, 其特征在于, 所述访问代理 与所述处理器固定连接包括:所述访问代理焊接在所述处理器所连接的印制 电路板上, 或者所述访问代理通过连接固件与所述处理器固定连接。
13、 根据权利要求 1-9任一项所述的计算机系统, 其特征在于, 所述计 算机系统还包括: PCIe交换器, 所述 PCIe交换器的上游端口通过所述 PCIe 总线与所述处理器连接, 所述 PCIe交换器的下游端口通过所述 PCIe总线与 所述 PCIe端点设备连接。
14、 根据权利要求 13所述的计算机系统, 其特征在于, 所述访问代理 封装在所述 PCIe交换器内部。
15、 根据权利要求 14所述的计算机系统, 其特征在于, 所述 PCIe交换 器焊接在所述处理器所连接的印制电路板上, 或者所述 PCIe交换器通过连 接固件与所述处理器固定连接。
16、 一种高速外围组件互联 PCIe端点设备的访问方法, 其特征在于, 计算机系统的处理器通过 PCIe总线连接所述 PCIe端点设备, 所述方法包 括:
所述处理器获取操作指令, 所述操作指令指示所述处理器通过访问代 理访问所述 PCIe端点设备;
所述处理器根据所述操作指令, 向所述访问代理发送访问请求, 所述 访问请求指示所述访问代理对所述 PCIe端点设备进行访问;
所述处理器接收所述访问代理发送的所述访问请求的响应消息。
17、 根据权利要求 16所述的方法, 其特征在于, 所述处理器获取操作 指令包括:
所述处理器获取 PCIe端点设备的驱动模块根据预先配置的 PCIe端点 设备的访问接口生成的所述操作指令, 所述预先配置的 PCIe端点设备的访 问接口指向所述访问代理; 或者,
所述处理器获取主操作系统根据预先配置的 PCIe端点设备的访问接口 生成的所述操作指令, 所述预先配置的 PCIe端点设备的访问接口指向所述 访问代理。
18、 根据权利要求 16或 17所述的方法, 其特征在于, 所述访问代理 由直接存储器存取 DMA 引擎实现, 所述操作指令具体指示所述处理器通 过所述 DMA引擎访问所述 PCIe端点设备;
所述处理器根据所述操作指令, 向所述访问代理发送所述访问请求包 括:
所述处理器根据所述操作指令, 向所述 DMA引擎发送数据搬移请求, 所述数据搬移请求指令所述 DMA引擎将所述 PCIe端点设备的存储器中的 指定数据搬移到所述计算机系统的存储器中, 或者将所述计算机系统的存 储器中的指定数据搬移到所述 PCIe端点设备的存储器中。
19、 根据权利要求 18所述的方法, 其特征在于, 所述操作指令中还指 示访问类型是读操作, 所述读操作的源地址和所述读操作的长度; 所述处理器根据所述操作指令, 向所述 DMA 引擎发送所述数据搬移 请求包括:
所述处理器获取所述计算机系统的存储器分配的所述读操作的目的地 址;
所述处理器向所述 DMA 引擎发送所述数据搬移请求, 所述数据搬移 请求中携带所述读操作的源地址、 所述读操作的目的地址和所述读操作的 长度, 以指示所述 DMA 引擎从所述读操作的源地址搬移所述读操作的长 度的数据到所述读操作的目的地址。
20、 根据权利要求 18所述的方法, 其特征在于, 所述操作指令中还指 示访问类型是写操作, 所述写操作的源地址、 所述写操作的目的地址和所 述写操作的长度;
所述处理器根据所述操作指令, 向所述 DMA 引擎发送所述数据搬移 请求包括:
所述处理器向所述 DMA 引擎发送所述数据搬移请求, 所述数据搬移 请求中携带所述写操作的源地址、 所述写操作的目的地址和所述写操作的 长度, 以指示所述 DMA 引擎从所述写操作的源地址搬移所述写操作的长 度的数据到所述写操作的目的地址。
21、 根据权利要求 16-20任一项所述的方法, 其特征在于, 还包括: 所述处理器接收所述访问代理发送的第一通知消息, 所述第一通知消 息表明所述访问代理对所述 PCIe端点设备访问成功;
所述处理器才艮据所述第一通知消息, 获取访问结果。
22、 根据权利要求 16-20任一项所述的方法, 其特征在于, 还包括: 所述处理器接收所述访问代理发送的第二通知消息, 所述第二通知消 息表明所述访问代理对所述 PCIe端点设备访问失败;
所述处理器根据所述第二通知消息, 执行访问失败的后续处理。
23、 根据权利要求 22所述的方法, 其特征在于, 执行所述访问失败的 后续处理包括:
所述处理器确定所述访问代理对所述 PCIe端点设备访问失败的原因, 若所述访问失败的原因是所述 PCIe端点设备异常离线,所述处理器中止对 所述 PCIe端点设备的访问。
24、 根据权利要求 23所述的方法, 其特征在于, 还包括:
获取所述 PCIe端点设备异常离线的通知, 保留为所述 PCIe端点设备分 配的资源。
25、 一种计算机, 其特征在于, 包括: 处理器、 存储器、 总线和通信接 口;
所述存储器用于存储计算机执行指令, 所述处理器与所述存储器通过所 述总线连接, 当所述计算机运行时, 所述处理器执行所述存储器存储的所述 计算机执行指令, 以使所述计算机执行如权利要求 16-24中任一所述的访问 方法。
26、 一种计算机可读介质, 其特征在于, 包括计算机执行指令, 当计算 机的处理器执行所述计算机执行指令时, 所述计算机执行如权利要求 16-24 中任一所述的访问方法。
27、一种高速外围组件互联 PCIe端点设备的访问方法,其特征在于, 所 述 PCIe端点设备通过 PCIe总线连接计算机系统的处理器, 所述方法包括: 接收调用指令, 所述调用指令指示对所述 PCIe端点设备进行访问; 根据预先配置的 PCIe端点设备的访问接口, 生成操作指令, 其中, 所 述预先配置的 PCIe端点设备的访问接口指向访问代理,所述操作指令用以 指示所述处理器通过所述访问代理访问所述 PCIe端点设备。
28、 根据权利要求 27所述的方法, 其特征在于, 所述接收调用指令包 括:
所述 PCIe端点设备的驱动模块接收所述调用指令;
相应地, 所述根据预先配置的所述 PCIe端点设备的访问接口, 生成所 述操作指令包括:
所述 PCIe端点设备的驱动模块根据预先配置的所述 PCIe端点设备的 访问接口, 生成对所述 PCIe端点设备进行访问的所述操作指令。
29、 根据权利要求 27所述的方法, 其特征在于, 所述接收调用指令包 括:
所述计算机系统的主操作系统接收 PCIe端点设备的驱动模块的所述调 用指令;
相应地, 所述根据预先配置的所述 PCIe端点设备的访问接口, 生成所 述操作指令包括:
所述主操作系统根据预先配置的所述 PCIe端点设备的访问接口,生成 对所述 PCIe端点设备进行访问的所述操作指令。
30、 一种高速外围组件互联 PCIe端点设备的访问装置, 其特征在于, 包括:
接收模块, 用于接收调用指令, 所述调用指令指示对所述 PCIe端点设 备进行访问;
生成模块, 用于根据预先配置的所述 PCIe端点设备的访问接口, 生成 操作指令, 其中, 所述预先配置的所述 PCIe端点设备的访问接口指向所述 访问代理, 所述操作指令用以指示所述处理器通过所述访问代理访问所述 PCIe端点设备。
31、 根据权利要求 30所述的方法, 其特征在于, 所述访问装置是所述
PCIe端点设备的驱动模块或者计算机系统的主操作系统。
32、 一种计算机, 其特征在于, 包括: 处理器、 存储器、 总线和通信接 口;
所述存储器用于存储计算机执行指令, 所述处理器与所述存储器通过 所述总线连接, 当所述计算机运行时, 所述处理器执行所述存储器存储的 所述计算机执行指令, 以使所述计算机执行如下方法: 接收调用指令, 所述调用指令指示对所述 PCIe端点设备进行访问; 根据预先配置的 PCIe端点设备的访问接口, 生成对所述 PCIe端点设 备进行访问的操作指令, 其中, 所述预先配置的所述 PCIe端点设备的访问 接口指向所述访问代理, 所述操作指令用以指示所述处理器通过所述访问 代理访问所述 PCIe端点设备。
33、 一种计算机可读介质, 其特征在于, 包括计算机执行指令, 当计 算机的处理器执行所述计算机执行指令时, 所述计算机执行如下方法: 接收调用指令, 所述调用指令指示对所述 PCIe端点设备进行访问; 根据预先配置的 PCIe端点设备的访问接口, 生成对所述 PCIe端点设 备进行访问的操作指令, 其中, 所述预先配置的所述 PCIe端点设备的访问 接口指向所述访问代理, 所述操作指令用以指示所述处理器通过所述访问 代理访问所述 PCIe端点设备。
34、 一种访问代理, 其特征在于, 所述访问代理应用于计算机系统中, 所述计算机系统包括处理器和高速外围组件互联 PCIe总线, 所述 PCIe总 线连接至少一个 PCIe端点设备;
所述访问代理分别连接所述处理器和所述 PCIe端点设备;
所述访问代理用于接收所述处理器对所述 PCIe端点设备的访问请求, 向所述处理器返回所述访问请求的响应消息, 以隔萬所述处理器与所述 PCIe端点设备之间的访问。
35、 根据权利要求 34所述的访问代理, 其特征在于, 所述访问代理还 用于根据所述访问请求执行对所述 PCIe端点设备的访问。
36、 根据权利要求 34或 35所述的访问代理, 其特征在于, 所述访问 代理由直接存储器存取(Direct Memory Access, DMA ) 引擎实现;
所述 DMA引擎具体用于接收所述处理器发送的数据搬移请求, 根据所 述数据搬移请求将所述 PCIe端点设备的存储器中的指定数据搬移到所述计 算机系统的存储器中, 或者将所述计算机系统的存储器中的指定数据搬移到 所述 PCIe端点设备的存储器中。
37、 根据权利要求 34或 35或 36所述的访问代理, 其特征在于, 所述 访问代理还用于向所述处理器发送第一通知消息, 所述第一通知消息表明 对所述 PCIe端点设备访问成功, 或者向所述处理器发送第二通知消息, 所 述第二通知消息表明对所述 PCIe端点设备访问失败。
38、 一种 PCIe交换器, 其特征在于, 所述 PCIe交换器应用于计算机 系统中,所述计算机系统包括处理器处理器和高速外围组件互联 PCIe总线, 所述 PCIe总线连接至少一个 PCIe端点设备;
所述 PCIe交换器的上游端口通过所述 PCIe总线与所述处理器连接, 所述 PCIe交换器的下游端口通过所述 PCIe总线与所述 PCIe端点设备连接; 所述 PCIe交换器内置如权利要求 34-37任意一项所述的访问代理。
39、一种为接入计算机系统中的高速外围组件互联 PCIe端点设备分配 资源的方法, 其特征在于, 包括:
为 PCIe端点设备的接入端口预留指定份额的资源, 所述指定份额的资 源大于或者等于所述 PCIe端点设备的资源需求量;
根据预留的所述指定份额的资源, 为所述 PCIe端点设备的接入端口分 配所预留的指定份额的资源。
40、 根据权利要求 39所述的方法, 其特征在于, 所述指定份额的资源 为所述计算机系统中资源需求量最大的类型的 PCIe端点设备的资源需求 量。
41、 根据权利要求 39或 40所述的方法, 其特征在于, 所述计算机系 统中接入的 PCIe端点设备与处理器组成一个 PCIe域, 为所述 PCIe域配置 对应的 PCIe树;
所述方法还包括: 在所述 PCIe树中记录为所述 PCIe端点设备的接入 端口所分配的所述指定份额的资源。
42、 根据权利要求 41所述的方法, 其特征在于, 所述方法还包括: 当所述 PCIe端点设备从所述计算机系统离线后, 保留所述 PCIe树中 记录的所述 PCIe端点设备的接入端口所分配的所述指定份额的资源。
43、 一种计算机系统, 其特征在于, 包括:
处理器;
高速外围组件互联 PCIe总线, 用于连接 PCIe端点设备;
基本输入输出系统 BIOS, 用于为所述 PCIe端点设备的接入端口预留 指定份额的资源,所述指定份额大于或者等于所述 PCIe端点设备的资源需 求量;
PCIe管理模块, 用于根据所述 BIOS预留的所述指定份额的资源, 为 所述 PCIe端点设备的接入端口分配所预留的指定份额的资源。
44、 根据权利要求 43所述的计算机系统, 其特征在于, 所述指定份额 为所述计算机系统中资源需求量最大的类型的 PCIe端点设备的资源需求 量。
45、 根据权利要求 43或 44所述的计算机系统, 其特征在于, 所述计 算机系统中接入的 PCIe端点设备与所述处理器组成一个 PCIe域,所述 PCIe 域配置对应的 PCIe树;
所述 PCIe管理模块还用于在所述 PCIe树中记录为所述 PCIe端点设备 的接入端口所分配的所述指定份额的资源。
46、 根据权利要求 45所述的计算机系统, 其特征在于, 所述 PCIe管 理模块还用于:
当所述 PCIe端点设备从所述计算机系统离线后, 保留所述 PCIe树中 记录的所述 PCIe端点设备的接入端口所分配的所述指定份额的资源。
PCT/CN2013/075088 2013-05-02 2013-05-02 一种计算机系统、高速外围组件互联端点设备的访问方法、和装置 WO2014176775A1 (zh)

Priority Applications (17)

Application Number Priority Date Filing Date Title
BR112013033792-3A BR112013033792B1 (pt) 2013-05-02 2013-05-02 sistema de computador e método para acessar um dispositivo de ponto de extremidade de expresso de interconexão de componentes periféricos, computador, servidor de acesso e trocador de pcie
CA2833940A CA2833940C (en) 2013-05-02 2013-05-02 Computer system, method for accessing peripheral component interconnect express endpoint device, and apparatus
CN201380000957.XA CN104335194B (zh) 2013-05-02 一种计算机系统、高速外围组件互联端点设备的访问方法、和装置
EP18155911.3A EP3385854B1 (en) 2013-05-02 2013-05-02 Computer system, method for accessing peripheral component interconnect express endpoint device, and apparatus
JP2015514331A JP5953573B2 (ja) 2013-05-02 2013-05-02 ペリフェラル・コンポーネント・インターコネクト・エクスプレス・エンドポイントデバイスにアクセスするためのコンピュータシステム、方法、および装置
EP13792568.1A EP2811413B1 (en) 2013-05-02 2013-05-02 Computer system, access method and apparatus for peripheral component interconnect express endpoint device
PCT/CN2013/075088 WO2014176775A1 (zh) 2013-05-02 2013-05-02 一种计算机系统、高速外围组件互联端点设备的访问方法、和装置
KR1020137032327A KR101539878B1 (ko) 2013-05-02 2013-05-02 컴퓨터 시스템, pci 익스프레스 엔드포인트 디바이스에 액세스하는 방법 및 장치
ES18155911T ES2866156T3 (es) 2013-05-02 2013-05-02 Sistema informático, método para acceder a un terminal de interconexión de componentes periféricos exprés, y equipo
EP16180277.2A EP3173936B1 (en) 2013-05-02 2013-05-02 Computer system, method for accessing peripheral component interconnect express endpoint device, and apparatus
AU2013263866A AU2013263866B2 (en) 2013-05-02 2013-05-02 Computer system, method for accessing peripheral component interconnect express endpoint device, and apparatus
ES13792568.1T ES2610978T3 (es) 2013-05-02 2013-05-02 Sistema informático, método de acceso y aparato para un dispositivo de punto final de interconexión de componentes periféricos exprés
ES16180277.2T ES2687609T3 (es) 2013-05-02 2013-05-02 Sistema informático, método para acceder a un terminal de interconexión de componentes periféricos exprés y equipo
ZA2013/08948A ZA201308948B (en) 2013-05-02 2013-11-27 Computer system, method and apparatus for peripheral component interconnect express endpoint device
US14/143,460 US8782317B1 (en) 2013-05-02 2013-12-30 Computer system, method for accessing peripheral component interconnect express endpoint device, and apparatus
US14/297,959 US10025745B2 (en) 2013-05-02 2014-06-06 Computer system, method for accessing peripheral component interconnect express endpoint device, and apparatus
US14/703,328 US9477632B2 (en) 2013-05-02 2015-05-04 Access proxy for accessing peripheral component interconnect express endpoint device, PCIe exchanger and computer system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2013/075088 WO2014176775A1 (zh) 2013-05-02 2013-05-02 一种计算机系统、高速外围组件互联端点设备的访问方法、和装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US14/143,460 Continuation US8782317B1 (en) 2013-05-02 2013-12-30 Computer system, method for accessing peripheral component interconnect express endpoint device, and apparatus

Publications (1)

Publication Number Publication Date
WO2014176775A1 true WO2014176775A1 (zh) 2014-11-06

Family

ID=51135815

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2013/075088 WO2014176775A1 (zh) 2013-05-02 2013-05-02 一种计算机系统、高速外围组件互联端点设备的访问方法、和装置

Country Status (10)

Country Link
US (3) US8782317B1 (zh)
EP (3) EP3173936B1 (zh)
JP (1) JP5953573B2 (zh)
KR (1) KR101539878B1 (zh)
AU (1) AU2013263866B2 (zh)
BR (1) BR112013033792B1 (zh)
CA (1) CA2833940C (zh)
ES (3) ES2866156T3 (zh)
WO (1) WO2014176775A1 (zh)
ZA (1) ZA201308948B (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105938461A (zh) * 2015-07-31 2016-09-14 杭州迪普科技有限公司 一种dma数据传输方法、装置以及网络设备

Families Citing this family (154)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12008266B2 (en) 2010-09-15 2024-06-11 Pure Storage, Inc. Efficient read by reconstruction
US11614893B2 (en) 2010-09-15 2023-03-28 Pure Storage, Inc. Optimizing storage device access based on latency
US8589640B2 (en) 2011-10-14 2013-11-19 Pure Storage, Inc. Method for maintaining multiple fingerprint tables in a deduplicating storage system
US20140229659A1 (en) * 2011-12-30 2014-08-14 Marc T. Jones Thin translation for system access of non volatile semicondcutor storage as random access memory
KR101539878B1 (ko) * 2013-05-02 2015-07-27 후아웨이 테크놀러지 컴퍼니 리미티드 컴퓨터 시스템, pci 익스프레스 엔드포인트 디바이스에 액세스하는 방법 및 장치
US9552323B1 (en) * 2013-07-05 2017-01-24 Altera Corporation High-speed peripheral component interconnect (PCIe) input-output devices with receive buffer management circuitry
US9367243B1 (en) 2014-06-04 2016-06-14 Pure Storage, Inc. Scalable non-uniform storage sizes
US9003144B1 (en) 2014-06-04 2015-04-07 Pure Storage, Inc. Mechanism for persisting messages in a storage system
US11068363B1 (en) 2014-06-04 2021-07-20 Pure Storage, Inc. Proactively rebuilding data in a storage cluster
US11652884B2 (en) 2014-06-04 2023-05-16 Pure Storage, Inc. Customized hash algorithms
US11399063B2 (en) 2014-06-04 2022-07-26 Pure Storage, Inc. Network authentication for a storage system
US11960371B2 (en) 2014-06-04 2024-04-16 Pure Storage, Inc. Message persistence in a zoned system
US10574754B1 (en) 2014-06-04 2020-02-25 Pure Storage, Inc. Multi-chassis array with multi-level load balancing
US9213485B1 (en) 2014-06-04 2015-12-15 Pure Storage, Inc. Storage system architecture
US9218244B1 (en) 2014-06-04 2015-12-22 Pure Storage, Inc. Rebuilding data across storage nodes
US9836234B2 (en) 2014-06-04 2017-12-05 Pure Storage, Inc. Storage cluster
US8850108B1 (en) 2014-06-04 2014-09-30 Pure Storage, Inc. Storage cluster
US8868825B1 (en) 2014-07-02 2014-10-21 Pure Storage, Inc. Nonrepeating identifiers in an address space of a non-volatile solid-state storage
US11604598B2 (en) 2014-07-02 2023-03-14 Pure Storage, Inc. Storage cluster with zoned drives
US11886308B2 (en) 2014-07-02 2024-01-30 Pure Storage, Inc. Dual class of service for unified file and object messaging
US10114757B2 (en) 2014-07-02 2018-10-30 Pure Storage, Inc. Nonrepeating identifiers in an address space of a non-volatile solid-state storage
US9836245B2 (en) 2014-07-02 2017-12-05 Pure Storage, Inc. Non-volatile RAM and flash memory in a non-volatile solid-state storage
US9021297B1 (en) 2014-07-02 2015-04-28 Pure Storage, Inc. Redundant, fault-tolerant, distributed remote procedure call cache in a storage system
US9811677B2 (en) 2014-07-03 2017-11-07 Pure Storage, Inc. Secure data replication in a storage grid
US10853311B1 (en) 2014-07-03 2020-12-01 Pure Storage, Inc. Administration through files in a storage system
US8874836B1 (en) 2014-07-03 2014-10-28 Pure Storage, Inc. Scheduling policy for queues in a non-volatile solid-state storage
US9747229B1 (en) 2014-07-03 2017-08-29 Pure Storage, Inc. Self-describing data format for DMA in a non-volatile solid-state storage
US9082512B1 (en) 2014-08-07 2015-07-14 Pure Storage, Inc. Die-level monitoring in a storage cluster
US9483346B2 (en) 2014-08-07 2016-11-01 Pure Storage, Inc. Data rebuild on feedback from a queue in a non-volatile solid-state storage
US10983859B2 (en) 2014-08-07 2021-04-20 Pure Storage, Inc. Adjustable error correction based on memory health in a storage unit
US9495255B2 (en) 2014-08-07 2016-11-15 Pure Storage, Inc. Error recovery in a storage cluster
US9766972B2 (en) 2014-08-07 2017-09-19 Pure Storage, Inc. Masking defective bits in a storage array
US9558069B2 (en) 2014-08-07 2017-01-31 Pure Storage, Inc. Failure mapping in a storage array
US10079711B1 (en) 2014-08-20 2018-09-18 Pure Storage, Inc. Virtual file server with preserved MAC address
US10229085B2 (en) 2015-01-23 2019-03-12 Hewlett Packard Enterprise Development Lp Fibre channel hardware card port assignment and management method for port names
US9948615B1 (en) 2015-03-16 2018-04-17 Pure Storage, Inc. Increased storage unit encryption based on loss of trust
US11294893B2 (en) 2015-03-20 2022-04-05 Pure Storage, Inc. Aggregation of queries
US9940234B2 (en) 2015-03-26 2018-04-10 Pure Storage, Inc. Aggressive data deduplication using lazy garbage collection
US10082985B2 (en) 2015-03-27 2018-09-25 Pure Storage, Inc. Data striping across storage nodes that are assigned to multiple logical arrays
US10178169B2 (en) 2015-04-09 2019-01-08 Pure Storage, Inc. Point to point based backend communication layer for storage processing
US9672125B2 (en) 2015-04-10 2017-06-06 Pure Storage, Inc. Ability to partition an array into two or more logical arrays with independently running software
US10140149B1 (en) 2015-05-19 2018-11-27 Pure Storage, Inc. Transactional commits with hardware assists in remote memory
US9817576B2 (en) 2015-05-27 2017-11-14 Pure Storage, Inc. Parallel update to NVRAM
US10846275B2 (en) 2015-06-26 2020-11-24 Pure Storage, Inc. Key management in a storage device
US10983732B2 (en) 2015-07-13 2021-04-20 Pure Storage, Inc. Method and system for accessing a file
US11232079B2 (en) 2015-07-16 2022-01-25 Pure Storage, Inc. Efficient distribution of large directories
US10108355B2 (en) 2015-09-01 2018-10-23 Pure Storage, Inc. Erase block state detection
US11341136B2 (en) 2015-09-04 2022-05-24 Pure Storage, Inc. Dynamically resizable structures for approximate membership queries
WO2017049433A1 (zh) * 2015-09-21 2017-03-30 华为技术有限公司 计算机系统和计算机系统中端点设备访问的方法
US10762069B2 (en) 2015-09-30 2020-09-01 Pure Storage, Inc. Mechanism for a system where data and metadata are located closely together
US9768953B2 (en) 2015-09-30 2017-09-19 Pure Storage, Inc. Resharing of a split secret
US10853266B2 (en) 2015-09-30 2020-12-01 Pure Storage, Inc. Hardware assisted data lookup methods
US9843453B2 (en) 2015-10-23 2017-12-12 Pure Storage, Inc. Authorizing I/O commands with I/O tokens
US10007457B2 (en) 2015-12-22 2018-06-26 Pure Storage, Inc. Distributed transactions with token-associated execution
CN105824622B (zh) * 2016-03-11 2020-04-24 联想(北京)有限公司 数据处理方法及电子设备
US10261690B1 (en) 2016-05-03 2019-04-16 Pure Storage, Inc. Systems and methods for operating a storage system
US11861188B2 (en) 2016-07-19 2024-01-02 Pure Storage, Inc. System having modular accelerators
US11449232B1 (en) 2016-07-22 2022-09-20 Pure Storage, Inc. Optimal scheduling of flash operations
US9672905B1 (en) 2016-07-22 2017-06-06 Pure Storage, Inc. Optimize data protection layouts based on distributed flash wear leveling
US10768819B2 (en) 2016-07-22 2020-09-08 Pure Storage, Inc. Hardware support for non-disruptive upgrades
US11080155B2 (en) 2016-07-24 2021-08-03 Pure Storage, Inc. Identifying error types among flash memory
US10216420B1 (en) 2016-07-24 2019-02-26 Pure Storage, Inc. Calibration of flash channels in SSD
US11604690B2 (en) 2016-07-24 2023-03-14 Pure Storage, Inc. Online failure span determination
US10203903B2 (en) 2016-07-26 2019-02-12 Pure Storage, Inc. Geometry based, space aware shelf/writegroup evacuation
US11797212B2 (en) 2016-07-26 2023-10-24 Pure Storage, Inc. Data migration for zoned drives
US10366004B2 (en) 2016-07-26 2019-07-30 Pure Storage, Inc. Storage system with elective garbage collection to reduce flash contention
US11886334B2 (en) 2016-07-26 2024-01-30 Pure Storage, Inc. Optimizing spool and memory space management
US11734169B2 (en) 2016-07-26 2023-08-22 Pure Storage, Inc. Optimizing spool and memory space management
US11422719B2 (en) 2016-09-15 2022-08-23 Pure Storage, Inc. Distributed file deletion and truncation
US10756816B1 (en) 2016-10-04 2020-08-25 Pure Storage, Inc. Optimized fibre channel and non-volatile memory express access
US9747039B1 (en) 2016-10-04 2017-08-29 Pure Storage, Inc. Reservations over multiple paths on NVMe over fabrics
US10481798B2 (en) 2016-10-28 2019-11-19 Pure Storage, Inc. Efficient flash management for multiple controllers
US11550481B2 (en) 2016-12-19 2023-01-10 Pure Storage, Inc. Efficiently writing data in a zoned drive storage system
US11307998B2 (en) 2017-01-09 2022-04-19 Pure Storage, Inc. Storage efficiency of encrypted host system data
US9747158B1 (en) 2017-01-13 2017-08-29 Pure Storage, Inc. Intelligent refresh of 3D NAND
US11955187B2 (en) 2017-01-13 2024-04-09 Pure Storage, Inc. Refresh of differing capacity NAND
US10979223B2 (en) 2017-01-31 2021-04-13 Pure Storage, Inc. Separate encryption for a solid-state drive
US10528488B1 (en) 2017-03-30 2020-01-07 Pure Storage, Inc. Efficient name coding
US11016667B1 (en) 2017-04-05 2021-05-25 Pure Storage, Inc. Efficient mapping for LUNs in storage memory with holes in address space
CN108733479B (zh) * 2017-04-24 2021-11-02 上海宝存信息科技有限公司 卸载固态硬盘卡的方法以及使用该方法的装置
US10516645B1 (en) 2017-04-27 2019-12-24 Pure Storage, Inc. Address resolution broadcasting in a networked device
US10944671B2 (en) 2017-04-27 2021-03-09 Pure Storage, Inc. Efficient data forwarding in a networked device
US10141050B1 (en) 2017-04-27 2018-11-27 Pure Storage, Inc. Page writes for triple level cell flash memory
US10223318B2 (en) 2017-05-31 2019-03-05 Hewlett Packard Enterprise Development Lp Hot plugging peripheral connected interface express (PCIe) cards
US11467913B1 (en) 2017-06-07 2022-10-11 Pure Storage, Inc. Snapshots with crash consistency in a storage system
US11947814B2 (en) 2017-06-11 2024-04-02 Pure Storage, Inc. Optimizing resiliency group formation stability
US11782625B2 (en) 2017-06-11 2023-10-10 Pure Storage, Inc. Heterogeneity supportive resiliency groups
US11138103B1 (en) 2017-06-11 2021-10-05 Pure Storage, Inc. Resiliency groups
US10425473B1 (en) 2017-07-03 2019-09-24 Pure Storage, Inc. Stateful connection reset in a storage cluster with a stateless load balancer
US10402266B1 (en) 2017-07-31 2019-09-03 Pure Storage, Inc. Redundant array of independent disks in a direct-mapped flash storage system
US10210926B1 (en) 2017-09-15 2019-02-19 Pure Storage, Inc. Tracking of optimum read voltage thresholds in nand flash devices
US10877827B2 (en) 2017-09-15 2020-12-29 Pure Storage, Inc. Read voltage optimization
US11024390B1 (en) 2017-10-31 2021-06-01 Pure Storage, Inc. Overlapping RAID groups
US10496330B1 (en) 2017-10-31 2019-12-03 Pure Storage, Inc. Using flash storage devices with different sized erase blocks
US10515701B1 (en) 2017-10-31 2019-12-24 Pure Storage, Inc. Overlapping raid groups
US10884919B2 (en) 2017-10-31 2021-01-05 Pure Storage, Inc. Memory management in a storage system
US10545687B1 (en) 2017-10-31 2020-01-28 Pure Storage, Inc. Data rebuild when changing erase block sizes during drive replacement
US10860475B1 (en) 2017-11-17 2020-12-08 Pure Storage, Inc. Hybrid flash translation layer
US10990566B1 (en) 2017-11-20 2021-04-27 Pure Storage, Inc. Persistent file locks in a storage system
CN107957885B (zh) * 2017-12-01 2021-02-26 麒麟软件有限公司 一种基于飞腾平台的pcie链路设备待机与恢复方法
US10719265B1 (en) 2017-12-08 2020-07-21 Pure Storage, Inc. Centralized, quorum-aware handling of device reservation requests in a storage system
US10929053B2 (en) 2017-12-08 2021-02-23 Pure Storage, Inc. Safe destructive actions on drives
US10929031B2 (en) 2017-12-21 2021-02-23 Pure Storage, Inc. Maximizing data reduction in a partially encrypted volume
US10467527B1 (en) 2018-01-31 2019-11-05 Pure Storage, Inc. Method and apparatus for artificial intelligence acceleration
US10733053B1 (en) 2018-01-31 2020-08-04 Pure Storage, Inc. Disaster recovery for high-bandwidth distributed archives
US10976948B1 (en) 2018-01-31 2021-04-13 Pure Storage, Inc. Cluster expansion mechanism
US11036596B1 (en) 2018-02-18 2021-06-15 Pure Storage, Inc. System for delaying acknowledgements on open NAND locations until durability has been confirmed
US11494109B1 (en) 2018-02-22 2022-11-08 Pure Storage, Inc. Erase block trimming for heterogenous flash memory storage devices
CN108509155B (zh) * 2018-03-31 2021-07-13 深圳忆联信息系统有限公司 一种远程访问磁盘的方法和装置
US11995336B2 (en) 2018-04-25 2024-05-28 Pure Storage, Inc. Bucket views
US12001688B2 (en) 2019-04-29 2024-06-04 Pure Storage, Inc. Utilizing data views to optimize secure data access in a storage system
US10931450B1 (en) 2018-04-27 2021-02-23 Pure Storage, Inc. Distributed, lock-free 2-phase commit of secret shares using multiple stateless controllers
US10853146B1 (en) 2018-04-27 2020-12-01 Pure Storage, Inc. Efficient data forwarding in a networked device
US11385792B2 (en) 2018-04-27 2022-07-12 Pure Storage, Inc. High availability controller pair transitioning
US11436023B2 (en) 2018-05-31 2022-09-06 Pure Storage, Inc. Mechanism for updating host file system and flash translation layer based on underlying NAND technology
US11438279B2 (en) 2018-07-23 2022-09-06 Pure Storage, Inc. Non-disruptive conversion of a clustered service from single-chassis to multi-chassis
US11017071B2 (en) * 2018-08-02 2021-05-25 Dell Products L.P. Apparatus and method to protect an information handling system against other devices
US11500570B2 (en) 2018-09-06 2022-11-15 Pure Storage, Inc. Efficient relocation of data utilizing different programming modes
US11520514B2 (en) 2018-09-06 2022-12-06 Pure Storage, Inc. Optimized relocation of data based on data characteristics
US11868309B2 (en) 2018-09-06 2024-01-09 Pure Storage, Inc. Queue management for data relocation
US11354058B2 (en) 2018-09-06 2022-06-07 Pure Storage, Inc. Local relocation of data stored at a storage device of a storage system
US10846155B2 (en) * 2018-10-16 2020-11-24 Samsung Electronics Co., Ltd. Method for NVMe SSD based storage service using RPC and gRPC tunneling over PCIe +
US10454498B1 (en) 2018-10-18 2019-10-22 Pure Storage, Inc. Fully pipelined hardware engine design for fast and efficient inline lossless data compression
US10976947B2 (en) 2018-10-26 2021-04-13 Pure Storage, Inc. Dynamically selecting segment heights in a heterogeneous RAID group
CN109684084A (zh) * 2018-12-12 2019-04-26 浪潮(北京)电子信息产业有限公司 一种总线资源的分配方法、系统及相关组件
US11334254B2 (en) 2019-03-29 2022-05-17 Pure Storage, Inc. Reliability based flash page sizing
US11775189B2 (en) 2019-04-03 2023-10-03 Pure Storage, Inc. Segment level heterogeneity
US11099986B2 (en) 2019-04-12 2021-08-24 Pure Storage, Inc. Efficient transfer of memory contents
JP7326863B2 (ja) * 2019-05-17 2023-08-16 オムロン株式会社 転送装置、情報処理装置、および、データ転送方法
US11714572B2 (en) 2019-06-19 2023-08-01 Pure Storage, Inc. Optimized data resiliency in a modular storage system
US11281394B2 (en) 2019-06-24 2022-03-22 Pure Storage, Inc. Replication across partitioning schemes in a distributed storage system
US11893126B2 (en) 2019-10-14 2024-02-06 Pure Storage, Inc. Data deletion for a multi-tenant environment
US12001684B2 (en) 2019-12-12 2024-06-04 Pure Storage, Inc. Optimizing dynamic power loss protection adjustment in a storage system
US11704192B2 (en) 2019-12-12 2023-07-18 Pure Storage, Inc. Budgeting open blocks based on power loss protection
US11847331B2 (en) 2019-12-12 2023-12-19 Pure Storage, Inc. Budgeting open blocks of a storage unit based on power loss prevention
US11416144B2 (en) 2019-12-12 2022-08-16 Pure Storage, Inc. Dynamic use of segment or zone power loss protection in a flash device
US11188432B2 (en) 2020-02-28 2021-11-30 Pure Storage, Inc. Data resiliency by partially deallocating data blocks of a storage device
US11507297B2 (en) 2020-04-15 2022-11-22 Pure Storage, Inc. Efficient management of optimal read levels for flash storage systems
US11256587B2 (en) 2020-04-17 2022-02-22 Pure Storage, Inc. Intelligent access to a storage device
US11474986B2 (en) 2020-04-24 2022-10-18 Pure Storage, Inc. Utilizing machine learning to streamline telemetry processing of storage media
US11416338B2 (en) 2020-04-24 2022-08-16 Pure Storage, Inc. Resiliency scheme to enhance storage performance
CN111767242B (zh) * 2020-05-28 2022-04-15 西安广和通无线软件有限公司 Pcie设备控制方法、装置、计算机设备和存储介质
US11768763B2 (en) 2020-07-08 2023-09-26 Pure Storage, Inc. Flash secure erase
US11681448B2 (en) 2020-09-08 2023-06-20 Pure Storage, Inc. Multiple device IDs in a multi-fabric module storage system
US11513974B2 (en) 2020-09-08 2022-11-29 Pure Storage, Inc. Using nonce to control erasure of data blocks of a multi-controller storage system
US11487455B2 (en) 2020-12-17 2022-11-01 Pure Storage, Inc. Dynamic block allocation to optimize storage system performance
US11847324B2 (en) 2020-12-31 2023-12-19 Pure Storage, Inc. Optimizing resiliency groups for data regions of a storage system
US11614880B2 (en) 2020-12-31 2023-03-28 Pure Storage, Inc. Storage system with selectable write paths
US11630593B2 (en) 2021-03-12 2023-04-18 Pure Storage, Inc. Inline flash memory qualification in a storage system
US11507597B2 (en) 2021-03-31 2022-11-22 Pure Storage, Inc. Data replication to meet a recovery point objective
US11832410B2 (en) 2021-09-14 2023-11-28 Pure Storage, Inc. Mechanical energy absorbing bracket apparatus
CN113868181B (zh) * 2021-09-30 2023-07-21 苏州浪潮智能科技有限公司 一种存储设备pcie链路协商方法、系统、设备及介质
EP4213007A3 (en) * 2021-12-24 2023-09-27 Samsung Electronics Co., Ltd. Storage device having deduplication manager, method of operating the same, and method of operating storage system including the same
US11994723B2 (en) 2021-12-30 2024-05-28 Pure Storage, Inc. Ribbon cable alignment apparatus

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101206629A (zh) * 2006-12-19 2008-06-25 国际商业机器公司 在运行的PCIe架构中热插/拔新组件的系统和方法
CN101206621A (zh) * 2006-12-19 2008-06-25 国际商业机器公司 迁移无状态虚拟功能的系统和方法
CN101631083A (zh) * 2009-08-07 2010-01-20 成都市华为赛门铁克科技有限公司 设备接管方法和装置及双控系统
CN101763221A (zh) * 2008-12-24 2010-06-30 成都市华为赛门铁克科技有限公司 一种存储方法、存储系统及控制器

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5630076A (en) * 1995-05-05 1997-05-13 Apple Computer, Inc. Dynamic device matching using driver candidate lists
KR100244778B1 (ko) * 1997-07-19 2000-02-15 윤종용 정상 동작중인 시스템에 보드를 실장 또는 탈장하는 회로
US7152167B2 (en) 2002-12-11 2006-12-19 Intel Corporation Apparatus and method for data bus power control
US7404047B2 (en) 2003-05-27 2008-07-22 Intel Corporation Method and apparatus to improve multi-CPU system performance for accesses to memory
US7484045B2 (en) 2004-03-30 2009-01-27 Intel Corporation Store performance in strongly-ordered microprocessor architecture
US7484016B2 (en) 2004-06-30 2009-01-27 Intel Corporation Apparatus and method for high performance volatile disk drive memory access using an integrated DMA engine
US7543094B2 (en) 2005-07-05 2009-06-02 Via Technologies, Inc. Target readiness protocol for contiguous write
US7546487B2 (en) * 2005-09-15 2009-06-09 Intel Corporation OS and firmware coordinated error handling using transparent firmware intercept and firmware services
JP4809166B2 (ja) * 2006-09-06 2011-11-09 株式会社日立製作所 リモートi/oを構成する計算機システム及びi/oデータ転送方法
US7835391B2 (en) 2007-03-07 2010-11-16 Texas Instruments Incorporated Protocol DMA engine
US8141094B2 (en) * 2007-12-03 2012-03-20 International Business Machines Corporation Distribution of resources for I/O virtualized (IOV) adapters and management of the adapters through an IOV management partition via user selection of compatible virtual functions
US7934033B2 (en) * 2008-03-25 2011-04-26 Aprius, Inc. PCI-express function proxy
US8521915B2 (en) * 2009-08-18 2013-08-27 Fusion-Io, Inc. Communicating between host computers and peripheral resources in an input/output (I/O) virtualization system
KR101539878B1 (ko) * 2013-05-02 2015-07-27 후아웨이 테크놀러지 컴퍼니 리미티드 컴퓨터 시스템, pci 익스프레스 엔드포인트 디바이스에 액세스하는 방법 및 장치

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101206629A (zh) * 2006-12-19 2008-06-25 国际商业机器公司 在运行的PCIe架构中热插/拔新组件的系统和方法
CN101206621A (zh) * 2006-12-19 2008-06-25 国际商业机器公司 迁移无状态虚拟功能的系统和方法
CN101763221A (zh) * 2008-12-24 2010-06-30 成都市华为赛门铁克科技有限公司 一种存储方法、存储系统及控制器
CN101631083A (zh) * 2009-08-07 2010-01-20 成都市华为赛门铁克科技有限公司 设备接管方法和装置及双控系统

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105938461A (zh) * 2015-07-31 2016-09-14 杭州迪普科技有限公司 一种dma数据传输方法、装置以及网络设备
CN105938461B (zh) * 2015-07-31 2019-02-19 杭州迪普科技股份有限公司 一种dma数据传输方法、装置以及网络设备

Also Published As

Publication number Publication date
JP2015519665A (ja) 2015-07-09
US20140331000A1 (en) 2014-11-06
ES2687609T3 (es) 2018-10-26
CN104335194A (zh) 2015-02-04
AU2013263866B2 (en) 2016-02-18
EP3385854A1 (en) 2018-10-10
US8782317B1 (en) 2014-07-15
US10025745B2 (en) 2018-07-17
KR101539878B1 (ko) 2015-07-27
EP3385854B1 (en) 2021-01-27
CA2833940A1 (en) 2014-11-02
AU2013263866A1 (en) 2014-12-04
US20150234772A1 (en) 2015-08-20
EP2811413A4 (en) 2014-12-10
ES2610978T3 (es) 2017-05-04
ES2866156T3 (es) 2021-10-19
BR112013033792A2 (pt) 2017-02-07
EP3173936A1 (en) 2017-05-31
KR20150005854A (ko) 2015-01-15
ZA201308948B (en) 2016-01-27
EP3173936B1 (en) 2018-07-18
BR112013033792B1 (pt) 2018-12-04
CA2833940C (en) 2018-12-04
EP2811413A1 (en) 2014-12-10
US9477632B2 (en) 2016-10-25
JP5953573B2 (ja) 2016-07-20
EP2811413B1 (en) 2016-10-19

Similar Documents

Publication Publication Date Title
WO2014176775A1 (zh) 一种计算机系统、高速外围组件互联端点设备的访问方法、和装置
US10838665B2 (en) Method, device, and system for buffering data for read/write commands in NVME over fabric architecture
JP6140303B2 (ja) 仮想マシンのライブマイグレーション方法、仮想マシンのメモリデータ処理方法、サーバ及び仮想マシンシステム
JP4585463B2 (ja) 仮想計算機システムを機能させるためのプログラム
JP2012133405A (ja) ストレージ装置及びそのデータ転送制御方法
CN103797469A (zh) 一种计算机系统、高速外围组件互联端点设备的访问方法和装置
US20100106869A1 (en) USB Storage Device and Interface Circuit Thereof
JP6842480B2 (ja) 分散ストレージシステム
JP2011076174A (ja) エンドポイント共有システム、代理アクセス方法および代理アクセスプログラム
JP6245370B2 (ja) コンピュータシステム及びデータを双方向に送受信する方法
JP2015201008A (ja) 情報処理装置,情報処理プログラム及び情報処理方法
JP2022039501A (ja) ストレージ制御装置、送達状況判定プログラムおよびストレージシステム

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 2833940

Country of ref document: CA

REEP Request for entry into the european phase

Ref document number: 2013792568

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2013792568

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 1020137032327

Country of ref document: KR

ENP Entry into the national phase

Ref document number: 2015514331

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2013263866

Country of ref document: AU

Date of ref document: 20130502

Kind code of ref document: A

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112013033792

Country of ref document: BR

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 112013033792

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20131227