CN115905036A - Data access system, method and related equipment - Google Patents

Data access system, method and related equipment Download PDF

Info

Publication number
CN115905036A
CN115905036A CN202111160189.7A CN202111160189A CN115905036A CN 115905036 A CN115905036 A CN 115905036A CN 202111160189 A CN202111160189 A CN 202111160189A CN 115905036 A CN115905036 A CN 115905036A
Authority
CN
China
Prior art keywords
node
address
chip
data access
port
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111160189.7A
Other languages
Chinese (zh)
Inventor
陈天翔
黄江乐
胡天驰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202111160189.7A priority Critical patent/CN115905036A/en
Priority to PCT/CN2022/118756 priority patent/WO2023051248A1/en
Publication of CN115905036A publication Critical patent/CN115905036A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/06Addressing a physical block of locations, e.g. base addressing, module addressing, memory dedication
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1081Address translation for peripheral access to main memory, e.g. direct memory access [DMA]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/42Bus transfer protocol, e.g. handshake; Synchronisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Small-Scale Networks (AREA)
  • Information Transfer Systems (AREA)

Abstract

The system does not need to wait for the preparation time of a network card queue unit, so that the efficiency of accessing the memory of the second node by the first node is high, the time delay is low, and the efficiency of processing data of the first node is improved.

Description

Data access system, method and related equipment
Technical Field
The present application relates to the field of storage, and in particular, to a data access system, method and related device.
Background
With the continuous development of science and technology, mass data generated in the information explosion era has penetrated into every industry and business function field at present, and the fields of big data (big data) and Artificial Intelligence (AI) have been developed, so that the development of the method becomes two very popular research directions.
When a computing node executes data processing (for example, big data or an AI task), a large memory capacity is often needed to store data, and usually, the data may be placed in memories of multiple storage nodes in a distributed manner, and the computing node may read the data in the memories of the storage nodes through a Remote Direct Memory Access (RDMA) protocol, thereby implementing expansion of the memory capacity.
However, under the RDMA protocol, communication between the computing node and the storage node is implemented by a network card, and data transmission is performed between the two network cards by a network card queue, so that a data reading request needs to be put into the network card queue every time the computing node reads data, which causes a large amount of time to be consumed in the preparation of the queue unit in the data reading process.
Disclosure of Invention
The application provides a data access system, a data access method and related equipment, which are used for solving the problems of low access efficiency and high network delay when a computing node accesses a memory of a storage node.
In a first aspect, a data access system is provided, which includes a first node and a second node, the first node and the second node being connected by a cable; the first node is used for generating a data access request, wherein the data access request is used for requesting data in a memory of the second node; the first node is used for sending a data access request to the second node through the cable; the second node is configured to convert the first destination address in the data access request into a local physical address corresponding to the first destination address, and access data in a memory of the second node according to the local physical address.
In the system described in the first aspect, the first node and the second node are connected by a cable, and communication interaction between the first node and the second node does not need to pass through a network card or a routing device, so that the first node does not need to wait for extra preparation time of a network card queue unit when accessing the memory of the second node, thereby improving the efficiency of accessing the memory of the second node by the first node and reducing the access delay.
In a possible implementation manner, the first node includes a computing chip and an interconnection chip, where a first high-speed interconnection port of the interconnection chip is connected to a second high-speed interconnection port of a processor in the second node through a cable, the computing chip is connected to the interconnection chip through a port, the computing chip is configured to generate a data access request and send the data access request to the interconnection chip, and the interconnection chip is configured to send the data access request to the second node through the cable.
The computing chip may be constituted by at least one general-purpose processor, such as a CPU, NPU, or a combination of CPU and hardware chips. The hardware chip may be an Application-Specific Integrated Circuit (ASIC), a Programmable Logic Device (PLD), or a combination thereof. The PLD may be a Complex Programmable Logic Device (CPLD), a Field-Programmable Gate Array (FPGA), general Array Logic (GAL), or any combination thereof. The number of the computing chips in the first node may be one or more, and the application is not limited in particular.
The interconnection chip may be an ASIC, a PLD, or a combination thereof, and the PLD may be a CPLD, an FPGA, a GAL, or any combination thereof, which is not specifically limited in this application. The number of the interconnection chips in the first node may be multiple, and the application is not particularly limited. The interconnection chip is provided with high-speed interconnection ports, the interconnection chip can perform data communication with the second node through the high-speed interconnection ports, the first high-speed interconnection ports of the interconnection chip are connected with the second high-speed interconnection ports on the second node through cables, the number of the first high-speed interconnection ports on each interconnection chip can be one or more, and each first high-speed interconnection port is in one-to-one correspondence with the second high-speed interconnection ports on the second node.
The high-speed interconnection port may be a high-speed serial bus port, such as a SERDES bus port, and the cable may be a cable, an optical fiber, a twisted pair, or the like, which may transmit data, and the cable is not specifically limited in this application. The number of the high-speed interconnection ports on the first node can be one or more, and the first high-speed interconnection ports on the first node are in one-to-one correspondence with the second high-speed interconnection ports on the second node.
The ports of the computing chip may be high-speed serial bus ports, such as SERDES bus ports, the computing chip may be connected to the interconnection chip through a bus, the bus may be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, and the like, and the ports and the bus of the computing chip, the interconnection chip, and the computing chip may be uniformly printed on a circuit board during processing. In a specific implementation, the number of the ports of the computing chip may be one or more, which is not limited in this application.
By implementing the implementation manner, the interconnection chip is deployed in the first node, so that the first node can communicate with more second nodes, and the larger the number of interconnection chips is, the larger the number of high-speed interconnection ports deployable in the first node is, the larger the number of second nodes connected with the first node is, thereby expanding the memory expansion capability of the first node, and making the first node suitable for more application scenarios.
In another possible implementation, the data communication between the computing chip of the first node, the interconnect chip, and the second node may implement an addressing function through an address decoder.
Optionally, the computing chip includes a first address decoder, and the computing chip is specifically configured to: and generating a data access request, determining a first port according to a first destination address and a first address decoder in the data access request, and sending the data access request to the interconnection chip through the first port, wherein the first address decoder is used for recording the corresponding relation between the destination address and the port of the computing chip.
Optionally, a first address decoder is deployed in the computing chip, the computing chip is specifically configured to generate a data access request, determine a first port according to a first destination address in the data access request and the first address decoder, and send the data access request to the interconnect chip through the first port, where the first address decoder may record a correspondence between the destination address and a port of the computing chip.
Optionally, a second address decoder is disposed in the interconnect chip, and the interconnect chip is specifically configured to determine a first high-speed interconnect port according to the first destination address and the second address decoder, and send a data access request to the second node through the first high-speed interconnect port, where the second address decoder is configured to record a correspondence between the destination address and the high-speed interconnect port.
Optionally, a third address decoder is deployed in the second node, and the second node is specifically configured to determine, according to the first destination address and the third address decoder, a local physical address corresponding to the first destination address, where the third address decoder is configured to record a correspondence between the destination address and the local physical address. The correspondence relationship recorded by the third address decoder may be: local physical address = destination address-base address, where the base address refers to a starting address of an address segment, also called a head address or a segment address, and the base addresses of destination addresses belonging to the same address segment are the same.
Optionally, the data access system may further include a configuration node, and the configuration node may configure the first address decoder, the second address decoder, and the third address decoder. Specifically, the configuration node is configured to obtain at least one local physical address of a memory of the second node from the second node, and the configuration node is configured to determine, according to the at least one local physical address, at least one corresponding destination address and configure the third address decoder; the configuration node is also used for configuring a second address decoder by combining a high-speed interconnection port between the second node and an interconnection chip according to at least one destination address; the configuration node is further configured to configure the first address decoder according to at least one destination address in combination with a chip port between the interconnect chip and the computing chip.
In a specific implementation, when the configuration node obtains at least one local physical address of the memory of the second node from the second node, the local physical address of the expanded memory, which is divided by the second node and used by the first node, may be determined according to the size of the memory of the second node and in combination with the service requirement. Optionally, the extended memory used by the first node may be a part of the memory of the second node, and the part of the extended memory may be processed by a memory isolation technology, so that the second node cannot access the part of the extended memory, and the security of data stored in the extended memory is improved.
By implementing the implementation mode, the first address decoder, the second address decoder and the third address decoder configured by the configuration node can ensure that the data access request generated by the computing chip is addressed by the address decoder, and the data access request is transmitted to the CPU of the second node corresponding to the destination address for memory reading and writing, so that the waiting time for network card queue preparation is avoided, the efficiency of expanding memory reading and writing by the first node is improved, the time delay can even reach microsecond level (the Ethernet time delay can reach millisecond level), the bandwidth can reach 400GB, and compared with the RDMA network card with the bandwidth of only 100GB, the RDMA network card has higher bandwidth and time delay.
In another possible implementation manner, when the first destination address is matched with the first address decoder and the second address decoder, the complete or partial first destination address may be matched with the address in the decoders, so as to improve the matching efficiency, and further improve the efficiency of data access.
Alternatively, the first port may be determined based on the base address and the length of the first destination address in the data access request. Specifically, the computing chip is specifically configured to match a base address and a length of a destination address recorded in the first address decoder with a base address and a length of the first destination address, and determine a first port corresponding to the matched destination address. Similarly, the interconnection chip is specifically configured to match the base address and the length of the destination address recorded in the second address decoder with the base address and the length of the second destination address, and determine the first high-speed interconnection port corresponding to the matched destination address. And will not be described in detail herein.
Alternatively, the first port may be determined based on the upper address of the first destination address. The computing chip is specifically configured to match a high-order address of a destination address recorded in the first address decoder with a high-order address of the first destination address, and determine a first port corresponding to the matched destination address, where a bit number of the high-order address is determined according to a memory size of the second node. Similarly, the interconnection chip is specifically configured to match the high-order address of the destination address recorded in the second address decoder with the high-order address of the first destination address, and determine the first high-speed interconnection port corresponding to the matched destination address. And will not be described in detail herein.
For example, assuming that the total length of the destination address is 64 bits, if the memory of the second node 120 corresponding to one high-speed interconnect port is 1T, then the addresses of the last 30 bits in the destination address of the 1T memory are different, the number of bits of the higher address may be 64-30= 34bits, in short, the first 34 bits of the destination address in the same memory are the same, and the last 30 bits are different, so that the number of bits of the higher address may be determined according to the expanded memory size of the second node 120 connected to the high-speed interconnect port.
It should be understood that, since the number of physical addresses corresponding to the expanded memory provided by the second node is multiple, ports corresponding to part of destination addresses recorded in the first decoder and the second decoder may be the same, destination addresses corresponding to the same port are located in the same memory, and the destination addresses corresponding to the same port have the same base address and length or have the same higher address, so that the port corresponding to the first destination address may be determined by matching the base address and the length or matching the higher address.
By implementing the implementation mode, partial first destination addresses are matched with the addresses in the decoder, so that the matching efficiency can be improved, the determining efficiency of the first port and the first high-speed interconnection port can be improved, and the data access efficiency can be further improved.
It should be noted that, if the data access request is to read data in the memory from the second node, after the second node processes the data access request, the read data may be returned to the first node in the original way by combining the first address decoder, the second address decoder, and the third address decoder according to the source address in the data access request, which is not repeated here.
It should be noted that, in some embodiments, the first node may not include an interconnection chip, a high-speed interconnection port on the computing chip is connected to a high-speed interconnection port on the second node through a cable, and the computing chip may also implement routing addressing of the data access request through the address decoder. Specifically, the computing chip may be deployed with a second address decoder, the second node is deployed with a third address decoder 230, the data access request generated by the computing chip may determine, according to the correspondence between the high-speed interconnect port recorded in the second address decoder and the destination address, a first high-speed interconnect port corresponding to the first destination address, and then send the data access request to the second node through the first high-speed interconnect port, which is not described herein in detail.
In a second aspect, a data access method is provided, which is applied to a data access system, the data access system includes a first node and a second node, the first node and the second node are connected by a cable, the method includes the following steps: the first node generates a data access request, wherein the data access request is used for requesting data in a memory of the second node, the first node sends the data access request to the second node through a cable, the second node converts a first destination address in the data access request into a local physical address corresponding to the first destination address, and accesses the data in the memory of the second node according to the local physical address.
In the method described in the second aspect, the first node and the second node are connected by a cable, and communication interaction between the first node and the second node does not need to pass through a network card or a routing device, so that the first node does not need to wait for extra preparation time of a network card queue unit when accessing the memory of the second node, thereby improving the efficiency of accessing the memory of the second node by the first node and reducing the access delay.
In a possible implementation manner, the first node includes a computing chip and an interconnection chip, where a first high-speed interconnection port of the interconnection chip is connected to a second high-speed interconnection port of a processor in the second node through a cable, the computing chip may generate a data access request and send the data access request to the interconnection chip, and the interconnection chip sends the data access request to the second node through the cable.
In a possible implementation manner, the computing chip is connected to the interconnect chip through a port, the computing chip includes a first address decoder, the computing chip can generate a data access request, the first port is determined according to a first destination address and the first address decoder in the data access request, and the data access request is sent to the interconnect chip through the first port, where the first address decoder is used to record a corresponding relationship between the destination address and the port of the computing chip.
In a possible implementation manner, the interconnection chip includes a second address decoder, the interconnection chip determines the first high-speed interconnection port according to the first destination address and the second address decoder, and sends a data access request to the second node through the first high-speed interconnection port, where the second address decoder is configured to record a corresponding relationship between the destination address and the high-speed interconnection port.
In a possible implementation manner, the second node includes a third address decoder, and the second node determines, according to the first destination address and the third address decoder, a local physical address corresponding to the first destination address, where the third address decoder is configured to record a correspondence between the destination address and the local physical address.
In a possible implementation manner, the data access system further includes a configuration node, and the method further includes the following steps: the configuration node acquires at least one local physical address of a memory of the second node from the second node, determines at least one corresponding destination address according to the at least one local physical address, configures the third address decoder, configures the second address decoder by combining a high-speed interconnection port between the second node and the interconnection chip according to the at least one destination address, and configures the first address decoder by combining a chip port between the interconnection chip and the computing chip according to the at least one destination address.
In a possible implementation manner, the computing chip matches the base address and the length of the destination address recorded in the first address decoder with the base address and the length of the first destination address, determines a first port corresponding to the matched destination address, and the interconnection chip matches the base address and the length of the destination address recorded in the second address decoder with the base address and the length of the first destination address, and determines a first high-speed interconnection port corresponding to the matched destination address.
In a possible implementation manner, the computing chip matches the high-order address of the destination address recorded in the first address decoder with the high-order address of the first destination address, and determines the first port corresponding to the matched destination address, where the number of bits of the high-order address is determined according to the memory size of the second node, and the interconnection chip matches the base address and the length of the destination address recorded in the second address decoder with the base address and the length of the first destination address, and determines the first high-speed interconnection port corresponding to the matched destination address.
In a possible implementation manner, the first high-speed interconnection port and the second high-speed interconnection port are high-speed serial bus ports, and the first port is a high-speed serial bus port.
In a third aspect, there is provided a computing node, which may be the first node described in the first and second aspects, the computing node being applied in a data access system, the data access system further comprising a storage node, the computing node comprising: the system comprises a computing chip and an interconnection chip, wherein the computing chip is connected with the interconnection chip through a high-speed interconnection port, and the interconnection chip is connected with other nodes through high-speed interconnection ports and cables; the computing chip is used for generating a data access request and sending the data access request to the interconnection chip, wherein the data access request comprises a first destination address, and the first destination address indicates the position of a memory in the storage node; the interconnection chip is used for sending the data access request to the storage node according to the first destination address.
In a fourth aspect, a storage node is provided, which may be the second node described in the first aspect and the second aspect, where the storage node is applied in a data access system, the data access system further includes a computing node, the storage node includes a processor and a memory, and the storage node is connected to the computing node through a high-speed interconnect port of the processor and a cable; the processor is configured to receive a data access request sent by the compute node through the high-speed interconnect port, convert a first destination address carried in the data access request into a local physical address of the storage node corresponding to the first destination address, and access data in the memory according to the local physical address.
In a fifth aspect, a computing device is provided, which comprises a processor and a memory, the memory storing code, the processor comprising functionality to perform the respective modules of the first aspect or any one of its possible implementations, as implemented by the first node or the second node.
In a sixth aspect, a computer-readable storage medium is provided, having stored therein instructions, which, when run on a computer, cause the computer to perform the method of the above aspects.
The present application can further combine to provide more implementations on the basis of the implementations provided by the above aspects.
Drawings
FIG. 1 is a schematic diagram of a data access system provided herein;
fig. 2 is a schematic deployment diagram of a first node and a second node in an application scenario provided in the present application;
FIG. 3 is a schematic diagram of another data access system provided herein;
FIG. 4 is a diagram of an example of a first address decoder provided herein;
FIG. 5 is a flow chart illustrating steps of a data access method provided herein;
FIG. 6 is a schematic structural diagram of a compute node provided herein;
FIG. 7 is a schematic structural diagram of a storage node provided in the present application;
fig. 8 is a schematic structural diagram of a computing device provided in the present application.
Detailed Description
The technical solutions in the embodiments of the present invention will be described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
First, the present application will be described with reference to an application scenario.
When executing some big data or AI tasks, the computing node needs a large memory capacity to store data and sparsely accesses the data with small granularity, for example, in a recommended system architecture, a memory of 10TB to 20TB is needed to store the data, the access granularity may be only 64byte or 128byte, and the randomness of the access position is high, which causes low data access efficiency of the big data or AI tasks, high network delay, and affects the processing efficiency of the big data or AI tasks.
In general, in order to increase the memory capacity, data may be placed in memories of multiple storage nodes in a distributed manner, and a computing node may read data in memories of the storage nodes through a Remote Direct Memory Access (RDMA) protocol, so as to implement expansion of the memory capacity.
However, under the RDMA protocol, communication between the computing node and the storage node is implemented by a network card, and data transmission is performed between the two network cards by a network card queue, so that a data reading request needs to be put into the network card queue by the computing node each time data is read, which causes a large amount of time to be consumed in the preparation of a queue unit in the data reading process, and even in some cases, the preparation time of the queue unit is longer than the data transmission time, which causes low data access efficiency of the computing node, high network delay, and influence on data processing efficiency.
In order to improve the speed of accessing the memory, PCI memory devices can be expanded on a memory bus of the computing device, but the number of the expanded PCIe devices is limited, and the expansion capabilities of QPI, UPI and other buses are weaker, so that the 1T memory capacity is generally expanded to the maximum, so that the expanded memory still cannot reach the magnitude required by big data or AI tasks.
In summary, when a compute node executes a data processing task, a large memory capacity is needed to store data, but a currently common RDMA method can expand a memory to a magnitude of a requirement, but data access efficiency of the compute node is low, and expanding the memory by using a PCI device can improve access efficiency, but the capacity of expanding the memory is weak, and the expanded memory still cannot meet the requirement, so that data access efficiency of big data or an AI task is low, network delay is high, and processing efficiency of the big data or the AI task is affected.
Fig. 1 is a schematic structural diagram of a data access system provided in the present application. The data access system comprises a first node 110 and a second node 120, wherein the first node 110 and the second node 120 are connected through a cable 140, and specifically, a high-speed interconnection port 130 of the first node 110 is connected with a high-speed interconnection port 130 of a processor in the second node 120 through a cable. It should be understood that the number of the second nodes 120 in fig. 1 is for illustration, and the number of the second nodes 120 is not limited in this application. For the sake of convenience of distinction, the high-speed interconnect port 130 in the first node 110 will be referred to as a first high-speed interconnect port, and the high-speed interconnect port 130 in the second node 120 will be referred to as a second high-speed interconnect port.
The first node 110 and the second node 120 may be physical servers, such as X86 servers, ARM servers, etc.; or a Virtual Machine (VM) implemented based on a general physical server and a Network Function Virtualization (NFV) technology, where the VM refers to a complete computer system that has a complete hardware system function and runs in a completely isolated environment, such as a virtual device in cloud computing, and the application is not limited specifically; the first node 110 and the second node 120 may also be a server cluster composed of a plurality of servers, which may be physical servers or virtual machines in the foregoing.
High-speed interconnect port 130 may be a high-speed serial bus port, such as a SERDES bus port, and cable 140 may be a cable, an optical fiber, a twisted pair cable, etc. capable of transmitting data, and this application does not specifically limit cable 140. The number of the high-speed interconnect ports 130 on the first node 110 may be one or more, and the first high-speed interconnect ports on the first node 110 and the second high-speed interconnect ports on the second node are in a one-to-one correspondence relationship, and fig. 1 exemplifies 3 ports, which is not limited in this application.
It should be noted that the first high-speed interconnect port of the first node 110 and the second high-speed interconnect port of the processor of the second node 120 are connected by a cable, the number of the processors of the second node 120 may be one or more, when the number of the processors of the second node 120 is multiple, the first node 110 may be connected to different processors of the second node through different high-speed interconnect ports, for example, when the second node 4 includes a processor 4, a processor 5, and a processor 6, the high-speed interconnect port 1 of the first node 110 may be connected to the processor 4, the high-speed interconnect port 2 of the first node 110 may be connected to the processor 5, and data in memories corresponding to different processors is read through different high-speed interconnect ports. It should be appreciated that the second node 120 may keep at least one processor unconnected to the first node 110, thereby ensuring that the second node 120 does not affect the processing of other traffic by the second node 120 after some memory is provided to the first node by the second node 120.
The first node 110 is used to process data tasks, such as big data or AI tasks as described above. The second node 120 is configured to store data, the second node 120 may divide a part of the memory into an extended memory of the first node 110, and the extended memory is used by the first node 110, and the first node 110 may read data from the extended memory divided by the second node 120 through the data access system shown in fig. 1, and perform processing on big data or an AI task, thereby implementing memory extension of the first node 110.
In an application scenario, as shown in fig. 2, the first node 110 and the second node 120 may be deployed in the same cabinet, and a first high-speed interconnect port of the first node 110 is directly connected to a second high-speed interconnect port of the second node 120. All servers in the whole cabinet can communicate without a switch or a network card, so that the purpose that the first node 110 reads data from the memory of the second node 120 is achieved.
The first node 110 may be an AI server, the second node 120 may be a 2P server, the 2P server refers to a server with two CPUs, each 2P server has 16 channels (channels), and each channel can mount 2 64GB memory banks, that is, each 2P server can extend 64GB × 2 × 16=2tb memory for the first node 110, so that 10 or so 2P servers can meet memory extension requirements of 10TB to 20 TB. And because the AI server and the 2P server are at the height of the cabinet, one AI server and 8-10 2P servers can be just placed in one cabinet, so that the rack server formed by one cabinet can have 10 TB-20 TB memory, and the memory requirement of the AI server in data processing under most application scenes is met. It should be understood that fig. 2 is for illustration purposes and is not intended to be limiting in this application.
In a specific implementation, the first node 110 may generate a data access request, where the data access request is used to request data in a memory of the second node, the first node 110 may send the data access request to the second node 120 through the cable 140, and the second node may convert a first destination address in the data access request into a local physical address corresponding to the destination address, and access the data in the memory of the second node according to the local physical address.
Exemplarily, as shown in fig. 3, fig. 3 is a schematic structural diagram of another data access system provided in the present application, where the first node 110 may include a computing chip 111, an interconnect chip 112, a port 113 of the computing chip, and a bus 114, where the port 113 of the computing chip 111 and the interconnect chip 112 communicate through the bus 114, and fig. 3 does not depict a port on the interconnect chip 112 for clarity of connection in the drawing, but in a specific implementation, the interconnect chip 112 may also have a corresponding port, it should be understood that fig. 1 is only an exemplary division manner, and each module unit may be combined or split into more or fewer module units, which is not limited in the present application.
The port 113 of the computing chip may be a high-speed serial bus port, such as a SERDES bus port, the bus 114 may be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, and the computing chip 111, the Interconnect chip 112, the port 113 of the computing chip, and the bus 114 may be printed on a circuit board at the time of processing. In a specific implementation, the number of the ports 113 of the computing chip may be one or more, and fig. 3 illustrates two ports (port 0 and port 1) as an example, which is not limited in this application.
The computing chip 111 may be constituted by at least one general-purpose processor, such as a CPU, NPU, or a combination of CPU and hardware chips. The hardware chip may be an Application-Specific Integrated Circuit (ASIC), a Programmable Logic Device (PLD), or a combination thereof. The PLD may be a Complex Programmable Logic Device (CPLD), a Field-Programmable Gate Array (FPGA), general Array Logic (GAL), or any combination thereof. Computing chip 111 executes various types of digital storage instructions that enable first node 110 to provide a wide variety of services. The number of the computing chips 111 in the first node 110 may be one or more, and fig. 3 illustrates one computing chip 111 as an example, which is not limited in this application.
The interconnection chip 112 may be an ASIC, a PLD, or a combination thereof, and the PLD may be a CPLD, an FPGA, a GAL, or any combination thereof, which is not specifically limited in this application. The number of the interconnect chips 112 in the first node 110 may be multiple, and fig. 3 illustrates 2 interconnect chips 112 (interconnect chip 1 and interconnect chip 2) as an example, which is not limited in this application.
A high-speed interconnect port 130 is disposed on the interconnect chip 112, the interconnect chip 112 may perform data communication with the second node 120 through the high-speed interconnect port 130, and a first high-speed interconnect port of the interconnect chip 112 is connected to a second high-speed interconnect port on the second node 120 through a cable 140, where the description of the high-speed interconnect port 130 and the cable 140 may refer to the foregoing embodiment of fig. 1 and fig. 2, and repeated description is omitted here. It should be noted that the number of the first high-speed interconnect ports on each interconnect chip 112 may be one or more, each first high-speed interconnect port is in a one-to-one correspondence with the second high-speed interconnect port on the second node, and fig. 3 exemplifies that the number of the interconnect chips 112 is 2, that is, the interconnect chip 1 includes a high-speed interconnect port 2 and a high-speed interconnect port 3, and the interconnect chip 2 includes a high-speed interconnect port 4 and a high-speed interconnect port 5, which is not specifically limited in this application.
The description of the second node 120 may refer to the embodiments in fig. 1 and fig. 2, which are not repeated herein, where the number of the second nodes 120 may be one or more, and fig. 3 exemplifies 4 second nodes (second nodes 1 to 4), which is not specifically limited in this application.
In this embodiment of the application, the computing chip 111 is configured to generate the data access request, and send the data access request to the interconnect chip 112, and in a specific implementation, the computing chip 111 may send the data access request to the interconnect chip 112 through the port 113 of the computing chip. The interconnect chip 112 is configured to send the data access request to the second node 120 through the cable 140, and in a specific implementation, the interconnect chip 112 sends the data access request to a second high-speed interconnect port of the second node 120 through the first high-speed interconnect port 130. It can be understood that, the interconnection chip 112 is disposed in the first node 110, so that the first node 110 can communicate with more second nodes 120, and the greater the number of interconnection chips 112, the greater the number of high-speed interconnection ports 130 that can be disposed in the first node 110, and the greater the number of second nodes 120 connected to the first node 110, thereby expanding the memory expansion capability of the first node 110, so that the first node 110 can be suitable for more application scenarios.
In the embodiment of the present application, the data communication between the computing chip 111, the interconnect chip 112 and the second node 120 may implement an addressing function through an address decoder. The address decoders in the compute chip 111, interconnect chip 112, and second node 120 are described in detail below in conjunction with FIG. 3.
In an embodiment, as shown in fig. 3, a first address decoder 210 is disposed in the computing chip 111, and the computing chip 111 is specifically configured to generate a data access request, determine a first port according to a first destination address in the data access request and the first address decoder 210, and send the data access request to the interconnect chip 112 through the first port, where the first address decoder 210 may record a corresponding relationship between the destination address and a port of the computing chip.
In an embodiment, as shown in fig. 3, a second address decoder 220 is disposed in the interconnect chip 112, and the interconnect chip 112 is specifically configured to determine a first high-speed interconnect port according to the first destination address and the second address decoder 220, and send a data access request to the second node 120 through the first high-speed interconnect port, where the second address decoder 220 is configured to record a corresponding relationship between the destination address and the high-speed interconnect port.
In an embodiment, as shown in fig. 3, a third address decoder 230 is disposed in the second node 120, and the second node 120 is specifically configured to determine a local physical address corresponding to the first destination address according to the first destination address and the third address decoder, where the third address decoder 230 is configured to record a correspondence between the destination address and the local physical address.
In an embodiment, as shown in fig. 3, the data access system may further include a configuration node 150, and the configuration node 150 may configure the first address decoder 210, the second address decoder 220, and the third address decoder 230. Specifically, the configuration node 150 is configured to obtain at least one local physical address of the memory of the second node from the second node 120, and the configuration node is configured to determine, according to the at least one local physical address, at least one corresponding destination address and configure the third address decoder; the configuration node is also used for configuring a second address decoder by combining a high-speed interconnection port between the second node and an interconnection chip according to at least one destination address; the configuration node is further configured to configure the first address decoder according to at least one destination address in combination with a chip port between the interconnect chip and the computing chip.
It can be understood that the first, second, and third address decoders configured by the configuration node 150 can ensure that the data access request generated by the computing chip is routed through the address decoder, and the data access request is transmitted to the CPU of the second node corresponding to the destination address for memory read/write, so as to avoid the waiting time for network card queue preparation, improve the efficiency of the first node 110 for expanding memory read/write, and improve the latency even to the microsecond level (the ethernet latency can reach the millisecond level), and the bandwidth can reach 400GB, which has higher bandwidth and latency compared with the RDMA network card whose bandwidth is only 100 GB.
In an embodiment, when the configuration node 150 obtains at least one local physical address of the memory of the second node from the second node 120, the local physical address of the expanded memory, which is divided by the second node 120 and used by the first node 110, may be determined according to the size of the memory of the second node 120 and by combining with the service requirement.
Optionally, the extended memory used by the first node 110 may be a part of the memory of the second node 120, and the part of the extended memory may be processed by a memory isolation technology, so that the second node 120 cannot access the part of the extended memory, thereby improving the security of data stored in the extended memory.
Alternatively, the correspondence relationship recorded by the third address decoder 230 may be: local physical address = destination address-base address, where the base address refers to a starting address of an address segment, also called a head address or a segment address, and the base addresses of destination addresses belonging to the same address segment are the same.
In an embodiment, when the first destination address is matched with the first address decoder 210 and the second address decoder 220, the whole or part of the first destination address may be matched with the addresses in the decoders, so as to improve the matching efficiency, and further improve the efficiency of data access.
Alternatively, the first port may be determined based on the base address and length of the first destination address in the data access request. Specifically, the computing chip 111 is specifically configured to match the base address and the length of the destination address recorded in the first address decoder 210 with the base address and the length of the first destination address, and determine the first port corresponding to the matched destination address. Similarly, the interconnect chip 112 is specifically configured to match the base address and the length of the destination address recorded in the second address decoder 220 with the base address and the length of the second destination address, and determine the first high-speed interconnect port corresponding to the matched destination address. And will not be described in detail herein.
Alternatively, the first port may be determined based on the upper address of the first destination address. The computing chip 111 is specifically configured to match a high-order address of the destination address recorded in the first address decoder with a high-order address of the first destination address, and determine a first port corresponding to the matched destination address, where a bit number of the high-order address is determined according to a memory size of the second node. Similarly, the interconnect chip 112 is specifically configured to match the upper address of the destination address recorded in the second address decoder with the upper address of the first destination address, and determine the first high-speed interconnect port corresponding to the matched destination address. And will not be described in detail herein.
For example, assuming that the total length of the destination address is 64 bits, if the memory of the second node 120 corresponding to one high-speed interconnect port is 1T, then the addresses of the last 30 bits in the destination address of the 1T memory are different, the number of bits of the higher address may be 64-30= 34bits, in short, the first 34 bits of the destination address in the same memory are the same, and the last 30 bits are different, so that the number of bits of the higher address may be determined according to the expanded memory size of the second node 120 connected to the high-speed interconnect port.
It should be understood that, since the number of physical addresses corresponding to the expanded memory provided by the second node 120 is multiple, ports corresponding to part of destination addresses recorded in the first decoder and the second decoder may be the same, destination addresses corresponding to the same port are located in the same memory, and destination addresses corresponding to the same port have the same base address and length or have the same high-order address, so that the port corresponding to the first destination address may be determined by matching the base address and the length or matching the high-order address.
Still taking the data access system shown in fig. 3 as an example, assuming that the number of physical addresses corresponding to the extended memory provided by the second node 4 is 4, and the third address decoder records destination addresses 1 to 10, the second address decoder may record the destination addresses 1 to 10 corresponding to the high-speed interconnect port 5, if the first destination address is any one of the destination addresses 1 to 10, the corresponding high-speed interconnect ports are all the high-speed interconnect ports 5, and the destination addresses corresponding to the same high-speed interconnect ports are the addresses of the extended memory for the second node 4, so that the base addresses and the lengths of the destination addresses are the same, or the high-order addresses are the same. When the high-speed interconnection port corresponding to the first destination address is determined, the partial address of the first destination address can be matched with the partial address of the destination address in the second address decoder, so that the matching efficiency is improved.
For example, fig. 4 is an exemplary diagram of an address decoder 210, as shown in fig. 4, the first address decoder 210 may include a plurality of destination addresses, ports of the computing chip 111 corresponding to destination addresses with the same base address and length are the same, and assuming that the base address and length of the first destination address are as shown in fig. 4, the base address and length of the first destination address may be matched with the base address and length of each destination address in the first address decoder 210, so as to determine that the first port corresponding to the matched destination address is port 2, and the data access request may be transmitted to the interconnect chip 112 through port 2. Similarly, the high-speed interconnect port for transmitting the data access request is determined according to the base address and the length of the destination address recorded by the second address decoder 220, and the description is not repeated here. It should be understood that fig. 4 matches the first destination address based on the base address and the length, and the above-described matching based on the upper address is similar thereto and will not be illustrated here.
It should be noted that the data access system shown in fig. 1 can also implement routing of data access requests through the address decoder. Specifically, the first node 110 may be deployed with a second address decoder 220, the second node 120 is deployed with a third address decoder 230, and the data access request generated by the first node 110 may determine, according to the correspondence between the high-speed interconnection port recorded in the second address decoder 220 and the destination address, a first high-speed interconnection port corresponding to the first destination address, and then send the data access request to the second node 120 through the first high-speed interconnection port, which is not described herein again. It should be understood that in the data access system shown in fig. 1, the high-speed interconnect port may be disposed on a processor in the first node 110, that is, the processor of the first node 110 is directly connected to the processor of the second node 120 through a cable.
It should be noted that, if the data access request is to read data in the memory from the second node 120, after the second node 120 processes the data access request, the read data may be returned to the first node 110 in an original way by combining the first address decoder, the second address decoder, and the third address decoder according to the source address in the data access request, which is not repeated here.
To sum up, according to the data access system provided by the application, the high-speed interconnection port of the first node is connected with the high-speed interconnection port of the second node through a cable, the first node can be combined with an address decoder to realize an addressing function, so that a data access request is sent to the memory of the second node corresponding to the first destination address, and the memory expansion of the first node is realized.
The data access method provided by the present application is explained below with reference to fig. 5. Fig. 5 is a data access method provided in the present application, which may be applied to the data access systems shown in fig. 1 to fig. 4, and the method may include the following steps:
step S510: the first node generates a data access request, wherein the data access request is used for requesting data in the memory of the second node. The description of the first node may refer to the embodiments of fig. 1 to fig. 4, and is not repeated here.
Step S520: the first node sends a data access request to the second node over the cable. It should be understood that the first high-speed interconnect port of the first node is connected to the second high-speed interconnect port of the second node through a cable, where the description of the high-speed interconnect port and the cable may refer to the embodiments in fig. 1 to fig. 4, and repeated description is omitted here.
In an embodiment, the first node may include a second address decoder, where the second address decoder is configured to record a correspondence between a destination address and a high-speed interconnect port, and the first node may determine, according to a first destination address in the data access request and the second address decoder, a first high-speed interconnect port corresponding to the first destination address, and then send the data access request to the second node through the first high-speed interconnect port. The specific description of the second address decoder may refer to the embodiments in fig. 1 to fig. 4, and is not repeated here.
Step S530: and the second node converts the first destination address in the data access request into a local physical address corresponding to the first destination address, and accesses the data in the memory of the second node according to the local physical address.
In an embodiment, the second node may include a third address decoder, where the third address decoder is configured to record a correspondence between the destination address and the local physical address, and the second node may determine, according to the first destination address and the third decoder in the data access request, the local physical address corresponding to the first destination address, and then access the data in the memory of the second node according to the local physical address. For the detailed description of the third address decoder, reference may be made to the embodiments in fig. 1 to 4, which is not repeated herein.
In an embodiment, the first node may include a computing chip and an interconnection chip, the computing chip is connected to the interconnection chip through a port, and the specific connection manner may be a bus in the foregoing, where the description of the first node, the second node, the computing chip, the interconnection chip, the port, and the bus may refer to the embodiments in fig. 1 to 4, and details are not repeated here.
In a specific implementation, the computing chip may execute step S510 to generate the data access request, and send the data access request to the interconnect chip, and the computing chip may send the data access request to the interconnect chip through a port of the computing chip. And the interconnection chip sends the data access request to the second node through the first high-speed interconnection port.
It can be understood that, by deploying interconnection chips in a first node, the first node can communicate with more second nodes, and as the number of interconnection chips is larger, the number of high-speed interconnection ports deployable in the first node is larger, so that the number of second nodes connected to the first node is larger, thereby expanding the memory expansion capability of the first node, and enabling the first node to be applicable to more application scenarios.
In an embodiment, a first address decoder is deployed in a computing chip, after the computing chip generates a data access request, a first port is determined according to a first destination address and the first address decoder in the data access request, and the data access request is sent to an interconnection chip through the first port, where the first address decoder may record a correspondence between the destination address and a port of the computing chip.
In an embodiment, a second address decoder is disposed in an interconnect chip, the interconnect chip may determine a first high-speed interconnect port according to a first destination address and the second address decoder, and send a data access request to a second node through the first high-speed interconnect port, where the second address decoder is configured to record a correspondence between the destination address and the high-speed interconnect port.
In an embodiment, the data access system may further include a configuration node, and the configuration node may configure the first address decoder, the second address decoder, and the third address decoder before the first node generates the data access request. Specifically, the configuration node acquires at least one local physical address of a memory of the second node from the second node, determines at least one corresponding destination address according to the at least one local physical address, and configures the third address decoder; according to at least one destination address, configuring a second address decoder by combining a high-speed interconnection port between a second node and an interconnection chip; and configuring the first address decoder by combining a chip port between the interconnection chip and the calculation chip according to at least one destination address.
It can be understood that the first, second and third address decoders configured by the configuration node can ensure that the data access request generated by the computing chip is addressed by the address decoder, and the data access request is transmitted to the CPU of the second node corresponding to the destination address to perform memory read-write, thereby avoiding the waiting time for network card queue preparation, improving the efficiency of the first node in expanding memory read-write, and even enabling the time delay to reach a microsecond level (the ethernet time delay can reach a millisecond level), and enabling the bandwidth to reach 400GB, and having higher bandwidth and time delay compared with an RDMA network card having a bandwidth of only 100 GB.
In an embodiment, when the configuration node obtains at least one local physical address of the memory of the second node from the second node 120, the local physical address of the expanded memory, which is divided by the second node and is used by the first node 110, may be determined according to the size of the memory of the second node and by combining with the service requirement.
Optionally, the extended memory used by the first node may be a part of the memory of the second node, and the part of the extended memory may be processed by a memory isolation technology, so that the second node cannot access the part of the extended memory, and the security of data stored in the extended memory is improved.
Alternatively, the correspondence recorded by the third address decoder may be: local physical address = destination address-base address, where the base address refers to a starting address of an address segment, also called a head address or a segment address, and the base addresses of destination addresses belonging to the same address segment are the same.
In an embodiment, when the first destination address is matched with the first address decoder and the second address decoder, the whole or part of the first destination address may be matched with the address in the decoders, so as to improve the matching efficiency, and further improve the efficiency of data access.
Alternatively, the first port may be determined based on the base address and length of the first destination address in the data access request. Specifically, the computing chip may match the base address and the length of the destination address recorded in the first address decoder with the base address and the length of the first destination address, and determine the first port corresponding to the matched destination address. Similarly, the interconnection chip may match the base address and the length of the destination address recorded in the second address decoder with the base address and the length of the second destination address, and determine the first high-speed interconnection port corresponding to the matched destination address. And will not be described in detail herein.
Alternatively, the first port may be determined based on the upper address of the first destination address. The computing chip may match a high-order address of the destination address recorded in the first address decoder with a high-order address of the first destination address, and determine the first port corresponding to the matched destination address, where a bit number of the high-order address is determined according to a memory size of the second node. Similarly, the interconnection chip may match the high-order address of the destination address recorded in the second address decoder with the high-order address of the first destination address, and determine the first high-speed interconnection port corresponding to the matched destination address. And will not be described in detail herein.
It should be understood that, because the number of physical addresses corresponding to the expanded memory provided by the second node is multiple, ports corresponding to part of destination addresses recorded in the first decoder and the second decoder may be the same, destination addresses corresponding to the same port are located in the same memory, and the base addresses and the lengths of the destination addresses corresponding to the same port are the same, or the higher addresses of the destination addresses are the same, the port corresponding to the first destination address may be determined by matching the base addresses and the lengths, or matching the higher addresses, thereby improving matching efficiency.
It should be noted that, for a detailed description of the manner of matching by the base address and the length, reference may be made to the foregoing illustration of the embodiment in fig. 4, and details are not repeated here.
It should be noted that, if the data access request is to read data in the memory from the second node, after the second node processes the data access request, the read data may be returned to the first node in the original way by combining the first address decoder, the second address decoder, and the third address decoder according to the source address in the data access request, which is not repeated here.
To sum up, according to the data access method provided by the application, the high-speed interconnection port of the first node and the high-speed interconnection port of the second node are connected through a cable, and the first node can realize an addressing function by combining with an address decoder, so that a data access request is sent to the memory of the second node corresponding to the first destination address, and the memory expansion of the first node is realized.
Fig. 6 is a schematic structural diagram of a computing node 600 provided in the present application, where the computing node 600 may be the first node 110 in the foregoing, and the computing node 600 may include a computing chip 111 and an interconnect chip 112, where the computing chip 111 may include a generating unit 1111, a first matching unit 1112, and a second sending unit 1113, and the interconnect chip 112 may include a first sending unit 1121 and a second matching unit 1122.
A generating unit 1111, configured to generate a data access request, where the data access request is used to request data in a memory of a second node, and specifically step S510 in the embodiment in fig. 5 may be executed;
the first sending unit 1121 is configured to send the data access request to the second node through the cable, so that the second node converts the first destination address in the data access request into a local physical address corresponding to the first destination address, and accesses data in the memory of the second node according to the local physical address, and specifically may perform step S520 in the embodiment of fig. 5.
In an embodiment, a first high-speed interconnect port of the interconnect chip 112 is connected to a second high-speed interconnect port of a processor in a second node through a cable, and the generating unit 1111 is configured to generate a data access request through the computing chip 111; a second sending unit 1113, configured to send the data access request to the interconnect chip 112 through the computing chip 111; a first sending unit 1121 configured to send a data access request to the second node through the interconnection chip 112 through a cable.
In an embodiment, the computing chip 111 is connected to the interconnect chip 112 through a port, the computing chip 111 includes a first address decoder, and the first matching unit 1112 is configured to determine, through the computing chip 111, a first port according to a first destination address in the data access request and the first address decoder, where the first address decoder is configured to record a corresponding relationship between the destination address and the port of the computing chip; a second sending unit 1113, configured to send, through the computing chip, a data access request to the interconnect chip 112 through the first port.
In an embodiment, the interconnect chip 112 includes a second address decoder, and the second matching unit 1122 is configured to determine, through the interconnect chip 112, a first high-speed interconnect port according to a first destination address and the second address decoder, where the second address decoder is configured to record a corresponding relationship between the destination address and the high-speed interconnect port; a first sending unit 1121, configured to send a data access request to the second node through the first high-speed interconnect port through the interconnect chip 112.
In an embodiment, the first matching unit 1112 is configured to match, by the computing chip 111, the base address and the length of the destination address recorded in the first address decoder with the base address and the length of the first destination address, and determine a first port corresponding to the matched destination address; the second matching unit 1122 is configured to match the base address and the length of the destination address recorded in the second address decoder with the base address and the length of the first destination address through the interconnect chip 112, and determine a first high-speed interconnect port corresponding to the matched destination address.
In an embodiment, the first matching unit 1112 is configured to match, by the computing chip 111, a high-order address of a destination address recorded in the first address decoder with a high-order address of the first destination address, and determine a first port corresponding to the matched destination address, where a bit number of the high-order address is determined according to a memory size of the second node; the second matching unit 1122 is configured to match the base address and the length of the destination address recorded in the second address decoder with the base address and the length of the first destination address through the interconnect chip 112, and determine a first high-speed interconnect port corresponding to the matched destination address.
In one embodiment, the first high-speed interconnect port and the second high-speed interconnect port are high-speed serial bus ports, and the first port is a high-speed serial bus port.
The first node 110 may be a physical server, such as an X86 server, an ARM server, or the like; or a Virtual Machine (VM) implemented based on a general physical server and a Network Function Virtualization (NFV) technology, where the VM refers to a complete computer system that has a complete hardware system function and runs in a completely isolated environment, such as a virtual device in cloud computing, and the application is not limited specifically; or a server cluster consisting of a plurality of physical servers or virtual machines.
Fig. 7 is a schematic structural diagram of a storage node 700 provided in the present application, where the storage node 700 may be the second node 120 in the embodiments of fig. 1 to fig. 6, and the storage node 700 may include a receiving unit 121 and a converting unit 122.
A receiving unit 121, configured to receive a data access request, where the data access request is generated by a first node, and the data access request is sent by the first node through a cable;
the converting unit 122 is configured to convert the first destination address in the data access request into a local physical address corresponding to the first destination address, and access data in the memory of the second node according to the local physical address.
In one embodiment, the second node 120 includes a third address decoder; a converting unit 122, configured to determine a local physical address corresponding to the first destination address according to the first destination address and a third address decoder, where the third address decoder is configured to record a corresponding relationship between the destination address and the local physical address.
In one embodiment, a first high-speed interconnect port of a first node is connected with a second high-speed interconnect port of a processor in a second node through a cable, and the first high-speed interconnect port and the second high-speed interconnect port are high-speed serial bus ports.
The second node 120 may be a physical server, such as an X86 server, an ARM server, or the like; or a Virtual Machine (VM) implemented based on a general physical server and a Network Function Virtualization (NFV) technology, where the VM refers to a complete computer system that has a complete hardware system function and runs in a completely isolated environment, such as a virtual device in cloud computing, and the application is not limited specifically; or a server cluster consisting of a plurality of physical servers or virtual machines.
To sum up, in the first node and the second node provided by the present application, the high-speed interconnection port of the first node is connected to the high-speed interconnection port of the second node through a cable, and the first node can implement an addressing function in combination with an address decoder, so as to send a data access request to the memory of the second node corresponding to the first destination address, thereby implementing memory expansion of the first node.
Fig. 8 is a schematic structural diagram of a computing device provided in this application, where the computing device 800 may be the first node 110 or the second node 120 in the embodiments of fig. 1 to fig. 7, and the computing device may be a physical server, a virtual machine, or a server cluster, or may be a chip (system) or other component or assembly that may be disposed in the physical server or the virtual machine, which is not limited in this application.
Further, the computing device 800 includes a processor 801, a memory 802, and a communication interface 803, wherein the processor 801, the memory 802, and the communication interface 803 communicate via a bus 805, or may communicate via other means, such as wireless transmission.
The processor 801 may be constituted by at least one general-purpose processor, such as a CPU, NPU, or a combination of a CPU and hardware chip. The hardware chip may be an Application-Specific Integrated Circuit (ASIC), a Programmable Logic Device (PLD), or a combination thereof. The PLD may be a Complex Programmable Logic Device (CPLD), a Field-Programmable Gate Array (FPGA), general Array Logic (GAL), or any combination thereof. The processor 801 executes various types of digitally stored instructions, such as software or firmware programs stored in the memory 802, which enable the computing device 800 to provide a wide variety of services.
In a specific implementation, the processor 801 may be a computing chip or an interconnect chip in the first node in the foregoing description, or may be a processor chip in the second node, which is not specifically limited in this application. In particular implementations, processor 801 may include one or more CPUs, such as CPU0 and CPU1 shown in fig. 8, as one embodiment.
In particular implementations, computing device 800 may also include multiple processors, such as processor 801 and processor 804 shown in FIG. 8, for one embodiment. Each of these processors may be a single-Core Processor (CPU) or a multi-Core Processor (CPU). A processor herein may refer to one or more devices, circuits, and/or processing cores for processing data (e.g., computer program instructions).
The memory 802 is used for storing program codes and is controlled to be executed by the processor 801 to execute the processing steps of the workflow system in any of the embodiments of fig. 1-7. One or more software modules may be included in the program code. When the computing node is the first node 110, the one or more software modules may be the generating unit 1111, the first matching unit 1112, the second sending unit 1113, the second matching unit 1122, and the first sending unit 1121 in the embodiment of fig. 6, and the specific implementation manner may refer to the embodiment of the method in fig. 6, which is not described herein again; when the computing node is the second node 120, the one or more software modules may be the receiving unit 121 and the converting unit 122 in the embodiment of fig. 7, and the specific implementation manner may refer to the embodiment of the method in fig. 6, and is not described herein again.
The memory 802 may include both read-only memory and random access memory, and provides instructions and data to the processor 801. The memory 802 may also include non-volatile random access memory. For example, the memory 802 may also store device type information.
The memory 802 can be either volatile memory or nonvolatile memory, or can include both volatile and nonvolatile memory. The non-volatile memory may be a read-only memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an electrically Erasable EPROM (EEPROM), or a flash memory. Volatile memory can be Random Access Memory (RAM), which acts as external cache memory. By way of example, but not limitation, many forms of RAM are available, such as static random access memory (static RAM, SRAM), dynamic Random Access Memory (DRAM), synchronous Dynamic Random Access Memory (SDRAM), double data rate synchronous dynamic random access memory (DDR SDRAM), enhanced synchronous SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), and direct bus RAM (DR RAM). The hard disk may be a Hard Disk Drive (HDD), a Solid State Disk (SSD), a mechanical hard disk (HDD), or the like, and the application is not limited in particular.
The communication interface 803 may be a wired interface (e.g., an ethernet interface), an internal interface (e.g., a Peripheral Component Interconnect express (PCIe) bus interface), a wired interface (e.g., an ethernet interface), or a wireless interface (e.g., a cellular network interface or a wireless lan interface), and is used for communicating with other servers or modules, and in particular implementations, the communication interface 803 may be used for receiving a message for the processor 801 or the processor 804 to process the message.
The bus 805 may be a Peripheral Component Interconnect Express (PCIe) bus, an Extended Industry Standard Architecture (EISA) bus, a unified bus (UBs or UBs), a computer Express link (CXL), a cache coherent Interconnect protocol (CCIX) bus, or the like. The bus 805 may be divided into an address bus, a data bus, a control bus, and the like.
The bus 805 may include a power bus, a control bus, a status signal bus, and the like, in addition to a data bus. But for clarity of illustration the various buses are labeled as bus 805 in the figures.
It should be noted that fig. 8 is only one possible implementation manner of the embodiment of the present application, and in practical applications, the computing device 800 may further include more or less components, which is not limited herein. For contents that are not shown or described in the embodiments of the present application, reference may be made to the related explanations in the embodiments of fig. 1 to fig. 7, and details are not repeated here.
It should be understood that the computing device 800 shown in fig. 8 may also be a computer cluster formed by at least one physical server, and specific descriptions about a specific form of the data access system may specifically refer to the embodiments in fig. 1 to fig. 7, and are not described here again to avoid repetition.
The present embodiment provides a chip, which may be specifically used for a server where a processor of an X86 architecture is located (also referred to as an X86 server), a server where a processor of an ARM architecture is located (also referred to as an ARM server for short), and the like, where the chip may include the above device or logic circuit, and when the chip runs on the server, the chip causes the server to execute the data access method described in the foregoing method embodiment.
In a specific implementation, the chip may be a computing chip or an interconnect chip in the first node in the foregoing description, and may also be a processor chip in the second node.
Embodiments of the present application provide a motherboard, which may be referred to as a Printed Circuit Board (PCB), including a processor, where the processor is configured to execute program codes to implement the data access method described in the foregoing method embodiments. Optionally, the motherboard may further include a memory for storing the program code for execution by the processor.
An embodiment of the present application provides a computer-readable storage medium, including: the computer readable storage medium having stored therein computer instructions; the computer instructions, when executed on a computer, cause the computer to perform the data access method described in the method embodiments above.
Embodiments of the present application provide a computer program product comprising instructions, including a computer program or instructions, which when run on a computer, cause the computer to perform the data access method described in the above method embodiments.
The above embodiments may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. The computer program product includes at least one computer instruction. The procedures or functions according to the embodiments of the invention are wholly or partly generated when the computer program instructions are loaded or executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website, computer, server, or data center to another website, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage node, such as a server, a data center, or the like, that contains at least one collection of available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., digital Video Disk (DVD), or a semiconductor medium.
While the invention has been described with reference to specific embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (20)

1. A data access system, characterized in that the data access system comprises a first node and a second node, the first node and the second node are connected through a cable;
the first node is used for generating a data access request, wherein the data access request is used for requesting data in the memory of the second node;
the first node is used for sending the data access request to the second node through the cable;
the second node is configured to convert the first destination address in the data access request into a local physical address corresponding to the first destination address, and access the data in the memory of the second node according to the local physical address.
2. The system of claim 1, wherein the first node comprises a compute chip and an interconnect chip, wherein a first high-speed interconnect port of the interconnect chip is connected to a second high-speed interconnect port of a processor in the second node by the cable;
the computing chip is used for generating the data access request and sending the data access request to the interconnection chip;
the interconnection chip is used for sending the data access request to the second node through the cable.
3. The system of claim 2, wherein the computing chip is coupled to the interconnect chip via a port, the computing chip including a first address decoder;
the computing chip is specifically configured to: and generating the data access request, determining the first port according to a first destination address in the data access request and the first address decoder, and sending the data access request to the interconnection chip through the first port, wherein the first address decoder is used for recording the corresponding relation between the destination address and the port of the computing chip.
4. The system of claim 3, wherein the interconnect chip includes a second address decoder therein,
the interconnection chip is specifically configured to: and determining the first high-speed interconnection port according to the first destination address and the second address decoder, and sending the data access request to the second node through the first high-speed interconnection port, wherein the second address decoder is used for recording the corresponding relation between the destination address and the high-speed interconnection port.
5. The system of claim 4, wherein the second node comprises a third address decoder;
the second node is specifically configured to: and determining a local physical address corresponding to the first destination address according to the first destination address and the third address decoder, wherein the third address decoder is used for recording the corresponding relation between the destination address and the local physical address.
6. The system of claim 5, wherein the data access system further comprises a configuration node,
the configuration node is used for acquiring at least one local physical address of a memory of the second node from the second node;
the configuration node is configured to determine, according to the at least one local physical address, at least one corresponding destination address, and configure the third address decoder;
the configuration node is further configured to configure the second address decoder according to the at least one destination address in combination with the high-speed interconnect port between the second node and the interconnect chip;
the configuration node is further configured to configure the first address decoder according to the at least one destination address in combination with a chip port between the interconnect chip and the compute chip.
7. The system according to any one of claims 4 to 6,
the computing chip is specifically configured to: matching the base address and the length of the destination address recorded in the first address decoder with the base address and the length of the first destination address, and determining the first port corresponding to the matched destination address;
the interconnection chip is specifically configured to: and matching the base address and the length of the destination address recorded in the second address decoder with the base address and the length of the first destination address, and determining the first high-speed interconnection port corresponding to the matched destination address.
8. The system according to any one of claims 4 to 6,
the computing chip is specifically configured to: matching the high-order address of the destination address recorded in the first address decoder with the high-order address of the first destination address, and determining the first port corresponding to the matched destination address, wherein the number of bits of the high-order address is determined according to the memory size of the second node;
the interconnection chip is specifically configured to: and matching the high-order address of the target address recorded in the second address decoder with the high-order address of the first target address, and determining the first high-speed interconnection port corresponding to the matched target address.
9. The system of any of claims 2 to 8, wherein the first high speed interconnect port and the second high speed interconnect port are high speed serial bus ports and the first port is a high speed serial bus port.
10. A data access method applied to a data access system including a first node and a second node, the first node and the second node being connected by a cable, the method comprising:
the first node generates a data access request, wherein the data access request is used for requesting data in a memory of the second node;
the first node sends the data access request to the second node through the cable;
and the second node converts a first destination address in the data access request into a local physical address corresponding to the first destination address, and accesses the data in the memory of the second node according to the local physical address.
11. The method of claim 10, wherein the first node comprises a compute chip and an interconnect chip, wherein a first high-speed interconnect port of the interconnect chip is connected to a second high-speed interconnect port of a processor in the second node by a cable;
the first node generating a data access request comprises:
the computing chip generates the data access request and sends the data access request to the interconnection chip;
the first node sending the data access request to the second node over the cable comprises:
and the interconnection chip sends the data access request to the second node through the cable.
12. The method of claim 11, wherein the compute chip is coupled to the interconnect chip through a port, the compute chip including a first address decoder;
the computing chip generating the data access request and sending the data access request to the interconnection chip comprises:
the computing chip generates the data access request, determines the first port according to a first destination address in the data access request and the first address decoder, and sends the data access request to the interconnection chip through the first port, wherein the first address decoder is used for recording a corresponding relation between the destination address and the port of the computing chip.
13. The method of claim 12, wherein the interconnect chip includes a second address decoder therein,
the sending, by the interconnect chip, the data access request to the second node via the cable includes:
and the interconnection chip determines the first high-speed interconnection port according to the first destination address and the second address decoder, and sends the data access request to the second node through the first high-speed interconnection port, wherein the second address decoder is used for recording the corresponding relation between the destination address and the high-speed interconnection port.
14. The system of claim 13, wherein the second node comprises a third address decoder;
the second node converting the first destination address in the data access request into the local physical address corresponding to the first destination address comprises:
and the second node determines a local physical address corresponding to the first destination address according to the first destination address and the third address decoder, wherein the third address decoder is used for recording the corresponding relation between the destination address and the local physical address.
15. The system of claim 14, wherein the data access system further comprises a configuration node, the method further comprising:
the configuration node acquires at least one local physical address of a memory of the second node from the second node;
the configuration node determines at least one corresponding destination address according to the at least one local physical address, and configures the third address decoder;
the configuration node configures the second address decoder by combining the high-speed interconnection port between the second node and the interconnection chip according to the at least one destination address;
and the configuration node configures the first address decoder by combining a chip port between the interconnection chip and the computing chip according to the at least one destination address.
16. The method of any of claims 13 to 15, wherein the determining, by the computing chip, the first port based on the first destination address in the data access request and the first address decoder comprises:
the computing chip matches the base address and the length of the destination address recorded in the first address decoder with the base address and the length of the first destination address, and determines the first port corresponding to the matched destination address;
the determining, by the interconnect chip, the first high-speed interconnect port according to the first destination address and the second address decoder includes:
and the interconnection chip matches the base address and the length of the destination address recorded in the second address decoder with the base address and the length of the first destination address, and determines the first high-speed interconnection port corresponding to the matched destination address.
17. The method of any of claims 13 to 15, wherein the determining, by the computing chip, the first port based on the first destination address in the data access request and the first address decoder comprises:
the computing chip matches a high-order address of a destination address recorded in the first address decoder with a high-order address of the first destination address, and determines the first port corresponding to the matched destination address, wherein the number of bits of the high-order address is determined according to the size of the memory of the second node;
the determining, by the interconnect chip, the first high-speed interconnect port according to the first destination address and the second address decoder includes:
and the interconnection chip matches the base address and the length of the destination address recorded in the second address decoder with the base address and the length of the first destination address, and determines the first high-speed interconnection port corresponding to the matched destination address.
18. The method of any of claims 10 to 17, wherein the first and second high speed interconnect ports are high speed serial bus ports and the first port is a high speed serial bus port.
19. A computing node for use in a data access system, the data access system further comprising a storage node, the computing node comprising: the system comprises a computing chip and an interconnection chip, wherein the computing chip is connected with the interconnection chip through a high-speed interconnection port, and the interconnection chip is connected with other nodes through high-speed interconnection ports and cables;
the computing chip is used for generating a data access request and sending the data access request to the interconnection chip, wherein the data access request comprises a first destination address, and the first destination address indicates the position of a memory in the storage node;
and the interconnection chip is used for sending the data access request to the storage node according to the first destination address.
20. The storage node is applied to a data access system, the data access system further comprises a computing node, the storage node comprises a processor and a memory, and the storage node is connected with the computing node through a high-speed interconnection port and a cable of the processor;
the processor is configured to receive a data access request sent by the compute node through the high-speed interconnect port, convert a first destination address carried in the data access request into a local physical address of the storage node corresponding to the first destination address, and access data in the memory according to the local physical address.
CN202111160189.7A 2021-09-30 2021-09-30 Data access system, method and related equipment Pending CN115905036A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202111160189.7A CN115905036A (en) 2021-09-30 2021-09-30 Data access system, method and related equipment
PCT/CN2022/118756 WO2023051248A1 (en) 2021-09-30 2022-09-14 Data access system and method, and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111160189.7A CN115905036A (en) 2021-09-30 2021-09-30 Data access system, method and related equipment

Publications (1)

Publication Number Publication Date
CN115905036A true CN115905036A (en) 2023-04-04

Family

ID=85727930

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111160189.7A Pending CN115905036A (en) 2021-09-30 2021-09-30 Data access system, method and related equipment

Country Status (2)

Country Link
CN (1) CN115905036A (en)
WO (1) WO2023051248A1 (en)

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU3936693A (en) * 1992-03-25 1993-10-21 Encore Computer U.S., Inc. Fiber optic memory coupling system
TWI614670B (en) * 2013-02-12 2018-02-11 Lsi公司 Chained, scalable storage system and method of accessing data in a chained, scalable storage system
CN103957155B (en) * 2014-05-06 2018-01-23 华为技术有限公司 Message transmitting method, device and interconnecting interface
US10560765B2 (en) * 2018-04-25 2020-02-11 Western Digital Technologies, Inc. Node with combined optical and electrical switching
CN111490946B (en) * 2019-01-28 2023-08-11 阿里巴巴集团控股有限公司 FPGA connection realization method and device based on OpenCL framework

Also Published As

Publication number Publication date
WO2023051248A1 (en) 2023-04-06

Similar Documents

Publication Publication Date Title
US20180024957A1 (en) Techniques to enable disaggregation of physical memory resources in a compute system
US11907139B2 (en) Memory system design using buffer(s) on a mother board
US10725957B1 (en) Uniform memory access architecture
CN114546913B (en) Method and device for high-speed data interaction between multiple hosts based on PCIE interface
US11003607B2 (en) NVMF storage to NIC card coupling over a dedicated bus
US20210334143A1 (en) System for cooperation of disaggregated computing resources interconnected through optical circuit, and method for cooperation of disaggregated resources
CN115374046B (en) Multiprocessor data interaction method, device, equipment and storage medium
JP2017537404A (en) Memory access method, switch, and multiprocessor system
WO2023125524A1 (en) Data storage method and system, storage access configuration method and related device
EP4123649A1 (en) Memory module, system including the same, and operation method of memory module
CN115840620A (en) Data path construction method, device and medium
US20230325277A1 (en) Memory controller performing selective and parallel error correction, system including the same and operating method of memory device
US11962675B2 (en) Interface circuit for providing extension packet and processor including the same
US10853255B2 (en) Apparatus and method of optimizing memory transactions to persistent memory using an architectural data mover
US10909044B2 (en) Access control device, access control method, and recording medium containing access control program
US20220342835A1 (en) Method and apparatus for disaggregation of computing resources
US20220147470A1 (en) System, device, and method for accessing memory based on multi-protocol
CN115905036A (en) Data access system, method and related equipment
CN112513824A (en) Memory interleaving method and device
KR20050080704A (en) Apparatus and method of inter processor communication
CN116932451A (en) Data processing method, host and related equipment
CN113722110B (en) Computer system, memory access method and device
CN115037783A (en) Data transmission method and device
CN116795742A (en) Storage device, information storage method and system
KR20220070951A (en) Memory device, system including the same and operating method of memory device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication