WO2023051248A1 - Data access system and method, and related device - Google Patents

Data access system and method, and related device Download PDF

Info

Publication number
WO2023051248A1
WO2023051248A1 PCT/CN2022/118756 CN2022118756W WO2023051248A1 WO 2023051248 A1 WO2023051248 A1 WO 2023051248A1 CN 2022118756 W CN2022118756 W CN 2022118756W WO 2023051248 A1 WO2023051248 A1 WO 2023051248A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
address
chip
destination address
port
Prior art date
Application number
PCT/CN2022/118756
Other languages
French (fr)
Chinese (zh)
Inventor
陈天翔
黄江乐
胡天驰
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023051248A1 publication Critical patent/WO2023051248A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/06Addressing a physical block of locations, e.g. base addressing, module addressing, memory dedication
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1081Address translation for peripheral access to main memory, e.g. direct memory access [DMA]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/42Bus transfer protocol, e.g. handshake; Synchronisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present application relates to the field of storage, in particular to a data access system, method and related equipment.
  • computing nodes When computing nodes perform data processing (such as big data or AI tasks), they often need a large memory capacity to store data. Usually, data can be distributed in the memory of multiple storage nodes. Computing nodes The data in the memory of the storage node can be read through the remote direct memory access (RDMA) protocol to realize the expansion of the memory capacity.
  • RDMA remote direct memory access
  • the communication between the computing node and the storage node is realized through the network card, and the data transmission between the two network cards is performed through the network card queue, so that each time the computing node reads data, it needs to put the data read request into the network card Queue, which causes the data reading process to consume a lot of time in the preparation of the queue unit, and even in some cases, the preparation time of the queue unit is longer than the data transmission time, resulting in low data access efficiency of computing nodes, high network delay, and great impact The processing efficiency of data or AI tasks.
  • the present application provides a data access system, method and related equipment, which are used to solve the problems of low access efficiency and high network delay when computing nodes access memory of storage nodes.
  • a data access system in a first aspect, includes a first node and a second node, the first node and the second node are connected through a cable; the first node is used to generate a data access request, wherein the data The access request is used to request data in the memory of the second node; the first node is used to send the data access request to the second node through the cable; the second node is used to convert the first destination address in the data access request to the first The local physical address corresponding to the destination address, and accessing the data in the memory of the second node according to the local physical address.
  • the first node and the second node are connected by cables, and the communication interaction between the two does not need to pass through the network card or routing equipment, so that the first node does not need to wait for the network card queue unit when accessing the memory of the second node preparation time, thereby improving the efficiency of the first node accessing the memory of the second node and reducing the access delay.
  • the first node includes a computing chip and an interconnection chip, wherein the first high-speed interconnection port of the interconnection chip is connected to the second high-speed interconnection port of the processor in the second node through a cable, and the computing chip is connected through The port is connected to the interconnection chip, the calculation chip is used to generate a data access request, and sends the data access request to the interconnection chip, and the interconnection chip is used to send the data access request to the second node through the cable.
  • a computing chip may consist of at least one general-purpose processor, such as a CPU, NPU, or a combination of a CPU and a hardware chip.
  • the aforementioned hardware chip may be an application-specific integrated circuit (Application-Specific Integrated Circuit, ASIC), a programmable logic device (Programmable Logic Device, PLD) or a combination thereof.
  • the above-mentioned PLD can be a complex programmable logic device (Complex Programmable Logic Device, CPLD), a field programmable logic gate array (Field-Programmable Gate Array, FPGA), a general array logic (Generic Array Logic, GAL) or any combination thereof.
  • the number of computing chips in the first node may be one or more, which is not specifically limited in this application.
  • the interconnection chip may be ASIC, PLD or a combination thereof, and the aforementioned PLD may be CPLD, FPGA, GAL or any combination thereof, which is not specifically limited in this application.
  • the number of interconnected chips in the first node may be multiple, which is not specifically limited in this application.
  • the interconnection chip is provided with a high-speed interconnection port, and the interconnection chip can perform data communication with the second node through the high-speed interconnection port.
  • the first high-speed interconnection port of the interconnection chip is connected to the second high-speed interconnection port on the second node through a cable.
  • the number of first high-speed interconnection ports on each interconnection chip may be one or more, and each first high-speed interconnection port is in a one-to-one correspondence with the second high-speed interconnection ports on the second node.
  • the high-speed interconnection port can be a high-speed serial bus port, such as a SERDES bus port, and the cable can be a cable, optical fiber, twisted pair, etc. that can transmit data, and this application does not specifically limit the cable.
  • the number of high-speed interconnection ports on the first node may be one or more, and the first high-speed interconnection ports on the first node are in a one-to-one correspondence with the second high-speed interconnection ports on the second node.
  • the port of the computing chip can be a high-speed serial bus port, such as a SERDES bus port, and the computing chip can be connected to the interconnection chip through the bus, and the bus can be a Peripheral Component Interconnect (PCI) bus or an extended industry standard structure (Extended Industry Standard Architecture, EISA) bus, etc., computing chips, interconnect chips, ports and buses of computing chips can be uniformly printed on the circuit board during processing.
  • PCI Peripheral Component Interconnect
  • EISA Extended Industry Standard Architecture
  • computing chips, interconnect chips, ports and buses of computing chips can be uniformly printed on the circuit board during processing.
  • the number of ports of the computing chip may be one or more, which is not limited in this application.
  • deploying interconnection chips in the first node can enable the first node to communicate with more second nodes.
  • the greater the number of interconnection chips the greater the number of high-speed interconnection ports that can be deployed in the first node.
  • the data communication between the computing chip, the interconnection chip and the second node of the first node may implement an addressing function through an address decoder.
  • the computing chip includes a first address decoder, and the computing chip is specifically configured to: generate a data access request, determine the first port according to the first destination address in the data access request and the first address decoder, and use the second A port sends a data access request to the interconnection chip, wherein the first address decoder is used to record the correspondence between the destination address and the port of the computing chip.
  • a first address decoder is deployed in the computing chip, and the computing chip is specifically used to generate a data access request, determine the first port according to the first destination address in the data access request and the first address decoder, and use the second A port sends a data access request to the interconnection chip, wherein the first address decoder can record the correspondence between the destination address and the port of the computing chip.
  • a second address decoder is deployed in the interconnection chip, and the interconnection chip is specifically used to determine the first high-speed interconnection port according to the first destination address and the second address decoder, and communicate to the second node through the first high-speed interconnection port A data access request is sent, wherein the second address decoder is used to record the corresponding relationship between the destination address and the high-speed interconnection port.
  • a third address decoder is deployed in the second node, and the second node is specifically configured to determine the local physical address corresponding to the first destination address according to the first destination address and the third address decoder, wherein the second The three-address decoder is used to record the correspondence between the destination address and the local physical address.
  • the data access system may further include a configuration node, and the configuration node may configure the first address decoder, the second address decoder, and the third address decoder.
  • the configuration node is used to acquire at least one local physical address of the memory of the second node from the second node, the configuration node is used to determine at least one corresponding destination address according to the at least one local physical address, and perform configuration; the configuration node is also used to configure the second address decoder according to at least one destination address in combination with the high-speed interconnect port between the second node and the interconnection chip; the configuration node is also used to combine the interconnection port according to at least one destination address The chip port between the chip and the computing chip configures the first address decoder.
  • the configuration node when the configuration node acquires at least one local physical address of the second node's memory from the second node, it can determine the extension allocated by the second node for use by the first node according to the size of the second node's memory and in combination with business requirements.
  • the local physical address of memory can be part of the memory of the second node, and this part of the extended memory can be processed through memory isolation technology, so that the second node cannot access this part of the extended memory, thereby increasing the storage capacity of the extended memory. Data Security.
  • the data access request generated by the computing chip is routed and addressed by the address decoder, and the data access request is transmitted to the destination address corresponding to
  • the CPU of the second node performs memory read and write, thereby avoiding the waiting time for network card queue preparation, improving the efficiency of the first node to expand memory read and write, and the delay can even reach the microsecond level (the Ethernet delay can reach the millisecond level ), the bandwidth can reach 400GB, which has higher bandwidth and delay than the RDMA network card with a bandwidth of only 100GB.
  • the whole or part of the first destination address can be matched with the address in the decoder Matching, so as to improve the matching efficiency, and then improve the efficiency of data access.
  • the first port may be determined according to the base address and length of the first destination address in the data access request.
  • the computing chip is specifically configured to match the base address and length of the destination address recorded in the first address decoder with the base address and length of the first destination address, and determine the first port corresponding to the matched destination address.
  • the interconnect chip is specifically used to match the base address and length of the destination address recorded in the second address decoder with the base address and length of the second destination address, and determine the first high-speed interconnect corresponding to the matched destination address. port. I won't go into details here.
  • the first port may be determined according to the upper address of the first destination address.
  • the computing chip is specifically used to match the high-order address of the destination address recorded in the first address decoder with the high-order address of the first destination address, and determine the first port corresponding to the matched destination address, wherein the number of digits of the high-order address It is determined according to the memory size of the second node.
  • the interconnection chip is specifically used to match the upper address of the destination address recorded in the second address decoder with the upper address of the first destination address, and determine the first high-speed interconnection port corresponding to the matched destination address. I won't go into details here.
  • the high-order address can be determined according to the expanded memory size of the second node 120 connected to the high-speed interconnect port. digits.
  • the ports corresponding to part of the destination addresses recorded in the first and second decoders may be the same, corresponding to the destination addresses of the same port Located in the same memory, these destination addresses corresponding to the same port have the same base address and length, or the high address is the same, so the first purpose can be determined by matching the base address and length, or matching the high address The port corresponding to the address.
  • matching part of the first destination address with the address in the decoder can improve the matching efficiency, improve the efficiency of determining the first port and the first high-speed interconnection port, and further improve the efficiency of data access.
  • the data access request is to read data in memory from the second node, after the second node processes the data access request, it can combine the first, second and third nodes according to the source address in the data access request.
  • the address decoder returns the read data to the first node through the original route, which will not be repeated here.
  • the first node may not include an interconnection chip, and the high-speed interconnection port on the computing chip is connected to the high-speed interconnection port on the second node through a cable, and the computing chip can also decode the address through the above-mentioned
  • the router implements routing addressing for data access requests.
  • the computing chip can be equipped with a second address decoder, and the second node can be equipped with a third address decoder 230.
  • the data access request generated by the computing chip can be based on the high-speed interconnection port recorded in the second address decoder and
  • the corresponding relationship between destination addresses is to determine the first high-speed interconnection port corresponding to the first destination address, and then send the data access request to the second node through the first high-speed interconnection port, which will not be described in detail here.
  • a data access method is provided, the method is applied to a data access system, the data access system includes a first node and a second node, the first node and the second node are connected by a cable, the method includes the following steps :
  • the first node generates a data access request, wherein the data access request is used to request data in the memory of the second node, the first node sends the data access request to the second node through a cable, and the second node sends the data in the data access request
  • the first destination address is converted into a local physical address corresponding to the first destination address, and data in the memory of the second node is accessed according to the local physical address.
  • the first node and the second node are connected by a cable, and the communication interaction between the two does not need to pass through a network card or a routing device, so that the first node does not need to wait for the network card queue unit when accessing the memory of the second node preparation time, thereby improving the efficiency of the first node accessing the memory of the second node and reducing the access delay.
  • the first node includes a computing chip and an interconnection chip, wherein the first high-speed interconnection port of the interconnection chip is connected to the second high-speed interconnection port of the processor in the second node through a cable, and the computing chip can A data access request is generated and sent to the interconnection chip, and the interconnection chip sends the data access request to the second node through the cable.
  • the computing chip is connected to the interconnection chip through a port, the computing chip includes a first address decoder, and the computing chip can generate a data access request, and according to the first destination address and the first address in the data access request
  • the decoder determines the first port, and sends a data access request to the interconnection chip through the first port, wherein the first address decoder is used to record the correspondence between the destination address and the port of the computing chip.
  • the interconnection chip includes a second address decoder, the interconnection chip determines the first high-speed interconnection port according to the first destination address and the second address decoder, and communicates to the second node through the first high-speed interconnection port A data access request is sent, wherein the second address decoder is used to record the corresponding relationship between the destination address and the high-speed interconnection port.
  • the second node includes a third address decoder, and the second node determines the local physical address corresponding to the first destination address according to the first destination address and the third address decoder, wherein the third The address decoder is used to record the correspondence between the destination address and the local physical address.
  • the data access system further includes a configuration node
  • the above method further includes the following steps: the configuration node acquires at least one local physical address of the memory of the second node from the second node, and the configuration node obtains at least one local physical address of the memory of the second node according to the at least one local physical address Determining at least one corresponding destination address, configuring the third address decoder, configuring the second address decoder according to the at least one destination address, and combining the high-speed interconnection port between the second node and the interconnection chip, The configuration node configures the first address decoder according to at least one destination address in combination with the chip port between the interconnection chip and the computing chip.
  • the computing chip matches the base address and length of the destination address recorded in the first address decoder with the base address and length of the first destination address, and determines the first address corresponding to the matched destination address.
  • the interconnection chip matches the base address and length of the destination address recorded in the second address decoder with the base address and length of the first destination address, and determines the first high-speed interconnection port corresponding to the matched destination address.
  • the computing chip matches the high-order address of the destination address recorded in the first address decoder with the high-order address of the first destination address, and determines the first port corresponding to the matched destination address, wherein, The number of bits in the high address is determined according to the memory size of the second node.
  • the interconnection chip matches the base address and length of the destination address recorded in the second address decoder with the base address and length of the first destination address to determine the match The first high-speed interconnect port corresponding to the last destination address.
  • the first high-speed interconnection port and the second high-speed interconnection port are high-speed serial bus ports, and the first port is a high-speed serial bus port.
  • a computing node in a third aspect, is provided, the computing node may be the first node described in the first aspect and the second aspect, the computing node is applied to a data access system, the data access system further includes a storage node, and the computing node includes: Computing chip and interconnection chip, wherein, the computing chip is connected to the interconnection chip through a high-speed interconnection port, and the interconnection chip is connected to other nodes through a high-speed interconnection port and cables; the computing chip is used to generate a data access request and send the data access request to The interconnection chip, wherein the data access request includes a first destination address, and the first destination address indicates the location of the memory in the storage node; the interconnection chip is used to send the data access request to the storage node according to the first destination address.
  • a fourth aspect provides a storage node, which may be the second node described in the first aspect and the second aspect, the storage node is applied to a data access system, the data access system further includes a computing node, and the storage node includes a processing
  • the storage node is connected to the computing node through the high-speed interconnection port of the processor and the cable; the processor is used to receive the data access request sent by the computing node through the high-speed interconnection port, and convert the first destination address carried in the data access request It is the local physical address of the storage node corresponding to the first destination address, and accesses the data in the memory according to the local physical address.
  • a computing device in a fifth aspect, includes a processor and a memory, the memory stores codes, and the processor includes the first node or the first node for executing the first aspect or any possible implementation manner of the first aspect. The functions of each module implemented by the two nodes.
  • a computer-readable storage medium wherein instructions are stored in the computer-readable storage medium, and when the computer-readable storage medium is run on a computer, the computer is made to execute the methods described in the above aspects.
  • Fig. 1 is a schematic structural diagram of a data access system provided by the present application
  • FIG. 2 is a schematic diagram of deployment of a first node and a second node in an application scenario provided by the present application;
  • FIG. 3 is a schematic structural diagram of another data access system provided by the present application.
  • FIG. 4 is an example diagram of a first address decoder provided by the present application.
  • Fig. 5 is a schematic flow chart of steps of a data access method provided by the present application.
  • FIG. 6 is a schematic structural diagram of a computing node provided by the present application.
  • FIG. 7 is a schematic structural diagram of a storage node provided by the present application.
  • FIG. 8 is a schematic structural diagram of a computing device provided by the present application.
  • computing nodes When computing nodes perform some big data or AI tasks, they need a large memory capacity to store data, and access data with small granularity and sparsely.
  • the access granularity may be only 64 bytes or 128 bytes, and the location of each access is highly random, resulting in low data access efficiency and high network latency for big data or AI tasks, affecting the processing efficiency of big data or AI tasks.
  • the data can be distributed in the memory of multiple storage nodes, and the computing nodes can read the data in the memory of the storage nodes through the remote direct memory access (RDMA) protocol , to achieve the expansion of memory capacity.
  • RDMA remote direct memory access
  • the communication between the computing node and the storage node is realized through the network card, and the data transmission between the two network cards is performed through the network card queue, so that each time the computing node reads data, it needs to put the data read request into the network card Queue, which causes the data reading process to consume a lot of time in the preparation of the queue unit, and even in some cases, the preparation time of the queue unit is longer than the data transmission time, resulting in low data access efficiency of computing nodes and high network delay, affecting data Processing efficiency.
  • PCI memory devices can be expanded on the memory bus of computing devices, but the number of expansions for PCIe devices is limited, and the expansion capabilities of buses such as QPI and UPI are weaker. Usually, the maximum memory capacity is expanded to 1T, making expansion The final memory still cannot reach the magnitude required by big data or AI tasks.
  • FIG. 1 is a schematic structural diagram of a data access system provided by the present application.
  • the data access system includes a first node 110 and a second node 120, wherein the first node 110 and the second node 120 are connected through a cable 140, specifically, the high-speed interconnection port 130 of the first node 110 is connected to the second
  • the high-speed interconnect ports 130 of the processors in the nodes 120 are connected by cables. It should be understood that the number of second nodes 120 in FIG. 1 is used for illustration, and the application does not limit the number of second nodes 120 .
  • the high-speed interconnect port 130 in the first node 110 will be collectively referred to as the first high-speed interconnect port, and the high-speed interconnect port 130 in the second node 120 will be referred to as the second high-speed interconnect port.
  • the first node 110 and the second node 120 can be physical servers, such as X86 servers, ARM servers, etc.; they can also be virtual machines (virtual machines) based on common physical servers combined with network functions virtualization (network functions virtualization, NFV) technology , VM), a virtual machine refers to a complete computer system that is simulated by software and has complete hardware system functions and runs in a completely isolated environment, such as a virtual device in cloud computing, which is not specifically limited in this application; the first node 110 and the second node
  • the second node 120 may also be a server cluster composed of multiple servers, and the servers may be physical servers or virtual machines in the foregoing content.
  • the high-speed interconnection port 130 can be a high-speed serial bus port, such as a SERDES bus port, and the cable 140 can be a cable, optical fiber, twisted pair, etc. that can transmit data. This application does not specifically limit the cable 140 .
  • the number of high-speed interconnection ports 130 on the first node 110 may be one or more, and the first high-speed interconnection ports on the first node 110 are in a one-to-one correspondence with the second high-speed interconnection ports on the second node, FIG. 1 uses three ports as an example for illustration, which is not limited in this application.
  • the first high-speed interconnection port of the first node 110 is connected to the second high-speed interconnection port of the processor of the second node 120 through a cable, and there may be one or more processors of the second node 120.
  • the first node 110 can be connected to different processors of the second node through different high-speed interconnect ports, for example, when the second node 4 includes processor 4, processor 5 and processor 6 , the high-speed interconnection port 1 of the first node 110 can be connected to the processor 4, the high-speed interconnection port 2 of the first node 110 can be connected to the processor 5, and the data in the memory corresponding to different processors can be read through different high-speed interconnection ports .
  • the second node 120 may reserve at least one processor not connected to the first node 110, so as to ensure that after the second node 120 provides some memory to the first node, it will not affect the second node 120 to process other services.
  • the first node 110 is used to process data tasks, such as big data or AI tasks in the aforementioned content.
  • the second node 120 is used to store data, and the second node 120 can divide a part of the memory as the expanded memory of the first node 110 for use by the first node 110.
  • the first node 110 can use the data access system shown in FIG. Data is read from the expanded memory allocated by the second node 120 to process big data or AI tasks, thereby implementing memory expansion of the first node 110 .
  • the first node 110 and the second node 120 can be deployed in the same cabinet, and the first high-speed interconnection port of the first node 110 and the second high-speed interconnection port of the second node 120 direct connection.
  • the servers in the entire cabinet can communicate with each other without going through a switch or a network card, so that the first node 110 can read data from the memory of the second node 120 .
  • the first node 110 can be an AI server
  • the second node 120 can be a 2P server.
  • a 2P server refers to a server with two CPUs.
  • one AI server and 8 to 10 2P servers can be placed inside a cabinet, so that a rack server composed of a cabinet can have 10TB to 20TB of memory, which meets the In most application scenarios, the memory requirements of the AI server for data processing.
  • FIG. 2 is used for illustration, and the present application does not limit it.
  • the first node 110 may generate a data access request, wherein the data access request is used to request data in the memory of the second node, and the first node 110 may send the data access request to the second node 120 through the cable 140 , the second node may convert the first destination address in the data access request into a local physical address corresponding to the destination address, and access the data in the memory of the second node according to the local physical address.
  • FIG. 3 is a schematic structural diagram of another data access system provided by the present application, wherein the first node 110 may include a computing chip 111, an interconnection chip 112, a port 113 of the computing chip and a bus 114, wherein, the port 113 of the computing chip of the computing chip 111 communicates with the interconnection chip 112 through the bus 114.
  • the port on the interconnection chip 112 is not drawn in FIG. 3, but in the specific implementation, The interconnection chip 112 may also have corresponding ports.
  • FIG. 1 is only an exemplary division method, and each module unit may be merged or split into more or fewer module units. This application does not make specific limited.
  • the port 113 of the computing chip can be a high-speed serial bus port, such as a SERDES bus port, and the bus 114 can be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, etc.
  • PCI Peripheral Component Interconnect
  • EISA Extended Industry Standard Architecture
  • the computing chip 111 , interconnection chip 112 , port 113 and bus 114 of the computing chip can be uniformly printed on the circuit board during processing.
  • the number of ports 113 of the computing chip may be one or more.
  • FIG. 3 uses two ports (port 0 and port 1) as an example for illustration, which is not limited in this application.
  • the computing chip 111 may be composed of at least one general-purpose processor, such as a CPU, an NPU, or a combination of a CPU and a hardware chip.
  • the aforementioned hardware chip may be an application-specific integrated circuit (Application-Specific Integrated Circuit, ASIC), a programmable logic device (Programmable Logic Device, PLD) or a combination thereof.
  • ASIC Application-Specific Integrated Circuit
  • PLD programmable logic device
  • the above-mentioned PLD can be a complex programmable logic device (Complex Programmable Logic Device, CPLD), a field programmable logic gate array (Field-Programmable Gate Array, FPGA), a general array logic (Generic Array Logic, GAL) or any combination thereof.
  • the computing chip 111 executes various types of digital storage instructions, which enable the first node 110 to provide a wide variety of services. Wherein, the number of computing chips 111 in the first node 110 may be one or more. FIG. 3 uses one computing chip 111 as an example for illustration, which is not specifically limited in this application.
  • the interconnect chip 112 may be an ASIC, a PLD or a combination thereof, and the PLD may be a CPLD, FPGA, GAL or any combination thereof, which is not specifically limited in this application.
  • the number of interconnection chips 112 in the first node 110 may be multiple.
  • FIG. 3 takes two interconnection chips 112 as an example (interconnection chip 1 and interconnection chip 2) for illustration, which is not specifically limited in this application.
  • the interconnection chip 112 is provided with a high-speed interconnection port 130, and the interconnection chip 112 can perform data communication with the second node 120 through the high-speed interconnection port 130.
  • the first high-speed interconnection port of the interconnection chip 112 and the second high-speed interconnection port on the second node 120 They are connected through a cable 140 , wherein the description of the high-speed interconnection port 130 and the cable 140 can refer to the foregoing embodiments in FIG. 1 and FIG. 2 , and details are not repeated here.
  • each interconnection chip 112 may be one or more, and each first high-speed interconnection port is in a one-to-one correspondence with the second high-speed interconnection port on the second node, as shown in FIG. 3
  • interconnection chip 1 includes high-speed interconnection port 2 and high-speed interconnection port 3
  • interconnection chip 2 includes high-speed interconnection port 4 and high-speed interconnection port 5.
  • the description of the second node 120 can refer to the embodiment in FIG. 1 and FIG. 4) An example is used for illustration, and this application does not make specific limitations.
  • the computing chip 111 is used to generate the above-mentioned data access request, and send the data access request to the interconnection chip 112.
  • the computing chip 111 can send data to the interconnection chip 112 through the port 113 of the above-mentioned computing chip access request.
  • the interconnection chip 112 is used to send the data access request to the second node 120 through the above-mentioned cable 140. ask.
  • deploying the interconnection chip 112 in the first node 110 can enable the first node 110 to communicate with more second nodes 120, and the greater the number of interconnection chips 112, the higher the high-speed The greater the number of interconnection ports 130, the greater the number of second nodes 120 connected to the first node 110, thereby expanding the memory expansion capability of the first node 110, making the first node 110 applicable to more application scenarios.
  • the data communication between the computing chip 111 , the interconnection chip 112 and the second node 120 can implement an addressing function through an address decoder.
  • the address decoders in the computing chip 111 , the interconnection chip 112 and the second node 120 will be described in detail below with reference to FIG. 3 .
  • a first address decoder 210 is deployed in the computing chip 111, and the computing chip 111 is specifically used to generate a data access request, and according to the first destination address and the first The address decoder 210 determines the first port, and sends a data access request to the interconnection chip 112 through the first port, wherein the first address decoder 210 can record the correspondence between the destination address and the port of the computing chip.
  • a second address decoder 220 is deployed in the interconnection chip 112, and the interconnection chip 112 is specifically used to determine the first high-speed interconnection address according to the first destination address and the second address decoder 220.
  • the port sends a data access request to the second node 120 through the first high-speed interconnection port, wherein the second address decoder 220 is used to record the correspondence between the destination address and the high-speed interconnection port.
  • a third address decoder 230 is deployed in the second node 120, and the second node 120 is specifically configured to determine the first The local physical address corresponding to the destination address, wherein the third address decoder 230 is used to record the correspondence between the destination address and the local physical address.
  • the data access system may further include a configuration node 150, and the configuration node 150 may configure the first address decoder 210, the second address decoder 220, and the third address decoder 230 for configuration.
  • the configuration node 150 is used to obtain at least one local physical address of the memory of the second node from the second node 120, the configuration node is used to determine at least one corresponding destination address according to the at least one local physical address, and decode the third address
  • the configuration node is also used to configure the second address decoder according to at least one destination address in combination with the high-speed interconnection port between the second node and the interconnection chip; the configuration node is also used to configure the second address decoder according to at least one destination address,
  • the first address decoder is configured in combination with the chip port between the interconnection chip and the computing chip.
  • the data access request generated by the computing chip is routed and addressed by the address decoder, and the data access request is transmitted to the corresponding address of the destination address.
  • the CPU of the second node performs memory reading and writing, thereby avoiding the waiting time for network card queue preparation, improving the efficiency of the first node 110 to expand memory reading and writing, and the delay can even reach the microsecond level (the Ethernet delay can reach milliseconds) level), the bandwidth can reach 400GB, which has higher bandwidth and delay than the RDMA network card with a bandwidth of only 100GB.
  • the configuration node 150 when the configuration node 150 obtains at least one local physical address of the memory of the second node from the second node 120, it may determine that the second node 120 is allocated for use according to the size of the memory of the second node 120 and in combination with business requirements. The local physical address of the extended memory used by the first node 110.
  • the extended memory used by the first node 110 may be a part of the memory of the second node 120, and this part of the extended memory may be processed through a memory isolation technology, so that the second node 120 cannot access this part of the extended memory, improving the performance of the extended memory. Security of data stored in .
  • the whole or part of the first destination address can be matched with the address in the decoder , so as to improve the matching efficiency, thereby improving the efficiency of data access.
  • the first port may be determined according to the base address and length of the first destination address in the data access request.
  • the computing chip 111 is specifically configured to match the base address and length of the destination address recorded in the first address decoder 210 with the base address and length of the first destination address, and determine the first address corresponding to the matched destination address.
  • the interconnect chip 112 is specifically used to match the base address and length of the destination address recorded in the second address decoder 220 with the base address and length of the second destination address, and determine the first address corresponding to the matched destination address.
  • High-speed Internet port I won't go into details here.
  • the first port may be determined according to the upper address of the first destination address.
  • the computing chip 111 is specifically used to match the high-order address of the destination address recorded in the first address decoder with the high-order address of the first destination address, and determine the first port corresponding to the matched destination address, wherein the bits of the high-order address The number is determined according to the memory size of the second node.
  • the interconnection chip 112 is specifically used to match the upper address of the destination address recorded in the second address decoder with the upper address of the first destination address, and determine the first high-speed interconnection port corresponding to the matched destination address. I won't go into details here.
  • the high-order address can be determined according to the expanded memory size of the second node 120 connected to the high-speed interconnect port. digits.
  • the ports corresponding to some of the destination addresses recorded in the first and second decoders may be the same, corresponding to the same port
  • the addresses are located in the same memory.
  • These destination addresses corresponding to the same port have the same base address and length, or the high address is the same, so the first can be determined by matching the base address and length, or matching the high address.
  • the port corresponding to the destination address since the expanded memory provided by the second node 120 has multiple physical addresses, the ports corresponding to some of the destination addresses recorded in the first and second decoders may be the same, corresponding to the same port
  • the addresses are located in the same memory.
  • These destination addresses corresponding to the same port have the same base address and length, or the high address is the same, so the first can be determined by matching the base address and length, or matching the high address.
  • the port corresponding to the destination address since the expanded memory provided by the second node 120 has multiple physical addresses, the ports corresponding to some of the destination addresses recorded in the first and second decoders may be the same
  • the second address decoder can record that destination addresses 1-10 correspond to high-speed interconnection ports 5. If the first destination address is any one of destination addresses 1-10, the corresponding high-speed interconnection ports are all high-speed interconnection ports 5, which correspond to the same high-speed interconnection ports.
  • the destination address of the interconnect port is also the address of the extended memory of the second node 4, so the base address and length of these destination addresses are the same, or the high address is the same.
  • the partial address of the first destination address can be matched with the partial address of the destination address in the second address decoder, thereby improving the matching efficiency.
  • FIG. 4 is an example diagram of an address decoder 210.
  • the first address decoder 210 may include a plurality of destination addresses, and the computing chip corresponding to the base address and the destination address with the same length
  • the ports of 111 are the same, assuming that the base address and length of the first destination address are as shown in Figure 4, the base address and length of the first destination address can be compared with the base address and length of each destination address in the first address decoder 210 Matching is performed to determine that the first port corresponding to the matched destination address is port 2, and the data access request can be transmitted to the interconnection chip 112 through port 2.
  • the high-speed interconnection port for transmitting the data access request is determined according to the base address and length of the destination address recorded by the second address decoder 220, and the description will not be repeated here. It should be understood that, in FIG. 4 , the first destination address is matched based on the base address and the length, and the above-mentioned matching method based on the high-order address is similar, and no further illustration is given here.
  • the data access system shown in FIG. 1 can also implement routing addressing for data access requests through the above address decoder.
  • the first node 110 may be equipped with a second address decoder 220
  • the second node 120 may be equipped with a third address decoder 230
  • the data access request generated by the first node 110 may be configured according to the second address decoder 220
  • the corresponding relationship between the high-speed interconnection port and the destination address recorded in determine the first high-speed interconnection port corresponding to the first destination address, and then send the data access request to the second node 120 through the first high-speed interconnection port, which will not be described in detail here .
  • the high-speed interconnect port can be deployed on the processor in the first node 110. In simple terms, the processor of the first node 110 and the processor of the second node 120 pass The cable connects directly.
  • the data access request is to read data in memory from the second node 120
  • the second node 120 processes the data access request, it can combine the first, second and The third address decoder returns the read data back to the first node 110 through the original path, which will not be repeated here.
  • the high-speed interconnection port of the first node is connected to the high-speed interconnection port of the second node through a cable, and the first node can combine the address decoder to realize the addressing function, so that the data
  • the access request is sent to the memory of the second node corresponding to the first destination address to realize the memory expansion of the first node.
  • This method does not need to deploy additional network cards or routers, and does not need to wait for the preparation time of the network card queue unit, so that the first node can access the second node
  • the memory has high efficiency and low latency.
  • the number of second nodes can be increased, and the expanded memory capacity of the first node can be increased, so that the scalable memory capacity of the first node is large and can handle more business in application scenarios.
  • Fig. 5 is a data access method provided by the present application, which can be applied to the data access system shown in Fig. 1 to Fig. 4, and the method may include the following steps:
  • Step S510 the first node generates a data access request, where the data access request is used to request data in the memory of the second node.
  • the data access request is used to request data in the memory of the second node.
  • Step S520 the first node sends a data access request to the second node through a cable.
  • the first high-speed interconnection port of the first node is connected to the second high-speed interconnection port of the second node through a cable, wherein, the description of the high-speed interconnection port and the cable can refer to the embodiments in FIGS. 1 to 4 , and will not be repeated here. repeat.
  • the first node may include a second address decoder, which is used to record the correspondence between the destination address and the high-speed interconnection port, and the first node may, according to the first address in the data access request
  • a destination address and second address decoder determine the first high-speed interconnection port corresponding to the first destination address, and then send the data access request to the second node through the first high-speed interconnection port.
  • the specific description of the second address decoder can refer to the embodiments shown in FIG. 1 to FIG. 4 , which will not be repeated here.
  • Step S530 the second node converts the first destination address in the data access request into a local physical address corresponding to the first destination address, and accesses the data in the memory of the second node according to the local physical address.
  • the second node may include a third address decoder, which is used to record the correspondence between the destination address and the local physical address, and the second node may, according to the first address in the data access request A destination address and a third decoder determine the local physical address corresponding to the first destination address, and then access data in the memory of the second node according to the local physical address.
  • the specific description of the third address decoder can refer to the embodiments shown in FIG. 1 to FIG. 4 , and details are not repeated here.
  • the first node may include a computing chip and an interconnection chip, and the computing chip is connected to the interconnection chip through a port.
  • interconnection chips, ports and bus descriptions may refer to the embodiments shown in FIG. 1 to FIG. 4 , which will not be repeated here.
  • the computing chip may perform step S510 to generate the above data access request, and send the data access request to the interconnection chip, and the computing chip may send the data access request to the interconnection chip through a port of the computing chip.
  • the interconnection chip sends the data access request to the second node through the first high-speed interconnection port.
  • interconnection chips in the first node can enable the first node to communicate with more second nodes.
  • the greater the number of interconnection chips the greater the number of high-speed interconnection ports that can be deployed in the first node. , so that the number of second nodes connected to the first node increases, thereby expanding the memory expansion capability of the first node, so that the first node can be applied to more application scenarios.
  • the computing chip is equipped with a first address decoder. After the computing chip generates a data access request, it determines the first port according to the first destination address in the data access request and the first address decoder. A port sends a data access request to the interconnection chip, wherein the first address decoder can record the correspondence between the destination address and the port of the computing chip.
  • a second address decoder is deployed in the interconnection chip, and the interconnection chip can determine the first high-speed interconnection port according to the first destination address and the second address decoder, and communicate to the second node through the first high-speed interconnection port A data access request is sent, wherein the second address decoder is used to record the corresponding relationship between the destination address and the high-speed interconnection port.
  • the data access system may further include a configuration node, and the configuration node may configure the first address decoder, the second address decoder, and the third address decoder before the first node generates a data access request. to configure.
  • the configuration node obtains at least one local physical address of the memory of the second node from the second node, determines at least one corresponding destination address according to the at least one local physical address, and configures the third address decoder; according to at least one destination Address, combining the high-speed interconnection port between the second node and the interconnection chip, configuring the second address decoder; according to at least one destination address, combining the chip port between the interconnection chip and the computing chip, decoding the first address device to configure.
  • the data access request generated by the computing chip is routed and addressed by the address decoder, and the data access request is transmitted to the address corresponding to the destination address.
  • the CPU of the second node reads and writes the memory, thereby avoiding the waiting time for the network card queue preparation, improving the efficiency of the first node to read and write the expanded memory, and the delay can even reach the microsecond level (the Ethernet delay can reach the millisecond level) , the bandwidth can reach 400GB, which has higher bandwidth and delay than the RDMA network card with a bandwidth of only 100GB.
  • the configuration node when the configuration node acquires at least one local physical address of the memory of the second node from the second node 120, it may determine that the second node is allocated for the first node according to the size of the memory of the second node and combined with business requirements. 110 The local physical address of the extended memory used.
  • the extended memory used by the first node can be part of the memory of the second node, and this part of the extended memory can be processed through memory isolation technology, so that the second node cannot access this part of the extended memory, thereby increasing the storage capacity of the extended memory.
  • Data Security can be part of the memory of the second node, and this part of the extended memory can be processed through memory isolation technology, so that the second node cannot access this part of the extended memory, thereby increasing the storage capacity of the extended memory.
  • the whole or part of the first destination address can be matched with the address in the decoder, thereby Improve matching efficiency, thereby improving the efficiency of data access.
  • the first port may be determined according to the base address and length of the first destination address in the data access request.
  • the computing chip can match the base address and length of the destination address recorded in the first address decoder with the base address and length of the first destination address, and determine the first port corresponding to the matched destination address.
  • the interconnection chip can match the base address and length of the destination address recorded in the second address decoder with the base address and length of the second destination address, and determine the first high-speed interconnection port corresponding to the matched destination address. I won't go into details here.
  • the first port may be determined according to the upper address of the first destination address.
  • the computing chip can match the high-order address of the destination address recorded in the first address decoder with the high-order address of the first destination address, and determine the first port corresponding to the matched destination address, wherein the number of digits of the high-order address is based on The memory size of the second node is determined.
  • the interconnection chip can match the upper address of the destination address recorded in the second address decoder with the upper address of the first destination address, and determine the first high-speed interconnection port corresponding to the matched destination address. I won't go into details here.
  • the ports corresponding to part of the destination addresses recorded in the first and second decoders may be the same, corresponding to the destination addresses of the same port Located in the same memory, these destination addresses corresponding to the same port have the same base address and length, or the high address is the same, so the first purpose can be determined by matching the base address and length, or matching the high address The port corresponding to the address, thus improving the matching efficiency.
  • the data access request is to read data in memory from the second node, after the second node processes the data access request, it can combine the first, second and third nodes according to the source address in the data access request.
  • the address decoder returns the read data to the first node through the original route, which will not be repeated here.
  • the high-speed interconnection port of the first node is connected to the high-speed interconnection port of the second node through a cable, and the first node can combine the address decoder to realize the addressing function, so that the data
  • the access request is sent to the memory of the second node corresponding to the first destination address to realize the memory expansion of the first node.
  • This method does not need to deploy additional network cards or routers, and does not need to wait for the preparation time of the network card queue unit, so that the first node can access the second node
  • the memory has high efficiency and low latency.
  • the number of second nodes can be increased, and the expanded memory capacity of the first node can be increased, so that the scalable memory capacity of the first node is large and can handle more business in application scenarios.
  • FIG. 6 is a schematic structural diagram of a computing node 600 provided in the present application.
  • the computing node 600 may be the first node 110 in the aforementioned content.
  • the computing node 600 may include a computing chip 111 and an interconnection chip 112, wherein the computing chip 111 It may include a generating unit 1111 , a first matching unit 1112 and a second sending unit 1113 , and the interconnection chip 112 may include a first sending unit 1121 and a second matching unit 1122 .
  • the generating unit 1111 is configured to generate a data access request, wherein the data access request is used to request data in the memory of the second node, and specifically step S510 in the embodiment of FIG. 5 can be executed;
  • the first sending unit 1121 is configured to send the data access request to the second node through the cable, so that the second node converts the first destination address in the data access request into a local physical address corresponding to the first destination address, and according to the local The physical address accesses the data in the memory of the second node, and specifically step S520 in the embodiment of FIG. 5 can be executed.
  • the first high-speed interconnection port of the interconnection chip 112 is connected to the second high-speed interconnection port of the processor in the second node through a cable, and the generating unit 1111 is configured to generate a data access request through the computing chip 111;
  • the second sending unit 1113 is configured to send the data access request to the interconnection chip 112 through the computing chip 111 ;
  • the first sending unit 1121 is configured to send the data access request to the second node through the interconnection chip 112 through a cable.
  • the computing chip 111 is connected to the interconnection chip 112 through a port, the computing chip 111 includes a first address decoder, and the first matching unit 1112 is used to pass through the computing chip 111 according to the first purpose in the data access request
  • the address and the first address decoder determine the first port, wherein the first address decoder is used to record the corresponding relationship between the destination address and the port of the computing chip; the second sending unit 1113 is used to pass the computing chip through the first port A port sends a data access request to the interconnection chip 112 .
  • the interconnection chip 112 includes a second address decoder and a second matching unit 1122, configured to determine the first high-speed interconnection port through the interconnection chip 112 according to the first destination address and the second address decoder, Wherein, the second address decoder is used to record the corresponding relationship between the destination address and the high-speed interconnection port; the first sending unit 1121 is used to send a data access request to the second node through the interconnection chip 112 through the first high-speed interconnection port.
  • the first matching unit 1112 is configured to match the base address and length of the destination address recorded in the first address decoder with the base address and length of the first destination address through the computing chip 111 to determine the matching The first port corresponding to the last destination address; the second matching unit 1122 is used to perform the base address and length of the destination address recorded in the second address decoder with the base address and length of the first destination address through the interconnection chip 112 Matching, determining the first high-speed interconnection port corresponding to the matched destination address.
  • the first matching unit 1112 is configured to match the upper address of the destination address recorded in the first address decoder with the upper address of the first destination address through the computing chip 111 to determine the matched destination address Corresponding first port, wherein, the number of digits of the upper address is determined according to the memory size of the second node; the second matching unit 1122 is used to convert the base address of the destination address recorded in the second address decoder through the interconnection chip 112 The address and length are matched with the base address and length of the first destination address, and the first high-speed interconnection port corresponding to the matched destination address is determined.
  • the first high-speed interconnect port and the second high-speed interconnect port are high-speed serial bus ports, and the first port is a high-speed serial bus port.
  • the first node 110 can be a physical server, such as an X86 server, an ARM server, etc.; it can also be a virtual machine (virtual machine, VM) implemented based on a common physical server combined with network functions virtualization (network functions virtualization, NFV) technology,
  • VM virtual machine
  • a virtual machine refers to a complete computer system simulated by software with complete hardware system functions and running in a completely isolated environment, such as a virtual device in cloud computing, which is not specifically limited in this application; it can also be multiple physical servers or virtual machines composed of server clusters.
  • FIG. 7 is a schematic structural diagram of a storage node 700 provided in the present application.
  • the storage node 700 may be the second node 120 in the embodiments of FIGS.
  • the receiving unit 121 is configured to receive a data access request, the data access request is generated by the first node, and the data access request is sent by the first node through a cable;
  • the conversion unit 122 is configured to convert the first destination address in the data access request into a local physical address corresponding to the first destination address, and access data in the memory of the second node according to the local physical address.
  • the second node 120 includes a third address decoder; the converting unit 122 is configured to determine the local physical address corresponding to the first destination address according to the first destination address and the third address decoder, wherein, The third address decoder is used to record the corresponding relationship between the destination address and the local physical address.
  • the first high-speed interconnection port of the first node is connected to the second high-speed interconnection port of the processor in the second node through a cable, and the first high-speed interconnection port and the second high-speed interconnection port are high-speed serial buses port.
  • the second node 120 can be a physical server, such as an X86 server, an ARM server, etc.; it can also be a virtual machine (virtual machine, VM) implemented based on a common physical server combined with network functions virtualization (network functions virtualization, NFV) technology,
  • VM virtual machine
  • a virtual machine refers to a complete computer system simulated by software with complete hardware system functions and running in a completely isolated environment, such as a virtual device in cloud computing, which is not specifically limited in this application; it can also be multiple physical servers or virtual machines composed of server clusters.
  • the high-speed interconnection port of the first node is connected to the high-speed interconnection port of the second node through a cable, and the first node can be combined with an address decoder to realize addressing function, so as to send the data access request to the memory of the second node corresponding to the first destination address, and realize the memory expansion of the first node.
  • Node access to the memory of the second node has high efficiency and low delay.
  • the number of second nodes can be increased by increasing the high-speed interconnection port, which can increase the expanded memory capacity of the first node, so that the scalable memory capacity of the first node is very large. , able to handle business in more application scenarios.
  • FIG. 8 is a schematic structural diagram of a computing device provided by the present application.
  • the computing device 800 may be the first node 110 or the second node 120 in the embodiments of FIG. 1 to FIG. 7 , and the computing device may be a physical server or a virtual machine. Or a server cluster, it may also be a chip (system) or other components or components that can be set on a physical server or a virtual machine, which is not limited in this application.
  • the computing device 800 includes a processor 801, a memory 802, and a communication interface 803, where the processor 801, the memory 802, and the communication interface 803 communicate through a bus 805, or other means such as wireless transmission.
  • the processor 801 may be composed of at least one general-purpose processor, such as a CPU, an NPU, or a combination of a CPU and a hardware chip.
  • the aforementioned hardware chip may be an application-specific integrated circuit (Application-Specific Integrated Circuit, ASIC), a programmable logic device (Programmable Logic Device, PLD) or a combination thereof.
  • ASIC Application-Specific Integrated Circuit
  • PLD programmable Logic Device
  • the above-mentioned PLD can be a complex programmable logic device (Complex Programmable Logic Device, CPLD), a field programmable logic gate array (Field-Programmable Gate Array, FPGA), a general array logic (Generic Array Logic, GAL) or any combination thereof.
  • Processor 801 executes various types of digitally stored instructions, such as software or firmware programs stored in memory 802, which enable computing device 800 to provide a wide variety of services.
  • the above-mentioned processor 801 may be a computing chip or an interconnection chip in the first node mentioned above, or may be a processor chip in the second node, which is not specifically limited in this application.
  • the processor 801 may include one or more CPUs, such as CPU0 and CPU1 shown in FIG. 8 .
  • the computing device 800 may also include multiple processors, such as the processor 801 and the processor 804 shown in FIG. 8 .
  • processors can be a single-core processor (single-CPU) or a multi-core processor (multi-CPU).
  • a processor herein may refer to one or more devices, circuits, and/or processing cores for processing data (eg, computer program instructions).
  • the memory 802 is used to store program codes, which are executed under the control of the processor 801, so as to execute the processing steps of the workflow system in any of the above-mentioned embodiments in FIGS. 1-7.
  • One or more software modules may be included in the program code.
  • the computing node is the first node 110
  • the above one or more software modules may be the generating unit 1111, the first matching unit 1112, the second sending unit 1113, the second matching unit 1122 and the first sending unit in the embodiment of FIG. Unit 1121
  • the specific implementation of the above-mentioned method can refer to the embodiment of the method in FIG.
  • the conversion unit 122 for the specific implementation manner above, reference may be made to the method embodiment in FIG. 6 , and details are not repeated here.
  • the memory 802 may include read-only memory and random-access memory, and provides instructions and data to the processor 801 .
  • Memory 802 may also include non-volatile random access memory.
  • memory 802 may also store device type information.
  • Memory 802 can be volatile memory or nonvolatile memory, or can include both volatile and nonvolatile memory.
  • the non-volatile memory can be read-only memory (read-only memory, ROM), programmable read-only memory (programmable ROM, PROM), erasable programmable read-only memory (erasable PROM, EPROM), electrically programmable Erases programmable read-only memory (electrically EPROM, EEPROM) or flash memory.
  • Volatile memory can be random access memory (RAM), which acts as external cache memory.
  • RAM random access memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • SDRAM synchronous dynamic random access memory
  • Double data rate synchronous dynamic random access memory double data date SDRAM, DDR SDRAM
  • enhanced synchronous dynamic random access memory enhanced SDRAM, ESDRAM
  • synchronous connection dynamic random access memory direct rambus RAM, DR RAM
  • It can also be a hard disk (hard disk), U disk (universal serial bus, USB), flash memory (flash), SD card (secure digital memory Card, SD card), memory stick, etc.
  • the hard disk can be a hard disk drive (hard disk drive).
  • HDD hard disk drive
  • solid state disk solid state disk
  • SSD mechanical hard disk
  • mechanical hard disk mechanical hard disk
  • the communication interface 803 can be a wired interface (such as an Ethernet interface), an internal interface (such as a high-speed serial computer expansion bus (Peripheral Component Interconnect express, PCIe) bus interface), a wired interface (such as an Ethernet interface) or a wireless interface (for example, a cellular network interface or a wireless local area network interface) is used to communicate with other servers or modules.
  • the communication interface 803 can be used to receive a message for the processor 801 or processor 804 to process the message.
  • the bus 805 can be a peripheral component interconnection standard (Peripheral Component Interconnect Express, PCIe) bus, or an extended industry standard architecture (extended industry standard architecture, EISA) bus, unified bus (unified bus, Ubus or UB), computer fast link (compute express link, CXL), cache coherent interconnect for accelerators (CCIX), etc.
  • PCIe peripheral component interconnection standard
  • EISA extended industry standard architecture
  • unified bus unified bus, Ubus or UB
  • computer fast link compute express link
  • CXL cache coherent interconnect for accelerators
  • CIX cache coherent interconnect for accelerators
  • bus 805 may also include a power bus, a control bus, and a status signal bus. However, for clarity of illustration, the various buses are labeled as bus 805 in the figure.
  • FIG. 8 is only a possible implementation manner of the embodiment of the present application.
  • the computing device 800 may include more or fewer components, which is not limited here.
  • computing device 800 shown in FIG. 8 may also be a computer cluster composed of at least one physical server.
  • the computing device 800 shown in FIG. 8 may also be a computer cluster composed of at least one physical server.
  • An embodiment of the present application provides a chip, which can be specifically used in a server where a processor of the X86 architecture resides (also may be called an X86 server), a server where a processor of the ARM architecture resides (also may be referred to as an ARM server for short), etc.
  • the chip may include the above-mentioned device or logic circuit, and when the chip runs on the server, the server is made to execute the data access method described in the above method embodiment.
  • the chip may be a computing chip or an interconnection chip in the first node in the foregoing content, or may be a processor chip in the second node.
  • An embodiment of the present application provides a main board, which may also be called a printed circuit board (printed circuit boards, PCB).
  • the main board includes a processor, and the processor is used to execute program codes to implement the data access method described in the above method embodiments.
  • the mainboard may further include a memory, which is used to store the above program codes for execution by the processor.
  • An embodiment of the present application provides a computer-readable storage medium, including: computer instructions are stored in the computer-readable storage medium; when the computer instructions are run on a computer, the computer is made to perform the data access described in the foregoing method embodiments method.
  • An embodiment of the present application provides a computer program product containing instructions, including a computer program or an instruction.
  • the computer program or instruction is run on a computer, the computer is made to execute the data access method described in the above method embodiments.
  • the above-mentioned embodiments may be implemented in whole or in part by software, hardware, firmware or other arbitrary combinations.
  • the above-described embodiments may be implemented in whole or in part in the form of computer program products.
  • a computer program product comprises at least one computer instruction.
  • the computer program instructions When the computer program instructions are loaded or executed on the computer, the processes or functions according to the embodiments of the present invention will be generated in whole or in part.
  • the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable devices.
  • Computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, e.g.
  • a computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage node such as a server or a data center that includes at least one set of available media. Available media may be magnetic media (eg, floppy disks, hard disks, tapes), optical media (eg, high-density digital video discs (DVD), or semiconductor media.
  • the semiconductor media may be SSDs.

Abstract

A data access system and method, and a related device. The system comprises a first node (110) and a second node (120); the first node (110) is connected to the second node (120) by means of a cable (140), and is used for generating a data access request, wherein the data access request is used for requesting data in a memory of the second node (120); the first node (110) is used for sending the data access request to the second node (120) by means of the cable (140); and the second node (120) is used for converting a first destination address in the data access request into a local physical address corresponding to the first destination address, and accessing data in the memory of the second node (120) according to the local physical address. The system does not need to wait for the preparation time of a network card queue unit, such that the first node (110) has high efficiency and low time delay in accessing the memory of the second node (120), thereby improving the data processing efficiency of the first node (110).

Description

一种数据访问系统、方法及相关设备A data access system, method and related equipment
本申请要求于2021年09月30日提交中国专利局、申请号为202111160189.7、申请名称为“一种数据访问系统、方法及相关设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with the application number 202111160189.7 and the application title "A data access system, method and related equipment" submitted to the China Patent Office on September 30, 2021, the entire contents of which are incorporated by reference in In this application.
技术领域technical field
本申请涉及存储领域,尤其涉及一种数据访问系统、方法及相关设备。The present application relates to the field of storage, in particular to a data access system, method and related equipment.
背景技术Background technique
随着科学技术的不断发展,信息爆炸时代产生的海量数据已经渗透到当今每一个行业和业务职能领域,大数据(big data)和人工智能(artificial intelligence,AI)领域也随之得到了发展,成为两个非常热门的研究方向。With the continuous development of science and technology, the massive data generated in the era of information explosion has penetrated into every industry and business function field today, and the fields of big data (big data) and artificial intelligence (artificial intelligence, AI) have also been developed. Become two very popular research directions.
计算节点在执行数据处理时(例如:大数据或者AI任务),常需要较大的内存容量来存储数据,通常情况下,可将数据分布式地放在多台存储节点的内存中,计算节点可通过远程直接内存访问(remote direct memory access,RDMA)协议读取存储节点内存中的数据,实现内存容量的扩展。When computing nodes perform data processing (such as big data or AI tasks), they often need a large memory capacity to store data. Usually, data can be distributed in the memory of multiple storage nodes. Computing nodes The data in the memory of the storage node can be read through the remote direct memory access (RDMA) protocol to realize the expansion of the memory capacity.
但是,RDMA协议下,计算节点与存储节点之间的通信通过网卡实现,二者的网卡之间通过网卡队列进行数据传输,使得计算节点每次读取数据都需要将数据读取请求放入网卡队列,导致数据读取过程消耗大量时间在队列单元的准备上,甚至一些情况下,队列单元的准备时间相比数据传输的时间更长,导致计算节点数据访问效率低,网络延迟高,影响大数据或者AI任务的处理效率。However, under the RDMA protocol, the communication between the computing node and the storage node is realized through the network card, and the data transmission between the two network cards is performed through the network card queue, so that each time the computing node reads data, it needs to put the data read request into the network card Queue, which causes the data reading process to consume a lot of time in the preparation of the queue unit, and even in some cases, the preparation time of the queue unit is longer than the data transmission time, resulting in low data access efficiency of computing nodes, high network delay, and great impact The processing efficiency of data or AI tasks.
发明内容Contents of the invention
本申请提供了一种数据访问系统、方法及相关设备,用于解决计算节点访问存储节点内存时的访问效率低、网络延迟高的问题。The present application provides a data access system, method and related equipment, which are used to solve the problems of low access efficiency and high network delay when computing nodes access memory of storage nodes.
第一方面,提供了一种数据访问系统,该数据访问系统包括第一节点和第二节点,第一节点与第二节点通过线缆连接;第一节点用于生成数据访问请求,其中,数据访问请求用于请求第二节点的内存中的数据;第一节点用于通过线缆发送数据访问请求至第二节点;第二节点用于将数据访问请求中的第一目的地址转换为第一目的地址对应的本地物理地址,并根据本地物理地址访问第二节点的内存中的数据。In a first aspect, a data access system is provided, the data access system includes a first node and a second node, the first node and the second node are connected through a cable; the first node is used to generate a data access request, wherein the data The access request is used to request data in the memory of the second node; the first node is used to send the data access request to the second node through the cable; the second node is used to convert the first destination address in the data access request to the first The local physical address corresponding to the destination address, and accessing the data in the memory of the second node according to the local physical address.
实施第一方面描述的系统,第一节点和第二节点通过线缆连接,二者之间的通信交互无需通过网卡或路由设备,使得第一节点访问第二节点内存时无需额外等待网卡队列单元的准备时间,从而提高第一节点访问第二节点内存的效率,降低访问时延。Implement the system described in the first aspect, the first node and the second node are connected by cables, and the communication interaction between the two does not need to pass through the network card or routing equipment, so that the first node does not need to wait for the network card queue unit when accessing the memory of the second node preparation time, thereby improving the efficiency of the first node accessing the memory of the second node and reducing the access delay.
在一可能的实现方式中,第一节点包括计算芯片和互联芯片,其中,互联芯片的第一高速互联端口与第二节点中的处理器的第二高速互联端口通过线缆连接,计算芯片通过端口与互联芯片相连,计算芯片用于生成数据访问请求,并将数据访问请求发送至互联芯片,互联芯片用于通过线缆发送数据访问请求至第二节点。In a possible implementation, the first node includes a computing chip and an interconnection chip, wherein the first high-speed interconnection port of the interconnection chip is connected to the second high-speed interconnection port of the processor in the second node through a cable, and the computing chip is connected through The port is connected to the interconnection chip, the calculation chip is used to generate a data access request, and sends the data access request to the interconnection chip, and the interconnection chip is used to send the data access request to the second node through the cable.
计算芯片可以由至少一个通用处理器构成,例如CPU、NPU或者CPU和硬件芯片的组 合。上述硬件芯片可以是专用集成电路(Application-Specific Integrated Circuit,ASIC)、可编程逻辑器件(Programmable Logic Device,PLD)或其组合。上述PLD可以是复杂可编程逻辑器件(Complex Programmable Logic Device,CPLD)、现场可编程逻辑门阵列(Field-Programmable Gate Array,FPGA)、通用阵列逻辑(Generic Array Logic,GAL)或其任意组合。其中,第一节点内的计算芯片的数量可以是一个或者多个,本申请不作具体限定。A computing chip may consist of at least one general-purpose processor, such as a CPU, NPU, or a combination of a CPU and a hardware chip. The aforementioned hardware chip may be an application-specific integrated circuit (Application-Specific Integrated Circuit, ASIC), a programmable logic device (Programmable Logic Device, PLD) or a combination thereof. The above-mentioned PLD can be a complex programmable logic device (Complex Programmable Logic Device, CPLD), a field programmable logic gate array (Field-Programmable Gate Array, FPGA), a general array logic (Generic Array Logic, GAL) or any combination thereof. Wherein, the number of computing chips in the first node may be one or more, which is not specifically limited in this application.
互联芯片可以是ASIC、PLD或其组合,上述PLD可以是CPLD、FPGA、GAL或其任意组合,本申请不作具体限定。其中,第一节点中互联芯片的数量可以是多个,本申请不作具体限定。互联芯片上设置有高速互联端口,互联芯片可通过高速互联端口与第二节点进行数据通信,互联芯片的第一高速互联端口与第二节点上的第二高速互联端口通过线缆连接,需要说明的,每个互联芯片上的第一高速互联端口的数量可以是一个或者多个,每个第一高速互联端口与第二节点上的第二高速互联端口呈一一对应关系。The interconnection chip may be ASIC, PLD or a combination thereof, and the aforementioned PLD may be CPLD, FPGA, GAL or any combination thereof, which is not specifically limited in this application. Wherein, the number of interconnected chips in the first node may be multiple, which is not specifically limited in this application. The interconnection chip is provided with a high-speed interconnection port, and the interconnection chip can perform data communication with the second node through the high-speed interconnection port. The first high-speed interconnection port of the interconnection chip is connected to the second high-speed interconnection port on the second node through a cable. Yes, the number of first high-speed interconnection ports on each interconnection chip may be one or more, and each first high-speed interconnection port is in a one-to-one correspondence with the second high-speed interconnection ports on the second node.
高速互联端口可以是高速串行总线端口,比如SERDES总线端口,线缆可以是电缆、光纤、双绞线等可以传输数据的线缆,本申请不对线缆进行具体限定。其中,第一节点上高速互联端口的数量可以是一个或者多个,且第一节点上的第一高速互联端口与第二节点上的第二高速互联端口呈一一对应的关系。The high-speed interconnection port can be a high-speed serial bus port, such as a SERDES bus port, and the cable can be a cable, optical fiber, twisted pair, etc. that can transmit data, and this application does not specifically limit the cable. The number of high-speed interconnection ports on the first node may be one or more, and the first high-speed interconnection ports on the first node are in a one-to-one correspondence with the second high-speed interconnection ports on the second node.
计算芯片的端口可以是高速串行总线端口,比如SERDES总线端口,计算芯片可以通过总线与互联芯片相连,总线可以是外设部件互联标准(Peripheral Component Interconnect,PCI)总线或扩展工业标准结构(Extended Industry Standard Architecture,EISA)总线等,计算芯片、互联芯片、计算芯片的端口和总线可以在加工时统一印制在电路板上。具体实现中,计算芯片的端口的数量可以是一个或者多个,本申请不对此进行限定。The port of the computing chip can be a high-speed serial bus port, such as a SERDES bus port, and the computing chip can be connected to the interconnection chip through the bus, and the bus can be a Peripheral Component Interconnect (PCI) bus or an extended industry standard structure (Extended Industry Standard Architecture, EISA) bus, etc., computing chips, interconnect chips, ports and buses of computing chips can be uniformly printed on the circuit board during processing. In a specific implementation, the number of ports of the computing chip may be one or more, which is not limited in this application.
实施上述实现方式,在第一节点中部署互联芯片,可以使得第一节点可以与更多的第二节点进行通信,互联芯片的数量越多,第一节点中可部署的高速互联端口的数量越多,使得与第一节点相连的第二节点数量越多,从而扩大第一节点的内存扩展能力,使得第一节点可以适用于更多的应用场景。Implementing the above implementation method, deploying interconnection chips in the first node can enable the first node to communicate with more second nodes. The greater the number of interconnection chips, the greater the number of high-speed interconnection ports that can be deployed in the first node. The more the number of second nodes connected to the first node is, the more the memory expansion capability of the first node is expanded, so that the first node can be applied to more application scenarios.
在另一可能的实现方式中,第一节点的计算芯片、互联芯片和第二节点之间的数据通信可通过地址译码器实现寻址功能。In another possible implementation manner, the data communication between the computing chip, the interconnection chip and the second node of the first node may implement an addressing function through an address decoder.
可选地,计算芯片中包括第一地址译码器,计算芯片具体用于:生成数据访问请求,根据数据访问请求中的第一目的地址和第一地址译码器确定第一端口,通过第一端口向互联芯片发送数据访问请求,其中,第一地址译码器用于记录目的地址与计算芯片的端口之间的对应关系。Optionally, the computing chip includes a first address decoder, and the computing chip is specifically configured to: generate a data access request, determine the first port according to the first destination address in the data access request and the first address decoder, and use the second A port sends a data access request to the interconnection chip, wherein the first address decoder is used to record the correspondence between the destination address and the port of the computing chip.
可选地,计算芯片中部署有第一地址译码器,计算芯片具体用于生成数据访问请求,根据数据访问请求中的第一目的地址和第一地址译码器确定第一端口,通过第一端口向互联芯片发送数据访问请求,其中,第一地址译码器可记录目的地址与计算芯片的端口之间的对应关系。Optionally, a first address decoder is deployed in the computing chip, and the computing chip is specifically used to generate a data access request, determine the first port according to the first destination address in the data access request and the first address decoder, and use the second A port sends a data access request to the interconnection chip, wherein the first address decoder can record the correspondence between the destination address and the port of the computing chip.
可选地,互联芯片中部署有第二地址译码器,互联芯片具体用于根据第一目的地址和第二地址译码器确定第一高速互联端口,通过第一高速互联端口向第二节点发送数据访问请求,其中,第二地址译码器用于记录目的地址与高速互联端口之间的对应关系。Optionally, a second address decoder is deployed in the interconnection chip, and the interconnection chip is specifically used to determine the first high-speed interconnection port according to the first destination address and the second address decoder, and communicate to the second node through the first high-speed interconnection port A data access request is sent, wherein the second address decoder is used to record the corresponding relationship between the destination address and the high-speed interconnection port.
可选地,第二节点中部署有第三地址译码器,第二节点具体用于根据第一目的地址和第三地址译码器,确定第一目的地址对应的本地物理地址,其中,第三地址译码器用于记录目的地址与本地物理地址之间的对应关系。其中,第三地址译码器记录的对应关系可以是:本 地物理地址=目的地址-基地址,其中,基地址指的是一个地址段的起始地址,又称为首地址或者段地址,属于同一个地址段的目的地址的基地址相同。Optionally, a third address decoder is deployed in the second node, and the second node is specifically configured to determine the local physical address corresponding to the first destination address according to the first destination address and the third address decoder, wherein the second The three-address decoder is used to record the correspondence between the destination address and the local physical address. Wherein, the corresponding relationship recorded by the third address decoder can be: local physical address=destination address-base address, wherein, the base address refers to the starting address of an address segment, also known as the first address or segment address, belonging to the same The base address of the destination address of an address segment is the same.
可选地,该数据访问系统还可包括配置节点,配置节点可以对第一地址译码器、第二地址译码器以及第三地址译码器进行配置。具体地,配置节点用于向第二节点获取第二节点的内存的至少一个本地物理地址,配置节点用于根据至少一个本地物理地址确定对应的至少一个目的地址,对第三地址译码器进行配置;配置节点还用于根据至少一个目的地址,结合第二节点与互联芯片之间的高速互联端口,对第二地址译码器进行配置;配置节点还用于根据至少一个目的地址,结合互联芯片与计算芯片之间的芯片端口,对第一地址译码器进行配置。Optionally, the data access system may further include a configuration node, and the configuration node may configure the first address decoder, the second address decoder, and the third address decoder. Specifically, the configuration node is used to acquire at least one local physical address of the memory of the second node from the second node, the configuration node is used to determine at least one corresponding destination address according to the at least one local physical address, and perform configuration; the configuration node is also used to configure the second address decoder according to at least one destination address in combination with the high-speed interconnect port between the second node and the interconnection chip; the configuration node is also used to combine the interconnection port according to at least one destination address The chip port between the chip and the computing chip configures the first address decoder.
具体实现中,配置节点向第二节点获取第二节点的内存的至少一个本地物理地址时,可以根据第二节点内存的大小,结合业务需求,确定第二节点划分出来供第一节点使用的拓展内存的本地物理地址。可选地,供第一节点使用的拓展内存可以是第二节点的部分内存,该部分拓展内存可以通过内存隔离技术进行处理,使得第二节点无法访问该部分拓展内存,提高拓展内存中存储的数据的安全性。In the specific implementation, when the configuration node acquires at least one local physical address of the second node's memory from the second node, it can determine the extension allocated by the second node for use by the first node according to the size of the second node's memory and in combination with business requirements. The local physical address of memory. Optionally, the extended memory used by the first node can be part of the memory of the second node, and this part of the extended memory can be processed through memory isolation technology, so that the second node cannot access this part of the extended memory, thereby increasing the storage capacity of the extended memory. Data Security.
实施上述实现方式,通过配置节点配置的第一、第二和第三地址译码器,可以确保计算芯片生成的数据访问请求通过地址译码器路由寻址,将数据访问请求传输至目的地址对应的第二节点的CPU进行内存读写,从而免于网卡队列准备的等待时间,提高第一节点对拓展内存读写的效率,时延甚至能达到微秒级别(以太网时延可以达到毫秒级),带宽可以达到400GB,相比带宽只有100GB的RDMA网卡,拥有更高的带宽和时延。Implementing the above implementation, by configuring the first, second and third address decoders configured by the node, it can be ensured that the data access request generated by the computing chip is routed and addressed by the address decoder, and the data access request is transmitted to the destination address corresponding to The CPU of the second node performs memory read and write, thereby avoiding the waiting time for network card queue preparation, improving the efficiency of the first node to expand memory read and write, and the delay can even reach the microsecond level (the Ethernet delay can reach the millisecond level ), the bandwidth can reach 400GB, which has higher bandwidth and delay than the RDMA network card with a bandwidth of only 100GB.
在另一可能的实现方式中,在将第一目的地址与第一地址译码器和第二地址译码器进行匹配时,可以将完整或部分第一目的地址与译码器中的地址进行匹配,从而提高匹配效率,进而提高数据访问的效率。In another possible implementation manner, when matching the first destination address with the first address decoder and the second address decoder, the whole or part of the first destination address can be matched with the address in the decoder Matching, so as to improve the matching efficiency, and then improve the efficiency of data access.
可选地,可根据数据访问请求中第一目的地址的基地址和长度确定第一端口。具体地,计算芯片具体用于将第一地址译码器中记录的目的地址的基地址和长度与第一目的地址的基地址和长度进行匹配,确定匹配后的目的地址对应的第一端口。同理,互联芯片具体用于将第二地址译码器中记录的目的地址的基地址和长度与第二目的地址的基地址和长度进行匹配,确定匹配后的目的地址对应的第一高速互联端口。这里不再展开赘述。Optionally, the first port may be determined according to the base address and length of the first destination address in the data access request. Specifically, the computing chip is specifically configured to match the base address and length of the destination address recorded in the first address decoder with the base address and length of the first destination address, and determine the first port corresponding to the matched destination address. Similarly, the interconnect chip is specifically used to match the base address and length of the destination address recorded in the second address decoder with the base address and length of the second destination address, and determine the first high-speed interconnect corresponding to the matched destination address. port. I won't go into details here.
可选地,可根据第一目的地址的高位地址确定第一端口。计算芯片具体用于将第一地址译码器中记录的目的地址的高位地址与第一目的地址的高位地址进行匹配,确定匹配后的目的地址对应的第一端口,其中,高位地址的位数是根据第二节点的内存大小确定的。同理,互联芯片具体用于将第二地址译码器中记录的目的地址的高位地址与第一目的地址的高位地址进行匹配,确定匹配后的目的地址对应的第一高速互联端口。这里不再展开赘述。Optionally, the first port may be determined according to the upper address of the first destination address. The computing chip is specifically used to match the high-order address of the destination address recorded in the first address decoder with the high-order address of the first destination address, and determine the first port corresponding to the matched destination address, wherein the number of digits of the high-order address It is determined according to the memory size of the second node. Similarly, the interconnection chip is specifically used to match the upper address of the destination address recorded in the second address decoder with the upper address of the first destination address, and determine the first high-speed interconnection port corresponding to the matched destination address. I won't go into details here.
举例来说,假设目的地址总长度为64bit,若一个高速互联端口对应的第二节点120的内存为1T,那么这1T内存的目的地址中,后30bit的地址不同,那么高位地址的位数可以是64-30=34bit,简单来说,位于同一个内存的目的地址前34bit是相同的,后面30bit不同,因此,根据高速互联端口所连接的第二节点120的拓展内存大小,可以确定高位地址的位数。For example, assuming that the total length of the destination address is 64 bits, if the memory of the second node 120 corresponding to a high-speed interconnection port is 1T, then among the destination addresses of the 1T memory, the address of the last 30 bits is different, so the number of bits of the high address can be It is 64-30=34bit. Simply speaking, the first 34bits of the destination address in the same memory are the same, and the latter 30bits are different. Therefore, the high-order address can be determined according to the expanded memory size of the second node 120 connected to the high-speed interconnect port. digits.
应理解,由于第二节点提供的拓展内存对应的物理地址数量为多个,因此第一、第二译码器中记录的部分目的地址对应的端口可能会是同一个,对应相同端口的目的地址位于同一个内存中,这些对应相同端口的目的地址,其基地址和长度是相同的,或者,其高位地址是相同的,因此可以通过匹配基地址和长度,或者匹配高位地址来确定第一目的地址对应的端口。It should be understood that since the expanded memory provided by the second node corresponds to multiple physical addresses, the ports corresponding to part of the destination addresses recorded in the first and second decoders may be the same, corresponding to the destination addresses of the same port Located in the same memory, these destination addresses corresponding to the same port have the same base address and length, or the high address is the same, so the first purpose can be determined by matching the base address and length, or matching the high address The port corresponding to the address.
实施上述实现方式,将部分第一目的地址与译码器中的地址进行匹配,可以提高匹配效率,提高第一端口和第一高速互联端口的确定效率,进而提高数据访问的效率。Implementing the above implementation method, matching part of the first destination address with the address in the decoder can improve the matching efficiency, improve the efficiency of determining the first port and the first high-speed interconnection port, and further improve the efficiency of data access.
需要说明的,若数据访问请求是向第二节点读取内存中的数据,第二节点对数据访问请求进行处理后,可以根据数据访问请求中的源地址,结合第一、第二和第三地址译码器,将读取到的数据原路返回至第一节点中,这里不在重复再开赘述。It should be noted that if the data access request is to read data in memory from the second node, after the second node processes the data access request, it can combine the first, second and third nodes according to the source address in the data access request. The address decoder returns the read data to the first node through the original route, which will not be repeated here.
需要说明的,在一些实施例中,第一节点也可以不包括互联芯片,计算芯片上的高速互联端口与第二节点上的高速互联端口通过线缆连接,计算芯片也可通过上述地址译码器实现数据访问请求的路由寻址。具体地,计算芯片可部署有第二地址译码器,第二节点部署有第三地址译码器230,计算芯片生成的数据访问请求可根据第二地址译码器中记录的高速互联端口和目的地址之间的对应关系,确定第一目的地址对应的第一高速互联端口,然后通过第一高速互联端口向第二节点发送该数据访问请求,这里不展开赘述。It should be noted that in some embodiments, the first node may not include an interconnection chip, and the high-speed interconnection port on the computing chip is connected to the high-speed interconnection port on the second node through a cable, and the computing chip can also decode the address through the above-mentioned The router implements routing addressing for data access requests. Specifically, the computing chip can be equipped with a second address decoder, and the second node can be equipped with a third address decoder 230. The data access request generated by the computing chip can be based on the high-speed interconnection port recorded in the second address decoder and The corresponding relationship between destination addresses is to determine the first high-speed interconnection port corresponding to the first destination address, and then send the data access request to the second node through the first high-speed interconnection port, which will not be described in detail here.
第二方面,提供了一种数据访问方法,该方法应用于数据访问系统,该数据访问系统包括第一节点和第二节点,第一节点与第二节点通过线缆连接,该方法包括以下步骤:第一节点生成数据访问请求,其中,数据访问请求用于请求第二节点的内存中的数据,第一节点通过线缆发送数据访问请求至第二节点,第二节点将数据访问请求中的第一目的地址转换为第一目的地址对应的本地物理地址,并根据本地物理地址访问第二节点的内存中的数据。In a second aspect, a data access method is provided, the method is applied to a data access system, the data access system includes a first node and a second node, the first node and the second node are connected by a cable, the method includes the following steps : The first node generates a data access request, wherein the data access request is used to request data in the memory of the second node, the first node sends the data access request to the second node through a cable, and the second node sends the data in the data access request The first destination address is converted into a local physical address corresponding to the first destination address, and data in the memory of the second node is accessed according to the local physical address.
实施第二方面描述的方法,第一节点和第二节点通过线缆连接,二者之间的通信交互无需通过网卡或路由设备,使得第一节点访问第二节点内存时无需额外等待网卡队列单元的准备时间,从而提高第一节点访问第二节点内存的效率,降低访问时延。Implement the method described in the second aspect, the first node and the second node are connected by a cable, and the communication interaction between the two does not need to pass through a network card or a routing device, so that the first node does not need to wait for the network card queue unit when accessing the memory of the second node preparation time, thereby improving the efficiency of the first node accessing the memory of the second node and reducing the access delay.
在一可能的实现方式中,第一节点包括计算芯片和互联芯片,其中,互联芯片的第一高速互联端口与第二节点中的处理器的第二高速互联端口通过线缆连接,计算芯片可以生成数据访问请求,并将数据访问请求发送至互联芯片,互联芯片通过线缆发送数据访问请求至第二节点。In a possible implementation, the first node includes a computing chip and an interconnection chip, wherein the first high-speed interconnection port of the interconnection chip is connected to the second high-speed interconnection port of the processor in the second node through a cable, and the computing chip can A data access request is generated and sent to the interconnection chip, and the interconnection chip sends the data access request to the second node through the cable.
在一可能的实现方式中,计算芯片通过端口与互联芯片相连,计算芯片中包括第一地址译码器,计算芯片可生成数据访问请求,根据数据访问请求中的第一目的地址和第一地址译码器确定第一端口,通过第一端口向互联芯片发送数据访问请求,其中,第一地址译码器用于记录目的地址与计算芯片的端口之间的对应关系。In a possible implementation, the computing chip is connected to the interconnection chip through a port, the computing chip includes a first address decoder, and the computing chip can generate a data access request, and according to the first destination address and the first address in the data access request The decoder determines the first port, and sends a data access request to the interconnection chip through the first port, wherein the first address decoder is used to record the correspondence between the destination address and the port of the computing chip.
在一可能的实现方式中,互联芯片中包括第二地址译码器,互联芯片根据第一目的地址和第二地址译码器确定第一高速互联端口,通过第一高速互联端口向第二节点发送数据访问请求,其中,第二地址译码器用于记录目的地址与高速互联端口之间的对应关系。In a possible implementation, the interconnection chip includes a second address decoder, the interconnection chip determines the first high-speed interconnection port according to the first destination address and the second address decoder, and communicates to the second node through the first high-speed interconnection port A data access request is sent, wherein the second address decoder is used to record the corresponding relationship between the destination address and the high-speed interconnection port.
在一可能的实现方式中,第二节点包括第三地址译码器,第二节点根据第一目的地址和第三地址译码器,确定第一目的地址对应的本地物理地址,其中,第三地址译码器用于记录目的地址和本地物理地址之间的对应关系。In a possible implementation, the second node includes a third address decoder, and the second node determines the local physical address corresponding to the first destination address according to the first destination address and the third address decoder, wherein the third The address decoder is used to record the correspondence between the destination address and the local physical address.
在一可能的实现方式中,数据访问系统还包括配置节点,上述方法还包括以下步骤:配置节点向第二节点获取第二节点的内存的至少一个本地物理地址,配置节点根据至少一个本地物理地址确定对应的至少一个目的地址,对第三地址译码器进行配置,配置节点根据至少一个目的地址,结合第二节点与互联芯片之间的高速互联端口,对第二地址译码器进行配置,配置节点据至少一个目的地址,结合互联芯片与计算芯片之间的芯片端口,对第一地址译码器进行配置。In a possible implementation, the data access system further includes a configuration node, and the above method further includes the following steps: the configuration node acquires at least one local physical address of the memory of the second node from the second node, and the configuration node obtains at least one local physical address of the memory of the second node according to the at least one local physical address Determining at least one corresponding destination address, configuring the third address decoder, configuring the second address decoder according to the at least one destination address, and combining the high-speed interconnection port between the second node and the interconnection chip, The configuration node configures the first address decoder according to at least one destination address in combination with the chip port between the interconnection chip and the computing chip.
在一可能的实现方式中,计算芯片将第一地址译码器中记录的目的地址的基地址和长度 与第一目的地址的基地址和长度进行匹配,确定匹配后的目的地址对应的第一端口,互联芯片将第二地址译码器中记录的目的地址的基地址和长度与第一目的地址的基地址和长度进行匹配,确定匹配后的目的地址对应的第一高速互联端口。In a possible implementation, the computing chip matches the base address and length of the destination address recorded in the first address decoder with the base address and length of the first destination address, and determines the first address corresponding to the matched destination address. port, the interconnection chip matches the base address and length of the destination address recorded in the second address decoder with the base address and length of the first destination address, and determines the first high-speed interconnection port corresponding to the matched destination address.
在一可能的实现方式中,计算芯片将第一地址译码器中记录的目的地址的高位地址与第一目的地址的高位地址进行匹配,确定匹配后的目的地址对应的第一端口,其中,高位地址的位数是根据第二节点的内存大小确定的,互联芯片将第二地址译码器中记录的目的地址的基地址和长度与第一目的地址的基地址和长度进行匹配,确定匹配后的目的地址对应的第一高速互联端口。In a possible implementation manner, the computing chip matches the high-order address of the destination address recorded in the first address decoder with the high-order address of the first destination address, and determines the first port corresponding to the matched destination address, wherein, The number of bits in the high address is determined according to the memory size of the second node. The interconnection chip matches the base address and length of the destination address recorded in the second address decoder with the base address and length of the first destination address to determine the match The first high-speed interconnect port corresponding to the last destination address.
在一可能的实现方式中,第一高速互联端口和第二高速互联端口为高速串行总线端口,第一端口为高速串行总线端口。In a possible implementation manner, the first high-speed interconnection port and the second high-speed interconnection port are high-speed serial bus ports, and the first port is a high-speed serial bus port.
第三方面,提供了一种计算节点,该计算节点可以是第一方面和第二方面描述的第一节点,该计算节点应用于数据访问系统,数据访问系统还包括存储节点,计算节点包括:计算芯片和互联芯片,其中,计算芯片的通过高速互联端口与互联芯片连接,互联芯片通过高速互联端口和线缆与其他节点连接;计算芯片用于生成数据访问请求,并将数据访问请求发送至互联芯片,其中,数据访问请求包括第一目的地址,第一目的地址指示存储节点中的内存的位置;互联芯片用于根据第一目的地址将数据访问请求发送至存储节点。In a third aspect, a computing node is provided, the computing node may be the first node described in the first aspect and the second aspect, the computing node is applied to a data access system, the data access system further includes a storage node, and the computing node includes: Computing chip and interconnection chip, wherein, the computing chip is connected to the interconnection chip through a high-speed interconnection port, and the interconnection chip is connected to other nodes through a high-speed interconnection port and cables; the computing chip is used to generate a data access request and send the data access request to The interconnection chip, wherein the data access request includes a first destination address, and the first destination address indicates the location of the memory in the storage node; the interconnection chip is used to send the data access request to the storage node according to the first destination address.
第四方面,提供了一种存储节点,该存储节点可以是第一方面和第二方面描述的第二节点,该存储节点应用于数据访问系统,数据访问系统还包括计算节点,存储节点包括处理器和内存,存储节点通过处理器的高速互联端口和线缆与计算节点连接;处理器用于通过高速互联端口接收计算节点发送的数据访问请求,将数据访问请求中的携带的第一目的地址转换为第一目的地址对应的存储节点的本地物理地址,并根据本地物理地址访问内存中的数据。A fourth aspect provides a storage node, which may be the second node described in the first aspect and the second aspect, the storage node is applied to a data access system, the data access system further includes a computing node, and the storage node includes a processing The storage node is connected to the computing node through the high-speed interconnection port of the processor and the cable; the processor is used to receive the data access request sent by the computing node through the high-speed interconnection port, and convert the first destination address carried in the data access request It is the local physical address of the storage node corresponding to the first destination address, and accesses the data in the memory according to the local physical address.
第五方面,提供了一种计算设备,该计算设备包括处理器和存储器,存储器存储有代码,处理器包括用于执行第一方面或第一方面任一种可能实现方式中第一节点或第二节点实现的各个模块的功能。In a fifth aspect, there is provided a computing device, the computing device includes a processor and a memory, the memory stores codes, and the processor includes the first node or the first node for executing the first aspect or any possible implementation manner of the first aspect. The functions of each module implemented by the two nodes.
第六方面,提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述各方面所述的方法。According to a sixth aspect, a computer-readable storage medium is provided, wherein instructions are stored in the computer-readable storage medium, and when the computer-readable storage medium is run on a computer, the computer is made to execute the methods described in the above aspects.
本申请在上述各方面提供的实现方式的基础上,还可以进行进一步组合以提供更多实现方式。On the basis of the implementation manners provided in the foregoing aspects, the present application may further be combined to provide more implementation manners.
附图说明Description of drawings
图1是本申请提供的一种数据访问系统的结构示意图;Fig. 1 is a schematic structural diagram of a data access system provided by the present application;
图2是本申请提供的一种应用场景下第一节点和第二节点的部署示意图;FIG. 2 is a schematic diagram of deployment of a first node and a second node in an application scenario provided by the present application;
图3是本申请提供的另一种数据访问系统的结构示意图;FIG. 3 is a schematic structural diagram of another data access system provided by the present application;
图4是本申请提供的一种第一地址译码器的示例图;FIG. 4 is an example diagram of a first address decoder provided by the present application;
图5是本申请提供的一种数据访问方法的步骤流程示意图;Fig. 5 is a schematic flow chart of steps of a data access method provided by the present application;
图6是本申请提供的一种计算节点的结构示意图;FIG. 6 is a schematic structural diagram of a computing node provided by the present application;
图7是本申请提供的一种存储节点的结构示意图;FIG. 7 is a schematic structural diagram of a storage node provided by the present application;
图8是本申请提供的一种计算设备的结构示意图。FIG. 8 is a schematic structural diagram of a computing device provided by the present application.
具体实施方式Detailed ways
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.
首先,对本申请涉及应用场景进行说明。First, the application scenarios involved in this application are described.
计算节点在执行一些大数据或者AI任务时,需要较大的内存容量来存储数据,且会小粒度稀疏地访问数据,比如推荐系统架构中,需要10TB~20TB量级的内存来存储数据,而访问粒度可能只有64byte或者128byte,且每次访问的位置随机性高,导致大数据或者AI任务的数据访问效率低,网络延迟高,影响大数据或者AI任务的处理效率。When computing nodes perform some big data or AI tasks, they need a large memory capacity to store data, and access data with small granularity and sparsely. The access granularity may be only 64 bytes or 128 bytes, and the location of each access is highly random, resulting in low data access efficiency and high network latency for big data or AI tasks, affecting the processing efficiency of big data or AI tasks.
通常情况下,为了提高内存容量,可将数据分布式地放在多台存储节点的内存中,计算节点可通过远程直接内存访问(remote direct memory access,RDMA)协议读取存储节点内存中的数据,实现内存容量的扩展。Usually, in order to increase the memory capacity, the data can be distributed in the memory of multiple storage nodes, and the computing nodes can read the data in the memory of the storage nodes through the remote direct memory access (RDMA) protocol , to achieve the expansion of memory capacity.
但是,RDMA协议下,计算节点与存储节点之间的通信通过网卡实现,二者的网卡之间通过网卡队列进行数据传输,使得计算节点每次读取数据都需要将数据读取请求放入网卡队列,导致数据读取过程消耗大量时间在队列单元的准备上,甚至一些情况下,队列单元的准备时间相比数据传输的时间更长,导致计算节点数据访问效率低,网络延迟高,影响数据处理效率。However, under the RDMA protocol, the communication between the computing node and the storage node is realized through the network card, and the data transmission between the two network cards is performed through the network card queue, so that each time the computing node reads data, it needs to put the data read request into the network card Queue, which causes the data reading process to consume a lot of time in the preparation of the queue unit, and even in some cases, the preparation time of the queue unit is longer than the data transmission time, resulting in low data access efficiency of computing nodes and high network delay, affecting data Processing efficiency.
为了提高访问内存的速度,可以在计算设备的内存总线上拓展PCI内存设备,但是PCIe设备的拓展数量有限,QPI、UPI等总线的拓展能力更弱,通常最大也就拓展1T内存容量,使得拓展后的内存仍然无法达到大数据或者AI任务所需求的量级。In order to improve the speed of accessing memory, PCI memory devices can be expanded on the memory bus of computing devices, but the number of expansions for PCIe devices is limited, and the expansion capabilities of buses such as QPI and UPI are weaker. Usually, the maximum memory capacity is expanded to 1T, making expansion The final memory still cannot reach the magnitude required by big data or AI tasks.
综上可知,计算节点在执行数据处理任务时,需要较大的内存容量来存储数据,但是当前常用的RDMA方法虽然可以拓展内存至需求的量级,但是会使得计算节点数据访问效率低,使用PCI设备拓展内存虽然可以提高访问效率,但是拓展内存的能力很弱,拓展后的内存仍然无法达到需求,导致大数据或者AI任务的数据访问效率低,网络延迟高,影响大数据或者AI任务的处理效率。In summary, when computing nodes perform data processing tasks, they need a large memory capacity to store data. However, although the current commonly used RDMA method can expand the memory to the required level, it will make the data access efficiency of computing nodes low. Using Although expansion memory of PCI devices can improve access efficiency, the ability to expand memory is very weak, and the expanded memory still cannot meet the demand, resulting in low data access efficiency and high network latency for big data or AI tasks, affecting the performance of big data or AI tasks. Processing efficiency.
图1是本申请提供的一种数据访问系统的结构示意图。该数据访问系统包括第一节点110和第二节点120,其中,第一节点110和第二节点120之间通过线缆140进行连接,具体的,第一节点110的高速互联端口130与第二节点120中的处理器的高速互联端口130通过线缆连接。应理解,图1中的第二节点120的数量用于举例说明,本申请不对第二节点120的数量进行限定。为了便于区分,下文将统一称第一节点110中的高速互联端口130为第一高速互联端口,第二节点120中的高速互联端口130为第二高速互联端口。FIG. 1 is a schematic structural diagram of a data access system provided by the present application. The data access system includes a first node 110 and a second node 120, wherein the first node 110 and the second node 120 are connected through a cable 140, specifically, the high-speed interconnection port 130 of the first node 110 is connected to the second The high-speed interconnect ports 130 of the processors in the nodes 120 are connected by cables. It should be understood that the number of second nodes 120 in FIG. 1 is used for illustration, and the application does not limit the number of second nodes 120 . For ease of distinction, the high-speed interconnect port 130 in the first node 110 will be collectively referred to as the first high-speed interconnect port, and the high-speed interconnect port 130 in the second node 120 will be referred to as the second high-speed interconnect port.
第一节点110和第二节点120可以是物理服务器,比如X86服务器、ARM服务器等;也可以是基于通用的物理服务器结合网络功能虚拟化(network functions virtualization,NFV)技术实现的虚拟机(virtual machine,VM),虚拟机指通过软件模拟的具有完整硬件系统功能的、运行在一个完全隔离环境中的完整计算机系统,比如云计算中的虚拟设备,本申请不作具体限定;第一节点110和第二节点120还可以是多个服务器组成的服务器集群,该服务器可以是前述内容中的物理服务器或者虚拟机。The first node 110 and the second node 120 can be physical servers, such as X86 servers, ARM servers, etc.; they can also be virtual machines (virtual machines) based on common physical servers combined with network functions virtualization (network functions virtualization, NFV) technology , VM), a virtual machine refers to a complete computer system that is simulated by software and has complete hardware system functions and runs in a completely isolated environment, such as a virtual device in cloud computing, which is not specifically limited in this application; the first node 110 and the second node The second node 120 may also be a server cluster composed of multiple servers, and the servers may be physical servers or virtual machines in the foregoing content.
高速互联端口130可以是高速串行总线端口,比如SERDES总线端口,线缆140可以是电缆、光纤、双绞线等可以传输数据的线缆,本申请不对线缆140进行具体限定。其中,第 一节点110上高速互联端口130的数量可以是一个或者多个,且第一节点110上的第一高速互联端口与第二节点上的第二高速互联端口呈一一对应的关系,图1以3个端口为例进行举例说明,本申请不对此进行限定。The high-speed interconnection port 130 can be a high-speed serial bus port, such as a SERDES bus port, and the cable 140 can be a cable, optical fiber, twisted pair, etc. that can transmit data. This application does not specifically limit the cable 140 . Wherein, the number of high-speed interconnection ports 130 on the first node 110 may be one or more, and the first high-speed interconnection ports on the first node 110 are in a one-to-one correspondence with the second high-speed interconnection ports on the second node, FIG. 1 uses three ports as an example for illustration, which is not limited in this application.
需要说明的,第一节点110的第一高速互联端口与第二节点120的处理器的第二高速互联端口通过线缆相连,第二节点120的处理器可以是一个或者多个,在第二节点120的处理器数量为多个时,第一节点110可以通过不同的高速互联端口与第二节点的不同处理器相连,比如第二节点4包括处理器4、处理器5和处理器6时,第一节点110的高速互联端口1可以与处理器4相连,第一节点110的高速互联端口2可以与处理器5相连,通过不同的高速互联端口读取不同处理器对应的内存中的数据。应理解,第二节点120可以保留至少一个处理器不与第一节点110进行连接,从而确保第二节点120将一些内存提供给第一节点后,不会影响第二节点120处理其他业务。It should be noted that the first high-speed interconnection port of the first node 110 is connected to the second high-speed interconnection port of the processor of the second node 120 through a cable, and there may be one or more processors of the second node 120. When the number of processors in the node 120 is multiple, the first node 110 can be connected to different processors of the second node through different high-speed interconnect ports, for example, when the second node 4 includes processor 4, processor 5 and processor 6 , the high-speed interconnection port 1 of the first node 110 can be connected to the processor 4, the high-speed interconnection port 2 of the first node 110 can be connected to the processor 5, and the data in the memory corresponding to different processors can be read through different high-speed interconnection ports . It should be understood that the second node 120 may reserve at least one processor not connected to the first node 110, so as to ensure that after the second node 120 provides some memory to the first node, it will not affect the second node 120 to process other services.
第一节点110用于处理数据任务,比如前述内容中的大数据或者AI任务。第二节点120用于存储数据,第二节点120可以将内存划分一部分作为第一节点110的拓展内存,供第一节点110使用,第一节点110可通过图1所示的数据访问系统,从第二节点120划分出的拓展内存中读取数据,进行大数据或者AI任务的处理,从而实现第一节点110的内存拓展。The first node 110 is used to process data tasks, such as big data or AI tasks in the aforementioned content. The second node 120 is used to store data, and the second node 120 can divide a part of the memory as the expanded memory of the first node 110 for use by the first node 110. The first node 110 can use the data access system shown in FIG. Data is read from the expanded memory allocated by the second node 120 to process big data or AI tasks, thereby implementing memory expansion of the first node 110 .
在一应用场景下,如图2所示,第一节点110和第二节点120可以部署于同一个机柜中,第一节点110的第一高速互联端口与第二节点120的第二高速互联端口直连。整个机柜内各个服务器之间不需要经过交换机或者网卡即可进行通信,实现第一节点110从第二节点120的内存中读取数据的目的。In an application scenario, as shown in FIG. 2 , the first node 110 and the second node 120 can be deployed in the same cabinet, and the first high-speed interconnection port of the first node 110 and the second high-speed interconnection port of the second node 120 direct connection. The servers in the entire cabinet can communicate with each other without going through a switch or a network card, so that the first node 110 can read data from the memory of the second node 120 .
其中,第一节点110可以是AI服务器,第二节点120可以是2P服务器,2P服务器指的是有两个CPU的服务器,每个2P服务器拥有16个通道(channel),每个channel可以挂载2个64GB的内存条,也就是说,每个2P服务器可以为第一节点110拓展64GB×2×16=2TB内存,因此10台左右的2P服务器就可以满足10TB~20TB的内存拓展需求。并且,由于AI服务器和2P服务器在机柜中的高度,一个AI服务器与8~10台2P服务器恰好可放入一个机柜内部,使得一个机柜组成的机架式服务器可以拥有10TB~20TB的内存,符合大部分应用场景下AI服务器进行数据处理时的内存需求。应理解,图2用于举例说明,本申请不对此进行限定。Among them, the first node 110 can be an AI server, and the second node 120 can be a 2P server. A 2P server refers to a server with two CPUs. Each 2P server has 16 channels (channels), and each channel can mount Two 64GB memory sticks, that is, each 2P server can expand 64GB×2×16=2TB memory for the first node 110, so about 10 2P servers can meet the memory expansion requirements of 10TB-20TB. Moreover, due to the height of the AI server and 2P server in the cabinet, one AI server and 8 to 10 2P servers can be placed inside a cabinet, so that a rack server composed of a cabinet can have 10TB to 20TB of memory, which meets the In most application scenarios, the memory requirements of the AI server for data processing. It should be understood that FIG. 2 is used for illustration, and the present application does not limit it.
具体实现中,第一节点110可以生成数据访问请求,其中,该数据访问请求用于请求第二节点内存中的数据,第一节点110可以通过线缆140发送该数据访问请求至第二节点120,第二节点可以将数据访问请求中的第一目的地址转换为目的地址对应的本地物理地址,并根据本地物理地址访问该第二节点的内存中的数据。In a specific implementation, the first node 110 may generate a data access request, wherein the data access request is used to request data in the memory of the second node, and the first node 110 may send the data access request to the second node 120 through the cable 140 , the second node may convert the first destination address in the data access request into a local physical address corresponding to the destination address, and access the data in the memory of the second node according to the local physical address.
示例性地,如图3所示,图3是本申请提供的另一种数据访问系统的结构示意图,其中,第一节点110可包括计算芯片111、互联芯片112、计算芯片的端口113和总线114,其中,计算芯片111的计算芯片的端口113与互联芯片112通过总线114进行通信,图3为了使得图中连接关系更清楚,没有将互联芯片112上的端口绘出,但是具体实现中,互联芯片112上也可拥有对应的端口,应理解,图1仅为一种示例性的划分方式,各个模块单元之间可以合并或者拆分为更多或更少的模块单元,本申请不作具体限定。Exemplarily, as shown in FIG. 3, FIG. 3 is a schematic structural diagram of another data access system provided by the present application, wherein the first node 110 may include a computing chip 111, an interconnection chip 112, a port 113 of the computing chip and a bus 114, wherein, the port 113 of the computing chip of the computing chip 111 communicates with the interconnection chip 112 through the bus 114. In order to make the connection relationship in the figure clearer, the port on the interconnection chip 112 is not drawn in FIG. 3, but in the specific implementation, The interconnection chip 112 may also have corresponding ports. It should be understood that FIG. 1 is only an exemplary division method, and each module unit may be merged or split into more or fewer module units. This application does not make specific limited.
计算芯片的端口113可以是高速串行总线端口,比如SERDES总线端口,总线114可以是外设部件互联标准(Peripheral Component Interconnect,PCI)总线或扩展工业标准结构(Extended Industry Standard Architecture,EISA)总线等,计算芯片111、互联芯片112、计算 芯片的端口113和总线114可以在加工时统一印制在电路板上。具体实现中,计算芯片的端口113的数量可以是一个或者多个,图3以两个端口(端口0和端口1)为例进行举例说明,本申请不对此进行限定。The port 113 of the computing chip can be a high-speed serial bus port, such as a SERDES bus port, and the bus 114 can be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, etc. , the computing chip 111 , interconnection chip 112 , port 113 and bus 114 of the computing chip can be uniformly printed on the circuit board during processing. In a specific implementation, the number of ports 113 of the computing chip may be one or more. FIG. 3 uses two ports (port 0 and port 1) as an example for illustration, which is not limited in this application.
计算芯片111可以由至少一个通用处理器构成,例如CPU、NPU或者CPU和硬件芯片的组合。上述硬件芯片可以是专用集成电路(Application-Specific Integrated Circuit,ASIC)、可编程逻辑器件(Programmable Logic Device,PLD)或其组合。上述PLD可以是复杂可编程逻辑器件(Complex Programmable Logic Device,CPLD)、现场可编程逻辑门阵列(Field-Programmable Gate Array,FPGA)、通用阵列逻辑(Generic Array Logic,GAL)或其任意组合。计算芯片111执行各种类型的数字存储指令,它能使第一节点110提供较宽的多种服务。其中,第一节点110中计算芯片111的数量可以是一个或者多个,图3以一个计算芯片111为例进行说明,本申请不作具体限定。The computing chip 111 may be composed of at least one general-purpose processor, such as a CPU, an NPU, or a combination of a CPU and a hardware chip. The aforementioned hardware chip may be an application-specific integrated circuit (Application-Specific Integrated Circuit, ASIC), a programmable logic device (Programmable Logic Device, PLD) or a combination thereof. The above-mentioned PLD can be a complex programmable logic device (Complex Programmable Logic Device, CPLD), a field programmable logic gate array (Field-Programmable Gate Array, FPGA), a general array logic (Generic Array Logic, GAL) or any combination thereof. The computing chip 111 executes various types of digital storage instructions, which enable the first node 110 to provide a wide variety of services. Wherein, the number of computing chips 111 in the first node 110 may be one or more. FIG. 3 uses one computing chip 111 as an example for illustration, which is not specifically limited in this application.
互联芯片112可以是ASIC、PLD或其组合,上述PLD可以是CPLD、FPGA、GAL或其任意组合,本申请不作具体限定。其中,第一节点110中互联芯片112的数量可以是多个,图3以2个互联芯片112为例(互联芯片1和互联芯片2)进行说明,本申请不作具体限定。The interconnect chip 112 may be an ASIC, a PLD or a combination thereof, and the PLD may be a CPLD, FPGA, GAL or any combination thereof, which is not specifically limited in this application. Wherein, the number of interconnection chips 112 in the first node 110 may be multiple. FIG. 3 takes two interconnection chips 112 as an example (interconnection chip 1 and interconnection chip 2) for illustration, which is not specifically limited in this application.
互联芯片112上设置有高速互联端口130,互联芯片112可通过高速互联端口130与第二节点120进行数据通信,互联芯片112的第一高速互联端口与第二节点120上的第二高速互联端口通过线缆140连接,其中,高速互联端口130和线缆140的描述可参考前述图1和图2实施例,这里不重复赘述。需要说明的,每个互联芯片112上的第一高速互联端口的数量可以是一个或者多个,每个第一高速互联端口与第二节点上的第二高速互联端口呈一一对应关系,图3以互联芯片112上的数量为2个为例进行举例说明,即互联芯片1包括高速互联端口2和高速互联端口3,互联芯片2包括高速互联端口4和高速互联端口5,本申请不对此进行具体限定。The interconnection chip 112 is provided with a high-speed interconnection port 130, and the interconnection chip 112 can perform data communication with the second node 120 through the high-speed interconnection port 130. The first high-speed interconnection port of the interconnection chip 112 and the second high-speed interconnection port on the second node 120 They are connected through a cable 140 , wherein the description of the high-speed interconnection port 130 and the cable 140 can refer to the foregoing embodiments in FIG. 1 and FIG. 2 , and details are not repeated here. It should be noted that the number of first high-speed interconnection ports on each interconnection chip 112 may be one or more, and each first high-speed interconnection port is in a one-to-one correspondence with the second high-speed interconnection port on the second node, as shown in FIG. 3 Take the number of interconnection chips 112 as two as an example for illustration, that is, interconnection chip 1 includes high-speed interconnection port 2 and high-speed interconnection port 3, and interconnection chip 2 includes high-speed interconnection port 4 and high-speed interconnection port 5. Make specific restrictions.
第二节点120的描述可以参考图1和图2实施例,这里不重复赘述,其中,第二节点120的数量可以是一个或者多个,图3以4个第二节点(第二节点1~4)为例进行举例说明,本申请不作具体限定。The description of the second node 120 can refer to the embodiment in FIG. 1 and FIG. 4) An example is used for illustration, and this application does not make specific limitations.
在本申请实施例中,计算芯片111用于生成上述数据访问请求,并将数据访问请求发送给互联芯片112,具体实现中,计算芯片111可通过上述计算芯片的端口113向互联芯片112发送数据访问请求。互联芯片112用于通过上述线缆140发送该数据访问请求至第二节点120,具体实现中,互联芯片112通过第一高速互联端口130向第二节点120的第二高速互联端口发送上述数据访问请求。可以理解的,在第一节点110中部署互联芯片112,可以使得第一节点110可以与更多的第二节点120进行通信,互联芯片112的数量越多,第一节点110中可部署的高速互联端口130的数量越多,使得与第一节点110相连的第二节点120数量越多,从而扩大第一节点110的内存扩展能力,使得第一节点110可以适用于更多的应用场景。In this embodiment of the application, the computing chip 111 is used to generate the above-mentioned data access request, and send the data access request to the interconnection chip 112. In a specific implementation, the computing chip 111 can send data to the interconnection chip 112 through the port 113 of the above-mentioned computing chip access request. The interconnection chip 112 is used to send the data access request to the second node 120 through the above-mentioned cable 140. ask. It can be understood that deploying the interconnection chip 112 in the first node 110 can enable the first node 110 to communicate with more second nodes 120, and the greater the number of interconnection chips 112, the higher the high-speed The greater the number of interconnection ports 130, the greater the number of second nodes 120 connected to the first node 110, thereby expanding the memory expansion capability of the first node 110, making the first node 110 applicable to more application scenarios.
在本申请实施例中,计算芯片111、互联芯片112和第二节点120之间的数据通信可通过地址译码器实现寻址功能。下面结合图3对计算芯片111、互联芯片112和第二节点120中的地址译码器进行详细说明。In the embodiment of the present application, the data communication between the computing chip 111 , the interconnection chip 112 and the second node 120 can implement an addressing function through an address decoder. The address decoders in the computing chip 111 , the interconnection chip 112 and the second node 120 will be described in detail below with reference to FIG. 3 .
在一实施例中,如图3所示,计算芯片111中部署有第一地址译码器210,计算芯片111具体用于生成数据访问请求,根据数据访问请求中的第一目的地址和第一地址译码器210确定第一端口,通过第一端口向互联芯片112发送数据访问请求,其中,第一地址译码器210可记录目的地址与计算芯片的端口之间的对应关系。In one embodiment, as shown in FIG. 3 , a first address decoder 210 is deployed in the computing chip 111, and the computing chip 111 is specifically used to generate a data access request, and according to the first destination address and the first The address decoder 210 determines the first port, and sends a data access request to the interconnection chip 112 through the first port, wherein the first address decoder 210 can record the correspondence between the destination address and the port of the computing chip.
在一实施例中,如图3所示,互联芯片112中部署有第二地址译码器220,互联芯片112具体用于根据第一目的地址和第二地址译码器220确定第一高速互联端口,通过第一高速互联端口向第二节点120发送数据访问请求,其中,第二地址译码器220用于记录目的地址与高速互联端口之间的对应关系。In one embodiment, as shown in FIG. 3 , a second address decoder 220 is deployed in the interconnection chip 112, and the interconnection chip 112 is specifically used to determine the first high-speed interconnection address according to the first destination address and the second address decoder 220. The port sends a data access request to the second node 120 through the first high-speed interconnection port, wherein the second address decoder 220 is used to record the correspondence between the destination address and the high-speed interconnection port.
在一实施例中,如图3所示,第二节点120中部署有第三地址译码器230,第二节点120具体用于根据第一目的地址和第三地址译码器,确定第一目的地址对应的本地物理地址,其中,第三地址译码器230用于记录目的地址与本地物理地址之间的对应关系。In one embodiment, as shown in FIG. 3 , a third address decoder 230 is deployed in the second node 120, and the second node 120 is specifically configured to determine the first The local physical address corresponding to the destination address, wherein the third address decoder 230 is used to record the correspondence between the destination address and the local physical address.
在一实施例中,如图3所示,该数据访问系统还可包括配置节点150,配置节点150可以对第一地址译码器210、第二地址译码器220以及第三地址译码器230进行配置。具体地,配置节点150用于向第二节点120获取第二节点的内存的至少一个本地物理地址,配置节点用于根据至少一个本地物理地址确定对应的至少一个目的地址,对第三地址译码器进行配置;配置节点还用于根据至少一个目的地址,结合第二节点与互联芯片之间的高速互联端口,对第二地址译码器进行配置;配置节点还用于根据至少一个目的地址,结合互联芯片与计算芯片之间的芯片端口,对第一地址译码器进行配置。In one embodiment, as shown in FIG. 3 , the data access system may further include a configuration node 150, and the configuration node 150 may configure the first address decoder 210, the second address decoder 220, and the third address decoder 230 for configuration. Specifically, the configuration node 150 is used to obtain at least one local physical address of the memory of the second node from the second node 120, the configuration node is used to determine at least one corresponding destination address according to the at least one local physical address, and decode the third address The configuration node is also used to configure the second address decoder according to at least one destination address in combination with the high-speed interconnection port between the second node and the interconnection chip; the configuration node is also used to configure the second address decoder according to at least one destination address, The first address decoder is configured in combination with the chip port between the interconnection chip and the computing chip.
可以理解的,通过配置节点150配置的第一、第二和第三地址译码器,可以确保计算芯片生成的数据访问请求通过地址译码器路由寻址,将数据访问请求传输至目的地址对应的第二节点的CPU进行内存读写,从而免于网卡队列准备的等待时间,提高第一节点110对拓展内存读写的效率,时延甚至能达到微秒级别(以太网时延可以达到毫秒级),带宽可以达到400GB,相比带宽只有100GB的RDMA网卡,拥有更高的带宽和时延。It can be understood that by configuring the first, second and third address decoders configured by the node 150, it can be ensured that the data access request generated by the computing chip is routed and addressed by the address decoder, and the data access request is transmitted to the corresponding address of the destination address. The CPU of the second node performs memory reading and writing, thereby avoiding the waiting time for network card queue preparation, improving the efficiency of the first node 110 to expand memory reading and writing, and the delay can even reach the microsecond level (the Ethernet delay can reach milliseconds) level), the bandwidth can reach 400GB, which has higher bandwidth and delay than the RDMA network card with a bandwidth of only 100GB.
在一实施例中,配置节点150向第二节点120获取第二节点的内存的至少一个本地物理地址时,可以根据第二节点120内存的大小,结合业务需求,确定第二节点120划分出来供第一节点110使用的拓展内存的本地物理地址。In an embodiment, when the configuration node 150 obtains at least one local physical address of the memory of the second node from the second node 120, it may determine that the second node 120 is allocated for use according to the size of the memory of the second node 120 and in combination with business requirements. The local physical address of the extended memory used by the first node 110.
可选地,供第一节点110使用的拓展内存可以是第二节点120的部分内存,该部分拓展内存可以通过内存隔离技术进行处理,使得第二节点120无法访问该部分拓展内存,提高拓展内存中存储的数据的安全性。Optionally, the extended memory used by the first node 110 may be a part of the memory of the second node 120, and this part of the extended memory may be processed through a memory isolation technology, so that the second node 120 cannot access this part of the extended memory, improving the performance of the extended memory. Security of data stored in .
可选地,第三地址译码器230记录的对应关系可以是:本地物理地址=目的地址-基地址,其中,基地址指的是一个地址段的起始地址,又称为首地址或者段地址,属于同一个地址段的目的地址的基地址相同。Optionally, the corresponding relationship recorded by the third address decoder 230 may be: local physical address=destination address-base address, where the base address refers to the starting address of an address segment, also known as the first address or segment address , the base addresses of the destination addresses belonging to the same address segment are the same.
在一实施例中,在将第一目的地址与第一地址译码器210和第二地址译码器220进行匹配时,可以将完整或部分第一目的地址与译码器中的地址进行匹配,从而提高匹配效率,进而提高数据访问的效率。In one embodiment, when matching the first destination address with the first address decoder 210 and the second address decoder 220, the whole or part of the first destination address can be matched with the address in the decoder , so as to improve the matching efficiency, thereby improving the efficiency of data access.
可选地,可根据数据访问请求中第一目的地址的基地址和长度确定第一端口。具体地,计算芯片111具体用于将第一地址译码器210中记录的目的地址的基地址和长度与第一目的地址的基地址和长度进行匹配,确定匹配后的目的地址对应的第一端口。同理,互联芯片112具体用于将第二地址译码器220中记录的目的地址的基地址和长度与第二目的地址的基地址和长度进行匹配,确定匹配后的目的地址对应的第一高速互联端口。这里不再展开赘述。Optionally, the first port may be determined according to the base address and length of the first destination address in the data access request. Specifically, the computing chip 111 is specifically configured to match the base address and length of the destination address recorded in the first address decoder 210 with the base address and length of the first destination address, and determine the first address corresponding to the matched destination address. port. Similarly, the interconnect chip 112 is specifically used to match the base address and length of the destination address recorded in the second address decoder 220 with the base address and length of the second destination address, and determine the first address corresponding to the matched destination address. High-speed Internet port. I won't go into details here.
可选地,可根据第一目的地址的高位地址确定第一端口。计算芯片111具体用于将第一地址译码器中记录的目的地址的高位地址与第一目的地址的高位地址进行匹配,确定匹配后的目的地址对应的第一端口,其中,高位地址的位数是根据第二节点的内存大小确定的。同理,互联芯片112具体用于将第二地址译码器中记录的目的地址的高位地址与第一目的地址 的高位地址进行匹配,确定匹配后的目的地址对应的第一高速互联端口。这里不再展开赘述。Optionally, the first port may be determined according to the upper address of the first destination address. The computing chip 111 is specifically used to match the high-order address of the destination address recorded in the first address decoder with the high-order address of the first destination address, and determine the first port corresponding to the matched destination address, wherein the bits of the high-order address The number is determined according to the memory size of the second node. Similarly, the interconnection chip 112 is specifically used to match the upper address of the destination address recorded in the second address decoder with the upper address of the first destination address, and determine the first high-speed interconnection port corresponding to the matched destination address. I won't go into details here.
举例来说,假设目的地址总长度为64bit,若一个高速互联端口对应的第二节点120的内存为1T,那么这1T内存的目的地址中,后30bit的地址不同,那么高位地址的位数可以是64-30=34bit,简单来说,位于同一个内存的目的地址前34bit是相同的,后面30bit不同,因此,根据高速互联端口所连接的第二节点120的拓展内存大小,可以确定高位地址的位数。For example, assuming that the total length of the destination address is 64 bits, if the memory of the second node 120 corresponding to a high-speed interconnection port is 1T, then among the destination addresses of the 1T memory, the address of the last 30 bits is different, so the number of bits of the high address can be It is 64-30=34bit. Simply speaking, the first 34bits of the destination address in the same memory are the same, and the latter 30bits are different. Therefore, the high-order address can be determined according to the expanded memory size of the second node 120 connected to the high-speed interconnect port. digits.
应理解,由于第二节点120提供的拓展内存对应的物理地址数量为多个,因此第一、第二译码器中记录的部分目的地址对应的端口可能会是同一个,对应相同端口的目的地址位于同一个内存中,这些对应相同端口的目的地址,其基地址和长度是相同的,或者,其高位地址是相同的,因此可以通过匹配基地址和长度,或者匹配高位地址来确定第一目的地址对应的端口。It should be understood that since the expanded memory provided by the second node 120 has multiple physical addresses, the ports corresponding to some of the destination addresses recorded in the first and second decoders may be the same, corresponding to the same port The addresses are located in the same memory. These destination addresses corresponding to the same port have the same base address and length, or the high address is the same, so the first can be determined by matching the base address and length, or matching the high address. The port corresponding to the destination address.
仍以图3所示的数据访问系统为例,假设第二节点4提供的拓展内存对应的物理地址数量为4个,第三地址译码器记录了目的地址1~10,那么第二地址译码器可记录目的地址1~10与高速互联端口5对应,如果第一目的地址是目的地址1~10中的任一个地址,其对应的高速互联端口都是高速互联端口5,这些对应相同高速互联端口的目的地址是同为第二节点4拓展内存的地址,因此这些目的地址的基地址和长度相同,或者高位地址相同。在确定第一目的地址对应的高速互联端口时,可以将第一目的地址的部分地址与第二地址译码器中目的地址的部分地址进行匹配,从而提高匹配效率。Still taking the data access system shown in Figure 3 as an example, assuming that the number of physical addresses corresponding to the extended memory provided by the second node 4 is 4, and the third address decoder records destination addresses 1-10, then the second address decoder The encoder can record that destination addresses 1-10 correspond to high-speed interconnection ports 5. If the first destination address is any one of destination addresses 1-10, the corresponding high-speed interconnection ports are all high-speed interconnection ports 5, which correspond to the same high-speed interconnection ports. The destination address of the interconnect port is also the address of the extended memory of the second node 4, so the base address and length of these destination addresses are the same, or the high address is the same. When determining the high-speed interconnection port corresponding to the first destination address, the partial address of the first destination address can be matched with the partial address of the destination address in the second address decoder, thereby improving the matching efficiency.
举例来说,图4是一种地址译码器210的示例图,如图4所示,第一地址译码器210可包括多个目的地址,基地址和长度相同的目的地址对应的计算芯片111的端口相同,假设第一目的地址的基地址和长度如图4所示,可以将第一目的地址的基地址和长度与第一地址译码器210中每个目的地址的基地址和长度进行匹配,从而确定匹配后的目的地址对应的第一端口为端口2,数据访问请求可以通过端口2传输至互联芯片112。同理,根据第二地址译码器220记录的目的地址的基地址和长度确定传输该数据访问请求的高速互联端口,这里不重复展开说明。应理解,图4是基于基地址和长度对第一目的地址进行匹配,上述基于高位地址的匹配方式与其类似,这里不再举例说明。For example, FIG. 4 is an example diagram of an address decoder 210. As shown in FIG. 4, the first address decoder 210 may include a plurality of destination addresses, and the computing chip corresponding to the base address and the destination address with the same length The ports of 111 are the same, assuming that the base address and length of the first destination address are as shown in Figure 4, the base address and length of the first destination address can be compared with the base address and length of each destination address in the first address decoder 210 Matching is performed to determine that the first port corresponding to the matched destination address is port 2, and the data access request can be transmitted to the interconnection chip 112 through port 2. Similarly, the high-speed interconnection port for transmitting the data access request is determined according to the base address and length of the destination address recorded by the second address decoder 220, and the description will not be repeated here. It should be understood that, in FIG. 4 , the first destination address is matched based on the base address and the length, and the above-mentioned matching method based on the high-order address is similar, and no further illustration is given here.
需要说明的,图1所示的数据访问系统也可通过上述地址译码器实现数据访问请求的路由寻址。具体地,第一节点110可部署有第二地址译码器220,第二节点120部署有第三地址译码器230,第一节点110生成的数据访问请求可根据第二地址译码器220中记录的高速互联端口和目的地址之间的对应关系,确定第一目的地址对应的第一高速互联端口,然后通过第一高速互联端口向第二节点120发送该数据访问请求,这里不展开赘述。应理解,图1所示的数据访问系统中,高速互联端口可以部署于第一节点110中的处理器上,简单来说,即第一节点110的处理器与第二节点120的处理器通过线缆直连。It should be noted that the data access system shown in FIG. 1 can also implement routing addressing for data access requests through the above address decoder. Specifically, the first node 110 may be equipped with a second address decoder 220, and the second node 120 may be equipped with a third address decoder 230, and the data access request generated by the first node 110 may be configured according to the second address decoder 220 The corresponding relationship between the high-speed interconnection port and the destination address recorded in , determine the first high-speed interconnection port corresponding to the first destination address, and then send the data access request to the second node 120 through the first high-speed interconnection port, which will not be described in detail here . It should be understood that in the data access system shown in FIG. 1 , the high-speed interconnect port can be deployed on the processor in the first node 110. In simple terms, the processor of the first node 110 and the processor of the second node 120 pass The cable connects directly.
需要说明的,若数据访问请求是向第二节点120读取内存中的数据,第二节点120对数据访问请求进行处理后,可以根据数据访问请求中的源地址,结合第一、第二和第三地址译码器,将读取到的数据原路返回至第一节点110中,这里不在重复再开赘述。It should be noted that if the data access request is to read data in memory from the second node 120, after the second node 120 processes the data access request, it can combine the first, second and The third address decoder returns the read data back to the first node 110 through the original path, which will not be repeated here.
综上可知,本申请提供的数据访问系统,第一节点的高速互联端口与第二节点的高速互联端口通过线缆进行连接,第一节点可结合地址译码器实现寻址功能,从而将数据访问请求发送至第一目的地址对应的第二节点内存,实现第一节点的内存拓展,该方式不需要额外部署网卡或者路由器,无需等待网卡队列单元的准备时间,使得第一节点访问第二节点内存的效率高、时延低,同时,通过增加高速互联端口可以增加第二节点的数量,可以提高第一节 点的拓展内存容量,使得第一节点可扩展的内存容量很大,能够处理更多应用场景下的业务。In summary, in the data access system provided by this application, the high-speed interconnection port of the first node is connected to the high-speed interconnection port of the second node through a cable, and the first node can combine the address decoder to realize the addressing function, so that the data The access request is sent to the memory of the second node corresponding to the first destination address to realize the memory expansion of the first node. This method does not need to deploy additional network cards or routers, and does not need to wait for the preparation time of the network card queue unit, so that the first node can access the second node The memory has high efficiency and low latency. At the same time, by increasing the number of high-speed interconnection ports, the number of second nodes can be increased, and the expanded memory capacity of the first node can be increased, so that the scalable memory capacity of the first node is large and can handle more business in application scenarios.
下面结合图5,对本申请提供的数据访问方法进行解释说明。图5是本申请提供的一种数据访问方法,该方法可应用于图1~图4所示的数据访问系统中,该方法可包括以下步骤:The data access method provided by this application will be explained below with reference to FIG. 5 . Fig. 5 is a data access method provided by the present application, which can be applied to the data access system shown in Fig. 1 to Fig. 4, and the method may include the following steps:
步骤S510:第一节点生成数据访问请求,其中,该数据访问请求用于请求第二节点的内存中的数据。第一节点的描述可参考图1~图4实施例,这里不重复赘述。Step S510: the first node generates a data access request, where the data access request is used to request data in the memory of the second node. For the description of the first node, reference may be made to the embodiments in FIG. 1 to FIG. 4 , and details are not repeated here.
步骤S520:第一节点通过线缆发送数据访问请求至第二节点。应理解,第一节点的第一高速互联端口与第二节点的第二高速互联端口通过线缆连接,其中,高速互联端口和线缆的描述可参考图1~图4实施例,这里不重复赘述。Step S520: the first node sends a data access request to the second node through a cable. It should be understood that the first high-speed interconnection port of the first node is connected to the second high-speed interconnection port of the second node through a cable, wherein, the description of the high-speed interconnection port and the cable can refer to the embodiments in FIGS. 1 to 4 , and will not be repeated here. repeat.
在一实施例中,第一节点可包括第二地址译码器,该第二地址译码器用于记录目的地址和高速互联端口之间的对应关系,第一节点可以根据数据访问请求中的第一目的地址和第二地址译码器,确定第一目的地址对应的第一高速互联端口,然后将该数据访问请求通过第一高速互联端口发送给第二节点。其中,第二地址译码器的具体描述可参考图1~图4实施例,这里不重复赘述。In an embodiment, the first node may include a second address decoder, which is used to record the correspondence between the destination address and the high-speed interconnection port, and the first node may, according to the first address in the data access request A destination address and second address decoder determine the first high-speed interconnection port corresponding to the first destination address, and then send the data access request to the second node through the first high-speed interconnection port. Wherein, the specific description of the second address decoder can refer to the embodiments shown in FIG. 1 to FIG. 4 , which will not be repeated here.
步骤S530:第二节点将数据访问请求中的第一目的地址转换为第一目的地址对应的本地物理地址,并根据本地物理地址访问第二节点的内存中的数据。Step S530: the second node converts the first destination address in the data access request into a local physical address corresponding to the first destination address, and accesses the data in the memory of the second node according to the local physical address.
在一实施例中,第二节点可包括第三地址译码器,该第三地址译码器用于记录目的地址和本地物理地址之间的对应关系,第二节点可以根据数据访问请求中的第一目的地址和第三译码器,确定第一目的地址对应的本地物理地址,然后根据本地物理地址访问第二节点的内存中的数据。其中,第三地址译码器的具体描述可参考图1~图4实施例,这里不重复赘述。In an embodiment, the second node may include a third address decoder, which is used to record the correspondence between the destination address and the local physical address, and the second node may, according to the first address in the data access request A destination address and a third decoder determine the local physical address corresponding to the first destination address, and then access data in the memory of the second node according to the local physical address. Wherein, the specific description of the third address decoder can refer to the embodiments shown in FIG. 1 to FIG. 4 , and details are not repeated here.
在一实施例中,第一节点可包括计算芯片和互联芯片,计算芯片通过端口与互联芯片连接,连接的具体方式可以是前述内容中的总线,其中,第一节点、第二节点、计算芯片、互联芯片、端口以及总线的描述可参考图1~图4实施例,这里不重复赘述。In an embodiment, the first node may include a computing chip and an interconnection chip, and the computing chip is connected to the interconnection chip through a port. , interconnection chips, ports and bus descriptions may refer to the embodiments shown in FIG. 1 to FIG. 4 , which will not be repeated here.
具体实现中,计算芯片可执行步骤S510生成上述数据访问请求,并将数据访问请求发送给互联芯片,计算芯片可通过计算芯片的端口向互联芯片发送数据访问请求。互联芯片通过第一高速互联端口发送该数据访问请求至第二节点。In a specific implementation, the computing chip may perform step S510 to generate the above data access request, and send the data access request to the interconnection chip, and the computing chip may send the data access request to the interconnection chip through a port of the computing chip. The interconnection chip sends the data access request to the second node through the first high-speed interconnection port.
可以理解的,在第一节点中部署互联芯片,可以使得第一节点可以与更多的第二节点进行通信,互联芯片的数量越多,第一节点中可部署的高速互联端口的数量越多,使得与第一节点相连的第二节点数量越多,从而扩大第一节点的内存扩展能力,使得第一节点可以适用于更多的应用场景。It can be understood that deploying interconnection chips in the first node can enable the first node to communicate with more second nodes. The greater the number of interconnection chips, the greater the number of high-speed interconnection ports that can be deployed in the first node. , so that the number of second nodes connected to the first node increases, thereby expanding the memory expansion capability of the first node, so that the first node can be applied to more application scenarios.
在一实施例中,计算芯片中部署有第一地址译码器,计算芯片生成数据访问请求后,根据数据访问请求中的第一目的地址和第一地址译码器确定第一端口,通过第一端口向互联芯片发送数据访问请求,其中,第一地址译码器可记录目的地址与计算芯片的端口之间的对应关系。In one embodiment, the computing chip is equipped with a first address decoder. After the computing chip generates a data access request, it determines the first port according to the first destination address in the data access request and the first address decoder. A port sends a data access request to the interconnection chip, wherein the first address decoder can record the correspondence between the destination address and the port of the computing chip.
在一实施例中,互联芯片中部署有第二地址译码器,互联芯片可根据第一目的地址和第二地址译码器确定第一高速互联端口,通过第一高速互联端口向第二节点发送数据访问请求,其中,第二地址译码器用于记录目的地址与高速互联端口之间的对应关系。In one embodiment, a second address decoder is deployed in the interconnection chip, and the interconnection chip can determine the first high-speed interconnection port according to the first destination address and the second address decoder, and communicate to the second node through the first high-speed interconnection port A data access request is sent, wherein the second address decoder is used to record the corresponding relationship between the destination address and the high-speed interconnection port.
在一实施例中,该数据访问系统还可包括配置节点,配置节点可以在第一节点生成数据访问请求之前,对第一地址译码器、第二地址译码器以及第三地址译码器进行配置。具体地,配置节点向第二节点获取第二节点的内存的至少一个本地物理地址,根据至少一个本地物理地址确定对应的至少一个目的地址,对第三地址译码器进行配置;根据至少一个目的地址, 结合第二节点与互联芯片之间的高速互联端口,对第二地址译码器进行配置;根据至少一个目的地址,结合互联芯片与计算芯片之间的芯片端口,对第一地址译码器进行配置。In an embodiment, the data access system may further include a configuration node, and the configuration node may configure the first address decoder, the second address decoder, and the third address decoder before the first node generates a data access request. to configure. Specifically, the configuration node obtains at least one local physical address of the memory of the second node from the second node, determines at least one corresponding destination address according to the at least one local physical address, and configures the third address decoder; according to at least one destination Address, combining the high-speed interconnection port between the second node and the interconnection chip, configuring the second address decoder; according to at least one destination address, combining the chip port between the interconnection chip and the computing chip, decoding the first address device to configure.
可以理解的,通过配置节点配置的第一、第二和第三地址译码器,可以确保计算芯片生成的数据访问请求通过地址译码器路由寻址,将数据访问请求传输至目的地址对应的第二节点的CPU进行内存读写,从而免于网卡队列准备的等待时间,提高第一节点对拓展内存读写的效率,时延甚至能达到微秒级别(以太网时延可以达到毫秒级),带宽可以达到400GB,相比带宽只有100GB的RDMA网卡,拥有更高的带宽和时延。It can be understood that by configuring the first, second and third address decoders configured by the node, it can be ensured that the data access request generated by the computing chip is routed and addressed by the address decoder, and the data access request is transmitted to the address corresponding to the destination address. The CPU of the second node reads and writes the memory, thereby avoiding the waiting time for the network card queue preparation, improving the efficiency of the first node to read and write the expanded memory, and the delay can even reach the microsecond level (the Ethernet delay can reach the millisecond level) , the bandwidth can reach 400GB, which has higher bandwidth and delay than the RDMA network card with a bandwidth of only 100GB.
在一实施例中,配置节点向第二节点120获取第二节点的内存的至少一个本地物理地址时,可以根据第二节点内存的大小,结合业务需求,确定第二节点划分出来供第一节点110使用的拓展内存的本地物理地址。In an embodiment, when the configuration node acquires at least one local physical address of the memory of the second node from the second node 120, it may determine that the second node is allocated for the first node according to the size of the memory of the second node and combined with business requirements. 110 The local physical address of the extended memory used.
可选地,供第一节点使用的拓展内存可以是第二节点的部分内存,该部分拓展内存可以通过内存隔离技术进行处理,使得第二节点无法访问该部分拓展内存,提高拓展内存中存储的数据的安全性。Optionally, the extended memory used by the first node can be part of the memory of the second node, and this part of the extended memory can be processed through memory isolation technology, so that the second node cannot access this part of the extended memory, thereby increasing the storage capacity of the extended memory. Data Security.
可选地,第三地址译码器记录的对应关系可以是:本地物理地址=目的地址-基地址,其中,基地址指的是一个地址段的起始地址,又称为首地址或者段地址,属于同一个地址段的目的地址的基地址相同。Optionally, the corresponding relationship recorded by the third address decoder may be: local physical address=destination address-base address, wherein the base address refers to the starting address of an address segment, also known as the first address or segment address, The base addresses of the destination addresses belonging to the same address segment are the same.
在一实施例中,在将第一目的地址与第一地址译码器和第二地址译码器进行匹配时,可以将完整或部分第一目的地址与译码器中的地址进行匹配,从而提高匹配效率,进而提高数据访问的效率。In an embodiment, when matching the first destination address with the first address decoder and the second address decoder, the whole or part of the first destination address can be matched with the address in the decoder, thereby Improve matching efficiency, thereby improving the efficiency of data access.
可选地,可根据数据访问请求中第一目的地址的基地址和长度确定第一端口。具体地,计算芯片可以将第一地址译码器中记录的目的地址的基地址和长度与第一目的地址的基地址和长度进行匹配,确定匹配后的目的地址对应的第一端口。同理,互联芯片可以将第二地址译码器中记录的目的地址的基地址和长度与第二目的地址的基地址和长度进行匹配,确定匹配后的目的地址对应的第一高速互联端口。这里不再展开赘述。Optionally, the first port may be determined according to the base address and length of the first destination address in the data access request. Specifically, the computing chip can match the base address and length of the destination address recorded in the first address decoder with the base address and length of the first destination address, and determine the first port corresponding to the matched destination address. Similarly, the interconnection chip can match the base address and length of the destination address recorded in the second address decoder with the base address and length of the second destination address, and determine the first high-speed interconnection port corresponding to the matched destination address. I won't go into details here.
可选地,可根据第一目的地址的高位地址确定第一端口。计算芯片可以将第一地址译码器中记录的目的地址的高位地址与第一目的地址的高位地址进行匹配,确定匹配后的目的地址对应的第一端口,其中,高位地址的位数是根据第二节点的内存大小确定的。同理,互联芯片可以将第二地址译码器中记录的目的地址的高位地址与第一目的地址的高位地址进行匹配,确定匹配后的目的地址对应的第一高速互联端口。这里不再展开赘述。Optionally, the first port may be determined according to the upper address of the first destination address. The computing chip can match the high-order address of the destination address recorded in the first address decoder with the high-order address of the first destination address, and determine the first port corresponding to the matched destination address, wherein the number of digits of the high-order address is based on The memory size of the second node is determined. Similarly, the interconnection chip can match the upper address of the destination address recorded in the second address decoder with the upper address of the first destination address, and determine the first high-speed interconnection port corresponding to the matched destination address. I won't go into details here.
应理解,由于第二节点提供的拓展内存对应的物理地址数量为多个,因此第一、第二译码器中记录的部分目的地址对应的端口可能会是同一个,对应相同端口的目的地址位于同一个内存中,这些对应相同端口的目的地址,其基地址和长度是相同的,或者,其高位地址是相同的,因此可以通过匹配基地址和长度,或者匹配高位地址来确定第一目的地址对应的端口,从而提高匹配效率。It should be understood that since the expanded memory provided by the second node corresponds to multiple physical addresses, the ports corresponding to part of the destination addresses recorded in the first and second decoders may be the same, corresponding to the destination addresses of the same port Located in the same memory, these destination addresses corresponding to the same port have the same base address and length, or the high address is the same, so the first purpose can be determined by matching the base address and length, or matching the high address The port corresponding to the address, thus improving the matching efficiency.
需要说明的,通过基地址和长度进行匹配的方式的详细描述可以参考前述图4实施例的举例说明,这里不在重复赘述。It should be noted that for a detailed description of the manner of matching through the base address and the length, reference may be made to the illustration of the embodiment in FIG. 4 above, and details are not repeated here.
需要说明的,若数据访问请求是向第二节点读取内存中的数据,第二节点对数据访问请求进行处理后,可以根据数据访问请求中的源地址,结合第一、第二和第三地址译码器,将读取到的数据原路返回至第一节点中,这里不在重复再开赘述。It should be noted that if the data access request is to read data in memory from the second node, after the second node processes the data access request, it can combine the first, second and third nodes according to the source address in the data access request. The address decoder returns the read data to the first node through the original route, which will not be repeated here.
综上可知,本申请提供的数据访问方法,第一节点的高速互联端口与第二节点的高速互 联端口通过线缆进行连接,第一节点可结合地址译码器实现寻址功能,从而将数据访问请求发送至第一目的地址对应的第二节点内存,实现第一节点的内存拓展,该方式不需要额外部署网卡或者路由器,无需等待网卡队列单元的准备时间,使得第一节点访问第二节点内存的效率高、时延低,同时,通过增加高速互联端口可以增加第二节点的数量,可以提高第一节点的拓展内存容量,使得第一节点可扩展的内存容量很大,能够处理更多应用场景下的业务。In summary, in the data access method provided by this application, the high-speed interconnection port of the first node is connected to the high-speed interconnection port of the second node through a cable, and the first node can combine the address decoder to realize the addressing function, so that the data The access request is sent to the memory of the second node corresponding to the first destination address to realize the memory expansion of the first node. This method does not need to deploy additional network cards or routers, and does not need to wait for the preparation time of the network card queue unit, so that the first node can access the second node The memory has high efficiency and low latency. At the same time, by increasing the number of high-speed interconnection ports, the number of second nodes can be increased, and the expanded memory capacity of the first node can be increased, so that the scalable memory capacity of the first node is large and can handle more business in application scenarios.
图6是本申请提供的一种计算节点600的结构示意图,该计算节点600可以是前述内容中的第一节点110,该计算节点600可包括计算芯片111和互联芯片112,其中,计算芯片111可包括生成单元1111、第一匹配单元1112和第二发送单元1113,互联芯片112可包括第一发送单元1121以及第二匹配单元1122。FIG. 6 is a schematic structural diagram of a computing node 600 provided in the present application. The computing node 600 may be the first node 110 in the aforementioned content. The computing node 600 may include a computing chip 111 and an interconnection chip 112, wherein the computing chip 111 It may include a generating unit 1111 , a first matching unit 1112 and a second sending unit 1113 , and the interconnection chip 112 may include a first sending unit 1121 and a second matching unit 1122 .
生成单元1111,用于生成数据访问请求,其中,数据访问请求用于请求第二节点的内存中的数据,具体可执行图5实施例中的步骤S510;The generating unit 1111 is configured to generate a data access request, wherein the data access request is used to request data in the memory of the second node, and specifically step S510 in the embodiment of FIG. 5 can be executed;
第一发送单元1121,用于通过线缆发送数据访问请求至第二节点,以供第二节点将数据访问请求中的第一目的地址转换为第一目的地址对应的本地物理地址,并根据本地物理地址访问第二节点的内存中的数据,具体可执行图5实施例中的步骤S520。The first sending unit 1121 is configured to send the data access request to the second node through the cable, so that the second node converts the first destination address in the data access request into a local physical address corresponding to the first destination address, and according to the local The physical address accesses the data in the memory of the second node, and specifically step S520 in the embodiment of FIG. 5 can be executed.
在一实施例中,互联芯片112的第一高速互联端口与第二节点中的处理器的第二高速互联端口通过线缆连接,生成单元1111,用于通过计算芯片111生成数据访问请求;第二发送单元1113,用于通过计算芯片111将数据访问请求发送至互联芯片112;第一发送单元1121,用于通过互联芯片112通过线缆发送数据访问请求至第二节点。In one embodiment, the first high-speed interconnection port of the interconnection chip 112 is connected to the second high-speed interconnection port of the processor in the second node through a cable, and the generating unit 1111 is configured to generate a data access request through the computing chip 111; The second sending unit 1113 is configured to send the data access request to the interconnection chip 112 through the computing chip 111 ; the first sending unit 1121 is configured to send the data access request to the second node through the interconnection chip 112 through a cable.
在一实施例中,计算芯片111通过端口与互联芯片112相连,计算芯片111中包括第一地址译码器,第一匹配单元1112用于通过计算芯片111,根据数据访问请求中的第一目的地址和第一地址译码器确定第一端口,其中,第一地址译码器用于记录目的地址与计算芯片的端口之间的对应关系;第二发送单元1113,用于通过计算芯片,通过第一端口向互联芯片112发送数据访问请求。In one embodiment, the computing chip 111 is connected to the interconnection chip 112 through a port, the computing chip 111 includes a first address decoder, and the first matching unit 1112 is used to pass through the computing chip 111 according to the first purpose in the data access request The address and the first address decoder determine the first port, wherein the first address decoder is used to record the corresponding relationship between the destination address and the port of the computing chip; the second sending unit 1113 is used to pass the computing chip through the first port A port sends a data access request to the interconnection chip 112 .
在一实施例中,互联芯片112中包括第二地址译码器,第二匹配单元1122,用于通过互联芯片112,根据第一目的地址和第二地址译码器确定第一高速互联端口,其中,第二地址译码器用于记录目的地址与高速互联端口之间的对应关系;第一发送单元1121,用于通过互联芯片112,通过第一高速互联端口向第二节点发送数据访问请求。In one embodiment, the interconnection chip 112 includes a second address decoder and a second matching unit 1122, configured to determine the first high-speed interconnection port through the interconnection chip 112 according to the first destination address and the second address decoder, Wherein, the second address decoder is used to record the corresponding relationship between the destination address and the high-speed interconnection port; the first sending unit 1121 is used to send a data access request to the second node through the interconnection chip 112 through the first high-speed interconnection port.
在一实施例中,第一匹配单元1112,用于通过计算芯片111将第一地址译码器中记录的目的地址的基地址和长度与第一目的地址的基地址和长度进行匹配,确定匹配后的目的地址对应的第一端口;第二匹配单元1122,用于通过互联芯片112将第二地址译码器中记录的目的地址的基地址和长度与第一目的地址的基地址和长度进行匹配,确定匹配后的目的地址对应的第一高速互联端口。In one embodiment, the first matching unit 1112 is configured to match the base address and length of the destination address recorded in the first address decoder with the base address and length of the first destination address through the computing chip 111 to determine the matching The first port corresponding to the last destination address; the second matching unit 1122 is used to perform the base address and length of the destination address recorded in the second address decoder with the base address and length of the first destination address through the interconnection chip 112 Matching, determining the first high-speed interconnection port corresponding to the matched destination address.
在一实施例中,第一匹配单元1112,用于通过计算芯片111将第一地址译码器中记录的目的地址的高位地址与第一目的地址的高位地址进行匹配,确定匹配后的目的地址对应的第一端口,其中,高位地址的位数是根据第二节点的内存大小确定的;第二匹配单元1122,用于通过互联芯片112将第二地址译码器中记录的目的地址的基地址和长度与第一目的地址的基地址和长度进行匹配,确定匹配后的目的地址对应的第一高速互联端口。In one embodiment, the first matching unit 1112 is configured to match the upper address of the destination address recorded in the first address decoder with the upper address of the first destination address through the computing chip 111 to determine the matched destination address Corresponding first port, wherein, the number of digits of the upper address is determined according to the memory size of the second node; the second matching unit 1122 is used to convert the base address of the destination address recorded in the second address decoder through the interconnection chip 112 The address and length are matched with the base address and length of the first destination address, and the first high-speed interconnection port corresponding to the matched destination address is determined.
在一实施例中,第一高速互联端口和第二高速互联端口为高速串行总线端口,第一端口为高速串行总线端口。In one embodiment, the first high-speed interconnect port and the second high-speed interconnect port are high-speed serial bus ports, and the first port is a high-speed serial bus port.
第一节点110可以是物理服务器,比如X86服务器、ARM服务器等等;也可以是基于 通用的物理服务器结合网络功能虚拟化(network functions virtualization,NFV)技术实现的虚拟机(virtual machine,VM),虚拟机指通过软件模拟的具有完整硬件系统功能的、运行在一个完全隔离环境中的完整计算机系统,比如云计算中的虚拟设备,本申请不作具体限定;还可以是多个物理服务器或者虚拟机组成的服务器集群。The first node 110 can be a physical server, such as an X86 server, an ARM server, etc.; it can also be a virtual machine (virtual machine, VM) implemented based on a common physical server combined with network functions virtualization (network functions virtualization, NFV) technology, A virtual machine refers to a complete computer system simulated by software with complete hardware system functions and running in a completely isolated environment, such as a virtual device in cloud computing, which is not specifically limited in this application; it can also be multiple physical servers or virtual machines composed of server clusters.
图7是本申请提供的存储节点700的结构示意图,该存储节点700可以是图1~图6实施例中的第二节点120,该存储节点700可包括接收单元121和转换单元122。FIG. 7 is a schematic structural diagram of a storage node 700 provided in the present application. The storage node 700 may be the second node 120 in the embodiments of FIGS.
接收单元121,用于接收数据访问请求,数据访问请求是第一节点生成的,数据访问请求是第一节点通过线缆发送的;The receiving unit 121 is configured to receive a data access request, the data access request is generated by the first node, and the data access request is sent by the first node through a cable;
转换单元122,用于将数据访问请求中的第一目的地址转换为第一目的地址对应的本地物理地址,并根据本地物理地址访问第二节点的内存中的数据。The conversion unit 122 is configured to convert the first destination address in the data access request into a local physical address corresponding to the first destination address, and access data in the memory of the second node according to the local physical address.
在一实施例中,第二节点120包括第三地址译码器;转换单元122,用于根据第一目的地址和第三地址译码器,确定第一目的地址对应的本地物理地址,其中,第三地址译码器用于记录目的地址和本地物理地址之间的对应关系。In one embodiment, the second node 120 includes a third address decoder; the converting unit 122 is configured to determine the local physical address corresponding to the first destination address according to the first destination address and the third address decoder, wherein, The third address decoder is used to record the corresponding relationship between the destination address and the local physical address.
在一实施例中,第一节点的第一高速互联端口与第二节点中的处理器的第二高速互联端口通过线缆连接,第一高速互联端口和第二高速互联端口为高速串行总线端口。In one embodiment, the first high-speed interconnection port of the first node is connected to the second high-speed interconnection port of the processor in the second node through a cable, and the first high-speed interconnection port and the second high-speed interconnection port are high-speed serial buses port.
第二节点120可以是物理服务器,比如X86服务器、ARM服务器等等;也可以是基于通用的物理服务器结合网络功能虚拟化(network functions virtualization,NFV)技术实现的虚拟机(virtual machine,VM),虚拟机指通过软件模拟的具有完整硬件系统功能的、运行在一个完全隔离环境中的完整计算机系统,比如云计算中的虚拟设备,本申请不作具体限定;还可以是多个物理服务器或者虚拟机组成的服务器集群。The second node 120 can be a physical server, such as an X86 server, an ARM server, etc.; it can also be a virtual machine (virtual machine, VM) implemented based on a common physical server combined with network functions virtualization (network functions virtualization, NFV) technology, A virtual machine refers to a complete computer system simulated by software with complete hardware system functions and running in a completely isolated environment, such as a virtual device in cloud computing, which is not specifically limited in this application; it can also be multiple physical servers or virtual machines composed of server clusters.
综上可知,本申请提供的第一节点和第二节点中,第一节点的高速互联端口与第二节点的高速互联端口通过线缆进行连接,第一节点可结合地址译码器实现寻址功能,从而将数据访问请求发送至第一目的地址对应的第二节点内存,实现第一节点的内存拓展,该方式不需要额外部署网卡或者路由器,无需等待网卡队列单元的准备时间,使得第一节点访问第二节点内存的效率高、时延低,同时,通过增加高速互联端口可以增加第二节点的数量,可以提高第一节点的拓展内存容量,使得第一节点可扩展的内存容量很大,能够处理更多应用场景下的业务。To sum up, in the first node and the second node provided by this application, the high-speed interconnection port of the first node is connected to the high-speed interconnection port of the second node through a cable, and the first node can be combined with an address decoder to realize addressing function, so as to send the data access request to the memory of the second node corresponding to the first destination address, and realize the memory expansion of the first node. Node access to the memory of the second node has high efficiency and low delay. At the same time, the number of second nodes can be increased by increasing the high-speed interconnection port, which can increase the expanded memory capacity of the first node, so that the scalable memory capacity of the first node is very large. , able to handle business in more application scenarios.
图8是本申请提供的一种计算设备的结构示意图,该计算设备800可以是图1至图7实施例中的第一节点110或者第二节点120,该计算设备可以是物理服务器、虚拟机或服务器集群,也可以是可设置于物理服务器或虚拟机的芯片(系统)或其他部件或组件,本申请对此不做限定。FIG. 8 is a schematic structural diagram of a computing device provided by the present application. The computing device 800 may be the first node 110 or the second node 120 in the embodiments of FIG. 1 to FIG. 7 , and the computing device may be a physical server or a virtual machine. Or a server cluster, it may also be a chip (system) or other components or components that can be set on a physical server or a virtual machine, which is not limited in this application.
进一步地,计算设备800包括处理器801、存储器802和通信接口803,其中,处理器801、存储器802和通信接口803通过总线805进行通信,也可以通过无线传输等其他手段实现通信。Further, the computing device 800 includes a processor 801, a memory 802, and a communication interface 803, where the processor 801, the memory 802, and the communication interface 803 communicate through a bus 805, or other means such as wireless transmission.
处理器801可以由至少一个通用处理器构成,例如CPU、NPU或者CPU和硬件芯片的组合。上述硬件芯片可以是专用集成电路(Application-Specific Integrated Circuit,ASIC)、可编程逻辑器件(Programmable Logic Device,PLD)或其组合。上述PLD可以是复杂可编程逻辑器件(Complex Programmable Logic Device,CPLD)、现场可编程逻辑门阵列(Field-Programmable Gate Array,FPGA)、通用阵列逻辑(Generic Array Logic,GAL)或其任意组合。处理器801执行各种类型的数字存储指令,例如存储在存储器802中的软件或者 固件程序,它能使计算设备800提供较宽的多种服务。The processor 801 may be composed of at least one general-purpose processor, such as a CPU, an NPU, or a combination of a CPU and a hardware chip. The aforementioned hardware chip may be an application-specific integrated circuit (Application-Specific Integrated Circuit, ASIC), a programmable logic device (Programmable Logic Device, PLD) or a combination thereof. The above-mentioned PLD can be a complex programmable logic device (Complex Programmable Logic Device, CPLD), a field programmable logic gate array (Field-Programmable Gate Array, FPGA), a general array logic (Generic Array Logic, GAL) or any combination thereof. Processor 801 executes various types of digitally stored instructions, such as software or firmware programs stored in memory 802, which enable computing device 800 to provide a wide variety of services.
具体实现中,上述处理器801可以是前述内容中第一节点内的计算芯片或者互联芯片,还可以是第二节点内的处理器芯片,本申请不作具体限定。在具体的实现中,作为一种实施例,处理器801可以包括一个或多个CPU,例如图8中所示的CPU0和CPU1。In a specific implementation, the above-mentioned processor 801 may be a computing chip or an interconnection chip in the first node mentioned above, or may be a processor chip in the second node, which is not specifically limited in this application. In a specific implementation, as an embodiment, the processor 801 may include one or more CPUs, such as CPU0 and CPU1 shown in FIG. 8 .
在具体实现中,作为一种实施例,计算设备800也可以包括多个处理器,例如图8中所示的处理器801和处理器804。这些处理器中的每一个可以是一个单核处理器(single-CPU),也可以是一个多核处理器(multi-CPU)。这里的处理器可以指一个或多个设备、电路、和/或用于处理数据(例如计算机程序指令)的处理核。In a specific implementation, as an embodiment, the computing device 800 may also include multiple processors, such as the processor 801 and the processor 804 shown in FIG. 8 . Each of these processors can be a single-core processor (single-CPU) or a multi-core processor (multi-CPU). A processor herein may refer to one or more devices, circuits, and/or processing cores for processing data (eg, computer program instructions).
存储器802用于存储程序代码,并由处理器801来控制执行,以执行上述图1-图7中任一实施例中工作流系统的处理步骤。程序代码中可以包括一个或多个软件模块。在计算节点是第一节点110时,上述一个或多个软件模块可以是图6实施例中的生成单元1111、第一匹配单元1112、第二发送单元1113、第二匹配单元1122以及第一发送单元1121,上述具体实现方式可以参考图6方法实施例,此处不再赘述;在计算节点是第二节点120时,上述一个或多个软件模块可以是图7实施例中的接收单元121和转换单元122,上述具体实现方式可以参考图6方法实施例,此处不再赘述。The memory 802 is used to store program codes, which are executed under the control of the processor 801, so as to execute the processing steps of the workflow system in any of the above-mentioned embodiments in FIGS. 1-7. One or more software modules may be included in the program code. When the computing node is the first node 110, the above one or more software modules may be the generating unit 1111, the first matching unit 1112, the second sending unit 1113, the second matching unit 1122 and the first sending unit in the embodiment of FIG. Unit 1121, the specific implementation of the above-mentioned method can refer to the embodiment of the method in FIG. For the conversion unit 122, for the specific implementation manner above, reference may be made to the method embodiment in FIG. 6 , and details are not repeated here.
存储器802可以包括只读存储器和随机存取存储器,并向处理器801提供指令和数据。存储器802还可以包括非易失性随机存取存储器。例如,存储器802还可以存储设备类型的信息。The memory 802 may include read-only memory and random-access memory, and provides instructions and data to the processor 801 . Memory 802 may also include non-volatile random access memory. For example, memory 802 may also store device type information.
存储器802可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data date SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。还可以是硬盘(hard disk)、U盘(universal serial bus,USB)、闪存(flash)、SD卡(secure digital memory Card,SD card)、记忆棒等等,硬盘可以是硬盘驱动器(hard disk drive,HDD)、固态硬盘(solid state disk,SSD)、机械硬盘(mechanical hard disk,HDD)等,本申请不作具体限定。 Memory 802 can be volatile memory or nonvolatile memory, or can include both volatile and nonvolatile memory. Among them, the non-volatile memory can be read-only memory (read-only memory, ROM), programmable read-only memory (programmable ROM, PROM), erasable programmable read-only memory (erasable PROM, EPROM), electrically programmable Erases programmable read-only memory (electrically EPROM, EEPROM) or flash memory. Volatile memory can be random access memory (RAM), which acts as external cache memory. By way of illustration and not limitation, many forms of RAM are available such as static random access memory (static RAM, SRAM), dynamic random access memory (DRAM), synchronous dynamic random access memory (synchronous DRAM, SDRAM), Double data rate synchronous dynamic random access memory (double data date SDRAM, DDR SDRAM), enhanced synchronous dynamic random access memory (enhanced SDRAM, ESDRAM), synchronous connection dynamic random access memory (synchlink DRAM, SLDRAM) and direct Memory bus random access memory (direct rambus RAM, DR RAM). It can also be a hard disk (hard disk), U disk (universal serial bus, USB), flash memory (flash), SD card (secure digital memory Card, SD card), memory stick, etc., and the hard disk can be a hard disk drive (hard disk drive). , HDD), solid state disk (solid state disk, SSD), mechanical hard disk (mechanical hard disk, HDD), etc., which are not specifically limited in this application.
通信接口803可以为有线接口(例如以太网接口),可以为内部接口(例如高速串行计算机扩展总线(Peripheral Component Interconnect express,PCIe)总线接口)、有线接口(例如以太网接口)或无线接口(例如蜂窝网络接口或使用无线局域网接口),用于与其他服务器或模块进行通信,具体实现中,通信接口803可用于接收报文,以供处理器801或处理器804对该报文进行处理。The communication interface 803 can be a wired interface (such as an Ethernet interface), an internal interface (such as a high-speed serial computer expansion bus (Peripheral Component Interconnect express, PCIe) bus interface), a wired interface (such as an Ethernet interface) or a wireless interface ( For example, a cellular network interface or a wireless local area network interface) is used to communicate with other servers or modules. In specific implementation, the communication interface 803 can be used to receive a message for the processor 801 or processor 804 to process the message.
总线805可以是快捷外围部件互联标准(Peripheral Component Interconnect Express,PCIe)总线,或扩展工业标准结构(extended industry standard architecture,EISA)总线、统一总线(unified bus,Ubus或UB)、计算机快速链接(compute express link,CXL)、缓存一致互联 协议(cache coherent interconnect for accelerators,CCIX)等。总线805可以分为地址总线、数据总线、控制总线等。The bus 805 can be a peripheral component interconnection standard (Peripheral Component Interconnect Express, PCIe) bus, or an extended industry standard architecture (extended industry standard architecture, EISA) bus, unified bus (unified bus, Ubus or UB), computer fast link (compute express link, CXL), cache coherent interconnect for accelerators (CCIX), etc. The bus 805 can be divided into an address bus, a data bus, a control bus, and the like.
总线805除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见,在图中将各种总线都标为总线805。In addition to the data bus, the bus 805 may also include a power bus, a control bus, and a status signal bus. However, for clarity of illustration, the various buses are labeled as bus 805 in the figure.
需要说明的,图8仅仅是本申请实施例的一种可能的实现方式,实际应用中,计算设备800还可以包括更多或更少的部件,这里不作限制。关于本申请实施例中未示出或未描述的内容,可参见前述图1-图7实施例中的相关阐述,这里不再赘述。It should be noted that FIG. 8 is only a possible implementation manner of the embodiment of the present application. In practical applications, the computing device 800 may include more or fewer components, which is not limited here. Regarding the content not shown or described in the embodiment of the present application, reference may be made to the related explanations in the foregoing embodiments of FIGS. 1-7 , which will not be repeated here.
应理解,图8所示的计算设备800还可以是至少一个物理服务器构成的计算机集群,具体可参考图1至图7实施例关于数据访问系统的具体形态描述,为了避免重复,此处不再赘述。It should be understood that the computing device 800 shown in FIG. 8 may also be a computer cluster composed of at least one physical server. For details, reference may be made to the description of the specific form of the data access system in the embodiment of FIG. 1 to FIG. repeat.
本申请实施例提供一种芯片,该芯片具体可用于X86架构的处理器所在服务器(也可以称为X86服务器)、ARM架构的处理器所在的服务器(也可以简称为ARM服务器)等等,该芯片可包括上述器件或逻辑电路,该芯片在服务器上运行时,使得该服务器执行上述方法实施例所述的数据访问方法。An embodiment of the present application provides a chip, which can be specifically used in a server where a processor of the X86 architecture resides (also may be called an X86 server), a server where a processor of the ARM architecture resides (also may be referred to as an ARM server for short), etc. The chip may include the above-mentioned device or logic circuit, and when the chip runs on the server, the server is made to execute the data access method described in the above method embodiment.
具体实现中,该芯片可以是前述内容中第一节点内的计算芯片或者互联芯片,还可以是第二节点内的处理器芯片。In a specific implementation, the chip may be a computing chip or an interconnection chip in the first node in the foregoing content, or may be a processor chip in the second node.
本申请实施例提供一种主板,又可称为印刷电路板(printed circuit boards,PCB),该主板包括处理器,该处理器用于执行程序代码实现上述方法实施例所述的数据访问方法。可选地,该主板还可包括存储器,存储器用于存储上述程序代码以供处理器执行。An embodiment of the present application provides a main board, which may also be called a printed circuit board (printed circuit boards, PCB). The main board includes a processor, and the processor is used to execute program codes to implement the data access method described in the above method embodiments. Optionally, the mainboard may further include a memory, which is used to store the above program codes for execution by the processor.
本申请实施例提供一种计算机可读存储介质,包括:该计算机可读存储介质中存储有计算机指令;当该计算机指令在计算机上运行时,使得该计算机执行上述方法实施例所述的数据访问方法。An embodiment of the present application provides a computer-readable storage medium, including: computer instructions are stored in the computer-readable storage medium; when the computer instructions are run on a computer, the computer is made to perform the data access described in the foregoing method embodiments method.
本申请实施例提供了一种包含指令的计算机程序产品,包括计算机程序或指令,当该计算机程序或指令在计算机上运行时,使得该计算机执行上述方法实施例所述的数据访问方法。An embodiment of the present application provides a computer program product containing instructions, including a computer program or an instruction. When the computer program or instruction is run on a computer, the computer is made to execute the data access method described in the above method embodiments.
上述实施例,可以全部或部分地通过软件、硬件、固件或其他任意组合来实现。当使用软件实现时,上述实施例可以全部或部分地以计算机程序产品的形式实现。计算机程序产品包括至少一个计算机指令。在计算机上加载或执行计算机程序指令时,全部或部分地产生按照本发明实施例的流程或功能。计算机可以为通用计算机、专用计算机、计算机网络、或者其他可编程装置。计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含至少一个可用介质集合的服务器、数据中心等数据存储节点。可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如,高密度数字视频光盘(digital video disc,DVD)、或者半导体介质。半导体介质可以是SSD。The above-mentioned embodiments may be implemented in whole or in part by software, hardware, firmware or other arbitrary combinations. When implemented using software, the above-described embodiments may be implemented in whole or in part in the form of computer program products. A computer program product comprises at least one computer instruction. When the computer program instructions are loaded or executed on the computer, the processes or functions according to the embodiments of the present invention will be generated in whole or in part. The computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable devices. Computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, e.g. Coaxial cable, optical fiber, digital subscriber line (digital subscriber line, DSL)) or wireless (such as infrared, wireless, microwave, etc.) transmission to another website site, computer, server or data center. A computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage node such as a server or a data center that includes at least one set of available media. Available media may be magnetic media (eg, floppy disks, hard disks, tapes), optical media (eg, high-density digital video discs (DVD), or semiconductor media. The semiconductor media may be SSDs.
以上,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到各种等效的修复或替换,这些修复或替换都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以权利要求的保护范围为准。The above is only a specific embodiment of the present invention, but the protection scope of the present invention is not limited thereto. Any person skilled in the art can easily think of various equivalent repairs or repairs within the technical scope disclosed in the present invention. Replacement, these repairs or replacements should be covered within the protection scope of the present invention. Therefore, the protection scope of the present invention should be based on the protection scope of the claims.

Claims (20)

  1. 一种数据访问系统,其特征在于,所述数据访问系统包括第一节点和第二节点,所述第一节点与所述第二节点通过线缆连接;A data access system, characterized in that the data access system includes a first node and a second node, and the first node is connected to the second node through a cable;
    所述第一节点用于生成数据访问请求,其中,所述数据访问请求用于请求所述第二节点的内存中的数据;The first node is configured to generate a data access request, wherein the data access request is used to request data in the memory of the second node;
    所述第一节点用于通过所述线缆发送所述数据访问请求至所述第二节点;The first node is configured to send the data access request to the second node through the cable;
    所述第二节点用于将所述数据访问请求中的第一目的地址转换为所述第一目的地址对应的本地物理地址,并根据所述本地物理地址访问所述第二节点的内存中的所述数据。The second node is configured to convert the first destination address in the data access request into a local physical address corresponding to the first destination address, and access the memory of the second node according to the local physical address the data.
  2. 根据权利要求1所述的系统,其特征在于,所述第一节点包括计算芯片和互联芯片,其中,所述互联芯片的第一高速互联端口与所述第二节点中的处理器的第二高速互联端口通过所述线缆连接;The system according to claim 1, wherein the first node includes a computing chip and an interconnection chip, wherein the first high-speed interconnection port of the interconnection chip is connected to the second high-speed interconnection port of the processor in the second node. The high-speed interconnection port is connected through the cable;
    所述计算芯片用于生成所述数据访问请求,并将所述数据访问请求发送至所述互联芯片;The computing chip is used to generate the data access request, and send the data access request to the interconnection chip;
    所述互联芯片用于通过所述线缆发送所述数据访问请求至所述第二节点。The interconnection chip is used to send the data access request to the second node through the cable.
  3. 根据权利要求2所述的系统,其特征在于,所述计算芯片通过端口与所述互联芯片相连,所述计算芯片中包括第一地址译码器;The system according to claim 2, wherein the computing chip is connected to the interconnection chip through a port, and the computing chip includes a first address decoder;
    所述计算芯片具体用于:生成所述数据访问请求,根据所述数据访问请求中的第一目的地址和所述第一地址译码器确定所述第一端口,通过所述第一端口向所述互联芯片发送所述数据访问请求,其中,所述第一地址译码器用于记录目的地址与所述计算芯片的端口之间的对应关系。The computing chip is specifically configured to: generate the data access request, determine the first port according to the first destination address in the data access request and the first address decoder, and send data to The interconnection chip sends the data access request, wherein the first address decoder is used to record the correspondence between the destination address and the port of the computing chip.
  4. 根据权利要求3所述的系统,其特征在于,所述互联芯片中包括第二地址译码器,The system according to claim 3, wherein the interconnection chip includes a second address decoder,
    所述互联芯片具体用于:根据所述第一目的地址和所述第二地址译码器确定所述第一高速互联端口,通过所述第一高速互联端口向所述第二节点发送所述数据访问请求,其中,所述第二地址译码器用于记录目的地址与高速互联端口之间的对应关系。The interconnection chip is specifically configured to: determine the first high-speed interconnection port according to the first destination address and the second address decoder, and send the A data access request, wherein the second address decoder is used to record the correspondence between the destination address and the high-speed interconnection port.
  5. 根据权利要求4所述的系统,其特征在于,所述第二节点包括第三地址译码器;The system of claim 4, wherein the second node includes a third address decoder;
    所述第二节点具体用于:根据所述第一目的地址和所述第三地址译码器,确定所述第一目的地址对应的本地物理地址,其中,所述第三地址译码器用于记录目的地址和本地物理地址之间的对应关系。The second node is specifically configured to: determine the local physical address corresponding to the first destination address according to the first destination address and the third address decoder, wherein the third address decoder is used to Record the correspondence between the destination address and the local physical address.
  6. 根据权利要求5所述的系统,其特征在于,所述数据访问系统还包括配置节点,The system according to claim 5, wherein the data access system further comprises a configuration node,
    所述配置节点用于向第二节点获取第二节点的内存的至少一个本地物理地址;The configuration node is used to obtain at least one local physical address of the memory of the second node from the second node;
    所述配置节点用于根据所述至少一个本地物理地址确定对应的至少一个目的地址,对所述第三地址译码器进行配置;The configuration node is configured to determine at least one corresponding destination address according to the at least one local physical address, and configure the third address decoder;
    所述配置节点还用于根据所述至少一个目的地址,结合所述第二节点与所述互联芯片之间的所述高速互联端口,对所述第二地址译码器进行配置;The configuration node is further configured to configure the second address decoder according to the at least one destination address in combination with the high-speed interconnection port between the second node and the interconnection chip;
    所述配置节点还用于根据所述至少一个目的地址,结合所述互联芯片与所述计算芯片之间的芯片端口,对所述第一地址译码器进行配置。The configuration node is further configured to configure the first address decoder according to the at least one destination address in combination with a chip port between the interconnection chip and the computing chip.
  7. 根据权利要求4至6任一权利要求所述的系统,其特征在于,A system according to any one of claims 4 to 6, characterized in that,
    所述计算芯片具体用于:将所述第一地址译码器中记录的目的地址的基地址和长度与所述第一目的地址的基地址和长度进行匹配,确定匹配后的目的地址对应的所述第一端口;The computing chip is specifically used to: match the base address and length of the destination address recorded in the first address decoder with the base address and length of the first destination address, and determine the corresponding address of the matched destination address. said first port;
    所述互联芯片具体用于:将所述第二地址译码器中记录的目的地址的基地址和长度与所述第一目的地址的基地址和长度进行匹配,确定匹配后的目的地址对应的所述第一高速互联端口。The interconnection chip is specifically used to: match the base address and length of the destination address recorded in the second address decoder with the base address and length of the first destination address, and determine the corresponding The first high-speed interconnect port.
  8. 根据权利要求4至6任一权利要求所述的系统,其特征在于,A system according to any one of claims 4 to 6, characterized in that,
    所述计算芯片具体用于:将所述第一地址译码器中记录的目的地址的高位地址与所述第一目的地址的高位地址进行匹配,确定匹配后的目的地址对应的所述第一端口,其中,所述高位地址的位数是根据所述第二节点的内存大小确定的;The computing chip is specifically configured to: match the high-order address of the destination address recorded in the first address decoder with the high-order address of the first destination address, and determine the first destination address corresponding to the matched destination address. port, wherein the number of bits of the high address is determined according to the memory size of the second node;
    所述互联芯片具体用于:将所述第二地址译码器中记录的目的地址的高位地址与所述第一目的地址的高位地址进行匹配,确定匹配后的目的地址对应的所述第一高速互联端口。The interconnection chip is specifically used for: matching the upper address of the destination address recorded in the second address decoder with the upper address of the first destination address, and determining the first destination address corresponding to the matched destination address. High-speed Internet port.
  9. 根据权利要求2至8任一权利要求所述的系统,其特征在于,所述第一高速互联端口和所述第二高速互联端口为高速串行总线端口,所述第一端口为高速串行总线端口。The system according to any one of claims 2 to 8, wherein the first high-speed interconnection port and the second high-speed interconnection port are high-speed serial bus ports, and the first port is a high-speed serial bus port. bus port.
  10. 一种数据访问方法,其特征在于,所述方法应用于数据访问系统,所述数据访问系统包括第一节点和第二节点,所述第一节点与所述第二节点通过线缆连接,所述方法包括:A data access method, characterized in that the method is applied to a data access system, the data access system includes a first node and a second node, the first node is connected to the second node through a cable, the The methods described include:
    所述第一节点生成数据访问请求,其中,所述数据访问请求用于请求所述第二节点的内存中的数据;The first node generates a data access request, wherein the data access request is used to request data in the memory of the second node;
    所述第一节点通过所述线缆发送所述数据访问请求至所述第二节点;the first node sends the data access request to the second node through the cable;
    所述第二节点将所述数据访问请求中的第一目的地址转换为所述第一目的地址对应的本地物理地址,并根据所述本地物理地址访问所述第二节点的内存中的所述数据。The second node converts the first destination address in the data access request into a local physical address corresponding to the first destination address, and accesses the data.
  11. 根据权利要求10所述的方法,其特征在于,所述第一节点包括计算芯片和互联芯片,其中,所述互联芯片的第一高速互联端口与所述第二节点中的处理器的第二高速互联端口通过线缆连接;The method according to claim 10, wherein the first node includes a computing chip and an interconnection chip, wherein the first high-speed interconnection port of the interconnection chip is connected to the second high-speed interconnection port of the processor in the second node The high-speed interconnect port is connected by cable;
    所述第一节点生成数据访问请求包括:The generating of the data access request by the first node includes:
    所述计算芯片生成所述数据访问请求,并将所述数据访问请求发送至所述互联芯片;The computing chip generates the data access request, and sends the data access request to the interconnection chip;
    所述第一节点通过所述线缆发送所述数据访问请求至所述第二节点包括:Sending the data access request to the second node by the first node through the cable includes:
    所述互联芯片通过所述线缆发送所述数据访问请求至所述第二节点。The interconnection chip sends the data access request to the second node through the cable.
  12. 根据权利要求11所述的方法,其特征在于,所述计算芯片通过端口与所述互联芯片相连,所述计算芯片中包括第一地址译码器;The method according to claim 11, wherein the computing chip is connected to the interconnection chip through a port, and the computing chip includes a first address decoder;
    所述计算芯片生成所述数据访问请求,并将所述数据访问请求发送至所述互联芯片包括:The computing chip generating the data access request, and sending the data access request to the interconnection chip includes:
    所述计算芯片生成所述数据访问请求,根据所述数据访问请求中的第一目的地址和所述第一地址译码器确定所述第一端口,通过所述第一端口向所述互联芯片发送所述数据访问请求,其中,所述第一地址译码器用于记录目的地址与所述计算芯片的端口之间的对应关系。The computing chip generates the data access request, determines the first port according to the first destination address in the data access request and the first address decoder, and communicates to the interconnection chip through the first port Sending the data access request, wherein the first address decoder is used to record the correspondence between the destination address and the port of the computing chip.
  13. 根据权利要求12所述的方法,其特征在于,所述互联芯片中包括第二地址译码器,The method according to claim 12, wherein the interconnection chip includes a second address decoder,
    所述互联芯片通过所述线缆发送所述数据访问请求至所述第二节点包括:Sending the data access request to the second node by the interconnection chip through the cable includes:
    所述互联芯片根据所述第一目的地址和所述第二地址译码器确定所述第一高速互联端口,通过所述第一高速互联端口向所述第二节点发送所述数据访问请求,其中,所述第二地址译码器用于记录目的地址与高速互联端口之间的对应关系。The interconnection chip determines the first high-speed interconnection port according to the first destination address and the second address decoder, and sends the data access request to the second node through the first high-speed interconnection port, Wherein, the second address decoder is used to record the corresponding relationship between the destination address and the high-speed interconnection port.
  14. 根据权利要求13所述的系统,其特征在于,所述第二节点包括第三地址译码器;The system of claim 13, wherein the second node includes a third address decoder;
    所述第二节点将所述数据访问请求中的第一目的地址转换为所述第一目的地址对应的本地物理地址包括:The second node converting the first destination address in the data access request into a local physical address corresponding to the first destination address includes:
    所述第二节点根据所述第一目的地址和所述第三地址译码器,确定所述第一目的地址对应的本地物理地址,其中,所述第三地址译码器用于记录目的地址和本地物理地址之间的对应关系。The second node determines the local physical address corresponding to the first destination address according to the first destination address and the third address decoder, wherein the third address decoder is used to record the destination address and Correspondence between local physical addresses.
  15. 根据权利要求14所述的系统,其特征在于,所述数据访问系统还包括配置节点,所述方法还包括:The system according to claim 14, wherein the data access system further comprises a configuration node, and the method further comprises:
    所述配置节点向第二节点获取第二节点的内存的至少一个本地物理地址;The configuration node acquires at least one local physical address of the memory of the second node from the second node;
    所述配置节点根据所述至少一个本地物理地址确定对应的至少一个目的地址,对所述第三地址译码器进行配置;The configuration node determines at least one corresponding destination address according to the at least one local physical address, and configures the third address decoder;
    所述配置节点根据所述至少一个目的地址,结合所述第二节点与所述互联芯片之间的所述高速互联端口,对所述第二地址译码器进行配置;The configuration node configures the second address decoder according to the at least one destination address in combination with the high-speed interconnection port between the second node and the interconnection chip;
    所述配置节点据所述至少一个目的地址,结合所述互联芯片与所述计算芯片之间的芯片端口,对所述第一地址译码器进行配置。The configuration node configures the first address decoder according to the at least one destination address in combination with a chip port between the interconnection chip and the computing chip.
  16. 根据权利要求13至15中任一权利要求所述的方法,其特征在于,所述计算芯片根据所述数据访问请求中的第一目的地址和所述第一地址译码器确定所述第一端口包括:The method according to any one of claims 13 to 15, wherein the computing chip determines the first destination address according to the first destination address in the data access request and the first address decoder. Ports include:
    所述计算芯片将所述第一地址译码器中记录的目的地址的基地址和长度与所述第一目的地址的基地址和长度进行匹配,确定匹配后的目的地址对应的所述第一端口;The calculation chip matches the base address and length of the destination address recorded in the first address decoder with the base address and length of the first destination address, and determines the first address corresponding to the matched destination address. port;
    所述互联芯片根据所述第一目的地址和所述第二地址译码器确定所述第一高速互联端口包括:The interconnect chip determining the first high-speed interconnect port according to the first destination address and the second address decoder includes:
    所述互联芯片将所述第二地址译码器中记录的目的地址的基地址和长度与所述第一目的地址的基地址和长度进行匹配,确定匹配后的目的地址对应的所述第一高速互联端口。The interconnection chip matches the base address and length of the destination address recorded in the second address decoder with the base address and length of the first destination address, and determines the first destination address corresponding to the matched destination address. High-speed Internet port.
  17. 根据权利要求13至15中任一权利要求所述的方法,其特征在于,所述计算芯片根据所述数据访问请求中的第一目的地址和所述第一地址译码器确定所述第一端口包括:The method according to any one of claims 13 to 15, wherein the computing chip determines the first destination address according to the first destination address in the data access request and the first address decoder. Ports include:
    所述计算芯片将所述第一地址译码器中记录的目的地址的高位地址与所述第一目的地址的高位地址进行匹配,确定匹配后的目的地址对应的所述第一端口,其中,所述高位地址的位数是根据所述第二节点的内存大小确定的;The calculation chip matches the upper address of the destination address recorded in the first address decoder with the upper address of the first destination address, and determines the first port corresponding to the matched destination address, wherein, The number of digits of the high address is determined according to the memory size of the second node;
    所述互联芯片根据所述第一目的地址和所述第二地址译码器确定所述第一高速互联端口包括:The interconnect chip determining the first high-speed interconnect port according to the first destination address and the second address decoder includes:
    所述互联芯片将所述第二地址译码器中记录的目的地址的基地址和长度与所述第一目的地址的基地址和长度进行匹配,确定匹配后的目的地址对应的所述第一高速互联端口。The interconnection chip matches the base address and length of the destination address recorded in the second address decoder with the base address and length of the first destination address, and determines the first destination address corresponding to the matched destination address. High-speed Internet port.
  18. 根据权利要求10至17中任一权利要求所述的方法,其特征在于,所述第一高速互联端口和所述第二高速互联端口为高速串行总线端口,所述第一端口为高速串行总线端口。The method according to any one of claims 10 to 17, wherein the first high-speed interconnection port and the second high-speed interconnection port are high-speed serial bus ports, and the first port is a high-speed serial bus port. row bus port.
  19. 一种计算节点,其特征在于,应用于数据访问系统,所述数据访问系统还包括存储节点,所述计算节点包括:计算芯片和互联芯片,其中,所述计算芯片的通过高速互联端口与所述互联芯片连接,所述互联芯片通过高速互联端口和线缆与所述其他节点连接;A computing node, characterized in that it is applied to a data access system, and the data access system further includes a storage node, and the computing node includes: a computing chip and an interconnection chip, wherein the computing chip is connected to the high-speed interconnection port The interconnection chip is connected, and the interconnection chip is connected to the other nodes through a high-speed interconnection port and a cable;
    所述计算芯片用于生成数据访问请求,并将所述数据访问请求发送至所述互联芯片,其中,所述数据访问请求包括第一目的地址,所述第一目的地址指示所述存储节点中的内存的位置;The computing chip is configured to generate a data access request, and send the data access request to the interconnection chip, wherein the data access request includes a first destination address, and the first destination address indicates The location of the memory;
    所述互联芯片用于根据所述第一目的地址将所述数据访问请求发送至所述存储节点。The interconnection chip is configured to send the data access request to the storage node according to the first destination address.
  20. 一种存储节点,其特征在于,应用于数据访问系统,所述数据访问系统还包括计算节点,所述存储节点包括处理器和内存,所述存储节点通过所述处理器的高速互联端口和线缆与所述计算节点连接;A storage node, characterized in that it is applied to a data access system, and the data access system further includes a computing node, the storage node includes a processor and a memory, and the storage node is connected through a high-speed interconnection port and a line of the processor A cable is connected to the computing node;
    所述处理器用于通过所述高速互联端口接收所述计算节点发送的数据访问请求,将所述数据访问请求中的携带的第一目的地址转换为所述第一目的地址对应的所述存储节点的本地物理地址,并根据所述本地物理地址访问所述内存中的数据。The processor is configured to receive the data access request sent by the computing node through the high-speed interconnection port, and convert the first destination address carried in the data access request into the storage node corresponding to the first destination address local physical address, and access the data in the memory according to the local physical address.
PCT/CN2022/118756 2021-09-30 2022-09-14 Data access system and method, and related device WO2023051248A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111160189.7 2021-09-30
CN202111160189.7A CN115905036A (en) 2021-09-30 2021-09-30 Data access system, method and related equipment

Publications (1)

Publication Number Publication Date
WO2023051248A1 true WO2023051248A1 (en) 2023-04-06

Family

ID=85727930

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/118756 WO2023051248A1 (en) 2021-09-30 2022-09-14 Data access system and method, and related device

Country Status (2)

Country Link
CN (1) CN115905036A (en)
WO (1) WO2023051248A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5544319A (en) * 1992-03-25 1996-08-06 Encore Computer U.S., Inc. Fiber optic memory coupling system with converter transmitting and receiving bus data in parallel fashion and diagnostic data in serial fashion
CN103957155A (en) * 2014-05-06 2014-07-30 华为技术有限公司 Message transmission method and device and interconnection interface
CN103984638A (en) * 2013-02-12 2014-08-13 Lsi股份有限公司 Chained, scalable storage devices
CN111344964A (en) * 2018-04-25 2020-06-26 西部数据技术公司 Node configuration in an optical network
CN111490946A (en) * 2019-01-28 2020-08-04 阿里巴巴集团控股有限公司 FPGA connection implementation method and device based on OpenC L framework

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5544319A (en) * 1992-03-25 1996-08-06 Encore Computer U.S., Inc. Fiber optic memory coupling system with converter transmitting and receiving bus data in parallel fashion and diagnostic data in serial fashion
CN103984638A (en) * 2013-02-12 2014-08-13 Lsi股份有限公司 Chained, scalable storage devices
CN103957155A (en) * 2014-05-06 2014-07-30 华为技术有限公司 Message transmission method and device and interconnection interface
CN111344964A (en) * 2018-04-25 2020-06-26 西部数据技术公司 Node configuration in an optical network
CN111490946A (en) * 2019-01-28 2020-08-04 阿里巴巴集团控股有限公司 FPGA connection implementation method and device based on OpenC L framework

Also Published As

Publication number Publication date
CN115905036A (en) 2023-04-04

Similar Documents

Publication Publication Date Title
US10095645B2 (en) Presenting multiple endpoints from an enhanced PCI express endpoint device
US20210255915A1 (en) Cloud-based scale-up system composition
US20180024957A1 (en) Techniques to enable disaggregation of physical memory resources in a compute system
US10725957B1 (en) Uniform memory access architecture
CN109445905B (en) Virtual machine data communication method and system and virtual machine configuration method and device
WO2020247042A1 (en) Network interface for data transport in heterogeneous computing environments
WO2019233322A1 (en) Resource pool management method and apparatus, resource pool control unit, and communication device
US10198373B2 (en) Uniform memory access architecture
US11829309B2 (en) Data forwarding chip and server
US11741034B2 (en) Memory device including direct memory access engine, system including the memory device, and method of operating the memory device
WO2023125524A1 (en) Data storage method and system, storage access configuration method and related device
US10437747B2 (en) Memory appliance couplings and operations
EP4002139A2 (en) Memory expander, host device using memory expander, and operation method of server system including memory expander
Shim et al. Design and implementation of initial OpenSHMEM on PCIe NTB based cloud computing
US11962675B2 (en) Interface circuit for providing extension packet and processor including the same
WO2023186143A1 (en) Data processing method, host, and related device
CN115840620B (en) Data path construction method, device and medium
US20200125494A1 (en) Cache sharing in virtual clusters
WO2023051248A1 (en) Data access system and method, and related device
US10909044B2 (en) Access control device, access control method, and recording medium containing access control program
WO2022271327A1 (en) Memory inclusivity management in computing systems
US20200387396A1 (en) Information processing apparatus and information processing system
CN116185553A (en) Data migration method and device and electronic equipment
KR20220067992A (en) Memory controller performing selective and parallel error correction, system having the same and operating method of memory device
CN113722110B (en) Computer system, memory access method and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22874635

Country of ref document: EP

Kind code of ref document: A1