CN111382099A - RDMA (remote direct memory Access) technology-based distributed high-performance computing method - Google Patents

RDMA (remote direct memory Access) technology-based distributed high-performance computing method Download PDF

Info

Publication number
CN111382099A
CN111382099A CN201811637858.3A CN201811637858A CN111382099A CN 111382099 A CN111382099 A CN 111382099A CN 201811637858 A CN201811637858 A CN 201811637858A CN 111382099 A CN111382099 A CN 111382099A
Authority
CN
China
Prior art keywords
rdma
computing
memory
distributed high
technology
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811637858.3A
Other languages
Chinese (zh)
Inventor
于大鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuxi Taihong Information Technology Co ltd
Original Assignee
Wuxi Taihong Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuxi Taihong Information Technology Co ltd filed Critical Wuxi Taihong Information Technology Co ltd
Priority to CN201811637858.3A priority Critical patent/CN111382099A/en
Publication of CN111382099A publication Critical patent/CN111382099A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/20Handling requests for interconnection or transfer for access to input/output bus
    • G06F13/28Handling requests for interconnection or transfer for access to input/output bus using burst mode transfer, e.g. direct memory access DMA, cycle steal

Abstract

The invention discloses a RDMA technology-based distributed high-performance computing method, which comprises the following steps: sending a request of a co-built RDMA computing system, and selecting the number of specified nodes meeting the requirement; locking corresponding memory space, storage space and computing resources according to contract requirements; establishing an RDMA computing system; the consistency of memory sharing adopts directory control, and the directory state comprises the following steps: four states of no copy, shared read, dirty and shared dirty; the RDMA technology-based distributed high-performance computing method effectively utilizes/integrates idle computing resources (computing, storage and network bandwidth) in the society, maximizes the performance of the computing resources, provides a maximized elastic space for hot spot computing/storage/network resources, prevents serious congestion or data explosion caused by hot spots, and provides high-reliability and high-reliability computing/storage/network resources for users by matching with a credible computing chip.

Description

RDMA (remote direct memory Access) technology-based distributed high-performance computing method
Technical Field
The invention relates to the technical field of data storage and calculation, in particular to a distributed high-performance calculation method based on an RDMA (remote direct memory Access) technology.
Background
The rapid development of the internet brings a big data era, and thus various cloud computing services appear. Existing cloud computing and virtualization technologies maximize the efficiency of utilizing computing resources (computing, storage, network), but do not maximize the performance of utilizing computing resources, especially the idle resources distributed across the various computing devices.
Remote Direct Memory Access (RDMA) refers to directly accessing a Remote Memory without Direct participation of host operating systems of both parties, thereby providing characteristics of high bandwidth and low latency.
The invention provides a distributed high-performance computing method based on an RDMA technology, which maximizes the performance of computing resources, provides a maximized elastic space for hot spot computing/storage/network resources, and prevents serious congestion or data explosion caused by hot spots.
Disclosure of Invention
The invention mainly aims to provide a RDMA technology-based distributed high-performance computing method, which can effectively solve the problems in the background technology.
In order to achieve the purpose, the invention provides the following technical scheme: a RDMA technology-based distributed high-performance computing method comprises the following method steps:
step S1: sending a broadcast request for constructing a distributed high-performance computing method based on the RDMA technology to a network interface equipment terminal which also adopts the RDMA operation support network interface card through the RDMA support network interface card, and selecting the number of specified nodes meeting the requirements according to the computing requirements in the received response;
step S2: the method comprises the steps that electronic contracts are issued to the adopted nodes, and after the adopted nodes accept the contracts, corresponding memory spaces, storage spaces and computing resources are locked according to contract requirements and are used by a system;
step S3: according to the security requirement, whether a memory sharing resource pool is constructed or not is considered, and memory sharing is the most efficient communication mode and is also the parallel computing mode with higher efficiency;
step S4: the consistency of memory sharing adopts directory control, each node (virtual machine) is set with a number, such as 0,1, … … n, where n is an integer and two upper bits are defined as directory states, and the directory states include: the method comprises four states of No Copy (NC), shared read (SH), dirty (D) and Shared Dirty (SD), and the state is written back to the local after the directory is modified.
Preferably, in step S4, the directory state specific management step includes: after the host is started, one-time memory zero-writing operation can be realized, invalid data reading is prevented from occurring in the execution process of the computer, and after zero writing, the corresponding directory state of any storage unit is no copy; in any state, when the memory writing operation is encountered, the next directory state is set to be dirty (D, the content in the Cache is not equal to the local value of the main memory); NC and SH, when encountering the memory reading operation, will set the next directory state as shared reading (SH); D. and if the SD encounters a memory read operation, the next directory state is set as a shared dirty state (the SD means that the contents in all the caches with the copies are not equal to the local value of the main memory).
Preferably, in step S4, the directory content further includes a copy existence flag bit of the shared state and/or a dirty object ID in addition to the directory state.
Preferably, in step S1, the network interface card uses an intelligent network interface chip supporting RDMA function.
Preferably, in step S1, when the number of required nodes is selected, the calculation power, the storage space, the Cache size, and the network bandwidth are used as the selection conditions.
Preferably, in step S1, the transmitting end and the receiving end of the device that send out the distributed high-performance computing method based on the RDMA technology both install a trusted computing card, or set an encryption/decryption chip on the motherboard.
Preferably, in step S2, 1-n nodes (virtual machines) are defined, and after the adopted nodes receive a contract, the computing power, space, Cache, and network required by the matching algorithm are virtualized for the system to use.
Preferably, in step S2, the consideration involved in the electronic contract is intended to be a virtual digital currency or other transaction medium accepted by both parties to the transaction.
Preferably, in step S3, the memory is shared by the network card and the switch, and the memory is addressed uniformly.
Compared with the prior art, the invention has the following beneficial effects: by establishing RDMA link and a distributed computing system, matching with a trusted computing chip, and simultaneously adopting a directory state management mode of multi-Cache sharing memory consistency, the method has the following advantages:
1) effectively utilizing/integrating idle computing resources (computing, storage, network bandwidth) in the society;
2) performance of maximized utilization of computing resources;
3) the maximum elastic space is provided for hotspot calculation/storage/network resources, and serious congestion or data explosion caused by hotspots is prevented;
4) providing high-reliability and high-credibility computing/storing/network resources for users.
Drawings
FIG. 1 is a directory state management flowchart of a RDMA technology-based distributed high-performance computing method according to the present invention;
FIG. 2 is a RDMA link establishment procedure and a flow chart of the establishment of a distributed computing system of a distributed high-performance computing method based on RDMA technology according to the present invention.
Detailed Description
In order to make the technical means, the creation characteristics, the achievement purposes and the effects of the invention easy to understand, the invention is further described with the specific embodiments.
Example 1
As shown in fig. 1-2, a distributed high performance computing method based on RDMA technology comprises the following method steps:
step S1: sending a broadcast request for constructing a distributed high-performance computing method based on the RDMA technology to a network interface equipment terminal which also adopts the RDMA operation support network interface card through the RDMA support network interface card, and selecting the number of specified nodes meeting the requirements according to the computing requirements in the received response;
step S2: the method comprises the steps that electronic contracts are issued to the adopted nodes, and after the adopted nodes accept the contracts, corresponding memory spaces, storage spaces and computing resources are locked according to contract requirements and are used by a system;
step S3: according to the security requirement, whether a memory sharing resource pool is constructed or not is considered, and memory sharing is the most efficient communication mode and is also the parallel computing mode with higher efficiency;
step S4: the consistency of memory sharing adopts directory control, each node (virtual machine) is set with a number, such as 0,1, … … n, where n is an integer and two upper bits are defined as directory states, and the directory states include: the method comprises four states of No Copy (NC), shared read (SH), dirty (D) and Shared Dirty (SD), and the state is written back to the local after the directory is modified.
In step S4, the directory state specific management step includes: after the host is started, one-time memory zero-writing operation can be realized, invalid data reading is prevented from occurring in the execution process of the computer, and after zero writing, the corresponding directory state of any storage unit is no copy; in any state, when the memory writing operation is encountered, the next directory state is set to be dirty (D, the content in the Cache is not equal to the local value of the main memory); NC and SH, when encountering the memory reading operation, will set the next directory state as shared reading (SH); D. and if the SD encounters a memory read operation, the next directory state is set as a shared dirty state (the SD means that the contents in all the caches with the copies are not equal to the local value of the main memory).
In step S4, the directory content includes a copy presence flag and/or a dirty target ID of the shared state in addition to the directory state.
In step S1, the network interface card employs an intelligent network interface chip that supports RDMA functionality.
In step S1, when the number of required nodes is selected, the calculation power, the storage space, the Cache size, and the network bandwidth are used as the selection conditions.
In step S1, the transmitting end and the receiving end of the device that send out the distributed high-performance computing method based on the RDMA technology are both installed with a trusted computing card, or an encryption/decryption chip is set on the motherboard.
In step S2, 1-n nodes (virtual machines) are defined, and after the adopted nodes receive contracts, the computing power, space, Cache, and network required by the conforming algorithm are virtualized for the system to use.
In step S2, the consideration involved in the electronic contract is intended to be a virtual digital currency or other transaction medium that both parties may accept.
In step S3, memory sharing is realized by the network card and the switch, and the memory is addressed uniformly.
By adopting the technical scheme, after the adopted nodes receive contracts, RDMA links are established, a distributed computing system is established, and meanwhile, four directory states of No Copy (NC), shared read (SH), dirty (D) and Shared Dirty (SD) are adopted, so that the directory state management of the consistency of the shared memory of multiple caches is realized, and therefore, idle computing resources (computing, storing and network bandwidth) in the society are effectively utilized/integrated; maximized performance in terms of computational resources; the maximum elastic space is provided for hotspot calculation/storage/network resources, and serious congestion or data explosion caused by hotspots is prevented; and the trusted computing chip is matched to provide high-reliability and high-credibility computing/storing/network resources for users.
The foregoing shows and describes the general principles and broad features of the present invention and advantages thereof. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (9)

1. A RDMA technology-based distributed high-performance computing method is characterized by comprising the following method steps:
step S1: sending a broadcast request for constructing a distributed high-performance computing method based on the RDMA technology to a network interface equipment terminal which also adopts the RDMA operation support network interface card through the RDMA support network interface card, and selecting the number of specified nodes meeting the requirements according to the computing requirements in the received response;
step S2: the method comprises the steps that electronic contracts are issued to the adopted nodes, and after the adopted nodes accept the contracts, corresponding memory spaces, storage spaces and computing resources are locked according to contract requirements and are used by a system;
step S3: according to the security requirement, whether a memory sharing resource pool is constructed or not is considered, and memory sharing is the most efficient communication mode and is also the parallel computing mode with higher efficiency;
step S4: the consistency of memory sharing adopts directory control, each node (virtual machine) is set with a number, such as 0,1, … … n, where n is an integer and two upper bits are defined as directory states, and the directory states include: the method comprises four states of No Copy (NC), shared read (SH), dirty (D) and Shared Dirty (SD), and the state is written back to the local after the directory is modified.
2. The RDMA-based distributed high-performance computing method of claim 1, wherein in step S4, the directory state specific management step comprises: after the host is started, one-time memory zero-writing operation can be realized, invalid data reading is prevented from occurring in the execution process of the computer, and after zero writing, the corresponding directory state of any storage unit is no copy; in any state, when the memory writing operation is encountered, the next directory state is set to be dirty (D, the content in the Cache is not equal to the local value of the main memory); NC and SH, when encountering the memory reading operation, will set the next directory state as shared reading (SH); D. and if the SD encounters a memory read operation, the next directory state is set as a shared dirty state (the SD means that the contents in all the caches with the copies are not equal to the local value of the main memory).
3. The RDMA technology-based distributed high-performance computing method of claim 1, wherein in step S4, the directory content comprises a copy present flag bit of the shared state and/or a dirty target ID in addition to the directory state.
4. The RDMA technology-based distributed high-performance computing method of claim 1, wherein in step S1, the network interface card employs an intelligent network interface chip supporting RDMA functionality.
5. The RDMA (remote direct memory access) -technology-based distributed high-performance computing method of claim 1, wherein in step S1, when the required number of nodes is selected, the computing power, the storage space, the Cache size and the network bandwidth are used as the selection conditions.
6. The RDMA-technology-based distributed high-performance computing method of claim 1, wherein in the step S1, the transmitting end and the receiving end of the device sending out the RDMA-technology-based distributed high-performance computing method are both installed with a trusted computing card, or are provided with an encryption/decryption chip on a mainboard.
7. The RDMA (remote direct memory access) -technology-based distributed high-performance computing method as claimed in claim 1, wherein in step S2, 1-n nodes (virtual machines) are defined, and after the adopted nodes accept the contract, the computing power, space, Cache and network required by the conforming algorithm are virtualized out for the system to use.
8. The RDMA-based distributed high-performance computing method of claim 1, wherein in step S2, the consideration involved in the electronic contract is intended to be virtual digital currency or other transaction media accepted by both parties to the transaction.
9. The RDMA (remote direct memory Access) -technology-based distributed high-performance computing method according to claim 1, wherein in step S3, memory sharing and unified memory addressing are realized through a network card and a switch.
CN201811637858.3A 2018-12-29 2018-12-29 RDMA (remote direct memory Access) technology-based distributed high-performance computing method Pending CN111382099A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811637858.3A CN111382099A (en) 2018-12-29 2018-12-29 RDMA (remote direct memory Access) technology-based distributed high-performance computing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811637858.3A CN111382099A (en) 2018-12-29 2018-12-29 RDMA (remote direct memory Access) technology-based distributed high-performance computing method

Publications (1)

Publication Number Publication Date
CN111382099A true CN111382099A (en) 2020-07-07

Family

ID=71218082

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811637858.3A Pending CN111382099A (en) 2018-12-29 2018-12-29 RDMA (remote direct memory Access) technology-based distributed high-performance computing method

Country Status (1)

Country Link
CN (1) CN111382099A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022183518A1 (en) * 2021-03-02 2022-09-09 山东大学 Cloud-computing-oriented high-performance blockchain architecture method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022183518A1 (en) * 2021-03-02 2022-09-09 山东大学 Cloud-computing-oriented high-performance blockchain architecture method

Similar Documents

Publication Publication Date Title
CN112422615B (en) Communication method and device
EP3028162B1 (en) Direct access to persistent memory of shared storage
US9213609B2 (en) Persistent memory device for backup process checkpoint states
KR102520039B1 (en) System and method for supporting energy and time efficient content distribution and delivery
US20150205819A1 (en) Techniques for optimizing data flows in hybrid cloud storage systems
CN109542814A (en) The method and system that data are transmitted between storage equipment is connected to by the P2P of PCI-EXPRESS
CN103858111B (en) A kind of realization is polymerized the shared method, apparatus and system of virtual middle internal memory
CN104462225A (en) Data reading method, device and system
WO2023125524A1 (en) Data storage method and system, storage access configuration method and related device
US9910808B2 (en) Reflective memory bridge for external computing nodes
Tatebe et al. Gfarm/bb—gfarm file system for node-local burst buffer
CN103595720A (en) Offloaded data transferring method, device and client
CN110609708A (en) Method, apparatus and computer program product for data processing
EP3959611A1 (en) Intra-device notational data movement system
CN101377788B (en) Method and system of caching management in cluster file system
CN111382099A (en) RDMA (remote direct memory Access) technology-based distributed high-performance computing method
KR20170107061A (en) Method and apparatus for accessing a data visitor directory in a multicore system
US7114031B2 (en) Structure and method of cache memory data update
US10936219B2 (en) Controller-based inter-device notational data movement system
Ahson et al. Research issues in mobile computing
US11334245B1 (en) Native memory semantic remote memory access system
US11288008B1 (en) Reflective memory system
US10853293B2 (en) Switch-based inter-device notational data movement system
US10762011B2 (en) Reflective memory bridge for external computing nodes
CN117270757A (en) Management method of distributed storage system, storage server and storage controller

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200707