TWI734895B - Method of aggregating storage, method of nvme-of ssd capacity aggregation and aggregated ethernet ssd group - Google Patents
Method of aggregating storage, method of nvme-of ssd capacity aggregation and aggregated ethernet ssd group Download PDFInfo
- Publication number
- TWI734895B TWI734895B TW107107134A TW107107134A TWI734895B TW I734895 B TWI734895 B TW I734895B TW 107107134 A TW107107134 A TW 107107134A TW 107107134 A TW107107134 A TW 107107134A TW I734895 B TWI734895 B TW I734895B
- Authority
- TW
- Taiwan
- Prior art keywords
- volatile memory
- solid state
- structure high
- speed non
- state drive
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 76
- 230000002776 aggregation Effects 0.000 title claims abstract description 41
- 238000004220 aggregation Methods 0.000 title claims abstract description 41
- 238000003860 storage Methods 0.000 title claims abstract description 26
- 230000004931 aggregating effect Effects 0.000 title claims description 4
- 239000007787 solid Substances 0.000 claims description 309
- 230000015654 memory Effects 0.000 claims description 241
- 238000013507 mapping Methods 0.000 claims description 45
- 238000004891 communication Methods 0.000 claims description 40
- 230000005540 biological transmission Effects 0.000 claims description 31
- 230000002093 peripheral effect Effects 0.000 claims description 14
- 238000012546 transfer Methods 0.000 claims description 11
- 230000001568 sexual effect Effects 0.000 claims 1
- 238000013403 standard screening design Methods 0.000 abstract description 17
- 239000004744 fabric Substances 0.000 abstract description 3
- 238000007726 management method Methods 0.000 description 43
- 238000010586 diagram Methods 0.000 description 13
- 238000012545 processing Methods 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000004590 computer program Methods 0.000 description 3
- 230000004044 response Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 238000006073 displacement reaction Methods 0.000 description 2
- 238000002513 implantation Methods 0.000 description 2
- 238000012423 maintenance Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 230000006399 behavior Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 238000009509 drug development Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 239000007943 implant Substances 0.000 description 1
- 239000003999 initiator Substances 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000005433 particle physics related processes and functions Effects 0.000 description 1
- 238000004064 recycling Methods 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0662—Virtualisation aspects
- G06F3/0665—Virtualisation aspects at area level, e.g. provisioning of virtual or logical volumes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0604—Improving or facilitating administration, e.g. storage management
- G06F3/0605—Improving or facilitating administration, e.g. storage management by facilitating the interaction with a user or administrator
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0604—Improving or facilitating administration, e.g. storage management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/0223—User address space allocation, e.g. contiguous or non contiguous base addressing
- G06F12/023—Free address space management
- G06F12/0238—Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory
- G06F12/0246—Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory in block erasable memory, e.g. flash memory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/14—Handling requests for interconnection or transfer
- G06F13/20—Handling requests for interconnection or transfer for access to input/output bus
- G06F13/28—Handling requests for interconnection or transfer for access to input/output bus using burst mode transfer, e.g. direct memory access DMA, cycle steal
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/38—Information transfer, e.g. on bus
- G06F13/40—Bus structure
- G06F13/4004—Coupling between buses
- G06F13/4022—Coupling between buses using switching circuits, e.g. switching matrix, connection or expansion network
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/38—Information transfer, e.g. on bus
- G06F13/42—Bus transfer protocol, e.g. handshake; Synchronisation
- G06F13/4282—Bus transfer protocol, e.g. handshake; Synchronisation on a serial bus, e.g. I2C bus, SPI bus
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0629—Configuration or reconfiguration of storage systems
- G06F3/0631—Configuration or reconfiguration of storage systems by allocating resources to storage systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0629—Configuration or reconfiguration of storage systems
- G06F3/0635—Configuration or reconfiguration of storage systems by changing the path, e.g. traffic rerouting, path reconfiguration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
- G06F3/0644—Management of space entities, e.g. partitions, extents, pools
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0655—Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
- G06F3/0659—Command handling arrangements, e.g. command buffers, queues, command scheduling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0655—Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
- G06F3/0661—Format or protocol conversion arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/067—Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0683—Plurality of storage devices
- G06F3/0688—Non-volatile semiconductor memory arrays
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L49/00—Packet switching elements
- H04L49/35—Switches specially adapted for specific applications
- H04L49/351—Switches specially adapted for specific applications for local area network [LAN], e.g. Ethernet switches
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L49/00—Packet switching elements
- H04L49/35—Switches specially adapted for specific applications
- H04L49/356—Switches specially adapted for specific applications for storage area networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2213/00—Indexing scheme relating to interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F2213/0026—PCI express
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Mathematical Physics (AREA)
- Computer Hardware Design (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Information Transfer Systems (AREA)
- Computer And Data Communications (AREA)
- Techniques For Improving Reliability Of Storages (AREA)
- Signal Processing For Digital Recording And Reproducing (AREA)
Abstract
Description
本申請案主張2017年3月31日於美國專利商標局申請的美國暫時專利申請案第62/480,113號與2017年4月10日於美國專利商標局申請的美國暫時專利申請案第62/483,913號的優先權,兩暫時專利申請案之全部揭露內容均併入本案供參考。 This application claims the U.S. Provisional Patent Application No. 62/480,113 filed in the U.S. Patent and Trademark Office on March 31, 2017 and the U.S. Provisional Patent Application No. 62/483,913 filed in the U.S. Patent and Trademark Office on April 10, 2017. Priority of the number, all the disclosed contents of the two provisional patent applications are incorporated into this case for reference.
本揭露部分實施例大體而言是有關於一種系統與方法,用於將多個記憶體硬碟(例如乙太網路固態硬碟)聚集為使主機視為單一的大型邏輯容量。 Some embodiments of this disclosure generally relate to a system and method for aggregating multiple memory hard disks (such as Ethernet solid state hard disks) into a single large logical capacity for the host.
固態硬碟(Solid State Drive,SSD)歷經迅速發展,目前已成為現代資訊科技基礎結構用的儲存元件首選,從而也取代了傳統硬碟(Hard Disk Drive,HDD)。固態硬碟的延遲極低,且 提供高資料讀取/寫入輸送量以及可靠的資料儲存能力。 Solid State Drive (SSD) has undergone rapid development and has now become the first choice for storage components for modern information technology infrastructure, thus replacing traditional hard disk drives (HDD). The latency of the solid state drive is extremely low, and Provide high data read/write throughput and reliable data storage capacity.
跨結構高速非揮發性記憶體(Non-Volatile-Memory-express-over-Fabric,NVMe-oF)是一種新興技術,可使數百或數千個跨結構高速非揮發性記憶體裝置(例如跨結構高速非揮發性記憶體固態硬碟)透過網路結構(例如無限頻寬(IB)、光纖通道(FC)、乙太網路)相互連結。跨結構高速非揮發性記憶體通訊協定可用來執行遠端直連式儲存(Direct Attach Storage,rDAS)。如此一來即可將大量的固態硬碟連接遠端主機。跨結構高速非揮發性記憶體通訊協定使用了遠端直接記憶體存取(Remote Direct Memory Access,RDMA)通訊協定,來提供可靠的高速非揮發性記憶體命令、資料及回應。提供遠端直接記憶體存取服務的傳輸通訊協定包括iWARP、RoCE v1、RoCE v2。 Cross-structure high-speed non-volatile memory (Non-Volatile-Memory-express-over-Fabric, NVMe-oF) is an emerging technology that enables hundreds or thousands of cross-structure high-speed non-volatile memory devices (such as cross- Structure High-speed non-volatile memory solid state drives) are connected to each other through a network structure (such as Infinite Bandwidth (IB), Fibre Channel (FC), Ethernet). The cross-structure high-speed non-volatile memory communication protocol can be used to implement remote direct attached storage (Direct Attach Storage, rDAS). In this way, a large number of solid state drives can be connected to the remote host. The cross-structure high-speed non-volatile memory communication protocol uses the Remote Direct Memory Access (RDMA) communication protocol to provide reliable high-speed non-volatile memory commands, data and responses. The transmission protocols that provide remote direct memory access services include iWARP, RoCE v1, and RoCE v2.
跨結構高速非揮發性記憶體介面可使大量的固態硬碟連接遠端主機。在每個跨結構高速非揮發性記憶體固態硬碟中,驅動程式按例一般是在遠端主機上執行。對部分應用程式而言,單一個固態硬碟提供的儲存容量可能是不足夠的。 The cross-structure high-speed non-volatile memory interface allows a large number of solid state drives to connect to remote hosts. In each cross-structure high-speed non-volatile memory solid state drive, the driver is usually executed on the remote host as usual. For some applications, the storage capacity provided by a single solid-state drive may not be enough.
本揭露的部分實施例提供一聚集多個固態硬碟的方法,可使主機將該些固態硬碟視為單一大容量邏輯磁碟區,同時提供達成此方法的網路結構。 Some embodiments of the present disclosure provide a method for aggregating multiple solid-state hard disks, which enables the host to treat these solid-state hard disks as a single large-capacity logical volume, and provides a network structure for achieving this method.
根據部分實施例,在跨結構高速非揮發性記憶體裝置用的儲存聚集方法中,上述方法包括:將聚集群組識別為包括多個跨結構高速非揮發性記憶體固態硬碟的聚集的乙太網路固態硬碟;選擇上述聚集群組中的跨結構高速非揮發性記憶體固態硬碟的其中一個作為主要跨結構高速非揮發性記憶體固態硬碟;選擇上述聚集群組中其他的跨結構高速非揮發性記憶體固態硬碟作為次要跨結構高速非揮發性記憶體固態硬碟;以及以處理器將主要跨結構高速非揮發性記憶體固態硬碟中的對映配置表初始化,以管理跨結構高速非揮發性記憶體固態硬碟。 According to some embodiments, in a storage aggregation method for a cross-structure high-speed non-volatile memory device, the above-mentioned method includes: identifying the aggregation group as an aggregation group including a plurality of cross-structure high-speed non-volatile memory solid state drives. Ethernet solid state drive; select one of the cross-structure high-speed non-volatile memory solid state drives in the above aggregation group as the main cross-structure high-speed non-volatile memory solid state drive; select the other in the above aggregation group Cross-structure high-speed non-volatile memory solid state drive as a secondary cross-structure high-speed non-volatile memory solid state drive; and initialize the mapping configuration table in the main cross-structure high-speed non-volatile memory solid state drive with a processor , In order to manage the cross-structure high-speed non-volatile memory solid state drive.
根據部分範例實施例,以上述處理器初始化上述對映配置表是在與上述聚集群組連結的儲存管理員的指導下進行。 According to some exemplary embodiments, the initialization of the mapping configuration table by the processor is performed under the guidance of the storage administrator connected to the aggregation group.
根據部分範例實施例,對於上述聚集群組的各個跨結構高速非揮發性記憶體固態硬碟,上述對映配置表包括跨結構高速非揮發性記憶體固態硬碟的容量、跨結構高速非揮發性記憶體固態硬碟的位址及跨結構高速非揮發性記憶體固態硬碟的剩餘容量。 According to some exemplary embodiments, for each cross-structure high-speed non-volatile memory solid state drive in the aforementioned aggregation group, the above mapping configuration table includes the capacity of the cross-structure high-speed non-volatile memory solid state drive, and the cross-structure high-speed non-volatile memory solid state drive. The address of the non-volatile memory solid state drive and the remaining capacity of the cross-structure high-speed non-volatile memory solid state drive.
根據部分範例實施例,上述方法更包括將主要跨結構高速非揮發性記憶體固態硬碟的上述位址提供給使用者應用程式,以使聚集群組與使用者應用程式之間能夠進行資料傳輸。 According to some example embodiments, the above-mentioned method further includes providing the above-mentioned address of the main cross-structure high-speed non-volatile memory solid state drive to the user application, so as to enable data transmission between the aggregation group and the user application .
根據部分範例實施例,上述方法更包括:在主要跨結構高速非揮發性記憶體固態硬碟上接收來自與聚集群組連結的主機的管理(Admin)命令;判斷與上述管理命令對應的資料是否僅儲 存在上述主要跨結構高速非揮發性記憶體固態硬碟上、或是儲存在一個或多個次要跨結構高速非揮發性記憶體固態硬碟上;當上述資料儲存在一個或多個次要跨結構高速非揮發性記憶體固態硬碟上時,將上述管理命令分割成分別對應一個或多個次要跨結構高速非揮發性記憶體固態硬碟的一個或多個管理子命令;將上述資料傳輸至主機;接收來自一個或多個次要跨結構高速非揮發性記憶體固態硬碟的子命令完成項目;以及從上述主要跨結構高速非揮發性記憶體固態硬碟建立並傳送完成項目至主機。 According to some example embodiments, the above method further includes: receiving an Admin command from a host connected to the aggregation group on the main cross-structure high-speed non-volatile memory solid state drive; determining whether the data corresponding to the above management command is Store only Exist on the above-mentioned main cross-structure high-speed non-volatile memory solid-state drive, or stored on one or more secondary cross-structure high-speed non-volatile memory solid-state hard drives; when the above data is stored in one or more secondary When the cross-structure high-speed non-volatile memory solid state drive is used, the above management commands are divided into one or more management sub-commands corresponding to one or more secondary cross-structure high-speed non-volatile memory solid state drives; Data transfer to the host; receive sub-commands from one or more secondary cross-structure high-speed non-volatile memory solid state drives to complete the project; and create and transfer the completed project from the above-mentioned main cross-structure high-speed non-volatile memory solid state drive To the host.
根據部分範例實施例,上述方法更包括:在一個或多個次要跨結構高速非揮發性記憶體固態硬碟的一個對應的次要跨結構高速非揮發性記憶體固態硬碟上接收一個或多個管理子命令的一個管理子命令;根據上述管理子命令判斷是否要將上述資料從對應的次要跨結構高速非揮發性記憶體固態硬碟傳輸到上述主要跨結構高速非揮發性記憶體固態硬碟;建立完成項目;以及將上述完成項目傳送到上述主要跨結構高速非揮發性記憶體固態硬碟。 According to some example embodiments, the above method further includes: receiving one or One management sub-command of multiple management sub-commands; according to the above-mentioned management sub-command, determine whether to transfer the above-mentioned data from the corresponding secondary cross-structure high-speed non-volatile memory solid state drive to the above-mentioned main cross-structure high-speed non-volatile memory Solid-state hard drive; establish and complete the project; and transfer the above-mentioned completed project to the above-mentioned main cross-structure high-speed non-volatile memory solid-state drive.
根據部分範例實施例,上述方法更包括:在上述主要跨結構高速非揮發性記憶體固態硬碟上接收建立命名空間(Namespace)的命令或刪除命名空間的命令;參考上述主要跨結構高速非揮發性記憶體固態硬碟的上述對映配置表;當上述命令是建立命名空間時,在上述主要跨結構高速非揮發性記憶體固態硬碟中及/或在一個或多個次要跨結構高速非揮發性記憶體固態硬 碟中配置容量,或是當上述命令是刪除命名空間時,收回主要跨結構高速非揮發性記憶體固態硬碟及/或一個或多個次要跨結構高速非揮發性記憶體固態硬碟的對應一者;以及更新上述對映配置表。 According to some example embodiments, the above method further includes: receiving a command to create a namespace (Namespace) or a command to delete a namespace on the above-mentioned main cross-structure high-speed non-volatile memory solid state drive; refer to the above-mentioned main cross-structure high-speed non-volatile memory The above-mentioned mapping configuration table of the non-volatile memory solid state drive; when the above command is to create a namespace, in the above-mentioned main cross-structure high-speed non-volatile memory solid state drive and/or in one or more secondary cross-structure high-speed Non-volatile memory solid state hard Configure the capacity in the disk, or when the above command deletes the namespace, recover the main cross-structure high-speed non-volatile memory solid state drive and/or one or more secondary cross-structure high-speed non-volatile memory solid state drives Correspond to one; and update the above mapping configuration table.
根據部分實施例,上述方法更包括:在上述主要跨結構高速非揮發性記憶體固態硬碟上接收讀取/寫入(Read/Write)命令;查詢上述主要跨結構高速非揮發性記憶體固態硬碟的對映配置表;建立一個或多個讀取/寫入子命令;將上述一個或多個讀取/寫入子命令分別傳送到一個或多個次要跨結構高速非揮發性記憶體固態硬碟;根據上述讀取/寫入命令,在主機與上述主要跨結構高速非揮發性記憶體固態硬碟及/或一個或多個次要跨結構高速非揮發性記憶體固態硬碟之間傳輸資料;以及傳輸上述資料後傳送完成至主機。 According to some embodiments, the above method further includes: receiving a read/write (Read/Write) command on the above-mentioned main cross-structure high-speed non-volatile memory solid state drive; querying the above-mentioned main cross-structure high-speed non-volatile memory solid state drive The mapping configuration table of the hard disk; create one or more read/write subcommands; send the above one or more read/write subcommands to one or more secondary cross-structure high-speed non-volatile memories respectively Solid state drive; according to the above read/write command, between the host and the above-mentioned main cross-structure high-speed non-volatile memory solid state drive and/or one or more secondary cross-structure high-speed non-volatile memory solid state drives Transfer data between; and transfer the above-mentioned data to the host after the completion.
根據部分實施例,上述方法更包括:在一個或多個次要跨結構高速非揮發性記憶體固態硬碟的一個對應的次要跨結構高速非揮發性記憶體固態硬碟上接收上述一個或多個讀取/寫入子命令中的一個讀取/寫入子命令;擷取對應讀取/寫入子命令的輸送資訊;從上述對應的次要跨結構高速非揮發性記憶體固態硬碟發出讀取/寫入要求至主機;以及在對應上述讀取/寫入子命令的資料傳輸完成之後,將完成項目傳送到主要跨結構高速非揮發性記憶體固態硬碟。 According to some embodiments, the above method further includes: receiving the above one or the above on a corresponding secondary cross-structure high-speed non-volatile memory solid state drive of the one or more secondary cross-structure high-speed non-volatile memory solid state drives. One of the multiple read/write sub-commands; captures the transmission information corresponding to the read/write sub-command; from the above-mentioned corresponding secondary cross-structure high-speed non-volatile memory solid state hard disk The disk sends a read/write request to the host; and after the data transmission corresponding to the above-mentioned read/write subcommand is completed, the completed item is sent to the main cross-structure high-speed non-volatile memory solid state drive.
根據部分範例實施例,一種跨結構高速非揮發性記憶體 乙太網路固態硬碟群組中的跨結構高速非揮發性記憶體固態硬碟用的容量聚集方法,上述方法包括:識別聚集群組的多個跨結構高速非揮發性記憶體固態硬碟;將跨結構高速非揮發性記憶體固態硬碟的其中一個指派為主要跨結構高速非揮發性記憶體固態硬碟;以及將剩餘的跨結構高速非揮發性記憶體固態硬碟指派為次要跨結構高速非揮發性記憶體固態硬碟,其中主機的主機驅動程式唯一可見的跨結構高速非揮發性記憶體固態硬碟為主要跨結構高速非揮發性記憶體固態硬碟。 According to some example embodiments, a cross-structure high-speed non-volatile memory A capacity aggregation method for cross-structure high-speed non-volatile memory solid state drives in an Ethernet solid-state drive group. The above method includes: identifying a plurality of cross-structure high-speed non-volatile memory solid state drives in the aggregation group ; Designate one of the cross-structure high-speed non-volatile memory solid state drives as the primary cross-structure high-speed non-volatile memory solid state drive; and assign the remaining cross-structure high-speed non-volatile memory solid state drives as the secondary Cross-structure high-speed non-volatile memory solid state drives, among which the only cross-structure high-speed non-volatile memory solid state drive visible in the host's host driver is the main cross-structure high-speed non-volatile memory solid state drive.
根據部分實施例,上述方法更包括根據主要跨結構高速非揮發性記憶體固態硬碟從主機接收的命令,維護主要跨結構高速非揮發性記憶體固態硬碟的對映配置表,其中上述對映配置表指示在聚集群組的主要跨結構高速非揮發性記憶體固態硬碟與一個或多個次要跨結構高速非揮發性記憶體固態硬碟分割的邏輯區塊位址(logical block address,LBA)空間。 According to some embodiments, the above method further includes maintaining the mapping configuration table of the main cross-structure high-speed non-volatile memory solid state drive according to the command received from the host computer for the main cross-structure high-speed non-volatile memory solid state drive, wherein The mapping configuration table indicates the logical block address of the main cross-structure high-speed non-volatile memory solid state drive and one or more secondary cross-structure high-speed non-volatile memory solid state drives in the aggregation group. , LBA) space.
根據部分實施例,上述方法更包括以處理器將對映配置表初始化,並依據指派為主要跨結構高速非揮發性記憶體固態硬碟的跨結構高速非揮發性記憶體固態硬碟的其中一個來設定聚集群組。 According to some embodiments, the above method further includes initializing the mapping configuration table with the processor, and assigning it as one of the main cross-structure high-speed non-volatile memory solid state drives according to the designation To set the aggregation group.
根據部分實施例,上述方法更包括依據主要跨結構高速非揮發性記憶體固態硬碟從主機接收到的命令來聚集次要跨結構高速非揮發性記憶體固態硬碟與上述主要跨結構高速非揮發性記憶體固態硬碟的容量,使得上述聚集群組的所述多個跨結構高速 非揮發性記憶體固態硬碟對主機而言顯現為單一的聚集邏輯容量。 According to some embodiments, the above method further includes gathering the secondary cross-structure high-speed non-volatile memory solid state drive and the above-mentioned main cross-structure high-speed non-volatile memory according to the command received from the host computer. The capacity of the volatile memory solid state drive makes the multiple cross-structures of the above-mentioned aggregation group high-speed The non-volatile memory solid state drive appears to the host as a single aggregate logical capacity.
根據部分實施例,上述方法更包括:使用上述主要跨結構高速非揮發性記憶體固態硬碟來將容量配置給一個或多個次要跨結構高速非揮發性記憶體固態硬碟;以及使用上述主要跨結構高速非揮發性記憶體固態硬碟來記錄已配置的容量與對映配置表中相關聯且對映的邏輯區塊位址範圍。 According to some embodiments, the above method further includes: using the above-mentioned main cross-structure high-speed non-volatile memory solid state drive to allocate capacity to one or more secondary cross-structure high-speed non-volatile memory solid state drives; and using the above Mainly across the structure of high-speed non-volatile memory solid state hard disks to record the allocated capacity and the associated and mapped logical block address range in the mapping configuration table.
根據部分實施例,上述方法更包括利用上述主要跨結構高速非揮發性記憶體固態硬碟過度配置次要跨結構高速非揮發性記憶體固態硬碟與上述主要跨結構高速非揮發性記憶體固態硬碟的聚集容量。 According to some embodiments, the above-mentioned method further includes using the above-mentioned main cross-structure high-speed non-volatile memory solid state drive to over-allocate the secondary cross-structure high-speed non-volatile memory solid state drive and the above-mentioned main cross-structure high-speed non-volatile memory solid state drive. The aggregate capacity of the hard disk.
根據部分實施例,上述方法更包括:在上述主要跨結構高速非揮發性記憶體固態硬碟上接收來自上述主機的命令;將上述命令分割成多個子命令且每一子命令對應至對應的次要跨結構高速非揮發性記憶體固態硬碟的各別一者;以及將子命令從上述主要跨結構高速非揮發性記憶體固態硬碟傳送到對應的次要跨結構高速非揮發性記憶體固態硬碟。 According to some embodiments, the above method further includes: receiving a command from the host on the main cross-structure high-speed non-volatile memory solid state drive; dividing the command into a plurality of sub-commands, and each sub-command corresponds to a corresponding sub-command. Each of the cross-structure high-speed non-volatile memory solid state drives; and the sub-commands from the above-mentioned main cross-structure high-speed non-volatile memory solid state drive to the corresponding secondary cross-structure high-speed non-volatile memory Solid state drive.
根據部分實施例,上述方法更包括基於個別的子命令,從對應的次要跨結構高速非揮發性記憶體固態硬碟直接傳送資料到主機。 According to some embodiments, the above method further includes directly transmitting data from the corresponding secondary cross-structure high-speed non-volatile memory solid state drive to the host based on individual sub-commands.
根據部分實施例,上述方法更包括:在對應的次要跨結構高速非揮發性記憶體固態硬碟上接收來自上述主要跨結構高速 非揮發性記憶體固態硬碟的個別子命令;執行對應個別子命令的任務;以及任務完成後,從對應的次要跨結構高速非揮發性記憶體固態硬碟傳送個別子命令完成項目到上述主要跨結構高速非揮發性記憶體固態硬碟。 According to some embodiments, the above-mentioned method further includes: receiving on the corresponding secondary cross-structure high-speed non-volatile memory solid state drive from the above-mentioned main cross-structure high-speed Individual sub-commands of the non-volatile memory solid state drive; execute the task corresponding to the individual sub-command; and after the task is completed, send the individual sub-command from the corresponding secondary cross-structure high-speed non-volatile memory solid state drive to complete the project to the above Mainly cross-structure high-speed non-volatile memory solid state drives.
根據部分實施例,上述方法更包括:使用上述主要跨結構高速非揮發性記憶體固態硬碟來維護子命令內容表;在上述主要跨結構高速非揮發性記憶體固態硬碟接收來自次要跨結構高速非揮發性記憶體固態硬碟的子命令完成項目;以及根據接收到的子命令完成項目,以上述主要跨結構高速非揮發性記憶體固態硬碟來追蹤子命令的執行。 According to some embodiments, the above method further includes: using the above-mentioned main cross-structure high-speed non-volatile memory solid state drive to maintain the sub-command content table; in the above-mentioned main cross-structure high-speed non-volatile memory solid state drive, receiving data from the secondary The sub-command of the structured high-speed non-volatile memory solid state drive completes the project; and the project is completed according to the received sub-command, and the execution of the sub-command is tracked by the above-mentioned main cross-structured high-speed non-volatile memory solid state drive.
根據部分範例實施例,提供一種聚集的乙太網路固態硬碟群組,包括:乙太網路固態硬碟底座;乙太網路固態硬碟底座上的乙太網路交換器,以與主機驅動程式通訊;處理器,耦接至乙太網路交換器;快捷外設互聯標準(Peripheral Component Interconnect Express,PCIe)交換器,耦接至基板管理控制器;多個跨結構高速非揮發性記憶體固態硬碟,包括:主要跨結構高速非揮發性記憶體固態硬碟;以及多個次要跨結構高速非揮發性記憶體固態硬碟,透過包括乙太網路交換器與快捷外設互聯標準交換器的私用通訊通道,來與主要跨結構高速非揮發性記憶體固態硬碟連結,其中只有上述主要跨結構高速非揮發性記憶體固態硬碟對主機而言是可見的,且其中上述基板管理控制器用於在開始決定跨結構高速非揮發性記憶體固態硬碟中何者包括上述主要跨 結構高速非揮發性記憶體固態硬碟。 According to some example embodiments, an aggregated Ethernet solid-state drive group is provided, including: an Ethernet solid-state drive base; an Ethernet switch on the Ethernet solid-state drive base to communicate with Host driver communication; processor, coupled to the Ethernet switch; Peripheral Component Interconnect Express (PCIe) switch, coupled to the baseboard management controller; multiple cross-structure high-speed non-volatile Memory solid state drives, including: main cross-structure high-speed non-volatile memory solid state hard drives; and multiple secondary cross-structure high-speed non-volatile memory solid state drives, including Ethernet switches and shortcut peripherals The private communication channel of the interconnected standard switch is connected to the main cross-structure high-speed non-volatile memory solid state drive, of which only the above-mentioned main cross-structure high-speed non-volatile memory solid state drive is visible to the host, and Among them, the above-mentioned baseboard management controller is used to determine which of the above-mentioned main components of the cross-structure high-speed non-volatile memory solid-state hard disk is initially determined. Structure high-speed non-volatile memory solid state drive.
基於上述,因為只有單一主要乙太網路固態硬碟進行所有跨結構高速非揮發性記憶體通訊協定的處理作業,同時追蹤所有透過次要乙太網路固態硬碟的相關子命令是否已完成,並確保次要乙太網路固態硬碟對主機而言是不可見的,因此乙太網路固態硬碟的聚集群組對主機是顯示為單一的大型邏輯容量。 Based on the above, because there is only a single primary Ethernet SSD for processing all cross-structure high-speed non-volatile memory communication protocols, it also tracks whether all related subcommands through the secondary Ethernet SSD have been completed , And ensure that the secondary Ethernet SSD is invisible to the host, so the aggregated group of the Ethernet SSD is displayed to the host as a single large logical capacity.
100、500:系統架構 100, 500: system architecture
110:乙太網路固態硬碟 110: Ethernet solid state drive
110s:次要乙太網路固態硬碟 110s: Secondary Ethernet SSD
110p:主要乙太網路固態硬碟 110p: Main Ethernet solid state drive
120:乙太網路固態硬碟底座 120: Ethernet Solid State Drive Dock
130:私用通訊通道 130: private communication channel
132:子命令 132: Subcommand
134:完成項目 134: Complete the project
135:控制平面 135: control plane
140:快捷外設互聯標準匯流排 140: Fast Peripheral Interconnection Standard Bus
150:基板管理控制器 150: baseboard management controller
160:乙太網路交換器 160: Ethernet switch
170:機跨結構高速非揮發性記憶體驅動程式 170: Machine cross-structure high-speed non-volatile memory driver
180:命令 180: Command
182:完成項目 182: Complete the project
185:資料 185: Information
190:主機 190: host
230:乙太網路機架 230: Ethernet rack
240:機架頂端交換器 240: rack top switch
300:對映配置表 300: mapping configuration table
311:容量(百兆位元組) 311: Capacity (hundred megabytes)
312:傳輸位址(MAC/IP) 312: Transmission address (MAC/IP)
313:乙太網路固態硬碟索引 313: Ethernet SSD Index
314:對映位址命名空間邏輯區塊位址範圍 314: Mapping address namespace logical block address range
400、700、800、900、1000、1100、1200:流程圖 400, 700, 800, 900, 1000, 1100, 1200: flow chart
530:乙太網路通訊通道 530: Ethernet communication channel
600:表格 600: form
610:命令標記 610: Command Mark
620:子命令# 620: Subcommand#
630:累積的錯誤狀態 630: Cumulative error status
640:命令ID 640: Command ID
S410、S420、S430、S440:步驟 S410, S420, S430, S440: steps
S710、S720、S730、S740、S750、S760、S770、S780、S790:步驟 S710, S720, S730, S740, S750, S760, S770, S780, S790: steps
S810、S820、S830、S840、S811、S821、S831、S841:步驟 S810, S820, S830, S840, S811, S821, S831, S841: steps
S910、S920、S930、S940、S950、S960、S970、S980、S990:步驟 S910, S920, S930, S940, S950, S960, S970, S980, S990: steps
S1010、S1020、S1030、S1040、S1050:步驟 S1010, S1020, S1030, S1040, S1050: steps
S1110、S1120、S1130、S1140、S1150:步驟 S1110, S1120, S1130, S1140, S1150: steps
S1210、S1220、S1230、S1240、S1250、S1260、S1270:步驟 S1210, S1220, S1230, S1240, S1250, S1260, S1270: steps
為讓本發明能更明顯易懂,下文特舉實施例,並配合所附圖式作詳細說明如下,在所附圖式中:圖1為繪示根據本揭露一實施例的用於包括單一乙太網路固態硬碟底座內的數個聚集的乙太網路固態硬碟的跨結構高速非揮發性記憶體乙太網路固態硬碟(Ethernet SSD,eSSD)儲存的系統架構100方塊圖。
In order to make the present invention more comprehensible, the following specific examples are given in conjunction with the accompanying drawings to describe in detail as follows. In the accompanying drawings: A cross-structure high-speed non-volatile memory of multiple clustered Ethernet solid-state drives in an Ethernet solid-
圖2為繪示根據本揭露一實施例的圖1所示的數個乙太路固態硬碟底座在乙太網路固態硬碟機架上彼此相連的方塊圖。 FIG. 2 is a block diagram showing several Ethernet SSD bases shown in FIG. 1 connected to each other on an Ethernet SSD rack according to an embodiment of the present disclosure.
圖3為根據本揭露一實施例的主要乙太網路固態硬碟所維護的對映配置表(Map Allocation Table)的範例。 FIG. 3 is an example of a map allocation table (Map Allocation Table) maintained by a main Ethernet solid state drive according to an embodiment of the disclosure.
圖4為繪示根據本揭露的一實施例的對映配置表初始化流程圖。 FIG. 4 is a flowchart showing the initialization of the mapping configuration table according to an embodiment of the present disclosure.
圖5為繪示根據本揭露一實施例包括資料往返於分別位於多個機架中的多個乙太網路固態硬碟底座中多個聚集的乙太網路固 態硬碟的用於跨結構高速非揮發性記憶體乙太路固態硬碟儲存的系統架構方塊圖。 FIG. 5 is a diagram showing a plurality of aggregated Ethernet solid state drives in a plurality of Ethernet solid state drive bays located in a plurality of racks according to an embodiment of the present disclosure; A block diagram of the system architecture for cross-structure high-speed non-volatile memory Ethernet solid-state hard disk storage.
圖6繪示一表格的範例命令內容。 Figure 6 shows an example command content of a table.
圖7為繪示根據本揭露一實施例的主要乙太網路固態硬碟處理管理(Admin)命令的流程圖。 FIG. 7 is a flowchart showing the main Ethernet SSD processing management (Admin) command according to an embodiment of the disclosure.
圖8為繪示根據本揭露一實施例的執行命名空間建立與刪除(Namespace Create and Delete)命令的流程圖。 FIG. 8 is a flowchart of executing a Namespace Create and Delete command according to an embodiment of the disclosure.
圖9為繪示根據本揭露一實施例所在自主要乙太網路固態硬碟控制下執行讀取/寫入(Read/Write)命令的流程圖。 FIG. 9 is a flowchart illustrating a read/write (Read/Write) command executed under the control of a main Ethernet solid state drive according to an embodiment of the present disclosure.
圖10為繪示根據本揭露一實施例的次要乙太網路固態硬碟執行管理子命令的流程圖。 FIG. 10 is a flowchart showing the execution and management sub-commands of the secondary Ethernet SSD according to an embodiment of the disclosure.
圖11為繪示根據本揭露一實施例的次要乙太網路固態硬碟執行讀取/寫入子命令的流程圖。 FIG. 11 is a flowchart showing the execution of read/write subcommands by the secondary Ethernet solid-state drive according to an embodiment of the disclosure.
圖12為繪示根據本揭露一實施例的在次要乙太網路固態硬碟中的資料傳送與子命令完成同步的流程圖。 FIG. 12 is a flowchart of data transmission and sub-command completion synchronization in a secondary Ethernet solid-state drive according to an embodiment of the disclosure.
通過參考下方對實施例與所附圖式的詳細描述,可更容易地理解本發明的發明概念特徵與其實現方法。本發明之實施例將於下文搭配所附圖式進行說明,且在全文中相同的標號代表相同的構件。然而,本發明可被例示為諸多不同形式且不應被視為僅限於本文所述的實施例。確切而言,提供該些實施例是作為範例,為了使此揭露內容更為透徹及完整,並向本領域具通常知識 者充分傳達本揭露的觀點與特徵。據此,對本領域具通常知識者而言,在完整理解本發明的觀點與特徵方面非要的程序、元件、技術可能不會加以說明。除非另有說明,在所附圖式與所述說明中,利用相同符號指代相似的元件,因此不再重複說明。在圖式中,為清晰起見,可放大元件、層、區域的相關尺寸。 By referring to the detailed description of the embodiments and the accompanying drawings below, it is easier to understand the inventive concept features and implementation methods of the present invention. The embodiments of the present invention will be described below in conjunction with the accompanying drawings, and the same reference numerals represent the same components throughout the text. However, the present invention can be exemplified in many different forms and should not be regarded as limited to the embodiments described herein. To be precise, these embodiments are provided as examples, in order to make this disclosure more thorough and complete, and to provide common knowledge in the field. The author fully conveys the views and characteristics of this disclosure. Accordingly, for a person with ordinary knowledge in the field, procedures, elements, and techniques that are not necessary for a complete understanding of the viewpoints and features of the present invention may not be explained. Unless otherwise specified, in the accompanying drawings and the description, the same symbols are used to refer to similar elements, so the description will not be repeated. In the drawings, for clarity, the relative dimensions of components, layers, and regions can be enlarged.
在接下來的描述中,出於解釋上的目的,列舉出數個具體細節以提供對於各項實施例的通盤理解。然而顯而易見的是,可以在沒有這些具體細節或提供一個或多個相等配置的情況下實作各項實施例。在其他情況下,以方塊圖形式顯示已知的結構與裝置,以避免不必要地混淆各項實施例。 In the following description, for explanatory purposes, several specific details are listed to provide a comprehensive understanding of the various embodiments. However, it is obvious that various embodiments can be implemented without these specific details or providing one or more equivalent configurations. In other cases, the known structures and devices are shown in block diagram form to avoid unnecessarily obscuring the various embodiments.
應可理解的是,儘管本文中可能使用「第一」、「第二」、「第三」等用語來闡述各種元件、構件、區域、層及/或區段,然而該些元件、構件、區域、層及/或區段不應受限於該些用語。該些用語僅用於區分各個元件、構件、區域、層或區段。因此,在不脫離本發明的精神與範圍內,以下所論述的第一元件、構件、區域、層或區段可被稱為第二元件、構件、區域、層或區段。 It should be understood that although terms such as "first", "second", and "third" may be used herein to describe various elements, components, regions, layers and/or sections, these elements, components, Regions, layers and/or sections should not be limited by these terms. These terms are only used to distinguish individual elements, components, regions, layers or sections. Therefore, without departing from the spirit and scope of the present invention, the first element, member, region, layer or section discussed below may be referred to as a second element, member, region, layer or section.
在本文中,為便於解釋,可使用例如「向...下面(beneath)」、「在...之下(below)」、「下方的(lower)」、「在...下方(under)」、「在...之上(above)」、「上方的(upper)」等空間相對性用語來闡述圖式中所示一個元件或特徵相對於另一(其他)元件或特徵的關係。應理解,該些空間相對性用語旨在除了圖式中所示定向以外,亦囊括裝置在使用或操作中的不同定向。舉例而言,若翻轉 圖式中的裝置,則描述為在其他元件「之下」或「下方」或「下方的」元件,此時將被定向為在其他元件或特徵「之上」。因此,用語「在...之下」與「在...下方」可包含上方及下方兩種定向。所述裝置亦可具有其他定向(例如旋轉90度或其他定向),且本文中所用的空間相對性描述語應可相應地進行解釋。 In this article, for the convenience of explanation, you can use, for example, "to...below", "below", "lower", "under )”, “above”, “upper” and other spatially relative terms to describe the relationship between one element or feature shown in the diagram relative to another (other) element or feature . It should be understood that these spatially relative terms are intended to encompass different orientations of the device in use or operation in addition to the orientations shown in the drawings. For example, if you flip The device in the drawing is described as being "below" or "below" or "below" other elements, and will be oriented "above" other elements or features at this time. Therefore, the terms "below" and "below" can include both above and below orientations. The device may also have other orientations (for example, a 90-degree rotation or other orientations), and the spatial relativity description used herein should be interpretable accordingly.
應理解,當稱一元件、層、區域或構件位於另一元件、層、區域或構件「上」、「連接至」或「耦合於」另一元件、層、區域或構件時,所述元件、層、區域或構件可直接位於所述另一元件、層、區域或構件「上」、直接「連接至」或直接「耦合於」所述另一元件、層、區域或構件或其間可存在其他一個或多個元件、層、區域或構件。此外,同時應理解,當稱一元件或層位於兩個元件或層的「之間」時,所述元件或層是唯一位於兩元件或層之間的元件或層,或可存在其他一個或多個元件或層。 It should be understood that when an element, layer, region or member is referred to as being "on", "connected to" or "coupled to" another element, layer, region or member, the element , Layer, region or member can be directly located on, directly "connected to" or directly "coupled to" said another element, layer, region or member or there may be in between. Other one or more elements, layers, regions or components. In addition, it should also be understood that when an element or layer is referred to as being "between" two elements or layers, the element or layer is the only element or layer between the two elements or layers, or there may be other one or Multiple elements or layers.
為了本揭露的目的,「X、Y、Z中至少其中之一」與「從X、Y、Z組成的群組中選擇至少一個」應理解為僅限於X、僅限於Y、僅限於Z,或X、Y、Z其中兩者或更多的任意組合,例如XYZ、XYY、YZ、ZZ。在通篇中相同的編號指代相同的元件。本文中所使用的用語「及/或」包含相關列出項其中一個或多個項的任意及所有組合。 For the purpose of this disclosure, "at least one of X, Y, and Z" and "select at least one from the group consisting of X, Y, and Z" should be understood as being limited to X, limited to Y, and limited to Z. Or any combination of two or more of X, Y, Z, such as XYZ, XYY, YZ, ZZ. The same numbers refer to the same elements throughout the text. The term "and/or" as used herein includes any and all combinations of one or more of the related listed items.
在以下範例中,x軸,y軸與z軸不限於直角坐標系的三個軸,可以從更廣泛的意義上解釋。舉例而言,x軸,y軸和z軸可以彼此垂直,或者也可代表彼此不互垂直的不同方向。 In the following example, the x-axis, y-axis, and z-axis are not limited to the three axes of the Cartesian coordinate system, but can be explained in a broader sense. For example, the x-axis, y-axis, and z-axis may be perpendicular to each other, or may also represent different directions that are not perpendicular to each other.
本文所用術語僅用於闡述特定實施例,並非用以限定本發明。除非上下文中清楚地另外指明,否則本文所用的單數形式「一(a、an)」旨在亦包括複數形式。更應理解,當在本說明書中使用用語「包括(comprises、comprising、includes、including)」時,是用於指明所述特徵、整數、步驟、操作、元件及/或構件但不排除一個或多個其他特徵、整數、步驟、操作、元件及/或構件的存在或添加。本文中所使用的用語「及/或」包含相關列出項其中一個或多個項的任意及所有組合。當在元件列表之前出現「至少其中之一」之時,該表達方式旨在修飾整個元件列表說明,並且非用於修飾列表中的單一個元件。 The terminology used herein is only used to describe specific embodiments, not to limit the present invention. Unless the context clearly indicates otherwise, the singular form "一 (a, an)" used herein is intended to also include the plural form. It should be understood that when the term "comprises, comprising, includes, including" is used in this specification, it is used to indicate the features, integers, steps, operations, elements and/or components but does not exclude one or more The existence or addition of other features, integers, steps, operations, elements and/or components. The term "and/or" as used herein includes any and all combinations of one or more of the related listed items. When "at least one of" appears before the element list, the expression is intended to modify the description of the entire element list, and is not used to modify a single element in the list.
如本文所用術語「大致上」、「大約」與其他類似術語被用作近似術語而不是用來表達程度的術語,並且這些術語旨在說明本領域具通常知識者所悉知的測量值或計算值中的固有偏差。更進一步而言,當描述本發明的實施例時使用「可」是指「本發明的一個或多個實施例」。如本文所用術語,「使用(use)」、「使用中(using)」與「使用過(used)」可分別被視為與術語「利用(utilize)」、「利用中(utilizing)」與「利用過(utilized)」同義。而術語「示例性的(exemplary)」旨在指代範例或說明。 As used herein, the terms "approximately", "approximately" and other similar terms are used as approximate terms rather than terms used to express degree, and these terms are intended to describe measurement values or calculations known to those with ordinary knowledge in the field. The inherent deviation in the value. Furthermore, the use of "可" when describing embodiments of the present invention means "one or more embodiments of the present invention." As the terms used in this article, "use", "using" and "used" can be regarded as the same as the terms "utilize", "utilizing" and " "Utilized" is synonymous. The term "exemplary" is intended to refer to examples or illustrations.
當某個實施例可以不同方式實施時,可執行與所描述的順序不同的特定程序順序。舉例而言,兩個前後接連說明的程序大致上可在同一時間執行,或者可也以與所述順序相反的順序來執行。 When a certain embodiment can be implemented in a different way, a specific program sequence different from the described sequence can be executed. For example, two programs described one after another may be executed substantially at the same time, or may also be executed in an order opposite to the described order.
同時,任何於此揭露及/或引述的數值範圍指在涵括引述範圍內所包含的相同精確數值的子範圍。舉例而言,「1.0到10.0」的範圍指在涵括(且包括)所有所引述的最小數值1.0與所引述的最大數值10.0之間的所有子範圍。也就是說,大於等於最小數值1.0與小於等於最大數值10.0的數值之間任何的範圍,例如2.4到7.6。任何於此引述的最大數值限制指在包括所有涵蓋的較低數值限制,且任何文中所引述的最小數值限制指在包括所涵蓋的所有較高數值限制。據此,本申請者保留本申請案修改的權利,包括本申請專利範圍,以可以明確的引述涵蓋在本文明確引述的範圍內的子範圍。 At the same time, any numerical range disclosed and/or quoted here refers to a sub-range of the same exact numerical value included in the inclusive quoted range. For example, the range of "1.0 to 10.0" refers to all sub-ranges that include (and include) all quoted minimum values of 1.0 and quoted maximum values of 10.0. That is, any range between the minimum value 1.0 or more and the maximum value 10.0 or less, such as 2.4 to 7.6. Any maximum numerical limit quoted herein means including all lower numerical limits covered, and any minimum numerical limit quoted in the text means including all higher numerical limits covered. Accordingly, the applicant reserves the right to amend this application, including the scope of the patent of this application, so that it can be clearly quoted to cover sub-scopes within the scope expressly quoted in this article.
於此說明各項實施例,搭配參考用來示意性的說明實施例及/或中間結構的剖面示意圖。因此,可預期將出現由例如製造技術及/或公差而導致的圖示形狀變化的產生。因此,揭露的實施例不應被視為僅限於本文所示的特定形狀,而是例如包括由製造而引起的形狀改變。例如,描述為矩形的植入區域通常在其邊緣具有圓形或彎曲的特徵及/或植入濃度梯度,而不是僅是從植入區域到非植入區域的二元變化。同樣的,通過佈植形成的掩埋區可造成在掩埋區與植入產生的表面之間的區域中產生佈值。因此,圖式中示出的區域本質上是示意性的,並且其形狀並非用於顯示出裝置區域的實際形狀,而且不用於限制內容。 The various embodiments are described here, and reference is made to schematically illustrate the embodiment and/or the cross-sectional schematic diagram of the intermediate structure. Therefore, it can be expected that changes in the shape of the drawing due to, for example, manufacturing technology and/or tolerances will occur. Therefore, the disclosed embodiments should not be regarded as limited to the specific shapes shown herein, but include, for example, shape changes caused by manufacturing. For example, an implanted area described as a rectangle usually has rounded or curved features and/or implant concentration gradients at its edges, rather than just a binary change from the implanted area to the non-implanted area. Similarly, the buried area formed by implantation can cause a cloth value to be generated in the area between the buried area and the surface produced by the implantation. Therefore, the area shown in the drawings is schematic in nature, and its shape is not used to show the actual shape of the device area, and is not used to limit the content.
根據本發明實施例於此所述的電子裝置及/或任何其他相關裝置或構件,可利用任何適合的硬體、韌體(例如特殊應用積 體電路)、軟體或硬體、韌體及軟體的組合來實作。舉例而言,這些裝置的各項構件可在一個積體電路(Integrated Circuit,IC)晶片或在不同積體電路晶片上形成。更進一步而言,這些裝置的各項構件,可在柔性印刷電路薄膜、捲帶式晶片載體封裝(Tape Carrier Package,TCP)、印刷電路板(Printed Circuit Board,PCB)上執行,或形成於基板上。此外,這些裝置的各項組件可以是程序或執行緒,在一個或多個處理器、一個或多個運算裝置上執行,並執行電腦程式指令,還可以與其他系統構件整合,以執行本文所述的各項功能。電腦程式指令存儲在記憶體中,可以使用標準記憶體裝置(例如隨機存取記憶體(RAM))在運算裝置中執行。電腦程式指令也可以儲存在其他非暫時性電腦可讀媒體中,例如光碟機、快閃磁碟機等等。在不脫離本發明例示性實施例的精神和範圍下,此項技術領域中具有通常知識者應可知不同運算設備的功能可以相結合或整合成單一的運算裝置,或者特定運算裝置的功能可分配到一個或多個其他運算裝置上。 The electronic devices and/or any other related devices or components described herein according to the embodiments of the present invention can utilize any suitable hardware and firmware (such as special application products). Body circuit), software or a combination of hardware, firmware and software. For example, the various components of these devices can be formed on one integrated circuit (IC) chip or on different integrated circuit chips. Furthermore, the various components of these devices can be implemented on flexible printed circuit film, tape carrier package (TCP), printed circuit board (PCB), or formed on a substrate superior. In addition, the various components of these devices can be programs or threads that execute on one or more processors, one or more computing devices, and execute computer program instructions, and can also be integrated with other system components to execute the instructions described in this article. The functions described. Computer program instructions are stored in memory and can be executed in a computing device using standard memory devices (such as random access memory (RAM)). The computer program instructions can also be stored in other non-transitory computer-readable media, such as optical disc drives, flash drives, and so on. Without departing from the spirit and scope of the exemplary embodiments of the present invention, those skilled in the art should know that the functions of different computing devices can be combined or integrated into a single computing device, or the functions of a specific computing device can be allocated To one or more other computing devices.
除非另行定義,於此使用的所有術語(包括技術和科學術語)具有與本發明所屬領域的具有通常知識者通常理解的相同的含義。應進一步理解的是,諸如在通用字典中定義的術語,應該被解釋為具有與其在相關技術及/或本說明書的上下文中的含義一致的含義,並且不應被解釋為理想化或過度形式化的意義,除非另行明確定義。 Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by those with ordinary knowledge in the field to which the present invention belongs. It should be further understood that terms such as those defined in general dictionaries should be interpreted as having meanings consistent with their meanings in the context of related technologies and/or this specification, and should not be interpreted as idealized or over-formalized The meaning of, unless clearly defined otherwise.
圖1為繪示根據本揭露一實施例的用於包括單一乙太網
路固態硬碟底座120內的數個聚集的乙太網路固態硬碟110的跨結構高速非揮發性記憶體乙太網路固態硬碟(Ethernet SSD,eSSD)儲存的系統架構方塊圖。圖2為繪示根據本揭露一實施例的圖1所示的數個乙太路固態硬碟底座120在一乙太網路固態硬碟機架彼此相連的方法塊。
FIG. 1 is a diagram showing a method for including a single Ethernet network according to an embodiment of the present disclosure
A block diagram of a cross-structure high-speed non-volatile memory Ethernet solid-state drive (Ethernet SSD, eSSD) storage system architecture block diagram of several clustered Ethernet solid-state drives 110 in the solid-
如上所述,跨結構高速非揮發性記憶體介面可使大量的固態硬碟111連接遠端主機190。在每個跨結構高速非揮發性記憶體固態硬碟110中,驅動程式例如是在遠端主機190上執行。然而,對部分應用程式而言,單一個固態硬碟110提供的儲存容量是不足夠的。這類應用程式較為適用於具有數百兆位元組(Terabyte,TB)的單一邏輯磁碟區。據此,這類應用程式可能較為適用於本揭露所提供的實施例。因為本揭露實施例提供大量的單獨固態硬碟110,這些固態硬碟110共同聚集成「聚集群組(Aggregation Group)」,且對應用程式而言顯示為單一邏輯磁碟區。
As mentioned above, the cross-structure high-speed non-volatile memory interface allows a large number of solid state hard disks 111 to be connected to the
舉例而言,24個16TB的固態硬碟可顯示為單一邏輯384TB硬碟。需要大量聚集的固態硬碟110應用程式的一些範例包括大型資料採礦與分析、石油化學、氣體與能源探勘,實驗粒子物理學,以及製藥藥物開發。這些範例需要高效能運算(High Performance Computing,HPC),高效能運算同時需要大量儲存容量以及高效能。
For example, 24 16TB solid state drives can be displayed as a single logical 384TB hard drive. Some examples of solid-
雖然可以具有系統軟體層聚集底下的固態硬碟110,並且
提供單一邏輯、擴充性磁碟區,但是這樣的系統軟體通常非常複雜且精密。這樣的軟體可能需要在例如主機190上執行的大量跨結構高速非揮發性記憶體驅動程式,因此會消耗系統資源,例如記憶體、中央處理器(Central Processing Unit,CPU)週期,以及電力。目標端解決方案可能使用x86伺服器或獨立磁碟冗餘陣列晶片(Redundant Array of Independent Disk-on-Chip,ROC)系統,來提供大容量的單一邏輯磁碟區。然而,這類的解決方案通常很複雜、昂貴,且對效能與能源使用上有負面影響。舉例而言,根據本發明實施例,使用中央處理器來接收與傳送資料所消耗的能源,是使用直接記憶體存取(Direct Memory Access,DMA)引擎、特殊應用積體電路(Application-Specific Integrated Circuit,ASIC)等等所消耗能源的好幾倍。
Although it is possible to have the
據此,本發明實施例提供可在乙太網路跨結構高速非揮發性記憶體固態硬碟中,以有效率且符合成本效益的方式聚集多個乙太網路固態硬碟110的方法與架構。 Accordingly, the embodiments of the present invention provide methods and methods that can gather multiple Ethernet solid state drives 110 in an Ethernet cross-structure high-speed non-volatile memory solid state drive in an efficient and cost-effective manner. Architecture.
請參照圖1,乙太網路固態硬碟110被指派了二個角色當中的其中一個角色(例如,由儲存管理員指派),因此每一個乙太網路固態硬碟110可作為主要乙太網路固態硬碟(P-eSSD)110p或是次要乙太網路固態硬碟(S-eSSD)110s。在單一個底座120(或在給定機架中的多個底座120中,或在分佈在廣大範圍內的多個機架230中的多個底座120)中的單一個主要乙太網路固態硬碟110p與數個次要乙太網路固態硬碟110s的集合共同地提供遠端
主機190所需的快閃記憶體容量作為單一個邏輯硬碟。乙太網路固態硬碟底座120包括乙太網路固態硬碟110以及處理器(例如基板管理控制器(BMC)裝置150)、與用於連接外部裝置的乙太網路交換器160。儘管以下實施例中所描述乙太網路固態硬碟底座120是用來指代一個跨結構高速非揮發性記憶體裝置群組,但本發明的其他實施例也可同樣應用於其他任何的多個跨結構高速非揮發性記憶體裝置,不論該些裝置的實體外殼(例如底座、機架或箱式外殼)為何。更進一步而言,雖然乙太網路固態硬碟110是用來描述以下實施例的跨結構高速非揮發性記憶體裝置,其他的跨結構高速非揮發性記憶體裝置也可同樣適用於本發明的其實施例。
Please refer to Figure 1, the Ethernet
據此,透過對應的乙太網路交換器160,即可在一個機架230中橫跨多個底座120,以及跨越包含了多個底座120的多個機架230來聚集乙太網路固態硬110。
Accordingly, through the corresponding Ethernet switch 160, it is possible to traverse
主要乙太網路固態硬碟110p是遠端主機跨結構高速非揮發性記憶體驅動程式170唯一可見的乙太網路固態硬110,因此終端跨結構高速非揮發性記憶體通協定。主要乙太網路固態硬碟110p在相同的聚集群組中代表其自身以及所有剩餘的次要乙太網路固態硬碟110s,對遠端主機190顯示為單一的大型聚集邏輯容量。主要乙太網路固態硬碟110p接收來自遠端主機跨結構高速非揮發性記憶體驅動程式170的所有輸入/輸出(input/output,I/O)命令180,並提供命令回應(例如完成項目)182給遠端主機190。
The main Ethernet
主要乙太網路固態硬碟110p也維護對映配置表(Map Allocation Table,MAT),對映配置表指示了在相同的乙太網路固態硬碟110聚集群組中,在主要乙太網路固態硬碟110p以及部分或全部的次要乙太網路固態硬碟110s之間分割的邏輯區塊位址(logical block address,LBA)空間。當主要乙太網路固態硬碟110p接收到輸入/輸出命令180時,主要乙太網路固態硬碟110p先查詢對映配置表(例如下文所述的圖3的對映配置表300),來決定那一個乙太網路固態硬碟110(主要乙太網路固態硬碟110p、一個或多個次要乙太網路固態硬碟110s或兩者的組合)能夠滿足輸入/輸出命令180。根據對映配置表,主要乙太網路固態硬碟110p接下來傳送經適當應修改的跨結構高速非揮發性記憶體輸入/輸出子命令132給合適的次要乙太網路固態硬碟110s的集合。
The
為了要傳送子命令132,在接上電源後,主要乙太網路固態硬碟110p還會在快捷外設互聯標準(Peripheral Component Interconnect Express,PCIe)匯流排140及帶有各個次要乙太網路固態硬碟110s的控制平面135上建立私用乙太網路-遠端直接記憶體存取(Ethernet-RDMA)連線(或是建立適當的專屬通訊通道)130。私用佇列對(Queue-Pair,QP)通訊通道130被主要乙太網路固態硬碟110p用來將輸入/輸出命令(例如子命令132)傳送給次要乙太網路固態硬碟110s,以及用來從次要乙太網路固態硬碟110s接收完成項目134。私用通訊通道130可以是乙太網路,而且能夠讓資料透過乙太網路交換器160傳輸。然而,私用通訊通
道130也可以是基於快捷外設互聯標準建立的通道,因而能讓資料透過快捷外設互聯標準交換器傳輸。亦即,所有的乙太網路固態硬碟110都有可能使用二個或二個以上的通訊模式來相互通訊。舉例而言,乙太網路通道通常可用來進行資料傳輸,快捷外設互聯標準通道可用來進行管理,而其中任一個通道皆可作為私用通訊通道130使用。
In order to send the subcommand 132, after the power is connected, the main Ethernet solid-
次要乙太網路固態硬碟110s即為一般的跨結構高速非揮發性記憶體固態硬碟110,僅用來透過跨結構高速非揮發性記憶體通訊協定,將資料傳送至遠端主機190或自遠端主機190接收資料。與遠端主機190的資料傳輸是透過使用遠端直接記憶體存取讀取服務與遠端直接記憶體存取寫入服務來直接完成。次要乙太網路固態硬碟110s接收來自主要乙太網路固態硬碟110p的命令(例如子命令132),但是不直接從遠端主機跨結構高速非揮發性記憶體驅動程式170接收命令。次要乙太網路固態硬碟110s傳送子命令完成項目134給主要乙太網路固態硬碟110p來指示子命令132已完成,而是不傳送給遠端主機190。
The secondary Ethernet
主要乙太網路固態硬碟110p處理所有的跨結構高速非揮發性記憶體通訊協定終端,處理所有主機命令與完成佇列(例如提交佇列/完成佇列(Submission Queue/Completion Queue,SQ/CQ)),而且對在遠端主機啟動器執行的遠端主機跨結構高速非揮發性記憶體驅動程式170是可見的。當遠端主機驅動程式170發送跨結構高速非揮發性記憶體管理(Admin)命令180或是輸入
/輸出命令180,命令180是發送給主要乙太網路固態硬碟110p,而所有的管理命令180是由主要乙太網路固態硬碟110p所執行。然而輸入/輸出命令180可以在多個乙太網路固態硬碟110之間傳送。
The main Ethernet solid-
主要乙太網路固態硬碟110p也可以根據輸入/輸出命令180來執行本身的資料傳輸共享。主要乙太網路固態硬碟110p接下來等待所有子命令完成項目134都抵達(例如出自於次要乙太網路固態硬碟110s的集合)私用通訊通道130後,才會傳送對應原始命令180的命令完成項目182給遠端主機190。
The main Ethernet
主要乙太網路固態硬碟110p也維護命令內容表(例如請參見圖6)中執行的各個命令的「命令內容」。主要乙太網路固態硬碟110p使用此命令內容來追蹤子命令132的執行狀態、資料傳輸狀態,以及任意錯誤狀態。當所有子命令132都完成後,命令回應/完成項目182將會提供給遠端主機190,而命令內容表將會解除配置。
The main Ethernet solid-
請參照圖2,多個乙太網路固態硬碟底座120可以在乙太網路固態硬碟機架230中連結在一起,其中機架頂端(Top-Of-Rack,TOR)交換器240是用來使共同機架230中多個底座120之間的連結性。同樣的,分別位於不同地理位置的多個機架230也能透過個別的機架頂端交換器240來相互連結,其是直接相互連結,或是透過外部交換器而相互連結。這些乙太網路機架230可以是位於同一棟資料中心建築物內,或也可以是分佈在不同的地理區
域中。
Please refer to FIG. 2, a plurality of Ethernet solid-state
如上所述,本揭露實施例提供的機制,能夠聚集多個乙太網路跨結構高速非揮發性記憶體固態硬碟(乙太網路固態硬碟)110以顯示為單一大容量的跨結構高速非揮發性記憶體固態硬碟。乙太網路固態硬碟110可以位於單一底座120內,位於單一機架230的多個底座120內,或甚至也可以散佈在多個乙太網路機架230上,且各個乙太網路機架230皆具有多個底座120。乙太網路固態硬碟110的其中一個被指派為主要乙太網路固態硬碟(P-eSSD)110p的角色。其他的乙太網路固態硬碟110被指派為次要乙太網路固態硬碟(S-eSSD)110s的角色。儘管次要乙太網路固態硬碟110s與遠端主機啟動器執行直接傳輸資料,不過次要乙太網路固態硬碟110s接收來自主要乙太網路固態硬碟110p的子命令132、完成子命令132、並且回傳子命令132的完成項目134給主要乙太網路固態硬碟110p。據此,在本實施例中,在不犧牲任何儲存頻寬的情況下,容量得以聚集而作動為有效的單一乙太網路固態硬碟。
As mentioned above, the mechanism provided by the embodiments of the present disclosure can aggregate multiple Ethernet cross-structure high-speed non-volatile memory solid state drives (Ethernet solid state drive) 110 to display as a single large-capacity cross-structure High-speed non-volatile memory solid state drive. The
圖3為根據本揭露一實施例一主要乙太網路固態硬碟110p所維護的「對映配置表」300範例。圖4為繪示根據本揭露的一實施例的對映配置表300初始化流程圖400。
FIG. 3 is an example of a "mapping configuration table" 300 maintained by a main Ethernet solid-
請參照圖3與圖4,如上所述,本實施例利用了兩種乙太網路固態硬碟110類型(例如主要乙太網路固態硬碟110p與次要乙太網路固態硬碟110s)。主要乙太網路固態硬碟110p與次要乙
太網路固態硬碟110s兩者皆使用跨結構高速非揮發性記憶體通訊協定,來提供儲存服務給主機190。主要乙太網路固態硬碟110p維護表格(例如對映配置表(MAT)300),此表格包含乙太網路固態硬碟110的聚集群組中的次要乙太網路固態硬碟110s的細節,而乙太網路固態硬碟110的聚集群組對主機190而言為單一邏輯磁碟區。
3 and 4, as described above, this embodiment uses two types of Ethernet SSD 110 (for example,
對映配置表300可經由基板管理控制器150在相同底座120中初始化為主要乙太網路固態硬碟110p。基板管理控制器150可管理乙太網路底座120與構件,像是乙太網路交換器160與乙太網路固態硬碟110。基板管理控制器150具有用於系統管理目的的快捷外設互聯標準介面與系統管理匯流排(System Management Bus,SMBus)介面。同時,基板管理控制器150決定要聚集那些乙太網路固態硬碟110(S410)(例如在儲存管理員的指示下)。當要聚集的乙太網路固態硬碟110決定好時,基板管理控制器150即可設定乙太網路交換器160。
The mapping configuration table 300 can be initialized as the main Ethernet
基板管理控制器150在儲存管理員的指引下初始化對映配置表300的左側三欄。基板管理控制器150以及儲存管理員看得見也知道存在聚集群組/儲存系統中的所有乙太網路固態硬碟110。所知內容包括每一個乙太網路固態硬碟110的容量311與位址位置312。儲存管理員可決定那些次要乙太網路固態硬碟110s是形成「聚集的乙太網路固態硬碟」(例如聚集群組)所必需的。基板管理控制器150與儲存管理員會通告或提供主要乙太網路固
態硬碟110p的網路位址給使用者,如此一來,對應於遠端主機跨結構高速非揮發性記憶體驅動程式170的使用者應用程式,將知道在那裡可以找到聚集的乙太網路固態硬碟。基板管理控制器150與儲存管理員也可以選擇或指定乙太網路固態硬碟110的其中一個作為主要乙太網路固態硬碟110p(S420),也可以在做出原始指派動作後,依據一個或一個以上的各種不同理由,變更作為主要乙太網路固態硬碟110p的乙太網路固態硬碟110。接下來,基板管理控制器150可程式化聚集群組的主要與次要模式(S430)。
The
主要乙太網路固態硬碟110p可在基板管理控制器150上保留一份對映配置表300的複本,而且可以定期更新儲存在基板管理控制器150上的對映配置表300。在部分實施例中,只有主要乙太網路固態硬碟110p包含正式的對映配置表300,而「0」的乙太網路固態硬碟索引313代表主要乙太網路固態硬碟110p,而剩餘的乙太網路固態硬碟索引值則相對於各別的次要乙太網路固態硬碟110s。
The main Ethernet
主要乙太網路固態硬碟110p終端主機驅動程式170的跨結構高速非揮發性記憶體通訊協定,並執行主機驅動程式170發出的所有命令180。當主機190的命令180完成後,主要乙太網路固態硬碟110p利用「完成項目」(Completion Entry)182形式將完成項目182回傳給主機驅動程式170。關於主機命令180,遠端主機跨結構高速非揮發性記憶體驅動程式170則完全不知道次要乙太網路固態硬碟110s的存在。主要乙太網路固態硬碟110p也維
護提交佇列(SQ),並且提交命令完成給完成佇列(CQ)。
The
主要乙太網路固態硬碟110p更新並維護對映配置表300的右側三欄。當遠端主機跨結構高速非揮發性記憶體驅動程式170建立「命名空間」(Namespace)時,某些快閃容量會配置給該命名空間。命名空間邏輯區塊位址範圍314對映到乙太網路固態硬碟110的集合,並且紀錄在由主要乙太網路固態硬碟110p所維護的對映配置表300中。以下將參考圖8,對此程序的詳細內容進行如下闡述。
The
主要乙太網路固態硬碟110p也可以執行某些初始化程序。一旦對映配置表300在主要乙太網路固態硬碟110p中初始化後(S440),主要乙太網路固態硬碟110p即得知那些乙太網路固態硬碟110是為對應的次要乙太網路固態硬碟110s。主要乙太網路固態硬碟110p接下來與聚集群組中的每一個次要乙太網路固態硬碟110s設定通訊通道130。通訊通道130能通過底座120內的乙太網路交換器160而經過乙太網路介面,或者通過底座120內的快捷外設互聯標準交換器而經過快捷外設互聯標準介面。如果次要乙太網路固態硬碟110s的其中一個是位於相同機架230中的不同底座120內,則通訊通道130是透過機架頂端交換器240來建立的。通訊通道130在給定的底座120內也可以是經過快捷外設互聯標準匯流排140。
The
圖5為繪示根據本揭露一實施例包括資料往返於分別位於多個機架230中的多個乙太網路固態硬碟底座120中多個聚集
的乙太網路固態硬碟110的用於跨結構高速非揮發性記憶體乙太路固態硬碟(eSSD)儲存的系統架構500方塊圖。
FIG. 5 is a diagram showing a plurality of gatherings in multiple Ethernet solid-
請參照圖5,主要乙太網路固態硬碟110p可以透過外部網路交換器與路由器,與位於廣域網路(Wide Area Network,WAN)的次要乙太網路固態硬碟110s來建立乙太網路通訊通道530。此私用的乙太網路通訊通道530可以是遠端直接記憶體存取佇列對(QP)或可以是專屬方法。乙太網路通訊通道530是用來交換子命令132與相關聯的完成。
Please refer to Figure 5, the primary Ethernet
圖6繪示一表格600的範例命令內容。 FIG. 6 shows an example command content of a table 600.
請參照圖6,每一個字命令132具有命令識別(ID)640並攜帶「命令標記」(Command Tag)610,因此當主要乙太網路固態硬碟110p接收到完成項目134時,可以透過這些完成項目134來追溯原始命令180。在追溯子命令132的完成項目134時,「子命令#」(# of Sub-commands)欄位遞減,而接收到的錯誤狀態與當前狀態鎖存。當「子命令#」欄位為零時,對應的命令180(亦即子命令132的上層命令)完成,而主要乙太網路固態硬碟110p可將完成項目182傳送回遠端主機190。這時主要乙太網路固態硬碟110p利用累積的錯誤狀態630來產生一完成項目,並將此完成項目放入相關聯的完成佇列中。
Please refer to Figure 6, each word command 132 has a command identification (ID) 640 and carries a "command tag" (Command Tag) 610, so when the main Ethernet
圖7為繪示根據本揭露一實施例的主要乙太網路固態硬碟110p處理管理(Admin)命令180的流程圖700。
FIG. 7 is a
請參照圖1與圖7,如上所述,各個主要乙太網路固態硬
碟110p維護命令提交佇列。當主要乙太網路固態硬碟110p接收到可執行的命令180(S710)時,主要乙太網路固態硬碟110p仲裁提交佇列,然後選擇要執行的命令180。主要乙太網路固態硬碟110p執行所有高速非揮發性記憶體命令(例如管理命令與輸入/輸出命令)180。也就是說,雖然次要乙太網路固態硬碟110s可直接傳送資料給主機190,次要乙太網路固態硬碟110s並不會直接接收來自主機190的命令,也不送會直接傳送完成給主機190。由主要乙太網路固態硬碟110p執行的管理命令可不需要與次要乙太網路固態硬碟110s進行任何通訊。
Please refer to Figure 1 and Figure 7, as mentioned above, each of the main Ethernet solid state hardware
The
在接收命令180後,主要乙太網路固態硬碟110P判斷資料的位置,以及判斷是否已存取所有資料(S720)。如果主要乙太網路固態硬碟110p具備所有資料,主要乙太網路固態硬碟110p則將資料傳輸到主機190(S770)。
After receiving the command 180, the main Ethernet solid state drive 110P determines the location of the data and determines whether all the data has been accessed (S720). If the main Ethernet
如果主要乙太網路固態硬碟110p判斷沒有具備所有資料(S720),則主要乙太網路固態硬碟110p接下來咨詢對映配置表300來判斷所要求資料的位置(S730)。一旦識別出相關的乙太網路固態硬碟110的集合之後,主要乙太網路固態硬碟110p即進行到命令180的執行。當主要乙太網路固態硬碟110p本身具備了所有相關要求的資料後,主要乙太網路固態硬碟110p傳輸資料185。然而當要求的資料是分散在主要乙太網路固態硬碟110p及/或次要乙太網路固態硬碟110s的集合,則主要乙太網路固態硬碟110p將原始命令180分割成適當數目的子命令132(S740)。子命令132
的數目對應於要求的資料所分散其上的乙太網路固態硬碟110的數目。各個子命令132對應各個乙太網路固態硬碟110擁有的要求的資料部分。
If the main Ethernet
主要乙太網路固態硬碟110p將合適的開始邏輯區塊位址(Start LBA,SLBA)、區塊數目(Number of Blocks,NLB)以及多個遠端分散/回收清單(Scatter/Gather Lists,SGLs)放入子命令132中。遠端分散/回收清單包括位址、金鑰、以及遠端主機190上的傳輸緩衝區大小。主要乙太網路固態硬碟110p接下來透過私用佇列對通訊通道130,在命令分割程序中將那些子命令132傳送到各別的次要乙太網路固態硬碟110s(S750),並且等待接收來自每一各別次要乙太網路固態硬碟110s的完成項目134(S760)。據此,原始命令180被分割為子命令132,適當的乙太網路固態硬碟110即可平行執行資料傳輸,因而使資料得以傳送到主機190(S770)。
The main Ethernet
主要乙太網路固態硬碟110p針對正在執行中的命令180建立命令內容,請參照圖6的說明。命令內容是用來追蹤子命令132的執行狀態,以及子命令132的任何中間錯誤狀態(S780)。一旦主要乙太網路固態硬碟110p確認管理命令完成,主要乙太網路固態硬碟110p傳送完成項目給主機190(S790)。
The main Ethernet
據此,主要乙太網路固態硬碟110p接收(S710)並執行遠端主機跨結構高速非揮發性記憶體驅動程式170發出的所有管理命令180。主要乙太網路固態硬碟110p可能獨自完成多項或全
部的管理命令180(例如當主要乙太網路固態硬碟110p判斷(S720)獨自具有用來完成管理命令的所有所需資訊)。在部分案例中,在完成管理命令180之前,主要乙太網路固態硬碟110p可從次要乙太網路固態硬碟110s擷取某部分的資訊。如果主要乙太網路固態硬碟110p要從次要乙太網路固態硬碟110s尋找某些非使用者資料資料,主要乙太網路固態硬碟110p會建立並傳送管理子命令132給各別的次要乙太網路固態硬碟110s。次要乙太網路固態硬碟110s可利用兩者之間的私用通訊通道130,回傳任何所需的資料與子命令完成項目134給主要乙太網路固態硬碟110p。
Accordingly, the main Ethernet
圖8為繪示根據本揭露一實施例的執行命名空間建立與刪除(Namespace Create and Delete)命令的流程圖800。
FIG. 8 shows a
請參照圖8,主要乙太網路固態硬碟110p可接收與執行命名空間建立(Namespace Create)命令(S810)及/或刪除管理(Delete Admin)命令(S811)。當主要乙太網路固態硬碟110p接收命名空間建立命令(S810)時,主要乙太網路固態硬碟110p可查詢對映配置表300(S820),並且從總合可用集區配置適當的容量(S830)。新建立的命名空間可具有完全來自主要乙太網路固態硬碟110p或完全來自某些次要乙太網路固態硬碟110s的快閃容量,或者新建立的命名空間可具有來自主要乙太網路固態硬碟110p與次要乙太網路固態硬碟110s任意組合的快閃容量。主要乙太網路固態硬碟110p接下來可在對映配置表300中記錄所配置的容量與相關的對應邏輯區塊位址範圍(S840)。
Referring to FIG. 8, the main Ethernet
當主要乙太網路固態硬碟110p接收刪除命名空間的命名空間刪除(Namespace Delete)命令(S811)時,主要乙太網路固態硬碟110p查詢對映配置表300(S821)、擷取對應的乙太網路固態硬碟(S831)、解除配置相關容量(S841),並據此更新對映配置表300。
When the main Ethernet
關於由次要乙太網路固態硬碟110s執行的命名空間建立/刪除命令,次要乙太網路固態硬碟110s不會直接接收命名空間建立/刪除命令。在正常情況下,次要乙太網路固態硬碟110s會包含代表總容量的單一命名空間。在適當時間,主要乙太網路固態硬碟110p可發送命名空間建立命令或命名空間刪除命令給次要乙太網路固態硬碟110s,以作為子命令132。次要乙太網路固態硬碟110s接下來分別執行那些命令,並且回傳對應的完成項目134給主要乙太網路固態硬碟110p。這樣的流程與任何接收來自主要乙太網路固態硬碟110p的管理子命令的流程是相同的。
Regarding the namespace creation/delete commands executed by the
圖9為繪示根據本揭露一實施例所在自主要乙太網路固態硬碟110p控制下執行讀取/寫入(Read/Write)命令的流程圖900。
FIG. 9 is a
請參照圖9,主要乙太網路固態硬碟110p可接收與執行所有輸入/輸出命令180,包括讀取與寫入命令180。當主要乙太網路固態硬碟110p接收到讀取/寫入命令180時(S910),主要乙太網路固態硬碟110p先查詢對映配置表300(S920)。從對映配置表300中,主要乙太網路固態硬碟110p識別出相關使用者資料所在
的乙太網路固態硬碟110的集合。
Please refer to FIG. 9, the main Ethernet
如圖6所述,主要乙太網路固態硬碟110p接下來針對原始命令建立命令內容(S930),如此一來,主要乙太網路固態硬碟110p便能追蹤子命令132的執行狀態。主要乙太網路固態硬碟110p接下來建立對應的讀取/寫入子命令132(S940),並且傳送適當的子命令132給適當的次要乙太網路固態硬碟110s(S950)。主要乙太網路固態硬碟110p也提供所有所需的傳輸網路相關資訊(例如位址)給次要乙太網路固態硬碟110s。作為子命令132的一部分,次要乙太網路固態硬碟110s接收含有遠端緩衝位址、大小、安全性金鑰的遠端主機190分散/回收清單。
As shown in FIG. 6, the main Ethernet
子命令132中的資料傳輸欄位適當的修改為正確的位移。次要乙太網路固態硬碟110s直接傳送資料給遠端主機190的緩衝器(S960),當完成後,次要乙太網路固態硬碟110s傳送完成給主要乙太網路固態硬碟110p(而不是直接傳送給主機190)。如有必要,主要乙太網路固態硬碟110p會執行自己這部分的資料傳輸共享至遠端主機190(S960)。
The data transmission field in the subcommand 132 is appropriately modified to the correct displacement. The
更進一步來說,各個次要乙太網路固態硬碟110s從主要乙太網路固態硬碟110p接收足夠的資訊,來執行任何直接資料傳輸至遠端主機190(S960)。在跨結構高速非揮發性記憶體通訊協定中,遠端直接記憶體存取傳輸服務(遠端直接記憶體存取讀取與遠端直接記憶體存取寫入)是用來將資料從次要乙太網路固態硬碟110s傳輸至遠端主機190。遠端主機190可能需要支援遠端
直接記憶體存取通訊協定的共用接收佇列(Shared Receive Queue,SRQ)功能,如此才能讓多個次要乙太網路固態硬碟110s傳輸資料至遠端主機190(S960)。遠端直接記憶體存取通訊協定可在Ethernet/IP/TCP(iWARP)、Ethernet/InfiniBand(RoCE v1)或Ethernet/IP/UDP(RoCE v2)上執行。在次要乙太網路固態硬碟110s與遠端主機190之間的通訊方面,次要乙太網路固態硬碟110s嚴格執行僅與遠端主機190進行資料傳輸(S960)。亦即次要乙太網路固態硬碟110s與遠端主機190之間僅進行遠端直接記憶體存取-讀取與遠端直接記憶體存取-寫入作業(S960)。子命令完成項目134與任何非使用者資料傳輸都是使用遠端直接記憶體存取傳送作業或其他專屬通訊協定與主要乙太網路固態硬碟110p執行。
Furthermore, each secondary Ethernet
當完成所有子命令後(S970),如指示當主要乙太網路固態硬碟110p接收到所有的子命令完成項目時,主要乙太網路固態硬碟110p針對原始命令180建立完成項目182,並且將完成項目182傳送給在主機190上的適當完成佇列(S980)。主要乙太網路固態硬碟110p接下來解除命令內容的配置(S990)。
When all the sub-commands are completed (S970), as instructed when the main Ethernet solid-
圖10為繪示根據本揭露一實施例的次要乙太網路固態硬碟110s執行管理子命令的流程圖1000。
FIG. 10 is a
請參照圖10,在本實施例中,沒有任何次要乙太網路固態硬碟110s曾經直接接收到來自主機190跨結構高速非揮發性記憶體驅動程式170的管理命令180或任何命令。反而是主要乙太網路固態硬碟110p,僅在必要時,傳送管理子命令132給次要乙
太網路固態硬碟110s(S1010)。次要乙太網路固態硬碟110s接下來決定是否需要進行任何的資料傳輸(S1020),並且與主要乙太網路固態硬碟110p在私用通訊通道130上進行所需的任何資料傳輸(S1030)。次要乙太網路固態硬碟110s接下來建立(S1040)並傳送完成項目134給主要乙太網路固態硬碟110p(S1050)。在其他實施例中,次要乙太網路固態硬碟110s可使用遠端直接記憶體存取傳送作業來傳送資料與完成項目134給主要乙太網路固態硬碟110p。
Referring to FIG. 10, in this embodiment, no
圖11為根據本揭露一實施例所繪示的一次要乙太網路固態硬碟110s執行讀取/寫入命令的流程圖1100。
FIG. 11 is a
請參照圖11,次要乙太網路固態硬碟110s的主要工作是執行讀取或寫入字命令132(例如而不是管理字命令132)。亦即,次要乙太網路固態硬碟110s主要是執行將資料移動至或移動自遠端主機190,而不會進行通訊協定處理的其他方面。當次要乙太網路固態硬碟110s接收到讀取/寫入字命令時(S1110),次要乙太網路固態硬碟110s使用接收到的傳輸網路資訊(S1120)來發送遠端直接記憶體存取讀取或遠端直接記憶體存取寫入要求給遠端主機190(S1130)。作為子命令132的一部分,次要乙太網路固態硬碟110s接收遠端緩衝位址/位移、大小、安全性金鑰的細節。當必要資料傳輸完成後(S1140),次要乙太網路固態硬碟110s傳送含有適當錯誤狀態的完成項目134給主要乙太網路固態硬碟110p後(S1150)。
Referring to FIG. 11, the main task of the
圖12為繪示根據本揭露一實施例的在次要乙太網路固態硬碟110s中的資料傳送與子命令完成同步的流程圖1200。
FIG. 12 is a
請參照圖12,在給定的主機命令180中,雖然主要乙太網路固態硬碟110p將完成項目182給主機192,但次要乙太網路固態硬碟110s的集合可執行傳送資料至主機190。所有資料都傳送給遠端主機190之後,在給定的命令180的完成項目182才能提供給主機,這是因為命令180的完成項目182如果比相關資料早一步傳送到主機190,可能會造成未定義的行為/錯誤。
Please refer to Figure 12, in a given host command 180, although the
如上所述,主要乙太網路固態硬碟110p接收來自主機190的讀取/寫入命令180(S1210),並且將命令180分割成數個子命令132(S1220)。接下來,分別接收來自主要乙太網路固態硬碟110p的子命令132的各個次要乙太網路固態硬碟110s發送資料傳輸給主機190(S1250)。在決定資料傳輸完成後(S1260),次要乙太網路固態硬碟110s發送完成項目134給主要乙太網路固態硬碟110p(S1270)。接下來,接收到來自每個相關次要乙太網路固態硬碟110s的所有子命令完成項目134(S1230)後,主要乙太網路固態硬碟110p發送完成項目182給主機190(S1240)。
As described above, the main Ethernet
因為聚集的乙太網路固態硬碟資料傳輸與完成項目公佈是分散在乙太網路固態硬碟110的集合上,且應要達成資料與完成同步。一般而言,當單一乙太網路固態硬碟110進行命令執行的兩階段(資料傳輸+完成公佈)時,不會發生這些問題。然而,聚集的乙太網路固態硬碟不是這樣運作的。因此,主要乙太網路
固態硬碟110p必須等待來自各別次要乙太網路固態硬碟110s的所有子命令完成項目134,才能公佈命令180的完成項目182。更進一步而言,次要乙太網路固態硬碟110s必須確保所有的資料傳輸已全部且確實地完成,才能將子命令132的完成項目134傳送給主要乙太網路固態硬碟110p。這樣的兩階段同步程序確保在任何時候,跨結構高速非揮發性記憶體通訊協定的完整性在聚集的乙太網路固態硬碟中是可達成的。
Because the aggregated Ethernet solid-state drive data transmission and completion project announcement are scattered on the collection of Ethernet solid-state drives 110, and data and completion synchronization should be achieved. Generally speaking, when a single Ethernet
綜上所述,因為只有單一主要乙太網路固態硬碟對主機而言是可見的,且此一主要乙太網路固態硬碟進行所有跨結構高速非揮發性記憶體通訊協定的處理作業,同時要追蹤所有次要乙太網路固態硬碟發出的相關子命令是否已完成,因此乙太網路固態硬碟的聚集群組對主機是顯示為單一的大型邏輯容量。 To sum up, because only a single main Ethernet solid state drive is visible to the host, and this main Ethernet solid state drive performs all the processing operations of the cross-structure high-speed non-volatile memory communication protocol At the same time, it is necessary to track whether the related subcommands issued by all secondary Ethernet SSDs have been completed. Therefore, the aggregation group of Ethernet SSDs is displayed to the host as a single large logical capacity.
100:系統架構 100: System Architecture
110s:次要乙太網路固態硬碟 110s: Secondary Ethernet SSD
110p:主要乙太網路固態硬碟 110p: Main Ethernet solid state drive
120:乙太網路固態硬碟底座 120: Ethernet Solid State Drive Dock
130:私用通訊通道 130: private communication channel
132:子命令 132: Subcommand
134:完成項目 134: Complete the project
135:控制平面 135: control plane
140:快捷外設互聯標準匯流排 140: Fast Peripheral Interconnection Standard Bus
150:基板管理控制器 150: baseboard management controller
160:乙太網路交換器 160: Ethernet switch
170:機跨結構高速非揮發性記憶體驅動程式 170: Machine cross-structure high-speed non-volatile memory driver
180:命令 180: Command
182:完成項目 182: Complete the project
185:資料 185: Information
190:主機 190: host
Claims (20)
Applications Claiming Priority (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201762480113P | 2017-03-31 | 2017-03-31 | |
US62/480,113 | 2017-03-31 | ||
US201762483913P | 2017-04-10 | 2017-04-10 | |
US62/483,913 | 2017-04-10 | ||
US15/618,081 | 2017-06-08 | ||
US15/618,081 US10282094B2 (en) | 2017-03-31 | 2017-06-08 | Method for aggregated NVME-over-fabrics ESSD |
Publications (2)
Publication Number | Publication Date |
---|---|
TW201843596A TW201843596A (en) | 2018-12-16 |
TWI734895B true TWI734895B (en) | 2021-08-01 |
Family
ID=63670692
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW107107134A TWI734895B (en) | 2017-03-31 | 2018-03-02 | Method of aggregating storage, method of nvme-of ssd capacity aggregation and aggregated ethernet ssd group |
Country Status (5)
Country | Link |
---|---|
US (1) | US10282094B2 (en) |
JP (1) | JP7032207B2 (en) |
KR (1) | KR102506394B1 (en) |
CN (1) | CN108776576B (en) |
TW (1) | TWI734895B (en) |
Families Citing this family (30)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10620855B2 (en) * | 2016-09-06 | 2020-04-14 | Samsung Electronics Co., Ltd. | System and method for authenticating critical operations on solid-state drives |
US10733137B2 (en) * | 2017-04-25 | 2020-08-04 | Samsung Electronics Co., Ltd. | Low latency direct access block storage in NVME-of ethernet SSD |
US10949361B1 (en) * | 2017-08-04 | 2021-03-16 | Nimbus Data, Inc. | Multiprocessor software-defined solid-state storage drive |
US11188496B2 (en) * | 2018-09-13 | 2021-11-30 | Toshiba Memory Corporation | System and method for storing data using ethernet drives and ethernet open-channel drives |
US11231764B2 (en) * | 2018-10-17 | 2022-01-25 | Samsung Electronics Co., Ltd. | System and method for supporting chassis level keep alive in NVME-of based system |
US11868284B2 (en) | 2018-12-05 | 2024-01-09 | Rongming Microelectronics (Jinan) Co., Ltd. | Peripheral device with embedded video codec functionality |
US10860504B2 (en) | 2018-12-05 | 2020-12-08 | New Century Technologies Ltd. | Peripheral device with embedded video codec functionality |
KR102348154B1 (en) * | 2018-12-14 | 2022-01-07 | 론밍 마이크로일렉트로닉스 (지난) 엘티디. | Peripheral device with embedded video codec functionality |
US11366610B2 (en) * | 2018-12-20 | 2022-06-21 | Marvell Asia Pte Ltd | Solid-state drive with initiator mode |
KR102691053B1 (en) * | 2019-01-10 | 2024-07-31 | 삼성전자주식회사 | Systems and methods for managing communication between NVMe-SSD storage device and NVMe-oF host unit |
WO2020183246A2 (en) * | 2019-03-14 | 2020-09-17 | Marvell Asia Pte, Ltd. | Termination of non-volatile memory networking messages at the drive level |
EP3939237B1 (en) | 2019-03-14 | 2024-05-15 | Marvell Asia Pte, Ltd. | Transferring data between solid state drives (ssds) via a connection between the ssds |
EP3938880A1 (en) | 2019-03-14 | 2022-01-19 | Marvell Asia Pte, Ltd. | Ethernet enabled solid state drive (ssd) |
CN109992420B (en) * | 2019-04-08 | 2021-10-22 | 苏州浪潮智能科技有限公司 | Parallel PCIE-SSD performance optimization method and system |
JP2020177501A (en) * | 2019-04-19 | 2020-10-29 | 株式会社日立製作所 | Storage system, drive housing thereof, and parity operation method |
JP6942163B2 (en) * | 2019-08-06 | 2021-09-29 | 株式会社日立製作所 | Drive box, storage system and data transfer method |
US11113001B2 (en) * | 2019-08-30 | 2021-09-07 | Hewlett Packard Enterprise Development Lp | Fabric driven non-volatile memory express subsystem zoning |
CN112988623B (en) * | 2019-12-17 | 2021-12-21 | 北京忆芯科技有限公司 | Method and storage device for accelerating SGL (secure gateway) processing |
KR20210077329A (en) | 2019-12-17 | 2021-06-25 | 에스케이하이닉스 주식회사 | Storage System, Storage Device and Operating Method Therefor |
US11704059B2 (en) * | 2020-02-07 | 2023-07-18 | Samsung Electronics Co., Ltd. | Remote direct attached multiple storage function storage device |
US11899550B2 (en) | 2020-03-31 | 2024-02-13 | Advantest Corporation | Enhanced auxiliary memory mapped interface test systems and methods |
KR20210124687A (en) | 2020-04-07 | 2021-10-15 | 에스케이하이닉스 주식회사 | Storage System, Storage Device, and Operating Method Therefor |
CN113051206B (en) * | 2020-05-04 | 2024-10-18 | 威盛电子股份有限公司 | Bridge circuit and computer system |
US11720413B2 (en) | 2020-06-08 | 2023-08-08 | Samsung Electronics Co., Ltd. | Systems and methods for virtualizing fabric-attached storage devices |
US20210389909A1 (en) * | 2020-06-16 | 2021-12-16 | Samsung Electronics Co., Ltd. | Edge solid state drive (ssd) device and edge data system |
US11789634B2 (en) | 2020-07-28 | 2023-10-17 | Samsung Electronics Co., Ltd. | Systems and methods for processing copy commands |
US11733918B2 (en) * | 2020-07-28 | 2023-08-22 | Samsung Electronics Co., Ltd. | Systems and methods for processing commands for storage devices |
CN114490106A (en) * | 2020-11-13 | 2022-05-13 | 瑞昱半导体股份有限公司 | Information exchange system and method |
CN115904210A (en) * | 2021-08-09 | 2023-04-04 | 华为技术有限公司 | Data sending method, network card and computing device |
CN117795466A (en) * | 2021-08-26 | 2024-03-29 | 美光科技公司 | Access request management using subcommands |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070214316A1 (en) * | 2006-03-07 | 2007-09-13 | Samsung Electronics Co., Ltd. | RAID system and method in mobile terminal |
US20110208922A1 (en) * | 2010-02-22 | 2011-08-25 | International Business Machines Corporation | Pool of devices providing operating system redundancy |
TW201212033A (en) * | 2010-09-07 | 2012-03-16 | Phison Electronics Corp | Hybrid storage apparatus and hybrid storage medium controller and addressing method thereof |
TW201510725A (en) * | 2009-01-23 | 2015-03-16 | Infortrend Technology Inc | Storage subsystem and storage system architecture performing storage virtualization and method thereof |
Family Cites Families (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7877569B2 (en) * | 2004-04-28 | 2011-01-25 | Panasonic Corporation | Reduction of fragmentation in nonvolatile memory using alternate address mapping |
US7646636B2 (en) * | 2007-02-16 | 2010-01-12 | Mosaid Technologies Incorporated | Non-volatile memory with dynamic multi-mode operation |
US9189385B2 (en) | 2010-03-22 | 2015-11-17 | Seagate Technology Llc | Scalable data structures for control and management of non-volatile storage |
CN101956936B (en) * | 2010-07-30 | 2014-09-24 | 深圳市华星光电技术有限公司 | Side-view backlight module and liquid crystal displayer using same |
WO2013109640A1 (en) * | 2012-01-17 | 2013-07-25 | Intel Corporation | Techniques for command validation for access to a storage device by a remote client |
JP2015532985A (en) | 2012-09-06 | 2015-11-16 | ピーアイ−コーラル、インク. | Large-scale data storage and delivery system |
US9229854B1 (en) | 2013-01-28 | 2016-01-05 | Radian Memory Systems, LLC | Multi-array operation support and related devices, systems and software |
US9483431B2 (en) | 2013-04-17 | 2016-11-01 | Apeiron Data Systems | Method and apparatus for accessing multiple storage devices from multiple hosts without use of remote direct memory access (RDMA) |
US9785356B2 (en) * | 2013-06-26 | 2017-10-10 | Cnex Labs, Inc. | NVM express controller for remote access of memory and I/O over ethernet-type networks |
US9430412B2 (en) * | 2013-06-26 | 2016-08-30 | Cnex Labs, Inc. | NVM express controller for remote access of memory and I/O over Ethernet-type networks |
US9986028B2 (en) | 2013-07-08 | 2018-05-29 | Intel Corporation | Techniques to replicate data between storage servers |
CN104346287B (en) * | 2013-08-09 | 2019-04-16 | Lsi公司 | The finishing mechanism of multi-level mapping is used in solid state medium |
US9111598B2 (en) | 2013-09-16 | 2015-08-18 | Netapp, Inc. | Increased I/O rate for solid state storage |
US20160259568A1 (en) * | 2013-11-26 | 2016-09-08 | Knut S. Grimsrud | Method and apparatus for storing data |
WO2016196766A2 (en) | 2015-06-03 | 2016-12-08 | Diamanti, Inc. | Enabling use of non-volatile media - express (nvme) over a network |
US9887008B2 (en) | 2014-03-10 | 2018-02-06 | Futurewei Technologies, Inc. | DDR4-SSD dual-port DIMM device |
US9696942B2 (en) | 2014-03-17 | 2017-07-04 | Mellanox Technologies, Ltd. | Accessing remote storage devices using a local bus protocol |
US9430268B2 (en) | 2014-05-02 | 2016-08-30 | Cavium, Inc. | Systems and methods for supporting migration of virtual machines accessing remote storage devices over network via NVMe controllers |
US9507722B2 (en) * | 2014-06-05 | 2016-11-29 | Sandisk Technologies Llc | Methods, systems, and computer readable media for solid state drive caching across a host bus |
US9990313B2 (en) * | 2014-06-19 | 2018-06-05 | Hitachi, Ltd. | Storage apparatus and interface apparatus |
KR102238652B1 (en) * | 2014-11-12 | 2021-04-09 | 삼성전자주식회사 | Data storage devce, method thereof, and method for operating data processing system having the same |
US10025747B2 (en) * | 2015-05-07 | 2018-07-17 | Samsung Electronics Co., Ltd. | I/O channel scrambling/ECC disassociated communication protocol |
KR102430187B1 (en) * | 2015-07-08 | 2022-08-05 | 삼성전자주식회사 | METHOD FOR IMPLEMENTING RDMA NVMe DEVICE |
KR20170013697A (en) * | 2015-07-28 | 2017-02-07 | 삼성전자주식회사 | Data storage device and data processing system including same |
CN105912275A (en) | 2016-04-27 | 2016-08-31 | 华为技术有限公司 | Method and device for establishing connection in nonvolatile memory system |
CN106020723B (en) * | 2016-05-19 | 2019-10-25 | 记忆科技(深圳)有限公司 | A kind of method of simplified NVMe solid state hard disk |
US20180032249A1 (en) * | 2016-07-26 | 2018-02-01 | Microsoft Technology Licensing, Llc | Hardware to make remote storage access appear as local in a virtualized environment |
US10372346B2 (en) * | 2016-07-29 | 2019-08-06 | Western Digital Technologies, Inc. | Extensible storage system controller |
US10474396B2 (en) * | 2016-10-25 | 2019-11-12 | Sandisk Technologies Llc | System and method for managing multiple file systems in a memory |
-
2017
- 2017-06-08 US US15/618,081 patent/US10282094B2/en active Active
-
2018
- 2018-01-23 KR KR1020180008126A patent/KR102506394B1/en active IP Right Grant
- 2018-03-02 TW TW107107134A patent/TWI734895B/en active
- 2018-03-28 CN CN201810263102.0A patent/CN108776576B/en active Active
- 2018-03-30 JP JP2018068067A patent/JP7032207B2/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070214316A1 (en) * | 2006-03-07 | 2007-09-13 | Samsung Electronics Co., Ltd. | RAID system and method in mobile terminal |
TW201510725A (en) * | 2009-01-23 | 2015-03-16 | Infortrend Technology Inc | Storage subsystem and storage system architecture performing storage virtualization and method thereof |
US20110208922A1 (en) * | 2010-02-22 | 2011-08-25 | International Business Machines Corporation | Pool of devices providing operating system redundancy |
TW201212033A (en) * | 2010-09-07 | 2012-03-16 | Phison Electronics Corp | Hybrid storage apparatus and hybrid storage medium controller and addressing method thereof |
Also Published As
Publication number | Publication date |
---|---|
TW201843596A (en) | 2018-12-16 |
JP7032207B2 (en) | 2022-03-08 |
US10282094B2 (en) | 2019-05-07 |
KR20180111492A (en) | 2018-10-11 |
CN108776576B (en) | 2023-08-15 |
US20180284990A1 (en) | 2018-10-04 |
JP2018173959A (en) | 2018-11-08 |
CN108776576A (en) | 2018-11-09 |
KR102506394B1 (en) | 2023-03-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TWI734895B (en) | Method of aggregating storage, method of nvme-of ssd capacity aggregation and aggregated ethernet ssd group | |
US10439878B1 (en) | Process-based load balancing and failover policy implementation in storage multi-path layer of host device | |
US20080162735A1 (en) | Methods and systems for prioritizing input/outputs to storage devices | |
WO2013160944A1 (en) | Provisioning of resources like cpu or virtual machines or virtualized storage in a cloud computing environment | |
KR20200017363A (en) | MANAGED SWITCHING BETWEEN ONE OR MORE HOSTS AND SOLID STATE DRIVES (SSDs) BASED ON THE NVMe PROTOCOL TO PROVIDE HOST STORAGE SERVICES | |
US11606429B2 (en) | Direct response to IO request in storage system having an intermediary target apparatus | |
KR20140112717A (en) | Data Storage System based on a key-value and Operating Method thereof | |
US20150370595A1 (en) | Implementing dynamic virtualization of an sriov capable sas adapter | |
CN107391270B (en) | System and method for high performance lock-free scalable targeting | |
US11379387B2 (en) | Storage system with submission queue selection utilizing application and submission queue priority | |
US11099754B1 (en) | Storage array with dynamic cache memory configuration provisioning based on prediction of input-output operations | |
US10936522B1 (en) | Performing input-output multi-pathing from user space | |
US11386023B1 (en) | Retrieval of portions of storage device access data indicating access state changes | |
US9755986B1 (en) | Techniques for tightly-integrating an enterprise storage array into a distributed virtualized computing environment | |
CN108228099B (en) | Data storage method and device | |
US11989455B2 (en) | Storage system, path management method, and recording medium | |
JP7330694B2 (en) | Computer system and method of operation | |
TWI619026B (en) | Independent resource allocation system for solving conflicts of distributed hadoop in virtualization and cloud serving system | |
US9015410B2 (en) | Storage control apparatus unit and storage system comprising multiple storage control apparatus units | |
US11175840B2 (en) | Host-based transfer of input-output operations from kernel space block device to user space block device | |
US8103827B2 (en) | Managing processing systems access to control blocks providing information on storage resources | |
US10447534B1 (en) | Converged infrastructure | |
US11983432B2 (en) | Load sharing of copy workloads in device clusters | |
US11886911B2 (en) | End-to-end quality of service mechanism for storage system using prioritized thread queues | |
US11481147B1 (en) | Buffer allocation techniques |