CN106886368A - A kind of block device writes IO shapings and multi-controller synchronization system and synchronous method - Google Patents

A kind of block device writes IO shapings and multi-controller synchronization system and synchronous method Download PDF

Info

Publication number
CN106886368A
CN106886368A CN201710022292.2A CN201710022292A CN106886368A CN 106886368 A CN106886368 A CN 106886368A CN 201710022292 A CN201710022292 A CN 201710022292A CN 106886368 A CN106886368 A CN 106886368A
Authority
CN
China
Prior art keywords
block device
write
request
controller
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710022292.2A
Other languages
Chinese (zh)
Other versions
CN106886368B (en
Inventor
王道邦
王成武
周泽湘
李艳国
段舒文
于召鑫
潘兴旺
张恒
马赵军
王爽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING TOYOU FEIJI ELECTRONICS Co Ltd
Original Assignee
BEIJING TOYOU FEIJI ELECTRONICS Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING TOYOU FEIJI ELECTRONICS Co Ltd filed Critical BEIJING TOYOU FEIJI ELECTRONICS Co Ltd
Publication of CN106886368A publication Critical patent/CN106886368A/en
Application granted granted Critical
Publication of CN106886368B publication Critical patent/CN106886368B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0689Disk arrays, e.g. RAID, JBOD

Abstract

The present invention relates to shaping and synchronous method that block device in a kind of solution multi-controller writes IO, belong to mass data storage technical field.The present invention realizes that Virtual Block Device drives by controller, bound with bottom block device, the synchronization between write IO request shaping and multi-controller cluster to server end, includes block device filtration drive, IO Shaping Modules, reads gap module, cache pool, forwarding module and brush module etc. under IO.The physics block device of rear end is shared equipment, the shared disk of mapping such as including FC, or the disk connected by SAS and other modes.Contrast prior art, when the present invention solves multi-controller data syn-chronization, without taking double space;Solve to write the optimization of IO performances and cache coherency problems in multi-controller cluster-based storage, give full play to the bandwidth performance of block device, it is to avoid the repeatedly time waste of tracking, reduce bottom disk writes shake and response time delay;Write-in caching is returned, and solves the response latency issue for writing shake and IO of disk.

Description

A kind of block device writes IO shapings and multi-controller synchronization system and synchronous method
Technical field
IO shapings and multi-controller synchronization system and synchronous method are write the present invention relates to a kind of block device, it is adaptable to control more Device cluster storage system, belongs to mass data storage technical field.
Background technology
In modern field of storage, in order to improve the reliability of data storage and improve input, the output performance of storage system, People devise various data storage schemes, and these data storage schemes are typically various types of RAIDs (Redundant Arrays of Independent Disks, RAID).By using specific hardware or software, RAID handles Multiple physical storage devices such as disk, joins together, and forms a unified logical memory device.Storage based on dual controller It is current mainstream technology, currently also has part manufacturer realizing multi-controller technology, but be primarily based upon dual control, multipair double Cluster is formed between control, is not total exchange, write IO is synchronously only carried out between each pair dual controller, while simply that IO is same The caching of another controller in dual control is walked, that is, caches mirror image, the IO between multipair dual controller does not carry out IO optimizations And synchronization, disk can only carry out share and access between local dual control controller, and the data safety between multiple controllers is needed To be replicated in volume rank, while every group of local disk space of dual controller is taken, as shown in Figure 1.The present invention is exactly led to That crosses between multi-controller writes IO shapings and synchronization, when solving the shared rear end disk of multi-controller, without taking double space, and drop Low bottom disk writes shake and response time delay, while can be when any controller is damaged in the cluster, cluster elects remaining control One in device processed takes over, if controller, then can bad N-1 controller for N number of in cluster.
The technical term commonly used in storage system is explained below:
Caching mirror image:In traditional dual control memory technology, each controller opens up independent spatial cache, is write in data It is fashionable, while writing opposite end, it is ensured that when master controller breaks down, can be switched to from controller, it is ensured that data are not lost.
Master controller:In dual control memory technology, can confirm that a controller is master control by certain strategy, this control Device is responsible for the read-write of actual data in magnetic disk.
From controller:In dual control memory technology, during dual-active, can go to master controller from all requests of controller and hold OK, after taking the return of master controller, server end is again returned to from controller;When active and standby, do not received from service from controller The request at device end, server request is sent only to master controller, is also only returned from master controller.
I/O Request:Storage is primarily referred to as by after space reflection to server end, server memory space is read or Write.
LUN mapping:Logical partition is mapped to host side by traditional magnetic disk array by IP or FC networks.
SAN:Storage area network.
The content of the invention
The invention aims to realize that the write IO request to server end carries out shaping and optimization, by shaping and optimization I/O Request afterwards is distributed on multiple controllers, makes full use of caching, and IO optimizations are write in realization, while asking to be forwarded to after merging Other controllers, it is ensured that the IO uniformity of multi-controller, it is ensured that data are not lost, are independent of the physics realization of bottom, are one General framework.
The purpose of the present invention is achieved through the following technical solutions:
A kind of block device writes IO shapings and multi-controller synchronous method, by setting a block device on each controller Filtration drive, it is bound with any block device of bottom, intercept server end I/O Request, and with local caching IO IO merging is carried out, other controller nodes are forwarded to while local cache is write, while receive all controller IO completing Response after, return server end write-in successful information.
A kind of block device writes IO shapings and multi-controller synchronization system, including Block Device Driver filtering module, and block device is tied up Cover half block, writes IO Shaping Modules, and IO forwarding modules read brush module under gap module and IO, and memory pool;Block Device Driver mistake Filter module respectively with block device binding module, write IO Shaping Modules, memory pool and be connected, write IO Shaping Modules respectively with memory pool, IO forwarding modules, reading gap module connection, memory pool are connected with brush module under IO;
Block Device Driver filtering module, is mainly used in the initialization of whole system, including memory source distribution, facility registration Interacted with user supervisor;The bottom block device being specially input into user supervisor interaction according to user, is its wound Build filtering block device, storage allocation pond and start under IO brush module and call block device binding module carry out filtering block device and Bottom block device maps, and filtering block device is used for user to carry out SAN derivation and receive I/O Request by user supervisor interaction, And the I/O Request to receiving calls IO Shaping Modules to process;
Block device binding module, is mainly used in the filtering block device created by the Block Device Driver filtering module and bottom Layer block device is mapped, and specifies the data bottom block device finally to be write in internal memory corresponding with block device;
Write IO Shaping Modules, be mainly used in writing that IO is continuous and gap data merges treatment, and to merging after the completion of Write IO request and the write IO request that does not have merging relation with data in current memory write with block device in corresponding internal memory, and adjust The data after merging are forwarded with the IO forwarding modules interface, brush data under brush module are called under IO;
Preferably, it is described do not have merging relation be non-conterminous and gap more than set condition.
Preferably, described pair is write the completion of IO gap data merga pass procedure below:First call reading gap module Gap data is read from the block device of bottom binding, then is merged with the I/O Request in current write IO request and caching.
IO forwarding modules, are mainly used in writing the IO that IO Shaping Modules send and are forwarded in other controllers of cluster, together When registration feedback call back function, wait other controllers feed back, other controllers feedback after will call call back function process return, Success return does not need extra process, during failure, relevant treatment is carried out according to the error message for returning;
Read gap module, be mainly used in being write described in reading the original position and length of the specified block device that IO Shaping Modules are sent The data of degree, and return to and described write IO Shaping Modules;
Brush module under IO, is mainly used in, when internal memory pool water level is triggered, the data in internal memory being write into bottom block device, writes To enter successfully call afterwards the call back function treatment successful I/O Request of lower brush, and notify other controllers, other controllers are receiving this The I/O Request in caching is labeled as Clean states during feedback;By the successful I/O Request of lower brush in cache pool and it is designated The data dump of Clean.
Preferably, the inspection when internal memory pool water level is triggered is regular check or IO checking water when entering memory pool Position, or not only regular check inspection when IO enters memory pool again.
Preferably, judging whether current oneself be main control before the data write-in bottom block device by internal memory Device, if so, brushing corresponding block device under then meeting the data of the I/O Request of internal memory pool water level condition.
Preferably, the bottom block device can be physics, or logic, as long as relevant block can be provided The interface of equipment standard, can be used cooperatively with filtering block device, including the shared disk that FC, IP SAN map, or Person is connected to the physical disk of all controllers by SAS or PCI-E total exchange modes.
A kind of block device writes IO shapings and multi-controller synchronous method, based on said system, including herein below:
One, start Block Device Driver filtering module, the fast equipment of bottom specified according to the user for receiving creates filter block and sets The read-write IO at front end derivation and the reception server end is ready for use on, and is filtering block device storage allocation pond, then call block to set Standby binding module is bound bottom block device with block device is filtered, the bottom block device with filter block apparatus bound, can be with It is single disk, or by the RAID of multiple DPU disk pack units, or Logical Disk by SAN storage mappings out etc.;Timing is called under IO Whether data reach water level and the fast equipment of bottom will be brushed under data during brush module checks memory pool;
Two, filtering block device waiting for server end write IO request, when server end writes IO to be reached, in IO being write and being cached Already present IO is compared, for meeting the successional write IO request of request address with caching I/O Request, and with caching IO Request meets the write IO request of address gap condition, calls and writes IO Shaping Modules and carry out shaping, for being unsatisfactory for continuity and writing The write IO request of interval, directly writes in caching;
Three, writing IO Shaping Modules will meet the successional write IO request of request address with caching I/O Request, enter row write IO conjunctions And simultaneously remove the caching I/O Request in memory pool for merging;Please by the IO that writes that address gap condition is met with caching I/O Request Ask, call reading gap module to read the gap data between write IO request and the write IO request in memory pool and merge, Yi Jiqing Except the caching I/O Request in memory pool for merging;
Four, write the write IO request after IO Shaping Modules will merge and the write IO request write-in internal memory that cannot merge Pond, calls whether data during brush module checks memory pool under IO reach water level and the fast equipment of bottom will be brushed under data;And call Forwarding module is forwarded to other controllers in cluster simultaneously, waits the feedback of other controllers, after succeeding, by successful information Return to filtering block device and then server end is returned to by filtering block device;If there is controller time-out, in the secondary of setting re-transmission After number, failure is marked as, and notifies all controllers, the controller of failure rejected from cluster, and elect new control The work of device taking over failing controller processed.
Beneficial effect
Contrast prior art, the present invention writes IO shapings and synchronization by between multi-controller, solves multi-controller and shares During the disk of rear end, without taking the problem that double space is synchronized;Can be when any controller be damaged in the cluster, cluster is pushed away One lifted in remaining controller takes over, if in cluster controller be it is N number of, then can bad N-1 controller;Solve The optimization of IO performances, multi-controller are write in multi-controller cluster-based storage and write IO cache coherency problems, optimized by IO and merge, Give full play to the bandwidth performance of bottom block device, it is to avoid the repeatedly time waste of tracking, reduce writing for bottom disk and shake and ring Answer time delay;It is saved in caching to be returned to, solves the request response latency issue that shake and client write IO of writing of bottom disk.
Brief description of the drawings
Fig. 1 is the operating diagram of embodiment of the present invention tradition multi-controller;
Fig. 2 is each software module operating diagram in embodiment of the present invention single controller;
Fig. 3 is the multi-controller cluster-based storage overall architecture schematic diagram of the embodiment of the present invention.
Specific embodiment
With reference to specific implementation, the present invention is described in detail.
As shown in figure 3, being a kind of typical multi-controller cluster-based storage configuration diagram, each business main frame is handed over by business Exchange device submits IO read-write requests to multi-controller cluster, i.e., each controller, and each controller is after rear end switching equipment is operated End storage device, carries out the read-write of physical data;Each controller is carried out IO synchronously and is worked as have control by cluster switching equipment Master controller arbitration when device does not work.
The inventive method by each controller set a block device filtration drive, by its any block with bottom Equipment is bound, and intercepts the I/O Request of server end, and is carried out IO with local caching IO and merged, in write-in local cache Other controller nodes are forwarded to simultaneously, while after receiving the response that all controller IO are completed, returning to server end and being written to Work(information.
During a kind of block device realized based on the above method writes IO shapings and multi-controller synchronization system applies to Fig. 3 Multi-controller IO synchronizations, as shown in Fig. 2 a kind of block device writes IO shapings and multi-controller synchronization system including with lower module:
A kind of block device writes IO shapings and multi-controller synchronization system, including Block Device Driver filtering module, and block device is tied up Cover half block, writes IO Shaping Modules, and IO forwarding modules read brush module under gap module and IO, and memory pool;Block Device Driver mistake Filter module respectively with block device binding module, write IO Shaping Modules, memory pool and be connected, write IO Shaping Modules respectively with memory pool, IO forwarding modules, reading gap module connection, memory pool are connected with brush module under IO;
Block Device Driver filtering module, is mainly used in the initialization of whole system, including memory source distribution, facility registration Interacted with user supervisor;The bottom block device being specially input into user supervisor interaction according to user, is its wound Build filtering block device, storage allocation pond and start under IO brush module and call block device binding module carry out filtering block device and Bottom block device maps, and filtering block device is used for user to carry out the derivation of the agreements such as FC, IP and connect by user supervisor interaction Receive I/O Request, and to receive I/O Request call IO shaping interfaces to process;
Block device in the present embodiment, both can be physics, or logic, as long as relevant block can be provided set The interface of standby standard, can be used cooperatively with filtering block device, including the shared disk that FC, IP SAN map, or The physical disk of all controllers is connected to by SAS or PCI-E total exchange modes.
Block device binding module, is mainly used in the filtering block device created by the Block Device Driver filtering module and bottom Layer block device is mapped, and specifies the data bottom block device finally to be write in internal memory corresponding with block device;
Write IO Shaping Modules, be mainly used in writing that IO is continuous and gap data merges, after the completion of write-in it is corresponding with block device In internal memory, and call the data after the IO forwarding modules interface forwarding merging;
Preferably, described pair is write the completion of IO gap data merga pass procedure below:First call reading gap module Gap data is read from the block device of bottom binding, then is merged with the I/O Request in current write IO request and caching.
IO forwarding modules, are mainly used in writing the IO that IO Shaping Modules send and are forwarded in other controllers of cluster, together When registration feedback call back function, wait other controllers feed back, other controllers feedback after will call call back function process return, Success return does not need extra process, it is necessary to be processed according to the error code for returning during failure, such as Retransmission timeout;
Read gap module, be mainly used in being write described in reading the original position and length of the specified block device that IO Shaping Modules are sent The data of degree, and return to and described write IO Shaping Modules;
Brush module under IO, deposits the write IO request of the filtering block device, lower brush module master in the memory pool of correspondence block device It is used to, when internal memory pool water level is triggered, the data in internal memory are write into bottom block device, inspection water can be timed on strategy Position and IO check water level when entering memory pool, call call back function to process the successful I/O Request of lower brush after lower brush success, by cache pool In corresponding data remove, and notify other controllers, other controllers are when this feedback is being received by the I/O Request mark in caching It is designated as Clean states;
Preferably, the inspection when internal memory pool water level is triggered is regular check or IO checking water when entering memory pool Position, or not only regular check but also the inspection when IO enters memory pool.
Preferably, the workflow of brush module is as follows under IO:Check whether the data cached in memory pool reach water level, Then judge whether current oneself be master controller in this way, if it is not, the write IO request release of Clean will be designated in caching, clearly Reason caching;If master controller, then corresponding block device is brushed under the data of the I/O Request that will meet internal memory pool water level condition, clearly Reason caching, and the lower successful I/O Request of brush is notified into all of controller;After brush module has notice under the IO of each controller, will Corresponding write IO request is designated Clean in local memory pond.
A kind of block device based on said system writes IO shapings and multi-controller synchronous method includes herein below:
One, start Block Device Driver filtering module, the fast equipment of bottom specified according to the user for receiving creates filter block and sets The read-write IO at front end derivation and the reception server end is ready for use on, and is filtering block device storage allocation pond, then call block to set Standby binding module is bound bottom block device with block device is filtered, the bottom block device with filter block apparatus bound, can be with It is single disk, or by the RAID of multiple DPU disk pack units, or Logical Disk by SAN storage mappings out etc.;Timing is called under IO Whether data reach water level and the fast equipment of bottom will be brushed under data during brush module checks memory pool;
Two, filtering block device waiting for server end write IO request, when server end writes IO to be reached, in IO being write and being cached Already present IO is compared, for meeting the successional write IO request of request address with caching I/O Request, and with caching IO Request meets the write IO request of address gap condition, calls and writes IO Shaping Modules and carry out shaping, for being unsatisfactory for continuity and writing The write IO request of interval, directly writes in caching;
Three, writing IO Shaping Modules will meet the successional write IO request of request address with caching I/O Request, enter row write IO conjunctions And simultaneously remove the caching I/O Request in memory pool;The write IO request of address gap condition will be met with caching I/O Request, call reading Gap module reads the gap data between write IO request and the write IO request in memory pool and merges, and in removing memory pool Caching I/O Request;
Four, write the write IO request after IO Shaping Modules will merge and the write IO request write-in internal memory that cannot merge Pond, calls whether data during brush module checks memory pool under IO reach water level and the fast equipment of bottom will be brushed under data;And call Forwarding module is forwarded to other controllers in cluster simultaneously, waits the feedback of other controllers, after succeeding, by successful information Return to filtering block device and then server end is returned to by filtering block device;If there is controller time-out, in the secondary of setting re-transmission After number, failure is marked as, and notifies all controllers, the controller of failure rejected from cluster, and elect new control The work of device taking over failing controller processed.
Experimental result
In this experiment, a 4 controller clusters, 24 pieces of share dish are built, rear end is exchanged by SAS and causes 4 controls Device node can be written and read to 24 pieces of disks, be tested using this 24 disks.RAID5 storage pools 2 are set up, each pond is equal It is 11 pieces of disks, remaining 2 pieces of disks are that two storage pools respectively create a HotSpare disk.It is each on each storage pool to create a volume, The equal loading blocks equipment I O filtration drives of each node, i.e. Block Device Driver filtering module, respectively the volume block device with bottom carry out Binding, and simulate two block devices respectively in filtration drive, that is, block device is filtered, its attribute is the attribute of bottom volume, if Client needs SAN to access, then volume is carried out into Target by one or more modes in optical fiber or iSCSI protocol leads Go out so that server end sees one piece of raw device;It is file system by volume formatter if client needs NAS to access, passes through CIFS, NFS, HTTP, HTTPS, FTP or similar agreement externally provide access.
After server end writes IO arrival filtration drives, Shaping Module is reached, already present IO is carried out in writing IO and caching Compare, I/O Request cached with request address continuity is met, enter row write IO merging, meet the write IO request of address gap condition, Gap data is first read from the block device of bottom binding, then is closed with the I/O Request in current write IO request and caching And, merging the I/O Request for completing can be written to caching corresponding with the bottom block device of binding with the I/O Request that can not merge In pond (i.e. memory pool).Forwarding module is called, the I/O Request for merging or can not merge is forwarded to other controls in cluster Device, and the feedback of other controllers is waited, after succeeding, successful information is returned into server end.If there is controller time-out, After setting the number of times for retransmitting, failure is marked as, and notifies all controllers, the controller of failure is rejected from cluster, And elected in remaining controller takes over its work;Elected with takeover strategy not in present invention model due to specific Farmland, can directly select existing cluster corresponding strategies.After caching water level condition and triggering, under IO brush module judge it is current whether Oneself it is master controller, if it is not, the write IO request release of Clean, cleaning caching will be designated in caching;If main control Device, then to meet water level and brush corresponding block device under the data in the I/O Request in the cache pool of condition, and brushed lower successfully I/O Request notify that all of controller identifies as Clean.
In being tested derived from actual SAN and NAS, server end write IO request is reached when filtering block device for this implementation, IO is write in all RAID layer of arrival of adapter, then as far as possible by data it is continuous, it is order write disk up, reduction disk writes Jitter problem, gives full play to the bandwidth write performance of disk, while the IO forwarding consistency treatments of multi-controller, prevent controller Loss of data during failure, it is ensured that user data consistence.
The above is only the preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art For member, under the premise without departing from the principles of the invention, some improvement can also be made, or which part technical characteristic is entered Row equivalent, these are improved and replacement also should be regarded as protection scope of the present invention.

Claims (8)

1. a kind of block device writes IO shapings and multi-controller synchronous method, it is characterised in that:A kind of block device writes IO shapings and many Controller synchronous method, by setting a block device filtration drive on each controller, it is set with any block of bottom For being bound, the I/O Request of server end is intercepted, and IO is carried out with local caching IO and merged, in the same of write-in local cache When be forwarded to other controller nodes, while after receiving the response that all controller IO are completed, returning to server end and writing successfully Information.
2. a kind of block device writes IO shapings and multi-controller synchronization system, it is characterised in that:Including Block Device Driver filtering module, Block device binding module, writes IO Shaping Modules, and IO forwarding modules read brush module under gap module and IO, and memory pool;Block sets It is standby drive filtering module respectively with block device binding module, write IO Shaping Modules, memory pool and be connected, write IO Shaping Modules respectively with Memory pool, IO forwarding modules, reading gap module connection, memory pool are connected with brush module under IO;
Block Device Driver filtering module, is mainly used in the initialization of whole system, including memory source distribution, facility registration and use Family management program interaction;The bottom block device being specially input into user supervisor interaction according to user, is that it was created Filter block equipment, storage allocation pond and start IO under brush module and call block device binding module carry out filter block device and bottom Block device maps, and filtering block device is used for user to carry out SAN derivation and receive I/O Request by user supervisor interaction, and right The I/O Request of reception calls IO Shaping Modules to be processed;
Block device binding module, is mainly used in the filtering block device and bottom block that will be created by the Block Device Driver filtering module Equipment is mapped, and specifies the data bottom block device finally to be write in internal memory corresponding with block device;
Write IO Shaping Modules, be mainly used in writing that IO is continuous and gap data merges treatment, and to merging after the completion of write I/O Request and the write IO request for not having merging relation with data in current memory are write with block device in corresponding internal memory, and call institute The data after the forwarding of IO forwarding modules interface merges are stated, brush data under brush module are called under IO;
IO forwarding modules, are mainly used in writing the IO that IO Shaping Modules send and are forwarded in other controllers of cluster, while note Volume feedback call back function, waits other controllers to feed back, and after other controllers feedback call back function will be called to process and returned, success Return does not need extra process, during failure, relevant treatment is carried out according to the error message for returning;
Read gap module, be mainly used in being write described in reading the original position and length of specified block device that IO Shaping Modules are sent Data, and return to and described write IO Shaping Modules;
Brush module under IO, is mainly used in, when internal memory pool water level is triggered, the data in internal memory being write into bottom block device, is written to Call call back function to process the successful I/O Request of lower brush after work(, and notify other controllers, other controllers are receiving this feedback When by the I/O Request in caching be labeled as Clean states;By the successful I/O Request of lower brush in cache pool and it is designated Clean Data dump.
3. a kind of block device according to claim 2 writes IO shapings and multi-controller synchronization system, it is characterised in that:It is described Completed to writing IO gap data merga pass procedure below:Read in first calling the block device that reading gap module binds from bottom Go out gap data, then merged with the I/O Request in current write IO request and caching.
4. a kind of block device according to claim 2 writes IO shapings and multi-controller synchronization system, it is characterised in that:It is described Inspection when internal memory pool water level is triggered to check water level when regular check or IO enter memory pool, or not only regular check but also IO is checked when entering memory pool.
5. a kind of block device according to claim 2 writes IO shapings and multi-controller synchronization system, it is characterised in that:It is described To judge whether current oneself be master controller before data write-in bottom block device in internal memory, if so, will then meet memory pool Corresponding block device is brushed under the data of the I/O Request of water level condition.
6. a kind of block device according to claim 2 writes IO shapings and multi-controller synchronization system, it is characterised in that:It is described Bottom block device can be physics, or logic, as long as the interface of relevant block equipment standard can be provided, can be with Used cooperatively with filtering block device, including the shared disk that FC, IP SAN map, or handed over entirely by SAS or PCI-E The mode of changing is connected to the physical disk of all controllers.
7. IO shapings and multi-controller synchronization system are write according to a kind of any described block devices of claim 2-6, its feature exists In:It is described that relevant treatment including but not limited to herein below is carried out according to the error message for returning:If there is controller time-out, setting After the fixed number of times for retransmitting, failure is marked as, and notifies all controllers, the controller of failure is rejected from cluster, and One elected in remaining controller takes over its work.
8. a kind of block device writes IO shapings and multi-controller synchronous method, and it is whole to write IO based on a kind of block device described in claim 2 Shape and multi-controller synchronization system, including herein below:
One, start Block Device Driver filtering module, the fast equipment of bottom specified according to the user for receiving creates filtering block device and uses The read-write IO with the reception server end is derived in front end, and is filtering block device storage allocation pond, then call block device to tie up Cover half block is bound bottom block device with filtering block device, the bottom block device with filter block apparatus bound, can be single Individual disk, or by the RAID of multiple DPU disk pack units, or Logical Disk by SAN storage mappings out etc.;Brush mould under IO is called in timing Whether data reach water level and the fast equipment of bottom will be brushed under data during block checks memory pool;
Two, filtering block device waiting for server end write IO request, when server end writes IO to be reached, has been deposited in writing IO and caching IO be compared, for meeting the successional write IO request of request address with caching I/O Request, and with caching I/O Request Meet the write IO request of address gap condition, call and write IO Shaping Modules and carry out shaping, for being unsatisfactory for continuity and writing interval Write IO request, directly write to caching in;
Three, writing IO Shaping Modules will meet the successional write IO request of request address with caching I/O Request, enter row write IO and merge simultaneously Caching I/O Request in removing memory pool for merging;The write IO request of address gap condition will be met with caching I/O Request, adjusted Read the gap data between write IO request and the write IO request in memory pool and merged with gap module is read, and remove internal memory It is used for the caching I/O Request for merging in pond;
Four, the write IO request after IO Shaping Modules will merge and the write IO request write-in memory pool that cannot merge are write, adjust Whether data reach water level and the fast equipment of bottom will be brushed under data in checking memory pool with brush module under IO;And call forwarding mould Block is forwarded to other controllers in cluster simultaneously, waits the feedback of other controllers, after succeeding, successful information is returned to Filter block device and then server end is returned to by filtering block device;If there is controller time-out, after the number of times that setting is retransmitted, will It is labeled as failure, and notifies all controllers, and the controller of failure is rejected from cluster, and elects new controller adapter The work of failed controller.
CN201710022292.2A 2016-12-30 2017-01-12 A kind of block device writes IO shaping and multi-controller synchronization system and synchronous method Active CN106886368B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201611257652 2016-12-30
CN2016112576529 2016-12-30

Publications (2)

Publication Number Publication Date
CN106886368A true CN106886368A (en) 2017-06-23
CN106886368B CN106886368B (en) 2019-08-16

Family

ID=59175706

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710022292.2A Active CN106886368B (en) 2016-12-30 2017-01-12 A kind of block device writes IO shaping and multi-controller synchronization system and synchronous method

Country Status (1)

Country Link
CN (1) CN106886368B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107800576A (en) * 2017-11-16 2018-03-13 郑州云海信息技术有限公司 A kind of system integrating management method and system based on multi-controller framework
CN107943422A (en) * 2017-12-07 2018-04-20 郑州云海信息技术有限公司 A kind of high speed storing media data management method, system and device
CN109086008A (en) * 2018-07-26 2018-12-25 浪潮电子信息产业股份有限公司 The data processing method and solid state hard disk of solid state hard disk
CN111813562A (en) * 2020-04-30 2020-10-23 中科院计算所西部高等技术研究院 Server host with OODA multi-partition IO resource pool mechanism
CN112612429A (en) * 2021-01-06 2021-04-06 武汉飞骥永泰科技有限公司 iscsi target tgt architecture optimization method and system
CN113687796A (en) * 2021-10-25 2021-11-23 苏州浪潮智能科技有限公司 IO task processing method and device, computer equipment and storage medium
CN114860634A (en) * 2022-04-29 2022-08-05 苏州浪潮智能科技有限公司 Forwarding current limiting method and device for write IO between storage controllers and storage medium

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1996011430A3 (en) * 1994-10-03 1996-07-18 Ibm Coherency and synchronization mechanism for I/O channel controllers in a data processing system
CN101727299A (en) * 2010-02-08 2010-06-09 北京同有飞骥科技有限公司 RAID5-orientated optimal design method for writing operation in continuous data storage
CN102662607A (en) * 2012-03-29 2012-09-12 华中科技大学 RAID6 level mixed disk array, and method for accelerating performance and improving reliability
CN103049222A (en) * 2012-12-28 2013-04-17 中国船舶重工集团公司第七0九研究所 RAID5 (redundant array of independent disk 5) write IO optimization processing method
CN103425438A (en) * 2013-07-15 2013-12-04 记忆科技(深圳)有限公司 Solid state disk and method for optimizing write request of solid state disk
CN103577125A (en) * 2013-11-22 2014-02-12 浪潮(北京)电子信息产业有限公司 Cross controller group mirror image writing method and device applied to high-end disk array
CN103761051A (en) * 2013-12-17 2014-04-30 北京同有飞骥科技股份有限公司 Performance optimization method for multi-input/output stream concurrent writing based on continuous data
CN103838515A (en) * 2012-11-23 2014-06-04 中国科学院声学研究所 Method and system for allowing server cluster to have access to and dispatch multi-controller disk array
CN104503710A (en) * 2015-01-23 2015-04-08 福州瑞芯微电子有限公司 Method and device for increasing writing speed of nand flash
CN105204779A (en) * 2015-09-14 2015-12-30 北京鲸鲨软件科技有限公司 Double-control-based SCSI (Small Computer System Interface) TARGET access control method and device
CN106233270A (en) * 2014-04-29 2016-12-14 华为技术有限公司 Share Memory Controller and using method thereof
CN106227464A (en) * 2016-07-14 2016-12-14 中国科学院计算技术研究所 A kind of double-deck redundant storage system and data write, reading and restoration methods

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1996011430A3 (en) * 1994-10-03 1996-07-18 Ibm Coherency and synchronization mechanism for I/O channel controllers in a data processing system
CN101727299A (en) * 2010-02-08 2010-06-09 北京同有飞骥科技有限公司 RAID5-orientated optimal design method for writing operation in continuous data storage
CN102662607A (en) * 2012-03-29 2012-09-12 华中科技大学 RAID6 level mixed disk array, and method for accelerating performance and improving reliability
CN103838515A (en) * 2012-11-23 2014-06-04 中国科学院声学研究所 Method and system for allowing server cluster to have access to and dispatch multi-controller disk array
CN103049222A (en) * 2012-12-28 2013-04-17 中国船舶重工集团公司第七0九研究所 RAID5 (redundant array of independent disk 5) write IO optimization processing method
CN103425438A (en) * 2013-07-15 2013-12-04 记忆科技(深圳)有限公司 Solid state disk and method for optimizing write request of solid state disk
CN103577125A (en) * 2013-11-22 2014-02-12 浪潮(北京)电子信息产业有限公司 Cross controller group mirror image writing method and device applied to high-end disk array
CN103761051A (en) * 2013-12-17 2014-04-30 北京同有飞骥科技股份有限公司 Performance optimization method for multi-input/output stream concurrent writing based on continuous data
CN106233270A (en) * 2014-04-29 2016-12-14 华为技术有限公司 Share Memory Controller and using method thereof
CN104503710A (en) * 2015-01-23 2015-04-08 福州瑞芯微电子有限公司 Method and device for increasing writing speed of nand flash
CN105204779A (en) * 2015-09-14 2015-12-30 北京鲸鲨软件科技有限公司 Double-control-based SCSI (Small Computer System Interface) TARGET access control method and device
CN106227464A (en) * 2016-07-14 2016-12-14 中国科学院计算技术研究所 A kind of double-deck redundant storage system and data write, reading and restoration methods

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107800576A (en) * 2017-11-16 2018-03-13 郑州云海信息技术有限公司 A kind of system integrating management method and system based on multi-controller framework
CN107943422A (en) * 2017-12-07 2018-04-20 郑州云海信息技术有限公司 A kind of high speed storing media data management method, system and device
CN109086008A (en) * 2018-07-26 2018-12-25 浪潮电子信息产业股份有限公司 The data processing method and solid state hard disk of solid state hard disk
CN109086008B (en) * 2018-07-26 2021-06-29 浪潮电子信息产业股份有限公司 Data processing method of solid state disk and solid state disk
CN111813562A (en) * 2020-04-30 2020-10-23 中科院计算所西部高等技术研究院 Server host with OODA multi-partition IO resource pool mechanism
CN111813562B (en) * 2020-04-30 2023-09-26 中科院计算所西部高等技术研究院 Server host with OODA multi-partition IO resource pool mechanism
CN112612429A (en) * 2021-01-06 2021-04-06 武汉飞骥永泰科技有限公司 iscsi target tgt architecture optimization method and system
CN113687796A (en) * 2021-10-25 2021-11-23 苏州浪潮智能科技有限公司 IO task processing method and device, computer equipment and storage medium
CN114860634A (en) * 2022-04-29 2022-08-05 苏州浪潮智能科技有限公司 Forwarding current limiting method and device for write IO between storage controllers and storage medium
CN114860634B (en) * 2022-04-29 2023-08-04 苏州浪潮智能科技有限公司 Forwarding and current limiting method and device for write IO between storage controllers and storage medium

Also Published As

Publication number Publication date
CN106886368B (en) 2019-08-16

Similar Documents

Publication Publication Date Title
CN106886368B (en) A kind of block device writes IO shaping and multi-controller synchronization system and synchronous method
US8108644B2 (en) Storage control apparatus, storage system, and virtual volume control method
US20190171264A1 (en) Persistent reservations for virtual disk using multiple targets
US6598174B1 (en) Method and apparatus for storage unit replacement in non-redundant array
US7181578B1 (en) Method and apparatus for efficient scalable storage management
US7444541B2 (en) Failover and failback of write cache data in dual active controllers
US8874680B1 (en) Interconnect delivery process
CA2363726C (en) Methods and systems for implementing shared disk array management functions
CN100334534C (en) Methods and apparatus for implementing virtualization of storage within a storage area network
JP4557988B2 (en) System and method for takeover of partner resources related to core dumps
EP1686478A2 (en) Storage replication system with data tracking
US20140244578A1 (en) Highly available main memory database system, operating method and uses thereof
US20060277363A1 (en) Method and apparatus for implementing a grid storage system
US20050028028A1 (en) Method for establishing a redundant array controller module in a storage array network
US6477618B2 (en) Data storage system cluster architecture
GB2416414A (en) Host-side rerouting of I/O requests in a data storage system with redundant controllers
WO2017041616A1 (en) Data reading and writing method and device, double active storage system and realization method thereof
CN105934793A (en) Method for distributing data in storage system, distribution apparatus and storage system
CN104135514B (en) Fusion type virtual storage system
CN109582495A (en) Bidirectional replication
CN106933504A (en) Method and system for providing the access of storage system
CN104994135B (en) The method and device of SAN and NAS storage architectures is merged in storage system
CN108205573B (en) Data distributed storage method and system
Zhang et al. Leveraging glocality for fast failure recovery in distributed RAM storage
CN207704423U (en) Resource pool integration builds system in dispatching of power netwoks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant