CN100362462C - Method for managing magnetic disk array buffer storage - Google Patents

Method for managing magnetic disk array buffer storage Download PDF

Info

Publication number
CN100362462C
CN100362462C CNB2005100842693A CN200510084269A CN100362462C CN 100362462 C CN100362462 C CN 100362462C CN B2005100842693 A CNB2005100842693 A CN B2005100842693A CN 200510084269 A CN200510084269 A CN 200510084269A CN 100362462 C CN100362462 C CN 100362462C
Authority
CN
China
Prior art keywords
write
main frame
data
read
magnetic disk
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CNB2005100842693A
Other languages
Chinese (zh)
Other versions
CN1862475A (en
Inventor
王玉林
吴小军
李广军
林水生
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
University of Electronic Science and Technology of China
Original Assignee
Huawei Technologies Co Ltd
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd, University of Electronic Science and Technology of China filed Critical Huawei Technologies Co Ltd
Priority to CNB2005100842693A priority Critical patent/CN100362462C/en
Publication of CN1862475A publication Critical patent/CN1862475A/en
Application granted granted Critical
Publication of CN100362462C publication Critical patent/CN100362462C/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The present invention discloses a method for managing a buffer storage of a magnetic disk array, which aims at reading/writing requests. A managing process aiming at the reading request has the step that when a reading request of a host computer arrives, whether reading request data is stored in a buffer storage of a magnetic disk array is judged; if the reading request data is stored in the buffer storage, the reading request data is acquired from the buffer storage of the magnetic disk and is sent back to the host computer, otherwise the reading request data is acquired from a magnetic disk and is sent back to the host computer, and a step b1 is executed. The present invention has the step b1 that whether a free area exists in the buffer storage of the magnetic disk array is judged; if the free area exists, the reading request is stored in the free area, otherwise a step c1 is executed. The present invention has the step c1 that read data blocks in the buffer storage of the magnetic disk array are selected, and the reading request data is stored in the position of the selected read data block. The writing request of the host computer is managed by adopting similar methods in the processes, and a polymerizing algorithm is adopted to write data into a data block and then into a magnetic disk. The method of the present invention shortens the average service time of requests of the host computer and improves the performance of an RAID systems under the condition of not decreasing the buffer cache hit ratio.

Description

The management method of magnetic disk array buffer storage
Technical field
The present invention relates to storage management technique, outstanding pointer is to the management method of the magnetic disk array buffer storage of read request and write request.
Background technology
Along with the popularization and application of development of internet technology and computing machine, memory data output is increasing, and people are more and more higher to the requirement of storage system, and this requirement especially is embodied in the storage of crucial affairs and uses.Cheap redundant disk (RAID, Redundant Array of Independent Disks) adopts itemize and redundant method to improve capacity, speed and the reliability of storage system, has become the first-selected structure that high-performance data is stored.
The next stage storer that the RAID system generally includes controller, magnetic disk array buffer storage and is made of a plurality of disks.Described magnetic disk array buffer storage is used for depositing the data block of host access temporarily, main frame is undertaken, to improve the data access speed of main frame to the RAID system by buffer memory to the data access great majority of RAID system.The capacity of buffer memory is little more than the capacity of disk, generally is no more than 1% of disk size, so the data of preserving in the buffer memory are the subclass of data content in the disk.
For the data access that makes main frame has the high level cache hit rate, promptly main frame just can find data block to be visited in buffer memory, must use certain algorithm to upgrade data content in the buffer memory.The algorithm of above-mentioned more new data is provided with according to the principle of locality of data access, principle of locality comprises two aspects: one, spatial locality, if promptly a data block is accessed to, the data block that is positioned at same itemize with this data block may be accessed to very soon.In the prior art, this spatial locality embodies by the disk prefetching algorithm.Two, temporal locality, if promptly a data block is accessed to, this data block is visited possibly once more, so can adopt least recently used (LRU during the data block in eliminating buffer memory, Least Recently Used) algorithm or similar algorithm, as much as possible with not removed in the following maximum duration, to improve the performance of whole RAID system by the data block that host access is arrived.
Main frame comprises read data block operations and write data block operations to the visit of RAID system.In existing buffer memory management method, the treatment scheme of main frame read request such as Fig. 1 may further comprise the steps:
After step 101, main frame read request arrive the RAID system, judge whether there is the read request data block in the buffer memory, if exist then execution in step 102, otherwise execution in step 103.
Described read request data block is meant the data block that the read request command request of main frame obtains.
Step 102, take out the read request data block return to main frame from buffer memory, and revise the Visitor Logs of this data block, this read data block operations is finished.
Step 103, the read operation of startup disk are taken out the read request data block from disk and are returned to main frame.
Step 104, judge whether there is the clear area in the buffer memory, if exist then execution in step 107, otherwise execution in step 105.
Step 105, selecting the data block in the buffer memory to eliminate according to lru algorithm, and judge the type of waiting to eliminate data block, is read data piece then execution in step 107 if wait to eliminate data block; If wait that eliminating data block is the write data piece, then execution in step 106.
Described read data piece is meant according to the main frame read request and takes out from disk, is stored in the data block in the buffer memory.The write data piece is meant according to the main frame write request and sends over from main frame or RAID system outside, is stored in the data block in the buffer memory.
Step 106, startup disk write operation will be waited to eliminate data block and be write disk.
Described disk write operation comprises other data block that reads with in the itemize from disk, carry out writing of the redundant computation of these itemize data and a plurality of data blocks etc.
Step 107, RAID system preserve the described read request data block of step 101 in buffer memory, and judge in the disk whether to also have the subsequent data blocks needs to read according to prefetch policy, if execution in step 108 then, otherwise this read data block operations is finished.
In disk, data message is preserved according to stripe unit, and a plurality of stripe units constitute an itemize, so the described data block that is positioned at same itemize that is meant with the itemize data block.
Step 108, start the disk read operation, and return execution in step 104 from the disk obtaining step 107 described subsequent data blocks of determining by prefetch policy.
In the prior art, main frame can adopt dual mode to realize to the write operation of RAID system, is respectively direct writing means (write through) and write-back mode (write back).
When direct writing means is meant and according to the main frame write request data block is write disk, also this data block is write buffer memory.Only after main frame was write disk with data block, the main frame write operation of direct writing means just finished.When adopting direct writing means, the treatment scheme of main frame write request may further comprise the steps as shown in Figure 2:
When step 201, the arrival of main frame write request, judge whether there is the write request data block in the buffer memory, if exist then execution in step 204, otherwise execution in step 202.
Step 202, judge whether there is the clear area in the buffer memory, if having then execution in step 204, if not then execution in step 203.
Step 203, select the data block in the buffer memory to eliminate according to lru algorithm.
Step 204, preserve the write request data block in buffer memory, and start disk write operation the write request data block is write disk, this write data block operations is finished.
In this step, described write request data block is kept in the described clear area of step 202, perhaps is kept at according to lru algorithm and chooses in the superseded data block, original content of preserving in the cover data piece.
If the unified direct writing means that adopts, each write request data block also writes disk when writing buffer memory, so during superseded data block, read data piece and write data piece can abandon simply.But the average service time of main frame write operation is longer during direct writing means, so this mode seldom adopts.
After the write-back mode was meant and according to the main frame write request data block is write magnetic disk array buffer storage, the main frame write operation just finished.The process that data block is write disk by buffer memory is finished by disk, is not included in the main frame write operation process.When adopting the write-back mode, the treatment scheme of main frame write request may further comprise the steps as shown in Figure 3:
When step 301, the arrival of main frame write request, judge whether there is the write request data block in the buffer memory, if exist then execution in step 305, otherwise execution in step 302.
Step 302, judge whether there is the clear area in the buffer memory, if having then execution in step 305, if not then execution in step 303.
Step 303, selecting the data block in the buffer memory to eliminate according to lru algorithm, and judge the type of waiting to eliminate data block, is read data piece then execution in step 305 if wait to eliminate data block; If wait that eliminating data block is the write data piece, then execution in step 304.
Step 304, startup disk write operation will be waited to eliminate data block and be write disk.
Step 305, preserve the write request data block in buffer memory, this write data block operations is finished.
Find out that from the explanation of Fig. 2 and Fig. 3 the average service time that the employing direct writing means is finished the main frame write request is much larger than the write-back mode, so the main frame write operation adopts the write-back mode to realize usually.
In above-mentioned main frame read/write operation, two types host access data block is arranged in the magnetic disk array buffer storage, be respectively read data piece and write data piece.Wherein, the read data piece is that the read operation of process main frame is kept at the data in the buffer memory, and the write data piece is that process main frame write operation is kept at the data in the buffer memory.Wait to eliminate data block when being filled with data in the buffer memory need eliminate data block the time, using lru algorithm to select from read data piece and write data piece without distinction usually, there is following shortcoming in this method:
(1) if waits that eliminating data block is the read data piece, then can directly abandon.But, if wait that eliminating data block is the write data piece that does not also write disk, need in carry out the read/write operation process, starting disk write operation and will treat that superseded data block writes disk, this will prolong the service time of this read.
For different levels of redundancy, it is also inequality that disk write operation need be read and write the number of times of disk, in RAID 5, write a stripe unit and need carry out repeatedly disk read-write and new redundant information calculating to disk, several even tens orders of magnitude of height are wanted in the simple operation of carrying out superseded data block of the time ratio that is spent.
(2) when waiting in the buffer memory that eliminating data block writes disk, do not consider the busy not busy situation of disk channel, thereby increase the average service time of main frame write operation, reduced the throughput of RAID system.
(3) in the time of will treating that superseded data block writes disk, only the visit situation of data block is selected, do not considered the itemize situation of data block in the RAID system, therefore increased the pressure of disk channel and disk, reduced the performance of whole RAID system according to main frame.
Summary of the invention
In view of this, fundamental purpose of the present invention is to provide a kind of management method of the magnetic disk array buffer storage at read request, reduces the average service time of processing host read request, improves the performance of RAID system.
Another purpose of the present invention is to provide a kind of management method of the magnetic disk array buffer storage at write request, reduces the average service time of processing host write request, improves the performance of RAID system.
For achieving the above object, technical scheme of the present invention specifically is achieved in that
A kind of management method of the magnetic disk array buffer storage at read request is characterized in that, when the main frame read request arrived magnetic disk array buffer storage, this method may further comprise the steps:
A1, judge whether the read request data are kept in the magnetic disk array buffer storage, return to main frame, return to main frame otherwise obtain the read request data from disk if then from magnetic disk array buffer storage, obtain the read request data, and execution in step b1;
B1, judge whether there is the clear area in the magnetic disk array buffer storage,, the clear area of described preservation read request data is become the read data piece if exist then preserve the read request data in the clear area, otherwise execution in step c1;
C1, when having the read data piece in the magnetic disk array buffer storage, choose the read data piece in the magnetic disk array buffer storage, described read request data are kept in the selected read data piece.
Further, main frame is set in magnetic disk array buffer storage in advance reads formation, described main frame is read the identification information that formation includes read data piece in the magnetic disk array buffer storage, then the described method of choosing the read data piece of step c1 is: read to select the identification information that needs superseded read data piece the formation from main frame, and find the read data piece of described identification information correspondence in magnetic disk array buffer storage.
Further, described main frame is read the access time that formation also includes the read data piece, the identification information binding of described access time and corresponding data piece, the then described method of selecting the read data piece of need eliminating is: determine that main frame reads in the formation to be write down apart from the access time farthest current time, select the read data piece of described access time correspondence.
Preferably, the method for described definite access time is: utilize the access time farthest least recently used algorithm chosen distance current time.
Preferably, described identification information is a buffer address.
Further comprise behind the step c1: judge whether exist the subsequent data blocks needs to read in the disk according to prefetching algorithm, if exist then return execution in step b1, otherwise main frame read data piece process finishes.
Preferably, described subsequent data blocks is the same track data pieces of read request data on disk.
A kind of management method of the magnetic disk array buffer storage at write request is characterized in that, when the main frame write request arrived magnetic disk array buffer storage, this method may further comprise the steps:
A2, judge whether the write request data are kept in the magnetic disk array buffer storage, if then use the write request data of described write request Data Update corresponding position, otherwise execution in step b2;
B2, judge whether there is the clear area in the magnetic disk array buffer storage,, the clear area of described preservation write request data is become the write data piece if exist then preserve described write request data in the clear area, otherwise execution in step c2;
C2, when having the read data piece in the magnetic disk array buffer storage, choose the read data piece in the magnetic disk array buffer storage, described write request data are kept in the selected read data piece, described selected read data piece is become the write data piece.
Further, main frame is set in magnetic disk array buffer storage in advance reads formation, described main frame is read the identification information that formation includes read data piece in the magnetic disk array buffer storage, then the described method of choosing the read data piece of step c2 is: read to select the identification information that needs superseded read data piece the formation from main frame, and find the read data piece of described identification information correspondence in magnetic disk array buffer storage.
Further, described main frame is read the access time that formation also includes the read data piece, the identification information binding of described access time and corresponding data piece, the then described method of selecting the read data piece of need eliminating is: determine that main frame reads in the formation to be write down apart from the access time farthest current time, select the read data piece of described access time correspondence.
Preferably, the method for described definite access time is: utilize the access time farthest least recently used algorithm chosen distance current time.
Preferably, described identification information is a buffer address.
Further, the main frame write queue is set in magnetic disk array buffer storage in advance, is used for the identification information of recording disc array buffer memory write data piece, then the described method of step a2 is:
A21, judge whether described write request data corresponding identification information is kept at main frame and reads in the formation,, and this identification information is joined the main frame write queue if then in the data block of identification information correspondence, preserve described write request data, otherwise execution in step a22;
A22, judge whether described write request data corresponding identification information is kept in the main frame write queue, if then in the data block of identification information correspondence, preserve described write request data, otherwise execution in step b2.
Further, step b2 is described further comprises after described write request data are preserved in the clear area: the identification information of described write request data is joined the main frame write queue;
Further comprise after the described write request data of the described preservation of step c2: the identification information of described write request data is joined the main frame write queue.
When identification information was joined the main frame write queue, this method further comprised:
D1, search the identification information of the same itemize data block that whether comprises described write request data in the main frame write queue, if comprise then with described write request data and with itemize data block write-once disk, otherwise execution in step d2;
D2, judge that whether main frame write queue length surpass preset length, if execution in step d3 then, otherwise the operating process that writes disk finishes;
D3, select the write data piece of the identification information correspondence that writes down in the main frame write queue, and search the write data piece and be recorded in same itemize data block in the main frame write queue, with above-mentioned data block write-once disk, the operating process that writes disk finishes.
1 described will further comprising behind the data write-once disk of steps d: with the identification information of described write request data place data block and join main frame with the identification information of itemize data block and read formation;
3 described will further comprising behind the data write-once disk of steps d: with the identification information of described write request data place data block and join main frame with the identification information of itemize data block and read formation.
Before operating process finished in the steps d 2, this method further comprised:
D21, judge whether main frame write queue length surpass and detect length that if surpass then execution in step d22, otherwise the operating process that writes disk finishes;
D22, judge whether the same itemize data block of described write request data is kept in the magnetic disk array buffer storage, if execution in step d24 then, otherwise execution in step d23;
D23, judge whether to record in the main frame write queue and all be kept at data block in the magnetic disk array buffer storage that if exist then execution in step d24, otherwise the operating process that writes disk finishes with the itemize data block;
D24, with the write data piece of the identification information correspondence that writes down in the main frame write queue, and be recorded in the same itemize data block write-once disk in the main frame write queue.
Preferably, select the method for data block to be in the steps d 3:, to choose the data block of head of the queue identification information recording according to the record order of main frame write queue.
Further, described identification information comprises the mapping relations between logical address, buffer address and two addresses;
Described method of searching with the itemize data block is: according to the logical address of write data piece, that determines to write down in the main frame write queue is in data block with itemize with the write data piece; According to the mapping relations between logical address and the buffer address, in magnetic disk array buffer storage, find described with the itemize data block.
As seen from the above technical solution, the management method of this magnetic disk array buffer storage at read of the present invention, data block in the buffer memory according to its operating position be set to the free time, main frame is read and main frame is write this three kinds of states, and reads the data block information that formation and main frame write queue are managed corresponding state with idle queues, main frame respectively.When needing the clear area to be used to preserve main frame read data block in the buffer memory, be in the data block of different conditions according to set formation region branch, utilize lru algorithm to eliminate and be arranged in the read data piece that main frame is read formation, thereby under the condition that does not reduce cache hit rate, reduce the average service time of host requests, improve the performance of RAID system.
On the other hand, the present invention only when data block adds the main frame write queue, just triggers the RAID system write data piece in the main frame write queue is write disk.Fully take into account the busy not busy situation of disk channel above-mentioned opportunity from the buffer memory writing data blocks to disk, so this method can improve the throughput of RAID system well.When carrying out the operation of main frame read data, because need be from disk read data, so disk channel is in busy condition.At this moment, method of the present invention is owing to need not to eliminate the write data piece from buffer memory, so need not further to take disk channel.When carrying out the main frame data writing operation, disk channel is in not busy state, so can utilize disk channel that the write data piece in the main frame write queue is write disk preferably.
Again on the one hand, the present invention utilizes RAID system itemize to preserve data characteristic, the polymerization algorithm is write in use to the write data piece in the buffer memory, as much as possible with in all the data block write-once disks in the same itemize, thereby reduce redundant computation and disk read-write number of times, improve the utilization factor of disk channel, and then improve the performance of whole RAID system.
Description of drawings
Fig. 1 is the process flow diagram of processing host read request in the prior art;
Fig. 2 is with the process flow diagram of direct writing means processing host write request in the prior art;
Fig. 3 is with the process flow diagram of write-back mode processing host write request in the prior art;
Fig. 4 is the state transition graph of spatial cache data block in a preferred embodiment of the present invention;
Fig. 5 is the process flow diagram of processing host read request in a preferred embodiment of the present invention;
Fig. 6 is with the process flow diagram of write-back mode processing host write request in a preferred embodiment of the present invention;
Fig. 7 writes the process flow diagram that the polymerization algorithm writes data block disk for adopting in a preferred embodiment of the present invention;
Fig. 8 writes disk sequence figure in the prior art with the write data piece;
Fig. 9 writes the polymerization algorithm for employing in a preferred embodiment of the present invention the write data piece is write disk sequence figure.
Embodiment
For making purpose of the present invention, technical scheme and advantage clearer, below with reference to the accompanying drawing embodiment that develops simultaneously, the present invention is described in more detail.
According to kind of design, the magnetic disk array buffer storage that adopts in the RAID of the present invention system can be divided into level cache and L2 cache, perhaps being divided into single port buffer memory and multiport buffer memory, is the management method that example specifies magnetic disk array buffer storage among the present invention with one-level single port buffer memory below.
In the present embodiment, be that unit divides with the data block with the magnetic disk array buffer storage space, the space identification of not preserving data content is an Idle state, and the space identification of preserving the read data piece is the main frame read states, and the space identification of preserving the write data piece is that main frame is write state.The status information of each data block in the queue record spatial cache is set, so that the use of spatial cache is managed, described formation is respectively idle queues, main frame is read formation and main frame write queue.If certain data block in the buffer memory is in Idle state, then the identification information with this data block is kept in the idle queues.Similarly, write state if certain data block is a main frame, then the identification information of this data block is kept in the main frame write queue; If certain data block is the main frame read states, then the identification information of this data block is kept at main frame and reads in the formation.Described identification information is used for the position of identification data block in the RAID system, can comprise that logical address, same data block that data block is preserved in disk are kept at or several in buffer address in the buffer memory and the contents such as mapping relations between logical address and the buffer address.The information such as access time that can also comprise data block in the formation, the access time of each data block and corresponding identification information binding.
Data block state in the magnetic disk array buffer storage can change along with the use of spatial cache, as shown in Figure 4.When carrying out main frame read data block operations, the data content that obtains by prefetching process 401 can be kept in the freed data blocks, then the state of this data block becomes the main frame read states from Idle state.If it is selected in selection process 402 to be in the data block of main frame read states, the data content in this data block will be dropped, and then this data block becomes Idle state from the main frame read states.When carrying out main frame write data block operations, the write request data can be kept in the freed data blocks, then the state of this data block becomes main frame from Idle state and writes state 403, if perhaps main frame write request data block has been kept in the spatial cache, and described space is in the main frame read states, then main frame upgrades 404 with the data content in the above-mentioned space, and its state is become main frame writes state.Write disk 405 if be in the data content that main frame writes in the data block of state by disk write operation, then the state with this data block becomes the main frame read states.
From the state transition graph of Fig. 4 as can be seen, the lru algorithm among the present invention only is used for eliminating main frame and reads the data block of formation, and these data blocks all are in the main frame read states.The write data piece will join main frame and read in the formation after being written into disk, participate in eliminating of lru algorithm together with other read data piece.
Based on above-mentioned state transition graph, what Fig. 5 showed is the treatment scheme of main frame read request among the present invention, and this flow process specifically comprises:
After step 501, main frame read request arrive, in whole buffer memory, search the read request data block, if exist then execution in step 502, otherwise execution in step 503.
Above-mentioned seek scope comprises that main frame reads all data blocks in formation and the main frame write queue.
Step 502, take out the read request data block return to main frame from buffer memory, and revise the Visitor Logs of this data block, this read data block operations is finished.
Step 503, the read operation of startup disk are taken out the read request data block from disk and are returned to main frame.
Step 504, judge whether there is the clear area in the buffer memory, if exist then execution in step 506, otherwise execution in step 505.
Step 505, read the formation to select the read data piece to eliminate from main frame according to lru algorithm.
Because the access time that main frame reads to comprise in the formation data block, so eliminate the method for read data piece in this step be: utilize lru algorithm or similar algorithm to read to select the formation a read data piece, in magnetic disk array buffer storage, find described data block according to the buffer address of this data block that writes down in the formation apart from current time access time correspondence farthest from main frame.
Step 506, RAID system preserve the described read request data block of step 501 in buffer memory, and judge in the disk whether to also have the subsequent data blocks needs to read according to prefetch policy, if execution in step 507 then, otherwise this read data block operations is finished.
The method of above-mentioned preservation is: the RAID system writes corresponding space in the buffer memory with new data content, replaces original data content.In addition, when execution is looked ahead, can in disk, select to be in the data block of same magnetic track, to accelerate main frame read data block operations with the read request data block.
Step 507, start the disk read operation, and return execution in step 504 from the disk obtaining step 506 described subsequent data blocks of determining by prefetch policy.
In the described process of Fig. 5, all be the read data piece owing to what eliminate, thus needn't start disk write operation, thus the processing time of main frame read request saved greatly.
Based on state transition graph shown in Figure 4, what Fig. 6 showed is the flow process that adopts write-back mode processing host write request among the present invention, and this flow process specifically comprises:
When step 601, main frame write request arrive, judge that main frame reads whether to have the write request data block in the formation, if exist then execution in step 605, otherwise execution in step 602.
Step 602, judge whether there is the write request data block in the main frame write queue, if exist then execution in step 606, otherwise execution in step 603.
Step 603, judge whether there is the clear area in the buffer memory, if having then execution in step 605, if not then execution in step 604.
Step 604, the data block of selecting main frame to read in the formation according to lru algorithm are eliminated.
Step 605, preserve the write request data block in buffer memory, and this data block information is joined in the main frame write queue, this write data block operations is finished.
Step 606, preserve the write request data block in buffer memory, this write data block operations is finished.
In the described process of Fig. 6, all be the read data piece owing to what eliminate, thus needn't start disk write operation, thus the processing time of main frame write request saved greatly.
In the above-mentioned steps 605, there have the write data block message to add in the main frame write queue to be fashionable, and the data block in the main frame write queue is triggered by the flow process of writing the polymerization algorithm and writing disk.What Fig. 7 showed is the process that data block writes disk, specifically may further comprise the steps:
There have the write data block message to add in step 701, the main frame write queue to be fashionable, search with described data block with the data block of itemize whether all in the main frame write queue, if execution in step 702, otherwise execution in step 703.
Because identification information comprises logical address, buffer address, and the mapping relations between the two, so described method of searching with the itemize data block is: according to the logical address of write data piece, that determines to write down in the main frame write queue is in data block with itemize with the write data piece; According to the mapping relations between logical address and the buffer address, in magnetic disk array buffer storage, find described with the itemize data block.
Step 702, with all data block write-once disks of same itemize, execution in step 709 then.
Step 703, judge that whether the length of main frame write queue surpass preset length, if execution in step 708 then, otherwise execution in step 704.
Step 704, judge that whether the length of main frame write queue surpass 2/3 of preset length, if execution in step 705 then, otherwise this process of writing disk finishes.
The described data block of step 705, judgement and step 701 with the data block of itemize whether all in buffer memory, if execution in step 707 then, otherwise execution in step 706.
Step 706, judge whether there is a data block in the main frame write queue, all in buffer memory, if having then execution in step 707, otherwise this process of writing disk finishes with other data block of itemize for it.
Step 707, with the data block in the main frame write queue and be in other data block write-once disk in the main frame write queue with itemize, execution in step 709 then.
At this moment, the data of write-once disk are the partial data pieces of same itemize in the main frame write queue.Though the above-mentioned data block that writes disk is the partial data piece in the same itemize,,, and need not read disk so this process only need be write disk because whole itemize data all in buffer memory, can use the whole itemize data in the buffer memory to carry out redundant computation.
Step 708, with the head of the queue data block and be in other data block write-once disk of the same itemize in the main frame write queue.
At this moment, if the part itemize data of head of the queue data block then once write disk with part itemize data in the main frame write queue.In addition, if be not the itemize data of whole head of the queue data block all in buffer memory, then before writing disk, also need to read disk, carry out the redundant computation of this itemize to obtain data.But the read operation number of times that this number of times of reading disk is carried out when writing a data block separately in the prior art is at every turn wanted much less.
Step 709, the data block identification information that will write in the disk are deleted from the main frame write queue, and corresponding information is added main frame read formation, and this process of writing disk finishes.
As can be seen from Figure 7, use is write the polymerization algorithm and can be reduced the redundant computation number of times, and the read-write number of times between disk and the buffer memory.Suppose in the prior art, the write data piece is write disk sequence as shown in Figure 8, wherein the Sij in the logical address represents that data block is kept at j stripe unit of i itemize in the disk, then adopting RAID 5 that all stripe units among Fig. 8 are all write the required disk read-write number of times of disk in order is 48 times, and the redundant computation number of times is 12 times.The write sequence that the polymerization algorithm is write in use as shown in Figure 9, wherein these four stripe units of S11, S12, S14 and S13 are as long as carry out write-once, the wiring method of other stripe unit in like manner, the stripe unit write-once disk that promptly belongs to same itemize, so the disk read-write number of times when adopting RAID 5 is 15 times, the redundant computation number of times is 4 times.
In the practical application, the length of main frame write queue can according to circumstances be adjusted.When the main frame write request more for a long time, can increase main frame write queue length, otherwise then reduce main frame write queue length.In the step 704, all whether detect in the same itemize of certain data block all data blocks also can adjust the opportunity in buffer memory as required, set in the present embodiment when main frame write queue length reach preset length 2/3 the time begin above-mentioned detection, this length is called detection length.In the step 708, when main frame write queue length surpassed preset length, the order that data block writes disk in the formation was not necessarily from the head of the queue data block, and the data block that can least can be rewritten in the future according to certain algorithm predicts also writes disk with it.
More than be to the management method of one-level single port magnetic disk array buffer storage in the present embodiment.Magnetic disk array buffer storage multistage for other or multiport also can adopt similar method to manage.
By the above embodiments as seen, the management method of this magnetic disk array buffer storage at read of the present invention, data block in the buffer memory according to its operating position be set to the free time, main frame is read and main frame is write this three kinds of states, when needing the clear area to be used to preserve main frame read data block in the buffer memory, utilize lru algorithm or other related algorithm to eliminate the data block that is in the main frame read states, thereby under the condition that does not reduce cache hit rate, reduce the average service time of host requests, improve the RAID system performance.

Claims (19)

1. the management method at the magnetic disk array buffer storage of read request is characterized in that, when the main frame read request arrived magnetic disk array buffer storage, this method may further comprise the steps:
A1, judge whether the read request data are kept in the magnetic disk array buffer storage, return to main frame, return to main frame otherwise obtain the read request data from disk if then from magnetic disk array buffer storage, obtain the read request data, and execution in step b1;
B1, judge whether there is the clear area in the magnetic disk array buffer storage,, the clear area of described preservation read request data is become the read data piece if exist then preserve the read request data in the clear area, otherwise execution in step c1;
C1, when having the read data piece in the magnetic disk array buffer storage, choose the read data piece in the magnetic disk array buffer storage, described read request data are kept in the selected read data piece.
2. method according to claim 1, it is characterized in that, main frame is set in magnetic disk array buffer storage in advance reads formation, described main frame is read the identification information that formation includes read data piece in the magnetic disk array buffer storage, then the described method of choosing the read data piece of step c1 is: read to select the identification information that needs superseded read data piece the formation from main frame, and find the read data piece of described identification information correspondence in magnetic disk array buffer storage.
3. method according to claim 2, it is characterized in that, described main frame is read the access time that formation also includes the read data piece, the identification information binding of described access time and corresponding data piece, the then described method of selecting the read data piece of need eliminating is: determine that main frame reads in the formation to be write down apart from the access time farthest current time, select the read data piece of described access time correspondence.
4. method according to claim 3 is characterized in that, the method for described definite access time is: utilize the access time farthest least recently used algorithm chosen distance current time.
5. according to the arbitrary described method of claim 2 to 4, it is characterized in that described identification information is a buffer address.
6. method according to claim 1 is characterized in that, further comprises behind the step c 1: judge whether exist the subsequent data blocks needs to read in the disk according to prefetching algorithm, if exist then return execution in step b1, otherwise main frame read data piece process finishes.
7. method according to claim 6 is characterized in that, described subsequent data blocks is the same track data pieces of read request data on disk.
8. the management method at the magnetic disk array buffer storage of write request is characterized in that, when the main frame write request arrived magnetic disk array buffer storage, this method may further comprise the steps:
A2, judge whether the write request data are kept in the magnetic disk array buffer storage, if then use the write request data of described write request Data Update corresponding position, otherwise execution in step b2;
B2, judge whether there is the clear area in the magnetic disk array buffer storage,, the clear area of described preservation write request data is become the write data piece if exist then preserve described write request data in the clear area, otherwise execution in step c2;
C2, when having the read data piece in the magnetic disk array buffer storage, choose the read data piece in the magnetic disk array buffer storage, described write request data are kept in the selected read data piece, described selected read data piece is become the write data piece.
9. method according to claim 8, it is characterized in that, main frame is set in magnetic disk array buffer storage in advance reads formation, described main frame is read the identification information that formation includes read data piece in the magnetic disk array buffer storage, then the described method of choosing the read data piece of step c2 is: read to select the identification information that needs superseded read data piece the formation from main frame, and find the read data piece of described identification information correspondence in magnetic disk array buffer storage.
10. method according to claim 9, it is characterized in that, described main frame is read the access time that formation also includes the read data piece, the identification information binding of described access time and corresponding data piece, the then described method of selecting the read data piece of need eliminating is: determine that main frame reads in the formation to be write down apart from the access time farthest current time, select the read data piece of described access time correspondence.
11. method according to claim 10 is characterized in that, the method for described definite access time is: utilize the access time farthest least recently used algorithm chosen distance current time.
12., it is characterized in that described identification information is a buffer address according to the arbitrary described method of claim 9 to 11.
13. method according to claim 9 is characterized in that, the main frame write queue is set in magnetic disk array buffer storage in advance, is used for the identification information of recording disc array buffer memory write data piece, then the described method of step a2 is:
A21, judge whether described write request data corresponding identification information is kept at main frame and reads in the formation,, and this identification information is joined the main frame write queue if then in the data block of identification information correspondence, preserve described write request data, otherwise execution in step a22;
A22, judge whether described write request data corresponding identification information is kept in the main frame write queue, if then in the data block of identification information correspondence, preserve described write request data, otherwise execution in step b2.
14. method according to claim 13 is characterized in that, step b2 is described further to be comprised after described write request data are preserved in the clear area: the identification information of described write request data is joined the main frame write queue;
Further comprise after the described write request data of the described preservation of step c2: the identification information of described write request data is joined the main frame write queue.
15., it is characterized in that when identification information was joined the main frame write queue, this method further comprised according to claim 13 or 14 described methods:
D1, search the identification information of the same itemize data block that whether comprises described write request data in the main frame write queue, if comprise then with described write request data and with itemize data block write-once disk, otherwise execution in step d2;
D2, judge that whether main frame write queue length surpass preset length, if execution in step d3 then, otherwise the operating process that writes disk finishes;
D3, select the write data piece of the identification information correspondence that writes down in the main frame write queue, and search the write data piece and be recorded in same itemize data block in the main frame write queue, with above-mentioned data block write-once disk, the operating process that writes disk finishes.
16. method according to claim 15, it is characterized in that 1 described will further comprising behind the data write-once disk of steps d: with the identification information of described write request data place data block and join main frame with the identification information of itemize data block and read formation;
3 described will further comprising behind the data write-once disk of steps d: with the identification information of described write request data place data block and join main frame with the identification information of itemize data block and read formation.
17. method according to claim 15 is characterized in that, before operating process finished in the steps d 2, this method further comprised:
D21, judge whether main frame write queue length surpass and detect length that if surpass then execution in step d22, otherwise the operating process that writes disk finishes;
D22, judge whether the same itemize data block of described write request data is kept in the magnetic disk array buffer storage, if execution in step d24 then, otherwise execution in step d23;
D23, judge whether to record in the main frame write queue and all be kept at data block in the magnetic disk array buffer storage that if exist then execution in step d24, otherwise the operating process that writes disk finishes with the itemize data block;
D24, with the write data piece of the identification information correspondence that writes down in the main frame write queue, and be recorded in the same itemize data block write-once disk in the main frame write queue.
18. method according to claim 15 is characterized in that, selects the method for data block to be in the steps d 3:, choose the data block of head of the queue identification information recording according to the record order of main frame write queue.
19. method according to claim 15 is characterized in that, described identification information comprises the mapping relations between logical address, buffer address and two addresses;
Described method of searching with the itemize data block is: according to the logical address of write data piece, that determines to write down in the main frame write queue is in data block with itemize with the write data piece; According to the mapping relations between logical address and the buffer address, in magnetic disk array buffer storage, find described with the itemize data block.
CNB2005100842693A 2005-07-15 2005-07-15 Method for managing magnetic disk array buffer storage Active CN100362462C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2005100842693A CN100362462C (en) 2005-07-15 2005-07-15 Method for managing magnetic disk array buffer storage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2005100842693A CN100362462C (en) 2005-07-15 2005-07-15 Method for managing magnetic disk array buffer storage

Publications (2)

Publication Number Publication Date
CN1862475A CN1862475A (en) 2006-11-15
CN100362462C true CN100362462C (en) 2008-01-16

Family

ID=37389919

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2005100842693A Active CN100362462C (en) 2005-07-15 2005-07-15 Method for managing magnetic disk array buffer storage

Country Status (1)

Country Link
CN (1) CN100362462C (en)

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101673188B (en) * 2008-09-09 2011-06-01 上海华虹Nec电子有限公司 Data access method for solid state disk
CN101751225B (en) * 2008-12-04 2011-12-14 上海华虹Nec电子有限公司 Data access method of hybrid hard drive
CN101520743B (en) * 2009-04-17 2010-12-08 杭州华三通信技术有限公司 Data storage method, system and device based on copy-on-write
CN102063264B (en) * 2009-11-18 2012-08-29 成都市华为赛门铁克科技有限公司 Data processing method, equipment and system
CN101794259B (en) * 2010-03-26 2012-05-30 成都市华为赛门铁克科技有限公司 Data storage method and device
CN102253810B (en) * 2010-05-17 2014-02-05 深圳市世纪光速信息技术有限公司 Method, apparatus and system used for reading data
CN101937321B (en) * 2010-09-15 2013-08-21 中兴通讯股份有限公司 Method and device for realizing mixed buffer
CN103823634B (en) * 2012-11-16 2017-12-12 腾讯科技(深圳)有限公司 A kind of data processing method and system supported without random WriteMode
CN103136121B (en) * 2013-03-25 2014-04-16 中国人民解放军国防科学技术大学 Cache management method for solid-state disc
CN103488582B (en) * 2013-09-05 2017-07-28 华为技术有限公司 Write the method and device of cache memory
CN104484287B (en) * 2014-12-19 2017-05-17 北京麓柏科技有限公司 Nonvolatile cache realization method and device
CN105022697A (en) * 2015-05-19 2015-11-04 江苏蓝深远望系统集成有限公司 Disk cache based virtual optical jukebox storage system replacement algorithm
CN106557277B (en) * 2015-09-30 2019-07-19 成都华为技术有限公司 The reading method and device of disk array
US10254999B2 (en) * 2016-03-31 2019-04-09 EMC IP Holding Company LLC Method and system for optimistic flow control for push-based input/output with buffer stealing
CN106528447A (en) * 2016-10-25 2017-03-22 郑州云海信息技术有限公司 Cache synchronization method for distributed SAN (Storage Area Network)
CN109213430B (en) * 2017-06-30 2021-09-10 伊姆西Ip控股有限责任公司 Storage management method and system
CN110716689B (en) * 2018-07-11 2023-05-26 阿里巴巴集团控股有限公司 Data processing method and device and computing equipment
CN109582222B (en) * 2018-10-31 2020-11-24 华中科技大学 Method for cleaning persistent cache in host sensing tile recording disk
CN110764697B (en) * 2019-09-29 2023-08-29 望海康信(北京)科技股份公司 Data management method and device
CN111581118B (en) * 2019-12-31 2021-04-13 北京忆芯科技有限公司 Computing acceleration system
CN117234430B (en) * 2023-11-13 2024-02-23 苏州元脑智能科技有限公司 Cache frame, data processing method, device, equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5761717A (en) * 1992-06-04 1998-06-02 Emc Corporation System and method for determining what position in cache memory to store data elements
US6282617B1 (en) * 1999-10-01 2001-08-28 Sun Microsystems, Inc. Multiple variable cache replacement policy

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5761717A (en) * 1992-06-04 1998-06-02 Emc Corporation System and method for determining what position in cache memory to store data elements
US6282617B1 (en) * 1999-10-01 2001-08-28 Sun Microsystems, Inc. Multiple variable cache replacement policy

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
现代操作系统. Tanenbaum,76,机械工业出版社. 1999 *
磁盘缓存管理机制研究. 朱平,吴碧伟.计算机工程与应用,第2004卷第20期. 2004 *

Also Published As

Publication number Publication date
CN1862475A (en) 2006-11-15

Similar Documents

Publication Publication Date Title
CN100362462C (en) Method for managing magnetic disk array buffer storage
US6230239B1 (en) Method of data migration
US6629211B2 (en) Method and system for improving raid controller performance through adaptive write back/write through caching
US7383392B2 (en) Performing read-ahead operation for a direct input/output request
US7085895B2 (en) Apparatus, system, and method flushing data from a cache to secondary storage
US7500063B2 (en) Method and apparatus for managing a cache memory in a mass-storage system
JP3522527B2 (en) I / O control device and I / O control method
US7130961B2 (en) Disk controller and method of controlling the cache
CN106547476B (en) Method and apparatus for data storage system
RU2003119149A (en) SYSTEM AND METHOD FOR PRELIMINARY DATA SELECTION IN THE CACHE OF MEMORY, BASED ON THE INTERVAL OF FAILURES
CN108459826A (en) A kind of method and device of processing I/O Request
KR100926865B1 (en) Data storage device, data relocation method, recording medium recording program
US6516389B1 (en) Disk control device
CN101131671A (en) Controlling access to non-volatile memory
CN103985393B (en) A kind of multiple optical disk data parallel management method and device
US6467024B1 (en) Accessing data volumes from data storage libraries in a redundant copy synchronization token tracking system
JP3460617B2 (en) File control unit
WO2001075581A1 (en) Using an access log for disk drive transactions
CN101174198B (en) Data storage system and data access method thereof
CN101840310B (en) Data read-write method and disk array system using same
US7975100B2 (en) Segmentation of logical volumes and movement of selected segments when a cache storage is unable to store all segments of a logical volume
US7171396B2 (en) Method and program product for specifying the different data access route for the first data set includes storing an indication of the different access for the first data set providing alternative data access routes to a data storage
JPH08263380A (en) Disk cache control system
US11449428B2 (en) Enhanced read-ahead capability for storage devices
US6950905B2 (en) Write posting memory interface with block-based read-ahead mechanism

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant