CN102063271A - State machine based write back method for external disk Cache - Google Patents

State machine based write back method for external disk Cache Download PDF

Info

Publication number
CN102063271A
CN102063271A CN2010106115142A CN201010611514A CN102063271A CN 102063271 A CN102063271 A CN 102063271A CN 2010106115142 A CN2010106115142 A CN 2010106115142A CN 201010611514 A CN201010611514 A CN 201010611514A CN 102063271 A CN102063271 A CN 102063271A
Authority
CN
China
Prior art keywords
dirty
disk
data
cache
write
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2010106115142A
Other languages
Chinese (zh)
Other versions
CN102063271B (en
Inventor
袁清波
骆志军
邵宗有
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dawning Information Industry Beijing Co Ltd
Dawning Information Industry Co Ltd
Original Assignee
Dawning Information Industry Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dawning Information Industry Beijing Co Ltd filed Critical Dawning Information Industry Beijing Co Ltd
Priority to CN201010611514.2A priority Critical patent/CN102063271B/en
Publication of CN102063271A publication Critical patent/CN102063271A/en
Application granted granted Critical
Publication of CN102063271B publication Critical patent/CN102063271B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The invention provides a write back mechanism based on states of an application access mode, a disk and an external Cache. In the write back mechanism, a state machine is used for performing write back on a DIRTY block, CLEAN shows that an external disk Cache block has no DIRTY data, DIRTY shows that the external disk Cache block has DIRTY data, FLUSH shows the entire process where one write back flow processes one Cache block. The write back mechanism not only reflects the requirements of users, but also ensures that DIRTY data are also written back in a disk, thereby avoiding losing lots of data not written back in the disk; in addition, the write back mechanism fully considers the overall performance of a system, the system does not execute extra write back operation to disturb the reading and writing of data stream when a certain amount of orderly reading and writing exists, thereby improving the performance of IO (Input/ Output).

Description

The external Cache of a kind of disk is based on the back method of writing of state machine
Technical field
The present invention relates to the external Cache of disk and optimize the field, be specifically related to the back method of writing that a kind of method in conjunction with application access pattern, disk load and the external Cache load of disk is controlled the external Cache of disk.
Background technology
The data transmission rate of disk is meant the speed of disk read-write data, comprises internal transfer rate and external transfer rate.Internal Transfer Rate is also referred to as MB-S, and it has reflected the not performance of time spent of disk buffer, and Internal Transfer Rate mainly depends on the rotational speed of disk; External Transfer Rate is also referred to as burst transfers of data rate or interface transfer rate, its nominal be data transmission rate between system bus and the disk buffer, external transfer rate is relevant with the size of hard-disk interface type and hard disk cache.Why this difference is arranged, be because there is a high-speed equipment that is called disk buffer, this is that disk manufacturer adds in disk inside for the readwrite performance that improves disk, be actually a memory chip on the hard disk controller, have the access speed that is exceedingly fast, it is the impact damper between hard disk storage inside and the extraneous interface.Because the internal data transfer speed of hard disk is different with extraneous interface transmission speed, is buffered in the effect of wherein playing a buffering.The size of buffer memory and speed are the key factors that is directly connected to the transmission speed of hard disk, can improve the hard disk integral performance significantly.When the harddisk access fragmentary data, need swap data between hard disk and internal memory constantly, big buffer memory is arranged, then those fragmentary datas can be temporarily stored in the buffer memory, reduce the load of external system, also improved the transmission speed of data.
But, being subject to the hardware configuration of disk, the buffer memory capacity on it can not be big especially, so if disk Cache is moved to the problem that the outside will be easy to the capacity that solves from disk inside.Because be not subjected to the restriction in space, external disk Cache can reach several GB, even the size of tens GB, uses jumbo buffer memory like this will improve the IO performance of total system greatly.Yet, the very serious problem that large capacity cache brings be a large amount of disc content that was modified in Cache, deposit rather than the user cognitive be positioned on the disk, like this when fortuitous events such as system's power down take place, be difficult to guarantee the consistance of data on data on the external Cache of disk and the disk.
Operation to data in magnetic disk on linux system is divided into two classes: Direct IO and Buffered IO, the former is a kind of IO without the kernel buffer memory, it can accomplish that directly internal memory with user's space writes disk or data in magnetic disk directly read the buffer zone of user's space, this strategy is exactly to use the self-designed buffer memory of user without the buffer memory of kernel, and this generally uses in Database Systems; The latter then is to use maximum a kind of, the user is to the buffer memory of the necessary process earlier of the visit kernel of data in magnetic disk, if data have buffer memory then directly read from the kernel internal memory and get final product during read data in kernel, far away surpassing from disk at a slow speed of speed reads, and also be in kernel, to distribute one section memory headroom earlier during write data, then data are write this space, be then written in the disk in the suitable time later on and go, also relate to the mechanism that much writes back here.Because the user wishes to write in fact inreal being written in the disk of data of disk, and just temporary in kernel spacing, so the data in magnetic disk that is temporarily stored in system crash in the internal memory will be lost all.Linux uses some special background thread periodically the data in magnetic disk in the kernel to be write back in the disk and goes.
Summary of the invention
The present invention takes into full account many-sided factor, neither influences the normal execution of application program, also dirty data is not retained among the external Cache of disk for a long time, has proposed a kind of mechanism that writes back based on application access pattern, disk and external Cache state.
The external Cache of a kind of disk is based on the back method of writing of state machine, and it is as follows that state machine writes back process:
A, at first distribute an xinfo structure and a rbio, if successfully change B; Otherwise DIRTY runs succeeded promptly from the conversion of FLUSH to DIRTY, carries out failure promptly from the conversion of DIRTY to DIRTY;
Whether B, successful as if resources allocation detects current dirty data bitmap, determine border, the left and right sides, and be single continuum, carries out following alternative path then:
If the single continuum of B1, in Cache piece or xinfo mark its be continuous, border, the left and right sides and the size of rbio is set;
If a plurality of continuums of B2, it is the border, the left and right sides of Cache piece that border, the left and right sides is set;
C, xinfo and other domain of dependence of rbio are set, rbio is sent to external disk Cache;
Behind D, the Cache blocks of data that reads back, in two kinds of situation:
D1, if its on dirty data constitute single continuum, then revise the bio domain of dependence, be forwarded to disk;
D2, discontinuous as if the dirty data on it needs to distribute earlier a rbio, and this bio is set, and reads whole data block from disk;
E, from the conversion of DIRTY, perhaps from the conversion of FLUSH to DIRTY to DIRTY;
Behind F, the data in magnetic disk that reads back, the data merging with in the external disk Cache piece of reading originally writes back disk again;
G, from the conversion of DIRTY, perhaps from the conversion of FLUSH to DIRTY to DIRTY;
If writing disk, H makes mistakes, then from DIRTY or FLUSH conversion to DIRTY;
I otherwise, from the conversion of FLUSH to CLEAN, the failure then from the conversion of DIRTY to DIRTY;
J, release respective resources;
Wherein, CLEAN represents do not have dirty data in the external disk Cache piece; DIRTY represents in the Cache piece dirty data is arranged; FLUSH represents an all processes that writes back a Cache piece of flow processing.
A kind of optimal technical scheme of the present invention is: need to detect earlier before writing back operations to write back strategy, write back according to writing back strategy; Writing back strategy does not comprise and writes back disk to dirty automatically, do not write back disk with dirty when having write operation, when both reading not have write operation dirty is not write back yet, dirty is write back disk and have dirty to write back disk as long as do not read and write in proper order or read and write in proper order when surpassing some.
The present invention had both embodied user's demand, had guaranteed that also dirty data has an opportunity to write back disk, thereby did not seriously have a large amount of data that do not write back disk when unexpected and do not lost taking place.In addition, the present invention has also taken into full account the overall performance of system, and when the order read-write that has some, system can not carry out the extra read-write that back operations is upset these data stream of writing, thereby has improved the performance of IO.
Description of drawings
Fig. 1 is the external Cache structural drawing of disk
Fig. 2 is the state machine transfer process
Embodiment
The present invention is a kind of mechanism that writes back based on application access pattern, disk and external Cache state.
The application access pattern: the user can be provided with the multiple strategy that writes back as required, as long as such as automatically dirty data is not write back, does not have the IO request time to write back again or no matter which kind of situation have dirty data strategy such as just to write back, system will carry out different treatment schemees according to user's requirement;
Disk State: when disk is in idle condition, can select the dirty data among the external Cache of some disks to write back in the disk, so both finish refresh operation, avoid the possibility of loss of data, also make full use of the disk bandwidth, had high cost performance;
External Cache state: if external Cache dirty data is too much, then need the part dirty data is write back disk, especially, at available Cache piece and during the piece before must replacing, if the data that are replaced some be dirty data then should immediately these data be write back disk so that other flow processs can be used this space.
Access module according to using can be provided with several strategies that write back.These strategies use when the external Cache piece of certain disk of system discovery comprises dirty data, the user set a kind of write back strategy after, have only and satisfy the Cache piece that strategy requires and just can be write back disk.
When discharging Cache piece control, can go to detect current external disk Cache equipment correspondence disk busy-idle condition and comprise the quantity of the Cache piece of dirty data, if disk is not in a hurry and then can selects a part of Cache piece to write back disk.On the other hand, because external disk Cache capacity is less than the disk of correspondence, so the Cache piece that needs to comprise data when not having idle Cache piece for use replaces, if data that this piece comprises and the data on the disk are in full accord, then directly uses this space to get final product; If the data that this Cache piece comprises do not exist, then these dirty datas must be write back disk in disk.
More than operation has determined which Cache piece need write back disk, based on these marks, the responsible flow process of specifically writing back operations is at first read all dirty datas (because external disk Cache not at the address space of internal memory, can not directly control it and write disk) from external disk Cache.If dirty data is single continuum, the errorless disk that then writes direct is read back in then read-only this zone; Otherwise read whole C ache piece, this part can be optimized for the leftmost border in read-only a plurality of dirty datas zone to rightmost circle, read back errorless, read in corresponding whole data block from disk again, among the bio that copying data up-to-date on the Cache piece to disk is returned according to the dirty data bitmap, again this bio is write back disk then.Detection to single continuum will be noted a bit, not only only sees the dirty data bitmap, if this Cache piece is buffered, should directly read the Minimum Area that comprises all dirty data zones from Cache so, directly writes disk then and gets final product.
Three states relevant with writing back operations are CLEAN, DIRTY and FLUSH.CLEAN represents do not have dirty data in the external disk Cache piece, notices that on behalf of all sector of this Cache piece, this state all contain valid data (partly writing such as only taking place before); DIRTY represents in the Cache piece dirty data is arranged; FLUSH represents an all processes that writes back a Cache piece of flow processing (not under the situation of being bothered by write operation).These three states transformational relation in all cases as shown in Figure 2.
Detailed process flow process according to this state machine is as follows:
1. at first distribute an xinfo structure and a rbio, if successfully change 2; Otherwise set_state (DIRTY) runs succeeded corresponding in the state machine F2 (! )From the conversion of FLUSH to DIRTY, carry out failure corresponding to F2 (! )From the conversion of DIRTY to DIRTY, attention must not have F2 or F2 (! ) from the conversion of CLEAN to DIRTY;
2. whether the success of resources allocation herein detects current dirty data bitmap, determine border, the left and right sides, and be single continuum, carries out following alternative path then:
If a) single continuum, in Cache piece or xinfo mark its be continuous, border, the left and right sides and the size of rbio is set;
B) if a plurality of continuum, it is the border, the left and right sides of Cache piece that border, the left and right sides is set;
3. other domain of dependence of xinfo and rbio is set, rbio is sent to external disk Cache;
4. read back behind the Cache blocks of data, in two kinds of situation:
A) if the dirty data on it constitutes single continuum, then revise the bio domain of dependence, be forwarded to disk;
B) if the dirty data on it is discontinuous, need to distribute earlier a rbio, this bio is set, read whole data block from disk;
5.set_state (DIRTY), this corresponding to F2 in the state machine (! ) from the conversion of DIRTY to DIRTY, perhaps F2 (! ) from the conversion of FLUSH to DIRTY;
6. read back behind the data in magnetic disk, the data merging with in the external disk Cache piece of reading originally writes back disk again;
7.set_state (DIRTY), this corresponding to F2 in the state machine (! ) from the conversion of DIRTY to DIRTY, perhaps F2 (! ) from the conversion of FLUSH to DIRTY;
8. make mistakes if write disk, set_state (DIRTY), corresponding to F3 in the state machine (disaster) from DIRTY or FLUSH conversion to DIRTY;
Otherwise, test_and_set_state (FLUSH CLEAN), operates successfully corresponding to F3 in the state machine from the conversion of FLUSH to CLEAN, failure corresponding to F3 from the conversion of DIRTY to DIRTY;
10. release respective resources.

Claims (2)

1. the external Cache of disk is based on the back method of writing of state machine, and it is characterized in that: it is as follows that state machine writes back process:
A, at first distribute an xinfo structure and a rbio, if successfully change B; Otherwise DIRTY runs succeeded promptly from the conversion of FLUSH to DIRTY, carries out failure promptly from the conversion of DIRTY to DIRTY;
Whether B, successful as if resources allocation detects current dirty data bitmap, determine border, the left and right sides, and be single continuum, carries out following alternative path then:
If the single continuum of B1, in Cache piece or xinfo mark its be continuous, border, the left and right sides and the size of rbio is set;
If a plurality of continuums of B2, it is the border, the left and right sides of Cache piece that border, the left and right sides is set;
C, xinfo and other domain of dependence of rbio are set, rbio is sent to external disk Cache;
Behind D, the Cache blocks of data that reads back, in two kinds of situation:
D1, if its on dirty data constitute single continuum, then revise the bio domain of dependence, be forwarded to disk;
D2, discontinuous as if the dirty data on it needs to distribute earlier a rbio, and this bio is set, and reads whole data block from disk;
E, from the conversion of DIRTY, perhaps from the conversion of FLUSH to DIRTY to DIRTY;
Behind F, the data in magnetic disk that reads back, the data merging with in the external disk Cache piece of reading originally writes back disk again;
G, from the conversion of DIRTY, perhaps from the conversion of FLUSH to DIRTY to DIRTY;
If writing disk, H makes mistakes, then from DIRTY or FLUSH conversion to DIRTY;
I otherwise, from the conversion of FLUSH to CLEAN, the failure then from the conversion of DIRTY to DIRTY;
J, release respective resources;
Wherein, CLEAN represents do not have dirty data in the external disk Cache piece; DIRTY represents in the Cache piece dirty data is arranged; FLUSH represents an all processes that writes back a Cache piece of flow processing.
2. the external Cache of a kind of according to claim 1 disk is characterized in that based on the back method of writing of state machine: need to detect earlier before writing back operations to write back strategy, write back according to writing back strategy; Writing back strategy does not comprise and writes back disk to dirty automatically, do not write back disk with dirty when having write operation, when both reading not have write operation dirty is not write back yet, dirty is write back disk and have dirty to write back disk as long as do not read and write in proper order or read and write in proper order when surpassing some.
CN201010611514.2A 2010-12-17 2010-12-17 State machine based write back method for external disk Cache Active CN102063271B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201010611514.2A CN102063271B (en) 2010-12-17 2010-12-17 State machine based write back method for external disk Cache

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201010611514.2A CN102063271B (en) 2010-12-17 2010-12-17 State machine based write back method for external disk Cache

Publications (2)

Publication Number Publication Date
CN102063271A true CN102063271A (en) 2011-05-18
CN102063271B CN102063271B (en) 2014-08-13

Family

ID=43998565

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201010611514.2A Active CN102063271B (en) 2010-12-17 2010-12-17 State machine based write back method for external disk Cache

Country Status (1)

Country Link
CN (1) CN102063271B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102681795A (en) * 2012-05-02 2012-09-19 无锡众志和达存储技术有限公司 Method for data input/output (I/O) read-in in small computer system interface (SCSI) Target mode of Linux system
CN103577349A (en) * 2013-11-06 2014-02-12 华为技术有限公司 Method and device for selecting data from cache to write dirty data into hard disk
WO2018141304A1 (en) * 2017-02-06 2018-08-09 中兴通讯股份有限公司 Flash file system and data management method thereof
CN113625937A (en) * 2020-05-09 2021-11-09 鸿富锦精密电子(天津)有限公司 Storage resource processing device and method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20010020260A1 (en) * 2000-01-27 2001-09-06 Atsushi Kanamaru Method and system of reading and writing data by a disk drive apparatus
CN1617110A (en) * 2003-11-12 2005-05-18 华为技术有限公司 Method for rewriting in magnetic disc array structure
CN101373417A (en) * 2007-08-22 2009-02-25 株式会社日立制作所 Storage system having function to backup data in cache memory

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20010020260A1 (en) * 2000-01-27 2001-09-06 Atsushi Kanamaru Method and system of reading and writing data by a disk drive apparatus
CN1617110A (en) * 2003-11-12 2005-05-18 华为技术有限公司 Method for rewriting in magnetic disc array structure
CN101373417A (en) * 2007-08-22 2009-02-25 株式会社日立制作所 Storage system having function to backup data in cache memory

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102681795A (en) * 2012-05-02 2012-09-19 无锡众志和达存储技术有限公司 Method for data input/output (I/O) read-in in small computer system interface (SCSI) Target mode of Linux system
CN103577349A (en) * 2013-11-06 2014-02-12 华为技术有限公司 Method and device for selecting data from cache to write dirty data into hard disk
CN103577349B (en) * 2013-11-06 2016-11-23 华为技术有限公司 Select the method and apparatus that data carry out brush in the caches
WO2018141304A1 (en) * 2017-02-06 2018-08-09 中兴通讯股份有限公司 Flash file system and data management method thereof
CN113625937A (en) * 2020-05-09 2021-11-09 鸿富锦精密电子(天津)有限公司 Storage resource processing device and method
CN113625937B (en) * 2020-05-09 2024-05-28 富联精密电子(天津)有限公司 Storage resource processing device and method

Also Published As

Publication number Publication date
CN102063271B (en) 2014-08-13

Similar Documents

Publication Publication Date Title
CN103049397B (en) A kind of solid state hard disc inner buffer management method based on phase transition storage and system
CN106528438B (en) A kind of segmented rubbish recovering method of solid storage device
US20190251023A1 (en) Host controlled hybrid storage device
CN101673188B (en) Data access method for solid state disk
CN103019622B (en) The storage controlling method of a kind of data, controller, physical hard disk, and system
CN103116551B (en) Be applied to the NorFLASH store interface module of CLB bus
CN105786400B (en) heterogeneous hybrid memory component, system and storage method
CN106066890B (en) Distributed high-performance database all-in-one machine system
CN100370440C (en) Processor system and its data operating method
CN104216837A (en) Memory system, memory access request processing method and computer system
CN105940386B (en) Method, system, and medium for moving data between memories
CN103279428B (en) A kind of explicit multi-core Cache consistency active management method towards stream application
CN103235760B (en) High usage NorFLASH memory interface chip based on CLB bus
CN101118460A (en) Adaptive storage system including hard disk drive with flash interface
CN102279712A (en) Storage control method, system and device applied to network storage system
CN102541466A (en) Hybrid storage control system and method
CN102063271B (en) State machine based write back method for external disk Cache
KR20220116041A (en) Signaling for heterogeneous memory systems
US8301857B2 (en) Writing to file by multiple application threads in parallel
CN106469119A (en) A kind of data write buffer method based on NVDIMM and its device
WO2017107162A1 (en) Heterogeneous hybrid internal storage component, system, and storage method
CN102520885A (en) Data management system for hybrid hard disk
CN102160038A (en) Method and an apparatus to manage non-volatile disl cache
CN102799414B (en) Improve method and the device of speculative multithreading
CN108897618B (en) Resource allocation method based on task perception under heterogeneous memory architecture

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220727

Address after: 100193 No. 36 Building, No. 8 Hospital, Wangxi Road, Haidian District, Beijing

Patentee after: Dawning Information Industry (Beijing) Co.,Ltd.

Patentee after: DAWNING INFORMATION INDUSTRY Co.,Ltd.

Address before: 100084 Beijing Haidian District City Mill Street No. 64

Patentee before: Dawning Information Industry (Beijing) Co.,Ltd.