Summary of the invention
The method and apparatus that the object of the present invention is to provide a kind of file data storing and read, solve data hierarchy method of the prior art and be merely able to promote the very problem of the read-write response speed of finite population file, make storage medium dispose under the identical condition, there is more file to obtain the lifting of read-write response speed, thereby significantly promotes the whole input and output performance of storage system.
Be a kind of file data storing and the read method of realizing that the object of the invention provides, comprise step: A. arranges the storage medium of existing and heterogeneous in the storage system from short to long according to response time, calculates the differential time of the response time of the response time of current kind storage medium and next class storage medium successively; B. calculating in described differential time the wide institute of filled band successively can data quantity transmitted, and result of calculation is set at writing the data volume higher limit at every turn and preserving of corresponding current storage medium, and the storage medium that comes the position, end does not have and writes the data volume higher limit at every turn; When C. write data requests occurring, be deposited into storage medium successively according to the data that write in the file that the data volume higher limit will deposit in of every kind of storage medium at every turn; When D. read data request occurring, send or concentrate the transmission reading order simultaneously to the storage medium that has required file data.
Wherein, described step B comprises: B1. sets the read and write data bandwidth value of application server towards storage medium; B2. calculate in described differential time, institute can data quantity transmitted during the wide transmission of described data bandwidth filled band; B3. the data volume higher limit that at every turn writes that result of calculation is set at corresponding current storage medium is also preserved.
Wherein, described step C comprises: when write data requests appearred in C1., whether the data volume of the file that judgement will deposit in write the data volume higher limit greater than the storage medium that ranks the first at every turn, be, execution in step C2 then, not, execution in step C3 then; C2. according to every kind of storage medium write the data volume higher limit at every turn, the data of the file that will deposit in are deposited into two or more storage medium successively; C3. judge being read frequency and whether being higher than predeterminated frequency of the file that to deposit in, be, then this document is deposited into the storage medium that ranks the first, not, then be deposited into other storage mediums.
Wherein, when described storage medium comprised SSD hard disk, SAS hard disk and SATA hard disk, then described steps A comprised: A1. arranges three kinds of hard disks from short to long according to response time, and rank results is SSD hard disk, SAS hard disk, SATA hard disk; A2. calculate the differential time of the response time of the response time of described SSD hard disk and described SAS hard disk; Calculate the differential time of the response time of the response time of described SAS hard disk and described SATA hard disk.
Wherein, described step B comprises: B1. sets the read and write data bandwidth value of application server towards the storage array of three kinds of hard disks compositions; B2. calculating in the differential time of the response time of the response time of described SSD hard disk and described SAS hard disk the wide institute of filled band can data quantity transmitted; The wide institute of calculating filled band in the differential time of the response time of the response time of described SAS hard disk and described SATA hard disk can data quantity transmitted; B3. will be in the differential time of the response time of the response time of described SSD hard disk and described SAS hard disk filled band wide can data quantity transmitted be set at the SSD hard disk write the data volume higher limit at every turn; Will be in the differential time of the response time of the response time of described SAS hard disk and described SATA hard disk filled band wide can data quantity transmitted be set at the SAS hard disk write the data volume higher limit at every turn, the SATA hard disk does not have and writes the data volume higher limit at every turn.
Wherein, described step C comprises: when write data requests appears in C1., the data volume of the file that judgement will deposit in whether greater than the SSD hard disk write the data volume higher limit at every turn, be, execution in step C2 then, not, execution in step C3 then; C2. be deposited into the SSD hard disk with the data that write data volume higher limit equivalent of SSD hard disk at every turn in the file that will deposit in; Remaining data in the file are deposited into the SAS hard disk; If file data also has residue, then the remaining data in the file is deposited into the SATA hard disk; C3. judge being read frequency and whether being higher than predeterminated frequency of the file that to deposit in, be, then this document is deposited into the SSD hard disk, not, then be deposited into SAS hard disk or SATA hard disk.
The present invention also provides a kind of file data storing and reading device, comprise: set in advance module, be positioned on the Control Server, be used for the storage medium of storage system existing and heterogeneous is arranged from short to long according to response time, calculate the differential time of the response time of the response time of current kind storage medium and next class storage medium successively; Calculate the wide institute of filled band energy data quantity transmitted in described differential time successively, the data volume higher limit that at every turn writes that result of calculation is set at corresponding current storage medium is also preserved; Writing module is positioned on the application server, when being used to write data requests occur, is deposited into storage medium successively according to the data that write in the file that the data volume higher limit will deposit in of every kind of storage medium at every turn; Read module is positioned on the application server, when being used to read data request occur, sends or concentrate the transmission reading order simultaneously to the storage medium that has required file data.
Wherein, the described module that sets in advance comprises: the differential time acquisition module, be used for the storage medium of storage system existing and heterogeneous is arranged from short to long according to response time, calculate the differential time of the response time of the response time of current kind storage medium and next class storage medium successively; Set the bandwidth module, be used to set the read and write data bandwidth value of application server towards storage medium; Computing module is used for calculating the wide institute of filled band energy data quantity transmitted in described differential time; Memory module is used for result of calculation is set at writing the data volume higher limit at every turn and preserving of corresponding current storage medium.
The invention has the beneficial effects as follows: a kind of file data reading/writing method and device that the present invention describes, be stored in the different storage mediums by data sementation a file, make application server after sending read-write requests simultaneously to several different storage mediums, carry out at other storage mediums in the time of work such as tracking and internal data transfer, the fastest storage medium of response speed takes the lead in desired data is sent, then response speed time fast storage medium also has been ready to the data that will send, by that analogy, this kind location mode can provide at once for the request of reading of application server, the data response that filled band is wide.
Embodiment
In order to make purpose of the present invention, technical scheme and advantage clearer,, a kind of file data storing of the present invention and the method and apparatus that reads are further elaborated below in conjunction with drawings and Examples.Should be appreciated that specific embodiment described herein only in order to explanation the present invention, and be not used in qualification the present invention.
Because when reading writing harddisk, the factor of restriction performance mainly is the seek time of magnetic head and the data transmission bauds of hard disk inside.Therefore, aim of the present invention is when the server write data, can be according to the response time and the read or write speed difference of various storage mediums, in conjunction with the readwrite bandwidth between application server and hard disk array, calculate the size of data that on various storage mediums, should deposit and carry out corresponding striping processing, so that can be when reading by the relay service of different storage mediums, for reading of application server asks to provide data response at once, that filled band is wide.This calculating and data sementation are deposited necessary address allocation procedure and are carried out on the IPSAN of storage system inside Control Server.
A kind of file data reading/writing method provided by the invention as shown in Figure 1, comprises the steps:
A. the storage medium of existing and heterogeneous in the storage system is arranged from short to long according to response time, calculated the differential time of the response time of the response time of current kind storage medium and next class storage medium successively.
The response time of storage medium is from storage medium and receives behind the read write command that to the length of the time of delay of sending desired data, time of delay is long more, and then response speed is slow more.The response time of hard disk depends primarily on the seek time of magnetic head and the data transmission bauds of hard disk inside.
After the ordering, be first storage medium, second storage medium, the 3rd storage medium according to all kinds of storage mediums of ordering definition
Calculate the differential time of the response time of the response time of current kind storage medium and next class storage medium, the time difference of just at first calculating the response time of first storage medium and second storage medium, obtain first differential time; The response time of calculating second storage medium and the 3rd storage medium then is poor, obtains second differential time; Calculate the time difference of the 3rd storage medium and the 4th storage medium again ... when not having the 4th storage medium, then do not calculate.
B. calculating in described differential time the wide institute of filled band successively can data quantity transmitted, and result of calculation is set at writing the data volume higher limit at every turn and preserving of corresponding current storage medium, and the storage medium that comes the position, end does not have and writes the data volume higher limit at every turn.
Bandwidth is the call parameter of calculated data amount, should set in advance the port bandwidth of each application server towards storage medium on the IPSAN Control Server.
Calculating in first differential time, the data volume that can transmit during bandwidth fixed, defining this data volume is first data volume, first data volume=bandwidth * first differential time; Calculate second data volume again, second data volume=bandwidth * second differential time ... in the middle of practical application, also should consider the problem of valid data transmission efficiency, at this moment, first data volume=first differential time * bandwidth * the valid data transmission efficiency, the second data volume=second differential time * bandwidth * valid data transmission efficiency, the rest may be inferred, can calculate the data volume in other differential times, repeat no more.
On the IPSAN Control Server, write the data volume higher limit with what each data volume of calculating gained was designated as corresponding storage medium at every turn, and it is saved as static data, as the limit standard that writes data volume during to different storage medium with storage.So-called higher limit, the data volume that promptly at every turn is written to this kind storage medium can not be more than this higher limit.
When C. write data requests occurring, be deposited into storage medium successively according to the data that write in the file that the data volume higher limit will deposit in of every kind of storage medium at every turn.
Because every kind of storage medium has all been set and has been write the data volume higher limit accordingly at every turn, therefore file is deposited in the process of different storage mediums, also just data have been split into a plurality of data segments.For example, if the data volume that a file arranged writes data volume higher limit sum greater than the writing data volume (first data volume) higher limit of first storage medium at every turn less than first storage medium and second storage medium at every turn, then in the process that writes, the data that at every turn write data volume higher limit equivalent of the front in this document and first storage medium are deposited into first storage medium, remaining data volume writes second storage medium, and file just has been divided into two data segments and stores like this.If the data volume of certain file is bigger, greater than first storage medium and second storage medium write data volume higher limit sum at every turn, then the remaining data volume of this document is deposited into the 3rd storage medium, like this, file just has been divided into three data segments and has stored.If in the storage system three kinds of storage mediums are only arranged, then the 3rd storage medium is not set and is write the data volume higher limit at every turn.
In the middle of practical application, in order to make full use of application server reading and writing data bandwidth, the data that identical file is deposited in the storage medium of the same race can be deposited in different memory cell through striping, as the reading and writing data bandwidth of supposing application server is 10Gbps, and first storage medium is the SSD hard disk, it is 1Gbps by the bandwidth that network connects, first data volume is 4.3MB as calculated, then the data of the preceding 4.3MB of a certain file can leave 10 or more on the polylith SSD hard disk (bandwidth of 10 or more a plurality of 1Gbps merge reach the above bandwidth of 10Gbps) in through striping, when the transmission data, these SSD hard disks transmit data simultaneously, make the application server readwrite bandwidth fully to be occupied and to utilize.
When D. read data request occurring, send or concentrate the transmission reading order simultaneously to the storage medium that has required file data.
When read data request occurring, should guarantee that each memory cell is to receive reading order simultaneously in all kinds of storage mediums and all kinds of storage medium.But, in actual applications, because application server can't accomplish to send simultaneously many reading orders, can only be one by one, the concentrated area sends reading order, just each reading order sends in succession, but the time of two adjacent order meeting machine cycles of interval.Because this machine cycle is very of short duration, can ignore, reading order almost arrives all kinds of storage mediums simultaneously.
Almost receive under the prerequisite of read write command simultaneously at all kinds of storage mediums, because the response speed of first storage medium is the fastest, then through the response time of first storage medium, first storage medium sends the data of first data volume in the middle of the required file, behind first differential time, the transfer of data of first data volume finishes.At this moment, apart from the moment that all kinds of storage mediums are almost received read write command simultaneously, elapsed time length is the response time and the first differential time sum of first storage medium, and the response time of first storage medium and the first differential time sum are the response time of second storage medium, that is to say from the moment of receiving read write command and count, passed through the response time of second storage medium, at this moment, second storage medium is with the DSR of second data volume of required file and send to application server.When the data of second data volume are transmitted when finishing, then the response time of the 3rd storage medium arrives, and the 3rd storage medium sends the data of the 3rd data volume, by that analogy, and up to the data in the file are all sent.
Therefore, after sending the order that reads or writes to the data of certain file, only through the response time of first storage medium, needed data will wide, continuous the sending of filled band, if the performance of first storage medium is enough superior, then the response time of first storage medium can be ignored, and promptly application server can obtain wide, the continuous data of filled band at once after sending read write command.
Referring to Fig. 2, Fig. 2 is for using the schematic diagram of a kind of storage system of the present invention.By the direct communication of Ethernet switch, be point-to-point interconnect architecture between the application server of this system and the storage medium array, have fixing bandwidth.
On such system architecture, the storage medium of storage system inside can have SSD hard disk, SAS hard disk and SATA hard disk.Technical scheme of the present invention is applied to this storage system, and then storage medium is defined as SATA hard disk, SAS hard disk and SSD hard disk, and as shown in Figure 3, technical scheme of the present invention comprises:
A1. SATA hard disk, SAS hard disk and SSD hard disk are arranged from short to long according to response time.
The SSD hard disk is owing to adopt semiconductor technology, the seek time that does not have magnetic head, it is very high that read or write speed can reach, it only is tens of microseconds that read-write postpones, and the rotating speed of SAS hard disk is at 10000 rev/mins to 15000 rev/mins, 2 to 4 milliseconds of the seek time average out to of magnetic head, internal transmission speed is suitable with the SATA hard disk.The SATA hard disk gets about 8 to 12 milliseconds of seek time average out to of magnetic head, internal data transfer speed average out to 110MB/ second.Therefore the SSD hard disk should come at first, the SAS hard disk secondly, the result of arrangement is: SSD hard disk, SAS hard disk, SATA hard disk.
A2. calculate first differential time of the response time of the response time of described SSD hard disk and described SAS hard disk; Calculate second differential time of the response time of the response time of described SAS hard disk and described SATA hard disk.
The response time of supposing the SAS hard disk is 4ms (millisecond), and the response time of SATA is 12ms, and the response time of SSD hard disk is ignored, first differential time=4ms-0ms=4ms then, second differential time=12ms-4ms=8ms.
B1. set the bandwidth value that reads and writes data of application server towards the storage array of three kinds of hard disks compositions.
B2. calculating in the differential time of the response time of the response time of described SSD hard disk and described SAS hard disk the wide institute of filled band can data quantity transmitted; The wide institute of calculating filled band in the differential time of the response time of the response time of described SAS hard disk and described SATA hard disk can data quantity transmitted.
Calculating in the first differential time 4ms the wide institute of filled band can data quantity transmitted, and this data volume is first data volume, first data volume=4ms* bandwidth.In the middle of practical application, also should consider the problem of valid data transmission efficiency, at this moment first data volume=4ms* bandwidth * valid data transmission efficiency.Calculating in the second differential time 8ms the wide institute of filled band can data quantity transmitted, and this data volume is second data volume, second data volume=8ms* bandwidth * valid data transmission efficiency.
B3. will be in the differential time of the response time of the response time of described SSD hard disk and described SAS hard disk filled band wide can data quantity transmitted be set at the SSD hard disk write the data volume higher limit at every turn; Will be in the differential time of the response time of the response time of described SAS hard disk and described SATA hard disk filled band wide can data quantity transmitted be set at the SAS hard disk write the data volume higher limit at every turn, the SATA hard disk does not have and writes the data volume higher limit at every turn.
Above-mentioned steps A1-B3 be for data deposit and read the preparation of being done in, the parameters such as response time of application server port bandwidth and all kinds of storage mediums can be set when system initialization by the management software by system.Calculate after the data volume higher limit that all kinds of hard disks deposit at every turn, the IPSAN Control Server just can be deposited in different hard disks with file fragmentation according to this data value.
When C1. write data requests occurring, the data volume of the file that judgement will deposit in whether greater than the SSD hard disk write the data volume higher limit at every turn, be, execution in step C2 then, not, execution in step C3 then;
C2. be deposited into the SSD hard disk with the data that write data volume higher limit equivalent of SSD hard disk at every turn in the file that will deposit in; Remaining data in the file are deposited into the SAS hard disk; If file data also has residue, then the remaining data in the file is deposited into the SATA hard disk.
C3. judge being read frequency and whether being higher than predeterminated frequency of the file that to deposit in, be, then this document is deposited into the SSD hard disk, not, then be deposited into SAS hard disk and/or SATA hard disk.
When write data requests occurring, having some small documents need deposit in, the size of the file that the IPSAN Control Server at first will deposit in and first data volume are relatively, when the data volume of the file that will deposit in is less than or equal to first data volume, then, small documents is stored in the suitable storage medium according to the practical application scene of small documents.As a kind of embodiment, small documents just can be come the deposit position of regulation small documents according to accessed frequency, what accessed frequency was high then deposits in the fast storage medium of response, and the small documents that accessed frequency is low is then deposited in the storage medium of low-response.
For the file of data volume greater than first data volume, then the data with first data volume in the file deposit first storage medium in, deposit the data of second data volume in the middle of the remaining data in the file in second storage medium, by that analogy.If there is not the 4th storage medium, then deposit the remaining data in the file in the 3rd storage medium.
When D. read data request occurring, send or concentrate the transmission reading order simultaneously to the storage medium that has required file data.
Suppose that the readwrite bandwidth between application server and hard disk array is 1Gbps, valid data transmission efficiency is 88%, then in the time (about 4ms) of SAS hard disk tracking and internal data transfer end, can transmit the data of about 430KB.And in the time (about 12ms) of SATA hard disk tracking and internal data transfer end, can transmit the data of about 1.26MB.In the fastest SSD hard disk of response speed, and follow-up data 830KB (being that 4ms is to the data that should transmit between the 12ms) leaves in the SAS hard disk the deposit data of the preceding 430KB of this document, and the remainder data of this document leaves on the SATA hard disk.
After data keep, assign the reading order of data by the Ethernet interface that connects storage system when application server, this order is stored the IPSAN Control Server of internal system and understands, and be mapped on the physical hard disk real LBA (Logical Block Address by the request that the virtualization engine that is installed on the server will read and write data, LBA), then the address after these mappings is returned to application server, application server is then according to these concrete addresses, with reading order to concentrate the disposable three kinds of hard disks that mail to.
SAS hard disk and SATA hard disk carry out the magnetic head tracking immediately after receiving order.Because the outstanding data response speed of SSD hard disk, it can provide data at first.After the SSD hard disk had passed its data (430KB), the SAS hard disk had been ready to leave in the 830KB above it, and transmission immediately.After the SAS hard disk groups had passed data, the SATA hard disk also had been ready to remaining data, continued the transmission of data.
Like this, the experience that the application server end obtains is: when it will read the data of a file, order once sending, and through the delay of tens of approximately microseconds, the data that can want with the wide continuous acquisition of filled band just.So effectively avoided transmission latency of causing, improved the performance of storage system greatly because of SAS and SATA hard disc magnetic head seek time.
The present invention also provides a kind of file data read-write equipment, comprise and set in advance module, be positioned on the Control Server, be used for the storage medium of storage system existing and heterogeneous is arranged from short to long according to response time, calculate the differential time of the response time of the response time of current kind storage medium and next class storage medium successively; Calculate the wide institute of filled band energy data quantity transmitted in described differential time successively, the data volume higher limit that at every turn writes that result of calculation is set at corresponding current storage medium is also preserved; Writing module is positioned on the application server, when being used to write data requests occur, is deposited into storage medium successively according to the data that write in the file that the data volume higher limit will deposit in of every kind of storage medium at every turn; Read module is positioned on the application server, when being used to read data request occur, sends or concentrate the transmission reading order simultaneously to the storage medium that has required file data.
As a kind of embodiment, describedly set in advance module and comprise: the differential time acquisition module, be used for the storage medium of storage system existing and heterogeneous is arranged from short to long according to response time, calculate the differential time of the response time of the response time of current kind storage medium and next class storage medium successively; Set the bandwidth module, be used to set the read and write data bandwidth value of application server towards storage medium; Computing module is used for calculating the wide institute of filled band energy data quantity transmitted in described differential time; Memory module is used for result of calculation is set at writing the data volume higher limit at every turn and preserving of corresponding current storage medium.
A kind of file data storing provided by the invention and the method and apparatus that reads, the allocation ratio of various storage mediums is identical in system, under the promptly suitable prerequisite with the cost of existing storage system, there is more file to obtain the lifting of read-write response speed, thereby significantly promotes the whole input and output performance of storage system.
Should be noted that at last that obviously those skilled in the art can carry out various changes and modification to the present invention and not break away from the spirit and scope of the present invention.Like this, if of the present invention these revise and modification belongs within the scope of claim of the present invention and equivalent technologies thereof, then the present invention also is intended to comprise these changes and modification.