WO2022246727A1 - 数据处理装置及数据处理方法 - Google Patents

数据处理装置及数据处理方法 Download PDF

Info

Publication number
WO2022246727A1
WO2022246727A1 PCT/CN2021/096303 CN2021096303W WO2022246727A1 WO 2022246727 A1 WO2022246727 A1 WO 2022246727A1 CN 2021096303 W CN2021096303 W CN 2021096303W WO 2022246727 A1 WO2022246727 A1 WO 2022246727A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
raid controller
preprocessing
cache address
storage array
Prior art date
Application number
PCT/CN2021/096303
Other languages
English (en)
French (fr)
Inventor
秦军杰
常高嘉
许羡
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN202180098648.5A priority Critical patent/CN117377940A/zh
Priority to PCT/CN2021/096303 priority patent/WO2022246727A1/zh
Publication of WO2022246727A1 publication Critical patent/WO2022246727A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present application relates to the field of information technology, and in particular to a data processing device and a data processing method.
  • Redundant Array of Independent Disks (Redundant Array of Independent Disks, RAID) is a high-performance, high-reliability storage technology that combines a series of individual disks in different ways to provide logical storage for application terminals or terminal clusters. disk. RAID technology has been widely used in various occasions of data storage.
  • RAID technologies include RAID0, RAID1, RAID5, RAID6, and RAID10; among them, RAID0 has no redundancy capability, and RAID1 has low disk utilization; while RAID5, RAID6, and RAID10 Each consists of multiple disks (for example, RAID5 contains at least 3 disks, RAID6 and RAID10 contain at least 4 disks), each RAID writes data to the disks in the array in the form of stripes, and stores the parity information on the disks in the array superior.
  • RAID5 is a storage solution that balances storage performance, data security, and storage cost. It uses Disk Striping technology. RAID 5 requires at least three disks. Instead of backing up the stored data, RAID 5 stores the data and corresponding verification information on each disk that makes up RAID5, and the verification information and corresponding data are stored in on a different disk. When data on a disk of RAID5 is damaged, the remaining data and corresponding checksum information can be used to recover the damaged data.
  • the old data on the disk will be stored in the cache unit buffer first, and then the RAID controller will read the old data from the buffer to perform corresponding Verify information update or data recovery.
  • the occupied system space is relatively large, and the processing delay is relatively high; in addition, when the cache unit buffer is located outside the RAID controller, the bus bandwidth requirements High, high system power consumption.
  • the embodiment of the present application discloses a data processing device and a data processing method, which can reduce the number of reads and writes of old data on the disk in the cache unit and the number of times it is transmitted on the bus, reduce the requirements for system bandwidth and storage space, and further reduce the cost of the system. Power consumption; at the same time, calculations can also be performed out of order, thereby reducing the delay in obtaining consistent calculation results.
  • the embodiment of the present application discloses a data processing device, including a redundant array of independent disks RAID controller and a storage array coupled to the RAID controller; wherein, the RAID controller is used to obtain the target stripe in the storage array The first data on the first data and the first index information corresponding to the first data; wherein, the first data is any one of the data on the target strip; it is used to determine from the preset mapping relationship based on the first index information and the first index information The first entry information corresponding to the data; wherein, the preset mapping relationship is generated based on the consistency of the stripes, and the first entry information is used to indicate the preprocessing type and cache address corresponding to the first data in the consistency operation ; Perform corresponding preprocessing on the first data according to the preprocessing type, and use the preprocessed first data to update the data in the cache address.
  • the above-mentioned consistency calculation may be the use of the first data in RAID 5/6 to recover damaged disk data, update disk verification information, or a corresponding calculation process in other scenarios.
  • the above-mentioned process of determining from the preset mapping relationship that the first index information corresponds to the first entry information may be through table lookup or other methods, which is not limited in this application.
  • the first data in the storage array is first cached in the cache unit, and then the first data in the cache unit and the physical address corresponding to the first data are read to the RAID controller.
  • the storage array can directly send the first data and the first index information corresponding to the first data to the RAID controller, and then use the first index information to determine the required
  • the preprocessing type and cache address can save the process of first caching the first data in the cache unit and then reading it from the cache unit to the RAID controller, thereby effectively reducing the number of data reads and writes in the cache unit, thereby reducing the need for The read and write bandwidth requirements of the cache unit.
  • the embodiment of the present application can also reduce the subsequent delay of using the first data to restore the disk data and/or update the verification information .
  • the cache unit is located outside the RAID controller, the process of caching the first data in the storage array in the cache unit in the prior art needs to perform data transmission through the bus. Since this application omits this process in the prior art, Therefore, the embodiment of the present application can also reduce the requirement on the bus bandwidth, thereby reducing the power consumption of the system.
  • the above-mentioned storage array includes M disks, and the target stripe includes M stripe units, and the M stripe units are respectively located on the M disks; wherein, M is an integer greater than 2; the first The data is one or all data blocks on any one of the M stripe units.
  • the first data may be one or all data blocks on any stripe unit among the M stripe units. That is, when the data in the stripe unit is split into multiple data blocks, the first data can be one of the multiple data blocks; when the data in the stripe unit is not split, the first data can be All data contained in any stripe unit among the M stripe units.
  • the size of the first data is small, so The first data at this time can be quickly returned to the RAID controller for subsequent calculations, thereby reducing delays in subsequent processing and improving efficiency.
  • the above-mentioned target stripe further includes second data; the above-mentioned storage array is used to: send the second data and second index information corresponding to the second data to the RAID controller; wherein, the second data
  • the sending time of the first data is before or after the sending time of the first data.
  • the foregoing second data may include at least one other data block in the target stripe except the first data.
  • the second data includes multiple data blocks
  • the multiple data blocks can be returned to the RAID controller in any order, and each data block will carry index information corresponding to each data block when returned.
  • the RAID controller can determine the returned data according to the index information corresponding to each returned data and perform consistent operations.
  • the required preprocessing type and cache address do not need to wait for all the required data on the target stripe to be returned. Therefore, when the second data contains multiple data blocks, the storage array can return the above-mentioned first data and second data to the RAID controller in any order. Two data, and based on the first data and/or the second data out of order, complete the process of recovering damaged data on the disk and/or updating the checksum information, thereby effectively improving system performance.
  • the first entry information includes a first preprocessing type and a first cache address
  • the RAID controller is specifically configured to: preprocess the first data according to the first preprocessing type, and Utilize the preprocessed first data to update the data in the first cache address to obtain the first reference information corresponding to the first data
  • the RAID controller is also used to: determine from the preset mapping relationship based on the second index information Second entry information corresponding to the second data; wherein, the second entry information includes a second preprocessing type and a second cache address; preprocessing the second data according to the second preprocessing type, and using the preprocessing
  • the second data in the second cache address is updated to obtain the second reference information corresponding to the second data; the data to be restored in the storage array is obtained according to the first reference information and the second reference information.
  • the data in the first cache address is updated by using the preprocessed first data to obtain the first reference information corresponding to the first data, specifically: the RAID controller acquires the first cache address The data in the address, perform an XOR operation on the preprocessed first data and the data in the first cache address to obtain the first reference information corresponding to the first data, and write the first reference information into the first cache address.
  • the RAID controller determines the corresponding entry information according to the index information corresponding to each data, and then according to the corresponding table
  • the preprocessing type and cache address indicated by the item information perform corresponding consistency operations on each data, and the consistency operations include corresponding preprocessing and XOR operations. Since the logic of the above XOR operation is fixed, the data returned to the RAID controller in any order can be directly processed according to the order in which each data is returned, and the reference information corresponding to each data can be obtained.
  • the multiple data blocks can be returned to the RAID controller out of order, and the corresponding consistency is performed according to the order in which the multiple data blocks are returned Operation (that is, out-of-order return, out-of-order calculation), thus effectively improving system performance.
  • the RAID controller is further configured to: receive third data to be written into the storage array and third index information corresponding to the third data; Determine the third table entry information corresponding to the third data; wherein, the third table entry information includes a third preprocessing type and a third cache address; preprocess the third data block according to the third preprocessing type, and use The preprocessed third data updates the data in the third cache address to obtain third reference information corresponding to the third data.
  • the above third data may be sent by the host to the RAID controller and subsequently written into the storage array; where the third data may include one or more data blocks, and the multiple data blocks may be in any Return to the RAID controller in sequence, and each data block will carry the index information corresponding to each data block at the same time.
  • the RAID controller may also perform a corresponding consistency operation on the third data returned from the host to obtain third reference information corresponding to the third data.
  • the third data includes a plurality of data blocks
  • each data block carries corresponding index information
  • the RAID controller can directly perform processing according to the order in which each data block is returned, instead of according to the sequence before each data block is split.
  • the corresponding consistency operation i.e. out-of-order return and out-of-order calculation
  • the delay of the second data consistency operation process can be reduced, and then The subsequent delay in obtaining verification information according to the result of the second data consistency operation is reduced.
  • the above-mentioned first entry information includes a fourth pre-processing type and a fourth cache address; the above-mentioned RAID controller is specifically configured to: pre-process the first data according to the fourth pre-processing type, and Use the preprocessed first data to update the data in the fourth cache address to obtain the fourth reference information corresponding to the first data; the RAID controller is also used to: obtain the storage according to the third reference information and the fourth reference information Checksum information in the array.
  • the first entry information also includes the fourth preprocessing type and the fourth cache address. Since the first data is returned to the RAID controller, the first index information corresponding to the first data is also returned to the RAID controller, and then the RAID controller can determine the corresponding first entry information according to the first index information, and according to The fourth preprocessing type and the fourth cache address included in each first entry information perform corresponding consistency calculations on the first data block, without waiting for all other data blocks in the storage array to be returned to the RAID controller, directly according to the return of the data Corresponding consistency operations (that is, out-of-order return and out-of-order calculation) are performed in sequence, which can effectively improve system performance.
  • the corresponding preprocessing type and cache address can be determined according to the first index information and the preset mapping relationship, without intermediate processing of the first data through the cache unit. Cache, thereby reducing the number of reads and writes of the cache unit and the number of bus transfers, thereby reducing the subsequent delay in obtaining the verification information to be updated of the storage array according to the consistency operation result of the first data.
  • the first entry information may include the first preprocessing type and/or the fourth preprocessing type, the first cache address and/or the fourth cache address.
  • the first entry information may include Q preprocessing types and Q cache addresses; wherein, the Q preprocessing types correspond to the Q cache addresses one-to-one.
  • Each preprocessing type and the cache address corresponding to the preprocessing type can correspond to an application scenario in RAID5/6, that is, the above Q preprocessing types and Q cache addresses respectively correspond to Q application scenarios, and the Q application scenarios can be It is any scenario where the first data is obtained from the disk for subsequent operations, for example, disk data recovery, disk verification information update, and disk data recovery and disk verification information update or other scenarios. This application does not limit this, Q is positive integer.
  • the RAID controller is further configured to: before receiving the first data, initialize the data in the cache address indicated by the first entry information.
  • the RAID controller before the RAID controller receives the first data and performs subsequent consistency calculations, it needs to initialize the data in the cache address indicated by the first entry information, and the initialization process can be clearing processing; then the RAID controller starts to perform a corresponding consistency operation on the received first data block, so as to ensure that the correct data to be recovered and/or check information of the disk is generated.
  • the embodiment of the present application discloses a RAID controller.
  • the RAID controller includes a processor and an interface circuit; the processor is coupled to the storage array through the interface circuit; The first data on the target strip in the array and the first index information corresponding to the first data; wherein, the first data is any one of the data on the target strip; determined from the preset mapping relationship based on the first index information First entry information corresponding to the first data; wherein, the preset mapping relationship is generated based on the consistency of the stripes, and the first entry information is used to indicate the preprocessing type corresponding to the first data in the consistency operation and a cache address; perform corresponding preprocessing on the first data according to the preprocessing type, and use the preprocessed first data to update the data in the cache address.
  • the foregoing RAID controller includes a memory, and the memory is configured to store the first entry information.
  • the above-mentioned storage array includes M disks, and the target stripe includes M stripe units, and the M stripe units are respectively located on the M disks; wherein, M is an integer greater than 2; the first The data is one or all data blocks on any one of the M stripe units.
  • the target stripe further includes second data; the storage array is further configured to send the second data and second index information corresponding to the second data to the RAID controller; wherein, the second data
  • the sending time of the first data is before or after the sending time of the first data.
  • the embodiment of the present application discloses a data processing method, including: obtaining, by the RAID controller, the first data on the target stripe in the storage array and the first index information corresponding to the first data; wherein, the first data It is any one of the data on the target stripe; the RAID controller determines the first entry information corresponding to the first data from the preset mapping relationship based on the first index information; wherein, the preset mapping relationship is based on the stripe Generated with consistency, the first entry information is used to indicate the preprocessing type and cache address of the consistency operation corresponding to the first data; perform corresponding preprocessing on the first data according to the preprocessing type, and use the preprocessed
  • the first data updates the data in the cache address.
  • the storage array includes M disks, the target stripe includes M stripe units, and the M stripe units are respectively located on the M disks; wherein, M is an integer greater than 2; the first data It is one or all data blocks on any one of the M stripe units.
  • the above-mentioned target stripe further includes second data; the method further includes: sending the second data and second index information corresponding to the second data to the RAID controller through the storage array; wherein, the second The transmission time of the data is before or after the transmission time of the first data.
  • the above-mentioned first entry information includes the first preprocessing type and the first cache address; the above-mentioned preprocessing is performed on the first data according to the preprocessing type, and the preprocessed first Updating the data in the cache address by the data includes: preprocessing the first data by the RAID controller according to the first preprocessing type, and using the preprocessed first data to update the data in the first cache address, First reference information corresponding to the first data is obtained.
  • the above method further includes: the RAID controller determines second entry information corresponding to the second data from a preset mapping relationship based on the second index information; wherein, the second entry information includes the second preprocessing type and the second Cache address; preprocess the second data according to the second preprocessing type, and use the preprocessed second data to update the data in the second cache address to obtain second reference information corresponding to the second data; according to the second preprocessing type
  • the first reference information and the second reference information obtain the data to be restored in the storage array.
  • the above method further includes: receiving, by the RAID controller, third data to be written into the storage array and third index information corresponding to the third data; Determine the third entry information corresponding to the third data in the relationship; wherein, the third entry information includes a third preprocessing type and a third cache address; perform preprocessing on the third data block according to the third preprocessing type, and The data in the third cache address is updated by using the preprocessed third data to obtain third reference information corresponding to the third data.
  • the above-mentioned first entry information includes the fourth preprocessing type and the fourth cache address; perform corresponding preprocessing on the first data according to the preprocessing type, and use the preprocessed first data
  • Updating the data in the cache address includes: preprocessing the first data by the RAID controller according to the fourth preprocessing type, and using the preprocessed first data to update the data in the fourth cache address to obtain The fourth reference information corresponding to the first data; the above method further includes: the RAID controller obtains the verification information in the storage array according to the third reference information and the fourth reference information.
  • the above method further includes: before receiving the first data, the RAID controller initializes the data in the cache address indicated by the first entry information.
  • the embodiment of the present application discloses a chip system, the chip system includes at least one processor, a memory and an interface circuit, the memory, the interface circuit and the at least one processor are interconnected through lines, and the at least one memory Instructions are stored; when the instructions are executed by the processor, the method described in any one of the above third aspects is implemented.
  • the embodiment of the present application discloses a computer-readable storage medium, the computer-readable storage medium stores program instructions, and when the program instructions are run on the processor, any one of the above-mentioned third aspects The method described in this item is implemented.
  • the embodiment of the present application discloses a computer program product.
  • the method described in any one of the above third aspects can be implemented.
  • an embodiment of the present application provides a terminal device, including the data processing apparatus provided in any one of the implementation manners in the first aspect above and a discrete device coupled to the data processing apparatus.
  • Fig. 1 is the structural representation of storage array in a kind of RAID 5 that the embodiment of the present application provides;
  • Fig. 2 is the structural representation of storage array in a kind of RAID 6 that the embodiment of the application provides;
  • FIG. 3 is a schematic diagram of a stripe structure of a storage array provided by an embodiment of the present application.
  • Fig. 4 is a schematic diagram of a data flow in the prior art
  • FIG. 5 is a schematic structural diagram of a data processing device provided in an embodiment of the present application.
  • FIG. 6 is a schematic flowchart of a consistency operation provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram of a cache address corresponding to a different data block in a cache unit according to an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of another data processing device provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of a data flow provided by an embodiment of the present application.
  • Fig. 10 is a schematic diagram of a process of bus transmission times and data reading and writing times provided by the embodiment of the present application.
  • FIG. 11 is a schematic diagram of a hardware structure of a RAID controller provided in an embodiment of the present application.
  • Fig. 12 is a schematic flowchart of a data processing method provided by an embodiment of the present application.
  • FIG. 1 is a schematic structural diagram of a storage array in RAID 5 provided by an embodiment of the present application.
  • RAID5 includes four independent disks: Disk0, Disk1, Disk2, and Disk3.
  • the four independent disks may include four stripes, each stripe includes four stripe units, and the four stripe units included in each stripe are respectively located on the four independent disks.
  • the first strip contains four strip units A1, A2, A3 and Ap
  • the second strip contains four strip units B1, B2, Bp and B3
  • the third strip contains four strip units C1, Cp, C2 and C3
  • the stripe unit, the fourth stripe includes four stripe units Dp, D1, D2 and D3.
  • each stripe unit contained in each stripe has the same starting position and length on the respective disks.
  • stripe units represented by numerical subscripts (1, 2, and 3) are used to store disk data
  • stripe units represented by letter subscripts (p) are used to store
  • the parity information corresponding to the disk data in all embodiments of the present application, the parity information in RAID5 may also be referred to as P data).
  • FIG. 2 is a schematic structural diagram of a storage array in RAID 6 provided by an embodiment of the present application.
  • RAID6 includes five independent disks, and the five independent disks may contain five stripes, and each stripe includes five stripe units; wherein, each stripe includes a specific stripe unit Refer to FIG. 2 , which will not be repeated here.
  • the stripe units represented by the numerical subscripts (1, 2 and 3) are used to store disk data, and the stripe units represented by the letter subscripts (p and q) are used It is used to store parity information corresponding to disk data (in all embodiments of the present application, the two types of parity information in RAID6 may be called P data and Q data respectively).
  • RAID 6 adds a second independent check information block.
  • Two independent parity systems use different algorithms, so data reliability is very high, and data integrity will not be affected when any two disks fail at the same time.
  • the application scenario of the embodiment of the present application will be introduced below with reference to FIG. 3 .
  • the embodiment of the present application can be applied to the following scenarios in RAID5 or RAID6: 1. recover the damaged data in the disk; 2. update the checksum information in the disk; 3. recover the damaged data in the disk and update the disk Checksum information in .
  • FIG. 3 is a schematic diagram of a stripe structure of a storage array according to an embodiment of the present application.
  • the stripe can be any one of the five stripes shown in FIG. 2 (that is, RAID6).
  • the above three application scenarios will be described below by taking the strip shown in FIG. 3 as an object.
  • the stripe shown in Figure 3 contains five stripe units: D0, D1, D2, P, and Q; among them, D0, D1, and D2 are used to store disk data, and P and Q are used to store data corresponding to the disk. P data and Q data (two independent parity information).
  • a stripe on a disk contains six stripe units D0, D1, D2, D3, P, and Q; among them, D0, D1, D2, and D3 are stripe units for storing data, and P and Q are A stripe unit that stores parity information. Assuming that the data in D0 and D1 are damaged, and you need to write new D2 data to the disk in uppercase, you need to read the data in D2 and D3 from the disk once, and the same data read will be used four times, respectively for calculation Old D0 data and old D1 data, and new P data and new Q data.
  • FIG. 4 is a schematic diagram of a data flow in the prior art, which is used to describe the data interaction process among the storage array (including multiple independent disks), the RAID controller and the cache unit in the prior art.
  • the old data in the storage array is read, and the old data in the storage array is written into the cache unit.
  • the RAID controller reads the old data of the storage array from the cache unit according to the data flow shown in 2; among them, when the RAID controller needs to perform consistency calculations corresponding to different scenarios, for example, restore damaged data and verify information at the same time
  • the RAID controller can respectively read the corresponding data from the cache unit, that is, the above data flow direction 2 can include multiple independent data reading processes.
  • the RAID controller can also receive new data to be written sent by the host through the bus (this process is not shown). After the RAID controller acquires the corresponding old data of the storage array from the cache unit, the RAID controller performs corresponding consistency calculations on the acquired old data of the storage array to obtain updated checksum information and/or data to be restored in the storage array.
  • the above data flow 3 may include multiple independent data writing processes.
  • the cache unit is located outside the RAID controller, the data streams in the above processes 1, 2 and 3 all need to be transmitted through the system bus.
  • the cache unit can be a readable and writable memory, such as a register or a random access memory (RAM), such as a static random access memory (static random access memory, SRAM), a dynamic random access memory Access memory (dynamic random access memory, DRAM) or synchronous dynamic random access memory (synchronous DRAM, SDRAM), double rate SDRAM (dual data rate SDRAM, DDR SDRAM), etc.
  • RAM random access memory
  • static random access memory static random access memory
  • DRAM dynamic random access memory Access memory
  • DRAM synchronous dynamic random access memory
  • SDRAM synchronous DRAM
  • double rate SDRAM double rate SDRAM
  • DDR SDRAM double rate SDRAM
  • the data interaction between the cache unit and the RAID controller is transmitted through the system bus; when the cache unit is located inside the RAID controller, the data interaction between the cache unit and the RAID controller does not need to pass system bus for transmission.
  • FIG. 5 is a schematic structural diagram of a data processing apparatus 500 provided in an embodiment of the present application.
  • the data processing device 500 may include a redundant array of independent disks RAID controller 510 and a storage array 520 coupled to the RAID controller.
  • the RAID controller 510 is configured to acquire the first data on the target stripe in the storage array 520 and the first index information corresponding to the first data; wherein, the first data is any one of the data on the target stripe.
  • the storage array 520 includes M independent disks, and the storage array 520 may be divided into N stripes.
  • the above-mentioned target strip may be any one of N strips, and M is an integer greater than 2.
  • the RAID controller 510 is configured to determine first entry information corresponding to the first data from a preset mapping relationship based on the first index information; wherein, the preset mapping relationship is generated based on the consistency of the stripes, and the first The entry information is used to indicate the preprocessing type and cache address corresponding to the first data in the consistency operation; perform corresponding preprocessing on the first data according to the preprocessing type, and use the preprocessed first data to data is updated.
  • the above-mentioned first index information may include specific location information of the first data in the storage array 520, and a correspondence relationship between the first data and the first entry information, that is, the RAID controller may parse the first index information from The first entry information corresponding to the first data is determined in the preset mapping relationship.
  • the above preset mapping relationship is generated based on the consistency of the stripes, that is, the entry information corresponding to different data on the target stripe is generated according to the consistency of the stripes.
  • the above-mentioned determination of the first index information corresponding to the first data based on the preset mapping relationship may be determined by looking up a table or other methods, which is not limited in this application; the first index information may be used as the command header of the first data, and The first data is returned to the RAID controller 510 together.
  • the above specific location information may include the number of the disk to which the first data belongs in the storage array 520 and the specific location of the first data on the disk to which it belongs.
  • the above-mentioned consistency calculation may be the use of the first data in RAID5/6 to recover disk damage data, update disk checksum information, or the corresponding calculation process in other scenarios, that is, the above-mentioned scenario 1, scenario 2, scenario 3 or other Scenarios in which data is obtained from the storage array for calculation in RAID5/6 may include a corresponding consistency calculation process, which is not limited in this application.
  • the above-mentioned first data may also be referred to as consistency operation data.
  • the first data in the storage array is first cached in the cache unit, and then the first data in the cache unit and the physical address corresponding to the first data are read to the RAID controller.
  • the storage array in the embodiment of the present application can directly send the first data and the first index information corresponding to the first data to the RAID controller, and then determine the preprocessing required for subsequent consistent operations on the first data through the first index information type and cache address, so the process of storing the first data of the disk in the cache unit first, and then reading it from the cache unit to the RAID controller can be saved, thereby effectively reducing the number of data reads and writes in the cache unit, and further reducing the need for cache memory. Unit read and write bandwidth requirements.
  • the embodiment of the present application can also reduce the subsequent delay of using the first data to restore the disk data and/or update the verification information.
  • the process of caching the first data in the storage array in the cache unit in the prior art needs to perform data transmission through the bus. Since this application omits this process in the prior art, Therefore, the embodiment of the present application can also reduce the requirement on the bus bandwidth, thereby reducing the power consumption of the system.
  • the above-mentioned storage array 520 includes M disks, and the target stripe includes M stripe units, and the M stripe units are respectively located on M disks; wherein, M is an integer greater than 2; A piece of data is one or all data blocks on any one of the M stripe units.
  • the first data when the storage array 520 splits the data in the M stripe units into multiple data blocks, the first data may be one of the multiple data blocks; when the storage array 520 does not divide the M stripe units When the data in the unit is split, the first data may be all the data blocks included in any one of the M stripe units, that is, all the data on the any stripe unit.
  • the first data block is one of the multiple data blocks, the first data is relatively small, so this The first data at that time can be quickly returned to the RAID controller for subsequent calculations, thereby reducing the delay in subsequent processing and improving efficiency.
  • the target stripe further includes second data; the storage array is further configured to: send the second data and second index information corresponding to the second data to the RAID controller; wherein, the second The transmission time of the data is before or after the transmission time of the first data.
  • the above-mentioned second data may include at least one other data block in the target stripe except the first data.
  • the second data includes multiple data blocks
  • the multiple data blocks can be returned to the RAID controller in any order, and each data block will carry index information corresponding to each data block when returned.
  • the RAID controller can determine the returned data according to the index information corresponding to each returned data and perform consistent operations.
  • the required preprocessing type and cache address do not need to wait for all the required data on the target stripe to be returned. Therefore, when the second data contains multiple data blocks, the storage array can return the above-mentioned first data and second data to the RAID controller in any order. Two data, and based on the first data and/or the second data out of order, complete the process of recovering damaged data on the disk and/or updating the checksum information, thereby effectively improving system performance.
  • the above-mentioned storage array 520 is used to send the first data and the multiple data blocks to the RAID controller in any order, and send the same data block at the same time when sending different data. Corresponding index information.
  • the first data may be all the data in any of the M stripe units, and the second data may include M stripe units For the data in at least one stripe unit in the unit, the specific data content contained in the second data is determined according to the application scenario to which the consistency operation belongs.
  • the storage array 520 may sequentially send the first data and the second data in any order.
  • the first data may be a data block included in any stripe unit, and the second data may include at least one remaining data except the first data blocks, when the at least one remaining data block is multiple, they may be located in the same stripe unit or in different stripe units.
  • the storage array 520 can send the first data and the second data to the RAID controller in the following two ways:
  • the storage array 520 first sends all the data blocks belonging to one stripe unit in the target stripe, and then sends all the data blocks belonging to another stripe unit, and sends the first data and the second data according to this rule; wherein, the storage array 520 can be based on A specific application scenario determines which stripe unit all data blocks are sent first, and which stripe unit all data blocks are sent subsequently, which is not specifically limited in this application.
  • the storage array 520 can first send all the data blocks contained in the stripe unit 3, and then send the data blocks contained in the stripe unit 4, or All the data blocks included in the stripe unit 4 are sent first, and then all the data blocks included in the stripe unit 3 are sent.
  • the storage array 520 may send them to the RAID controller in a sequential or out-of-order manner. For example, when the data contained in the stripe unit 3 is split into data block 1, data block 2, and data block 3; wherein, data block 1 is a data header, and data block 3 is a data tail.
  • the storage array 520 sends the three data blocks in the stripe unit 3, it may sequentially send the above three data blocks in the order of data block 1, data block 2, and data block 3, or send the data blocks to the RAID controller in other order. The three data blocks.
  • the storage array 520 may alternately send data blocks belonging to different stripe units in any order. Specifically, the storage array 520 may first send one or more data blocks (not all data blocks) in any stripe unit, and then send one or more first data blocks (not all data blocks) in another stripe unit. ) until the first data and the second data are sent. Wherein, the storage array 520 may determine which partial data block of a stripe unit to send first and which partial data block of a stripe unit to send later according to a specific application scenario, which is not specifically limited in this application.
  • the storage array 520 may sequentially send data block 2 in stripe unit 3, data block 5 and data block 6 in stripe unit 4, Data block 1 and data block 3 in stripe unit 3, and data block 4 in stripe unit 4.
  • the RAID controller 510 can determine the entry information corresponding to the data according to the index information of the returned data, and then perform the subsequent XOR operation based on the preprocessing type and cache address contained in the entry information, because the logic of the XOR operation is fixed, so the first data and the second data can be returned to the RAID controller out of order, so that the calculation of the data to be recovered and/or the verification information of the subsequent disks can be completed out of order, so the embodiment of the present application can effectively improve the system performance.
  • the above-mentioned first entry information includes a first preprocessing type and a first cache address; the RAID controller is specifically configured to: preprocess the first data according to the first preprocessing type, and use The preprocessed first data updates the data in the first cache address to obtain the first reference information corresponding to the first data; the RAID controller is also used to: determine from the preset mapping relationship based on the second index information and The second entry information corresponding to the second data; wherein, the second entry information includes the second preprocessing type and the second cache address; the second data is preprocessed according to the second preprocessing type, and the preprocessed The second data updates the data in the second cache address to obtain second reference information corresponding to the second data; and obtains the data to be restored in the storage array according to the first reference information and the second reference information.
  • the above-mentioned preprocessing is performed on the first data according to the first preprocessing type, and the data in the first cache address is updated by using the preprocessed first data to obtain the first reference information corresponding to the first data, It specifically includes: the RAID controller 510 obtains the data in the first cache address, performs an XOR operation on the preprocessed first data and the data in the first cache address, and obtains first reference information corresponding to the first data (that is, the result of the XOR operation), and write the first reference information into the first cache address.
  • the calculation process of the second reference information may refer to the corresponding process in the first reference information, which will not be repeated here.
  • the first data corresponds to the first index information
  • the first index information corresponds to the first entry information
  • the first entry information is used to indicate the preprocessing type and cache address of the first data for consistent operation.
  • the first entry information may include K preprocessing types and K cache addresses; wherein, the K preprocessing types and K cache addresses are in one-to-one correspondence, each preprocessing type and the corresponding cache address They are respectively used for consistency operations of the first data in different scenarios, and K is a positive integer.
  • the second data corresponds to the second index information
  • the second index information corresponds to the second entry information
  • the second entry information is used to indicate the preprocessing type and cache address of the second data for consistent operation.
  • each data block corresponds to one index information and entry information, that is, at this time, the second index information includes index information corresponding to the multiple data blocks, and the second entry The information includes entry information corresponding to the plurality of data blocks.
  • the above-mentioned preprocessing type is the data processing method in RAID5/6, that is, after the RAID controller 510 receives data from the storage array 520, it uses a preprocessing function to process the received data.
  • the corresponding preprocessing function can be used to process the data 50 to obtain the preprocessed data 89; then, an XOR operation is performed on the data in the cache address corresponding to the data 50 and the data 89 to obtain The reference information corresponding to the data 50 is the result of the XOR operation.
  • the above preprocessing is the data processing process before the XOR operation after the data on the storage array 520 is returned.
  • FIG. 6 is a schematic flow chart 600 of consistency operation, including step 610 , step 620 , step 630 and step 640 .
  • Step 610 the RAID controller receives the currently returned i-th data block, and obtains the corresponding entry information from the preset mapping relationship according to the index information corresponding to the i-th data block;
  • the data block is preprocessed; wherein, when the second data contains multiple data blocks, the ith data block is the first data or one of the multiple data blocks; when the second data is a data block, the i-th data block is The i-th data block is the first data or the second data, and i is a positive integer.
  • the process of the RAID controller preprocessing the i-th data block is correspondingly the same as the above process of the RAID controller preprocessing the first data, which will not be repeated here.
  • the preprocessing type corresponding to each returned data block is determined according to a specific application scenario, and may be the same or different, which is not limited in this application. Different preprocessing types correspond to different preprocessing functions.
  • the RAID controller preprocesses the i-th data block according to the pre-processing algorithm corresponding to the i-th data block. For example, when the data in the i-th data block is 67 , it can be preprocessed according to the corresponding preprocessing algorithm, and the i-th data block obtained after preprocessing is a value different from the data in the i-th data block, such as 34.
  • Step 620 judging whether the cache address corresponding to the i-th data block overlaps with the cache address corresponding to the data block currently undergoing XOR processing by the RAID controller (also referred to as the current data block).
  • the above overlapping part means that the cache address corresponding to the i-th data block completely or partially overlaps with the cache address corresponding to the current data block.
  • Step 630 when the judgment result in step 620 is "Yes", the RAID controller waits for the data in the cache address corresponding to the current data block to be updated, and then performs a corresponding consistency operation on the i-th data block to update the i-th data block The data in the cache address corresponding to each data block, that is, the serial update.
  • the RAID controller waits for the data in the cache address corresponding to the current data block to be updated, and then the RAID controller acquires the i-th data block For the data in the corresponding cache address, XOR processing is performed on the i-th data block after preprocessing and the data in the corresponding cache address of the i-th data block to obtain the reference information corresponding to the i-th data block, and convert it to Write to the cache address corresponding to the i-th data block.
  • Step 640 when the judgment result in step 620 is "No", the RAID controller can immediately start to update the data in the cache address corresponding to the i-th data block, that is, update in parallel; the specific update process corresponds to the above-mentioned embodiment The process will not be repeated here.
  • the RAID controller updates the data in the cache address corresponding to the i-th data block
  • this process needs to use the data in the cache address
  • the cache address corresponding to the i-th data block is the same as the current data
  • it needs to be updated serially; when the cache address corresponding to the i-th data block does not coincide at all with the cache address corresponding to the current data block, it can be updated in parallel.
  • the target cache address in FIG. 7 is the union of the above-mentioned first cache address and the second cache address.
  • the first cache address is used to store the first reference information.
  • the second cache address is used to store the second reference information.
  • the second cache address may include multiple cache addresses respectively corresponding to the multiple data blocks; at this time, the target cache address is the first A cache address and the union of the multiple cache addresses.
  • the above-mentioned first cache address and any two cache addresses in the plurality of cache addresses may not overlap, completely overlap or partially overlap in the target cache address.
  • the i-th data block is one of multiple data blocks included in the first data or the second data.
  • the cache address corresponding to the i-th data block partially overlaps with the cache address corresponding to the i-1th data block;
  • the cache addresses corresponding to the data blocks completely overlap.
  • the above-mentioned target cache address may be located in the buffer unit buffer described in FIG. static random access memory (SRAM), dynamic random access memory (dynamic random access memory, DRAM) or synchronous dynamic random access memory (synchronous DRAM, SDRAM), double rate SDRAM (dual data rate SDRAM, DDR SDRAM), etc.
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • SDRAM synchronous dynamic random access memory
  • double rate SDRAM double data rate SDRAM
  • DDR SDRAM double rate SDRAM
  • the target cache unit can be located inside the RAID controller (that is, on-chip, such as SRAM), or outside the RAID controller (that is, off-chip, such as DDR SDRAM).
  • the process in which the RAID controller uses the first data received from the storage array 520 to generate parity information in the target stripe will be specifically described below, that is, the second scenario.
  • the third data to be written into the storage array and the third index information corresponding to the third data are received; based on the third index information, the index corresponding to the third data is determined from the preset mapping relationship Third entry information; wherein, the third entry information includes a third preprocessing type and a third cache address; preprocessing the third data block according to the third preprocessing type, and using the preprocessed third data to The data in the third cache address is updated to obtain third reference information corresponding to the third data.
  • the above third data may be sent by the host to the RAID controller and subsequently written into the storage array 520; where the third data may include one or more data blocks, and the multiple data blocks may be represented as Return to the RAID controller in any order, and each data block will carry the index information corresponding to each data block at the same time.
  • the RAID controller 510 may determine third entry information corresponding to the third data from a preset mapping relationship according to the third index information. Using the third entry information to include the third preprocessing type and the third cache address to perform a corresponding consistency operation on the third data block; the process of the third data consistency operation can refer to the corresponding process in the first data consistency operation, I won't repeat them here.
  • the corresponding consistency operation process of each data block is the same as the above-mentioned consistency operation process when the third data is used as one data block, which will not be repeated here.
  • the RAID controller may also perform a corresponding consistency operation on the third data returned from the host to obtain third reference information corresponding to the third data.
  • the third data includes a plurality of data blocks
  • each data block carries corresponding index information
  • the RAID controller can directly perform processing according to the order in which each data block is returned, instead of according to the sequence before each data block is split.
  • the corresponding consistency operation i.e. out-of-order return and out-of-order calculation
  • the delay of the second data consistency operation process can be reduced, and then The subsequent delay in obtaining verification information according to the result of the second data consistency operation is reduced.
  • the above-mentioned first entry information includes a fourth preprocessing type and a fourth cache address; the RAID controller is specifically configured to: preprocess the first data according to the fourth preprocessing type, and use The preprocessed first data updates the data in the fourth cache address to obtain the fourth reference information corresponding to the first data; the RAID controller is also used to: obtain the storage array according to the third reference information and the fourth reference information Checksum information in .
  • the above-mentioned process in which the RAID controller 510 uses the fourth preprocessing type and the fourth cache address to perform the consistency operation on the first data may refer to The corresponding operation process will not be repeated here.
  • the process of RAID controller 510 generating parity information in the disk according to the third reference information and the fourth reference information is similar to the embodiment shown in FIG. 7.
  • the update of the parity information The specific process is as follows: any two of the multiple cache addresses corresponding to the fourth cache address and the multiple data blocks in the third data may not overlap, completely overlap or partially overlap in the reference cache address; the reference cache address may be A union of the fourth cache address and multiple cache addresses corresponding to multiple data blocks in the third data.
  • the RAID controller 510 respectively updates the data in the cache addresses corresponding to the multiple data blocks in the first data and the third data according to the calculation process in the foregoing embodiments, and writes the reference information corresponding to the last data block for consistency calculation. After entering the cache address corresponding to the data block, the data stored in the reference cache address is the check information to be updated on the target stripe in the storage array 520 .
  • any one of the plurality of data blocks included in the first data and the third data can be used as the ith data block shown in FIG. 6 , and the processing of the ith data block by the RAID controller 510
  • the order follows the principle in the embodiment in Figure 6, that is, it is determined whether the i-th data block and the current data block are processed in parallel or serially according to whether the i-th data block and the corresponding cache address of the current data block overlap. Let me repeat.
  • the RAID controller 510 may also use the second data to jointly update the parity information in the target stripe.
  • the second entry information may include the fifth preprocessing type and the fifth cache address, and the RAID controller 510 uses the fifth preprocessing type and the fifth cache address to perform a consistency operation on the second data to obtain the fifth reference information ; Then obtain the check information to be updated in the target stripe according to the third reference information, the fourth reference information and the fifth reference information.
  • the second entry information includes preprocessing types and cache addresses respectively corresponding to the multiple data blocks
  • the fifth reference information includes the preprocessing types and cache addresses respectively corresponding to the multiple data blocks.
  • a plurality of reference information; the calculation process of each reference information in the plurality of reference information can refer to the foregoing embodiments, and will not be repeated here.
  • the cache unit where the above-mentioned reference cache address is located may be the same as the cache unit where the target cache address is located, which will not be repeated here.
  • the first entry information may include the first preprocessing type and/or the fourth preprocessing type, the first cache address and/or the fourth cache address.
  • the first entry information may include Q preprocessing types and Q cache addresses; wherein, the Q preprocessing types correspond to the Q cache addresses one-to-one.
  • Each preprocessing type and the cache address corresponding to the preprocessing type can correspond to an application scenario in RAID5/6, that is, the above Q preprocessing types and Q cache addresses respectively correspond to Q application scenarios, and the Q application scenarios can be It is any scenario where the first data is obtained from the disk for subsequent operations, for example, disk data recovery, disk verification information update, and disk data recovery and disk verification information update or other scenarios. This application does not limit this, Q is positive integer.
  • the first entry information also includes the fourth preprocessing type and the fourth cache address. Since the first data is returned to the RAID controller, the first index information corresponding to the first data is also returned to the RAID controller, and then the RAID controller can determine the corresponding first entry information according to the first index information, and according to The fourth preprocessing type and the fourth cache address included in each first entry information perform corresponding consistency calculations on the first data block, without waiting for all other data blocks in the storage array to be returned to the RAID controller, directly according to the return of the data Corresponding consistency operations (that is, out-of-order return and out-of-order calculation) are performed in sequence, which can effectively improve system performance.
  • Corresponding consistency operations that is, out-of-order return and out-of-order calculation
  • the corresponding preprocessing type and cache address can be determined according to the first index information and the preset mapping relationship, without intermediate processing of the first data through the cache unit. Cache, thereby reducing the number of reads and writes of the cache unit and the number of bus transfers, thereby reducing the subsequent delay in obtaining the verification information to be updated of the storage array according to the consistency operation result of the first data.
  • the RAID controller is further configured to: before receiving the first data, initialize the data in the cache address indicated by the first entry information.
  • the data used for performing consistency calculations is different.
  • the data is initialized, and the initialization process can be cleared.
  • the data used for performing consistency calculations comes from storage array 520 (for example, may include first data and/or second data); in scenario 2 in RAID5/6, for The data for performing consistency calculations comes from the storage array 520 and the host (for example, may include the first data and the third data); in scenario 3 in RAID5/6, the data for performing consistency calculations comes from the storage array 520 and host (eg, may include first data, second data, and third data).
  • the above description of Scenario 2 under RAID 5/6 can be understood as a process of generating check information.
  • the above-mentioned first entry information includes the corresponding Q preprocessing types and Q cache addresses;
  • the third entry information includes the corresponding Q preprocessing types and Q cache addresses; wherein, Q is a positive integer.
  • the generation process of the Q types of check information can be performed in parallel, and the update process of each type of check information is correspondingly the same as the check information generation process described in the above embodiment, which will not be repeated here.
  • the RAID controller needs to initialize the data in the cache address indicated by the first entry information and/or the second entry information before receiving the first data for subsequent consistency calculation , that is, clear processing; then the RAID controller starts to perform a corresponding consistency operation on the received first data block and/or second data block, so as to ensure that the correct data to be restored and/or verification information of the disk is generated.
  • the process of the RAID controller 510 recovering the damaged data in the target stripe of the storage array 520 and updating the parity information in the target stripe will be described below, that is, the third scenario.
  • this embodiment of the present application only exemplifies the process of using data in one stripe (ie, the target stripe in this application) to restore damaged data in the stripe and/or update the corresponding check information in the stripe, Those skilled in the art may use the embodiment of the present application to execute any one of the above three scenarios or other possible scenarios in parallel on the data in the remaining one or more stripes in the storage array 520, which is not limited in the present application.
  • FIG. 8 is a schematic structural diagram of another data processing device 500 in the embodiment of the present application, as a refinement of the functional modules of the RAID controller 510 in the data processing device 500 in FIG. 5 .
  • the RAID controller 510 may include a management unit 511 , a determination unit 512 , an analysis unit 513 and a calculation unit 514 .
  • the management unit 511 is configured to manage entry information (for example, The above-mentioned first entry information, second entry information or third entry information).
  • the management unit 511 can generate entry information corresponding to the consistency operation data according to the target command sent by the host, and clear the data in the cache address indicated by the entry information after receiving the target command; wherein, the target The command indicates a specific application scenario to be executed by the RAID controller 510 subsequently.
  • the determining unit 512 is configured to determine the entry information corresponding to the consistent computing data from the preset mapping relationship according to the index information corresponding to the consistent computing data.
  • the parsing unit 513 is configured to parse the content in the entry information, obtain the preprocessing type and cache address corresponding to the data used for the consistency operation; The type and cache address are sent to the arithmetic unit 514 .
  • the operation unit 514 is used to perform corresponding preprocessing on the data used for the consistency operation according to the preprocessing type corresponding to the data of the consistency operation, so as to obtain the data of the consistency operation after preprocessing;
  • the data corresponding to the data in the cache address of the data pair consistency operation is updated to obtain the data to be restored and/or the verification information in the storage array 520 .
  • FIG. 9 is a schematic diagram of a data flow in an embodiment of the present application, which is used to describe a process in which a disk sends data to a RAID controller.
  • the data flow 4 describes the process in which the storage array 520 returns the consistency operation data to the RAID controller 510 in the embodiment of the present application, and the RAID controller 510 performs the corresponding consistency operation on the received consistency operation data (the The consistency calculation may be the calculation process in the above three scenarios or other RAID5/6 scenarios).
  • the storage array 520 can directly send the consistency operation data and the index information corresponding to the data to the RAID controller 510, compared with the data acquisition method in the prior art (such as the data flow in FIG. 9 5 and data flow 6), there is no need to go through the caching process of the cache unit, which reduces the number of reads and writes of the cache unit, thereby reducing the number of bus transmissions and system power consumption; in addition, because part of the data transmission process is omitted, this application The embodiment can also reduce the delay in obtaining consistent operation results.
  • FIG. 10 is a schematic flow chart of bus transmission times and data read/write times provided by an embodiment of the present application.
  • the process of acquiring storage array data for verification information update in the prior art is as follows: the storage array first sends the data to the cache unit, the RAID controller reads the data from the cache unit, and then the RAID controller The processor performs a corresponding consistency operation on the read data to obtain new P data and new Q data; finally writes the new P data and new Q data into the cache unit.
  • the process of acquiring storage array data to update the verification information is as follows: the storage array directly sends the data to the RAID controller, and the RAID controller performs corresponding consistency calculations on the read data to obtain the new P data and New Q data; finally write the new P data and new Q data into the cache unit.
  • the RAID controller will first read the data to be written into the cache address of the new P data, and then write the new P data into the cache address corresponding to the cache unit, This process includes a read and a write process, and the system bus includes two data transfer processes.
  • the above-mentioned new P data or new Q data may be the third reference information and the fourth reference information in the foregoing embodiments.
  • the above process is described under the condition that the cache unit is outside the RAID controller. It should be understood that the cache unit may also be located inside the RAID controller, which is not limited in this application.
  • Condition one in scenario one (disk data recovery) under RAID6, under the condition that the cache unit buffer is located outside the RAID controller, the embodiment of the present application can reduce the number of times of bus transmission and the number of reads and writes of the cache unit (for details, please refer to the statistics of the number of times in Scenario 1 in Table 1).
  • the number of bus transfers and the number of reads and writes of the cache unit in scenario 1 in Table 1 is an example under the condition of a single bad disk in RAID6.
  • the storage array data is first written into the cache unit, the number of bus transfers in this process is 1, and the number of reads and writes of the cache unit is 1; then the RAID controller obtains the storage from the cache unit For array data, the number of bus transfers in this process is one time, and the number of reads and writes of the cache unit is one time; the RAID controller performs corresponding consistency calculations based on the acquired storage array data to obtain the data to be restored, and writes the data to be restored into the cache In the unit, the number of bus transfers in this process is 2 times, and the number of reads and writes of the cache unit is 2 times. To sum up, as a condition, when using the prior art, the number of bus transfers is 4 times, and the number of reads and writes of the cache unit is 4 times.
  • the RAID controller first obtains the storage array data directly from the disk, and this process performs a data transmission through the bus; then the RAID controller performs corresponding consistency calculations on the storage array data to obtain The data to be restored is written into the cache unit.
  • the number of bus transfers is 2, and the number of reads and writes of the cache unit is 2 times.
  • the number of bus transfers is 3 times, and the number of reads and writes of the cache unit is 2 times.
  • cache unit buffer is positioned at RAID controller external condition, the number of times of bus data transmission and cache unit data read/write times in the prior art and the embodiment of the present application the difference.
  • the storage array data is written into the storage unit.
  • the data is transmitted through the bus.
  • the number of bus transmissions is one, and the number of reads and writes of the cache unit is one time.
  • the check information is updated by using the storage array data in the cache unit.
  • the RAID controller first reads the storage array data from the cache unit. Write data to perform corresponding consistency calculations to obtain new P data, and write the new P data into the cache unit.
  • This process requires two data transfers through the bus; specifically: when writing new P data into the cache unit, it needs First read the new P data in the cache unit to be written into the cache address, and then write the new P data into the cache address to be written.
  • the cache unit undergoes a read process and a write process (1R1W), that is During this process, the number of bus transfers and the number of reads and writes of the cache unit are respectively 2 times. It can be seen that in the process of updating the P data, the number of bus transfers is 4 times, and the number of reads and writes of the cache unit is 4 times. It should be understood that the update process of the Q data (the second type of check information in RAID6) in the check information is correspondingly the same as that of the P data, and will not be repeated here. To sum up, under the second condition, when the prior art is used, the number of bus transfers is 8, and the number of reads and writes of the cache unit is 8 times.
  • the RAID controller first obtains the storage array data directly from the storage array, and this process performs a data transmission through the bus;
  • the consistency operation obtains new P data.
  • the process for the RAID controller to write new P data to the cache unit is the same as the corresponding process in the prior art.
  • the number of bus transmissions is 3 times, and the number of reads and writes of the cache unit is 2 times.
  • the update process of the Q data is the same as that of the P data.
  • the number of bus transfers is 6, and the number of reads and writes of the cache unit is 4 times.
  • Condition 3 In Scenario 3 under RAID6 (disk data recovery and update of checksum information on the disk), the cache unit buffer is located outside the RAID controller, there is a double-data bad disk in RAID6, and only the old disk returned from the disk is used. Under the condition that the data updates the verification information, when the verification information is updated and the disk data is restored, the number of bus transfers and the number of reads and writes of the cache unit can be seen in the corresponding statistics in Table 1.
  • the disk data recovery process firstly, the data of the storage array is written into the cache unit. During this process, the bus performs one data transmission, and the number of reads and writes of the cache unit is one time.
  • the RAID controller obtains the corresponding storage array data from the cache unit, the number of bus transfers in this process is 1 time, and the number of reads and writes of the cache unit is 1 time; then The RAID controller recovers the damaged data in any of the above-mentioned bad disks based on the obtained corresponding storage array data, obtains the data to be restored in any data bad disk, and writes the data to be restored to the cache unit.
  • the number of bus transfers is 2 times, and the number of reads and writes of the cache unit is 2 times. To sum up, it can be seen that when recovering the damaged data in the two bad disks, the number of bus transfers is 7 times, and the number of reads and writes of the cache unit is 7 times. Since there is no need to read data from the storage array again during the process of updating the verification information, in the process of P data update in the verification information: the RAID controller first obtains the storage array data from the cache unit, and this process performs 1 data transmission through the bus , the number of reads and writes of the cache unit is 1; then the RAID controller performs corresponding consistency calculations on the acquired storage array data and the data to be written to obtain new P data, and writes the new P data into the cache unit.
  • the number of bus transfers is 2 times, and the number of reads and writes of the cache unit is 2 times.
  • the update process of the Q data is the same as that of the P data, and will not be repeated here.
  • the number of bus transmissions is 6 times, and the number of reads and writes of the cache unit is 6 times. It can be known from the above description that under the third condition, when the prior art is used, the number of bus transfers is 13, and the number of reads and writes of the cache unit is 13 times.
  • the RAID controller first obtains the storage array data directly from the disk, and this process performs a data transmission through the bus; then the RAID controller performs any of the above-mentioned
  • the damaged data in the bad disk is restored, and the data to be restored in any data bad disk is obtained, and the data to be restored is written into the cache unit.
  • the number of bus transmissions in this process is 2 times, and the number of reads and writes of the cache unit is 2 Second-rate. To sum up, it can be seen that when recovering the damaged data in the two bad disks, the number of bus transfers is 5, and the number of reads and writes of the cache unit is 4 times.
  • the RAID controller Since there is no need to read data from the storage array again during the process of updating the verification information, in the process of updating P data in the verification information: the RAID controller first obtains the calculated data to be restored from the cache unit, and this process is carried out through the bus 1 data transmission, the cache unit reads and writes once, the RAID controller performs corresponding consistency calculations on the data to be restored and the data to be written to obtain new P data, and writes the new P data into the cache unit.
  • the bus The number of transfers is 2 times, and the number of reads and writes of the cache unit is 2 times.
  • the update process of the Q data is the same as that of the P data, and will not be repeated here.
  • the number of bus transmissions is 5 times, and the number of reads and writes of the cache unit is 5 times. It can be seen from the above description that under the third condition, when the embodiment of the application is adopted, the number of bus transfers is 10, and the number of reads and writes of the cache unit is 9 times.
  • Table 1 Bus transmission times and cache unit data read and write times under different conditions in the prior art and the embodiment of the present application
  • the embodiments of the present application may also be applied to RAID5/6, where the RAID controller obtains data from the storage array to perform other consistent operations, and the present application does not specifically limit this. It can be seen that when the method in the embodiment of the present application is used to obtain data from the storage array for subsequent consistency calculations, the number of data reads and writes and the number of bus transfers of the cache unit can be effectively reduced, the demand for bus bandwidth is reduced, and the system power is reduced. consumption.
  • the data acquired by the RAID controller from the storage array is different in different scenarios, the data acquired by the RAID controller from the storage array is the same in the same scenario when using the prior art and the embodiment of the present application. Therefore, for the convenience of statistics, the number of bus transfers and the number of reads and writes of the cache unit in the process of obtaining data from the storage array in the embodiment of the present application and the prior art are counted as one time.
  • FIG. 11 is a schematic diagram 1100 of a hardware structure of a RAID controller provided in an embodiment of the present application.
  • the RAID controller includes a processor 1101 , a memory 1102 , an interface circuit 1103 and a bus 1104 .
  • Interface circuit 1103 may be coupled with the memory array.
  • the processor 1101 is configured to receive the first data on the target stripe in the storage array and the first index information corresponding to the first data through the interface circuit 1103; wherein, the first data is any one of the data on the target stripe; based on The first index information determines the first entry information corresponding to the first data from the preset mapping relationship; wherein, the preset mapping relationship is generated based on the consistency of the stripes, and the first entry information is used to indicate the first A preprocessing type and a cache address corresponding to the data in the consistency operation; perform corresponding preprocessing on the first data according to the preprocessing type, and use the preprocessed first data to update the data in the cache address.
  • the memory 1102 is configured to store first entry information. Wherein, the processor 1101 , the memory 1102 and the interface circuit 1103 perform data transmission through the bus 1104 .
  • the memory 1102 includes but is not limited to random access memory (random access memory, RAM), read-only memory (read-only memory, ROM), erasable programmable read-only memory (erasable programmable read only memory, EPROM) , or portable read-only memory (compact disc read-only memory, CD-ROM).
  • the processor 1101 may be one or more central processing units (central processing unit, CPU). In the case where the processor 1101 is a CPU, the CPU may be a single-core CPU or a multi-core CPU.
  • the above-mentioned storage array includes M disks, and the target stripe includes M stripe units, and the M stripe units are respectively located on the M disks; wherein, M is an integer greater than 2; the first The data is one or all data blocks on any one of the M stripe units.
  • the target stripe further includes second data; the storage array is further configured to send the second data and second index information corresponding to the second data to the RAID controller; wherein, the second data
  • the sending time of the first data is before or after the sending time of the first data.
  • Fig. 12 is a schematic flow chart of a data processing method 1200 provided by the embodiment of the present application, the data processing method is applicable to any data processing device in the above-mentioned Fig. 5 and Fig. 8 and includes the above-mentioned data processing device device of.
  • the method includes but is not limited to the following steps:
  • Step S1201 The RAID controller obtains the first data on the target stripe in the storage array and the first index information corresponding to the first data; wherein, the first data is any one of the data on the target stripe; Step S1202: through The RAID controller determines first entry information corresponding to the first data from a preset mapping relationship based on the first index information; wherein, the preset mapping relationship is generated based on stripe consistency, and the first entry information uses To indicate the preprocessing type and cache address of the consistency operation corresponding to the first data; Step S1203: Perform corresponding preprocessing on the first data according to the preprocessing type, and use the preprocessed first data to process the data in the cache address to update.
  • the storage array includes M disks, the target stripe includes M stripe units, and the M stripe units are respectively located on the M disks; wherein, M is an integer greater than 2; the first data It is one or all data blocks on any one of the M stripe units.
  • the above-mentioned target stripe further includes second data; the method further includes: sending the second data and second index information corresponding to the second data to the RAID controller through the storage array; wherein, the second The transmission time of the data is before or after the transmission time of the first data.
  • the above-mentioned first entry information includes the first preprocessing type and the first cache address; the above-mentioned preprocessing is performed on the first data according to the preprocessing type, and the preprocessed first Updating the data in the cache address by the data includes: preprocessing the first data by the RAID controller according to the first preprocessing type, and using the preprocessed first data to update the data in the first cache address, First reference information corresponding to the first data is obtained.
  • the above method further includes: the RAID controller determines second entry information corresponding to the second data from a preset mapping relationship based on the second index information; wherein, the second entry information includes the second preprocessing type and the second Cache address; preprocess the second data according to the second preprocessing type, and use the preprocessed second data to update the data in the second cache address to obtain second reference information corresponding to the second data; according to the second preprocessing type
  • the first reference information and the second reference information obtain the data to be restored in the storage array.
  • the above method further includes: receiving, by the RAID controller, third data to be written into the storage array and third index information corresponding to the third data; Determine the third entry information corresponding to the third data in the relationship; wherein, the third entry information includes a third preprocessing type and a third cache address; perform preprocessing on the third data block according to the third preprocessing type, and The data in the third cache address is updated by using the preprocessed third data to obtain third reference information corresponding to the third data.
  • the above-mentioned first entry information includes the fourth preprocessing type and the fourth cache address; perform corresponding preprocessing on the first data according to the preprocessing type, and use the preprocessed first data
  • Updating the data in the cache address includes: preprocessing the first data by the RAID controller according to the fourth preprocessing type, and using the preprocessed first data to update the data in the fourth cache address to obtain The fourth reference information corresponding to the first data; the above method further includes: the RAID controller obtains the verification information in the storage array according to the third reference information and the fourth reference information.
  • the above method further includes: before receiving the first data, the RAID controller initializes the data in the cache address indicated by the first entry information.
  • the embodiment of the present application also provides a computer storage medium, wherein the computer storage medium can store a computer program, when a part of the computer program is executed by the processor (not shown in FIG. 5 ) in the data processing device 500 , so that the processor can execute some or all of the steps described in any one of the above method embodiments.
  • the above-mentioned computer storage medium may be a cache unit (not shown in FIG. 5 ) included in the data processing apparatus 500 .
  • An embodiment of the present application also provides a computer program, where the computer program includes instructions.
  • the processor may perform some or all of the steps described in any one of the above method embodiments.
  • the disclosed device can be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the above units is only a logical function division.
  • there may be other division methods for example, multiple units or components can be combined or integrated. to another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical or other forms.
  • the units described above as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Human Computer Interaction (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

本申请实施例提供一种数据处理装置和数据处理方法,该装置包括独立磁盘冗余阵列RAID控制器和耦合至RAID控制器的存储阵列。RAID控制器,用于获取存储阵列中目标条带上的第一数据以及第一数据对应的第一索引信息;其中,第一数据为目标条带上数据中的任意一个。RAID控制器,用于基于第一索引信息从预设的映射关系中确定对应的第一表项信息;其中,第一表项信息用于指示第一数据在一致性运算中对应的预处理类型以及缓存地址;根据第一表项信息对第一数据进行相应的一致性运算。本申请实施例可以减少第一数据在缓存单元中的读写次数及其在总线上传输次数、降低对系统带宽的要求,进而降低系统功耗。

Description

数据处理装置及数据处理方法 技术领域
本申请涉及信息技术领域,尤其涉及一种数据处理装置及数据处理方法。
背景技术
独立磁盘冗余阵列(Redundant Array of Independent Disks,RAID)是一种高性能、高可靠性的存储技术,通过一系列单独的磁盘以不同的方式组合起来,为应用终端或终端集群提供逻辑上的磁盘。RAID技术已经广泛应用于数据存储的各种场合,常用的RAID技术包括RAID0、RAID1、RAID5、RAID6和RAID10等;其中,RAID0不具有冗余能力,RAID1磁盘利用率低;而RAID5、RAID6和RAID10分别由多块磁盘(如,RAID5包含至少3块,RAID6和RAID10包含至少4块)组成,各个RAID以条带的方式向阵列中的磁盘写数据,并将校验信息存放在阵列中的磁盘上。
RAID5是一种储存性能、数据安全和存储成本兼顾的存储解决方案,它使用的是磁盘分割Disk Striping技术。RAID 5至少需要三个磁盘,RAID 5不是对存储的数据进行备份,而是把数据和相对应的校验信息存储到组成RAID5的各个磁盘上,并且校验信息和相对应的数据分别存储于不同的磁盘上。当RAID5的一个磁盘数据发生损坏后,可以利用剩下的数据和相应的校验信息去恢复被损坏的数据。
对于RAID5和RAID6技术,在更新磁盘校验信息或恢复磁盘损坏数据的过程中,磁盘上的旧数据会先存储到缓存单元buffer中,然后RAID控制器从buffer中读取旧数据来进行相应地校验信息更新或数据恢复。
然而现有技术在更新磁盘校验信息或恢复磁盘损坏数据的过程中,占用的系统空间较大,且处理过程延迟较高;此外,当缓存单元buffer位于RAID控制器外部时,对总线带宽要求高、系统功耗较高。
发明内容
本申请实施例公开了一种数据处理装置及数据处理方法,可以减少磁盘旧数据在缓存单元中的读写次数及其在总线上传输次数、降低对系统带宽和存储空间的要求、进而降低系统功耗;同时,还可以乱序进行计算,从而降低获得一致性运算结果的延时。
第一方面,本申请实施例公开了一种数据处理装置,包括独立磁盘冗余阵列RAID控制器和耦合至RAID控制器的存储阵列;其中,RAID控制器,用于获取存储阵列中目标条带上的第一数据以及第一数据对应的第一索引信息;其中,第一数据为目标条带上数据中的任意一个;用于基于第一索引信息从预设的映射关系中确定与第一数据对应的第一表项信息;其中,预设的映射关系是基于条带的一致性生成的,第一表项信息用于指示第一数据在一致性运算中对应的预处理类型以及缓存地址;根据预处理类型对第一数据进行相应的预处理,并利用预处理后的第一数据对缓存地址中的数据进行更新。
应当理解,上述一致性运算可以是RAID5/6中利用第一数据进行磁盘损坏数据的恢复、磁盘校验信息更新或其它场景中相应的计算过程。上述从预设的映射关系中确定第一索引 信息对应第一表项信息的过程可以通过查表或其它的方式,本申请对此不限定。
可以看出,相比于现有技术中先将存储阵列中的第一数据缓存在缓存单元,再将缓存单元中的第一数据和第一数据对应的物理地址读取到RAID控制器的过程,由于在本申请实施例中,存储阵列可以直接向RAID控制器发送第一数据和第一数据对应的第一索引信息,然后通过第一索引信息确定第一数据后续进行一致性运算所需的预处理类型以及缓存地址,因而可以省去将第一数据先缓存于缓存单元,再从缓存单元读取到RAID控制器的过程,从而有效降低缓存单元中的数据读写次数,进而可以降低对缓存单元读写带宽的要求。同时由于本申请实施例中获取第一数据的过程不经过缓存单元缓存,步骤更加简洁,因而本申请实施例还可以降低后续利用第一数据进行磁盘数据恢复和/或更新校验信息的延时。此外,当缓存单元位于RAID控制器外部时,现有技术中将存储阵列中的第一数据缓存在缓存单元的过程需要经过总线进行数据传输,由于本申请省去现有技术中的该过程,因而本申请实施例还可以降低对总线带宽的要求,进而降低系统功耗。
在一种可行的实施方式中,上述存储阵列包括M个磁盘,目标条带包括M个条带单元,M个条带单元分别位于M个磁盘上;其中,M为大于2的整数;第一数据为M个条带单元中任一条带单元上的一个或全部数据块。
应当理解,第一数据可以是M个条带单元中的任一条带单元上的一个或全部数据块。即当条带单元中的数据被拆分为多个数据块时,第一数据可以是该多个数据块中的一个;当条带单元中的数据未被拆分时,第一数据可以是M个条带单元中任一条带单元包含的全部数据。
可以看出,在本申请实施例中,当条带单元中的数据被拆分为多个数据块时,由于第一数据为多个数据块中的一个,第一数据的大小较小,因而此时的第一数据可以快速返回RAID控制器以便进行后续计算,进而降低后续处理过程的延时,提升效率。
在一种可行的实施方式中,上述目标条带上还包括第二数据;上述存储阵列用于:向RAID控制器发送第二数据和第二数据对应的第二索引信息;其中,第二数据的发送时间在第一数据的发送时间之前或之后。
应当理解,上述第二数据可以包括目标条带中除第一数据外的至少一个其它数据块。当第二数据包含多个数据块时,该多个数据块可以以任意顺序返回RAID控制器,且每个数据块返回时会同时携带与每个数据块对应的索引信息。
可以看出,在本申请实施例中,由于数据对应的第一索引信息可以与数据同时返回RAID控制器,因而RAID控制器可以根据每个返回数据对应的索引信息确定返回数据进行一致性运算所需的预处理类型和缓存地址,无需等待目标条带上所需数据全部返回,因而当第二数据包含多个数据块时,存储阵列可以按照任意顺序向RAID控制器返回上述第一数据和第二数据,并基于第一数据和/或第二数据乱序完成磁盘损坏数据的恢复和/或校验信息的更新过程,进而可以有效提高系统性能。
在一种可行的实施方式中,上述第一表项信息包括第一预处理类型和第一缓存地址;上述RAID控制器具体用于:根据第一预处理类型对第一数据进行预处理,并利用预处理后的第一数据对第一缓存地址中的数据进行更新,得到第一数据对应的第一参考信息;RAID控制器还用于:基于第二索引信息从预设的映射关系中确定与第二数据对应的第二表 项信息;其中,第二表项信息包括第二预处理类型以及第二缓存地址;根据第二预处理类型对第二数据进行预处理,并利用预处理后的第二数据对第二缓存地址中的数据进行更新,得到第二数据对应的第二参考信息;根据第一参考信息和第二参考信息得到存储阵列中的待恢复数据。
在一种可行的实施方式中,上述利用预处理后的第一数据对第一缓存地址中的数据进行更新,得到第一数据对应的第一参考信息,具体为:RAID控制器获取第一缓存地址中的数据,对预处理后的第一数据和第一缓存地址中的数据进行异或XOR运算,得到第一数据对应的第一参考信息,并将该第一参考信息写入第一缓存地址中。
应当理解,上述第二参考信息的计算过程可以对应参照第一参考信息的相应计算过程,此处不再赘述。
可以看出,在本申请实施例中,由于数据返回RAID控制器的同时还携带有对应的索引信息,RAID控制器根据每个数据对应的索引信息确定对应的表项信息,进而根据对应的表项信息所指示的预处理类型和缓存地址对每个数据进行相应的一致性运算,该一致性运算包括相应的预处理和异或运算。由于上述异或运算的逻辑是固定的,因而对于以任意顺序返回RAID控制器的数据,可以按照每个数据返回的顺序直接进行相应处理,得到每个数据对应的参考信息。具体地,当同一条带单元中的数据被拆分为多个数据块时,该多个数据块可以乱序返回RAID控制器时,并依据该多个数据块返回的顺序进行相应的一致性运算(即乱序返回,乱序计算),从而有效提升系统性能。
在一种可行的实施方式中,上述RAID控制器还用于:接收待写入存储阵列的第三数据和与第三数据对应的第三索引信息;基于第三索引信息从预设的映射关系中确定与第三数据对应的第三表项信息;其中,第三表项信息包含第三预处理类型和第三缓存地址;根据第三预处理类型对第三数据块进行预处理,并利用预处理后的第三数据对第三缓存地址中的数据进行更新,得到第三数据对应的第三参考信息。
应当理解,上述第三数据可以是主机向RAID控制器发送的,后续将写入存储阵列中的数据;其中,第三数据可以包括一个或多个数据块,且该多个数据块可以以任意顺序返回RAID控制器,且每个数据块返回时会同时携带与每个数据块对应的索引信息。
可以看出,在本申请实施例中,RAID控制器还可以对从主机返回的第三数据进行相应的一致性运算,得到第三数据对应的第三参考信息。当第三数据包括多个数据块时,由于每个数据块携带有对应的索引信息,因而RAID控制器可以不用依据每个数据块拆分前的顺序,直接依据每个数据块的返回顺序对每个数据块上述相应一致性运算(即乱序返回,乱序计算),从而有效提升系统性能;同时,由于可以进行乱序计算,因而可以降低第二数据一致性运算过程的延时,进而降低后续根据第二数据一致性运算结果得到校验信息的延时。
在一种可行的实施方式中,上述第一表项信息包括第四预处理类型和第四缓存地址;上述RAID控制器具体用于:根据第四预处理类型对第一数据进行预处理,并利用预处理后的第一数据对第四缓存地址中的数据进行更新,得到第一数据对应的第四参考信息;RAID控制器还用于:根据第三参考信息和第四参考信息,得到存储阵列中的校验信息。
可以看出,在本申请实施例中,第一表项信息中还包含第四预处理类型和第四缓存地 址。由于第一数据返回RAID控制器的同时,与第一数据对应的第一索引信息也一同返回RAID控制器,进而RAID控制器可以根据该第一索引信息确定对应的第一表项信息,并根据每个第一表项信息包含的第四预处理类型和第四缓存地址对第一数据块进行相应的一致性运算,无需等待存储阵列中其余数据块全部返回RAID控制器,直接依据数据的返回顺序进行相应的一致性运算(即乱序返回,乱序计算),从而可以有效地提升系统性能。同时,由于磁盘返回的第一数据携带有对应的第一索引信息,根据可以依据第一索引信息和预设映射关系确定对应的预处理类型和缓存地址,无需经过缓存单元对第一数据进行中间缓存,从而降低缓存单元读写次数以及总线传输次数,进而降低后续根据第一数据的一致性运算结果得到存储阵列待更新校验信息的延时。
应当理解,本申请实施例示例性地介绍了第一表项信息可以包含第一预处理类型和/或第四预处理类型,第一缓存地址和/或第四缓存地址。本领域技术人员应当理解第一表项信息中可以包含Q个预处理类型和Q个缓存地址;其中,该Q个预处理类型和Q个缓存地址一一对应。每个预处理类型和与该预处理类型对应的缓存地址可以对应RAID5/6中一个应用场景,即上述Q个预处理类型和Q个缓存地址分别对应Q种应用场景,该Q种应用场景可以是任意从磁盘获取第一数据进行后续操作的场景,例如,磁盘数据恢复、磁盘校验信息更新,以及磁盘数据恢复和磁盘校验信息更新或其它场景,本申请对此不限定,Q为正整数。
在一种可行的实施方式中,上述RAID控制器还用于:在接收第一数据之前,对第一表项信息所指示缓存地址中的数据进行初始化。
可以看出,在本申请实施例中,RAID控制器在接收第一数据进行后续一致性运算之前,需要对第一表项信息所指示缓存地址中的数据进行初始化,该初始化过程可以是清零处理;然后RAID控制器开始对接收到的第一数据块进行相应的一致性运算,从而确保生成正确的磁盘待恢复数据和/或校验信息。
第二方面,本申请实施例公开了一种RAID控制器,该RAID控制器包括处理器和接口电路;处理器通过接口电路与存储阵列相耦合;其中,处理器,用于通过接口电路接收存储阵列中目标条带上的第一数据以及第一数据对应的第一索引信息;其中,第一数据为目标条带上数据中的任意一个;基于第一索引信息从预设的映射关系中确定与第一数据对应的第一表项信息;其中,预设的映射关系是基于条带的一致性生成的,第一表项信息用于指示第一数据在一致性运算中对应的预处理类型以及缓存地址;根据预处理类型对第一数据进行相应的预处理,并利用预处理后的第一数据对缓存地址中的数据进行更新。
在一种可行的实施方式中,上述RAID控制器包括存储器,存储器,用于存储第一表项信息。
在一种可行的实施方式中,上述存储阵列包括M个磁盘,目标条带包括M个条带单元,M个条带单元分别位于M个磁盘上;其中,M为大于2的整数;第一数据为M个条带单元中任一条带单元上的一个或全部数据块。
在一种可行的实施方式中,上述目标条带上还包括第二数据;存储阵列,还用于向RAID控制器发送第二数据和第二数据对应的第二索引信息;其中,第二数据的发送时间在第一 数据的发送时间之前或之后。
应当理解,上述第二方面中各实施例对应的有益效果可以参照第一方面中对应实施例的有益效果描述,此处不再赘述。
第三方面,本申请实施例公开了一种数据处理方法,包括:由RAID控制器获取存储阵列中目标条带上的第一数据以及第一数据对应的第一索引信息;其中,第一数据为目标条带上数据中的任意一个;通过RAID控制器基于第一索引信息从预设的映射关系中确定与第一数据对应的第一表项信息;其中,预设的映射关系是基于条带一致性生成的,第一表项信息用于指示第一数据所对应一致性运算的预处理类型以及缓存地址;根据预处理类型对第一数据进行相应的预处理,并利用预处理后的第一数据对缓存地址中的数据进行更新。
在一种可行的实施方式中,存储阵列包括M个磁盘,目标条带包括M个条带单元,M个条带单元分别位于M个磁盘上;其中,M为大于2的整数;第一数据为M个条带单元中任一条带单元上的一个或全部数据块。
在一种可行的实施方式中,上述目标条带上还包括第二数据;方法还包括:通过存储阵列向RAID控制器发送第二数据和第二数据对应的第二索引信息;其中,第二数据的发送时间在第一数据的发送时间之前或之后。
在一种可行的实施方式中,上述第一表项信息包括第一预处理类型和第一缓存地址;上述根据预处理类型对第一数据进行相应的预处理,并利用预处理后的第一数据对缓存地址中的数据进行更新,包括:由RAID控制器根据第一预处理类型对第一数据进行预处理,并利用预处理后的第一数据对第一缓存地址中的数据进行更新,得到第一数据对应的第一参考信息。上述方法还包括:由RAID控制器基于第二索引信息从预设的映射关系中确定与第二数据对应的第二表项信息;其中,第二表项信息包括第二预处理类型以及第二缓存地址;根据第二预处理类型对第二数据进行预处理,并利用预处理后的第二数据对第二缓存地址中的数据进行更新,得到第二数据对应的第二参考信息;根据第一参考信息和第二参考信息得到存储阵列中的待恢复数据。
在一种可行的实施方式中,上法还包括:由RAID控制器接收待写入存储阵列的第三数据和与第三数据对应的第三索引信息;基于第三索引信息从预设的映射关系中确定与第三数据对应的第三表项信息;其中,第三表项信息包含第三预处理类型和第三缓存地址;根据第三预处理类型对第三数据块进行预处理,并利用预处理后的第三数据对第三缓存地址中的数据进行更新,得到第三数据对应的第三参考信息。
在一种可行的实施方式中,上述第一表项信息包括第四预处理类型和第四缓存地址;根据预处理类型对第一数据进行相应的预处理,并利用预处理后的第一数据对缓存地址中的数据进行更新,包括:由RAID控制器根据第四预处理类型对第一数据进行预处理,并利用预处理后的第一数据对第四缓存地址中的数据进行更新,得到第一数据对应的第四参考信息;上述方法还包括:由RAID控制器根据第三参考信息和第四参考信息,得到存储阵列中的校验信息。
在一种可行的实施方式中,上述方法还包括:在接收第一数据之前,由RAID控制器 对第一表项信息所指示缓存地址中的数据进行初始化。
第四方面,本申请实施例公开了一种芯片系统,该芯片系统包括至少一个处理器,存储器和接口电路,存储器、接口电路和所述至少一个处理器通过线路互联,所述至少一个存储器中存储有指令;所述指令被所述处理器执行时,上述第三方面中任一项所述的方法得以实现。
第五方面,本申请实施例公开了一种计算机可读存储介质,所述计算机可读存储介质中存储有程序指令,当所述程序指令在处理器上运行时,上述第三方面中任一项所述的方法得以实现。
第六方面,本申请实施例公开了一种计算机程序产品,当所述计算机程序产品在终端上运行时,上述第三方面中任一项所述的方法得以实现。
第七方面,本申请实施例提供了一种终端设备,包括上述第一方面中的任意一种实施方式所提供的数据处理装置以及耦合于该数据处理装置的分立器件。
附图说明
图1是本申请实施例提供的一种RAID 5中存储阵列的结构示意图;
图2是本申请实施例提供的一种RAID 6中存储阵列的结构示意图;
图3是本申请实施例提供的一种存储阵列的条带结构示意图;
图4是现有技术中的一种数据流示意图;
图5是本申请实施例提供的一种数据处理装置的结构示意图;
图6是本申请实施例提供的一种一致性运算的流程示意图;
图7是本申请实施例提供的一种不同数据块在缓存单元中对应缓存地址示意图;
图8是本申请实施例提供的另一种数据处理装置的结构示意图;
图9是本申请实施例提供的一种数据流示意图;
图10是本申请实施例提供的一种总线传输次数和数据读写次数过程示意图;
图11为本申请实施例提供的一种RAID控制器的硬件结构示意图;
图12是本申请实施例提供的一种数据处理方法的流程示意图。
具体实施方式
下面结合本申请实施例中的附图对本申请实施例进行描述。
首先对本方案涉及的RAID5和RAID6中的磁盘结构以及相应的数据读写方式进行介绍:
请参见图1,图1是本申请实施例提供的一种RAID 5中存储阵列的结构示意图。如图1所示,RAID5包含四个独立的磁盘Disk:Disk0、Disk1、Disk2和Disk3。该四个独立的磁盘可以包含四个条带,每个条带包含四个条带单元,且每个条带包含的四个条带单元分别位于该四个独立的磁盘上。第一条带包含A1、A2、A3和Ap四个条带单元,第二条带包含B1、B2、Bp和B3四个条带单元,第三条带包含C1、Cp、C2和C3四个条带单元,第四条带包含Dp、D1、D2和D3四个条带单元。对于上述四个条带中的每个条带,该每 个条带包含的四个条带单元在各自磁盘上的起始位置和长度相同。在图1所示的每个条带中,数字下标(1、2和3)所表示的条带单元用于存储磁盘数据,字母下标(p)所表示的条带单元用于存储与磁盘数据相对应的校验信息(在本申请所有实施例中,RAID5中的校验信息也可称为P数据)。
请参见图2,图2是本申请实施例提供的一种RAID 6中存储阵列的结构示意图。如图2所示,RAID6包含五个独立的磁盘,该五个独立的磁盘可以包含五个条带,每个条带包含五个条带单元;其中,每个条带包含的具体条带单元可以参见图2,此处不再赘述。如图2所示,在每个条带中,数字下标(1、2和3)所表示的条带单元用于存储磁盘数据,字母下标(p和q)所表示的条带单元用于存储与磁盘数据相对应的校验信息(在本申请所有实施例中,RAID6中的两种校验信息可分别称作P数据和Q数据)。
可以看出,与RAID 5相比,RAID 6增加了第二个独立的校验信息块。两个独立的奇偶系统使用不同的算法,数据的可靠性非常高,任意两块磁盘同时失效时不会影响数据完整性。
下面将结合图3介绍本申请实施例的应用场景。本申请实施例可以应用于RAID5或RAID6中的如下场景:一、对磁盘中的损坏数据进行恢复;二、更新磁盘中的校验信息;三、对磁盘中的损坏数据进行恢复,并更新磁盘中的校验信息。
请参见图3,图3为本申请实施例一种存储阵列的条带结构示意图。该条带可以是图2(即RAID6)中所示的五个条带中的任意一个。下面将以图3中所示的条带为对象,对上述三种应用场景进行描述。图3所示的条带中包含五个条带单元:D0、D1、D2、P和Q;其中,D0、D1和D2用于存储磁盘数据,P和Q分别用于存储与磁盘数据对应的P数据和Q数据(两种独立的校验信息)。
在本申请实施例应用于场景一时,假设D1中数据损坏,需要恢复D1中的磁盘数据,此时,需要读取D0、D2、P和Q中的数据,对D1中的数据进行恢复。
在本申请实施例应用于场景二时,假设需要向D1中写入新D1数据,此时更新P数据和Q数据的过程包含大写和小写两种方式:
(一)小写方式
首先从条带单元D1、条带单元P和条带单元Q中读取相应旧D1数据、旧P数据和旧Q数据;然后接收新D1数据,并根据旧D1数据、旧P数据和旧Q数据和新D1数据计算出新P数据和新Q数据。
(二)大写方式
首先从条带单元D0和条带单元D2中读取相应的旧D0数据和旧D2数据;然后接收新D1数据,并根据旧D0数据、旧D2数据和新D1数据计算出新P数据和新Q数据。
应当理解,在RAID5下的场景二中,需要更新的校验信息只有一种,采用大写或小写方式都会从盘上读取数据,读取的数据使用一次;在RAID6下的场景二中,需要更新的校验信息包含两个独立的部分,采用大写或小写方式也都会从盘上读取数据时,读取的数据(条带单元P和Q中数据除外)会使用两次。
在本申请实施例应用于场景三时,当磁盘中校验信息的更新不需要使用待恢复的磁盘数据时,校验信息的更新和磁盘数据的恢复可以同时进行;当磁盘中校验信息的更新需要使用待恢复的磁盘数据时,校验信息的更新需要在相应磁盘数据的恢复后进行。在RAID6中双数据坏盘强制大写场景下,磁盘返回的同一数据可以被使用四次。举例来说,磁盘中的一个条带中包含D0、D1、D2、D3、P和Q六个条带单元;其中,D0、D1、D2和D3为存储数据的条带单元,P和Q为存储校验信息的条带单元。假设D0和D1中数据损坏,需要采用大写方式向磁盘写入新D2数据时,需从磁盘读取一次D2和D3中的数据,读取出的同一数据会被使用四次,分别用于计算旧D0数据和旧D1数据,以及新P数据和新Q数据。
请参见图4,图4是现有技术中一种数据流示意图,用于描述现有技术中存储阵列(包括多个独立的磁盘)、RAID控制器和缓存单元之间的数据交互过程。如图4所示,首先根据图4中①所示的数据流向,读取存储阵列中的旧数据,并将存储阵列中旧数据写入缓存单元。然后RAID控制器根据②所示的数据流向从缓存单元中读取存储阵列旧数据;其中,当RAID控制器需要进行不同场景对应的一致性运算时,例如同时进行损坏数据的恢复和校验信息的更新时,RAID控制器可以分别从缓存单元中读取对应的数据,即上述数据流向②可以包含多个独立的数据读取过程。此外,RAID控制器还可以通过总线接收主机发送的待写入新数据(此过程未示出)。在RAID控制器从缓存单元获取对应的存储阵列旧数据后,RAID控制器对所获取的存储阵列旧数据进行相应的一致性运算,以得到更新的校验信息和/或存储阵列中的待恢复数据;最后按照③所示的数据流向将更新的校验信息和/或存储阵列中的待恢复数据写入缓存单元的中对应位置;其中,当RAID控制器需要同时向缓存单元写入不同的一致性运算结果时,上述数据流向③可以包含多个独立的数据写入过程。当缓存单元位于RAID控制器外部时,上述①、②和③过程中的数据流均需要经过系统总线进行传输。
可选地,在图4中,缓存单元可以为可读可写的存储器,如寄存器或随机存储器(random access memory,RAM),例如静态随机存取存储器(static random access memory,SRAM)、动态随机存取存储器(dynamic random access memory,DRAM)或同步动态随机存储器(synchronous DRAM,SDRAM)、双倍速率SDRAM(dual data rate SDRAM,DDR SDRAM)等。应当注意,该缓存单元可以位于RAID控制器内部(即位于片上,如SRAM),或RAID控制器外部(即位于片外,如DDR SDRAM)。当缓存单元位于RAID控制器外部时,缓存单元与RAID控制器之间的数据交互通过系统总线进行传输;当缓存单元位于RAID控制器内部时,缓存单元与RAID控制器之间的数据交互无需经过系统总线进行传输。
请参见图5,图5是本申请实施例提供的一种数据处理装置500的结构示意图。如图5所示,数据处理装置500可以包括独立磁盘冗余阵列RAID控制器510和耦合至RAID控制器的存储阵列520。
RAID控制器510用于获取存储阵列520中目标条带上的第一数据以及第一数据对应的第一索引信息;其中,第一数据为目标条带上数据中的任意一个。
具体地,如图5所示,存储阵列520包括M个独立的磁盘,存储阵列520可以划分为N个条带。上述目标条带可以是N个条带中的任意一个,M为大于2的整数。
RAID控制器510用于基于第一索引信息从预设的映射关系中确定与第一数据对应的第一表项信息;其中,预设的映射关系是基于条带的一致性生成的,第一表项信息用于指示第一数据在一致性运算中对应的预处理类型以及缓存地址;根据预处理类型对第一数据进行相应的预处理,并利用预处理后的第一数据对缓存地址中的数据进行更新。
可选地,上述第一索引信息可以包含第一数据在存储阵列520中的具体位置信息,以及第一数据与第一表项信息的对应关系,即RAID控制器可以通过解析第一索引信息从预设映射关系中确定第一数据对应的第一表项信息。其中,上述预设的映射关系是基于条带的一致性生成的,即目标条带上不同数据所对应的表项信息是根据条带的一致性生成的。上述基于预设的映射关系确定第一数据对应的第一索引信息可以是通过查表或其它的方式进行确定,本申请对此不限定;第一索引信息可以作为第一数据的命令头,和第一数据一同返回RAID控制器510。
其中,上述具体位置信息可以包括第一数据在存储阵列520中的所属磁盘编号以及第一数据在所属磁盘中的具体位置。
应当理解,上述一致性运算可以是RAID5/6中利用第一数据进行磁盘损坏数据的恢复、磁盘校验信息更新或其它场景中相应的计算过程,即上述场景一、场景二、场景三或其它在RAID5/6中从存储阵列中获取数据进行运算的场景都可以包含对应的一致性运算过程,本申请对此不限定。上述第一数据也可称为一致性运算数据。
可以看出,相比于现有技术中先将存储阵列中的第一数据缓存在缓存单元,再将缓存单元中的第一数据和第一数据对应的物理地址读取到RAID控制器的过程,由于本申请实施例中存储阵列可以直接向RAID控制器发送第一数据和第一数据对应的第一索引信息,然后通过第一索引信息确定第一数据后续进行一致性运算所需的预处理类型以及缓存地址,因而可以省去将磁盘第一数据先存储于缓存单元,再从缓存单元读取到RAID控制器的过程,从而有效降低缓存单元中的数据读写次数,进而可以降低对缓存单元读写带宽的要求。同时由于本申请实施例中获取第一数据的过程更加简单,因而本申请实施例还可以降低后续利用第一数据进行磁盘数据恢复和/或更新校验信息的延时。此外,当缓存单元位于RAID控制器外部时,现有技术中将存储阵列中的第一数据缓存在缓存单元的过程需要经过总线进行数据传输,由于本申请省去现有技术中的该过程,因而本申请实施例还可以降低对总线带宽的要求,进而降低系统功耗。
在一种可行的实施方式中,上述存储阵列520包括M个磁盘,目标条带包括M个条带单元,M个条带单元分别位于M个磁盘上;其中,M为大于2的整数;第一数据为M个条带单元中任一条带单元上的一个或全部数据块。
可选地,当存储阵列520将M个条带单元中的数据拆分为多个数据块时,第一数据可以是该多个数据块中的一个;当存储阵列520未将M个条带单元中的数据进行拆分时,第一数据可以是M个条带单元中任一条带单元包含的全部数据块,即该任一条带单元上的全部数据。
可以看出,在本申请实施例中,当条带单元中的数据被拆分为多个数据块时,由于第 一数据块为多个数据块中的一个,第一数据较小,因而此时的第一数据可以快速返回RAID控制器以便进行后续计算,进而降低后续处理过程的延时,提升效率。
在一种可行的实施方式中,上述目标条带上还包括第二数据;上述存储阵列还用于:向RAID控制器发送第二数据和第二数据对应的第二索引信息;其中,第二数据的发送时间在第一数据的发送时间之前或之后。
可选地,上述第二数据可以包括目标条带中除第一数据外的至少一个其它数据块。当第二数据包含多个数据块时,该多个数据块可以以任意顺序返回RAID控制器,且每个数据块返回时会同时携带与每个数据块对应的索引信息。
可以看出,在本申请实施例中,由于数据对应的第一索引信息可以与数据同时返回RAID控制器,因而RAID控制器可以根据每个返回数据对应的索引信息确定返回数据进行一致性运算所需的预处理类型和缓存地址,无需等待目标条带上所需数据全部返回,因而党第二数据包含多个数据块时,存储阵列可以按照任意顺序向RAID控制器返回上述第一数据和第二数据,并基于第一数据和/或第二数据乱序完成磁盘损坏数据的恢复和/或校验信息的更新过程,进而可以有效提高系统性能。
具体地,当第二数据包含多个数据块时,上述存储阵列520用于以任意顺序分别向RAID控制器发送第一数据和该多个数据块,且在发送不同数据时还同时发送与数据对应的索引信息。
进一步地,当存储阵列520未对M个条带单元中的数据进行拆分,第一数据可以为M个条带单元中任一条带单元中的全部数据,第二数据可以包括M个条带单元中至少一个条带单元中的数据,第二数据包含的具体数据内容根据一致性运算所属的应用场景决定。存储阵列520可以按照任意顺序依次发送第一数据和第二数据。当存储阵列520对M个条带单元中的数据进行拆分时,第一数据可以为任一条带单元中所包含的一个数据块,第二数据可以包含除第一数据外的至少一个其余数据块,当该至少一个其余数据块的数量为多个时,它们可以位于同一条带单元或不同条带单元。此时,存储阵列520可以按以下两种方式向RAID控制器发送第一数据和第二数据:
(1)第一种方式
存储阵列520先发送属于目标条带中一个条带单元的全部数据块,再发送属于另一条带单元的全部数据块,按照此规则发送第一数据和第二数据;其中,存储阵列520可以根据具体应用场景确定先发送哪个条带单元的全部数据块,以及后续发送哪个条带单元的全部数据块,本申请不做具体限定。例如,当条带单元3包含的数据按照从数据头部到数据尾部的顺序被拆分成数据块1、数据块2和数据块3,条带单元4包含的数据按照从数据头部到数据尾部的顺序被拆分成数据块4、数据块5和数据块6;此时,存储阵列520可以先发送条带单元3包含的全部数据块,再发送条带单元4包含的数据块,或者先发送条带单元4包含的全部数据块,再发送条带单元3包含的全部数据块。
进一步地,对于拆分后属于同一条带单元的多个数据块,存储阵列520可以按照顺序或乱序的方式将其发送到RAID控制器。例如,当条带单元3包含的数据被拆分成数据块1、数据块2和数据块3;其中,数据块1为数据头部,数据块3为数据尾部。存储阵列520在发送条带单元3中的三个数据块时,可以按照数据块1、数据块2和数据块3的顺序依 次发送上述三个数据块,也可按照其它顺序向RAID控制器发送该三个数据块。
(2)第二种方式
存储阵列520可以按照任意顺序交替发送属于不同条带单元的数据块。具体地,存储阵列520可以先发送任一条带单元中的一个或多个数据块(非全部数据块),然后再发送另一条带单元中的一个或多个第一数据块(非全部数据块),直到发送完第一数据和第二数据。其中,存储阵列520可以根据具体地应用场景确定先发送哪个条带单元的部分数据块,以及后续发送哪个条带单元的部分数据块,本申请不做具体限定。例如,当条带单元3包含的数据按照从数据头部到数据尾部的顺序被拆分成数据块1、数据块2和数据块3,条带单元4包含的数据按照从数据头部到数据尾部的顺序被拆分成数据块4、数据块5和数据块6时,存储阵列520可以依次发送条带单元3中的数据块2、条带单元4中的数据块5和数据块6、条带单元3中的数据块1和数据块3、条带单元4中的数据块4。
应当注意,RAID控制器510可以根据返回数据的索引信息确定数据对应的表项信息,进而基于表项信息中包含的预处理类型和缓存地址进行后续的异或XOR运算,由于异或运算的逻辑是固定的,所以第一数据和第二数据可以乱序返回RAID控制器,从而乱序完成后续磁盘待恢复数据和/或校验信息的计算,因而本申请实施例可以有效地提高系统性能。
下面将具体描述RAID控制器510利用从存储阵列520接收的第一数据和第二数据来计算待恢复数据(即对磁盘损坏数据进行恢复)的过程,即场景一。
在一种可行的实施方式中,上述第一表项信息包括第一预处理类型和第一缓存地址;RAID控制器具体用于:根据第一预处理类型对第一数据进行预处理,并利用预处理后的第一数据对第一缓存地址中的数据进行更新,得到第一数据对应的第一参考信息;RAID控制器还用于:基于第二索引信息从预设的映射关系中确定与第二数据对应的第二表项信息;其中,第二表项信息包括第二预处理类型以及第二缓存地址;根据第二预处理类型对第二数据进行预处理,并利用预处理后的第二数据对第二缓存地址中的数据进行更新,得到第二数据对应的第二参考信息;根据第一参考信息和第二参考信息得到存储阵列中的待恢复数据。
可选地,上述根据第一预处理类型对第一数据进行预处理,并利用预处理后的第一数据对第一缓存地址中的数据进行更新,得到第一数据对应的第一参考信息,具体包括:RAID控制器510获取第一缓存地址中的数据,对预处理后的第一数据和与该第一缓存地址中的数据进行异或XOR运算,得到第一数据对应的第一参考信息(即异或运算的结果),并将该第一参考信息写入第一缓存地址。同理,第二参考信息的计算过程可以参照第一参考信息中的对应过程,此处不再赘述。
其中,第一数据对应第一索引信息,第一索引信息对应第一表项信息,第一表项信息用于指示第一数据进行一致性运算的预处理类型和缓存地址。可选地,第一表项信息可以包含K个预处理类型和K个缓存地址;其中,该K个预处理类型和K个缓存地址分别一一对应,每个预处理类型和对应的缓存地址分别用于第一数据在不同场景的一致性运算,K为正整数。同理,第二数据对应第二索引信息,第二索引信息对应第二表项信息,第二表项信息用于指示第二数据进行一致性运算的预处理类型和缓存地址。应当注意,当第二 数据包含多个数据块时,每个数据块对应一个索引信息和表项信息,即此时第二索引信息包含该多个数据块分别对应的索引信息,第二表项信息包含该多个数据块分别对应的表项信息。
其中,上述预处理类型为RAID5/6中的数据处理方式,即在RAID控制器510从存储阵列520接收到数据后,利用预处理函数对接收到的数据进行处理的方式,例如,当从存储阵列520接收到的数据为50时,可以采用对应的预处理函数对数据50进行处理,得到预处理后的数据89;然后对数据50对应缓存地址中的数据和数据89进行异或运算,得到数据50对应的参考信息,即异或运算的结果。综上,上述预处理为存储阵列520上的数据返回后,异或运算之前的数据处理过程。
下面将参照图6和图7描述利用第一数据和第二数据生成存储阵列520上目标条带中待恢复数据的过程。如图6所示,图6为一致性运算的流程示意图600,包括步骤610、步骤620、步骤630和步骤640。
步骤610,RAID控制器接收当前返回的第i个数据块,根据第i个数据块对应的索引信息从预设的映射关系中获取对应的表项信息;根据该表项信息对该第i个数据块进行预处理;其中,当第二数据包含多个数据块时,该第i个数据块为第一数据或该多个数据块中的一个;当第二数据为一个数据块时,该第i个数据块为第一数据或第二数据,i为正整数。
具体地,RAID控制器对第i个数据块进行预处理的过程与上述RAID控制器对第一数据进行预处理的过程对应相同,此处不再赘述。
应当理解,每个返回的数据块对应的预处理类型根据具体的应用场景确定,可以相同也可以不同,本申请不做限定。不同的预处理类型对应不同的预处理函数,RAID控制器根据第i个数据块对应的预处理算法对该第i个数据块进行预处理,例如,当第i个数据块中的数据为67时,可以根据对应的预处理算法将其进行预处理,得到预处理后的第i个数据块为与第i个数据块中数据不同的值,如34。
步骤620,判断第i个数据块对应的缓存地址与RAID控制器当前正在进行异或处理的数据块(也称为当前数据块)对应的缓存地址是否有重合部分。
其中,上述重合部分指第i个数据块对应的缓存地址和当前数据块对应的缓存地址完全重合或部分重合。
步骤630,当步骤620中的判断结果为“是”时,RAID控制器等待当前数据块对应的缓存地址中的数据更新后,再对第i个数据块进行相应的一致性运算,更新第i个数据块对应的缓存地址中的数据,也即串行更新。
具体地,当第i个数据块对应的缓存地址和当前数据块对应缓存地址发生重合时,RAID控制器等待当前数据块对应的缓存地址中的数据更新后,RAID控制器获取第i个数据块对应的缓存地址中的数据,对预处理后的第i个数据块和第i个数据块对应缓存地址中的数据进行异或XOR处理,得到第i个数据块对应的参考信息,并将其写入第i个数据块对应的缓存地址中。
步骤640,当步骤620中的判断结果为“否”时,RAID控制器可立即开始对第i个数 据块对应的缓存地址中的数据进行更新,即并行更新;具体更新过程前述实施例的对应过程,此处不再赘述。
综上可知,RAID控制器在对第i个数据块对应缓存地址中的数据进行更新时,由于此过程需要使用该缓存地址中的数据,因而当第i个数据块对应的缓存地址与当前数据块对应缓存地址发生部分或全部重合时,需要串行更新;当第i个数据块对应的缓存地址与当前数据块对应的缓存地址完全不重合时,可以并行更新。
应当理解,当存储阵列520对M个条带单元中的数据进行拆分时,同一条带单元被拆分得到的多个数据块分别对应的多个缓存地址不会发生重叠;当存储阵列520对M个条带单元中每个条带单元中的数据采用不同的规则进行拆分时,属于不同条带单元中数据块的分别对应的缓存地址可能发生重叠。
下面将参照图7描述利用第一参考信息和第二参考信息得到存储阵列520中待恢复数据的过程。请参见图7,图7中的目标缓存地址为上述第一缓存地址和第二缓存地址的并集。第一缓存地址用于存储第一参考信息。第二缓存地址用于存储第二参考信息,当第二数据包含多个数据块时,第二缓存地址可以包含该多个数据块分别对应的多个缓存地址;此时,目标缓存地址为第一缓存地址和该多个缓存地址的并集。上述第一缓存地址和该多个缓存地址中的任意两个缓存地址在目标缓存地址中可以不重叠、或完全重叠或部分重叠。其中,第i个数据块为第一数据或第二数据所包含多个数据块中的一个。
具体地,如图7所示,第i个数据块对应的缓存地址与第i-1个数据块对应的缓存地址部分重叠;第i+1个数据块对应的缓存地址与第i+2个数据块对应的缓存地址完全重叠。按照图6实施例中的描述并行和/或串行计算得到每个数据块对应的参考信息后,并将每个数据块对应的参考信息写入缓存地址中;待将最后一个进行一致性运算的数据块对应参考信息写入该数据块对应的缓存地址后,目标缓存地址中所存储的数据即为存储阵列520中目标条带上的待恢复数据。
其中,上述目标缓存地址可以位于图4所描述的缓存单元buffer中,该缓存单元可以为可读可写的存储器,如寄存器或随机存储器(random access memory,RAM),例如静态随机存取存储器(static random access memory,SRAM)、动态随机存取存储器(dynamic random access memory,DRAM)或同步动态随机存储器(synchronous DRAM,SDRAM)、双倍速率SDRAM(dual data rate SDRAM,DDR SDRAM)等。应当注意,该目标缓存单元可以位于RAID控制器内部(即位于片上,如SRAM),或RAID控制器外部(即位于片外,如DDR SDRAM)。
下面将具体描述RAID控制器利用从存储阵列520接收的第一数据来生成目标条带中校验信息的过程,即场景二。
在一种可行的实施方式中,接收待写入存储阵列的第三数据和与第三数据对应的第三索引信息;基于第三索引信息从预设的映射关系中确定与第三数据对应的第三表项信息;其中,第三表项信息包含第三预处理类型和第三缓存地址;根据第三预处理类型对第三数据块进行预处理,并利用预处理后的第三数据对第三缓存地址中的数据进行更新,得到第 三数据对应的第三参考信息。
应当理解,上述第三数据可以是主机向RAID控制器发送的,后续将写入存储阵列520中的数据;其中,第三数据可以包括一个或多个数据块,且该多个数据块可以以任意顺序返回RAID控制器,且每个数据块返回时会同时携带与每个数据块对应的索引信息。
具体地,当第三数据包含一个数据块时,RAID控制器510可以根据第三索引信息从预设的映射关系中确定与第三数据对应的第三表项信息。利用第三表项信息包含第三预处理类型和第三缓存地址对第三数据块进行对应的一致性运算;第三数据一致性运算的过程可以参照第一数据一致性运算中的对应过程,此处不再赘述。
应当理解,当第三数据包括多个数据块时,每个数据块对应的一致性运算过程与上述第三数据作为一个数据块时的一致性运算过程对应相同,此处不再赘述。
可以看出,在本申请实施例中,RAID控制器还可以对从主机返回的第三数据进行相应的一致性运算,得到第三数据对应的第三参考信息。当第三数据包括多个数据块时,由于每个数据块携带有对应的索引信息,因而RAID控制器可以不用依据每个数据块拆分前的顺序,直接依据每个数据块的返回顺序对每个数据块上述相应一致性运算(即乱序返回,乱序计算),从而有效提升系统性能;同时,由于可以进行乱序计算,因而可以降低第二数据一致性运算过程的延时,进而降低后续根据第二数据一致性运算结果得到校验信息的延时。
在一种可行的实施方式中,上述第一表项信息包括第四预处理类型和第四缓存地址;RAID控制器具体用于:根据第四预处理类型对第一数据进行预处理,并利用预处理后的第一数据对第四缓存地址中的数据进行更新,得到第一数据对应的第四参考信息;RAID控制器还用于:根据第三参考信息和第四参考信息,得到存储阵列中的校验信息。
具体地,上述RAID控制器510利用第四预处理类型和第四缓存地址对第一数据进行一致性运算的过程可以对应参照利用第一预处理类型和第一缓存地址对第一数据进行一致性运算的对应过程,此处不再赘述。
进一步,RAID控制器510根据第三参考信息和第四参考信息生成磁盘中的校验信息的过程与图7所示实施例类似,当第三数据包含多个数据块时,校验信息的更新过程具体如下:上述第四缓存地址和第三数据中多个数据块对应的多个缓存地址中任意两个缓存地址在参考缓存地址中可以不重叠、完全重叠或部分重叠;参考缓存地址可以是上述第四缓存地址和第三数据中多个数据块对应的多个缓存地址的并集。RAID控制器510根据前述实施例中的计算过程分别更新第一数据和第三数据中多个数据块分别对应缓存地址中的数据,待将最后一个进行一致性运算的数据块对应的参考信息写入该数据块对应的缓存地址后,参考缓存地址中所存储的数据即为存储阵列520中目标条带上的待更新校验信息。
应当理解,RAID控制器510利用上述第一数据和第三数据更新校验信息的计算顺序可以参照图6实施例中的对应过程。具体地,上述第一数据和第三数据所包含的多个数据块中的任一数据块可以作为图6中所示的第i个数据块,RAID控制器510对第i个数据块的处理顺序遵循图6实施例中的原则,即根据第i个数据块和当前数据块对应缓存地址是否发生重合来确定第i个数据块与当前数据块的处理过程是并行还是串行,此处不再赘述。
可选地,在不同场景下,RAID控制器510还可利用第二数据共同完成目标条带中校验信息的更新。此时,第二表项信息可以包含第五预处理类型和第五缓存地址,RAID控制器510利用第五预处理类型和第五缓存地址对第二数据进行一致性运算,得到第五参考信息;然后根据第三参考信息、第四参考信息和第五参考信息得到目标条带中的待更新校验信息。
具体地,利用第五预处理类型和第五缓存地址对第二数据进行一致性运算的过程,以及根据第三参考信息、第四参考信息和第五参考信息得到目标条带中的待更新校验信息的过程可以参见前述实施例的具体描述,此处不再赘述。
应当理解,当第二数据包含多个数据块时,第二表项信息中包含与多个数据块分别对应的预处理类型和缓存地址,第五参考信息包含分别与该多个数据块对应的多个参考信息;其中,该多个参考信息中每个参考信息的计算过程(即每个数据块的一致性运算过程)可以参见前述实施例,此处不再赘述。
可选地,上述参考缓存地址所位于的缓存单元可以与目标缓存地址位于的缓存单元相同,此处不再赘述。
应当理解,本申请实施例示例性地介绍了第一表项信息可以包含第一预处理类型和/或第四预处理类型,第一缓存地址和/或第四缓存地址。本领域技术人员应当理解第一表项信息中可以包含Q个预处理类型和Q个缓存地址;其中,该Q个预处理类型和Q个缓存地址一一对应。每个预处理类型和与该预处理类型对应的缓存地址可以对应RAID5/6中一个应用场景,即上述Q个预处理类型和Q个缓存地址分别对应Q种应用场景,该Q种应用场景可以是任意从磁盘获取第一数据进行后续操作的场景,例如,磁盘数据恢复、磁盘校验信息更新,以及磁盘数据恢复和磁盘校验信息更新或其它场景,本申请对此不限定,Q为正整数。
可以看出,在本申请实施例中,第一表项信息中还包含第四预处理类型和第四缓存地址。由于第一数据返回RAID控制器的同时,与第一数据对应的第一索引信息也一同返回RAID控制器,进而RAID控制器可以根据该第一索引信息确定对应的第一表项信息,并根据每个第一表项信息包含的第四预处理类型和第四缓存地址对第一数据块进行相应的一致性运算,无需等待存储阵列中其余数据块全部返回RAID控制器,直接依据数据的返回顺序进行相应的一致性运算(即乱序返回,乱序计算),从而可以有效地提升系统性能。同时,由于磁盘返回的第一数据携带有对应的第一索引信息,根据可以依据第一索引信息和预设映射关系确定对应的预处理类型和缓存地址,无需经过缓存单元对第一数据进行中间缓存,从而降低缓存单元读写次数以及总线传输次数,进而降低后续根据第一数据的一致性运算结果得到存储阵列待更新校验信息的延时。
在一种可行的实施方式中,上述RAID控制器还用于:在接收第一数据之前,对第一表项信息所指示缓存地址中的数据进行初始化。
具体地,在RAID5/6中不同场景下,用于进行一致性运算的数据不同,在RAID控制器510接收不同场景所需的数据之前,对用于进行一致性运算的数据所对应缓存地址中的数据进行初始化,该初始化过程可以是清零处理。
其中,在RAID5/6中场景一下,用于进行一致性运算的数据来自于存储阵列520(例 如,可以包括第一数据和/或第二数据);在RAID5/6中场景二下,用于进行一致性运算的数据来自于存储阵列520和主机(例如,可以包括第一数据和第三数据);在RAID5/6中场景三下,用于进行一致性运算的数据来自于存储阵列520和主机(例如,可以包括第一数据、第二数据和第三数据)。
应当理解,上述对RAID5/6下场景二的描述可以理解为生成一种校验信息的过程。当目标条带中存在Q个独立的校验信息需要更新时,上述第一表项信息包含对应的Q个预处理类型和Q个缓存地址;第三表项信息包含对应的Q个预处理类型和Q个缓存地址;其中,Q为正整数。Q种校验信息的生成过程可以并行进行,每种校验信息的更新过程与上述实施例描述的校验信息生成过程对应相同,此处不再赘述。
可以看出,在本申请实施例中,RAID控制器在接收第一数据进行后续一致性运算之前,需要对第一表项信息和/或第二表项信息所指示缓存地址中的数据进行初始化,即清零处理;然后RAID控制器开始对接收到的第一数据块和/或第二数据块进行相应的一致性运算,从而确保生成正确的磁盘待恢复数据和/或校验信息。
下面将描述RAID控制器510恢复存储阵列520目标条带中损坏数据,并更新目标条带中校验信息的过程,即场景三。
具体地,场景三中的过程可以参照上述场景一和场景二中相应的过程,此处不再赘述。应当注意,在场景三中,当更新校验信息需要用到目标条带中待恢复数据时,RAID控制器需要等待目标条带中的待恢复数据恢复后,才能进行校验信息的更新;当更新校验信息的过程不需要用到待恢复数据时,待恢复数据的恢复过程和校验信息的更新过程可以并行进行。
此外,本申请实施例只示例出利用一个条带(即本申请中的目标条带)中的数据进行该条带中损坏数据的恢复和/或该条带中对应校验信息的更新过程,本领域技术人员可以采用本申请实施例对存储阵列520中其余一个或多个条带中的数据分别并行执行上述三种场景中的任一场景或其它可能的场景,本申请对此不限定。
请参见图8,图8为本申请实施例中另一种数据处理装置500的结构示意图,作为对图5中数据处理装置500中RAID控制器510功能模块的细化。如图8所示,RAID控制器510可以包括管理单元511、确定单元512、解析单元513和运算单元514。管理单元511用于管理一致性运算的数据(例如,该一致性运算的数据可包括上述第一数据、第二数据或第三数据中的一个或多个)所对应的表项信息(例如,上述第一表项信息、第二表项信息或第三表项信息)。具体地,管理单元511可以根据主机发送的目标命令生成一致性运算数据所对应的表项信息,并在接收到目标命令后对表项信息所指示缓存地址中的数据进行清零;其中,目标命令指示了RAID控制器510后续将要执行的具体应用场景。确定单元512用于根据一致性运算数据所对应的索引信息从预设的映射关系中确定一致性运算数据对应的表项信息,其具体过程可参前述实施例的描述,此处不再赘述。解析单元513用于解析表项信息中的内容,得到用于一致性运算的数据对应的预处理类型和缓存地址;并将用于一致性运算的数据,以及一致性运算的数据对应的预处理类型和缓存地址发送到运算 单元514。运算单元514用于根据一致性运算的数据对应的预处理类型对用于进行一致性运算的数据进行相应的预处理,得到预处理后的一致性运算的数据;利用预处理后的一致性运算数据对一致性运算的数据对应缓存地址中的数据进行更新,得到存储阵列520中待恢复数据和/或校验信息。运算单元514利用一致性运算的数据生成存储阵列中待恢复数据和/或校验信息的过程的具体过程可参见前文对应实施例,此处不再赘述。
请参见图9,图9为本申请实施例中一种数据流示意图,用于描述磁盘向RAID控制器发送数据的过程。如图9所示,数据流④描述了本申请实施例中存储阵列520向RAID控制器510返回一致性运算数据的过程,RAID控制器510对接收一致性运算数据进行相应的一致性运算(该一致性运算可以是上述三种场景或其它RAID5/6场景中的计算过程)。
可以看出,在本申请实施例中,存储阵列520可以直接将一致性运算数据和数据对应的索引信息发送到RAID控制器510,相对于现有技术中数据获取方式(如图9中数据流⑤和数据流⑥所示)来说,无需经过缓存单元的缓存过程,降低缓存单元的读写次数,进而减少总线传输次数和系统功耗;此外,由于省去部分数据传输过程,因而本申请实施例还可以降低得到一致性运算结果的延时。
请参见图10,图10是本申请实施例提供的一种总线传输次数和数据读写次数流程示意图。如图10所示,RAID6下,现有技术中获取存储阵列数据进行校验信息更新的过程如下:存储阵列先将数据发送到缓存单元中,RAID控制器从缓存单元读取数据,然后RAID控制器对读取数据进行相应的一致性运算,得到新P数据和新Q数据;最后将新P数据和新Q数据写入缓存单元。在本申请实施例中获取存储阵列数据进行校验信息更新的过程如下:存储阵列直接将数据发送到RAID控制器中,RAID控制器对读取数据进行相应的一致性运算,得到新P数据和新Q数据;最后将新P数据和新Q数据写入缓存单元。
其中,在将新P数据写入缓存单元过程中,RAID控制器会先将新P数据待写入缓存地址中的数据读取出来,然后将新P数据写入缓存单元对应的缓存地址中,此过程包含一次读取和一次写入过程,系统总线包含两次数据传输过程。上述新P数据或新Q数据可以是前述实施例中的第三参考信息和第四参考信息。上述过程是以缓存单元在RAID控制器外部的条件下进行描述的。应当理解,缓存单元也可以位于RAID控制器内部,本申请不限定。
下面将详细描述在三种不同条件下,现有技术和本申请实施例中总线传输次数和数据读写次数示意图。
条件一:在RAID6下的场景一(磁盘数据恢复)中,缓存单元buffer位于RAID控制器外部的条件下,本申请实施例相对于现有技术可降低总线传输次数和缓存单元的读写次数(具体可参见表1中的场景一中的次数统计)。表1中场景一下的总线传输次数和缓存单元的读写次数是RAID6中单坏盘条件下的示例。
在现有技术中,如图10所示,首先将存储阵列数据写入缓存单元,此过程总线传输次数为1次,缓存单元读写次数为1次;然后RAID控制器从缓存单元中获取存储阵列数据,此过程总线传输次数为1次,缓存单元读写次数为1次;RAID控制器基于获取到的存储阵 列数据进行相应的一致性运算得到待恢复数据,并将待恢复数据写入缓存单元中,此过程总线传输次数为2次,缓存单元读写次数为2次。综上,条件一下,使用现有技术时,总线传输次数为4次,缓存单元读写次数为4次。
在本申请实施例中,如图10所示,RAID控制器首先直接从磁盘获取存储阵列数据,此过程经过总线进行1次数据传输;然后RAID控制器对存储阵列数据进行相应的一致性运算得到待恢复数据,并将该待恢复数据写入缓存单元,此过程总线传输次数为2次,缓存单元读写次数为2次。综上,条件一下,采用申请实施例时,总线传输次数为3次,缓存单元读写次数为2次。
条件二:在RAID6下的场景二(更新磁盘校验信息)中,缓存单元buffer位于RAID控制器外部条件下,现有技术和本申请实施例中总线数据传输次数和缓存单元数据读写次数的区别。
在现有技术中,如图10所示,首先将存储阵列数据写入存储单元,此过程经过总线进行数据传输,总线传输次数为1次,缓存单元读写次数为1次。然后利用缓存单元中的存储阵列数据来更新校验信息。在更新校验信息中的P数据时:RAID控制器首先从缓存单元读取存储阵列数据,此过程经过总线进行1次数据传输,缓存单元读写次数为1次;然后对存储阵列数据和待写入数据进行相应的一致性运算得到新P数据,并将新P数据写入缓存单元,此过程需要经过总线进行2次数据传输;具体的:在将新P数据写入缓存单元时,需要先将新P数据在缓存单元中待写入缓存地址中的数据读出,然后将新P数据写入待写入缓存地址,此过程缓存单元经历一次读过程和一次写过程(1R1W),即此过程中的总线传输次数和缓存单元读写次数分别为2次。可以看出,在更新P数据的过程中,总线传输次数为4次,缓存单元读写次数为4次。应当理解,校验信息中的Q数据(RAID6中的第二种校验信息)的更新过程与P数据的更新过程对应相同,此处不再赘述。综上,条件二下,使用现有技术时,总线传输次数为8次,缓存单元的读写次数为8次。
在本申请实施例中,如图10所示,RAID控制器首先直接从获取存储阵列数据,此过程经过总线进行1次数据传输;然后RAID控制器对存储阵列数据和待写入数据进行相应的一致性运算得到新P数据。RAID控制器向缓存单元写入新P数据的过程与现有技术对应过程相同。综上,在P数据更新过程中,总线传输次数为3次,缓存单元的读写次数为2次。同理,Q数据的更新过程与P数据对应相同。综上,条件二下,采用本申请实施例时,总线传输次数为6次,缓存单元的读写次数为4次。
条件三:在RAID6下的场景三(磁盘数据恢复且更新磁盘中的校验信息)中,缓存单元buffer位于RAID控制器外部、RAID6中存在双数据坏盘,且只需使用磁盘中返回的旧数据更新校验信息的条件下,在进行校验信息更新和磁盘数据恢复时,总线传输次数和缓存单元的读写次数具体可见表1中对应的次数统计。
在现有技术中,如图10所示,在磁盘数据恢复过程中:首先将存储阵列数据写入缓存单元,此过程总线进行1次数据传输,缓存单元读写次数为1次。对于双数据坏盘中任一数据坏盘中的损坏数据恢复时,RAID控制器从缓存单元中获取对应存储阵列数据,此过程 总线传输次数为1次,缓存单元读写次数为1次;然后RAID控制器基于获取的对应存储阵列数据对上述任一坏盘中的损坏数据进行恢复,得到任一数据坏盘中的待恢复数据,并将该待恢复数据写入缓存单元,此过程中的总线传输次数为2次,缓存单元读写次数为2次。综上可知,对两个坏盘数据中的损坏数据都进行恢复时,总线传输次数为7次,缓存单元读写次数为7次。由于在更新校验信息的过程中无需再次从存储阵列读取数据,在校验信息中P数据更新过程中:RAID控制器首先从缓存单元获取存储阵列数据,此过程经过总线进行1次数据传输,缓存单元读写次数为1次;然后RAID控制器对获取的存储阵列数据和待写入数据进行相应的一致性运算得到新P数据,并将新P数据写入缓存单元,此过程中的总线传输次数为2次,缓存单元读写次数为2次。同理,Q数据的更新过程与P数据对应相同,此处不再赘述。综上,校验信息更新过程中,总线传输次数为6次,缓存单元的读写次数为6次。由上述描述可知,在条件三下,使用现有技术时,总线传输次数为13次,缓存单元读写次数为13次。
在本申请实施例中,如图10所示,RAID控制器首先直接从磁盘获取存储阵列数据,此过程经过总线进行1次数据传输;然后RAID控制器基于获取的存储阵列旧数据对上述任一坏盘中的损坏数据进行恢复,得到任一数据坏盘中的待恢复数据,并将该待恢复数据写入缓存单元,此过程中的总线传输次数为2次,缓存单元读写次数为2次。综上可知,对两个坏盘数据中的损坏数据都进行恢复时,总线传输次数为5次,缓存单元读写次数为4次。由于在更新校验信息的过程中无需再次从存储阵列读取数据,在校验信息中P数据更新过程中:RAID控制器首先从缓存单元获取计算得到的待恢复数据,此过程经过总线进行1次数据传输,缓存单元读写次数为1次,RAID控制器对待恢复数据和待写入数据进行相应的一致性运算得到新P数据,并将新P数据写入缓存单元,此过程中的总线传输次数为2次,缓存单元读写次数为2次。同理,Q数据的更新过程与P数据对应相同,此处不再赘述。综上,校验信息更新过程中,总线传输次数为5次,缓存单元的读写次数为5次。由上述描述可知,在条件三下,采用申请实施例时,总线传输次数为10次,缓存单元读写次数为9次。
Figure PCTCN2021096303-appb-000001
表1:现有技术和本申请实施例中不同条件下总线传输次数和缓存单元数据读写次
应当理解,本申请实施例也可应用到RAID5/6中,RAID控制器从存储阵列获取数据进行其它一致性运算的场景,本申请对此不做具体限定。可以看出,采用本申请实施例中的方法从存储阵列获取数据进行后续一致性运算时,可以有效减少缓存单元的数据读写次数和总线传输次数,降低对总线带宽的需求,并减少系统功耗。
此外,虽然在不同场景下,RAID控制器从存储阵列获取的数据不同,但在同一场景下,采用现有技术和本申请实施例时,RAID控制器从存储阵列中获取的数据相同。因而,为便于统计,上文将本申请实施例和现有技术中从存储阵列获取数据过程中的总线传输次数和缓存单元读写次数分别按一次进行统计。
请参见图11,图11为本申请实施例提供的一种RAID控制器的硬件结构示意图1100。如图11所示,RAID控制器包括处理器1101、存储器1102、接口电路1103和总线1104。接口电路1103可以与存储阵列相耦合。
处理器1101,用于通过接口电路1103接收存储阵列中目标条带上的第一数据以及第一数据对应的第一索引信息;其中,第一数据为目标条带上数据中的任意一个;基于第一索引信息从预设的映射关系中确定与第一数据对应的第一表项信息;其中,预设的映射关系是基于条带的一致性生成的,第一表项信息用于指示第一数据在一致性运算中对应的预处理类型以及缓存地址;根据预处理类型对第一数据进行相应的预处理,并利用预处理后的第一数据对缓存地址中的数据进行更新。存储器1102,用于存储第一表项信息。其中,处理器1101、存储器1102和接口电路1103通过总线1104进行数据传输。
其中,存储器1102包括但不限于是随机存储记忆体(random access memory,RAM)、只读存储器(read-only memory,ROM)、可擦除可编程只读存储器(erasable programmable read only memory,EPROM)、或便携式只读存储器(compact disc read-only memory,CD-ROM)。处理器1101可以是一个或多个中央处理器(central processing unit,CPU),在处理器1101是一个CPU的情况下,该CPU可以是单核CPU,也可以是多核CPU。
在一种可行的实施方式中,上述存储阵列包括M个磁盘,目标条带包括M个条带单元,M个条带单元分别位于M个磁盘上;其中,M为大于2的整数;第一数据为M个条带单元中任一条带单元上的一个或全部数据块。
在一种可行的实施方式中,上述目标条带上还包括第二数据;存储阵列,还用于向RAID控制器发送第二数据和第二数据对应的第二索引信息;其中,第二数据的发送时间在第一数据的发送时间之前或之后。
具体地,上述处理器1101、存储器1102和接口电路1103的具体功能可参照图5实施例中的对应描述,此处不再赘述。
请参见图12,图12是本申请实施例提供的一种数据处理方法1200流程示意图,该数据处理方法,适用于上述图5和图8中的任意一种数据处理装置以及包含上述数据处理装置的设备。该方法包括但不限于如下步骤:
步骤S1201:由RAID控制器获取存储阵列中目标条带上的第一数据以及第一数据对应的第一索引信息;其中,第一数据为目标条带上数据中的任意一个;步骤S1202:通过RAID控制器基于第一索引信息从预设的映射关系中确定与第一数据对应的第一表项信息;其中,预设的映射关系是基于条带一致性生成的,第一表项信息用于指示第一数据所对应一致性运算的预处理类型以及缓存地址;步骤S1203:根据预处理类型对第一数据进行相应的预处理,并利用预处理后的第一数据对缓存地址中的数据进行更新。
在一种可行的实施方式中,存储阵列包括M个磁盘,目标条带包括M个条带单元,M个条带单元分别位于M个磁盘上;其中,M为大于2的整数;第一数据为M个条带单元中任一条带单元上的一个或全部数据块。
在一种可行的实施方式中,上述目标条带上还包括第二数据;方法还包括:通过存储阵列向RAID控制器发送第二数据和第二数据对应的第二索引信息;其中,第二数据的发送时间在第一数据的发送时间之前或之后。
在一种可行的实施方式中,上述第一表项信息包括第一预处理类型和第一缓存地址;上述根据预处理类型对第一数据进行相应的预处理,并利用预处理后的第一数据对缓存地址中的数据进行更新,包括:由RAID控制器根据第一预处理类型对第一数据进行预处理,并利用预处理后的第一数据对第一缓存地址中的数据进行更新,得到第一数据对应的第一参考信息。上述方法还包括:由RAID控制器基于第二索引信息从预设的映射关系中确定与第二数据对应的第二表项信息;其中,第二表项信息包括第二预处理类型以及第二缓存地址;根据第二预处理类型对第二数据进行预处理,并利用预处理后的第二数据对第二缓存地址中的数据进行更新,得到第二数据对应的第二参考信息;根据第一参考信息和第二参考信息得到存储阵列中的待恢复数据。
在一种可行的实施方式中,上法还包括:由RAID控制器接收待写入存储阵列的第三数据和与第三数据对应的第三索引信息;基于第三索引信息从预设的映射关系中确定与第三数据对应的第三表项信息;其中,第三表项信息包含第三预处理类型和第三缓存地址;根据第三预处理类型对第三数据块进行预处理,并利用预处理后的第三数据对第三缓存地址中的数据进行更新,得到第三数据对应的第三参考信息。
在一种可行的实施方式中,上述第一表项信息包括第四预处理类型和第四缓存地址;根据预处理类型对第一数据进行相应的预处理,并利用预处理后的第一数据对缓存地址中的数据进行更新,包括:由RAID控制器根据第四预处理类型对第一数据进行预处理,并利用预处理后的第一数据对第四缓存地址中的数据进行更新,得到第一数据对应的第四参考信息;上述方法还包括:由RAID控制器根据第三参考信息和第四参考信息,得到存储阵列中的校验信息。
在一种可行的实施方式中,上述方法还包括:在接收第一数据之前,由RAID控制器对第一表项信息所指示缓存地址中的数据进行初始化。
需要说明的是,本申请实施例中所描述的数据处理方法1200的具体流程,可参见上述图5和图8中所述的申请实施例中的相关描述,此处不再赘述。
本申请实施例还提供一种计算机存储介质,其中,该计算机存储介质可存储有计算机程序,当该计算机程序中的部分程序被数据处理装置500中的处理器(图5中未示出)执行时,使得该处理器可以执行如上述方法实施例中记载的任意一种的部分或全部步骤。其中,上述计算机存储介质可以是数据处理装置500包含的缓存单元(图5中未示出)。
本申请实施例还提供一种计算机程序,该计算机程序包括指令。当该计算机程序中的部分程序被数据处理装置500中的处理器执行时,该处理器可以执行如上述方法实施例中记载的任意一种的部分或全部步骤。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其它实施例的相关描述。需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请并不受所描述的动作顺序的限制,因为依据本申请,某些步骤可能可以采用其它顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本申请所必须的。
在本申请所提供的几个实施例中,应该理解到,所揭露的装置,可通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如上述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性或其它的形式。
上述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
以上所述,以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。

Claims (21)

  1. 一种数据处理装置,其特征在于,所述装置包括:独立磁盘冗余阵列RAID控制器和耦合至所述RAID控制器的存储阵列;其中,
    所述RAID控制器,用于:
    获取所述存储阵列中目标条带上的第一数据以及所述第一数据对应的第一索引信息;其中,所述第一数据为所述目标条带上数据中的任意一个;
    基于所述第一索引信息从预设的映射关系中确定与所述第一数据对应的第一表项信息;其中,所述预设的映射关系是基于条带的一致性运算生成的,所述第一表项信息用于指示所述第一数据在所述一致性运算中对应的预处理类型以及缓存地址;
    根据所述预处理类型对所述第一数据进行相应的预处理,并利用预处理后的所述第一数据对所述缓存地址中的数据进行更新。
  2. 根据权利要求1所述的装置,其特征在于,所述存储阵列包括M个磁盘,所述目标条带包括M个条带单元,所述M个条带单元分别位于所述M个磁盘上;其中,所述M为大于2的整数;
    所述第一数据为所述M个条带单元中任一条带单元上的一个或全部数据块。
  3. 根据权利要求1或2所述的装置,其特征在于,所述目标条带上还包括第二数据;所述存储阵列用于:
    向所述RAID控制器发送所述第二数据和所述第二数据对应的第二索引信息;其中,所述第二数据的发送时间在所述第一数据的发送时间之前或之后。
  4. 根据权利要求3中所述的装置,其特征在于,所述第一表项信息包括第一预处理类型和第一缓存地址;
    所述RAID控制器具体用于:
    根据所述第一预处理类型对所述第一数据进行预处理,并利用预处理后的所述第一数据对所述第一缓存地址中的数据进行更新,得到所述第一数据对应的第一参考信息;
    所述RAID控制器还用于:
    基于所述第二索引信息从所述预设的映射关系中确定与所述第二数据对应的第二表项信息;其中,所述第二表项信息包括第二预处理类型以及第二缓存地址;
    根据所述第二预处理类型对所述第二数据进行预处理,并利用预处理后的所述第二数据对所述第二缓存地址中的数据进行更新,得到所述第二数据对应的第二参考信息;
    根据所述第一参考信息和所述第二参考信息得到所述存储阵列中的待恢复数据。
  5. 根据权利要求2或3中所述的装置,其特征在于,所述RAID控制器还用于:
    接收待写入所述存储阵列的第三数据和与所述第三数据对应的第三索引信息;
    基于所述第三索引信息从所述预设的映射关系中确定与所述第三数据对应的第三表项 信息;其中,所述第三表项信息包含第三预处理类型和第三缓存地址;
    根据所述第三预处理类型对所述第三数据块进行预处理,并利用预处理后的所述第三数据对所述第三缓存地址中的数据进行更新,得到所述第三数据对应的第三参考信息。
  6. 根据权利要求5中所述的装置,其特征在于,所述第一表项信息包括第四预处理类型和第四缓存地址;
    所述RAID控制器具体用于:
    根据所述第四预处理类型对所述第一数据进行预处理,并利用预处理后的所述第一数据对所述第四缓存地址中的数据进行更新,得到所述第一数据对应的第四参考信息;
    所述RAID控制器还用于:
    根据所述第三参考信息和所述第四参考信息,得到所述存储阵列中的校验信息。
  7. 根据权利要求1-6中任一项所述的装置,其特征在于,所述RAID控制器还用于:
    在接收所述第一数据之前,对所述第一表项信息所指示缓存地址中的数据进行初始化。
  8. 一种RAID控制器,其特征在于,所述RAID控制器包括处理器和接口电路;所述处理器通过所述接口电路与存储阵列相耦合;其中,
    所述处理器,用于:
    通过所述接口电路接收所述存储阵列中目标条带上的第一数据以及所述第一数据对应的第一索引信息;其中,所述第一数据为所述目标条带上数据中的任意一个;
    基于所述第一索引信息从预设的映射关系中确定与所述第一数据对应的第一表项信息;其中,所述预设的映射关系是基于条带的一致性生成的,所述第一表项信息用于指示所述第一数据在所述一致性运算中对应的预处理类型以及缓存地址;
    根据所述预处理类型对所述第一数据进行相应的预处理,并利用预处理后的所述第一数据对所述缓存地址中的数据进行更新。
  9. 根据权利要求8所述的RAID控制器,其特征在于,所述RAID控制器包括存储器,所述存储器用于存储所述第一表项信息。
  10. 根据权利要求8或9所述的RAID控制器,其特征在于,
    所述存储阵列包括M个磁盘,所述目标条带包括M个条带单元,所述M个条带单元分别位于所述M个磁盘上;其中,所述M为大于2的整数;
    所述第一数据为所述M个条带单元中任一条带单元上的一个或全部数据块。
  11. 根据权利要求8-10中任一项所述的RAID控制器,其特征在于,所述目标条带上还包括第二数据;所述存储阵列用于:
    向所述RAID控制器发送所述第二数据和所述第二数据对应的第二索引信息;其中,所述第二数据的发送时间在所述第一数据的发送时间之前或之后。
  12. 一种数据处理方法,其特征在于,所述方法包括:
    由RAID控制器获取存储阵列中目标条带上的第一数据以及所述第一数据对应的第一索引信息;其中,所述第一数据为所述目标条带上数据中的任意一个;
    通过所述RAID控制器基于所述第一索引信息从预设的映射关系中确定与所述第一数据对应的第一表项信息;其中,所述预设的映射关系是基于条带一致性生成的,所述第一表项信息用于指示所述第一数据所对应一致性运算的预处理类型以及缓存地址;根据所述预处理类型对所述第一数据进行相应的预处理,并利用预处理后的所述第一数据对所述缓存地址中的数据进行更新。
  13. 根据权利要求12所述的方法,其特征在于,所述存储阵列包括M个磁盘,所述目标条带包括M个条带单元,所述M个条带单元分别位于所述M个磁盘上;其中,所述M为大于2的整数;
    所述第一数据为所述M个条带单元中任一条带单元上的一个或全部数据块。
  14. 根据权利要求12或13所述的方法,其特征在于,所述目标条带上还包括第二数据;所述方法还包括:
    通过所述存储阵列向所述RAID控制器发送所述第二数据和所述第二数据对应的第二索引信息;其中,所述第二数据的发送时间在所述第一数据的发送时间之前或之后。
  15. 根据权利要求14所述的方法,其特征在于,所述第一表项信息包括第一预处理类型和第一缓存地址;所述根据所述预处理类型对所述第一数据进行相应的预处理,并利用预处理后的所述第一数据对所述缓存地址中的数据进行更新,包括:
    由所述RAID控制器根据所述第一预处理类型对所述第一数据进行预处理,并利用预处理后的所述第一数据对所述第一缓存地址中的数据进行更新,得到所述第一数据对应的第一参考信息;
    所述方法还包括:
    由所述RAID控制器基于所述第二索引信息从所述预设的映射关系中确定与所述第二数据对应的第二表项信息;其中,所述第二表项信息包括第二预处理类型以及第二缓存地址;
    根据所述第二预处理类型对所述第二数据进行预处理,并利用预处理后的所述第二数据对所述第二缓存地址中的数据进行更新,得到所述第二数据对应的第二参考信息;
    根据所述第一参考信息和所述第二参考信息得到所述存储阵列中的待恢复数据。
  16. 根据权利要求13或14所述的方法,其特征在于,所述方法还包括:
    由所述RAID控制器接收待写入所述存储阵列的第三数据和与所述第三数据对应的第三索引信息;
    基于所述第三索引信息从所述预设的映射关系中确定与所述第三数据对应的第三表项信息;其中,所述第三表项信息包含第三预处理类型和第三缓存地址;
    根据所述第三预处理类型对所述第三数据块进行预处理,并利用预处理后的所述第三数据对所述第三缓存地址中的数据进行更新,得到所述第三数据对应的第三参考信息。
  17. 根据权利要求16所述的方法,其特征在于,所述第一表项信息包括第四预处理类型和第四缓存地址;所述根据所述预处理类型对所述第一数据进行相应的预处理,并利用预处理后的所述第一数据对所述缓存地址中的数据进行更新,包括:
    由所述RAID控制器根据所述第四预处理类型对所述第一数据进行预处理,并利用预处理后的所述第一数据对所述第四缓存地址中的数据进行更新,得到所述第一数据对应的第四参考信息;
    所述方法还包括:
    由所述RAID控制器根据所述第三参考信息和所述第四参考信息,得到所述存储阵列中的校验信息。
  18. 根据权利要求12-17中任一项所述的方法,其特征在于,所述方法还包括:
    在接收所述第一数据之前,由所述RAID控制器对所述第一表项信息所指示缓存地址中的数据进行初始化。
  19. 一种芯片系统,其特征在于,所述芯片系统包括至少一个处理器,存储器和接口电路,所述存储器、所述接口电路和所述至少一个处理器通过线路互联,所述至少一个存储器中存储有指令;所述指令被所述处理器执行时,权利要求12-18中任一所述的方法得以实现。
  20. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有程序指令,当所述程序指令在处理器上运行时,实现权利要求12-18中任一项所述的方法。
  21. 一种计算机程序产品,其特征在于,当所述计算机程序产品在终端上运行时,权利要求12-18中任一项所述的方法得以实现。
PCT/CN2021/096303 2021-05-27 2021-05-27 数据处理装置及数据处理方法 WO2022246727A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202180098648.5A CN117377940A (zh) 2021-05-27 2021-05-27 数据处理装置及数据处理方法
PCT/CN2021/096303 WO2022246727A1 (zh) 2021-05-27 2021-05-27 数据处理装置及数据处理方法

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/096303 WO2022246727A1 (zh) 2021-05-27 2021-05-27 数据处理装置及数据处理方法

Publications (1)

Publication Number Publication Date
WO2022246727A1 true WO2022246727A1 (zh) 2022-12-01

Family

ID=84229441

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/096303 WO2022246727A1 (zh) 2021-05-27 2021-05-27 数据处理装置及数据处理方法

Country Status (2)

Country Link
CN (1) CN117377940A (zh)
WO (1) WO2022246727A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101241420A (zh) * 2008-03-20 2008-08-13 杭州华三通信技术有限公司 用于提高写地址非连续的数据存储效率的方法和存储设备
CN111158599A (zh) * 2019-12-29 2020-05-15 北京浪潮数据技术有限公司 一种写数据的方法、装置、设备及存储介质
US20200159462A1 (en) * 2018-08-01 2020-05-21 EMC IP Holding Company LLC Fast input/output in a content-addressable storage architecture with paged metadata
CN111625181A (zh) * 2019-02-28 2020-09-04 华为技术有限公司 数据处理方法、独立硬盘冗余阵列控制器和数据存储系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101241420A (zh) * 2008-03-20 2008-08-13 杭州华三通信技术有限公司 用于提高写地址非连续的数据存储效率的方法和存储设备
US20200159462A1 (en) * 2018-08-01 2020-05-21 EMC IP Holding Company LLC Fast input/output in a content-addressable storage architecture with paged metadata
CN111625181A (zh) * 2019-02-28 2020-09-04 华为技术有限公司 数据处理方法、独立硬盘冗余阵列控制器和数据存储系统
CN111158599A (zh) * 2019-12-29 2020-05-15 北京浪潮数据技术有限公司 一种写数据的方法、装置、设备及存储介质

Also Published As

Publication number Publication date
CN117377940A (zh) 2024-01-09

Similar Documents

Publication Publication Date Title
CN108363670B (zh) 一种数据传输的方法、装置、设备和系统
KR101455016B1 (ko) 고가용성 솔리드 스테이트 드라이브를 제공하는 방법 및 장치
CN106021147B (zh) 在逻辑驱动器模型下呈现直接存取的存储设备
EP2100224B1 (en) Computer storage system
US20100125695A1 (en) Non-volatile memory storage system
JP2007513435A (ja) データ組織化を管理するための方法、システム、及びプログラム
US20080148025A1 (en) High performance raid-6 system architecture with pattern matching
WO1996018141A1 (fr) Systeme informatique
CN115617742B (zh) 一种数据缓存的方法、系统、设备和存储介质
CN103645969A (zh) 数据复制方法及数据存储系统
CN110609659A (zh) 用于执行读取命令的NVMeoF RAID实现方法
WO2021089036A1 (zh) 一种数据传输方法、网络设备、网络系统及芯片
CN114817093B (zh) 一种数据传输方法、系统、装置及存储介质
CN102609221A (zh) 一种硬件raid5/6存储系统的架构及数据处理方法
CN103645995B (zh) 写数据的方法及装置
CN116893789B (zh) 一种数据管理方法、系统、装置、设备及计算机存储介质
CN116501264B (zh) 一种数据存储方法、装置、系统、设备及可读存储介质
WO2022246727A1 (zh) 数据处理装置及数据处理方法
US7293139B2 (en) Disk array system generating a data guarantee code on data transferring
WO2023020136A1 (zh) 存储系统中的数据存储方法及装置
US11720442B2 (en) Memory controller performing selective and parallel error correction, system including the same and operating method of memory device
US9921770B1 (en) Extending fixed block architecture device access over ficon using transport mode protocol
WO2024040919A1 (zh) 一种数据修复方法及存储设备
CN105701060A (zh) 基于fpga的高速实时数据记录系统
US11733917B2 (en) High bandwidth controller memory buffer (CMB) for peer to peer data transfer

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21942307

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 202180098648.5

Country of ref document: CN

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21942307

Country of ref document: EP

Kind code of ref document: A1