WO2023184843A1

WO2023184843A1 - Raid encoding/decoding method and apparatus, device and readable storage medium

Info

Publication number: WO2023184843A1
Application number: PCT/CN2022/115341
Authority: WO
Inventors: 吴睿振; 陈静静; 张永兴; 张旭; 王凛
Original assignee: 苏州浪潮智能科技有限公司
Priority date: 2022-03-30
Filing date: 2022-08-28
Publication date: 2023-10-05
Also published as: CN114416424A; CN114416424B

Abstract

Provided in the embodiments of the present application are an RAID encoding/decoding method, the method comprising: first setting a target set comprising n numerical values which are not equal to each other, and setting as parameters a first preset value and a second preset value which are not equal to each other; and then constructing in a Galois field a ternary linear encoding/decoding equation system to solve three unknowns.

Description

A RAID encoding and decoding method, device, equipment and readable storage medium

Cross-references to related applications

This application requires priority to be submitted to the China Patent Office on March 30, 2022, with the application number CN202210321311.2, and the application title is "A RAID encoding and decoding method, device, equipment and readable storage medium", The entire contents of which are incorporated herein by reference.

Technical field

The present application relates to the field of computer technology, and in particular to a RAID encoding and decoding method, device, equipment and readable storage medium.

Background technique

Currently, multiple disks can be used to form a disk group with a huge capacity, called a disk array (Redundant Arrays of Independent Disks, RAID). Based on this, RAID 0 is derived (a technology that does not provide data redundancy and connects at least two disks together through software or hardware to form a large volume group, and writes data to each disk in turn), RAID 1 (for n disks, n/2 of them are used as mirror disks. When writing data to one of the disks, data is also written to the other disk at the same time. n is at least 2), RAID 5 (disk Data and verification data are stored on the computer at the same time. Data blocks and corresponding verification information are stored on different disks. This technology will not cause data loss when one disk fails), RAID 6 (with double verification function, in Technology that will not cause data loss when two disks fail at the same time), currently commonly used are RAID 5 and RAID 6.

RAID 5 can tolerate up to 1 data block error on a stripe, and RAID 6 can tolerate up to 2 data block errors on a stripe. In the RAID6 encoding and decoding scheme, due to the need for division operations, there may be cases where the division cannot be completed. Incomplete division will affect the encoding accuracy. The specific manifestation is: when the division is incomplete, a specific length of decimal places needs to be retained, which will inevitably cause all The retained data deviates from the actual data and therefore has accuracy errors. In addition, when the stripe is large, the coefficients in the equation of different data blocks may be the same, and the solution may not be possible.

Contents of the invention

In a first aspect, embodiments of the present application provide a RAID encoding and decoding method, including:

Obtain any stripe in RAID that needs to be encoded/decoded, and determine the target set; the target set includes n unequal values, where n is the number of disks corresponding to the stripe;

Determine the first preset value and the second preset value that are not equal to each other;

In response to the strip including 3 unknown data blocks, using all known data blocks included in the strip, the 3 unknown data blocks, the target set, the first preset value and the second preset value, three blocks are constructed in the Galois domain. One-dimensional encoding/decoding of equations;

For the ternary linear encoding/decoding equations, the parameter values calculated by Galois field division in any equation are formed into an array to obtain the first array and the second array;

Select two parameter values corresponding to the same unknown data block in the first array and the second array, and update the first array and the second array with the two parameter values to obtain a new first array and a new second array;

Determine the parameter value corresponding to the unselected unknown data block in the new first array and the new second array, and determine the dividend based on the parameter value corresponding to the unselected unknown data block; and

Three unknown data blocks are determined based on the parameter values corresponding to the unselected unknown data blocks, the dividend, the number of disks corresponding to the stripe, and all known data blocks.

In a second aspect, embodiments of the present application provide a RAID encoding and decoding device, including:

The acquisition module is used to obtain any stripe in RAID that needs to be encoded/decoded, and determine the target set; the target set includes n unequal values, where n is the number of disks corresponding to the stripe;

A first determination module, configured to determine a first preset value and a second preset value that are not equal to each other;

Building a module for, in response to the strip including 3 unknown data blocks, utilizing all known data blocks included in the strip, the 3 unknown data blocks, the target set, the first preset value and the second preset value, in the Ga Construct a system of ternary linear encoding/decoding equations in the Luohua domain;

The composition module is used for ternary linear encoding/decoding equations, and combines the parameter values calculated by Galois field division in any equation into an array to obtain the first array and the second array;

The update module is used to select two parameter values corresponding to the same unknown data block in the first array and the second array, and update the first array and the second array using the two parameter values to obtain a new first array and a new second array. array;

The second determination module is used to determine the parameter value corresponding to the unselected unknown data block in the new first array and the new second array, and determine the dividend based on the parameter value corresponding to the unselected unknown data block; and

The third determination module is used to determine three unknown data blocks based on the parameter values corresponding to the unselected unknown data blocks, the dividend, the number of disks corresponding to the stripe, and all known data blocks.

In a third aspect, embodiments of the present application provide an electronic device, including a memory and one or more processors. Computer-readable instructions are stored in the memory. When the computer-readable instructions are executed by one or more processors, such that One or more processors perform the steps of the RAID encoding and decoding method in any embodiment.

In a fourth aspect, embodiments of the present application provide one or more non-volatile computer-readable storage media storing computer-readable instructions. When executed by one or more processors, the computer-readable instructions cause one or more A processor executes the steps of the RAID encoding and decoding method in any embodiment. computer readable instructions computer readable instructions computer readable instructions computer readable instructions

Description of drawings

In order to more clearly explain the technical solutions in the embodiments of the present application or related technologies, the drawings that need to be used in the description of the embodiments or related technologies will be briefly introduced below. Obviously, the drawings in the following description are only For the embodiments of the present application, those of ordinary skill in the art can also obtain other drawings based on the provided drawings without exerting creative efforts.

Figure 1 is a flow chart of a first RAID encoding and decoding method according to one or more embodiments;

Figure 2 is a schematic diagram of a RAID encoding and decoding device according to one or more embodiments;

Figure 3 is a schematic diagram of an electronic device according to one or more embodiments;

Figure 4 is an internal structure diagram of an electronic device according to one or more embodiments.

Detailed ways

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are only some of the embodiments of the present application, rather than all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative efforts fall within the scope of protection of this application.

As shown in Figure 1, this embodiment of the present application discloses a RAID encoding and decoding method, which includes:

Step S101: Obtain any stripe in the RAID that needs to be encoded/decoded, and determine the target set.

A RAID can be composed of several disks of the same or different types, and different disks can be located on the same or different cabinets. Each cabinet corresponds to a controller, which is used to control the data placement work of the local cabinet. The storage spaces on different cabinets are integrated through cloud services. User data is distributed and stored on different disks through the distribution strategy and control method corresponding to RAID. At this time, for disk arrays, the units across different hard disks used to implement verification services are called stripes.

Among them, the target set includes n unequal values, and n is the number of disks corresponding to the stripe. Among them, the number of disks corresponding to one stripe is the total number of disks that make up the current RAID. If the total number of disks is n, then a stripe includes n data blocks, and these n data blocks are located on n disks. Specifically, n data blocks included in one stripe can be recorded as n=k+m based on the current RAID mode. For example: when the RAID mode is RAID5, n=k+1 (number of parity blocks). When the RAID mode is RAID6, n=k+2 (number of parity blocks). The method provided by this embodiment is n=k+3 (number of check blocks). Among them, k represents the user data block. It can be seen that the n unequal values included in the target set correspond to each disk in the RAID one-to-one, and also correspond to each block in a stripe.

In a specific implementation manner, the n unequal values included in the target set can be flexibly set. For example: take natural numbers from 1 to n, or take random values among the natural numbers: 1, 4, 7, 8, 9... until n natural numbers are taken, or take the value as an arithmetic sequence or a geometric sequence. Preferably, the value can be determined with base 2, then n values that are not equal to each other are: 2 ⁰ , 2 ¹ ,..., 2 ^n-1 .

It should be noted that when n unequal values have certain value rules (such as: 2 ⁰ , 2 ¹ ,..., 2 ^n-1 ), the encoding and decoding process can be simplified. Because the coefficients obtained based on the n mutually unequal values can simplify the solution of the equation.

Step S102: Determine a first preset value and a second preset value that are not equal to each other.

The first preset value and the second preset value can be flexibly set. For example, take two natural numbers: 1 and 2, or choose any two numbers from the target set.

The target set, the first preset value and the second preset value can be determined by software or hardware.

Step S103: In response to the stripe including 3 unknown data blocks, using all known data blocks, 3 unknown data blocks, the target set, the first preset value and the second preset value included in the stripe, in Galois Construct a system of three-dimensional linear encoding/decoding equations in the domain.

In some optional implementations, before performing step S103, it may first be determined whether the stripe includes 3 unknown data blocks. If the stripe includes 3 unknown data blocks, all known data blocks included in the stripe, 3 An unknown data block, a target set, a first preset value and a second preset value are used to construct a ternary one-step encoding/decoding equation system in the Galois domain.

If 3 unknown data blocks need to be solved, the solution cannot be completed using RAID 5 and RAID 6. To this end, this embodiment constructs a ternary primary code in the Galois domain based on the previously set target set, the first preset value, the second preset value, all known data blocks and three unknown data blocks participating in the solution. /decode system of equations.

In a specific implementation, the three-dimensional linear encoding/decoding equations include:

Among them, p ₁ , p ₂ , p ₃ are 3 unknown data blocks; n is the number of disks corresponding to the stripe; d ₁ , d ₂ ,…, d _n-3 are n-3 known data blocks; GF_div Represents Galois field division; a ₁ is the first preset value; a ₂ is the second preset value; b ₁ , b ₂ ,..., b _n are n unequal values in the target set; ⊕ represents Galois field division. Luo Huayu addition. Since the number of disks corresponding to a stripe is n, then when the stripe includes 3 unknown data blocks, there are still n-3 known data blocks left in the stripe, that is: d ₁ , d ₂ ,…,d _n-3 .

Step S104: For the ternary linear encoding/decoding equations, combine the parameter values calculated by Galois field division in any equation into an array to obtain a first array and a second array.

The three-dimensional linear encoding/decoding equations shown above include a total of 3 equations. 2 of these 3 equations involve Galois field division. Therefore, the known data blocks and unknown data in these 2 equations are extracted. The coefficients of the block, resulting in 2 arrays: the first array and the second array.

Specifically, the parameter values in the first array are:

The parameter values in the second array are:

Step S105: Select two parameter values corresponding to the same unknown data block in the first array and the second array, and use the two parameter values to update the first array and the second array to obtain a new first array and a new second array.

Select two parameter values corresponding to the same unknown data block in the first array and the second array. For example: select the two parameter values corresponding to the unknown data block p ₁ in the first array and the second array:

and

Either select the two parameter values corresponding to the unknown data block p ₂ in the first array and the second array; or select the two parameter values corresponding to the unknown data block p ₃ in the first array and the second array. It can be seen that no matter which one of the three unknown data blocks is selected, the first array and the second array only need to be changed accordingly when subsequently updating, but the solution process can be completed.

Assuming n=16, then the target set includes 16 unequal values, namely 1, 2, 3, 4,...,16. The first preset value a1=1, the second preset value a2=2, then according to the above, the parameter values in the first array are: [142,244,...,216,114], and the parameter values in the second array are: [244,71,…,114,192]. Assume that the two parameter values corresponding to the unknown data block p ₃ in the first array and the second array are selected: 114 (ie: the last parameter value in the first array), 192 (ie: the last parameter value in the second array) value). Then use the parameter value 144 corresponding to the unknown data block p ₃ in the first array to perform Galois field addition (that is, XOR operation) with each parameter in the first array. The new first array can be obtained: [252,134 ,…,170,0]. Correspondingly, by performing Galois field addition with the parameter value 192 corresponding to the unknown data block p ₃ in the second array and each parameter in the second array, a new second array can be obtained: [52,135,…,178, 0]. At this point, the first array and the second array are updated.

Wherein, each element in the above-mentioned first array, second array, new first array, and new second array is a decimal number. When updating the first array and the second array, first convert the parameter values in the first array and the second array from decimal numbers to binary numbers, and then perform an XOR operation.

In a specific implementation, using two parameter values to update the first array and the second array to obtain a new first array and a new second array includes: using the parameter value selected from the first array to Perform Galois field addition on each parameter value in the second array to obtain a new first array; use the parameter value selected from the second array to perform Galois field addition on each parameter value in the second array to obtain a new second array .

Step S106: Determine the parameter value corresponding to the unselected unknown data block in the new first array and the new second array, and determine the dividend based on the parameter value corresponding to the unselected unknown data block.

In a specific implementation, determining the dividend based on the parameter value corresponding to the unselected unknown data block includes: calculating the dividend according to a first formula; the first formula is: p_deno=GF_mul(p1_32, p2_22)⊕GF_mul(p2_32, p1_22); where p_deno is the dividend; GF_mul represents Galois field multiplication; p1_32 is the parameter value corresponding to the unselected unknown data block p ₁ in the new second array; p2_22 is the unselected unknown in the new first array The parameter value corresponding to the data block p ₂ ; p2_32 is the parameter value corresponding to the unselected unknown data block p ₂ in the new second array; p1_22 is the parameter value corresponding to the unselected unknown data block p ₁ in the new first array ;⊕ represents Galois field addition.

If the unknown data block selected in step S105 is p ₃ , then the unselected unknown data blocks remain p ₁ and p ₂ . According to the above example, the corresponding parameter value of p ₁ in the new first array is 252, and the corresponding parameter value of p ₁ in the new second array is 52; the corresponding parameter value of p ₂ in the new first array is 170, p The corresponding parameter value of ₂ in the new second array is 178. Substituting these parameter values into the first formula, we can get: p_deno=GF_mul(52,170)⊕GF_mul(178,252). At this point, the dividend p_deno can be obtained.

Step S107: Determine three unknown data blocks based on the parameter values corresponding to the unselected unknown data blocks, the dividend, the number of disks corresponding to the stripe, and all known data blocks.

In a specific implementation, determining three unknown data blocks based on the parameter values corresponding to the unselected unknown data blocks, the dividend, the number of disks corresponding to the stripe, and all known data blocks includes: calculating according to the second formula 3 unknown data blocks; the second formula includes:

Among them, p ₁ , p ₂ , and p ₃ are 3 unknown data blocks; n is the number of disks corresponding to the stripe; GF_div represents Galois field division; GF_mul represents Galois field multiplication; ⊕ represents Galois field addition. ; p1_32 is the parameter value corresponding to the unselected unknown data block p ₁ in the new second array; p2_22 is the parameter value corresponding to the unselected unknown data block p ₂ in the new first array; p2_32 is the parameter value corresponding to the unselected unknown data block p 2 in the new second array The parameter value corresponding to the unselected unknown data block p ₂ ; p1_22 is the parameter value corresponding to the unselected unknown data block p ₁ in the new first array; v2'(i) is the known data block in the new second array The parameter value corresponding to i; v1'(i) is the parameter value corresponding to the known data block i in the new first array; p_deno is the dividend; _di is the known data block i. i=1,2,3,…,n-3.

p ₁ , p ₂ , and p ₃ in the second formula can be solved in parallel, thereby increasing the computing speed.

It can be seen that the embodiment shown in Figure 1 first sets a target set including n mutually unequal values, as well as mutually unequal first preset values and second preset values as parameters, and then needs to solve three unknown numbers. At this time, based on these parameters, known quantities (all known data blocks included in the strip), and unknown quantities (that is, 3 unknown data blocks), a ternary linear encoding/decoding equation system is constructed in the Galois domain. Due to the setting The n values of are not equal to each other, and the first preset value and the second preset value are not equal to each other, so the coefficients of different data blocks in the equation will be different, so the situation that cannot be solved can be avoided. At the same time, decimals will not appear in division in the Galois field, so the situation of incomplete division can be avoided, thereby improving the encoding and decoding accuracy. In this solution, the unknown data blocks that need to be solved can be: check blocks to be calculated during the encoding process, or damaged data blocks to be calculated during the decoding process. That is to say: this solution describes both the encoding process and the decoding process. This solution can complete RAID encoding and RAID decoding based on the same logic. Since this solution can solve 3 unknown data blocks, it has a higher fault tolerance rate than RAID 5 and RAID 6, and is suitable for encoding and decoding large stripes (such as stripes corresponding to a number of disks of not less than 32).

It should be noted that the RAID encoding process corresponds to the process of storing data. After storing the data, a stripe consisting of n blocks includes 3 parity blocks: p ₁ , p ₂ , p ₃ ; n-3 users Data block: d ₁ , d ₂ ,…, d _n-3 . If any user data block (any one of d ₁ , d ₂ ,..., d _n-3 ) in the stripe is changed, the difference between the changed data block and the pre-changed data block can be directly determined, and then Based on the difference, the corresponding changes that need to be made to the three check blocks are determined, and then the three check blocks can be directly changed to complete the change of one stripe.

In a specific implementation, if any known data block as user data in the strip is changed, the correction value in the strip is determined based on the difference between the changed known data block and the original known data block. Check the verification difference of the verification data, and update the verification data based on the verification difference.

Assuming that the user data block d _x changes, the following formula can be used to update the parity block p ₁ :

Among them, d _x ' is the d _x after the change, Δd _x is the difference between the d _x after the change and the d _x before the change, Δp ₁ is the corresponding change that p ₁ needs to make (that is, the verification difference of p ₁ ), p ₁ ' is the changed p ₁ . v1'(x) is the parameter value corresponding to d _x in the new first array, and v2'(x) is the parameter value corresponding to d _x in the new second array.

In the same way, the check blocks p ₂ and p ₃ can be updated accordingly, thereby obtaining three updated check blocks: p ₁ ', p ₂ ', and p ₃ '.

In some optional implementations, the maximum fault tolerance rate of the RAID encoding and decoding scheme provided by the embodiments of this application is 3, and can also be downgraded to 2 and 1, that is, downgraded to RAID 5 and RAID 6. That is to say: this application can correct errors and complete data recovery when there are 3 block errors, 2 block errors, or 1 block error in a stripe.

When there are three block errors in one stripe, data recovery can be completed according to the above embodiment.

If the stripe includes an unknown data block (that is, when there is a block error in a stripe), XOR other blocks in the stripe except the unknown data block to determine the unknown data block. For example: Suppose an error occurs in data block d ₂ , then the formula for solving data block d ₂ is:

In some optional implementations, after determining three unknown data blocks based on the parameter values corresponding to the unselected unknown data blocks, the dividend, the number of disks corresponding to the stripe, and all known data blocks, in response to the stripe Including 1 unknown data block, XOR other blocks in the strip except the unknown data block to determine the unknown data block.

If the stripe includes 2 unknown data blocks (that is, when there are 2 block errors in one stripe), determine the parameters of the 2 unknown data blocks calculated by Galois field division in the ternary linear encoding/decoding equations. value; determine the 2 unknown data blocks based on the determined parameter values, other blocks except the 2 unknown data blocks, and the parameter values of other blocks calculated by Galois field division in the ternary one-step encoding/decoding equation system.

In some optional implementations, after determining three unknown data blocks based on the parameter values corresponding to the unselected unknown data blocks, the dividend, the number of disks corresponding to the stripe, and all known data blocks, in response to if The band includes 2 unknown data blocks, then determine the parameter values of the 2 unknown data blocks calculated by Galois field division in the ternary linear encoding/decoding equations; based on the determined parameter values, divide the 2 unknown data blocks The two unknown data blocks are determined by the parameter values calculated by Galois field division in the ternary linear encoding/decoding equation system except for other blocks.

For example: Suppose errors occur in p ₁ and d ₂ , then the formula for solving p ₁ and d ₂ is:

Among them, d ₁ , d ₂ ,…, d _n-3 , p ₁ , p ₂ , p ₃ are n data blocks in the strip, p ₁ and d ₂ are two unknown data blocks among them; GF_div represents GA Luohua domain division; a ₁ is the first preset value; b ₁ , b ₂ ,..., b _n are n unequal values in the target set;

Represents Galois field addition.

It can be seen that when two errors occur, the coefficients corresponding to each block in any equation including Galois field division in the ternary linear encoding/decoding equations need to be used.

Based on the above embodiment, it should be noted that the Galois field generally uses GF8, and its polynomial is: P(x)=x ⁸ +x ⁴ +x ³ +x ¹ +1. The corresponding data mapping table is shown in Table 1.

Table 1

Data mapping can be performed according to Table 1. When the data is a decimal number 10, through the above-mentioned Galois field original polynomial P(X) conversion, 10 can be expressed as binary 01110100. Through such conversion, each data will be converted into an 8-bit binary number, and four arithmetic operations based on the 8-bit binary number will be performed.

This application operates in the Galois field, so the following encoding formula can be designed:

Among them, d ₁ , d ₂ ,…, d _n are n user data blocks in a stripe, a ₁ is the first preset value; a ₂ is the second preset value; b ₁ , b ₂ ,…, b _n is n values that are not equal to each other in the target set.

Assume that the total number of disks corresponding to a stripe is still n, then add 3 check blocks to the above encoding formula: p ₁ , p ₂ , p ₃ , then n=(n-3)+3, we can get three A system of linear encoding/decoding equations:

Of course, the positions of the three added check blocks are determined by mechanisms such as load balancing in the current RAID system. The positions of the three check blocks in the ternary one-step encoding/decoding equation set are only exemplary.

Then, the first array and the second array are determined based on the ternary one-time encoding/decoding equations, and then the first array and the second array are updated, and then the dividend is determined and the three check blocks are solved. For details, please refer to the relevant introduction of the above embodiment.

It can be seen that according to this application, the encoding and decoding process is completed based on the corresponding parameter values of each block in the array, and three error correction encoding and decoding can be realized. The operation process is simple and efficient, and there will be no decimal and precision problems. It is suitable for large stripe encoding and decoding. And can be downgraded to RAID5 and RAID6. Of course, this application is also applicable to different Galois fields. Under the polynomials of different Galois fields, the generated arrays are different, but the operation logic is the same, and there is no need for an inverse matrix, which has obvious advantages of high speed and flexibility.

The following is an introduction to a RAID encoding and decoding device provided by embodiments of the present application. The RAID encoding and decoding device described below and the RAID encoding and decoding method described above may be referred to each other.

Referring to Figure 2, an embodiment of the present application discloses a RAID encoding and decoding device, which includes:

The acquisition module 201 is used to acquire any stripe in the RAID that needs to be encoded/decoded, and determine the target set; the target set includes: n unequal values, where n is the number of disks corresponding to the stripe;

The first determination module 202 is used to determine the first preset value and the second preset value that are not equal to each other;

Building module 203, configured to use all known data blocks, the 3 unknown data blocks, the target set, the first preset value and the second preset value included in the stripe in response to the stripe including 3 unknown data blocks. Construct a system of three-dimensional linear encoding/decoding equations in the Galois domain;

The composition module 204 is used to form an array with the parameter values calculated by Galois field division in any equation for the ternary linear encoding/decoding equations to obtain a first array and a second array;

The update module 205 is used to select two parameter values corresponding to the same unknown data block in the first array and the second array, and update the first array and the second array using the two parameter values to obtain a new first array and a new third array. two array;

The second determination module 206 is configured to determine the parameter value corresponding to the unselected unknown data block in the new first array and the new second array, and determine the dividend based on the parameter value corresponding to the unselected unknown data block;

The third determination module 207 is used to determine three unknown data blocks based on the parameter value corresponding to the unselected unknown data block, the dividend, the number of disks corresponding to the stripe, and all known data blocks.

In a specific implementation, the three-dimensional linear encoding/decoding equations are:

Among them, p ₁ , p ₂ , p ₃ are 3 unknown data blocks; n is the number of disks corresponding to the stripe; d ₁ , d ₂ ,…, d _n-3 are n-3 known data blocks; GF_div Represents Galois field division; a ₁ is the first preset value; a ₂ is the second preset value; b ₁ , b ₂ ,..., b _n are n unequal values in the target set; ⊕ represents Galois field division. Luo Huayu addition.

In a specific implementation, the update module is specifically used to:

Perform Galois field addition on each parameter value in the first array using the parameter value selected from the first array to obtain a new first array;

Galois field addition is performed on each parameter value in the second array using the parameter value selected from the second array to obtain a new second array.

In a specific implementation, the second determination module is specifically used to:

Calculate the dividend according to the first formula; the first formula is: p_deno=GF_mul(p1_32,p2_22)⊕GF_mul(p2_32,p1_22);

Among them, p_deno is the dividend; GF_mul represents Galois field multiplication; p1_32 is the parameter value corresponding to the unselected unknown data block p ₁ in the new second array; p2_22 is the unselected unknown data block p in the new first array ₂ corresponds to the parameter value; p2_32 is the parameter value corresponding to the unselected unknown data block p ₂ in the new second array; p1_22 is the parameter value corresponding to the unselected unknown data block p ₁ in the new first array;

Represents Galois field addition.

In a specific implementation, the third determination module is specifically used to:

Calculate 3 unknown data blocks according to the second formula; the second formula includes:

Among them, p ₁ , p ₂ , and p ₃ are three unknown data blocks; n is the number of disks corresponding to the strip; GF_div represents Galois field division; GF_mul represents Galois field multiplication;

Represents Galois field addition; p1_32 is the parameter value corresponding to the unselected unknown data block p ₁ in the new second array; p2_22 is the parameter value corresponding to the unselected unknown data block p ₂ in the new first array; p2_32 is the parameter value corresponding to the unselected unknown data block p ₂ in the new second array; p1_22 is the parameter value corresponding to the unselected unknown data block p ₁ in the new first array; v2'(i) is the new second The parameter value corresponding to the known data block i in the array; v1'(i) is the parameter value corresponding to the known data block i in the new first array; p_deno is the dividend; _di is the known data block i.

In a specific implementation, it also includes:

The change module is used to determine the verification data in the strip based on the difference between the changed known data block and the original known data block if any known data block in the strip is changed. Verification difference, update the verification data based on the verification difference.

In a specific implementation, the n unequal values are: 2 ⁰ , 2 ¹ ,..., 2 ^n-1 .

In a specific implementation, it also includes:

The single error correction module is used to perform XOR on other blocks in the strip except the unknown data block if the stripe includes an unknown data block to determine the unknown data block.

In a specific implementation, it also includes:

The double error correction module is used to calculate the 2 unknown data blocks according to the third formula if the stripe includes 2 unknown data blocks; the third formula is:

Among them, d ₁ and d ₂ are 2 unknown data blocks; d ₁ , d ₂ ,..., d _n are n data blocks in the strip; GF_div represents Galois field division; a ₁ is the first preset value ; b ₁ , b ₂ ,…, b _n are n unequal values in the target set; ⊕ represents Galois field addition.

For more specific working processes of each module and unit in this embodiment, reference may be made to the corresponding content disclosed in the foregoing embodiments, which will not be described again here.

It can be seen that this embodiment provides a RAID encoding and decoding device, which can improve the accuracy of RAID encoding and decoding, avoid situations where solutions cannot be solved, and has a higher fault tolerance rate than RAID 5 and RAID 6.

An electronic device provided by an embodiment of the present application is introduced below. The electronic device described below and the RAID encoding and decoding method and device described above may be referred to each other.

Referring to Figure 3, an embodiment of the present application discloses an electronic device, including:

Memory 301, used to store computer readable instructions;

The processor 302 is configured to execute computer-readable instructions to implement the RAID encoding and decoding method disclosed in any of the above embodiments.

In some optional implementations, computer-readable instructions are stored in the memory 301. When executed by the one or more processors 302, the computer-readable instructions cause the one or more processors 302 to execute any of the foregoing embodiments. Steps in the RAID encoding and decoding method.

In some embodiments, the electronic device is a computer device, and its internal structure diagram can be shown in Figure 4 . The electronic device includes a processor, a memory, and a network interface connected through a system bus. Among them, the processor of the electronic device is used to provide computing and control capabilities. The memory of the electronic device includes non-volatile storage media and internal memory. The non-volatile storage medium stores an operating system and computer-readable instructions. This internal memory provides an environment for the execution of an operating system and computer-readable instructions in a non-volatile storage medium. The network interface of the electronic device is used to communicate with an external terminal through a network connection. When executed by the processor, the computer-readable instructions implement the RAID encoding and decoding method in any embodiment.

The following is an introduction to a readable storage medium provided by embodiments of the present application. The readable storage medium described below and the RAID encoding and decoding method, apparatus and equipment described above may be referred to each other.

A readable storage medium used to store computer-readable instructions, wherein when the computer-readable instructions are executed by a processor, the RAID encoding and decoding method disclosed in the foregoing embodiments is implemented. Regarding the specific steps of this method, reference may be made to the corresponding content disclosed in the foregoing embodiments, which will not be described again here.

In some embodiments, embodiments of the present application disclose one or more non-volatile computer-readable storage media storing computer-readable instructions that, when executed by one or more processors, cause one or more Multiple processors execute the steps of the RAID encoding and decoding method in any of the foregoing embodiments.

“First”, “second”, “third”, “fourth”, etc. (if present) mentioned in the embodiments of this application are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances so that the embodiments described herein can be practiced in sequences other than those illustrated or described herein. Furthermore, the terms "including" and "having" and any variations thereof are intended to cover non-exclusive inclusions, e.g., a process, method or apparatus that encompasses a series of steps or units need not be limited to those steps or units expressly listed. , but may include other steps or elements not expressly listed or inherent to such processes, methods or apparatuses.

It should be noted that the descriptions involving “first”, “second”, etc. in the embodiments of this application are only for descriptive purposes and cannot be understood as indicating or implying their relative importance or implicitly indicating the indicated technical features. quantity. Therefore, features defined as "first" and "second" may explicitly or implicitly include at least one of these features. In addition, the technical solutions in various embodiments can be combined with each other, but it must be based on the realization by those of ordinary skill in the art. When the combination of technical solutions is contradictory or cannot be realized, it should be considered that such a combination of technical solutions does not exist. , nor is it within the scope of protection required by this application.

Each embodiment in this specification is described in a progressive manner. Each embodiment focuses on its differences from other embodiments. The same or similar parts between the various embodiments can be referred to each other.

The steps of the methods or algorithms described in conjunction with the embodiments disclosed herein may be implemented directly in hardware, in software modules executed by a processor, or in a combination of both. Software modules may be located in random access memory (RAM), memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disks, removable disks, CD-ROMs, or anywhere in the field of technology. any other known form of readable storage medium.

This article uses specific examples to illustrate the principles and implementation methods of the present application. The description of the above embodiments is only used to help understand the method and the core idea of the present application; at the same time, for those of ordinary skill in the art, based on this application There will be changes in the specific implementation and scope of application of the ideas. In summary, the contents of this description should not be understood as limiting the present application.

Claims

A RAID encoding and decoding method, characterized by including:

Obtain any stripe in the RAID that needs to be encoded/decoded, and determine a target set; the target set includes n unequal values, where n is the number of disks corresponding to the stripe;

Determine the first preset value and the second preset value that are not equal to each other;

In response to the strip including 3 unknown data blocks, using all known data blocks included in the strip, the 3 unknown data blocks, the target set, the first preset value and the third 2 preset values, construct a system of three-dimensional linear encoding/decoding equations in the Galois domain;

For the three-dimensional linear encoding/decoding equations, the parameter values calculated by Galois field division in any equation are formed into an array to obtain a first array and a second array;

Select two parameter values corresponding to the same unknown data block in the first array and the second array, and use the two parameter values to update the first array and the second array to obtain a new first array. array and new second array;

Determine the parameter value corresponding to the unselected unknown data block in the new first array and the new second array, and determine the dividend based on the parameter value corresponding to the unselected unknown data block; and

Three unknown data blocks are determined based on the parameter value corresponding to the unselected unknown data block, the dividend, the number of disks corresponding to the stripe, and all known data blocks.
The method according to claim 1, characterized in that the three-dimensional one-time encoding/decoding equation set includes:

Among them, p 1 , p 2 , p 3 are 3 unknown data blocks; n is the number of disks corresponding to the stripe; d 1 , d 2 ,..., d n-3 are n-3 known data blocks ; GF_div represents Galois field division; a 1 is the first preset value; a 2 is the second preset value; b 1 , b 2 ,..., b n are n unequal values in the target set ;
Represents Galois field addition.
The method according to claims 1 to 2, characterized in that, using the two parameter values to update the first array and the second array to obtain a new first array and a new second array includes:

performing Galois field addition on each parameter value in the first array using the parameter value selected from the first array to obtain the new first array; and

Perform Galois field addition on each parameter value in the second array using the parameter value selected from the second array to obtain the new second array.
The method according to any one of claims 1 to 3, characterized in that,

Determining the dividend based on the parameter value corresponding to the unselected unknown data block includes: calculating the dividend according to the first formula;

The first formula is:

Among them, p_deno is the dividend; GF_mul represents Galois field multiplication; p1_32 is the parameter value corresponding to the unknown data block p 1 that is not selected in the new second array; p2_22 is the parameter value that is not selected in the new first array. The parameter value corresponding to the selected unknown data block p 2 ; p2_32 is the parameter value corresponding to the unselected unknown data block p 2 in the new second array; p1_22 is the unselected unknown data in the new first array Parameter value corresponding to block p 1 ;
Represents Galois field addition.
The method according to any one of claims 1 to 4, characterized in that,

Determining three unknown data blocks based on the parameter values corresponding to the unselected unknown data blocks, the dividend, the number of disks corresponding to the stripe, and all known data blocks includes: calculating according to the second formula 3 unknown data blocks;

The second formula includes:

as well as

Among them, p 1 , p 2 , and p 3 are three unknown data blocks; n is the number of disks corresponding to the stripe; GF_div represents Galois field division; GF_mul represents Galois field multiplication;
represents Galois field addition; p1_32 is the parameter value corresponding to the unselected unknown data block p 1 in the new second array; p2_22 is the parameter value corresponding to the unselected unknown data block p 2 in the new first array Parameter value; p2_32 is the parameter value corresponding to the unselected unknown data block p 2 in the new second array; p1_22 is the parameter value corresponding to the unselected unknown data block p 1 in the new first array; v2 '(i) is the parameter value corresponding to the known data block i in the new second array; v1'(i) is the parameter value corresponding to the known data block i in the new first array; p_deno is the dividend ;d i is the known data block i.
The method according to claim 5, characterized in that the calculation of three unknown data blocks according to the second formula includes:

Solve p 1 , p 2 , p 3 in the second formula in parallel.
The method according to any one of claims 1 to 6, further comprising:

In response to any known data block serving as user data in the strip being changed, a checksum of the check data in the strip is determined based on the difference between the changed known data block and the original known data block. Check the difference, and update the check data based on the check difference.
The method according to claim 7, further comprising: updating the verification data based on the verification difference, including: updating one of the verification data using the following formula:

Among them, d x is the known data block before the change, d x ' is the known data block after the change, Δd , Δp 1 represents the check difference of p 1 , p 1 ' is the data obtained after updating p 1 , v1'(x) is the parameter value corresponding to d x in the new first array, v2'(x) is the new The parameter value corresponding to d x in the second array.
The method according to any one of claims 1 to 8, characterized in that n mutually unequal values are: 2 0 , 2 1 ,..., 2 n-1 .
The method according to any one of claims 1 to 8, characterized in that the n mutually unequal numerical values are n mutually unequal, disordered natural numbers.
The method according to any one of claims 1 to 8, characterized in that the n mutually unequal numerical values form an arithmetic sequence or a geometric sequence.
The method according to any one of claims 1 to 11, further comprising:

After determining three unknown data blocks based on the parameter values corresponding to the unselected unknown data blocks, the dividend, the number of disks corresponding to the stripe, and all known data blocks, in response to the stripe The strip includes 1 unknown data block, and other blocks in the strip except the unknown data block are XORed to determine the unknown data block.
The method according to claim 12, characterized in that, performing XOR on other blocks in the strip except the unknown data block to determine the unknown data block includes:

Use the following formula to determine unknown data blocks:

Among them, d 2 represents the unknown data block, and d 1 , d 3 , d 4 , ..., p 1 , p 2 , p 3 represent other blocks other than the unknown data block.
The method according to any one of claims 1 to 11, further comprising:

After determining three unknown data blocks based on the parameter values corresponding to the unselected unknown data blocks, the dividend, the number of disks corresponding to the stripe, and all known data blocks, in response to the stripe The band includes two unknown data blocks, and the parameter values of the two unknown data blocks calculated by Galois field division in the three-dimensional linear encoding/decoding equations are determined;

The two unknown data blocks are determined based on the determined parameter values, other blocks other than the two unknown data blocks and the parameter values calculated by Galois field division in the ternary one-step encoding/decoding equation system. Unknown data block.
The method according to claim 14, characterized in that the two unknown data blocks are determined according to a third formula, and the third formula is:

Among them, d 1 and d 2 are 2 unknown data blocks; d 1 , d 2 ,..., d n are n data blocks in the strip; GF_div represents Galois field division; a 1 is the first preset value ; b 1 , b 2 ,…, b n are n unequal values in the target set;
Represents Galois field addition.
The method according to any one of claims 1 to 15, characterized in that each element in the first array, the second array, the new first array, and the new second array is a decimal number;

Select two parameter values corresponding to the same unknown data block in the first array and the second array, and use the two parameter values to update the first array and the second array to obtain a new first array. array and new second array, including:

Before selecting two parameter values corresponding to the same unknown data block in the first array and the second array, convert each parameter value in the first array and the second array from a decimal number into a binary number. .
A RAID encoding and decoding device, characterized by including:

An acquisition module is used to acquire any stripe in the RAID that needs to be encoded/decoded, and determine a target set; the target set includes n unequal values, where n is the number of disks corresponding to the stripe;

A first determination module, configured to determine a first preset value and a second preset value that are not equal to each other;

Building a module for, in response to the strip including 3 unknown data blocks, utilizing all known data blocks included in the strip, the 3 unknown data blocks, the target set, the first preset value and the second preset value to construct a ternary linear encoding/decoding equation system in the Galois domain;

A composition module configured to form an array with the parameter values calculated by Galois field division in any equation for the ternary linear encoding/decoding equations to obtain a first array and a second array;

An update module, configured to select two parameter values corresponding to the same unknown data block in the first array and the second array, and update the first array and the second array using the two parameter values. , get the new first array and the new second array;

A second determination module, configured to determine the parameter value corresponding to the unselected unknown data block in the new first array and the new second array, and based on the parameter value corresponding to the unselected unknown data block Determine the dividend; and

The third determination module is configured to determine three unknown data blocks based on the parameter value corresponding to the unselected unknown data block, the dividend, the number of disks corresponding to the stripe, and all known data blocks.
The device according to claim 17, characterized in that the update module is specifically used to:

Perform Galois field addition on each parameter value in the first array using the parameter value selected from the first array to obtain a new first array;

Galois field addition is performed on each parameter value in the second array using the parameter value selected from the second array to obtain a new second array.
An electronic device, characterized in that it includes a memory and one or more processors. Computer-readable instructions are stored in the memory. When the computer-readable instructions are executed by the one or more processors, the computer-readable instructions cause the The one or more processors perform the steps of the method according to any one of claims 1 to 6.
One or more non-volatile computer-readable storage media storing computer-readable instructions, characterized in that, when executed by one or more processors, the computer-readable instructions cause the one or more processors to Carry out the steps of the method according to any one of claims 1 to 16.