CN110874181B - Data updating method and updating device - Google Patents
Data updating method and updating device Download PDFInfo
- Publication number
- CN110874181B CN110874181B CN201811015192.8A CN201811015192A CN110874181B CN 110874181 B CN110874181 B CN 110874181B CN 201811015192 A CN201811015192 A CN 201811015192A CN 110874181 B CN110874181 B CN 110874181B
- Authority
- CN
- China
- Prior art keywords
- data
- updating
- stripe
- update
- range
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 58
- 238000012795 verification Methods 0.000 claims description 12
- 238000004590 computer program Methods 0.000 claims description 5
- 230000005540 biological transmission Effects 0.000 abstract description 17
- 238000013500 data storage Methods 0.000 description 27
- 238000004364 calculation method Methods 0.000 description 13
- 238000005516 engineering process Methods 0.000 description 13
- 230000008569 process Effects 0.000 description 8
- 238000010586 diagram Methods 0.000 description 7
- 230000009471 action Effects 0.000 description 3
- 238000003491 array Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/061—Improving I/O performance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
- G06F3/064—Management of blocks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/067—Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the invention provides a data updating method and a data updating device, wherein the method comprises the following steps: receiving a data updating request aiming at a stored file, wherein the data updating request carries the data content, the position offset and the data size value of data to be updated; according to the position offset and the data size value, positioning a corresponding updating range of the data to be updated in the stored file; determining a plurality of stripes covered by an updating range; and updating data of the data block units positioned in the updating range in each strip by using the data content for the plurality of determined strips. According to the data updating method provided by the embodiment of the invention, only the data block unit needing to be updated is subjected to data updating, and the data in the whole stripe does not need to be read to the memory for data updating, so that the data reading and writing amount is greatly reduced during data updating, the data transmission amount in the cloud storage server is reduced, and the network transmission pressure in the cloud storage server is reduced.
Description
Technical Field
The present invention relates to the field of data storage technologies, and in particular, to a data updating method and an updating apparatus.
Background
The cloud storage technology is a technology for data cloud storage, and when data is stored, a cloud storage server can receive the data sent by a client through a network, so that the data is stored. When the current cloud storage server stores data, only one copy is usually stored, and in order to avoid data loss due to machine failure, an EC (erasure code) technology is mostly adopted to store data.
The EC technique mainly performs striping processing on data, and then calculates corresponding check data for the stripe, so that when part of the data in the stripe is lost, the data can be recovered through the check data. The striping is a mature technology, and means that received data is stored into each data block unit according to the arrangement sequence of the data block units, and when the number of the data block units storing the data reaches, the plurality of data block units arranged in sequence form a stripe.
In the existing EC technology, for data written into a stripe, when a part of data in the stripe is updated, a cloud storage server needs to read the data of the whole stripe first, recalculate check data after the data is updated, and then write the updated data and the new check data into a disk, so that the data transmission amount inside the cloud storage server is large, which results in large transmission pressure of a network inside the cloud storage server. For example, if the data in one stripe is 4MB and 1 byte of data in the stripe needs to be updated, the cloud storage server needs to read 4MB of data, that is, 4194304 bytes of data. That is, although only 1 byte of data is updated, at least 4194304 bytes of data are transmitted inside the server.
Disclosure of Invention
The embodiment of the invention aims to provide a data updating method and a data updating device so as to reduce the data transmission quantity in a cloud storage server. The specific technical scheme is as follows:
in a first aspect, an embodiment of the present invention provides a data updating method, where the method includes:
receiving a data updating request aiming at a stored file, wherein the data updating request carries data content, position offset and data size value of data to be updated;
according to the position offset and the data size value, positioning a corresponding updating range of the data to be updated in the stored file;
determining a plurality of stripes covered by the updating range, wherein each stripe comprises the same number of data block units;
and updating data of the data block units in the updating range in each strip by using the data content for the determined plurality of strips.
Optionally, the positioning, according to the position offset and the data size value, an update range of the data to be updated in the stored file, includes:
determining an update starting position value of the stored file according to the position offset;
determining an update ending position value of the stored file by the sum of the update starting position value and the data size value;
and determining the updating range through the updating starting position value and the updating ending position value.
Optionally, the determining a plurality of stripes covered by the update range includes:
a stripe located within the update scope is determined, and a stripe that includes a portion of the update scope.
Optionally, the performing, for the determined multiple stripes, data update on the data block unit located in the update range in each stripe by using the data content includes:
for the determined plurality of stripes, if a stripe with partial data block units located in the updating range exists in the plurality of stripes, performing data updating on the data block units located in the updating range in the stripe by using the data content;
and if the plurality of stripes have the stripes with all the data block units positioned in the updating range, updating the data of all the data block units in the stripes by using the data content.
Optionally, after the data block unit in each stripe located in the update range is updated with the data content for the determined multiple stripes, the method further includes:
and starting from the moment after the data updating is finished, when the preset time is elapsed, no data updating request aiming at the stored file is received, and aiming at the stored file after the data updating, generating the verification data of the strip corresponding to the file.
Optionally, after the data block unit in each stripe located in the update range is updated with the data content for the determined multiple stripes, the method further includes:
if a data updating request for a stored file is received for the first time, writing the data in the data block unit after data updating in each stripe into other storage areas of the disk array, and generating stripe information of the stripe after data updating, wherein the stripe information comprises: the identification number of the stripe, and the identification number of each data block unit constituting the stripe.
In a second aspect, an embodiment of the present invention provides a data updating apparatus, including:
the receiving module is used for receiving a data updating request aiming at a stored file, wherein the data updating request carries data content, position offset and data size value of data to be updated;
the positioning module is used for positioning the corresponding updating range of the data to be updated in the stored file according to the position offset and the data size value;
a determining module, configured to determine a plurality of stripes covered by the update range, where each stripe includes the same number of data block units;
and the updating module is used for updating data of the data block units in the updating range in each strip by using the data content for the plurality of determined strips.
Optionally, the positioning module includes:
the first determining submodule is used for determining an updating initial position value of the stored file according to the position offset;
a second determining submodule, configured to determine an update end position value of the stored file according to a sum of the update start position value and the data size value;
and the third determining submodule is used for determining the updating range through the updating starting position value and the updating ending position value.
Optionally, the determining module is specifically configured to:
a stripe located within the update scope is determined, and a stripe that includes a portion of the update scope.
Optionally, the update module includes:
a first updating submodule, configured to, if there is a stripe in the plurality of stripes in which all data block units are located within the updating range, perform data updating on all data block units in the stripe by using the data content;
and the second updating submodule is used for updating the data of the data block units in the updating range in the stripe by using the data content if the stripe with partial data block units in the updating range exists in the plurality of stripes.
Optionally, the apparatus further comprises:
and the verification data generation module is used for generating verification data of the strip corresponding to the file according to the stored file after the data is updated if a data updating request for the stored file is not received after a preset time period from the moment after the data is updated.
Optionally, the apparatus further comprises:
a stripe information generating module, configured to, if a data update request for a stored file is received for the first time, write data in a data block unit after data update in each stripe into another storage area of the disk array, and generate stripe information of the stripe after data update, where the stripe information includes: the identification number of the stripe, and the identification number of each data block unit constituting the stripe.
In a third aspect, an embodiment of the present invention provides a cloud storage server, including a processor and a machine-readable storage medium, the machine-readable storage medium storing machine-executable instructions executable by the processor, the processor being caused by the machine-executable instructions to: the method steps of the data updating method provided in the first aspect of the embodiment of the present invention are implemented.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and the computer program is executed by a processor to perform the method steps of the data updating method provided in the first aspect of the embodiment of the present invention.
In the data updating method provided by the embodiment of the invention, for a file stored in a striped manner, when data in the file is updated, the corresponding updating range of the data to be updated in the stored file is located according to the position offset and the data size value of the data to be updated carried by the data updating request, so that a plurality of stripes covered by the updating range are determined, and data updating is performed on data block units in the stripes within the updating range. The embodiment of the invention only updates the data of the data block unit which needs to be updated, and does not need to read the data in the whole stripe to the memory for data updating. Compared with the prior art, when data updating is carried out, the data volume read and written in is greatly reduced, so that the data transmission volume in the cloud storage server is reduced, and the network transmission pressure in the cloud storage server is reduced. Of course, it is not necessary for any product or method of practicing the invention to achieve all of the above-described advantages at the same time.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flow chart of a data updating method according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating a stripe corresponding to a stored file according to an embodiment of the present invention;
FIG. 3 is a flowchart illustrating a location update procedure according to an embodiment of the present invention;
fig. 4 is another schematic flow chart of a data updating method according to an embodiment of the present invention;
FIG. 5 is a schematic flow chart illustrating a data updating method according to an embodiment of the present invention;
FIG. 6 is a schematic flow chart illustrating the generation of parity data for a stripe according to an embodiment of the present invention;
FIG. 7 is a schematic structural diagram of a data update apparatus according to an embodiment of the present invention;
FIG. 8 is a schematic structural diagram of a positioning module according to an embodiment of the present invention;
FIG. 9 is a block diagram of an update module according to an embodiment of the present invention;
FIG. 10 is a schematic structural diagram of a data update apparatus according to an embodiment of the present invention;
fig. 11 is a schematic structural diagram of a cloud storage server according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In a conventional cloud storage system, user data is often only stored as one copy, and if a machine fails, the user data is easily lost. At present, a network RAID (Redundant Arrays of Independent Disks, disk array) in a cloud storage system can perform cross-node protection on data of a user on the basis of saving user cost, and an implementation manner of the network RAID is mainly based on an EC (erasure code) technology.
At present, in a network RAID based on an EC technology, when a part of data written in a stripe is updated, a cloud storage server needs to read data of the entire stripe first, specifically, for example, data in one stripe is 4MB, and if 1 byte of data in one stripe needs to be updated, the cloud storage server needs to read 4MB of data, that is, 4194304 bytes of data. It can be seen that the data read and write amount is at least 4194304 times larger than the data size that needs to be updated. For this reason, the network RAID interface provided by the current cloud storage service provider generally does not support a random write operation, i.e., a write operation is performed on any part of the written data. How to realize random writing based on the EC technology and how to reduce the data transmission amount of the cloud storage server when writing data becomes a problem to be solved.
Based on the foregoing problems, in a data updating method provided in an embodiment of the present invention, for a file stored in a striped manner, when any part of data in the file is updated, an update range corresponding to the data to be updated in the stored file is located according to a position offset and a data size value of the data to be updated carried in a data update request, so as to determine a plurality of stripes covered by the update range, and perform data update on data block units located in the update range in the stripes, without reading data in the whole stripe to a memory and then performing data update, so that not only random writing can be implemented based on an EC technology, but also a data transmission amount of a cloud storage server during data writing can be reduced.
As shown in fig. 1, an embodiment of the present invention provides a data updating method, which may include the following steps:
s101, receiving a data updating request aiming at a stored file, wherein the data updating request carries data content, position offset and data size value of data to be updated.
In the embodiment of the present invention, the server may be a cloud storage server based on an EC technology, the cloud storage server performs data storage in a striping manner, data may be stored in a plurality of stripes arranged in sequence, and each stripe may include the same number of data block units. The data block unit is the smallest logical unit of data storage, and the size of each data block unit is usually fixed and is generally set to 32KB by default. In the data writing process, the data is usually written sequentially, and this time, it can be regarded as a process in which the data fills up the individual data block units sequentially. When the data block unit in one stripe is filled, the data is written into the data block unit of the next stripe in sequence. As can be seen from the above, the stripes corresponding to a file are also arranged in sequence, so that each stripe also has a position offset in the file. Illustratively, assuming file A contains a total of 1000KB of data, each stripe contains 320KB of data, as shown in FIG. 2, file A may consist of 4 stripes, where stripe 1 has a position offset of 0 in the file, stripe 2 has a position offset of 320 in the file, stripe 3 has a position offset of 640 in the file, and stripe 4 has a position offset of 960 in the file. The position offset can be understood as the position of the first byte of each band in the file.
In one possible implementation manner, the cloud storage server may update the data of the file stored in the cloud storage server according to a data update request from the client. The update request can carry information such as data content, position offset and data size value of the data to be updated, and the cloud storage server can acquire the information after receiving the data update request. Illustratively, the file a stored in the cloud storage server contains 1000KB of data, and the data size of the data content of the data to be updated is 400KB, which indicates that the 400KB of data in the file a needs to be updated at this time; the position offset is 240, indicating that the update start position is at 240KB of file A, i.e., the data of 400KB therein is updated from 240KB of file A. Information interaction can be performed between a bottom nas (network Attached storage) module of the cloud storage server and the client, so that the client can determine the position offset of the current update data relative to the stored file, and the process of determining the position offset can be realized by the existing position offset determination method, which is not described herein again in the embodiments of the present invention.
S102, positioning the corresponding updating range of the data to be updated in the stored file according to the position offset and the data size value.
In the embodiment of the invention, the position offset is recorded as the update initial position, and the size value of the update data is known, so that the update ending position of the data can be conveniently positioned, and the corresponding data update range is obtained.
As shown in fig. 3, the process of updating the range of the positioning specifically may include:
and S1021, determining an update starting position value of the data to be updated in the stored file according to the position offset.
Because the position offset is recorded as the update starting position, the update starting position can be accurately positioned in the stored file according to the position offset, and the update starting position value corresponding to the position is determined. Illustratively, still taking file a as an example, file a contains 1000KB of data, the position offset is 240, and the corresponding update start position value is 240 KB.
And S1022, determining an update ending position value of the data to be updated in the stored file through the sum of the update starting position value and the data size value.
After the update starting position value is determined, the update starting position value is added with the data size value of the data to be updated, and the update ending position value can be determined, so that the update ending position of the data to be updated in the stored file is accurately determined. Illustratively, still taking file a as an example, after determining that the update start position is 240KB, the data size value is 400KB, and the determined update end position value is 720KB, the corresponding update end position is 720 KB.
S1023, an update range is determined by updating the start position value and the end position value.
In the embodiment of the invention, after the update starting position value and the update ending position value are determined, the update range of the file which needs to be updated can be conveniently determined. Illustratively, still taking file a as an example, after determining that the update start position is 240KB and the update end position is 720KB, the corresponding update range is from 240KB to 720KB of file a, i.e. it is determined that the data in the range is updated.
S103, determining a plurality of strips covered by the updating range.
After determining the update scope, it can be calculated in which stripes the update scope is located, i.e. which stripes the update scope covers. Illustratively, still taking file a as an example, file a contains 4 stripes, each of which is 320KB, and the update range is from 240KB to 720KB, which is determined as above, and is located in stripe 1, stripe 2 and stripe 3, wherein the update range completely covers stripe 2 and partially covers stripe 1 and stripe 3, i.e. stripe 2 is located in the update range, and stripe 1 and stripe 3 contain partial update ranges.
And S104, updating data of the data block units positioned in the updating range in each band by using the data content of the plurality of determined bands.
In the embodiment of the invention, after a plurality of stripes covered by the updating range are determined, the data can be updated for each stripe by using the data content. Because the number of the data block units in each stripe is fixed, the data block units needing data updating in the plurality of stripes can be positioned according to the updating range. Illustratively, still taking file a as an example, the size of each data block unit is 80KB, and of the 4 stripes included in file a, each stripe includes 4 data block units. From the update range 240KB to 720KB determined above, the update start position is located in the 4 th data block unit of the slice 1, and the update end position is located in the 1 st data block unit of the slice 3. As is apparent from the above description, the 4 th data block unit of the slice 1, all the data block units of the slice 2, and the 1 st data block unit of the slice 3 are located within the update range, and thus it is necessary to update the data in the determined data block units.
Because the data block units in the stripe need to be updated can be directly determined, the data updating method of the embodiment of the invention can directly write the data content of the data to be updated into the determined data block units, and does not need to read and cache the whole stripe into the memory first and then update the data in the memory.
In the data updating method provided by the embodiment of the invention, for a file stored in a striped manner, when data in the file is updated, the corresponding updating range of the data to be updated in the stored file is located according to the position offset and the data size value of the data to be updated carried by the data updating request, so that a plurality of stripes covered by the updating range are determined, and data updating is performed on data block units in the stripes within the updating range. The embodiment of the invention only updates the data of the data block unit which needs to be updated, and does not need to read the data in the whole stripe to the memory for data updating. Compared with the prior art, when data updating is carried out, the data volume read and written in is greatly reduced, so that the data transmission volume in the cloud storage server is reduced, and the network transmission pressure in the cloud storage server is reduced.
An embodiment of the present invention further provides a data updating method, as shown in fig. 4, which may include the following steps:
s201, receiving a data updating request aiming at a stored file, wherein the data updating request carries data content, position offset and data size value of data to be updated.
S202, according to the position offset and the data size value, positioning the corresponding updating range of the data to be updated in the stored file.
S203, determining a plurality of strips covered by the updating range.
In the embodiment of the present invention, before S204, refer to the flow execution from S101 to S103 in fig. 1, which is not described herein again.
And S204, for the plurality of determined stripes, if a stripe with partial data block units located in the updating range exists in the plurality of stripes, updating the data block units located in the updating range in the stripe by using the data content.
In the embodiment of the present invention, after determining a plurality of stripes covered by the update range, it may be determined which stripes have all data block units located in the update range and which stripes have some data block units located in the update range. Optionally, it may be determined which data block units in the stripe need to be updated according to the offset of each data block unit in the stripe with respect to the stripe. Illustratively, still taking file a as an example, file a includes 4 stripes, namely stripe 1, stripe 2, stripe 3, and stripe 4, each stripe includes 4 data block units, and each data block unit has a size of 80 KB. Among the 4 data block units included in stripe 1, the offset of data block unit 1 in stripe 1 is 0, the offset of data block unit 2 in stripe 1 is 80, the offset of data block unit 3 in stripe 1 is 160, and the offset of data block unit 1 in stripe 1 is 240. Knowing that the determined update range is 240KB to 720KB, i.e. the update real position is located at 240KB, it can be determined that the 4 th data block unit of the slice 1 is located at the update start position.
And S205, if the plurality of stripes have the stripes with all the data block units located in the updating range, updating the data of all the data block units in the stripes by using the data content.
Illustratively, of 4 data block units included in the stripe 2, the offsets of the data block units in the stripe 2 are 0, 80, 160, and 240, and knowing that the position offset of the stripe 2 in the file a is 320, it can be determined that the position offsets of the data block units in the file a are 320(0+320), 400(80+320), 480(160+320), and 560(320+320), which are all located within the determined update range, so that data update is performed on all the data block units in the stripe 2.
As an optional implementation manner of the embodiment of the present invention, after the data block unit that needs to update data in each stripe is updated, the check data of the stripes may also be generated. One file can correspond to a plurality of stripes, the data of the file is stored in the corresponding stripes, and after no data updating request for the file is received any more, the check data of the stripes can be generated by utilizing an EC technology. The EC technology can increase M redundant data from N original data, and can restore the original data from any N data in N + M, thereby preventing data loss. In the embodiment of the invention, a plurality of stripes after data updating are equivalent to N parts of original data, and generated check data are equivalent to M parts of redundant data.
Because some file data are updated frequently, if check data are generated by calculation after each data update, the calculation amount of the cloud storage server is increased. Therefore, the time starting point can be the time starting point from the time after the data updating is finished, and after the preset time is elapsed, if the data updating request for the stored file is not received any more, the verification data is generated again, so that the computing pressure of the cloud storage server is reduced. The preset time can be set according to the actual load condition of the cloud storage server.
Optionally, the verification data may be generated when the number of data updates for the stored file reaches a preset number, and the purpose of reducing the calculation amount of the cloud storage server may also be achieved. Specifically, the following may be mentioned: and counting data updating requests for the stored files from the first time, and generating the verification data of the strip corresponding to the files for the stored files after the data updating when the number of times of receiving the updating requests reaches a preset value.
Optionally, after generating the check data, since the check data is also stored in a striped manner, stripe information of a stripe including the check data may be generated, and the stripe information may include: the identification number of the strip is used for identifying the strip; the identification number of each data block unit constituting the stripe is used for identifying the data block unit in the stripe.
Optionally, the data security level of the file, that is, the N, M numbers of the content, may also be carried in the data update request for the stored file, and the cloud storage server may generate corresponding verification data according to the data security level of the file. The larger the value of M, the higher the data security of the file, but the larger the amount of calculation required when generating the check data.
As an optional implementation manner of the embodiment of the present invention, if the cloud storage server determines that a data update request for a stored file is received for the first time, data in the data block unit after data update in each stripe may be written into another storage area of the disk array, so that the updated data is stored in one location in a centralized manner, which is convenient for subsequent reading. The other storage area may be a storage area other than the stored file in the disk array, for example, if the stored file is stored in the data storage service module a, the updated data in the data block unit may be written into the storage area other than the data storage service module a, for example, into the data storage service module N. As data is written into the new data block unit, the corresponding data block unit in each stripe is changed, and the cloud storage server may generate new stripe information for the stripe after the data block unit is changed, where the stripe information may include: the identification number of the strip is used for identifying the strip; the identification number of each data block unit forming the strip is used for identifying the data block unit in the strip; the offset of the stripe in the stored file is used to locate the location of the stripe in the stored file.
In the data updating method provided by the embodiment of the present invention, after a plurality of stripes covered by an updating range are determined, it can be determined which stripes have all data block units located in the updating range and which stripes have partial data block units located in the updating range, so as to update data of data block units that need to be updated, and not update data of data block units that do not need to be updated, so that the updating data is correctly written into each data block unit in a stripe.
As shown in fig. 5, an embodiment of the present invention further provides a data updating method, where a plurality of data storage service modules (storegerservice) are arranged in a cloud storage server, and are used to store received data in a local disk; the system is also provided with a strip manager (StripeManager) which is used for managing strips corresponding to the stored files and sending the calculation tasks corresponding to the strips to a coding calculation unit (Encoder), wherein the coding calculation unit is used for executing the coding tasks and sending the transcoded data to a data storage service module for storage. The method comprises the following steps:
s301, a client sends a data updating request aiming at a stored file to a data storage service module A, wherein the data updating request carries parameters such as data content, position offset and data size value of data to be updated;
s302, the data storage service module A sends a data updating request to the strip manager according to the position offset and the data size value so as to inquire the strip covered by the updating range;
s303, after the strip manager inquires the strip meeting the coverage condition, the strip information is returned to the data storage service module A;
s304, the data storage service module A determines which data block storage units in the strip need to be subjected to data updating according to the position offset and the data size value;
s305, the data storage service module A updates the data in the determined data block storage unit by using the data content of the data to be updated;
s306, if the data storage service module A determines that the data updating request is the first request, writing the data in the data block unit after data updating in each strip into the data storage service module N;
s307, the data storage service module N sends the writing result to the data storage service module A, for example, the written data block unit identification number;
s308, the data storage service module A reports the updated information of the strip to the strip manager;
s309, the stripe manager updates the stripe information of the stripe corresponding to the stored file;
s3010, the strip manager returns the update result to the data storage service module A;
s3011, the data storage service module a returns the write result to the client.
After the data block unit that needs to update data in each stripe is updated, as shown in fig. 6, the process of generating parity data of the stripe may include:
s401, the strip manager issues a command for calculating check data to a coding calculation unit;
s402, reading data of a strip corresponding to a stored file from the data storage service module A and the data storage service module N by the coding calculation unit;
s403, the data storage service module A and the data storage service module N return data of the strip corresponding to the stored file;
s404, the coding calculation unit calculates the check data of the strip corresponding to the stored file;
s405, the coding calculation unit sends check data to the data storage service module A;
s406, the data storage service module A receives and writes the check data;
s407, the data storage service module A returns a writing result to the coding calculation unit, wherein the writing result comprises the stripe information corresponding to the check data;
s408, the coding calculation unit sends the stripe information corresponding to the check data to the stripe manager;
s409, the stripe manager updates the stripe information.
According to the data updating method provided by the embodiment of the invention, after a plurality of stripes covered by the updating range are determined, which data block storage units in the stripes need to be subjected to data updating according to the position offset and the data size value, so that the data block units needing to be updated are subjected to data updating, the data block units not needing to be updated are not subjected to data updating, and the updating data are correctly written into each data block unit in the stripes.
A specific embodiment of a data updating apparatus provided in an embodiment of the present invention corresponds to the flow shown in fig. 1, and referring to fig. 7, fig. 7 is a schematic structural diagram of the data updating apparatus provided in the embodiment of the present invention, including:
the receiving module 501 is configured to receive a data update request for a stored file, where the data update request carries data content, a position offset, and a data size value of data to be updated.
The positioning module 502 is configured to position an update range corresponding to the data to be updated in the stored file according to the position offset and the data size value.
A determining module 503, configured to determine a plurality of stripes covered by the update range.
And an updating module 504, configured to update, with respect to the determined multiple stripes, data of the data block unit located in the update range in each stripe by using the data content.
The positioning module 502, as shown in fig. 8, includes:
the first determining submodule 5021 is configured to determine an update start position value of the stored file according to the position offset.
A second determining submodule 5022, configured to determine an update end position value of the stored file by a sum of the update start position value and the data size value;
the third determining submodule 5023 is used for determining the updating range by updating the starting position value and the updating ending position value.
The determining module 503 is specifically configured to:
a stripe within the update scope is determined, and a stripe that includes a portion of the update scope is determined.
As shown in fig. 9, the update module 504 includes:
a first update submodule 5041, configured to, if there is a stripe in the plurality of stripes in which all data block units are located within the update range, perform data update on all data block units in the stripe by using the data content;
and a second update sub-module 5042, configured to, if there is a stripe in the plurality of stripes in which a part of the data block units are located within the update range, update data, using the data content, on the data block unit in the stripe located within the update range.
Optionally, as shown in fig. 10, on the basis of the data updating apparatus shown in fig. 7, the data updating apparatus according to the embodiment of the present invention may further include:
the verification data generating module 505 is configured to, from a time point after the data update is finished, if a data update request for the stored file is not received after a preset time period elapses, generate verification data of a stripe corresponding to the file for the stored file after the data update.
A stripe information generating module 506, configured to, if a data update request for a stored file is received for the first time, write data in the data block unit after data update in each stripe into another storage area of the disk array, and generate stripe information of the stripe after data update, where the stripe information includes: the identification number of the stripe, and the identification number of each data block unit constituting the stripe.
The data updating device provided by the embodiment of the device of the invention, for a file stored in a striped manner, when updating data in the file, locates the corresponding updating range of the data to be updated in the stored file according to the position offset and the data size value of the data to be updated carried by the data updating request, further determines a plurality of stripes covered by the updating range, and updates data for the data block units located in the updating range in the stripes, so that only the data block units needing to be updated are updated, and the data in the whole stripe does not need to be read into the memory for further data updating. Compared with the prior art, when data updating is carried out, the data volume read and written in is greatly reduced, so that the data transmission volume in the cloud storage server is reduced, and the network transmission pressure in the cloud storage server is reduced.
An embodiment of the present invention further provides a cloud storage server, as shown in fig. 11, the server 600 includes a processor 601 and a machine-readable storage medium 602, where the machine-readable storage medium stores machine-executable instructions capable of being executed by the processor, and the processor is caused by the machine-executable instructions to implement the following steps:
receiving a data updating request aiming at a stored file, wherein the data updating request carries the data content, the position offset and the data size value of data to be updated;
according to the position offset and the data size value, positioning a corresponding updating range of the data to be updated in the stored file;
determining a plurality of stripes covered by an updating range;
and updating data of the data block units positioned in the updating range in each strip by using the data content for the plurality of determined strips.
In the cloud storage server provided by the embodiment of the invention, for a file stored in a striped manner, when data in the file is updated, an update range corresponding to the data to be updated in the stored file is located according to a position offset and a data size value of the data to be updated carried by a data update request, so that a plurality of stripes covered by the update range are determined, and data update is performed on data block units in the stripes within the update range. The embodiment of the invention only updates the data of the data block unit which needs to be updated, and does not need to read the data in the whole stripe to the memory for data updating. Compared with the prior art, when data updating is carried out, the data volume read and written in is greatly reduced, so that the data transmission volume in the cloud storage server is reduced, and the network transmission pressure in the cloud storage server is reduced.
The machine-readable storage medium mentioned in the above server may include a Random Access Memory (RAM) or a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. In the alternative, the storage medium may be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the integrated Circuit may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete gate or transistor logic device, or discrete hardware components.
An embodiment of the present invention further provides a computer-readable storage medium, in which a computer program is stored, and is configured to execute the following steps:
receiving a data updating request aiming at a stored file, wherein the data updating request carries the data content, the position offset and the data size value of data to be updated;
according to the position offset and the data size value, positioning a corresponding updating range of the data to be updated in the stored file;
determining a plurality of stripes covered by an updating range;
and updating data of the data block units positioned in the updating range in each strip by using the data content for the plurality of determined strips.
In the computer-readable storage medium, for a file stored in a striped manner, when data in the file is updated, an update range corresponding to the data to be updated in the stored file is located according to a position offset and a data size value of the data to be updated carried by a data update request, so as to determine a plurality of stripes covered by the update range, and perform data update on data block units located in the update range in the stripes. The embodiment of the invention only updates the data of the data block unit which needs to be updated, and does not need to read the data in the whole stripe to the memory for data updating. Compared with the prior art, when data updating is carried out, the data volume read and written in is greatly reduced, so that the data transmission volume in the cloud storage server is reduced, and the network transmission pressure in the cloud storage server is reduced.
For the device/server/storage medium embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to part of the description of the method embodiment.
It should be noted that the apparatus, the cloud storage server and the storage medium according to the embodiments of the present invention are an apparatus, a cloud storage server and a storage medium to which the data updating method is applied, and all embodiments of the data updating method are applicable to the apparatus, the cloud storage server and the storage medium, and can achieve the same or similar beneficial effects.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.
Claims (12)
1. A method for updating data, the method comprising:
receiving a data updating request aiming at a stored file, wherein the data updating request carries data content, position offset and data size value of data to be updated;
according to the position offset and the data size value, positioning a corresponding updating range of the data to be updated in the stored file;
determining a plurality of stripes covered by the updating range, wherein each stripe comprises the same number of data block units;
for the determined plurality of stripes, updating data of the data block units in the update range in each stripe by using the data content;
if a data updating request for a stored file is received for the first time, writing the data in the data block unit after data updating in each stripe into other storage areas of the disk array, and generating stripe information of the stripe after data updating, wherein the stripe information comprises: the identification number of the stripe, and the identification number of each data block unit constituting the stripe.
2. The method according to claim 1, wherein the locating an update range of the data to be updated in the stored file according to the position offset and the data size value comprises:
determining an update starting position value of the stored file according to the position offset;
determining an update ending position value of the stored file by the sum of the update starting position value and the data size value;
and determining the updating range through the updating starting position value and the updating ending position value.
3. The data updating method of claim 1, wherein the determining the plurality of stripes covered by the updating range comprises:
a stripe located within the update scope is determined, and a stripe that includes a portion of the update scope.
4. The data updating method according to claim 1, wherein the updating data of the data block unit in the update range in each stripe by using the data content for the determined plurality of stripes comprises:
for the determined plurality of stripes, if a stripe with partial data block units located in the updating range exists in the plurality of stripes, performing data updating on the data block units located in the updating range in the stripe by using the data content;
and if the plurality of stripes have the stripes with all the data block units positioned in the updating range, updating the data of all the data block units in the stripes by using the data content.
5. The data updating method according to claim 1 or 4, wherein after the data updating is performed on the data block units in each stripe within the updating range by using the data content for the determined plurality of stripes, the method further comprises:
and starting from the moment after the data updating is finished, when the preset time is elapsed, no data updating request aiming at the stored file is received, and aiming at the stored file after the data updating, generating the verification data of the strip corresponding to the file.
6. An apparatus for updating data, the apparatus comprising:
the receiving module is used for receiving a data updating request aiming at a stored file, wherein the data updating request carries data content, position offset and data size value of data to be updated;
the positioning module is used for positioning the corresponding updating range of the data to be updated in the stored file according to the position offset and the data size value;
a determining module, configured to determine a plurality of stripes covered by the update range, where each stripe includes the same number of data block units;
the updating module is used for updating data of the data block units in the updating range in each strip by using the data content aiming at the plurality of determined strips;
a stripe information generating module, configured to, if a data update request for a stored file is received for the first time, write data in a data block unit after data update in each stripe into another storage area of the disk array, and generate stripe information of the stripe after data update, where the stripe information includes: the identification number of the stripe, and the identification number of each data block unit constituting the stripe.
7. The data update apparatus of claim 6, wherein the positioning module comprises:
the first determining submodule is used for determining an updating initial position value of the stored file according to the position offset;
a second determining submodule, configured to determine an update end position value of the stored file according to a sum of the update start position value and the data size value;
and the third determining submodule is used for determining the updating range through the updating starting position value and the updating ending position value.
8. The data updating apparatus of claim 6, wherein the determining module is specifically configured to:
a stripe located within the update scope is determined, and a stripe that includes a portion of the update scope.
9. The data update apparatus of claim 6, wherein the update module comprises:
a first updating submodule, configured to, if there is a stripe in the plurality of stripes in which all data block units are located within the updating range, perform data updating on all data block units in the stripe by using the data content;
and the second updating submodule is used for updating the data of the data block units in the updating range in the stripe by using the data content if the stripe with partial data block units in the updating range exists in the plurality of stripes.
10. A data update apparatus according to claim 6 or 9, characterized in that the apparatus further comprises:
and the verification data generation module is used for generating verification data of the strip corresponding to the file according to the stored file after the data is updated if a data updating request for the stored file is not received after a preset time period from the moment after the data is updated.
11. A cloud storage server comprising a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor, the processor being caused by the machine-executable instructions to: carrying out the method steps of any one of claims 1 to 5.
12. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of claims 1-5.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811015192.8A CN110874181B (en) | 2018-08-31 | 2018-08-31 | Data updating method and updating device |
PCT/CN2019/102972 WO2020043119A1 (en) | 2018-08-31 | 2019-08-28 | Data updating method and updating device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811015192.8A CN110874181B (en) | 2018-08-31 | 2018-08-31 | Data updating method and updating device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110874181A CN110874181A (en) | 2020-03-10 |
CN110874181B true CN110874181B (en) | 2021-12-17 |
Family
ID=69644066
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811015192.8A Active CN110874181B (en) | 2018-08-31 | 2018-08-31 | Data updating method and updating device |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110874181B (en) |
WO (1) | WO2020043119A1 (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111736778B (en) * | 2020-07-21 | 2020-11-17 | 北京金山云网络技术有限公司 | Data updating method, device and system and electronic equipment |
CN114385067B (en) * | 2020-10-19 | 2023-07-18 | 澜起科技股份有限公司 | Data updating method for memory system and memory controller |
CN113821485B (en) * | 2021-09-27 | 2024-10-11 | 深信服科技股份有限公司 | Data changing method, device, equipment and computer readable storage medium |
CN115469818B (en) * | 2022-11-11 | 2023-03-24 | 苏州浪潮智能科技有限公司 | Disk array writing processing method, device, equipment and medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102521074A (en) * | 2011-12-01 | 2012-06-27 | 浪潮电子信息产业股份有限公司 | Method for quickening recovery of redundant array of independent disk (RAID) 5 |
CN106708651A (en) * | 2016-11-16 | 2017-05-24 | 北京三快在线科技有限公司 | Erasure code-based partial write-in method and device, storage medium and equipment |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6904498B2 (en) * | 2002-10-08 | 2005-06-07 | Netcell Corp. | Raid controller disk write mask |
US7913148B2 (en) * | 2004-03-12 | 2011-03-22 | Nvidia Corporation | Disk controller methods and apparatus with improved striping, redundancy operations and interfaces |
CN103984587B (en) * | 2008-06-12 | 2017-10-20 | 普安科技股份有限公司 | The method of the control program of physical storage device is updated in sas storage virtualization system |
CN101510175B (en) * | 2009-04-02 | 2015-06-03 | 北京中星微电子有限公司 | Method for updating target data to memory and apparatus thereof |
CN103294957B (en) * | 2013-05-06 | 2015-10-28 | 北京赛思信安技术有限公司 | Support data guard method during Data Update in data de-duplication file system |
CA2965715C (en) * | 2014-12-27 | 2019-02-26 | Huawei Technologies Co., Ltd. | Data processing method, apparatus, and system |
CN105930103B (en) * | 2016-05-10 | 2019-04-16 | 南京大学 | A kind of correcting and eleting codes covering write method of distributed storage CEPH |
CN106528125A (en) * | 2016-10-26 | 2017-03-22 | 腾讯科技(深圳)有限公司 | Data file incremental updating method, server, client and system |
CN107341070B (en) * | 2017-06-30 | 2020-07-10 | 长江大学 | Random writing method and system based on erasure codes |
CN107454161A (en) * | 2017-07-31 | 2017-12-08 | 郑州云海信息技术有限公司 | A kind of data back up method and device |
CN108123997A (en) * | 2017-11-21 | 2018-06-05 | 武汉中海庭数据技术有限公司 | A kind of navigation pack update method and system based on difference update |
-
2018
- 2018-08-31 CN CN201811015192.8A patent/CN110874181B/en active Active
-
2019
- 2019-08-28 WO PCT/CN2019/102972 patent/WO2020043119A1/en active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102521074A (en) * | 2011-12-01 | 2012-06-27 | 浪潮电子信息产业股份有限公司 | Method for quickening recovery of redundant array of independent disk (RAID) 5 |
CN106708651A (en) * | 2016-11-16 | 2017-05-24 | 北京三快在线科技有限公司 | Erasure code-based partial write-in method and device, storage medium and equipment |
Non-Patent Citations (1)
Title |
---|
面向多节点失效的纠删码及数据修复技术研究;林轩;《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》;20170315;I137-329 * |
Also Published As
Publication number | Publication date |
---|---|
CN110874181A (en) | 2020-03-10 |
WO2020043119A1 (en) | 2020-03-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110874181B (en) | Data updating method and updating device | |
US10846137B2 (en) | Dynamic adjustment of application resources in a distributed computing system | |
US10896102B2 (en) | Implementing secure communication in a distributed computing system | |
US9696914B2 (en) | System and method for transposed storage in RAID arrays | |
US9268648B1 (en) | System and method for consistency verification of replicated data in a recovery system | |
CN104978362B (en) | Data migration method, device and the meta data server of distributed file system | |
US9619472B2 (en) | Updating class assignments for data sets during a recall operation | |
US20170060769A1 (en) | Systems, devices and methods for generating locality-indicative data representations of data streams, and compressions thereof | |
US10417213B1 (en) | Metadata updating | |
CN110147203B (en) | File management method and device, electronic equipment and storage medium | |
CN108829342B (en) | Log storage method, system and storage device | |
TW201525687A (en) | Method and processor for writing, and error tracking a log subsystem of a file system | |
CN112749039A (en) | Method, apparatus and program product for data writing and data recovery | |
CN112948279A (en) | Method, apparatus and program product for managing access requests in a storage system | |
CN108369575A (en) | Electronic storage system | |
CN115756955A (en) | Data backup and data recovery method and device and computer equipment | |
WO2017087015A1 (en) | Count of metadata operations | |
CN104216660A (en) | Method and device for improving disk array performance | |
CN109144766B (en) | Data storage and reconstruction method and device and electronic equipment | |
CN111857549B (en) | Method, apparatus and computer program product for managing data | |
US20210132843A1 (en) | Method, electronic device and computer program product for managing disk array | |
CN117591009A (en) | Data management method, storage device and server | |
JP2012089049A (en) | Computer system and server | |
CN109656936A (en) | Method of data synchronization, device, computer equipment and storage medium | |
US9710183B2 (en) | Effectively limitless apparent free space on storage device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |