CN110874181B - Data updating method and updating device - Google Patents

Data updating method and updating device Download PDF

Info

Publication number
CN110874181B
CN110874181B CN201811015192.8A CN201811015192A CN110874181B CN 110874181 B CN110874181 B CN 110874181B CN 201811015192 A CN201811015192 A CN 201811015192A CN 110874181 B CN110874181 B CN 110874181B
Authority
CN
China
Prior art keywords
data
updating
stripe
update
range
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811015192.8A
Other languages
Chinese (zh)
Other versions
CN110874181A (en
Inventor
夏伟强
汪渭春
林起芊
王伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Hikvision System Technology Co Ltd
Original Assignee
Hangzhou Hikvision System Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Hikvision System Technology Co Ltd filed Critical Hangzhou Hikvision System Technology Co Ltd
Priority to CN201811015192.8A priority Critical patent/CN110874181B/en
Priority to PCT/CN2019/102972 priority patent/WO2020043119A1/en
Publication of CN110874181A publication Critical patent/CN110874181A/en
Application granted granted Critical
Publication of CN110874181B publication Critical patent/CN110874181B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a data updating method and a data updating device, wherein the method comprises the following steps: receiving a data updating request aiming at a stored file, wherein the data updating request carries the data content, the position offset and the data size value of data to be updated; according to the position offset and the data size value, positioning a corresponding updating range of the data to be updated in the stored file; determining a plurality of stripes covered by an updating range; and updating data of the data block units positioned in the updating range in each strip by using the data content for the plurality of determined strips. According to the data updating method provided by the embodiment of the invention, only the data block unit needing to be updated is subjected to data updating, and the data in the whole stripe does not need to be read to the memory for data updating, so that the data reading and writing amount is greatly reduced during data updating, the data transmission amount in the cloud storage server is reduced, and the network transmission pressure in the cloud storage server is reduced.

Description

Data updating method and updating device
Technical Field
The present invention relates to the field of data storage technologies, and in particular, to a data updating method and an updating apparatus.
Background
The cloud storage technology is a technology for data cloud storage, and when data is stored, a cloud storage server can receive the data sent by a client through a network, so that the data is stored. When the current cloud storage server stores data, only one copy is usually stored, and in order to avoid data loss due to machine failure, an EC (erasure code) technology is mostly adopted to store data.
The EC technique mainly performs striping processing on data, and then calculates corresponding check data for the stripe, so that when part of the data in the stripe is lost, the data can be recovered through the check data. The striping is a mature technology, and means that received data is stored into each data block unit according to the arrangement sequence of the data block units, and when the number of the data block units storing the data reaches, the plurality of data block units arranged in sequence form a stripe.
In the existing EC technology, for data written into a stripe, when a part of data in the stripe is updated, a cloud storage server needs to read the data of the whole stripe first, recalculate check data after the data is updated, and then write the updated data and the new check data into a disk, so that the data transmission amount inside the cloud storage server is large, which results in large transmission pressure of a network inside the cloud storage server. For example, if the data in one stripe is 4MB and 1 byte of data in the stripe needs to be updated, the cloud storage server needs to read 4MB of data, that is, 4194304 bytes of data. That is, although only 1 byte of data is updated, at least 4194304 bytes of data are transmitted inside the server.
Disclosure of Invention
The embodiment of the invention aims to provide a data updating method and a data updating device so as to reduce the data transmission quantity in a cloud storage server. The specific technical scheme is as follows:
in a first aspect, an embodiment of the present invention provides a data updating method, where the method includes:
receiving a data updating request aiming at a stored file, wherein the data updating request carries data content, position offset and data size value of data to be updated;
according to the position offset and the data size value, positioning a corresponding updating range of the data to be updated in the stored file;
determining a plurality of stripes covered by the updating range, wherein each stripe comprises the same number of data block units;
and updating data of the data block units in the updating range in each strip by using the data content for the determined plurality of strips.
Optionally, the positioning, according to the position offset and the data size value, an update range of the data to be updated in the stored file, includes:
determining an update starting position value of the stored file according to the position offset;
determining an update ending position value of the stored file by the sum of the update starting position value and the data size value;
and determining the updating range through the updating starting position value and the updating ending position value.
Optionally, the determining a plurality of stripes covered by the update range includes:
a stripe located within the update scope is determined, and a stripe that includes a portion of the update scope.
Optionally, the performing, for the determined multiple stripes, data update on the data block unit located in the update range in each stripe by using the data content includes:
for the determined plurality of stripes, if a stripe with partial data block units located in the updating range exists in the plurality of stripes, performing data updating on the data block units located in the updating range in the stripe by using the data content;
and if the plurality of stripes have the stripes with all the data block units positioned in the updating range, updating the data of all the data block units in the stripes by using the data content.
Optionally, after the data block unit in each stripe located in the update range is updated with the data content for the determined multiple stripes, the method further includes:
and starting from the moment after the data updating is finished, when the preset time is elapsed, no data updating request aiming at the stored file is received, and aiming at the stored file after the data updating, generating the verification data of the strip corresponding to the file.
Optionally, after the data block unit in each stripe located in the update range is updated with the data content for the determined multiple stripes, the method further includes:
if a data updating request for a stored file is received for the first time, writing the data in the data block unit after data updating in each stripe into other storage areas of the disk array, and generating stripe information of the stripe after data updating, wherein the stripe information comprises: the identification number of the stripe, and the identification number of each data block unit constituting the stripe.
In a second aspect, an embodiment of the present invention provides a data updating apparatus, including:
the receiving module is used for receiving a data updating request aiming at a stored file, wherein the data updating request carries data content, position offset and data size value of data to be updated;
the positioning module is used for positioning the corresponding updating range of the data to be updated in the stored file according to the position offset and the data size value;
a determining module, configured to determine a plurality of stripes covered by the update range, where each stripe includes the same number of data block units;
and the updating module is used for updating data of the data block units in the updating range in each strip by using the data content for the plurality of determined strips.
Optionally, the positioning module includes:
the first determining submodule is used for determining an updating initial position value of the stored file according to the position offset;
a second determining submodule, configured to determine an update end position value of the stored file according to a sum of the update start position value and the data size value;
and the third determining submodule is used for determining the updating range through the updating starting position value and the updating ending position value.
Optionally, the determining module is specifically configured to:
a stripe located within the update scope is determined, and a stripe that includes a portion of the update scope.
Optionally, the update module includes:
a first updating submodule, configured to, if there is a stripe in the plurality of stripes in which all data block units are located within the updating range, perform data updating on all data block units in the stripe by using the data content;
and the second updating submodule is used for updating the data of the data block units in the updating range in the stripe by using the data content if the stripe with partial data block units in the updating range exists in the plurality of stripes.
Optionally, the apparatus further comprises:
and the verification data generation module is used for generating verification data of the strip corresponding to the file according to the stored file after the data is updated if a data updating request for the stored file is not received after a preset time period from the moment after the data is updated.
Optionally, the apparatus further comprises:
a stripe information generating module, configured to, if a data update request for a stored file is received for the first time, write data in a data block unit after data update in each stripe into another storage area of the disk array, and generate stripe information of the stripe after data update, where the stripe information includes: the identification number of the stripe, and the identification number of each data block unit constituting the stripe.
In a third aspect, an embodiment of the present invention provides a cloud storage server, including a processor and a machine-readable storage medium, the machine-readable storage medium storing machine-executable instructions executable by the processor, the processor being caused by the machine-executable instructions to: the method steps of the data updating method provided in the first aspect of the embodiment of the present invention are implemented.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and the computer program is executed by a processor to perform the method steps of the data updating method provided in the first aspect of the embodiment of the present invention.
In the data updating method provided by the embodiment of the invention, for a file stored in a striped manner, when data in the file is updated, the corresponding updating range of the data to be updated in the stored file is located according to the position offset and the data size value of the data to be updated carried by the data updating request, so that a plurality of stripes covered by the updating range are determined, and data updating is performed on data block units in the stripes within the updating range. The embodiment of the invention only updates the data of the data block unit which needs to be updated, and does not need to read the data in the whole stripe to the memory for data updating. Compared with the prior art, when data updating is carried out, the data volume read and written in is greatly reduced, so that the data transmission volume in the cloud storage server is reduced, and the network transmission pressure in the cloud storage server is reduced. Of course, it is not necessary for any product or method of practicing the invention to achieve all of the above-described advantages at the same time.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flow chart of a data updating method according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating a stripe corresponding to a stored file according to an embodiment of the present invention;
FIG. 3 is a flowchart illustrating a location update procedure according to an embodiment of the present invention;
fig. 4 is another schematic flow chart of a data updating method according to an embodiment of the present invention;
FIG. 5 is a schematic flow chart illustrating a data updating method according to an embodiment of the present invention;
FIG. 6 is a schematic flow chart illustrating the generation of parity data for a stripe according to an embodiment of the present invention;
FIG. 7 is a schematic structural diagram of a data update apparatus according to an embodiment of the present invention;
FIG. 8 is a schematic structural diagram of a positioning module according to an embodiment of the present invention;
FIG. 9 is a block diagram of an update module according to an embodiment of the present invention;
FIG. 10 is a schematic structural diagram of a data update apparatus according to an embodiment of the present invention;
fig. 11 is a schematic structural diagram of a cloud storage server according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In a conventional cloud storage system, user data is often only stored as one copy, and if a machine fails, the user data is easily lost. At present, a network RAID (Redundant Arrays of Independent Disks, disk array) in a cloud storage system can perform cross-node protection on data of a user on the basis of saving user cost, and an implementation manner of the network RAID is mainly based on an EC (erasure code) technology.
At present, in a network RAID based on an EC technology, when a part of data written in a stripe is updated, a cloud storage server needs to read data of the entire stripe first, specifically, for example, data in one stripe is 4MB, and if 1 byte of data in one stripe needs to be updated, the cloud storage server needs to read 4MB of data, that is, 4194304 bytes of data. It can be seen that the data read and write amount is at least 4194304 times larger than the data size that needs to be updated. For this reason, the network RAID interface provided by the current cloud storage service provider generally does not support a random write operation, i.e., a write operation is performed on any part of the written data. How to realize random writing based on the EC technology and how to reduce the data transmission amount of the cloud storage server when writing data becomes a problem to be solved.
Based on the foregoing problems, in a data updating method provided in an embodiment of the present invention, for a file stored in a striped manner, when any part of data in the file is updated, an update range corresponding to the data to be updated in the stored file is located according to a position offset and a data size value of the data to be updated carried in a data update request, so as to determine a plurality of stripes covered by the update range, and perform data update on data block units located in the update range in the stripes, without reading data in the whole stripe to a memory and then performing data update, so that not only random writing can be implemented based on an EC technology, but also a data transmission amount of a cloud storage server during data writing can be reduced.
As shown in fig. 1, an embodiment of the present invention provides a data updating method, which may include the following steps:
s101, receiving a data updating request aiming at a stored file, wherein the data updating request carries data content, position offset and data size value of data to be updated.
In the embodiment of the present invention, the server may be a cloud storage server based on an EC technology, the cloud storage server performs data storage in a striping manner, data may be stored in a plurality of stripes arranged in sequence, and each stripe may include the same number of data block units. The data block unit is the smallest logical unit of data storage, and the size of each data block unit is usually fixed and is generally set to 32KB by default. In the data writing process, the data is usually written sequentially, and this time, it can be regarded as a process in which the data fills up the individual data block units sequentially. When the data block unit in one stripe is filled, the data is written into the data block unit of the next stripe in sequence. As can be seen from the above, the stripes corresponding to a file are also arranged in sequence, so that each stripe also has a position offset in the file. Illustratively, assuming file A contains a total of 1000KB of data, each stripe contains 320KB of data, as shown in FIG. 2, file A may consist of 4 stripes, where stripe 1 has a position offset of 0 in the file, stripe 2 has a position offset of 320 in the file, stripe 3 has a position offset of 640 in the file, and stripe 4 has a position offset of 960 in the file. The position offset can be understood as the position of the first byte of each band in the file.
In one possible implementation manner, the cloud storage server may update the data of the file stored in the cloud storage server according to a data update request from the client. The update request can carry information such as data content, position offset and data size value of the data to be updated, and the cloud storage server can acquire the information after receiving the data update request. Illustratively, the file a stored in the cloud storage server contains 1000KB of data, and the data size of the data content of the data to be updated is 400KB, which indicates that the 400KB of data in the file a needs to be updated at this time; the position offset is 240, indicating that the update start position is at 240KB of file A, i.e., the data of 400KB therein is updated from 240KB of file A. Information interaction can be performed between a bottom nas (network Attached storage) module of the cloud storage server and the client, so that the client can determine the position offset of the current update data relative to the stored file, and the process of determining the position offset can be realized by the existing position offset determination method, which is not described herein again in the embodiments of the present invention.
S102, positioning the corresponding updating range of the data to be updated in the stored file according to the position offset and the data size value.
In the embodiment of the invention, the position offset is recorded as the update initial position, and the size value of the update data is known, so that the update ending position of the data can be conveniently positioned, and the corresponding data update range is obtained.
As shown in fig. 3, the process of updating the range of the positioning specifically may include:
and S1021, determining an update starting position value of the data to be updated in the stored file according to the position offset.
Because the position offset is recorded as the update starting position, the update starting position can be accurately positioned in the stored file according to the position offset, and the update starting position value corresponding to the position is determined. Illustratively, still taking file a as an example, file a contains 1000KB of data, the position offset is 240, and the corresponding update start position value is 240 KB.
And S1022, determining an update ending position value of the data to be updated in the stored file through the sum of the update starting position value and the data size value.
After the update starting position value is determined, the update starting position value is added with the data size value of the data to be updated, and the update ending position value can be determined, so that the update ending position of the data to be updated in the stored file is accurately determined. Illustratively, still taking file a as an example, after determining that the update start position is 240KB, the data size value is 400KB, and the determined update end position value is 720KB, the corresponding update end position is 720 KB.
S1023, an update range is determined by updating the start position value and the end position value.
In the embodiment of the invention, after the update starting position value and the update ending position value are determined, the update range of the file which needs to be updated can be conveniently determined. Illustratively, still taking file a as an example, after determining that the update start position is 240KB and the update end position is 720KB, the corresponding update range is from 240KB to 720KB of file a, i.e. it is determined that the data in the range is updated.
S103, determining a plurality of strips covered by the updating range.
After determining the update scope, it can be calculated in which stripes the update scope is located, i.e. which stripes the update scope covers. Illustratively, still taking file a as an example, file a contains 4 stripes, each of which is 320KB, and the update range is from 240KB to 720KB, which is determined as above, and is located in stripe 1, stripe 2 and stripe 3, wherein the update range completely covers stripe 2 and partially covers stripe 1 and stripe 3, i.e. stripe 2 is located in the update range, and stripe 1 and stripe 3 contain partial update ranges.
And S104, updating data of the data block units positioned in the updating range in each band by using the data content of the plurality of determined bands.
In the embodiment of the invention, after a plurality of stripes covered by the updating range are determined, the data can be updated for each stripe by using the data content. Because the number of the data block units in each stripe is fixed, the data block units needing data updating in the plurality of stripes can be positioned according to the updating range. Illustratively, still taking file a as an example, the size of each data block unit is 80KB, and of the 4 stripes included in file a, each stripe includes 4 data block units. From the update range 240KB to 720KB determined above, the update start position is located in the 4 th data block unit of the slice 1, and the update end position is located in the 1 st data block unit of the slice 3. As is apparent from the above description, the 4 th data block unit of the slice 1, all the data block units of the slice 2, and the 1 st data block unit of the slice 3 are located within the update range, and thus it is necessary to update the data in the determined data block units.
Because the data block units in the stripe need to be updated can be directly determined, the data updating method of the embodiment of the invention can directly write the data content of the data to be updated into the determined data block units, and does not need to read and cache the whole stripe into the memory first and then update the data in the memory.
In the data updating method provided by the embodiment of the invention, for a file stored in a striped manner, when data in the file is updated, the corresponding updating range of the data to be updated in the stored file is located according to the position offset and the data size value of the data to be updated carried by the data updating request, so that a plurality of stripes covered by the updating range are determined, and data updating is performed on data block units in the stripes within the updating range. The embodiment of the invention only updates the data of the data block unit which needs to be updated, and does not need to read the data in the whole stripe to the memory for data updating. Compared with the prior art, when data updating is carried out, the data volume read and written in is greatly reduced, so that the data transmission volume in the cloud storage server is reduced, and the network transmission pressure in the cloud storage server is reduced.
An embodiment of the present invention further provides a data updating method, as shown in fig. 4, which may include the following steps:
s201, receiving a data updating request aiming at a stored file, wherein the data updating request carries data content, position offset and data size value of data to be updated.
S202, according to the position offset and the data size value, positioning the corresponding updating range of the data to be updated in the stored file.
S203, determining a plurality of strips covered by the updating range.
In the embodiment of the present invention, before S204, refer to the flow execution from S101 to S103 in fig. 1, which is not described herein again.
And S204, for the plurality of determined stripes, if a stripe with partial data block units located in the updating range exists in the plurality of stripes, updating the data block units located in the updating range in the stripe by using the data content.
In the embodiment of the present invention, after determining a plurality of stripes covered by the update range, it may be determined which stripes have all data block units located in the update range and which stripes have some data block units located in the update range. Optionally, it may be determined which data block units in the stripe need to be updated according to the offset of each data block unit in the stripe with respect to the stripe. Illustratively, still taking file a as an example, file a includes 4 stripes, namely stripe 1, stripe 2, stripe 3, and stripe 4, each stripe includes 4 data block units, and each data block unit has a size of 80 KB. Among the 4 data block units included in stripe 1, the offset of data block unit 1 in stripe 1 is 0, the offset of data block unit 2 in stripe 1 is 80, the offset of data block unit 3 in stripe 1 is 160, and the offset of data block unit 1 in stripe 1 is 240. Knowing that the determined update range is 240KB to 720KB, i.e. the update real position is located at 240KB, it can be determined that the 4 th data block unit of the slice 1 is located at the update start position.
And S205, if the plurality of stripes have the stripes with all the data block units located in the updating range, updating the data of all the data block units in the stripes by using the data content.
Illustratively, of 4 data block units included in the stripe 2, the offsets of the data block units in the stripe 2 are 0, 80, 160, and 240, and knowing that the position offset of the stripe 2 in the file a is 320, it can be determined that the position offsets of the data block units in the file a are 320(0+320), 400(80+320), 480(160+320), and 560(320+320), which are all located within the determined update range, so that data update is performed on all the data block units in the stripe 2.
As an optional implementation manner of the embodiment of the present invention, after the data block unit that needs to update data in each stripe is updated, the check data of the stripes may also be generated. One file can correspond to a plurality of stripes, the data of the file is stored in the corresponding stripes, and after no data updating request for the file is received any more, the check data of the stripes can be generated by utilizing an EC technology. The EC technology can increase M redundant data from N original data, and can restore the original data from any N data in N + M, thereby preventing data loss. In the embodiment of the invention, a plurality of stripes after data updating are equivalent to N parts of original data, and generated check data are equivalent to M parts of redundant data.
Because some file data are updated frequently, if check data are generated by calculation after each data update, the calculation amount of the cloud storage server is increased. Therefore, the time starting point can be the time starting point from the time after the data updating is finished, and after the preset time is elapsed, if the data updating request for the stored file is not received any more, the verification data is generated again, so that the computing pressure of the cloud storage server is reduced. The preset time can be set according to the actual load condition of the cloud storage server.
Optionally, the verification data may be generated when the number of data updates for the stored file reaches a preset number, and the purpose of reducing the calculation amount of the cloud storage server may also be achieved. Specifically, the following may be mentioned: and counting data updating requests for the stored files from the first time, and generating the verification data of the strip corresponding to the files for the stored files after the data updating when the number of times of receiving the updating requests reaches a preset value.
Optionally, after generating the check data, since the check data is also stored in a striped manner, stripe information of a stripe including the check data may be generated, and the stripe information may include: the identification number of the strip is used for identifying the strip; the identification number of each data block unit constituting the stripe is used for identifying the data block unit in the stripe.
Optionally, the data security level of the file, that is, the N, M numbers of the content, may also be carried in the data update request for the stored file, and the cloud storage server may generate corresponding verification data according to the data security level of the file. The larger the value of M, the higher the data security of the file, but the larger the amount of calculation required when generating the check data.
As an optional implementation manner of the embodiment of the present invention, if the cloud storage server determines that a data update request for a stored file is received for the first time, data in the data block unit after data update in each stripe may be written into another storage area of the disk array, so that the updated data is stored in one location in a centralized manner, which is convenient for subsequent reading. The other storage area may be a storage area other than the stored file in the disk array, for example, if the stored file is stored in the data storage service module a, the updated data in the data block unit may be written into the storage area other than the data storage service module a, for example, into the data storage service module N. As data is written into the new data block unit, the corresponding data block unit in each stripe is changed, and the cloud storage server may generate new stripe information for the stripe after the data block unit is changed, where the stripe information may include: the identification number of the strip is used for identifying the strip; the identification number of each data block unit forming the strip is used for identifying the data block unit in the strip; the offset of the stripe in the stored file is used to locate the location of the stripe in the stored file.
In the data updating method provided by the embodiment of the present invention, after a plurality of stripes covered by an updating range are determined, it can be determined which stripes have all data block units located in the updating range and which stripes have partial data block units located in the updating range, so as to update data of data block units that need to be updated, and not update data of data block units that do not need to be updated, so that the updating data is correctly written into each data block unit in a stripe.
As shown in fig. 5, an embodiment of the present invention further provides a data updating method, where a plurality of data storage service modules (storegerservice) are arranged in a cloud storage server, and are used to store received data in a local disk; the system is also provided with a strip manager (StripeManager) which is used for managing strips corresponding to the stored files and sending the calculation tasks corresponding to the strips to a coding calculation unit (Encoder), wherein the coding calculation unit is used for executing the coding tasks and sending the transcoded data to a data storage service module for storage. The method comprises the following steps:
s301, a client sends a data updating request aiming at a stored file to a data storage service module A, wherein the data updating request carries parameters such as data content, position offset and data size value of data to be updated;
s302, the data storage service module A sends a data updating request to the strip manager according to the position offset and the data size value so as to inquire the strip covered by the updating range;
s303, after the strip manager inquires the strip meeting the coverage condition, the strip information is returned to the data storage service module A;
s304, the data storage service module A determines which data block storage units in the strip need to be subjected to data updating according to the position offset and the data size value;
s305, the data storage service module A updates the data in the determined data block storage unit by using the data content of the data to be updated;
s306, if the data storage service module A determines that the data updating request is the first request, writing the data in the data block unit after data updating in each strip into the data storage service module N;
s307, the data storage service module N sends the writing result to the data storage service module A, for example, the written data block unit identification number;
s308, the data storage service module A reports the updated information of the strip to the strip manager;
s309, the stripe manager updates the stripe information of the stripe corresponding to the stored file;
s3010, the strip manager returns the update result to the data storage service module A;
s3011, the data storage service module a returns the write result to the client.
After the data block unit that needs to update data in each stripe is updated, as shown in fig. 6, the process of generating parity data of the stripe may include:
s401, the strip manager issues a command for calculating check data to a coding calculation unit;
s402, reading data of a strip corresponding to a stored file from the data storage service module A and the data storage service module N by the coding calculation unit;
s403, the data storage service module A and the data storage service module N return data of the strip corresponding to the stored file;
s404, the coding calculation unit calculates the check data of the strip corresponding to the stored file;
s405, the coding calculation unit sends check data to the data storage service module A;
s406, the data storage service module A receives and writes the check data;
s407, the data storage service module A returns a writing result to the coding calculation unit, wherein the writing result comprises the stripe information corresponding to the check data;
s408, the coding calculation unit sends the stripe information corresponding to the check data to the stripe manager;
s409, the stripe manager updates the stripe information.
According to the data updating method provided by the embodiment of the invention, after a plurality of stripes covered by the updating range are determined, which data block storage units in the stripes need to be subjected to data updating according to the position offset and the data size value, so that the data block units needing to be updated are subjected to data updating, the data block units not needing to be updated are not subjected to data updating, and the updating data are correctly written into each data block unit in the stripes.
A specific embodiment of a data updating apparatus provided in an embodiment of the present invention corresponds to the flow shown in fig. 1, and referring to fig. 7, fig. 7 is a schematic structural diagram of the data updating apparatus provided in the embodiment of the present invention, including:
the receiving module 501 is configured to receive a data update request for a stored file, where the data update request carries data content, a position offset, and a data size value of data to be updated.
The positioning module 502 is configured to position an update range corresponding to the data to be updated in the stored file according to the position offset and the data size value.
A determining module 503, configured to determine a plurality of stripes covered by the update range.
And an updating module 504, configured to update, with respect to the determined multiple stripes, data of the data block unit located in the update range in each stripe by using the data content.
The positioning module 502, as shown in fig. 8, includes:
the first determining submodule 5021 is configured to determine an update start position value of the stored file according to the position offset.
A second determining submodule 5022, configured to determine an update end position value of the stored file by a sum of the update start position value and the data size value;
the third determining submodule 5023 is used for determining the updating range by updating the starting position value and the updating ending position value.
The determining module 503 is specifically configured to:
a stripe within the update scope is determined, and a stripe that includes a portion of the update scope is determined.
As shown in fig. 9, the update module 504 includes:
a first update submodule 5041, configured to, if there is a stripe in the plurality of stripes in which all data block units are located within the update range, perform data update on all data block units in the stripe by using the data content;
and a second update sub-module 5042, configured to, if there is a stripe in the plurality of stripes in which a part of the data block units are located within the update range, update data, using the data content, on the data block unit in the stripe located within the update range.
Optionally, as shown in fig. 10, on the basis of the data updating apparatus shown in fig. 7, the data updating apparatus according to the embodiment of the present invention may further include:
the verification data generating module 505 is configured to, from a time point after the data update is finished, if a data update request for the stored file is not received after a preset time period elapses, generate verification data of a stripe corresponding to the file for the stored file after the data update.
A stripe information generating module 506, configured to, if a data update request for a stored file is received for the first time, write data in the data block unit after data update in each stripe into another storage area of the disk array, and generate stripe information of the stripe after data update, where the stripe information includes: the identification number of the stripe, and the identification number of each data block unit constituting the stripe.
The data updating device provided by the embodiment of the device of the invention, for a file stored in a striped manner, when updating data in the file, locates the corresponding updating range of the data to be updated in the stored file according to the position offset and the data size value of the data to be updated carried by the data updating request, further determines a plurality of stripes covered by the updating range, and updates data for the data block units located in the updating range in the stripes, so that only the data block units needing to be updated are updated, and the data in the whole stripe does not need to be read into the memory for further data updating. Compared with the prior art, when data updating is carried out, the data volume read and written in is greatly reduced, so that the data transmission volume in the cloud storage server is reduced, and the network transmission pressure in the cloud storage server is reduced.
An embodiment of the present invention further provides a cloud storage server, as shown in fig. 11, the server 600 includes a processor 601 and a machine-readable storage medium 602, where the machine-readable storage medium stores machine-executable instructions capable of being executed by the processor, and the processor is caused by the machine-executable instructions to implement the following steps:
receiving a data updating request aiming at a stored file, wherein the data updating request carries the data content, the position offset and the data size value of data to be updated;
according to the position offset and the data size value, positioning a corresponding updating range of the data to be updated in the stored file;
determining a plurality of stripes covered by an updating range;
and updating data of the data block units positioned in the updating range in each strip by using the data content for the plurality of determined strips.
In the cloud storage server provided by the embodiment of the invention, for a file stored in a striped manner, when data in the file is updated, an update range corresponding to the data to be updated in the stored file is located according to a position offset and a data size value of the data to be updated carried by a data update request, so that a plurality of stripes covered by the update range are determined, and data update is performed on data block units in the stripes within the update range. The embodiment of the invention only updates the data of the data block unit which needs to be updated, and does not need to read the data in the whole stripe to the memory for data updating. Compared with the prior art, when data updating is carried out, the data volume read and written in is greatly reduced, so that the data transmission volume in the cloud storage server is reduced, and the network transmission pressure in the cloud storage server is reduced.
The machine-readable storage medium mentioned in the above server may include a Random Access Memory (RAM) or a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. In the alternative, the storage medium may be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the integrated Circuit may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete gate or transistor logic device, or discrete hardware components.
An embodiment of the present invention further provides a computer-readable storage medium, in which a computer program is stored, and is configured to execute the following steps:
receiving a data updating request aiming at a stored file, wherein the data updating request carries the data content, the position offset and the data size value of data to be updated;
according to the position offset and the data size value, positioning a corresponding updating range of the data to be updated in the stored file;
determining a plurality of stripes covered by an updating range;
and updating data of the data block units positioned in the updating range in each strip by using the data content for the plurality of determined strips.
In the computer-readable storage medium, for a file stored in a striped manner, when data in the file is updated, an update range corresponding to the data to be updated in the stored file is located according to a position offset and a data size value of the data to be updated carried by a data update request, so as to determine a plurality of stripes covered by the update range, and perform data update on data block units located in the update range in the stripes. The embodiment of the invention only updates the data of the data block unit which needs to be updated, and does not need to read the data in the whole stripe to the memory for data updating. Compared with the prior art, when data updating is carried out, the data volume read and written in is greatly reduced, so that the data transmission volume in the cloud storage server is reduced, and the network transmission pressure in the cloud storage server is reduced.
For the device/server/storage medium embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to part of the description of the method embodiment.
It should be noted that the apparatus, the cloud storage server and the storage medium according to the embodiments of the present invention are an apparatus, a cloud storage server and a storage medium to which the data updating method is applied, and all embodiments of the data updating method are applicable to the apparatus, the cloud storage server and the storage medium, and can achieve the same or similar beneficial effects.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (12)

1. A method for updating data, the method comprising:
receiving a data updating request aiming at a stored file, wherein the data updating request carries data content, position offset and data size value of data to be updated;
according to the position offset and the data size value, positioning a corresponding updating range of the data to be updated in the stored file;
determining a plurality of stripes covered by the updating range, wherein each stripe comprises the same number of data block units;
for the determined plurality of stripes, updating data of the data block units in the update range in each stripe by using the data content;
if a data updating request for a stored file is received for the first time, writing the data in the data block unit after data updating in each stripe into other storage areas of the disk array, and generating stripe information of the stripe after data updating, wherein the stripe information comprises: the identification number of the stripe, and the identification number of each data block unit constituting the stripe.
2. The method according to claim 1, wherein the locating an update range of the data to be updated in the stored file according to the position offset and the data size value comprises:
determining an update starting position value of the stored file according to the position offset;
determining an update ending position value of the stored file by the sum of the update starting position value and the data size value;
and determining the updating range through the updating starting position value and the updating ending position value.
3. The data updating method of claim 1, wherein the determining the plurality of stripes covered by the updating range comprises:
a stripe located within the update scope is determined, and a stripe that includes a portion of the update scope.
4. The data updating method according to claim 1, wherein the updating data of the data block unit in the update range in each stripe by using the data content for the determined plurality of stripes comprises:
for the determined plurality of stripes, if a stripe with partial data block units located in the updating range exists in the plurality of stripes, performing data updating on the data block units located in the updating range in the stripe by using the data content;
and if the plurality of stripes have the stripes with all the data block units positioned in the updating range, updating the data of all the data block units in the stripes by using the data content.
5. The data updating method according to claim 1 or 4, wherein after the data updating is performed on the data block units in each stripe within the updating range by using the data content for the determined plurality of stripes, the method further comprises:
and starting from the moment after the data updating is finished, when the preset time is elapsed, no data updating request aiming at the stored file is received, and aiming at the stored file after the data updating, generating the verification data of the strip corresponding to the file.
6. An apparatus for updating data, the apparatus comprising:
the receiving module is used for receiving a data updating request aiming at a stored file, wherein the data updating request carries data content, position offset and data size value of data to be updated;
the positioning module is used for positioning the corresponding updating range of the data to be updated in the stored file according to the position offset and the data size value;
a determining module, configured to determine a plurality of stripes covered by the update range, where each stripe includes the same number of data block units;
the updating module is used for updating data of the data block units in the updating range in each strip by using the data content aiming at the plurality of determined strips;
a stripe information generating module, configured to, if a data update request for a stored file is received for the first time, write data in a data block unit after data update in each stripe into another storage area of the disk array, and generate stripe information of the stripe after data update, where the stripe information includes: the identification number of the stripe, and the identification number of each data block unit constituting the stripe.
7. The data update apparatus of claim 6, wherein the positioning module comprises:
the first determining submodule is used for determining an updating initial position value of the stored file according to the position offset;
a second determining submodule, configured to determine an update end position value of the stored file according to a sum of the update start position value and the data size value;
and the third determining submodule is used for determining the updating range through the updating starting position value and the updating ending position value.
8. The data updating apparatus of claim 6, wherein the determining module is specifically configured to:
a stripe located within the update scope is determined, and a stripe that includes a portion of the update scope.
9. The data update apparatus of claim 6, wherein the update module comprises:
a first updating submodule, configured to, if there is a stripe in the plurality of stripes in which all data block units are located within the updating range, perform data updating on all data block units in the stripe by using the data content;
and the second updating submodule is used for updating the data of the data block units in the updating range in the stripe by using the data content if the stripe with partial data block units in the updating range exists in the plurality of stripes.
10. A data update apparatus according to claim 6 or 9, characterized in that the apparatus further comprises:
and the verification data generation module is used for generating verification data of the strip corresponding to the file according to the stored file after the data is updated if a data updating request for the stored file is not received after a preset time period from the moment after the data is updated.
11. A cloud storage server comprising a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor, the processor being caused by the machine-executable instructions to: carrying out the method steps of any one of claims 1 to 5.
12. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of claims 1-5.
CN201811015192.8A 2018-08-31 2018-08-31 Data updating method and updating device Active CN110874181B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201811015192.8A CN110874181B (en) 2018-08-31 2018-08-31 Data updating method and updating device
PCT/CN2019/102972 WO2020043119A1 (en) 2018-08-31 2019-08-28 Data updating method and updating device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811015192.8A CN110874181B (en) 2018-08-31 2018-08-31 Data updating method and updating device

Publications (2)

Publication Number Publication Date
CN110874181A CN110874181A (en) 2020-03-10
CN110874181B true CN110874181B (en) 2021-12-17

Family

ID=69644066

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811015192.8A Active CN110874181B (en) 2018-08-31 2018-08-31 Data updating method and updating device

Country Status (2)

Country Link
CN (1) CN110874181B (en)
WO (1) WO2020043119A1 (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111736778B (en) * 2020-07-21 2020-11-17 北京金山云网络技术有限公司 Data updating method, device and system and electronic equipment
CN114385067B (en) * 2020-10-19 2023-07-18 澜起科技股份有限公司 Data updating method for memory system and memory controller
CN113821485B (en) * 2021-09-27 2024-10-11 深信服科技股份有限公司 Data changing method, device, equipment and computer readable storage medium
CN115469818B (en) * 2022-11-11 2023-03-24 苏州浪潮智能科技有限公司 Disk array writing processing method, device, equipment and medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521074A (en) * 2011-12-01 2012-06-27 浪潮电子信息产业股份有限公司 Method for quickening recovery of redundant array of independent disk (RAID) 5
CN106708651A (en) * 2016-11-16 2017-05-24 北京三快在线科技有限公司 Erasure code-based partial write-in method and device, storage medium and equipment

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6904498B2 (en) * 2002-10-08 2005-06-07 Netcell Corp. Raid controller disk write mask
US7913148B2 (en) * 2004-03-12 2011-03-22 Nvidia Corporation Disk controller methods and apparatus with improved striping, redundancy operations and interfaces
CN103984587B (en) * 2008-06-12 2017-10-20 普安科技股份有限公司 The method of the control program of physical storage device is updated in sas storage virtualization system
CN101510175B (en) * 2009-04-02 2015-06-03 北京中星微电子有限公司 Method for updating target data to memory and apparatus thereof
CN103294957B (en) * 2013-05-06 2015-10-28 北京赛思信安技术有限公司 Support data guard method during Data Update in data de-duplication file system
CA2965715C (en) * 2014-12-27 2019-02-26 Huawei Technologies Co., Ltd. Data processing method, apparatus, and system
CN105930103B (en) * 2016-05-10 2019-04-16 南京大学 A kind of correcting and eleting codes covering write method of distributed storage CEPH
CN106528125A (en) * 2016-10-26 2017-03-22 腾讯科技(深圳)有限公司 Data file incremental updating method, server, client and system
CN107341070B (en) * 2017-06-30 2020-07-10 长江大学 Random writing method and system based on erasure codes
CN107454161A (en) * 2017-07-31 2017-12-08 郑州云海信息技术有限公司 A kind of data back up method and device
CN108123997A (en) * 2017-11-21 2018-06-05 武汉中海庭数据技术有限公司 A kind of navigation pack update method and system based on difference update

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521074A (en) * 2011-12-01 2012-06-27 浪潮电子信息产业股份有限公司 Method for quickening recovery of redundant array of independent disk (RAID) 5
CN106708651A (en) * 2016-11-16 2017-05-24 北京三快在线科技有限公司 Erasure code-based partial write-in method and device, storage medium and equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
面向多节点失效的纠删码及数据修复技术研究;林轩;《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》;20170315;I137-329 *

Also Published As

Publication number Publication date
CN110874181A (en) 2020-03-10
WO2020043119A1 (en) 2020-03-05

Similar Documents

Publication Publication Date Title
CN110874181B (en) Data updating method and updating device
US10846137B2 (en) Dynamic adjustment of application resources in a distributed computing system
US10896102B2 (en) Implementing secure communication in a distributed computing system
US9696914B2 (en) System and method for transposed storage in RAID arrays
US9268648B1 (en) System and method for consistency verification of replicated data in a recovery system
CN104978362B (en) Data migration method, device and the meta data server of distributed file system
US9619472B2 (en) Updating class assignments for data sets during a recall operation
US20170060769A1 (en) Systems, devices and methods for generating locality-indicative data representations of data streams, and compressions thereof
US10417213B1 (en) Metadata updating
CN110147203B (en) File management method and device, electronic equipment and storage medium
CN108829342B (en) Log storage method, system and storage device
TW201525687A (en) Method and processor for writing, and error tracking a log subsystem of a file system
CN112749039A (en) Method, apparatus and program product for data writing and data recovery
CN112948279A (en) Method, apparatus and program product for managing access requests in a storage system
CN108369575A (en) Electronic storage system
CN115756955A (en) Data backup and data recovery method and device and computer equipment
WO2017087015A1 (en) Count of metadata operations
CN104216660A (en) Method and device for improving disk array performance
CN109144766B (en) Data storage and reconstruction method and device and electronic equipment
CN111857549B (en) Method, apparatus and computer program product for managing data
US20210132843A1 (en) Method, electronic device and computer program product for managing disk array
CN117591009A (en) Data management method, storage device and server
JP2012089049A (en) Computer system and server
CN109656936A (en) Method of data synchronization, device, computer equipment and storage medium
US9710183B2 (en) Effectively limitless apparent free space on storage device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant