CN111435288A - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN111435288A
CN111435288A CN201910032459.2A CN201910032459A CN111435288A CN 111435288 A CN111435288 A CN 111435288A CN 201910032459 A CN201910032459 A CN 201910032459A CN 111435288 A CN111435288 A CN 111435288A
Authority
CN
China
Prior art keywords
data
storage area
write request
container
written
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910032459.2A
Other languages
Chinese (zh)
Other versions
CN111435288B (en
Inventor
赵亚飞
董元元
魏舒展
庄灿伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201910032459.2A priority Critical patent/CN111435288B/en
Publication of CN111435288A publication Critical patent/CN111435288A/en
Application granted granted Critical
Publication of CN111435288B publication Critical patent/CN111435288B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0656Data buffering arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data processing method and device. Wherein, the method comprises the following steps: receiving a write request, wherein the write request is used for requesting to write data to any one or more data bands in a data container of a storage area; generating a check code under the condition that the data of any one or more data bands in the data container of the storage area are aligned; and carrying out persistence processing on the data and the check code. The invention solves the technical problem that complex holes exist because the prior art generally adopts a mode of reading, rewriting or filling invalid data to process due to the fact that the data is not aligned in the process of writing data in the prior art.

Description

Data processing method and device
Technical Field
The invention relates to the technical field of internet, in particular to a data processing method and device.
Background
The storage size of distributed systems is becoming larger and larger, and device errors in distributed systems are a problem that cannot be ignored. The storage cost and reliability of the data are factors to be considered when designing the distributed system. Erasure codes can minimize the storage overhead of the system while ensuring the same data reliability as it does.
The implementation mode in the distributed system in the industry at present usually adopts an asynchronous coding method, that is, data is stored in a multi-copy mode and is converted into an erasure code mode for storage by adopting a certain strategy at the background, and the method has the problems of flow amplification and high occupancy rate of storage space; on the other hand, because the erasure code file needs to use the aligned data to calculate the check data, and the size of the write request of the user is often random, if the erasure code file is directly written with data, padding data is often needed to be added, which causes the system to occupy redundant storage space and introduces the problem of effective data management of the storage space.
In view of the above-mentioned situation that the data is not aligned in the process of writing data in the prior art, the prior art generally adopts a method of reading, rewriting or filling invalid data to process, and has a problem of complex holes, and no effective solution is proposed at present.
Disclosure of Invention
Embodiments of the present invention provide a data processing method and apparatus, so as to at least solve the technical problem that in the prior art, due to the fact that misaligned data exists in the process of writing data, a complex hole exists because the prior art generally processes data in a manner of reading, rewriting, or filling invalid data.
According to an aspect of an embodiment of the present invention, there is provided a data processing method, including: receiving a write request, wherein the write request is used for requesting to write data to any one or more data bands in a data container of a storage area; generating a check code under the condition that the data of any one or more data bands in the data container of the storage area are aligned; and carrying out persistence processing on the data and the check code.
Optionally, in a case that data in the data container of the storage area are all aligned, before generating the check code, the method further includes: judging whether any one or more data bands in the data container of the storage area are in a full-written state; if any one or more data bands in the data container of the storage area are not in a full-written state, determining that the data in the data container of the storage area are not aligned; if any one or more data bands in the data container of the storage area are in a fully written state, it is determined that the data in the data container of the storage area are all aligned.
Further, optionally, after determining that the data in the data container of the storage area is not aligned, the method further includes: writing data into a storage area, wherein if it is monitored that a first part of data in the data is written into the storage area, writing a second part of data which is not written into the storage area; if it is detected that the first data band in the non-fully written state still exists in the data container of the storage area after the data is written, the first data band is filled according to a predetermined rule until the first data band is in the fully written state.
Optionally, the portion of the stuffing process is allowed to not perform the persistence process.
Alternatively, if a write request is requested to be performed on different data bands, persistence processing is allowed to process data in each data band in parallel.
Optionally, if it is detected that the data container of the storage area after the data is written does not need to be filled, and the data is in a state of temporary cache, the data is deleted.
Optionally, after receiving the write request, the method further includes: judging whether the storage area has data or not; if so, determining that the content written before the write request has unaligned data; if not, it is determined whether any one or more of the data stripes in the data containers of the storage area are all aligned.
Further, optionally, if the content written before the write request has unaligned data, the unaligned data is filled with the data of the write request, and the check code is calculated.
Optionally, after the data and the check code are subjected to persistence processing, the method further includes: judging whether residual data exist in the write request or not, wherein the residual data are used for representing data which are not written into any one or more data bands in the write request; if yes, returning to execute the step of generating the check code; if not, the request ends.
According to another aspect of the embodiments of the present invention, there is also provided a data processing apparatus, including: the device comprises a receiving module, a sending module and a receiving module, wherein the receiving module is used for receiving a write request, and the write request is used for requesting to write data to any one or more data bands in a data container of a storage area; the survival module is used for generating a check code under the condition that the data of any one or more data bands in the data container of the storage area are aligned; and the processing module is used for carrying out persistence processing on the data and the check code.
According to still another aspect of the embodiments of the present invention, there is also provided a storage medium including a stored program, wherein when the program runs, a device on which the storage medium is located is controlled to execute the above-mentioned data processing method.
In the embodiment of the invention, by receiving a write request, wherein the write request is used for requesting to write data to any one or more data bands in a data container of a storage area; generating a check code under the condition that the data of any one or more data bands in the data container of the storage area are aligned; the data and the check code are subjected to persistence processing, the purposes that only temporary check blocks need to be generated and simple check block garbage recovery is added are achieved, the technical effects of online EC, low cost and low technical difficulty are achieved, and the technical problem that in the prior art, due to the fact that the situation that data are not aligned exists in the data writing process, invalid data are generally read, rewritten or filled in the prior art, and complex holes exist is solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
fig. 1 is a block diagram of a hardware configuration of a computer terminal of a data processing method according to an embodiment of the present invention;
FIG. 2 is a flow chart of a method for processing data according to a first embodiment of the invention;
FIG. 3 is a diagram of an EC in a data processing method according to a first embodiment of the invention;
fig. 4 is a schematic diagram of processing user data for Chunk N, N +1 in a storage pool in a data processing method according to a first embodiment of the present invention;
fig. 5 is a schematic diagram of data processing in a data processing method according to a first embodiment of the present invention;
FIG. 6 is a diagram illustrating write requests in a storage pool in a data processing method according to a first embodiment of the present invention;
fig. 7 is a block diagram of a data processing apparatus according to a second embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The technical terms related to the present application are:
erasure Code: the erasure code is a coding fault-tolerant technology, and the basic principle is to fragment stored data, generate k + m parts of data from k parts of original data through a certain check calculation mode, and restore the data into the original data through any k parts of data in the k + m parts. Thus, even if part of data is lost, the system can still recover the original data;
block, a basic unit of data, i.e., a data Block;
the Encoding Group, Erasure Code basic unit comprises k data blocks and m check blocks;
stripe, same Encoding Group;
chunk, a data container, for storing user data.
Example 1
There is also provided, in accordance with an embodiment of the present invention, an embodiment of a method for processing data, it being noted that the steps illustrated in the flowchart of the figure may be performed in a computer system such as a set of computer-executable instructions and that, although a logical order is illustrated in the flowchart, in some cases the steps illustrated or described may be performed in an order different than that presented herein.
The method provided by the first embodiment of the present application may be executed in a mobile terminal, a computer terminal, or a similar computing device. Taking the example of the present invention running on a computer terminal, fig. 1 is a block diagram of a hardware structure of a computer terminal of a data processing method according to an embodiment of the present invention. As shown in fig. 1, the computer terminal 10 may include one or more (only one shown) processors 102 (the processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA), a memory 104 for storing data, and a transmission module 106 for communication functions. It will be understood by those skilled in the art that the structure shown in fig. 1 is only an illustration and is not intended to limit the structure of the electronic device. For example, the computer terminal 10 may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.
The memory 104 may be used to store software programs and modules of application software, such as program instructions/modules corresponding to the data processing method in the embodiment of the present invention, and the processor 102 executes various functional applications and data processing, i.e., implements the data processing method of the application program by running the software programs and modules stored in the memory 104. The memory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to the computer terminal 10 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission device 106 is used for receiving or transmitting data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the computer terminal 10. In one example, the transmission device 106 includes a Network adapter (NIC) that can be connected to other Network devices through a base station to communicate with the internet. In one example, the transmission device 106 can be a Radio Frequency (RF) module, which is used to communicate with the internet in a wireless manner.
Under the operating environment, the application provides a data processing method as shown in fig. 2. Fig. 2 is a flowchart of a data processing method according to a first embodiment of the present invention.
Step S202, receiving a write request, wherein the write request is used for requesting to write data into any one or more data bands in a data container of a storage area;
the structure of the data tape Stripe in the data container is shown in fig. 3, and fig. 3 is a schematic diagram of an EC in the data processing method according to the first embodiment of the present invention; taking an Erasure code Erasure Coding of 4+2 as an example, wherein user data is stored in different chunks; the chunk is a data strip Stripe for erasure coding to organize data; one Stripe contains a plurality of data blocks and check blocks.
Step S204, generating a check code under the condition that the data of any one or more data bands in the data container of the storage area are aligned;
in the process of generating the check code, as shown in fig. 4, fig. 4 is a schematic diagram illustrating processing of N, N +1 th user data in the storage pool for Chunk in the data processing method according to the first embodiment of the present invention; here, for the processing of Chunk N, N +1 user data (UserData):
first, a Block of data is placed in a data Block area of one or more stripes;
second, for the nth block of user data, the last UserData N (3) cannot fill the Stripe data area, thus filling Pad while generating the check code Temp ParityN (3). Data and check codes are persisted, but the Pad part does not need to be persisted because the Pad can formulate a padding rule, such as 0 padding in total; meanwhile, UserData N (3) is stored in a temporary cache area;
thirdly, for the N +1 blocks of user data, the UserData N (3) of the temporary buffer is complemented by the UserData N +1(1), a check code is calculated, and the UserData N +1(1) and the check code ParityN +1(1) are persisted. UserData N (3) has been persisted during the last processing, so there is no need to persist again at this time. After the persistence is successful, the UserData N (3) in the temporary cache region can be deleted.
Fourth, UserData N +1(2) and ParityN +1(2) are processed normally.
Fifth, the N +1 th block of data misaligned partial data is processed as with UserData N (3).
Step S206, the data and the check code are subjected to persistence processing.
As shown in fig. 5, fig. 5 is a schematic diagram of data processing in the data processing method according to the first embodiment of the present invention, and the processing of the user data is always data-appended and is not filled, invalidated, or overwritten. The processing of the check blocks is relatively complex. Since several times of calculation occur to some strips, some temporary check blocks can be covered by the check blocks of the same strips later, and the garbage collection processing of Temp Parity needs to be added.
In the embodiment of the invention, by receiving a write request, wherein the write request is used for requesting to write data to any one or more data strips in a data container of a storage area; generating a check code under the condition that any one or more data strips in a data container of the storage area are aligned; the data and the check code are subjected to persistence processing, the purposes that only temporary check blocks need to be generated and simple check block garbage recovery is added are achieved, the technical effects of online EC, low cost and low technical difficulty are achieved, and the technical problem that in the prior art, due to the fact that the situation that data are not aligned exists in the data writing process, invalid data are generally read, rewritten or filled in the prior art, and complex holes exist is solved.
Specifically, fig. 6 is a schematic diagram of a write request in a storage pool in a data processing method according to a first embodiment of the present invention; as shown in fig. 6, the data processing method provided in the embodiment of the present application specifically includes:
optionally, in the case that the data in the data container of the storage area are all aligned in step S204, before generating the check code, the method for processing data provided in the embodiment of the present application further includes:
step1, judging whether any one or more data tape strips in the data container of the storage area are in a full-written state;
step2, if any one or more data tape strips in the data container of the storage area are not in a full-written state, determining that the data in the data container of the storage area are not aligned;
step3, if any one or more data tape strips in the data container of the storage area are in a fully written state, determining that the data in the data container of the storage area are all aligned.
Further, optionally, after determining that the data in the data container of the storage area is not aligned at Step2, the method for processing data provided in the embodiment of the present application further includes:
step4, writing data into the storage area, wherein if the first part of data in the data is monitored to be written into the storage area, writing the second part of data which is not written into the storage area;
step5, if it is detected that the first data tape Stripe in the non-fully written state still exists in the data container of the storage area after the data is written, the first data tape Stripe is filled according to the predetermined rule until the first data tape Stripe is in the fully written state.
Optionally, the portion of the stuffing process is allowed to not perform the persistence process.
Alternatively, Step6 allows for the persistence process to process the data in each of the data Stripe stripes in parallel if a write request is requested to be performed to a different data Stripe.
Alternatively, Step7, if it is detected that the data container of the storage area after the data is written does not need to be filled and the data is in a temporary cache state, the data is deleted.
Specifically, in combination with steps 1 to 4, the execution flow of the embodiment of the present application after receiving a write request is as shown in steps 107 to 108 in fig. 6, as follows:
step 107: the data and generated parity chunks are persisted to the storage pool. Wherein the filling part does not need to be persisted; for write requests across stripes, persistence may be in parallel.
Step 108: if the transaction is not filled and there is data in the temporary buffer, then the portion of data may be deleted from the temporary buffer at this time.
Optionally, after receiving the write request in step S202, the method for processing data provided in the embodiment of the present application further includes:
step1, judging whether the storage area has data already;
step2, if yes, determining that the content written before the write request has unaligned data;
step3, if not, then determine if any one or more of the data Stripe strips in the data container of the storage area are all aligned.
Further, optionally, in Step4, if the content written before the write request has misaligned data, the data of the write request is used to fill the misaligned data, and the check code is calculated.
Specifically, in combination with steps 1 to 4, the execution flow of the embodiment of the present application after receiving a write request is as shown in steps 100 to 106 in fig. 6, as follows:
step 100: receiving a write request;
step 101: checking whether the temporary buffer area has data;
step 102: the data indicates that the last request has unaligned data, and the computed check code is filled, so that the data in the request needs to be used for complementing;
step 103: the new Stripe may still be insufficient and therefore needs to be checked;
step 104: when the Stripe is not full, the data needs to be stored in a temporary buffer area; if part of data is already in the temporary buffer area, only the data in the new request needs to be stored; the data of the original cache region is also preserved at the same time;
step 105: on a 104 basis, fill is regularly (e.g., full 0) so that it is aligned with Stripe.
Step 106: either 103 checks the alignment or 105 generates a check code by calculation after the alignment is made up.
Optionally, after the data and the check code are subjected to persistence processing in step S206, the data processing method provided in the embodiment of the present application further includes:
step S207, judging whether residual data exist in the write request, wherein the residual data are used for representing that any one or more data with strips are not written in the write request;
step S208, if the check code exists, returning to the step of generating the check code;
in step S209, if not, the request ends.
Specifically, with reference to step S207 to step S209, the execution flow of the embodiment of the present application after receiving the write request is as shown in steps 109 to 110 in fig. 6, as follows:
step 109: it is checked whether there is any remaining data in the request that is unprocessed and, if so, step 103 is performed.
Step 110: if not, the request ends.
For the read request flow in the embodiment of the present application, the data area is not missing and is not filled; for the data reconstruction process, if the strip has padding, generating data according to the padding rule. And the check block takes the latest check block according to the Stripe.
The data processing method provided by the embodiment of the application can be realized by only generating the temporary check block and adding simple check block garbage recovery, has low cost and low technical realization difficulty, and can realize online EC; the data processing method provided by the embodiment of the application brings extra write amplification, and is suitable for scenes in which the size of the Stripe is smaller than that of most write request packets.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.
Through the above description of the embodiments, those skilled in the art can clearly understand that the data processing method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
Example 2
According to an embodiment of the present invention, there is further provided a method for implementing the above data processing, as shown in fig. 7, fig. 7 is a structural diagram of a data processing apparatus according to a second embodiment of the present invention, the apparatus including:
a receiving module 72, configured to receive a write request, where the write request is used to request that data be written to any one or more data tapes strips in a data container of a storage area; a survival module 74, configured to generate a check code when any one or more pieces of data tape Stripe data in the data container of the storage area are aligned; and a processing module 76, configured to perform persistence processing on the data and the check code.
Example 3
According to still another aspect of the embodiments of the present invention, there is also provided a storage medium including a stored program, wherein when the program runs, a device in which the storage medium is located is controlled to execute the data processing method in embodiment 1.
Example 4
The embodiment of the invention also provides a storage medium. Optionally, in this embodiment, the storage medium may be configured to store a program code executed by the data processing method provided in the first embodiment.
Optionally, in this embodiment, the storage medium may be located in any one of computer terminals in a computer terminal group in a computer network, or in any one of mobile terminals in a mobile terminal group.
Optionally, in this embodiment, the storage medium is configured to store program code for performing the following steps: receiving a write request, wherein the write request is used for requesting to write data to any one or more data tapes strips in a data container of a storage area; generating a check code under the condition that any one or more data strips in a data container of the storage area are aligned; and carrying out persistence processing on the data and the check code.
Optionally, in this embodiment, the storage medium is configured to store program code for performing the following steps: under the condition that the data in the data container of the storage area are all aligned, before generating a check code, judging whether any one or more data tape strips in the data container of the storage area are in a full-written state; if any one or more data tape strips in the data container of the storage area are not in a full-written state, determining that the data in the data container of the storage area are not aligned; if any one or more of the data tapes Stripe in the data container of the storage area is in a fully written state, it is determined that the data in the data container of the storage area is aligned.
Further, optionally, in the present embodiment, the storage medium is configured to store program code for performing the following steps: after determining that the data in the data container of the storage area is not aligned, the method further comprises: writing data into a storage area, wherein if it is monitored that a first part of data in the data is written into the storage area, writing a second part of data which is not written into the storage area; if it is detected that the first data tape Stripe in the non-fully written state still exists in the data container of the storage area after the data is written, the first data tape Stripe is subjected to padding processing according to a predetermined rule until the first data tape Stripe is in the fully written state.
Optionally, in this embodiment, the storage medium is configured to store program code for performing the following steps: the part of the padding process is allowed not to perform the persistence process.
Optionally, in this embodiment, the storage medium is configured to store program code for performing the following steps: if a write request is requested to be performed for a different data tape Stripe, a persistence process is allowed to process data in each data tape Stripe in parallel.
Optionally, in this embodiment, the storage medium is configured to store program code for performing the following steps: and if the data container of the storage area after the data is written is detected not to need to be filled and the data is in the state of temporary cache, deleting the data.
Optionally, in this embodiment, the storage medium is configured to store program code for performing the following steps: after receiving the write request, judging whether the storage area has data or not; if so, determining that the content written before the write request has unaligned data; if not, then a determination is made as to whether any one or more of the data Stripe strips in the data container of the storage area are all aligned.
Further, optionally, in the present embodiment, the storage medium is configured to store program code for performing the following steps: if the content written before the write request has unaligned data, the unaligned data is filled with the data of the write request, and a check code is calculated.
Optionally, in this embodiment, the storage medium is configured to store program code for performing the following steps: after the data and the check code are subjected to persistence processing, judging whether residual data exist in the write request, wherein the residual data are used for representing that any one or more data with strips are not written in the write request; if yes, returning to execute the step of generating the check code; if not, the request ends.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (11)

1. A method of processing data, comprising:
receiving a write request, wherein the write request is used for requesting to write data to any one or more data bands in a data container of a storage area;
generating a check code under the condition that the data of any one or more data bands in the data container of the storage area are aligned;
and carrying out persistence processing on the data and the check code.
2. The method of claim 1, wherein, in the event that data in the data containers of the storage area are all aligned, prior to generating a check code, the method further comprises:
judging whether any one or more data bands in the data container of the storage area are in a full-written state;
determining that data in the data containers of the storage area are not aligned if any one or more data bands in the data containers of the storage area are not in the fully written state;
and if any one or more data bands in the data container of the storage area are in the full-written state, determining that the data in the data container of the storage area are aligned.
3. The method of claim 2, wherein after determining that data in the data container of the storage area is not aligned, the method further comprises:
writing the data into the storage area, wherein if it is monitored that a first part of the data is written into the storage area, a second part of the data which is not written into the storage area is written into the storage area;
if it is detected that the first data band in the non-fully written state still exists in the data container of the storage area after the data is written, filling processing is performed on the first data band according to a preset rule until the first data band is in the fully written state.
4. The method of claim 3, wherein the portion of the population process that is allowed does not perform the persistence process.
5. The method of claim 3, wherein if a write request is requested to be performed on different data bands, persistence processing that allows data in each data band to be processed in parallel is allowed.
6. The method of claim 3, wherein if it is detected that a data container of a storage area after writing data does not need to be filled and the data is in a state of temporary buffering, the data is deleted.
7. The method of claim 1, wherein after receiving a write request, the method further comprises:
judging whether the storage area already has data or not;
if so, determining that the content written before the write request has unaligned data;
if not, judging whether any one or more data bands in the data container of the storage area are aligned.
8. The method of claim 7, wherein if the content written before the write request has unaligned data, the unaligned data is padded with the data of the write request and a check code is calculated.
9. The method of any of claims 1 to 8, wherein after persisting the data and the check code, the method further comprises:
judging whether residual data exist in the write request, wherein the residual data are used for representing data which are not written into any one or more data bands in the write request;
if yes, returning to execute the step of generating the check code;
if not, the request ends.
10. An apparatus for processing data, comprising:
the device comprises a receiving module, a sending module and a receiving module, wherein the receiving module is used for receiving a write request, and the write request is used for requesting to write data to any one or more data bands in a data container of a storage area;
the survival module is used for generating a check code under the condition that the data of any one or more data bands in the data container of the storage area are aligned;
and the processing module is used for carrying out persistence processing on the data and the check code.
11. A storage medium comprising a stored program, wherein an apparatus in which the storage medium is located is controlled to execute the data processing method of claim 1 when the program is executed.
CN201910032459.2A 2019-01-14 2019-01-14 Data processing method and device Active CN111435288B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910032459.2A CN111435288B (en) 2019-01-14 2019-01-14 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910032459.2A CN111435288B (en) 2019-01-14 2019-01-14 Data processing method and device

Publications (2)

Publication Number Publication Date
CN111435288A true CN111435288A (en) 2020-07-21
CN111435288B CN111435288B (en) 2023-05-02

Family

ID=71580266

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910032459.2A Active CN111435288B (en) 2019-01-14 2019-01-14 Data processing method and device

Country Status (1)

Country Link
CN (1) CN111435288B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113132661A (en) * 2021-03-11 2021-07-16 深圳市阿达视高新技术有限公司 Video data storage method and device, storage medium and camera equipment
CN114579352A (en) * 2022-04-29 2022-06-03 阿里云计算有限公司 Data reconstruction method and device
CN117389484A (en) * 2023-12-12 2024-01-12 深圳大普微电子股份有限公司 Data storage processing method, device, equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101727299A (en) * 2010-02-08 2010-06-09 北京同有飞骥科技有限公司 RAID5-orientated optimal design method for writing operation in continuous data storage
US20150113222A1 (en) * 2013-10-18 2015-04-23 International Business Machines Corporation Read and Write Requests to Partially Cached Files
CN105824583A (en) * 2016-04-18 2016-08-03 北京鲸鲨软件科技有限公司 Processing method and device for improving sequential writing efficiency of erasure correction code clustered file system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101727299A (en) * 2010-02-08 2010-06-09 北京同有飞骥科技有限公司 RAID5-orientated optimal design method for writing operation in continuous data storage
US20150113222A1 (en) * 2013-10-18 2015-04-23 International Business Machines Corporation Read and Write Requests to Partially Cached Files
CN105824583A (en) * 2016-04-18 2016-08-03 北京鲸鲨软件科技有限公司 Processing method and device for improving sequential writing efficiency of erasure correction code clustered file system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
YING-BO LIU; HUI-MEI DAI; FENG WANG; WEI DAI; HUI DENG: "\"Efficient Indexing and Querying of Massive Astronomical Data Using Compressed Word-Aligned Hybrid Bitmap\"" *
孙志卓;张全新;李元章;谭毓安;刘靖宇;马忠梅;: "连续数据存储中面向RAID5的写操作优化设计" *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113132661A (en) * 2021-03-11 2021-07-16 深圳市阿达视高新技术有限公司 Video data storage method and device, storage medium and camera equipment
CN113132661B (en) * 2021-03-11 2022-04-12 深圳市阿达视高新技术有限公司 Video data storage method and device, storage medium and camera equipment
CN114579352A (en) * 2022-04-29 2022-06-03 阿里云计算有限公司 Data reconstruction method and device
CN117389484A (en) * 2023-12-12 2024-01-12 深圳大普微电子股份有限公司 Data storage processing method, device, equipment and storage medium
CN117389484B (en) * 2023-12-12 2024-04-26 深圳大普微电子股份有限公司 Data storage processing method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN111435288B (en) 2023-05-02

Similar Documents

Publication Publication Date Title
CN105808151B (en) Solid state hard disk stores the data access method of equipment and solid state hard disk storage equipment
CN111435288A (en) Data processing method and device
EP2908254A1 (en) Data redundancy implementation method and device
CN103959256A (en) Fingerprint-based data deduplication
CN104765693A (en) Data storage method, device and system
EP2989549A1 (en) Reference counter integrity checking
CN107807792A (en) A kind of data processing method and relevant apparatus based on copy storage system
CN108733311B (en) Method and apparatus for managing storage system
CN110941514B (en) Data backup method, data recovery method, computer equipment and storage medium
CN105206306A (en) Method of Handling Error Correcting Code in Non-volatile Memory and Non-volatile Storage Device Using the Same
CN112000627B (en) Data storage method, system, electronic equipment and storage medium
CN105471714A (en) Message processing method and device
US11455100B2 (en) Handling data slice revisions in a dispersed storage network
CN110018783A (en) A kind of date storage method, apparatus and system
CN109582213A (en) Data reconstruction method and device, data-storage system
CN113311993A (en) Data storage method and data reading method
CN111782152A (en) Data storage method, data recovery device, server and storage medium
CN107608821B (en) Data reading method, device and equipment based on erasure codes
CN109196478B (en) Fault tolerant enterprise object storage system for small objects
CN110889143A (en) File verification method and device
US10402262B1 (en) Fencing for zipheader corruption for inline compression feature system and method
CN102523205A (en) Determination method and device for content checksum
CN111211993A (en) Incremental persistence method and device for streaming computation
CN110968255B (en) Data processing method, device, storage medium and processor
CN109144766B (en) Data storage and reconstruction method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40033301

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant