CN104881466A - Method and device for processing data fragments and deleting garbage files - Google Patents

Method and device for processing data fragments and deleting garbage files Download PDF

Info

Publication number
CN104881466A
CN104881466A CN201510271710.2A CN201510271710A CN104881466A CN 104881466 A CN104881466 A CN 104881466A CN 201510271710 A CN201510271710 A CN 201510271710A CN 104881466 A CN104881466 A CN 104881466A
Authority
CN
China
Prior art keywords
data
information
file
division
iterative
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510271710.2A
Other languages
Chinese (zh)
Other versions
CN104881466B (en
Inventor
徐佩林
颜世光
覃安
李康
梁栋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baidu Online Network Technology Beijing Co Ltd
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201510271710.2A priority Critical patent/CN104881466B/en
Publication of CN104881466A publication Critical patent/CN104881466A/en
Application granted granted Critical
Publication of CN104881466B publication Critical patent/CN104881466B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • G06F16/162Delete operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors

Abstract

The embodiment of the invention discloses a method and device for processing data fragments and deleting garbage files. The method for processing the data fragments comprises the steps that in the process that a distributed total order storage system generates total order data fragments, and at least one piece of attribute description information corresponding to the data fragments is obtained, wherein the attribute description information comprises data iterative information; the attribute description information is written in file meta information corresponding to the data fragments; when an instruction for processing at least one target data fragment is received, the data iterative information in the file meta information corresponding to the target data fragments is processed to achieve processing on the target data fragments. According to the technical scheme, the technical effect for processing the target data fragments completely can be achieved without the needs for moving or modifying the data files, the processing mechanism of data fragments in an existing distributed total order storage system is optimized, and the ever-growing convenient and efficient processing demands of people for the data fragments are met.

Description

The process of data fragmentation and the delet method of garbage files and device
Technical field
The embodiment of the present invention relates to computer technology, particularly relates to a kind of process of data fragmentation and the delet method of garbage files and device.
Background technology
In general, data store mainly through the mode of Key-Value (key-value pair) in a database.Store corresponding key assignments (Value) in each key name (Key), corresponding key assignments can be found by key name, and then certain data manipulation can be completed to this key assignments.In addition, in order to realize the fast reading and writing to data in database, the data stored in database are generally total order data.
Total order data logically see it is a super large data set (number of data lines is more than trillion magnitudes) sorted by key, because its data volume is huge, super large data set cannot be stored completely by means of only one or several servers.Therefore, in existing distributed total order storage system, need magnanimity total order data to disperse in a distributed fashion to be stored in each data fragmentation of server cluster.Wherein, different data fragmentations is stored in one or more sliced service device, and the data message scope stored in different burst is stored in the burst metamessage of management server by unified.Like this, multiple sliced service device carries out unified scheduling configuration by a management server, can realize each generic operation to total order data.
Obviously, the total order data stored in database are dynamic changes, along with the lasting additions and deletions operation to data, the size of burst can change, therefore need to divide larger burst, and less burst is merged, and how the data fragmentation storing total order data is carried out rationally and divides efficiently/merge being the previous very important research topic of order.
The implementation method of existing burst division/folding mainly contains the following two kinds:
1, off-line division/merge.This programme needs to stop service when burst division/merging.Then the legacy data off-line in burst is write in new burst, then revise burst metamessage and come into force.Division/the combined efficiency of this realization is very low, needs the bandwidth/computational resource doubled, and needs to stop service for a long time, is unacceptable under the scene higher to requirement of real-time;
2, based on the division of file chaining.In this programme, the physical store of burst corresponds to a file system directories, if generate new burst, only needs the link creating ancient deed under new burst catalogue.This scheme, without the need to Mobile data, in line splitting, can be served without the need to stopping.But need the linking functions of dependent file system, and effectively cannot realize burst merging.
Summary of the invention
In view of this, the embodiment of the present invention provides a kind of process of data fragmentation and the delet method of garbage files and device, to optimize the treatment mechanism of data fragmentation in existing distributed total order storage system, meet the processing demands of data fragmentation of the growing facilitation of people, high efficiency.
In first aspect, embodiments provide the disposal route of data fragmentation in a kind of distributed total order storage system, comprising:
Generate in distributed total order storage system in the process of total order data fragmentation, obtain at least one attribute description information corresponding with described data fragmentation, wherein, described attribute description information comprises data iterative information;
By in file meta-information corresponding with described data fragmentation for described attribute description information write;
When receiving the process instruction at least one target data burst, the data iterative information in the file meta-information corresponding with described target data burst is processed, to realize the process to described target data burst.
In second aspect, embodiments provide a kind of delet method of garbage files, comprising:
File meta-information corresponding with each data fragmentation in Querying Distributed total order storage system, obtain the first listed files, wherein, in described file meta-information, store the attribute description information corresponding with data fragmentation, and include data iterative information in described attribute description information;
Scan the file system corresponding with described distributed total order storage system, obtain the second listed files;
Calculate the difference set of described first listed files and described second listed files, as listed files to be deleted;
Delete the data file matched with described listed files to be deleted in described distributed total order storage system.
In the third aspect, embodiments provide the treating apparatus of data fragmentation in a kind of distributed total order storage system, comprising:
Attribute description data obtaining module, for generating in the process of total order data fragmentation in distributed total order storage system, obtain at least one attribute description information corresponding with described data fragmentation, wherein, described attribute description information comprises data iterative information;
Attribute description information writing module, for writing in the file meta-information corresponding with described data fragmentation by described attribute description information;
File meta-information processing module, for when receiving the process instruction at least one target data burst, processes the data iterative information in the file meta-information corresponding with described target data burst, to realize the process to described target data burst.
In fourth aspect, embodiments provide a kind of delete device of garbage files, comprising:
First listed files acquiring unit, for file meta-information corresponding with each data fragmentation in Querying Distributed total order storage system, obtain the first listed files, wherein, store the attribute description information corresponding with data fragmentation in described file meta-information, and include data iterative information in described attribute description information;
Second listed files acquiring unit, for scanning the file system corresponding with described distributed total order storage system, obtains the second listed files;
Listed files computing unit to be deleted, for calculating the difference set of described first listed files and described second listed files, as listed files to be deleted;
Data file delete cells, for deleting the data file matched with described listed files to be deleted in described distributed total order storage system;
Described first listed files acquiring unit specifically for:
Obtain file meta-information corresponding with each data fragmentation in described distributed total order storage system as pending file meta-information;
According to the data iterative information in each described pending file meta-information and file storage location information, obtain the division file meta-information that described pending file meta-information comprises, and/or merged file metamessage, wherein, described division file meta-information specifically comprises the file meta-information after division process, and described merged file metamessage specifically comprises the file meta-information after merging treatment;
According to described division file meta-information, and/or described merged file metamessage, generate described first listed files.
The embodiment of the present invention is by writing the technological means in the file meta-information corresponding with data fragmentation by the data iterative information of data fragmentation, achieve when receiving the process instruction to target data burst, without the need to directly processing the data file stored in this target data burst, process by means of only to the file meta-information corresponding with described target data burst, the technique effect of the process to described target data burst can be completed, without the need to carrying out any movement or amendment to data file, optimize the treatment mechanism of data fragmentation in existing distributed total order storage system, meet the facilitation that people are growing, the processing demands of the data fragmentation of high efficiency.
Accompanying drawing explanation
Fig. 1 is the process flow diagram of the disposal route of data fragmentation in a kind of distributed total order storage system of first embodiment of the invention;
Fig. 2 is the process flow diagram of the disposal route of data fragmentation in a kind of distributed total order storage system of second embodiment of the invention;
Fig. 3 is the process flow diagram of the disposal route of data fragmentation in a kind of distributed total order storage system of third embodiment of the invention;
Fig. 4 is the relation schematic diagram between burst metamessage, file meta-information and data fragmentation applied in the embodiment of the present invention;
Fig. 5 is the information interaction schematic diagram in the embodiment of the present invention between management server and sliced service device;
Fig. 6 is the division of data fragmentation and the reality of merging schematic diagram consuming time in the embodiment of the present invention;
Fig. 7 is the process flow diagram of the delet method of a kind of garbage files of fourth embodiment of the invention;
Fig. 8 is the process flow diagram of the delete procedure of a kind of concrete garbage files of fourth embodiment of the invention;
Fig. 9 is the structural drawing of the treating apparatus of data fragmentation in a kind of distributed total order storage system of fifth embodiment of the invention;
Figure 10 is the structural drawing of the delete device of a kind of garbage files of sixth embodiment of the invention.
Embodiment
In order to make the object, technical solutions and advantages of the present invention clearly, below in conjunction with accompanying drawing, the specific embodiment of the invention is described in further detail.Be understandable that, specific embodiment described herein is only for explaining the present invention, but not limitation of the invention.It also should be noted that, for convenience of description, illustrate only part related to the present invention in accompanying drawing but not full content.Before in further detail exemplary embodiment being discussed, it should be mentioned that some exemplary embodiments are described as the process or method described as process flow diagram.Although operations (or step) is described as the process of order by process flow diagram, many operations wherein can be implemented concurrently, concomitantly or simultaneously.In addition, the order of operations can be rearranged.Described process can be terminated when its operations are completed, but can also have the additional step do not comprised in the accompanying drawings.Described process can correspond to method, function, code, subroutine, subroutine etc.
First embodiment
The process flow diagram of the disposal route of data fragmentation in a kind of distributed total order storage system that Fig. 1 provides for first embodiment of the invention, the method of the present embodiment can be performed by the treating apparatus of data fragmentation in distributed total order storage system, this device realizes by the mode of hardware and/or software, and in the sliced service device of general accessible site in distributed total order storage system, with the management server in distributed total order storage system with the use of.The method of the present embodiment specifically comprises:
110, generate in distributed total order storage system in the process of total order data fragmentation, obtain at least one attribute description information corresponding with described data fragmentation, wherein, described attribute description information comprises data iterative information.
As previously mentioned, for total order data more than PB (petabyte, thousand terabytes), be difficult to its full storage to get off by means of only single server.Therefore, after needing to use distributed total order storage system that total order data are carried out data fragmentation, be stored in different sliced service devices.
In the present embodiment, generate in distributed total order storage system in the process of total order data fragmentation, obtain the attribute description information that includes data iterative information corresponding with each data fragmentation.
Described data iterative information is specifically for identifying the scope of total order data included in the data fragmentation corresponding to it.Accordingly, described data iterative information can comprise initial data mark (typical, the Key value in Key-Value) and the end data mark of the total order data stored in described data fragmentation; Also initial data serial number and the end data serial number of the total order data stored in described data fragmentation can be comprised; Can also comprise the initial data mark of the total order data stored in described data fragmentation and the number etc. of included total order data, the present embodiment does not limit this.
For example, described total order data are by the data of Key value from the sequence of A ~ Z order, these total order data are divided into two data fragmentations by distributed total order storage system, then for above-mentioned three kinds of situations: the data iterative information corresponding with the first data fragmentation can be the data iterative information that (A->M) is corresponding with the second data fragmentation can be (N->Z); Or the data iterative information corresponding with the first data fragmentation can be the data iterative information that (1->13) is corresponding with the second data fragmentation can be (14->26); Or the data iterative information corresponding with the first data fragmentation can be (A, 13) data iterative information corresponding with the second data fragmentation can be (N, 13).
Described attribute description information is specifically for describing the base attribute feature of the data fragmentation corresponding to it, it is except including data iterative information, can also comprise the size that stores storage file in the physical address of data fragmentation, data fragmentation and other zone bit information etc., the present embodiment does not limit this.
120, by file meta-information corresponding with described data fragmentation for described attribute description information write.
In the present embodiment, specifically store the attribute description information in data fragmentation in described file meta-information, the general and data fragmentation corresponding stored of this file meta-information is in sliced service device.
130, when receiving the process instruction at least one target data burst, the data iterative information in the file meta-information corresponding with described target data burst is processed, to realize the process to described target data burst.
As previously mentioned, data in the total order data file stored in distributed total order storage system are dynamic changes, along with the lasting additions and deletions operation to data, the size of each data fragmentation can change, therefore need to divide larger data fragmentation, less data fragmentation is merged.
Accordingly, in the present embodiment, management server can according to the size of each data fragmentation, and the sliced service device regularly carrying out the data fragmentation place dividing or merge to needs sends corresponding process instruction, completes corresponding division or union operation to make described sliced service device.Wherein, described target data burst specifically refers to the data fragmentation needing to carry out dividing or merging.
In the present embodiment, owing to storing data iterative information in file meta-information, and in this data iterative information, identify the scope of total order data included in the data fragmentation corresponding to it.Now, sliced service device is without the need to having come division or the merging of paired data burst by the mode of practical operation target data burst, and only need, by dividing accordingly the data iterative information in file meta-information corresponding to described target data burst or merge, the division to described target data burst or merging can be realized.
Typically, when receiving the division instruction to target data burst, process by carrying out division to the data iterative information in the file meta-information corresponding with described target data burst, generate the mode of at least two file meta-information corresponding with target data burst, the division process to described target data burst can be realized; When receiving the merging instruction at least two target data bursts, by carrying out merging treatment to the data iterative information at least two file meta-information corresponding with described at least two target data bursts, generate the mode with these at least two target data burst respective file metamessages, the merging treatment to described at least two target data bursts can be realized.
The embodiment of the present invention is by writing the technological means in the file meta-information corresponding with data fragmentation by the data iterative information of data fragmentation, achieve when receiving the process instruction to target data burst, without the need to directly processing the data file stored in this target data burst, process by means of only to the file meta-information corresponding with described target data burst, the technique effect of the process to described target data burst can be completed, without the need to carrying out any movement or amendment to data file, optimize the treatment mechanism of data fragmentation in existing distributed total order storage system, meet the facilitation that people are growing, the processing demands of the data fragmentation of high efficiency.
On the basis of above-described embodiment, described attribute description information can also comprise: come into force identification information, wherein, described in the identification information that comes into force whether come into force for identifying described data iterative information.Whether come into force, specifically refer to the data whether can accessed on the data fragmentation corresponding with described data iterative information.If described in the identification information that comes into force be designated and do not come into force, then the total data on described data fragmentation all can be accessed; If described in the identification information that comes into force be identified as and come into force, then only can access the partial data on described data fragmentation.
For example, after generation data fragmentation, first file meta-information corresponding with this data fragmentation comprises two attribute description information, that is: data iterative information and the identification information that comes into force.Wherein, described data iterative information is (A->G), described in the identification information that comes into force be set to " not coming into force ".Accordingly, by the first file meta-information, the total data (data that A ~ G is corresponding) in this data fragmentation can be accessed;
After receiving the division instruction to this data fragmentation, first file meta-information can be split into two file meta-information (the second file meta-information and the 3rd file meta-information) by sliced service device, wherein, the data iterative information that second file meta-information comprises is (A->C), and the mark that comes into force in the second file meta-information is set to " coming into force ".Accordingly, by the second file meta-information, the partial data (data that A ~ C is corresponding) in this data fragmentation only can be accessed.
The reason of such setting is: in sliced service device, and data are come into force needs to increase the consumption of CPU (CentralProcessing Unit, central processing unit).Therefore, do not use come into force identification information time, all need data iterative information to be designated in any case to come into force, this considerably increases the consumption of CPU.Accordingly, by introducing this identification information that comes into force in attribute description information, can selecting the need of making data iterative information come into force according to actual conditions, doing the consumption that significantly can reduce CPU like this, improving the treatment effeciency of CPU.
Second embodiment
Fig. 2 is the process flow diagram of the disposal route of data fragmentation in a kind of distributed total order storage system of second embodiment of the invention.The present embodiment is optimized based on above-described embodiment, in the present embodiment, described process instruction is specifically optimized for and indicates the division of data fragmentation; Accordingly, by when receiving the process instruction at least one target data burst, data iterative information in the file meta-information corresponding with described target data burst is processed, to realize specifically being optimized for the process of described target data burst: the instruction of the division to target data burst that receiving management server sends; According to described division instruction, obtain the file destination metamessage corresponding with described target data burst; Data iterative information in described file destination metamessage is carried out division process, generate at least two division file meta-information; Division result is back to described management server, according to described division result, corresponding burst metamessage is modified to indicate described management server;
Simultaneously, data iterative information in described file destination metamessage is carried out division process, generate at least two division file meta-information to be specifically optimized for: in the total order data corresponding with described target data burst, obtain at least one node data mark, wherein, described node data is the data in described total order data between start point data and endpoint data; According to start point data mark, endpoint data mark and at least one node data described mark that the data iterative information in described file destination metamessage comprises, generate at least two division data iterative informations; According to described at least two division data iterative informations, generate at least two division file meta-information.
Accordingly, the method for the present embodiment specifically comprises:
210, generate in distributed total order storage system in the process of total order data fragmentation, obtain at least one attribute description information corresponding with described data fragmentation.
220, by file meta-information corresponding with described data fragmentation for described attribute description information write.
230, the division to target data burst that receiving management server sends indicates.
In the present embodiment, in the division instruction that management server sends, include the information of target data burst needing to carry out dividing process, and described division instruction is used to indicate sliced service device that target data burst is split at least two data fragmentations.
240, according to described division instruction, the file destination metamessage corresponding with described target data burst is obtained.
In the present embodiment, sliced service device indicates the information of the target data burst comprised according to described division, can obtain the file destination metamessage corresponding with described target data burst.
250, in the total order data corresponding with described target data burst, obtain at least one node data mark, wherein, described node data is the data in described total order data between start point data and endpoint data.
In the present embodiment, in order to realize the division process to target data burst, only need the file destination metamessage corresponding with described target data burst to be treated at least two file meta-information (be hereinafter referred to as and divide file meta-information).Also namely, the data iterative information that file destination metamessage comprises is treated at least two data iterative informations (being hereinafter referred to as division data iterative information).
For example, the data iterative information that file destination metamessage comprises is (A->Z), by this data iterative information and the total order data corresponding with this iterative information (namely Key value is from the total order data of A ~ Z), (A->M) and (N->Z) these two can be generated and divide data iterative information.Accordingly, two the division file meta-information comprising (A->M) and (N->Z) these two respectively and divide data iterative information can be generated.
In order to realize above-mentioned technique effect, first needing in the total order data corresponding with described target data burst, obtaining at least one node data mark.Wherein, the described node data mark obtained, specifically for identifying with the start point data that the data iterative information in file destination metamessage comprises, together with endpoint data identifies, the start point data mark that the new division data iterative information of common composition comprises and endpoint data mark.
Continuous precedent, in the Key value corresponding with described target data burst from the total order data of A ~ Z, obtain two node data mark M and N, wherein, the node data that M and N is corresponding is the data in described total order data between start point data and endpoint data.
260, according to start point data mark, endpoint data mark and at least one node data described mark that the data iterative information in described file destination metamessage comprises, at least two division data iterative informations are generated.
Continuous precedent, according to start point data mark A, endpoint data mark Z and two node data mark M and N that the data iterative information in described file destination metamessage comprises, two divisions data iterative information (A->M) and (N->Z) can be generated.
270, according to described at least two division data iterative informations, at least two division file meta-information are generated.
280, division result is back to described management server, according to described division result, corresponding burst metamessage is modified to indicate described management server.
In general, management server is used for storing and safeguarding burst metamessage, wherein, mainly stores the information of the sliced service device at total order data different pieces of information burst place in burst metamessage.In the present embodiment, owing to being by realizing the process of file meta-information to the division of data fragmentation and merging treatment, instead of direct control data burst.Therefore, in the present embodiment, the information storing corresponding file meta-information in existing burst metamessage is needed.Like this, when a needs access data fragmentation, first search burst metamessage, find corresponding file meta-information afterwards according to the burst metamessage of coupling, the data fragmentation finally finding coupling according to the file meta-information of coupling conducts interviews.
As mentioned above, after a file meta-information is split at least two division file meta-information, needs division result to be back to described management server, according to described division result, corresponding burst metamessage is modified to indicate described management server.
The method of the present embodiment is when receiving the division instruction to target data burst, without the need to directly dividing the data file stored in this target data burst, process by means of only carrying out division to the file meta-information corresponding with described target data burst, the technique effect that described target data burst is divided can be completed, without the need to carrying out any movement or amendment to data file, optimize the treatment mechanism of data fragmentation in existing distributed total order storage system, meet the processing demands of data fragmentation of the growing facilitation of people, high efficiency.
3rd embodiment
Fig. 3 is the process flow diagram of the disposal route of data fragmentation in a kind of distributed total order storage system of third embodiment of the invention.The present embodiment is optimized based on above-described embodiment, in the present embodiment, described process instruction is specifically optimized for and indicates the merging of data fragmentation; Accordingly, by when receiving the process instruction at least one target data burst, data iterative information in the file meta-information corresponding with described target data burst is processed, to realize specifically being optimized for the process of described target data burst: the instruction of the merging at least two target data bursts that receiving management server sends; According to described merging instruction, obtain at least two the file destination metamessages corresponding with described at least two target data bursts; Data iterative information in described at least two file destination metamessages is carried out merging treatment, generates merged file metamessage; Amalgamation result is back to described management server, according to described amalgamation result, corresponding burst metamessage is modified to indicate described management server;
Meanwhile, the data iterative information in described at least two file destination metamessages is carried out merging treatment, generate merged file metamessage and be specifically optimized for: the data iterative information that described in acquisition, at least two file destination metamessages comprise; Putting in order of each start point data mark, each endpoint data mark and the total order data corresponding with described target data burst that comprise according to described at least two data iterative informations, determines merge start point data mark and merge endpoint data mark; According to described merging start point data mark and described merging endpoint data mark, generate pooled data iterative information; According to described pooled data iterative information, generate merged file metamessage.
Accordingly, the method for the present embodiment specifically comprises:
310, generate in distributed total order storage system in the process of total order data fragmentation, obtain at least one attribute description information corresponding with described data fragmentation.
320, by file meta-information corresponding with described data fragmentation for described attribute description information write.
330, the merging at least two target data bursts that receiving management server sends indicates.
In the present embodiment, include the information of at least two the target data bursts needing to carry out merging treatment in the merging instruction that management server sends, and described merging instruction is used to indicate sliced service device that described at least two target data bursts are merged into a data fragmentation.
340, according to described merging instruction, at least two the file destination metamessages corresponding with described at least two target data bursts are obtained.
In the present embodiment, sliced service device merges according to described the information indicating at least two the target data bursts comprised, and can obtain at least two the file destination metamessages corresponding with described target data burst.
350, obtain described in the data iterative information that comprises of at least two file destination metamessages.
360, putting in order of each start point data mark, each endpoint data mark and the total order data corresponding with described target data burst that comprise according to described at least two data iterative informations, determines merge start point data mark and merge endpoint data mark.
370, according to described merging start point data mark and described merging endpoint data mark, pooled data iterative information is generated.
In a concrete example, the data iterative information that described at least two the file destination metamessages obtained comprise is respectively (A->M) and (N->Z), Data Identification included in these two data iterative informations is respectively A, M, N and Z, the total order data corresponding with data fragmentation put in order as arranging according to the order of Key value from A ~ Z, therefore, can determine that merging start point data is designated A, merge terminal flag Z, accordingly, the pooled data iterative information generated is (A->Z).
380, according to described pooled data iterative information, merged file metamessage is generated.
390, amalgamation result is back to described management server, according to described amalgamation result, corresponding burst metamessage is modified to indicate described management server.
The method of the present embodiment is when receiving the merging instruction at least two target data bursts, without the need to directly merging the data file stored in these at least two target data bursts, merging treatment is carried out by means of only to the file meta-information corresponding with described at least two target data bursts, the technique effect that described target data burst is merged can be completed, without the need to carrying out any movement or amendment to data file, optimize the treatment mechanism of data fragmentation in existing distributed total order storage system, meet the facilitation that people are growing, the processing demands of the data fragmentation of high efficiency.
Can clearly be found out by the second embodiment and the 3rd embodiment: the method for the present embodiment is in fact introduce file metadata layer between burst metamessage and data fragmentation.Wherein, the file meta-information stored in this file metadata layer is owing to only including the attribute information of data fragmentation, and its physics realization is a very little file, usually only has several million sizes.After introducing file metadata layer, the data fragmentation of the division of data fragmentation or union operation and reality can be realized to isolate, become the operation to file meta-information completely, and file meta-information is very little file, can be quickly to its processing speed.
Figure 4 illustrates the relation schematic diagram between a kind of burst metamessage, file meta-information and data fragmentation.As shown in Figure 4, burst metamessage is stored in burst metamessage layer, and file meta-information is stored in file metadata layer, and data fragmentation is stored in file system layer.Wherein, burst metamessage layer is generally arranged in management server, and file metadata layer and file system layer are generally arranged in sliced service device.In burst metamessage, store the information of one or more file meta-information, in file meta-information, store the information of data fragmentation.Be not one to one between file meta-information and data fragmentation, a data fragmentation can one or more file meta-information corresponding (such as, a data fragmentation experienced by and repeatedly divides).By said structure relation, the division division of data fragmentation being converted into file meta-information can be realized, the merging of data fragmentation is converted into the merging of file meta-information.The real Abruption and mergence that makes becomes a pair invertible operation, and has nothing to do with real data fragmentation.In whole operating process, there is not any movement or amendment in data fragmentation.
Figure 5 illustrates the information interaction schematic diagram between management server in the embodiment of the present invention and sliced service device.As shown in Figure 5, management server initiates the instruction of division/union operation to sliced service device, sliced service device is according to the operational order received, the file meta-information that upgrading is corresponding, after upgrading successfully, the result of successful operation is returned to management server, and management server is according to above-mentioned updating result afterwards, the burst metamessage that amendment is corresponding.
The automatic Abruption and mergence of burst that the distributed total order storage system that application the technical program realizes can be implemented in line, the main beneficial effect of acquisition is as follows:
1) system availability: all operations completes completely online, each operation used time is less than 0.5 second, serves basic unaware, and the system high for availability requirement is very applicable, wherein, the reality schematic diagram consuming time of the division/merging of data fragmentation in the embodiment of the present invention is shown at Fig. 6;
2) data are flexible: can no requirement (NR) completely to input Data distribution8, system can completely self-adaptation by dividing, merging adjust burst;
3) resource consumption: Abruption and mergence operation does not bring the resource pressures such as extra bandwidth, calculating, storage to server cluster, and resource consumption approximates 0, and this is extremely important for data scale, scene that cluster scale is large;
4) system load balancing, hot dilatation: all bursts of whole system can be made to maintain the size of design by division/folding, this is that the load balancing of system has laid an extraordinary basis.The method of the present embodiment and burst move the hot dilatation combining and realize, and make the dilatation of system very simple and quick, service unaware.
Based on above-mentioned beneficial effect, the method for the present embodiment can be applied to the entity storehouse etc. of the chained library of the service of crawl, web page library and knowledge mapping.
4th embodiment
Fig. 7 is the process flow diagram of the delet method of a kind of garbage files of fourth embodiment of the invention.The method of the present embodiment can be performed by the delete device of garbage files, this device realizes by the mode of hardware and/or software, and in the management server of general accessible site in distributed total order storage system, with the sliced service device in distributed total order storage system with the use of.The method of the present embodiment specifically comprises:
710, file meta-information corresponding with each data fragmentation in distributed total order storage system is obtained as pending file meta-information, wherein, store the attribute description information corresponding with data fragmentation in described file meta-information, and include data iterative information in described attribute description information.
In general, under the scene of multiple sliced service device, the formation speed of garbage files is quickly, and can the collection handling garbage files while realizing division/merging well also very crucial.Can be found out to the 3rd embodiment by the first embodiment: because burst metamessage layer and file system layer are kept apart by file metadata layer, and a data fragmentation may correspond to multiple file meta-information, different file meta-information may be present on different sliced service devices.Therefore, the garbage files determined in a sliced service device, may also can use by other sliced service device, therefore, in an independent sliced service device, directly complete the deletion action of garbage files, probably cause the mistake of useful file to be deleted, therefore, need to carry out each sliced service device of United Dispatching by management server, carried out the deletion action of garbage files.
In the present embodiment, file meta-information corresponding with each data fragmentation in Querying Distributed total order storage system, obtaining the first listed files can comprise: inquire about all files metamessage stored in each sliced service device, to obtain the first corresponding listed files.
Preferably, the file storage location information that can comprise according to file meta-information, generates described first listed files.
But, consider that the data volume of the file meta-information that each data fragmentation is corresponding is very huge, therefore when obtaining the first listed files according to all files metamessage, to the consumption of CPU is comparatively large and processing speed is slow.Accordingly, one of the present embodiment preferred embodiment in, file meta-information corresponding with each data fragmentation in Querying Distributed total order storage system, obtaining the first listed files can also comprise:
Obtain file meta-information corresponding with each data fragmentation in described distributed total order storage system as pending file meta-information; According to the data iterative information in each described pending file meta-information and file storage location information, obtain the division file meta-information that described pending file meta-information comprises, and/or merged file metamessage, wherein, described division file meta-information specifically comprises the file meta-information after division process, and described merged file metamessage specifically comprises the file meta-information after merging treatment; According to described division file meta-information, and/or described merged file metamessage, generate described first listed files.
The reason of such setting is, can find out according to the first embodiment and the 3rd embodiment: if file meta-information is without division or union operation, data fragmentation corresponding to it can only be stored in a sliced service device, the data fragmentation corresponding to this file meta-information carries out delete processing, the mistake of data fragmentation can not be caused to delete, only file meta-information through division or union operation after, data fragmentation corresponding to it just may be stored in different sliced service devices, therefore, only need obtain division file meta-information and merged file metamessage that each file meta-information comprises to generate described first listed files, arrange like this and greatly can reduce CPU consumption, improve CPU processing speed.
720, the file system that scanning is corresponding with described distributed total order storage system, obtains the second listed files.
730, the difference set of described first listed files and described second listed files is calculated, as listed files to be deleted.
740, the data file matched with described listed files to be deleted in described distributed total order storage system is deleted.
The method of the present embodiment, by file meta-information corresponding with each data fragmentation in Querying Distributed total order storage system, obtains the first listed files, scans the file system corresponding with described distributed total order storage system, obtain the second listed files; Calculate the difference set of described first listed files and described second listed files, as listed files to be deleted; Delete the technological means of the data fragmentation matched with described listed files to be deleted in described distributed total order storage system, can on the basis of introducing file metadata layer, the mistake of data file is prevented to delete, additionally by the mode obtaining the first listed files according to division file meta-information and merged file metamessage, the erasing time of garbage files greatly can be reduced.
Figure 8 illustrates a kind of process flow diagram of concrete garbage files delete procedure, as shown in Figure 8, described method comprises:
810, the file set a used is inquired about to sliced service device.
820, scanning document system, obtains the set b of All Files in current file system.
830, wait-receiving mode is to All Files data.
Whether the result 840, judging to verify fileinfo is that file is complete: if so, perform 850; Otherwise, return execution 810 and 820.
850, difference set is done in b, a set, draw garbage files set.
860, from file system, delete garbage files, this refuse collection completes.
870, after triggering the timer of refuse collection next time, execution 810 and 820 is returned.
5th embodiment
Figure 9 illustrates the treating apparatus of data fragmentation in a kind of distributed total order storage system of fifth embodiment of the invention.As shown in Figure 9, described device comprises:
Attribute description data obtaining module 91, for generating in the process of total order data fragmentation in distributed total order storage system, obtain at least one attribute description information corresponding with described data fragmentation, wherein, described attribute description information comprises data iterative information.
Attribute description information writing module 92, for writing described attribute description information in the file meta-information corresponding with described data fragmentation.
File meta-information processing module 93, for when receiving the process instruction at least one target data burst, data iterative information in the file meta-information corresponding with described target data burst is processed, to realize the process to described target data burst.
The embodiment of the present invention is by writing the technological means in the file meta-information corresponding with data fragmentation by the data iterative information of data fragmentation, achieve when receiving the process instruction to target data burst, without the need to directly processing the data file stored in this target data burst, process by means of only to the file meta-information corresponding with described target data burst, the technique effect of the process to described target data burst can be completed, without the need to carrying out any movement or amendment to data file, optimize the treatment mechanism of data fragmentation in existing distributed total order storage system, meet the facilitation that people are growing, the processing demands of the data fragmentation of high efficiency.
On the basis of the various embodiments described above, described data iterative information can comprise: the initial data mark of the total order data stored in described data fragmentation and end data mark.
On the basis of the various embodiments described above, described process indicates to comprise and indicates the division of data fragmentation;
Accordingly, described file meta-information processing module specifically can comprise:
Division instruction receiving element, for the instruction of the division to target data burst that receiving management server sends;
File destination metamessage acquiring unit, for according to described division instruction, obtains the file destination metamessage corresponding with described target data burst;
Division processing unit, for the data iterative information in described file destination metamessage being carried out division process, generates at least two division file meta-information;
Division result returns unit, for division result is back to described management server, modifies according to described division result to indicate described management server to corresponding burst metamessage.
On the basis of the various embodiments described above, described division processing unit specifically may be used for further:
In the total order data corresponding with described target data burst, obtain at least one node data mark, wherein, described node data is the data in described total order data between start point data and endpoint data;
According to start point data mark, endpoint data mark and at least one node data described mark that the data iterative information in described file destination metamessage comprises, generate at least two division data iterative informations;
According to described at least two division data iterative informations, generate at least two division file meta-information.
On the basis of the various embodiments described above, described process indicates to comprise and indicates the merging of data fragmentation;
Accordingly, described file meta-information processing module specifically can comprise:
Merge instruction receiving element, for the instruction of the merging at least two target data bursts that receiving management server sends;
File destination metamessage acquiring unit, for according to described merging instruction, obtains at least two the file destination metamessages corresponding with described at least two target data bursts;
Merging treatment unit, for the data iterative information in described at least two file destination metamessages is carried out merging treatment, generates merged file metamessage;
Amalgamation result returns unit, for amalgamation result is back to described management server, modifies according to described amalgamation result to indicate described management server to corresponding burst metamessage.
On the basis of the various embodiments described above, described merging treatment unit specifically may be used for further:
The data iterative information that described in acquisition, at least two file destination metamessages comprise;
Putting in order of each start point data mark, each endpoint data mark and the total order data corresponding with described target data burst that comprise according to described at least two data iterative informations, determines merge start point data mark and merge endpoint data mark;
According to described merging start point data mark and described merging endpoint data mark, generate pooled data iterative information;
According to described pooled data iterative information, generate merged file metamessage.
In the distributed total order storage system that the embodiment of the present invention provides, the treating apparatus of data fragmentation can be used for the disposal route performing data fragmentation in the distributed total order storage system that provides of any embodiment of the present invention, possess corresponding functional module, realize identical beneficial effect.
6th embodiment
Figure 10 illustrates the delete device of a kind of garbage files of sixth embodiment of the invention.As shown in Figure 10, described device comprises:
First listed files acquiring unit 101, for file meta-information corresponding with each data fragmentation in Querying Distributed total order storage system, obtain the first listed files, wherein, store the attribute description information corresponding with data fragmentation in described file meta-information, in attribute description information described in it, include data iterative information;
Second listed files acquiring unit 102, for scanning the file system corresponding with described distributed total order storage system, obtains the second listed files;
Listed files computing unit 103 to be deleted, for calculating the difference set of described first listed files and described second listed files, as listed files to be deleted;
Data file delete cells 104, for deleting the data file matched with described listed files to be deleted in described distributed total order storage system;
Described first listed files obtains Unit 101 and specifically may be used for:
Obtain file meta-information corresponding with each data fragmentation in described distributed total order storage system as pending file meta-information;
According to the data iterative information in each described pending file meta-information and file storage location information, obtain the division file meta-information that described pending file meta-information comprises, and/or merged file metamessage, wherein, described division file meta-information specifically comprises the file meta-information after division process, and described merged file metamessage specifically comprises the file meta-information after merging treatment;
According to described division file meta-information, and/or described merged file metamessage, generate described first listed files.
The method of the present embodiment, by file meta-information corresponding with each data fragmentation in Querying Distributed total order storage system, obtains the first listed files, scans the file system corresponding with described distributed total order storage system, obtain the second listed files; Calculate the difference set of described first listed files and described second listed files, as listed files to be deleted; Delete the technological means of the data file matched with described listed files to be deleted in described distributed total order storage system, can on the basis of introducing file metadata layer, the mistake of data file is prevented to delete, additionally by the mode obtaining the first listed files according to division file meta-information and merged file metamessage, the erasing time of garbage files greatly can be reduced.
The delete device of the garbage files that the embodiment of the present invention provides can be used for the delet method performing the garbage files that any embodiment of the present invention provides, and possesses corresponding functional module, realizes identical beneficial effect.
Obviously, it will be understood by those skilled in the art that above-mentioned of the present invention each module or each step can be implemented by sliced service device as above and management server.Alternatively, the embodiment of the present invention can realize by the executable program of computer installation, thus they storages can be performed by processor in the storage device, described program can be stored in a kind of computer-readable recording medium, the above-mentioned storage medium mentioned can be ROM (read-only memory), disk or CD etc.; Or they are made into each integrated circuit modules respectively, or the multiple module in them or step are made into single integrated circuit module to realize.Like this, the present invention is not restricted to the combination of any specific hardware and software.
The foregoing is only the preferred embodiments of the present invention, be not limited to the present invention, to those skilled in the art, the present invention can have various change and change.All do within spirit of the present invention and principle any amendment, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (16)

1. the disposal route of data fragmentation in distributed total order storage system, is characterized in that, comprising:
Generate in distributed total order storage system in the process of total order data fragmentation, obtain at least one attribute description information corresponding with described data fragmentation, wherein, described attribute description information comprises data iterative information;
By in file meta-information corresponding with described data fragmentation for described attribute description information write;
When receiving the process instruction at least one target data burst, the data iterative information in the file meta-information corresponding with described target data burst is processed, to realize the process to described target data burst.
2. method according to claim 1, is characterized in that, described data iterative information comprises: the initial data mark of the total order data stored in described data fragmentation and end data mark.
3. method according to claim 2, is characterized in that, described process indicates to comprise and indicates the division of data fragmentation;
Accordingly, when receiving the process instruction at least one target data burst, the data iterative information in the file meta-information corresponding with described target data burst is processed, to realize comprising the process of described target data burst:
The instruction of the division to target data burst that receiving management server sends;
According to described division instruction, obtain the file destination metamessage corresponding with described target data burst;
Data iterative information in described file destination metamessage is carried out division process, generate at least two division file meta-information;
Division result is back to described management server, according to described division result, corresponding burst metamessage is modified to indicate described management server.
4. method according to claim 3, is characterized in that, the data iterative information in described file destination metamessage is carried out division process, generates at least two division file meta-information and comprises:
In the total order data corresponding with described target data burst, obtain at least one node data mark, wherein, described node data is the data in described total order data between start point data and endpoint data;
According to start point data mark, endpoint data mark and at least one node data described mark that the data iterative information in described file destination metamessage comprises, generate at least two division data iterative informations;
According to described at least two division data iterative informations, generate at least two division file meta-information.
5. method according to claim 2, is characterized in that, described process indicates to comprise and indicates the merging of data fragmentation;
Accordingly, when receiving the process instruction at least one target data burst, the data iterative information in the file meta-information corresponding with described target data burst is processed, to realize comprising the process of described target data burst:
The instruction of the merging at least two target data bursts that receiving management server sends;
According to described merging instruction, obtain at least two the file destination metamessages corresponding with described at least two target data bursts;
Data iterative information in described at least two file destination metamessages is carried out merging treatment, generates merged file metamessage;
Amalgamation result is back to described management server, according to described amalgamation result, corresponding burst metamessage is modified to indicate described management server.
6. method according to claim 5, is characterized in that, the data iterative information in described at least two file destination metamessages is carried out merging treatment, generates merged file metamessage and comprises:
The data iterative information that described in acquisition, at least two file destination metamessages comprise;
Putting in order of each start point data mark, each endpoint data mark and the total order data corresponding with described target data burst that comprise according to described at least two data iterative informations, determines merge start point data mark and merge endpoint data mark;
According to described merging start point data mark and described merging endpoint data mark, generate pooled data iterative information;
According to described pooled data iterative information, generate merged file metamessage.
7. method according to claim 1, is characterized in that, described attribute description information also comprises: come into force identification information, wherein, described in the identification information that comes into force whether come into force for identifying described data iterative information.
8. a delet method for garbage files, is characterized in that, comprising:
File meta-information corresponding with each data fragmentation in Querying Distributed total order storage system, obtain the first listed files, wherein, in described file meta-information, store the attribute description information corresponding with data fragmentation, and include data iterative information in described attribute description information;
Scan the file system corresponding with described distributed total order storage system, obtain the second listed files;
Calculate the difference set of described first listed files and described second listed files, as listed files to be deleted;
Delete the data file matched with described listed files to be deleted in described distributed total order storage system.
9. method according to claim 8, is characterized in that, file meta-information corresponding with each data fragmentation in Querying Distributed total order storage system, obtains the first listed files and comprises:
Obtain file meta-information corresponding with each data fragmentation in described distributed total order storage system as pending file meta-information;
According to the data iterative information in each described pending file meta-information and file storage location information, obtain the division file meta-information that described pending file meta-information comprises, and/or merged file metamessage, wherein, described division file meta-information specifically comprises the file meta-information after division process, and described merged file metamessage specifically comprises the file meta-information after merging treatment;
According to described division file meta-information, and/or described merged file metamessage, generate described first listed files.
10. the treating apparatus of data fragmentation in distributed total order storage system, is characterized in that, comprising:
Attribute description data obtaining module, for generating in the process of total order data fragmentation in distributed total order storage system, obtain at least one attribute description information corresponding with described data fragmentation, wherein, described attribute description information comprises data iterative information;
Attribute description information writing module, for writing in the file meta-information corresponding with described data fragmentation by described attribute description information;
File meta-information processing module, for when receiving the process instruction at least one target data burst, processes the data iterative information in the file meta-information corresponding with described target data burst, to realize the process to described target data burst.
11. devices according to claim 10, is characterized in that, described data iterative information comprises: the initial data mark of the total order data stored in described data fragmentation and end data mark.
12. devices according to claim 11, is characterized in that, described process indicates to comprise and indicates the division of data fragmentation;
Accordingly, described file meta-information processing module specifically comprises:
Division instruction receiving element, for the instruction of the division to target data burst that receiving management server sends;
File destination metamessage acquiring unit, for according to described division instruction, obtains the file destination metamessage corresponding with described target data burst;
Division processing unit, for the data iterative information in described file destination metamessage being carried out division process, generates at least two division file meta-information;
Division result returns unit, for division result is back to described management server, modifies according to described division result to indicate described management server to corresponding burst metamessage.
13. devices according to claim 12, is characterized in that, described division processing unit further specifically for:
In the total order data corresponding with described target data burst, obtain at least one node data mark, wherein, described node data is the data in described total order data between start point data and endpoint data;
According to start point data mark, endpoint data mark and at least one node data described mark that the data iterative information in described file destination metamessage comprises, generate at least two division data iterative informations;
According to described at least two division data iterative informations, generate at least two division file meta-information.
14. devices according to claim 11, is characterized in that, described process indicates to comprise and indicates the merging of data fragmentation;
Accordingly, described file meta-information processing module specifically comprises:
Merge instruction receiving element, for the instruction of the merging at least two target data bursts that receiving management server sends;
File destination metamessage acquiring unit, for according to described merging instruction, obtains at least two the file destination metamessages corresponding with described at least two target data bursts;
Merging treatment unit, for the data iterative information in described at least two file destination metamessages is carried out merging treatment, generates merged file metamessage;
Amalgamation result returns unit, for amalgamation result is back to described management server, modifies according to described amalgamation result to indicate described management server to corresponding burst metamessage.
15. devices according to claim 14, is characterized in that, described merging treatment unit further specifically for:
The data iterative information that described in acquisition, at least two file destination metamessages comprise;
Putting in order of each start point data mark, each endpoint data mark and the total order data corresponding with described target data burst that comprise according to described at least two data iterative informations, determines merge start point data mark and merge endpoint data mark;
According to described merging start point data mark and described merging endpoint data mark, generate pooled data iterative information;
According to described pooled data iterative information, generate merged file metamessage.
The delete device of 16. 1 kinds of garbage files, is characterized in that, comprising:
First listed files acquiring unit, for file meta-information corresponding with each data fragmentation in Querying Distributed total order storage system, obtain the first listed files, wherein, store the attribute description information corresponding with data fragmentation in described file meta-information, and include data iterative information in described attribute description information;
Second listed files acquiring unit, for scanning the file system corresponding with described distributed total order storage system, obtains the second listed files;
Listed files computing unit to be deleted, for calculating the difference set of described first listed files and described second listed files, as listed files to be deleted;
Data file delete cells, for deleting the data file matched with described listed files to be deleted in described distributed total order storage system;
Described first listed files acquiring unit specifically for:
Obtain file meta-information corresponding with each data fragmentation in described distributed total order storage system as pending file meta-information;
According to the data iterative information in each described pending file meta-information and file storage location information, obtain the division file meta-information that described pending file meta-information comprises, and/or merged file metamessage, wherein, described division file meta-information specifically comprises the file meta-information after division process, and described merged file metamessage specifically comprises the file meta-information after merging treatment;
According to described division file meta-information, and/or described merged file metamessage, generate described first listed files.
CN201510271710.2A 2015-05-25 2015-05-25 The processing of data fragmentation and the delet method of garbage files and device Active CN104881466B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510271710.2A CN104881466B (en) 2015-05-25 2015-05-25 The processing of data fragmentation and the delet method of garbage files and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510271710.2A CN104881466B (en) 2015-05-25 2015-05-25 The processing of data fragmentation and the delet method of garbage files and device

Publications (2)

Publication Number Publication Date
CN104881466A true CN104881466A (en) 2015-09-02
CN104881466B CN104881466B (en) 2018-09-07

Family

ID=53948959

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510271710.2A Active CN104881466B (en) 2015-05-25 2015-05-25 The processing of data fragmentation and the delet method of garbage files and device

Country Status (1)

Country Link
CN (1) CN104881466B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105354315A (en) * 2015-11-11 2016-02-24 华为技术有限公司 Region division method in distributed database, Region node and system
CN107656695A (en) * 2016-07-25 2018-02-02 杭州海康威视数字技术股份有限公司 A kind of data storage, delet method, device and distributed memory system
CN108170372A (en) * 2017-12-08 2018-06-15 厦门集微科技有限公司 data processing method and device based on cloud hard disk
CN108733306A (en) * 2017-04-14 2018-11-02 华为技术有限公司 A kind of Piece file mergence method and device
CN109587278A (en) * 2019-01-16 2019-04-05 平安普惠企业管理有限公司 Data transmission method and relevant apparatus
CN110535793A (en) * 2018-05-25 2019-12-03 微软技术许可有限责任公司 The message total order mechanism of distributed system
CN111026751A (en) * 2019-11-22 2020-04-17 北京金山云网络技术有限公司 Processing method, device and system of distributed table and electronic equipment
CN111597149A (en) * 2020-04-27 2020-08-28 五八有限公司 Data cleaning method and device for database
CN112445801A (en) * 2020-11-27 2021-03-05 杭州海康威视数字技术股份有限公司 Method and device for managing meta information of data table and storage medium
CN112632008A (en) * 2020-12-29 2021-04-09 华录光存储研究院(大连)有限公司 Data fragment transmission method and device and computer equipment
WO2024050972A1 (en) * 2022-09-05 2024-03-14 金蝶软件(中国)有限公司 Database table sharding method and apparatus, computer device, and storage medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050033755A1 (en) * 2003-04-03 2005-02-10 Parag Gokhale System and method for extended media retention
CN101807207A (en) * 2010-03-22 2010-08-18 北京大用科技有限责任公司 Method for sharing document based on content difference comparison
CN102480494A (en) * 2010-11-23 2012-05-30 金蝶软件(中国)有限公司 File updating method, device and system
CN102779185A (en) * 2012-06-29 2012-11-14 浙江大学 High-availability distribution type full-text index method
CN103353873A (en) * 2013-06-07 2013-10-16 携程计算机技术(上海)有限公司 Method and system for optimization realization based on time dimension data real-time inquiry service
CN103488778A (en) * 2013-09-27 2014-01-01 华为技术有限公司 Data searching method and device
CN103955528A (en) * 2014-05-09 2014-07-30 北京华信博研科技有限公司 File data writing method, and file data reading method and device
US20140258300A1 (en) * 2011-12-23 2014-09-11 Daniel Baeumges Independent Table Nodes In Parallelized Database Environments
US8892507B1 (en) * 2012-03-29 2014-11-18 Emc Corporation Providing file system quota support for a file system having separated data and metadata
CN104573068A (en) * 2015-01-23 2015-04-29 四川中科腾信科技有限公司 Information processing method based on megadata

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050033755A1 (en) * 2003-04-03 2005-02-10 Parag Gokhale System and method for extended media retention
CN101807207A (en) * 2010-03-22 2010-08-18 北京大用科技有限责任公司 Method for sharing document based on content difference comparison
CN102480494A (en) * 2010-11-23 2012-05-30 金蝶软件(中国)有限公司 File updating method, device and system
US20140258300A1 (en) * 2011-12-23 2014-09-11 Daniel Baeumges Independent Table Nodes In Parallelized Database Environments
US8892507B1 (en) * 2012-03-29 2014-11-18 Emc Corporation Providing file system quota support for a file system having separated data and metadata
CN102779185A (en) * 2012-06-29 2012-11-14 浙江大学 High-availability distribution type full-text index method
CN103353873A (en) * 2013-06-07 2013-10-16 携程计算机技术(上海)有限公司 Method and system for optimization realization based on time dimension data real-time inquiry service
CN103488778A (en) * 2013-09-27 2014-01-01 华为技术有限公司 Data searching method and device
CN103955528A (en) * 2014-05-09 2014-07-30 北京华信博研科技有限公司 File data writing method, and file data reading method and device
CN104573068A (en) * 2015-01-23 2015-04-29 四川中科腾信科技有限公司 Information processing method based on megadata

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017080139A1 (en) * 2015-11-11 2017-05-18 华为技术有限公司 Region division method in distributed database, region node and system
CN105354315B (en) * 2015-11-11 2018-10-30 华为技术有限公司 Method, sublist node and the system of distributed data base neutron table splitting
CN105354315A (en) * 2015-11-11 2016-02-24 华为技术有限公司 Region division method in distributed database, Region node and system
US11868315B2 (en) 2015-11-11 2024-01-09 Huawei Cloud Computing Technologies Co., Ltd. Method for splitting region in distributed database, region node, and system
CN107656695B (en) * 2016-07-25 2020-12-25 杭州海康威视数字技术股份有限公司 Data storage and deletion method and device and distributed storage system
CN107656695A (en) * 2016-07-25 2018-02-02 杭州海康威视数字技术股份有限公司 A kind of data storage, delet method, device and distributed memory system
CN108733306A (en) * 2017-04-14 2018-11-02 华为技术有限公司 A kind of Piece file mergence method and device
CN108733306B (en) * 2017-04-14 2020-04-21 华为技术有限公司 File merging method and device
CN108170372A (en) * 2017-12-08 2018-06-15 厦门集微科技有限公司 data processing method and device based on cloud hard disk
CN110535793A (en) * 2018-05-25 2019-12-03 微软技术许可有限责任公司 The message total order mechanism of distributed system
CN109587278A (en) * 2019-01-16 2019-04-05 平安普惠企业管理有限公司 Data transmission method and relevant apparatus
CN111026751A (en) * 2019-11-22 2020-04-17 北京金山云网络技术有限公司 Processing method, device and system of distributed table and electronic equipment
CN111026751B (en) * 2019-11-22 2024-02-09 北京金山云网络技术有限公司 Distributed form processing method, device and system and electronic equipment
CN111597149B (en) * 2020-04-27 2023-03-31 五八有限公司 Data cleaning method and device for database
CN111597149A (en) * 2020-04-27 2020-08-28 五八有限公司 Data cleaning method and device for database
CN112445801A (en) * 2020-11-27 2021-03-05 杭州海康威视数字技术股份有限公司 Method and device for managing meta information of data table and storage medium
CN112632008A (en) * 2020-12-29 2021-04-09 华录光存储研究院(大连)有限公司 Data fragment transmission method and device and computer equipment
WO2024050972A1 (en) * 2022-09-05 2024-03-14 金蝶软件(中国)有限公司 Database table sharding method and apparatus, computer device, and storage medium

Also Published As

Publication number Publication date
CN104881466B (en) 2018-09-07

Similar Documents

Publication Publication Date Title
JP6778795B2 (en) Methods, devices and systems for storing data
CN104881466A (en) Method and device for processing data fragments and deleting garbage files
CN107391653B (en) Distributed NewSQL database system and picture data storage method
CN102129425B (en) The access method of big object set table and device in data warehouse
KR20170123336A (en) File manipulation method and apparatus
CN105677904B (en) Small documents storage method and device based on distributed file system
CN106970958B (en) A kind of inquiry of stream file and storage method and device
EP3646133B1 (en) Systems and methods of creation and deletion of tenants within a database
US11409722B2 (en) Database live reindex
CN111324606B (en) Data slicing method and device
US8880553B2 (en) Redistribute native XML index key shipping
KR101621385B1 (en) System and method for searching file in cloud storage service, and method for controlling file therein
CN107710189B (en) Multimodal sharing of content between documents
US20160203032A1 (en) Series data parallel analysis infrastructure and parallel distributed processing method therefor
CN111414422B (en) Data distribution method, device, equipment and storage medium
CN111930684A (en) Small file processing method, device and equipment based on HDFS (Hadoop distributed File System) and storage medium
CN114969165B (en) Data query request processing method, device, equipment and storage medium
KR20180077830A (en) Processing method for a relational query in distributed stream processing engine based on shared-nothing architecture, recording medium and device for performing the method
CN105653566B (en) A kind of method and device for realizing database write access
CN109684331A (en) A kind of object storage meta data management device and method based on Kudu
US9690886B1 (en) System and method for a simulation of a block storage system on an object storage system
US11074244B1 (en) Transactional range delete in distributed databases
CN105426489A (en) Memory calculation based distributed expandable data search system
CN111782834A (en) Image retrieval method, device, equipment and computer readable storage medium
CN110727672A (en) Data mapping relation query method and device, electronic equipment and readable medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
EXSB Decision made by sipo to initiate substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant