CN104142958A - Storage method for data in Key-Value system and related device - Google Patents

Storage method for data in Key-Value system and related device Download PDF

Info

Publication number
CN104142958A
CN104142958A CN201310172455.7A CN201310172455A CN104142958A CN 104142958 A CN104142958 A CN 104142958A CN 201310172455 A CN201310172455 A CN 201310172455A CN 104142958 A CN104142958 A CN 104142958A
Authority
CN
China
Prior art keywords
data
burst
key
value
content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310172455.7A
Other languages
Chinese (zh)
Other versions
CN104142958B (en
Inventor
潘锋烽
张子刚
熊劲
岳银亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Institute of Computing Technology of CAS
Original Assignee
Huawei Technologies Co Ltd
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd, Institute of Computing Technology of CAS filed Critical Huawei Technologies Co Ltd
Priority to CN201310172455.7A priority Critical patent/CN104142958B/en
Publication of CN104142958A publication Critical patent/CN104142958A/en
Application granted granted Critical
Publication of CN104142958B publication Critical patent/CN104142958B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2219Large Object storage; Management thereof

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

An embodiment of the invention discloses a storage method for data in a Key-Value system and a related device. By means of the method and the device, the storage operation efficiency of Value data can be improved. The method comprises steps as follows: whether the data volume of Value data in the Key-Value exceeds the threshold value of data is judged; if the data volume of Value data doesn't exceed the threshold value of data, the Value data is sliced, and M slice contents are obtained; slice information is generated for N slices according to the M slice contents and comprises the number of Value data slices, offset addresses of slice contents in the N slice contents, serial numbers ID of the slice contents in the N slice contents and (M-N) slice contents; and Key data and slice information are stored in an LSM-Tree (the log-structured merge-tree), the N slice contents are stored in a Key-Value database, and the Key data correspond to the slice contents.

Description

Storage means and the relevant apparatus of data in a kind of key-value pair system
Technical field
The present invention relates to field of computer technology, relate in particular to storage means and the relevant apparatus of data in a kind of key-value pair system.
Background technology
LSM-Tree(The Log-Structured Merge-Tree, merging tree based on log-structured) be at key-value pair (K-V, Key-Value) in storage system, for example frequently insert or delete for frequent updating Value() and the Indexing Mechanism of introducing, such as can be applied to the frequency of inquiry Value, far below the scene of the frequency of renewal Value, history table and journal file etc.
The implementation method of LSM-Tree is: will the renewal of Value data be kept in internal memory, after the Value preserving in internal memory reaches the threshold value of appointment, these update content be written in disk in bulk.As shown in Fig. 1-a and Fig. 1-b, specifically, LSM-Tree is by two-layer ordered data collection (C 0tree and C 1tree) or multilayer order data set (C 0tree, C 1tree ..., C ktree) index is changed and postponed and batch processing, and efficiently the multiple ordered data collection on disk are merged by the mode that is similar to merge sort.Wherein the LSM-Tree principle of work of double-layer structure and sandwich construction is identical,, by postponing and batch processing, the data recording of recent renewal is preferentially put into C 0in tree (full internal storage data structure), go, only have and reach after certain threshold value when the size of a certain level, just part of records wherein can be moved in more higher leveled level and goes, this process is that poll merges (rolling merge), therefore between all levels, i.e. (C i-1, C i) between have an asynchronous rolling merge process to be responsible at C i-1when tree exceedes threshold size, its segment data record is moved on to C iin tree, thus old, out-of-date data recording in new peak one-level level more, and therefore newer data recording can move between the tree of many levels, until the data recording being updated is replaced or delete.
In existing key-value pair storage system, conventionally adopt the as above index data structure of the LSM-Tree of explanation, but a C in existing this LSM-Tree itree (i is any one value that is less than k) all can bring input and output (I/O in the time upgrading Value data, Input/Output) expense of port, in the time that the data volume of Value data is very large, the Value data of big data quantity need to move between the data set in different levels in rolling merge process, by producing the expense of very large I/O port, must reduce the storage operation efficiency to overall Value data thus.
Summary of the invention
The embodiment of the present invention provides storage means and the relevant apparatus of data in a kind of key-value pair system, stores Value data for the expense of the I/O port with less, improves the storage operation efficiency to Value data.
For solving the problems of the technologies described above, the embodiment of the present invention provides following technical scheme:
First aspect, the embodiment of the present invention provides the storage means of data in a kind of key-value pair system, comprising:
Whether the data volume that judges Value data in key-value pair exceedes data threshold, and described key-value pair comprises Key data and the Value data corresponding with described Key data;
If the data volume of described Value data does not exceed data threshold, described Value data are cut into slices, obtain M burst content, described M is greater than 1 natural number, and described M burst content is the data content information of the described Value data after burst;
According to described M burst content, N burst content generated to burst information, described burst information comprises: the individual burst content of sequence number ID, (M-N) of each burst content in the offset address of each burst content, a described N burst content in the number to described Value data fragmentation, a described N burst content, described N is for being greater than 0 natural number, and described N is less than or equal to described M;
Described Key data and described burst information are stored in the merging tree LSM-Tree based on log-structured, described N burst content is stored in key-value pair database, described Key data are corresponding with described burst information.
In conjunction with first aspect, in the possible implementation of the first of first aspect, described method also comprises:
If the data volume of described Value data exceedes data threshold, described key-value pair is stored in described LSM-Tree.
In conjunction with first aspect, in the possible implementation of the second of first aspect, the described merging tree LSM-Tree that described Key data and described burst information are stored in based on log-structured comprises:
Described Key data and described burst information are stored in to the first ordered data and concentrate, described the first ordered data collection is arranged in internal memory memory;
In the time that described the first ordered data collection exceedes merging threshold value, to concentrate burst information corresponding to Key data to be updated to described the first ordered data the second ordered data and concentrate the burst information corresponding with described Key data, described the second ordered data collection is arranged in disk.
In conjunction with the possible implementation of the second of first aspect, in the third possible implementation of first aspect, described by the second ordered data concentrate burst information corresponding to described Key data be updated to described the first ordered data concentrate the burst information corresponding with described Key data, also comprise afterwards:
The described Key data that described the first ordered data is concentrated and the burst information corresponding with described Key data are deleted.
In conjunction with first aspect, in the 4th kind of possible implementation of first aspect, describedly after being stored in key-value pair database, described N burst content also comprise:
Obtain in Value data and need to need the data-bias address of reading, Key data corresponding to described Value data in the data length reading, described Value data, described Value data are the Value data that current needs read;
In described LSM-Tree, search the first burst information corresponding to described Key data according to described Key data;
From described LSM-Tree and/or described key-value pair database, read out and in described Value data, need the data that read according to the data length that needs to read in the data-bias address that needs to read in described Value data, described Value data and the first burst information corresponding to described Key data.
In conjunction with the 4th kind of possible implementation of first aspect, in the 5th kind of possible implementation of first aspect, describedly also comprise after reading out according to the data length that needs to read in the data-bias address that needs to read in described Value data, described Value data and the first burst information corresponding to described Key data the data that need to read in described Value data from described LSM-Tree and/or described key-value pair database:
In the burst content at the data place having read out, be new data more by the described data replacement having read out;
Each burst content after having upgraded in described Value data is regenerated to the first burst information;
The first burst information regenerating is stored in described LSM-Tree.
In conjunction with first aspect, in the 6th kind of possible implementation of first aspect, describedly after being stored in key-value pair database, described N burst content also comprise:
Obtain the data-bias address, the 2nd Key data corresponding to described the 2nd Value data that in the 2nd Value data, need to need in the data length inserting, described the 2nd Value data insertion;
In described LSM-Tree, search the second burst information corresponding to described the 2nd Key data according to described the 2nd Key data;
From described LSM-Tree and/or described key-value pair database, read out according to the data-bias address that needs in described the 2nd Value data to insert and the second burst information corresponding to described the 2nd Key data the burst content that needs the place, data-bias address of inserting in described the 2nd Value data;
In the burst content at place, data-bias address that needs to insert, insert the data of needs insertion in described the 2nd Value data;
Each burst content after data inserting in the 2nd Value data is regenerated to the second burst information, the burst content that needs the place, data-bias address of inserting in the 2nd Value data after data inserting is carried in the second burst information regenerating, the second burst information regenerating is stored in described LSM-Tree; Or, carry out following steps: each burst content after data inserting in the 2nd Value data is regenerated to the second burst information, the second burst information regenerating is stored in described LSM-Tree, the burst content that needs the place, data-bias address of inserting in the 2nd Value data after data inserting is stored in described key-value pair database.
Second aspect, the embodiment of the present invention also provides the memory storage of data in a kind of key-value pair system, comprising:
Whether judge module, exceed data threshold for the data volume that judges key-value pair Value data, and described key-value pair comprises Key data and the Value data corresponding with described Key data;
Section module, for in the time that the data volume of described Value data does not exceed data threshold, described Value data are cut into slices, obtain M burst content, described M is greater than 1 natural number, and described M burst content is the data content information of the described Value data after burst;
Acquisition module, for N burst content being generated to burst information according to described M burst content, described burst information comprises: the individual burst content of sequence number ID, (M-N) of each burst content in the offset address of each burst content, a described N burst content in the number to described Value data fragmentation, a described N burst content, described N is for being greater than 0 natural number, and described N is less than or equal to described M;
Memory module, for described Key data and described burst information are stored in to the merging tree LSM-Tree based on log-structured, is stored in described N burst content in key-value pair database, and described Key data are corresponding with described burst information.
In conjunction with second aspect, in the possible implementation of the first of second aspect, described memory module, also, in the time that the data volume of described Value data exceedes data threshold, is stored in described key-value pair in described LSM-Tree.
In conjunction with second aspect, in the possible implementation of the second of second aspect, described memory module comprises:
Sub module stored, concentrates for described Key data and described burst information being stored in to the first ordered data, and described the first ordered data collection is arranged in internal memory memory;
Merge submodule, while merging threshold value for exceeding when described the first ordered data collection, concentrate burst information corresponding to Key data to be updated to described the first ordered data the second ordered data and concentrate the burst information corresponding with described Key data, described the second ordered data collection is arranged in disk.
In conjunction with the possible implementation of the second of second aspect, in the third possible implementation of second aspect, described device also comprises: removing module, and for described Key data concentrated described the first ordered data and the burst information corresponding with described Key data are deleted.
In conjunction with second aspect, in the 4th kind of possible implementation of second aspect, described device also comprises: search module, read module, wherein,
Described acquisition module, also need to need the data-bias address of reading, Key data corresponding to described Value data in the data length reading, described Value data for obtaining Value data, described Value data are the Value data that current needs read;
The described module of searching, for searching the first burst information corresponding to described Key data according to described Key data at described LSM-Tree;
Described read module reads out and in described Value data, needs the data that read for the data length that needs to read in the data-bias address that needs to read according to described Value data, described Value data and the first burst information corresponding to described Key data from described LSM-Tree and/or described key-value pair database.
In conjunction with the 4th kind of possible implementation of second aspect, in the 5th kind of possible implementation of second aspect, described device also comprises: update module, wherein,
Described update module is new data more for the burst content at the data place having read out by the described data replacement having read out;
Described acquisition module, also regenerates the first burst information for each burst content after described Value data have been upgraded;
Described memory module, also for being stored in described LSM-Tree by the first burst information regenerating.
In conjunction with second aspect, in the 6th kind of possible implementation of second aspect, described device also comprises: search module, read module, insert module, wherein,
Described acquisition module, also needs for obtaining data-bias address, the 2nd Key data corresponding to described the 2nd Value data inserted in the data length, described the 2nd Value data that the 2nd Value data need to insert;
The described module of searching, for searching the second burst information corresponding to described the 2nd Key data according to described the 2nd Key data at described LSM-Tree;
Described read module reads out for the data-bias address and the second burst information corresponding to described the 2nd Key data that need according to described the 2nd Value data to insert the burst content that needs the place, data-bias address of inserting in described the 2nd Value data from described LSM-Tree and/or described key-value pair database;
Described insert module, for inserting the data of needs insertion in the burst content at place, data-bias address that needs in described the 2nd Value data to insert;
Described acquisition module, also, for each burst content after the 2nd Value data data inserting is regenerated to the second burst information, the burst content that needs the place, data-bias address of inserting in the 2nd Value data after data inserting is carried in the second burst information regenerating; Described memory module, also for storing the second burst information regenerating into described LSM-Tree; Or described acquisition module, also for regenerating the second burst information to each burst content after the 2nd Value data data inserting; Described memory module, also for the second burst information regenerating is stored in to described LSM-Tree, stores the burst content that needs the place, data-bias address of inserting in the 2nd Value data after data inserting in described key-value pair database into.
As can be seen from the above technical solutions, the embodiment of the present invention has the following advantages:
In embodiments of the present invention, whether the data volume that first judges Value data in key-value pair exceedes data threshold, in the time that the data volume of Value data exceedes data threshold, Value data are cut into slices, obtain M burst content, part burst content for M burst content generates burst information, remaining burst content is carried to burst information, finally burst information and Key data are stored in LSM-Tree, have the part burst content of burst information to be stored in key-value pair database generation.Because burst content can be stored in key-value pair database, these burst contents just need to not move between the data set in different levels in rolling merge process so, reduce thus the expense of I/O port, improved the storage operation efficiency to Value data.And in the embodiment of the present invention, can flexibly the burst content after burst be stored in key-value pair database, also can be stored in LSM-Tree in the mode of burst information, improve the storage dirigibility of key-value pair system, be convenient to user's use.
Term " first ", " second " etc. in instructions of the present invention and claims and above-mentioned accompanying drawing are for distinguishing similar object, and needn't be used for describing specific order or precedence.The term that should be appreciated that such use suitably can exchange in situation, and this is only to describe the differentiation mode in embodiments of the invention, the object of same alike result being adopted in the time describing.
Below be elaborated respectively.
An embodiment of the storage means of the data in key-value pair system of the present invention, can comprise: whether the data volume that judges Value data in key-value pair exceedes data threshold, and above-mentioned key-value pair comprises Key data and the Value data corresponding with above-mentioned Key data; If the data volume of above-mentioned Value data does not exceed data threshold, above-mentioned Value data are cut into slices, obtain M burst content, above-mentioned M is greater than 1 natural number, and above-mentioned M burst content is the data content information of the above-mentioned Value data after burst; According to above-mentioned M burst content, N burst content generated to burst information, above-mentioned burst information comprises: the individual burst content of sequence number ID, (M-N) of each burst content in the offset address of each burst content, an above-mentioned N burst content in the number to above-mentioned Value data fragmentation, an above-mentioned N burst content, above-mentioned N is for being greater than 0 natural number, and above-mentioned N is less than or equal to above-mentioned M; Above-mentioned Key data and above-mentioned burst information are stored in the merging tree LSM-Tree based on log-structured, above-mentioned N burst content is stored in key-value pair database, above-mentioned Key data are corresponding with above-mentioned burst information.
Refer to shown in Fig. 2, the storage means of data in the key-value pair system that one embodiment of the invention provides, can comprise:
Whether the data volume that 201, judges Value data in key-value pair exceedes data threshold.
Wherein, above-mentioned key-value pair comprises Key data and the Value data corresponding with above-mentioned Key data.
In some embodiments of the invention, in key-value pair system, each key-value pair can be expressed as (Key/Value) conventionally, and each key-value pair comprises Key data and the Value data corresponding with above-mentioned Key data.In addition, a Key/Value is to being also referred to as an object.Conventionally the data volume that Key data take is less, but may take less data volume for Value data, also may take more data volume, can be for the default data threshold of Value data in the embodiment of the present invention, for the data volume of Value data and the relation of data threshold, adopt storage mode flexibly, to improve the storage efficiency of data in key-value pair system, wherein data threshold can be set flexibly according to user's demand the value size of this threshold value, also can arrange according to the internal memory of system (memory) capacity, can also set by the attribute information based on Value data, and after setting data threshold value, can also adjust flexibly according to various information the size of this threshold value, herein for illustrative purposes only, do not limit.
If the data volume of 202 above-mentioned Value data does not exceed data threshold, above-mentioned Value data are cut into slices, obtain M burst content.
Wherein, above-mentioned M is greater than 1 natural number, and above-mentioned M burst content is the data content information of the above-mentioned Value data after burst.
In some embodiments of the invention, if the data volume of above-mentioned Value data does not exceed data threshold, Value data can be divided and are cut to multiple, thereby obtain multiple burst contents, for convenience of description, being expressed as section and obtaining M burst content, and indicating M and be greater than 1 natural number, this M the data content information that burst content is exactly Value data.Wherein, Value data are done to data slicer can be had multiple, for example, slice size (FRAGMENT_SIZE) can be set, just can be divided into multiple burst contents according to this slice size to Value data, can also set in advance in addition Slicing Algorithm, according to this Slicing Algorithm, Value data be cut into slices.
203, according to above-mentioned M burst content, N burst content generated to burst information.
Wherein, the above-mentioned burst information generating comprises: the individual burst content of sequence number ID, (M-N) of each burst content in the offset address of each burst content, an above-mentioned N burst content in the number to above-mentioned Value data fragmentation, an above-mentioned N burst content, above-mentioned N is for being greater than 0 natural number, and above-mentioned N is less than or equal to above-mentioned M.
In an embodiment of the present invention, the content description comprising by the above-mentioned slice information indicating in step 202, how to generate burst information.It should be noted that, in the embodiment of the present invention, in order to improve the dirigibility of storing data in key-value pair system, according to M burst content, N burst content generated to burst information, and indicate N and be greater than zero natural number, N meets and is less than or equal to M simultaneously.That is to say, in the time that the value of N equates with M, in the embodiment of the present invention, step 202 specifically can be described as M burst content to generate burst information, the total data content obtaining after section has all been generated to burst information, in the time that the value of N is less than M, in the embodiment of the present invention, step 202 specifically can be described as part burst content in M burst content to generate burst information, and for the burst content of remainder, can be carried in burst information.For example, the data volume of Value data is 3.2MB, and the value of setting FRAGMENT_SIZE is 1MB, these Value data can be divided into 4 sheets, the size of front 3 burst contents is all 1MB, the size of the 4th burst content is 0.2MB, in the embodiment of the present invention, can generate burst information for 4 burst contents, also can generate burst information for front 3 burst contents, and by the 4th burst content, (data volume is 0.2MB, also can be referred to as " tail data ") be carried in the burst information generating for front 3 burst contents, therefore can realize flexibly generation burst information in the embodiment of the present invention.
In some embodiments of the invention, in burst information, include the number of Value data fragmentation, and also include the sequence number (ID of each burst content in N burst content, IDentity), can find easily each burst content by ID, and in burst information, also include the offset address (offset) of each burst content in N burst content, just can in key-value pair database, get accurately burst content by offset address.
204, above-mentioned Key data and above-mentioned burst information are stored in the merging tree (LSM-Tree, The Log-Structured Merge-Tree) based on log-structured, above-mentioned N burst content is stored in key-value pair database.
Wherein, above-mentioned Key data are corresponding with above-mentioned burst information.
In embodiments of the present invention, after generating burst information, by Key data and the separately storage of Value data, Key data and burst information are stored in LSM-Tree, and N burst content is stored in key-value pair database, owing to only having Key data and burst information to be stored in LSM-Tree, and N burst content is stored in key-value pair database (DB, Data Base) in, therefore N burst content can not participate in rolling merge process, and only have Key data and burst information can participate in rolling merge process, therefore N burst content just need to not move between the data set of different levels, thereby can avoid the larger expense that I/O port is caused, improve the storage operation efficiency to Value data.
In some embodiments of the invention, key-value pair database can be stored in raw device or file system, or is directly stored in disk.
In some embodiments of the invention, Key data and burst information are stored in LSM-Tree, specifically mode in the following manner: above-mentioned Key data and above-mentioned burst information are stored in to the first ordered data and concentrate, above-mentioned the first ordered data collection is arranged in internal memory (memory); In the time that above-mentioned the first ordered data collection exceedes merging threshold value, to concentrate burst information corresponding to Key data to be updated to above-mentioned the first ordered data the second ordered data and concentrate the burst information corresponding with above-mentioned Key data, above-mentioned the second ordered data collection is arranged in disk.Shown in Fig. 1-a, the first ordered data collection specifically can refer to C as the aforementioned 0tree, the second ordered data collection specifically can refer to C 1tree, first Key data and burst information are stored in C 0in tree, then work as C 0when tree meets the condition that merges threshold value, by C 1in tree, burst information corresponding to Key data is updated to C 0burst information corresponding to these Key data in tree.If be also stored in for example the 3rd ordered data collection of multiple ordered data collection and the 4th ordered data collection in disk time, can carry out rolling merge process to Key data and burst information according to aforesaid mode, concentrate thereby Key data and burst information are moved to more higher leveled ordered data.
In other embodiment of the present invention, after concentrated the second ordered data burst information corresponding to Key data is updated to the concentrated burst information corresponding with above-mentioned Key data of above-mentioned the first ordered data, the above-mentioned Key data that above-mentioned the first ordered data can also be concentrated and the burst information corresponding with above-mentioned Key data are deleted." legacy data " in this way the first ordered data concentrated deleted, thereby improves space availability ratio.
Therefore, whether the data volume that first judges Value data in key-value pair exceedes data threshold, in the time that the data volume of Value data exceedes data threshold, Value data are cut into slices, obtain M burst content, generate burst information for the part burst content of M burst content, remaining burst content is carried to burst information, finally burst information and Key data are stored in LSM-Tree, have the part burst content of burst information to be stored in key-value pair database generation.Because burst content can be stored in key-value pair database, these burst contents just need to not move between the data set in different levels in rolling merge process so, reduce thus the expense of I/O port, improved the storage operation efficiency to Value data.And in the embodiment of the present invention, can flexibly the burst content after burst be stored in key-value pair database, also can be stored in LSM-Tree in the mode of burst information, improve the storage dirigibility of key-value pair system, be convenient to user's use.
Next introduce another embodiment of the storage means of the data in key-value pair system of the present invention, as shown in Fig. 3-a, can comprise:
Whether the data volume that 301, judges Value data in key-value pair exceedes data threshold, if the data volume of above-mentioned Value data exceedes data threshold execution step 302, if the data volume of above-mentioned Value data does not exceed data threshold execution step 303.
If the data volume of 302 Value data exceedes data threshold, key-value pair is stored in LSM-Tree.
If the data volume of 303 above-mentioned Value data does not exceed data threshold, above-mentioned Value data are cut into slices, obtain M burst content.
304, according to above-mentioned M burst content, N burst content generated to burst information.
305, above-mentioned Key data and above-mentioned burst information are stored in LSM-Tree, above-mentioned N burst content is stored in key-value pair database.
Wherein, step 303 to 305 with previous embodiment in step 202 similar to 204 implementation, repeat no more herein.
306, obtain and in Value data, need to need in the data length reading, above-mentioned Value data the data-bias address of reading, Key data corresponding to above-mentioned Value data.
Wherein, above-mentioned Value data are the Value data that current needs read.
Step 301 is the storing process to data to the embodiment of the present invention of 305 descriptions, after storing process completes, and in the time that needs read Value data, can be according to the step 306 of describing in the embodiment of the present invention to 308 realizations reading Value data.And it should be noted that, in prior art, because whole Value data are all stored in LSM-Tree, when need to read in Value data at every turn part field contents time, whole Value data all must be read from LSM-Tree, cause the larger expense of I/O port, in the embodiment of the present invention due to Value data have been done to slicing treatment, therefore do not need whole Value data all to read in the time reading the part field contents of Value data at every turn, need the length of the data that the position of reading out data reads as required to realize reading in subrange but can accurately locate, so just can reduce the expense that uses I/O port to bring, thereby improve the reading efficiency of data.
In embodiments of the present invention, need for convenience of description the Value data that are read, be defined as Value data, in Value data, needed the data length reading, the data-bias address that need to read, Key data corresponding to Value data therefore can first get.
307, in above-mentioned LSM-Tree, search the first burst information corresponding to above-mentioned Key data according to above-mentioned Key data.
In some embodiments of the invention, because Key data and burst information are all stored in LSM-Tree, therefore get and need the data length that reads in Value data, after the data-bias address that need to read, just can in LSM-Tree, search according to Key data the first burst information of its correspondence, wherein, what the one Key data referred to is exactly and the corresponding Key data of a Value, be defined as " Key data " for ease of describing, the first same burst information also refers to and the corresponding burst information of Key data, be defined as for convenience of description " the first burst information ".
308, from above-mentioned LSM-Tree and/or above-mentioned key-value pair database, read out and in above-mentioned Value data, need the data that read according to the data length that needs to read in the data-bias address that needs to read in above-mentioned Value data, above-mentioned Value data and the first burst information corresponding to above-mentioned Key data.
After finding the first burst information, just can from LSM-Tree and/or above-mentioned key-value pair database, read out and in Value data, need the data that read according to the data length that needs to read in the data-bias address that needs to read in Value data, above-mentioned Value data, the first burst information.Wherein, in embodiments of the present invention, because burst content can be stored in key-value pair database flexibly, also can be stored in LSM-Tree with the form of burst information flexibly, therefore in the time that the related data fragmentation content of Value data that need to read is stored in key-value pair database, just need to read Value data from key-value pair database herein; Therefore in the time that the related data fragmentation content of Value data that need to read is stored in LSM-Tree, just need to read Value data from LSM-Tree herein; Therefore in the time that the related data fragmentation content of Value data that need to read is stored in key-value pair database and LSM-Tree, just need to read Value data from key-value pair database and LSM-Tree herein.
In embodiments of the present invention, the step 306 of describing is reading partial data content in Value data to 308 descriptions, after reading Value data, in the time also needing the data that read out to upgrade, can also comprise the steps: so
309, in the burst content at the data place having read out, be new data more by the above-mentioned data replacement having read out.
Wherein, read out after the data that need to read in Value data, in the burst content at the data place having read out, be new data more by the data replacement having read out, if the burst content at the data place having read out is stored in LSM-Tree, more new data is written into after the burst content at the data place having read out, being stored in LSM-Tree too, if the burst content at the data place having read out is stored in key-value pair database, more new data is written into after the burst content at the data place having read out, being stored in key-value pair database too.
310, each burst content after having upgraded in above-mentioned Value data is regenerated to the first burst information;
311, the first burst information regenerating is stored in above-mentioned LSM-Tree.
The data that read out in burst content due to the data place having read out have been replaced by more new data, there is change in the data content that is to say the burst content at the data place having read out, therefore need to regenerate burst information to each burst content after having upgraded in above-mentioned Value data, need each burst content after having upgraded in above-mentioned Value data to regenerate the first burst information, and be stored in LSM-Tree, and the first burst information corresponding to original Value data can be deleted from LSM-Tree, to save storage space, improve space availability ratio.
Next introduce another embodiment of the storage means of the data in key-value pair system of the present invention, as shown in Fig. 3-b, can comprise: step 301 is to 305,312 to 317, wherein,
Step 301, to 305, can be consulted the description of previous embodiment, repeats no more herein.
Step 312, obtain and in the 2nd Value data, need to need in the data length inserting, above-mentioned the 2nd Value data data-bias address, the 2nd Key data corresponding to above-mentioned the 2nd Value data inserted.
Wherein, step 301 is the storing process to data to the embodiment of the present invention of 305 descriptions, after storing process completes, in the time need to inserting new data content in original Value data, can be according to the step 312 of describing in the embodiment of the present invention to the insertion of 319 realizations to Value data.And it should be noted that, in prior art, because whole Value data are all stored in LSM-Tree, when at every turn need to be being originally stored in Value data in LSM-Tree while inserting new Value data, original whole Value data all must be read from LSM-Tree, cause the larger expense of I/O port, in the embodiment of the present invention due to Value data have been done to slicing treatment, therefore do not need whole Value data all to read in the time that needs insert new data content at every turn, realize the insertion in subrange but can accurately locate the length that need to insert the new data content of the position of new data content inserting as required, so just can reduce the expense that uses I/O port to bring, thereby improve the reading efficiency of data.
In embodiments of the present invention, need for convenience of description the Value data that are inserted into, be defined as the 2nd Value data, in the data length that needs in the 2nd Value data to insert, the 2nd Value data, needed therefore can first get the 2nd Key data corresponding to data-bias address, the 2nd Value data inserted.
313, in above-mentioned LSM-Tree, search the second burst information corresponding to above-mentioned the 2nd Key data according to above-mentioned the 2nd Key data.
In some embodiments of the invention, because Key data and burst information are all stored in LSM-Tree, therefore get the data length that needs insertion in the 2nd Value data, after the data-bias address that needs to insert, just can in LSM-Tree, search according to the 2nd Key data the second burst information of its correspondence, wherein, what the 2nd Key data referred to is exactly and the corresponding Key data of the 2nd Value, be defined as " the 2nd Key data " for ease of describing, the second same burst information also refers to and the corresponding burst information of the 2nd Key data, be defined as for convenience of description " the second burst information ".
314, from above-mentioned LSM-Tree and/or above-mentioned key-value pair database, read out according to the data-bias address that needs in above-mentioned the 2nd Value data to insert and the second burst information corresponding to above-mentioned the 2nd Key data the burst content that needs the place, data-bias address of inserting in above-mentioned the 2nd Value data.
Wherein, in embodiments of the present invention, because burst content can be stored in key-value pair database flexibly, also can be stored in LSM-Tree with the form of burst information flexibly, therefore in the time that the related data fragmentation content in data-bias address that needs in the 2nd Value data to insert is stored in key-value pair database, just need to read corresponding burst content from key-value pair database herein; Therefore in the time that the related data fragmentation content in data-bias address that needs in the 2nd Value data to insert is stored in LSM-Tree, just need to read corresponding burst content from LSM-Tree herein; Therefore in the time that the related data fragmentation content in data-bias address that needs in the 2nd Value data to insert is stored in key-value pair database and LSM-Tree, just need to read corresponding burst content from key-value pair database and LSM-Tree herein.
315, in the burst content at place, data-bias address that needs to insert, insert the data of needs insertion in above-mentioned the 2nd Value data.
After inserting the data of needs insertion in the burst content at place, data-bias address that needs in the 2nd Value data to insert, can perform step 316 and 317, also can perform step 318 and 319.
316, each burst content after data inserting in the 2nd Value data is regenerated to the second burst information, the burst content that needs the place, data-bias address of inserting in the 2nd Value data after data inserting is carried in the second burst information regenerating.
Wherein, if need the burst content at the place, data-bias address of inserting to be originally stored in LSM-Tree in the 2nd Value data, perform step 316, the burst content that needs the place, data-bias address of inserting in the 2nd Value data after data inserting is carried in the second burst information regenerating, then performs step 317.
317, the second burst information regenerating is stored in described LSM-Tree.
318, each burst content after data inserting in the 2nd Value data is regenerated to the second burst information.
Wherein, if need the burst content at the place, data-bias address of inserting to be originally stored in key-value pair database in the 2nd Value data, perform step 318, each burst content after data inserting in the 2nd Value data is regenerated to the second burst information.
319, the second burst information regenerating is stored in above-mentioned LSM-Tree, the burst content that needs the place, data-bias address of inserting in the 2nd Value data after data inserting is stored in above-mentioned key-value pair database.
Therefore, whether the data volume that first judges Value data in key-value pair exceedes data threshold, in the time that the data volume of Value data exceedes data threshold, Value data are cut into slices, obtain M burst content, generate burst information for the part burst content of M burst content, remaining burst content is carried to burst information, finally burst information and Key data are stored in LSM-Tree, have the part burst content of burst information to be stored in key-value pair database generation.Because burst content can be stored in key-value pair database, these burst contents just need to not move between the data set in different levels in rolling merge process so, reduce thus the expense of I/O port, improved the storage operation efficiency to Value data.And in the embodiment of the present invention, can flexibly the burst content after burst be stored in key-value pair database, also can be stored in LSM-Tree in the mode of burst information, improve the storage dirigibility of key-value pair system, be convenient to user's use.
For ease of better understanding and implement the such scheme of the embodiment of the present invention, several application scenarioss of giving an example are below specifically described.
Refer to as shown in Fig. 4-a, Fig. 4-b, Fig. 4-c, another embodiment of the storage means of data in key-value pair system of the present invention, can comprise:
The data volume of supposing Value data in key-value pair is 4.7MB, suppose that the data threshold of setting is 2MB, in this key-value pair, Value data can be defined as large object, if Fig. 4-a is the storage means according to prior art, directly the Value data of large object are deposited in LSM-Tree together with Key data.
In embodiments of the present invention, according to following disposal route: the Value data of this object are cut into 5 burst contents, wherein FRAGMENT_SIZE is 1MB, front 4 burst contents are deposited in key-value pair database, key-value pair database is kept in raw device or file system, and can process according to following two kinds of modes the tail data that the 5th burst content remain 0.7MB:
(1), generate burst information for these 5 burst contents, and Key data are kept in LSM-Tree together with burst information, front 4 burst contents are stored in key-value pair database as a burst content respectively together with tail data 0.7MB, 5 burst contents that section obtains have all stored in key-value pair database, as shown in Fig. 4-b, for index organization's mode of large object in LSM-Tree, comprising having Key data and burst information, this burst information specifically refers to burst information 1, burst information 2, burst information 3, burst information 4, burst information 5, burst content corresponding to each burst information is stored in key-value pair database, wherein, burst information 1 correspondence be burst content 1, burst information 2 correspondences be burst content 2, burst information 3 correspondences be burst content 3, burst information 4 correspondences be burst content 4, burst information 5 correspondences be namely tail data of burst content 5().Therefore the data volume of storing is less than 4.7MB, therefore can reduce the use expense of I/O interface carry out rollingmerge process in LSM-Tree time, improve the storage efficiency to data in LSM-Tree.
(2), generate burst information for front 4 burst contents, and directly tail data is carried in the burst information generating for front 4 burst contents, and Key data are kept in LSM-Tree together with burst information, front 4 burst contents are stored in key-value pair database, front 4 burst contents that section obtains have stored in key-value pair database, and tail data is carried in burst information and is stored in LSM-Tree, as shown in Fig. 4-c, for index organization's mode of large object in LSM-Tree, comprising having Key data and burst information, this burst information specifically refers to burst information 1, burst information 2, burst information 3, burst information 4 and tail data, burst content corresponding to each burst information is stored in key-value pair database, wherein, burst information 1 correspondence be burst content 1, burst information 2 correspondences be burst content 2, burst information 3 correspondences be burst content 3, burst information 4 correspondences be burst content 4.Therefore the data volume of storing is less than 4.7MB, therefore can reduce the use expense of I/O interface carry out rolling merge process in LSM-Tree time, improve the storage efficiency to data in LSM-Tree.
Above embodiment has introduced the storing process to data, after storing process completes, in the time that needs read Value data, refers to the explanation of the following process that reads to data:
During still taking Value data as large object, it is example, need the Value data 1 in reading object A, the offset=3M+50K of the Value data 1 that need to read opens, the length of the Value data 1 that need to read is 16KB, and the Key data of Value data 1 correspondence that need to read are k1, and concrete can call following Interface realization: get (k1, & buffer, 3196928,16384) realize, this read operation process is as follows:
First in LSM-Tree, search the object that Key data are k1, obtaining the burst information that this object is corresponding is information a;
Then, according to the offset(3M+50K providing), data length (16KB) and the information a that need to read, determine and need to read from which burst content, if the data of storage are as shown in earlier figures 4-b, only need to read the corresponding burst content 4 of burst information 4 herein.
Next, after obtaining burst information 4, from reading in burst content 4 from offset(3M+50K key-value pair database), reading out data length is the corresponding data content of 16KB.
In the prior art, all the whole Value data of large object all can be read out, upper level applications only has the data content that just can obtain respective range by the operation of self, is unusual poor efficiency while causing reading out data in efficiency.And in the embodiment of the present invention, can realize subrange read operation, wherein subrange mainly refers to the some scopes in Value data.In the time that user need to read a data content in the some scopes in larger Value data, the embodiment of the present invention just can realize subrange read operation, thereby can reduce the expense that uses I/O interface to bring.
When having read after the corresponding data content of 16KB, in the embodiment of the present invention, if desired the corresponding data content of 16KB is modified, after having revised, corresponding amended 16KB data content is stored in key-value pair database, because amendment has occurred content, therefore need to regenerate burst information, be assumed to be information b, need information b to be stored in LSM-Tree, just can realize thus the amendment of data content in subrange in Value data is replaced.
Above embodiment has introduced the storing process to data, after storing process completes, in the time need to inserting Value data in original Value data, refers to the explanation of the following renewal process to data:
During still taking Value data as large object, it is example, the data volume of supposing this large object is 3MB+1KB, needing the data length inserting is 20KB data, needing the position of inserting is the end of this large object, the Key data of this large object are k2, specifically can call following Interface realization: update (k2, & buffer, 3146752), this update process is as follows:
First in LSM-Tree, search the object that Key is k2, obtaining the burst information that this object is corresponding is information c;
Then, as shown in Fig. 4-c, be that tail data in information c is spliced with new 20KB data of adding by burst information, because the data volume of the data after having spliced in this example is 90KB, be less than FRAGMENT_SIZE, therefore set it as new tail data, desired value be, if the data volume of the data that obtain after splicing is greater than FRAGMENT_SIZE, can produce a new burst content and new tail data, therefore each burst content regenerates burst information after need to being again fragmented large object, in the burst information regenerating, carry new tail data, the burst information regenerating is stored in LSM-Tree, new burst content is stored in key-value pair database.
In the prior art, all the whole Value data of large object all can be read out, after then splicing with the data of new interpolation, then re-write, be unusual poor efficiency while causing data inserting in efficiency.And in the embodiment of the present invention, can realize subrange write operation, wherein subrange mainly refers to the some scopes in Value data.Insert new data in user need to the some scopes in larger Value data time, the embodiment of the present invention just can realize subrange write operation, thereby can reduce the expense that uses I/O interface to bring.
Above embodiment has introduced the storing process to data, next the merging to large object (merge) process is illustrated, and refers to the explanation of the following merging process to large object, refers to as shown in Fig. 4-d, the burst information of large object participates in rolling merge process, i.e. C i-1the burst information of tree can be updated to C iin tree, thus the burst information in new and old large object.
First, for C i-1tree and C iidentical Key1 in tree, by C i-1the burst information of large object in tree (be respectively burst information 1 ' and burst information 2 ') and C iin tree, the burst information (being respectively burst information 1, burst information 2, burst information 3) of large object compares, by C iin tree, the original old burst information of large object is deleted, by C i-1burst information 1 in tree ' and burst information 2 ' delete, by C i-1burst information in tree is updated to burst information 1, burst information 2, burst information 3.
Wherein, in LSM-Tree, the mode of data storage is as shown in Fig. 4-b, i.e. C in LSM-Tree i-1in tree, only preserve burst information 1, burst information 2, burst information 3, and there is no tail data.
As from the foregoing, whether the data volume that first judges Value data in key-value pair exceedes data threshold, in the time that the data volume of Value data exceedes data threshold, Value data are cut into slices, obtain M burst content, generate burst information for the part burst content of M burst content, remaining burst content is carried to burst information, finally burst information and Key data are stored in LSM-Tree, have the part burst content of burst information to be stored in key-value pair database generation.Because burst content can be stored in key-value pair database, these burst contents just need to not move between the data set in different levels in rolling merge process so, reduce thus the expense of I/O port, improved the storage operation efficiency to Value data.And in the embodiment of the present invention, can flexibly the burst content after burst be stored in key-value pair database, also can be stored in LSM-Tree in the mode of burst information, improve the storage dirigibility of key-value pair system, be convenient to user's use.
It should be noted that, for aforesaid each embodiment of the method, for simple description, therefore it is all expressed as to a series of combination of actions, but those skilled in the art should know, the present invention is not subject to the restriction of described sequence of movement, because according to the present invention, some step can adopt other orders or carry out simultaneously.Secondly, those skilled in the art also should know, the embodiment described in instructions all belongs to preferred embodiment, and related action and module might not be that the present invention is necessary.
For ease of better implementing the such scheme of the embodiment of the present invention, be also provided for implementing the relevant apparatus of such scheme below.
Refer to shown in Fig. 5-a, the memory storage 500 of data in a kind of key-value pair system that the embodiment of the present invention provides, can comprise: judge module 501, section module 502, acquisition module 503, memory module 504, wherein,
Whether judge module 501, exceed data threshold for the data volume that judges key-value pair Value data, and described key-value pair comprises Key data and the Value data corresponding with described Key data;
Section module 502, be used for according to the judged result of judge module 501 in the time that the data volume of described Value data does not exceed data threshold, described Value data are cut into slices, obtain M burst content, described M is greater than 1 natural number, and described M burst content is the data content information of the described Value data after burst;
Acquisition module 503, for described M the burst content obtaining according to section module 502, N burst content generated to burst information, described burst information comprises: the individual burst content of sequence number ID, (M-N) of each burst content in the offset address of each burst content, a described N burst content in the number to described Value data fragmentation, a described N burst content, described N is for being greater than 0 natural number, and described N is less than or equal to described M;
Memory module 504, be stored in the merging tree LSM-Tree based on log-structured for the burst information that described Key data and described acquisition module 503 are got, N the burst content that described section module 503 is obtained is stored in key-value pair database, and described Key data are corresponding with described burst information.
In some embodiments of the invention, memory module 504, also, in the time that the data volume of described Value data exceedes data threshold, is stored in described key-value pair in described LSM-Tree.
Refer to shown in Fig. 5-b, in some embodiments of the invention, in key-value pair system, the memory storage 500 of data also can comprise: search module 505, read module 506, wherein,
Described acquisition module 503, also need to need the data-bias address of reading, Key data corresponding to described Value data in the data length reading, described Value data for obtaining Value data, described Value data are the Value data that current needs read;
The described module 505 of searching, for searching the first burst information corresponding to described Key data according to described Key data at described LSM-Tree;
Described read module 506 reads out and in described Value data, needs the data that read for the data length that needs to read in the data-bias address that needs to read according to described Value data, described Value data and the first burst information corresponding to described Key data from described LSM-Tree and/or described key-value pair database.
In some embodiments of the invention, in key-value pair system, the memory storage 500 of data also can comprise: update module 507, wherein,
Described update module 507 is new data more for the burst content at the data place having read out by the described data replacement having read out;
Described acquisition module 503, also regenerates the first burst information for each burst content after described Value data have been upgraded;
Described memory module 504, also for being stored in described LSM-Tree by the first burst information regenerating.
In some embodiments of the invention, memory module 504 can comprise:
Sub module stored 5041, concentrates for described Key data and described burst information being stored in to the first ordered data, and described the first ordered data collection is arranged in internal memory memory;
Merge submodule 5042, while merging threshold value for exceeding when described the first ordered data collection, concentrate burst information corresponding to Key data to be updated to described the first ordered data the second ordered data and concentrate the burst information corresponding with described Key data, described the second ordered data collection is arranged in disk.
In some embodiments of the invention, in key-value pair system, the memory storage 500 of data also can comprise: removing module 508, and for described Key data concentrated described the first ordered data and the burst information corresponding with described Key data are deleted.
Refer to shown in Fig. 5-c, in some embodiments of the invention, in key-value pair system, the memory storage 500 of data also can comprise: search module 505, read module 506, insert module 509, wherein,
Described acquisition module 503, also needs for obtaining data-bias address, the 2nd Key data corresponding to described the 2nd Value data inserted in the data length, described the 2nd Value data that the 2nd Value data need to insert;
The described module 505 of searching, for searching the second burst information corresponding to described the 2nd Key data according to described the 2nd Key data at described LSM-Tree;
Described read module 506 reads out for the data-bias address and the second burst information corresponding to described the 2nd Key data that need according to described the 2nd Value data to insert the burst content that needs the place, data-bias address of inserting in described the 2nd Value data from described LSM-Tree and/or described key-value pair database;
Described insert module 509, for inserting the data of needs insertion in the burst content at place, data-bias address that needs in described the 2nd Value data to insert;
Described acquisition module 503, also, for each burst content after the 2nd Value data data inserting is regenerated to the second burst information, the burst content that needs the place, data-bias address of inserting in the 2nd Value data after data inserting is carried in the second burst information regenerating; Described memory module 504, also for storing the second burst information regenerating into described LSM-Tree; Or described acquisition module 503, also for regenerating the second burst information to each burst content after the 2nd Value data data inserting; Described memory module 504, also for the second burst information regenerating is stored in to described LSM-Tree, stores the burst content that needs the place, data-bias address of inserting in the 2nd Value data after data inserting in described key-value pair database into.
As from the foregoing, whether the data volume that first judges Value data in key-value pair exceedes data threshold, in the time that the data volume of Value data exceedes data threshold, Value data are cut into slices, obtain M burst content, generate burst information for the part burst content of M burst content, remaining burst content is carried to burst information, finally burst information and Key data are stored in LSM-Tree, have the part burst content of burst information to be stored in key-value pair database generation.Because burst content can be stored in key-value pair database, these burst contents just need to not move between the data set in different levels in rolling merge process so, reduce thus the expense of I/O port, improved the storage operation efficiency to Value data.And in the embodiment of the present invention, can flexibly the burst content after burst be stored in key-value pair database, also can be stored in LSM-Tree in the mode of burst information, improve the storage dirigibility of key-value pair system, be convenient to user's use.
Inventive embodiments also provides a kind of computer-readable storage medium, and wherein, this computer-readable storage medium has program stored therein, and this program is carried out and comprised the part or all of layout of recording in said method embodiment.
Next the memory storage of introducing data in the another kind of key-value pair system that the embodiment of the present invention provides, refers to shown in Fig. 6, and in key-value pair system, the memory storage 600 of data comprises:
Input media 601, output unit 602, processor 603 and storer 604 (wherein the quantity of the processor 603 in memory storage 600 can be one or more, in Fig. 6 taking a processor as example).In some embodiments of the invention, input media 601, output unit 602, processor 603 and storer 604 can be connected by bus or alternate manner, wherein, in Fig. 6 to be connected to example by bus.
Wherein,
Processor 603, for carrying out following steps: whether the data volume that judges key-value pair Value data exceedes data threshold, described key-value pair comprises Key data and the Value data corresponding with described Key data; If the data volume of described Value data does not exceed data threshold, described Value data are cut into slices, obtain M burst content, described M is greater than 1 natural number, and described M burst content is the data content information of the described Value data after burst; According to described M burst content, N burst content generated to burst information, described burst information comprises: the individual burst content of sequence number ID, (M-N) of each burst content in the offset address of each burst content, a described N burst content in the number to described Value data fragmentation, a described N burst content, described N is for being greater than 0 natural number, and described N is less than or equal to described M; Described Key data and described burst information are stored in the merging tree LSM-Tree based on log-structured, described N burst content is stored in key-value pair database, described Key data are corresponding with described burst information.
In some embodiments of the invention, processor 603 is also for carrying out following steps: if the data volume of described Value data exceedes data threshold, described key-value pair is stored in described LSM-Tree.
In some embodiments of the invention, processor 603 is also for carrying out following steps: obtaining Value data needs to need the data-bias address of reading, Key data corresponding to described Value data in the data length reading, described Value data, and described Value data are the Value data that current needs read; In described LSM-Tree, search the first burst information corresponding to described Key data according to described Key data; From described LSM-Tree and/or described key-value pair database, read out and in described Value data, need the data that read according to the data length that needs to read in the data-bias address that needs to read in described Value data, described Value data and the first burst information corresponding to described Key data.
In some embodiments of the invention, processor 603 is also for carrying out following steps: be new data more in the burst content at the data place having read out by the described data replacement having read out; Each burst content after having upgraded in described Value data is regenerated to the first burst information; The first burst information regenerating is stored in described LSM-Tree.
In some embodiments of the invention, processor 603 is also for carrying out following steps: obtain data-bias address, the 2nd Key data corresponding to described the 2nd Value data that the 2nd Value data need to need in the data length inserting, described the 2nd Value data insertion; In described LSM-Tree, search the second burst information corresponding to described the 2nd Key data according to described the 2nd Key data; From described LSM-Tree and/or described key-value pair database, read out according to the data-bias address that needs in described the 2nd Value data to insert and the second burst information corresponding to described the 2nd Key data the burst content that needs the place, data-bias address of inserting in described the 2nd Value data; In the burst content at place, data-bias address that needs to insert, insert the data of needs insertion in described the 2nd Value data; Each burst content after data inserting in the 2nd Value data is regenerated to the second burst information, the burst content that needs the place, data-bias address of inserting in the 2nd Value data after data inserting is carried in the second burst information regenerating; The second burst information regenerating is stored in described LSM-Tree; Or, each burst content after data inserting in the 2nd Value data is regenerated to the second burst information; The second burst information regenerating is stored in described LSM-Tree, the burst content that needs the place, data-bias address of inserting in the 2nd Value data after data inserting is stored in described key-value pair database.
One of ordinary skill in the art will appreciate that all or part of step realizing in above-described embodiment method is can carry out the hardware that instruction is relevant by program to complete, described program can be stored in a kind of computer-readable recording medium, the above-mentioned storage medium of mentioning can be ROM (read-only memory), disk or CD etc.
Above storage means and the relevant apparatus of data in a kind of key-value pair system provided by the present invention are described in detail, for one of ordinary skill in the art, according to the thought of the embodiment of the present invention, all will change in specific embodiments and applications, in sum, this description should not be construed as limitation of the present invention.
Brief description of the drawings
In order to be illustrated more clearly in the technical scheme in the embodiment of the present invention, below the accompanying drawing of required use during embodiment is described is briefly described, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, to those skilled in the art, can also obtain according to these accompanying drawings other accompanying drawing.
To be the LSM-Tree that exists in prior art changed and postponed and the schematic diagram of batch processing by two-layer ordered data set pair index Fig. 1-a;
To be the LSM-Tree that exists in prior art changed and postpone and the schematic diagram of batch processing index by multilayer order data set Fig. 1-b;
Fig. 2 is the process blocks schematic diagram of the storage means of data in a kind of key-value pair system providing in the embodiment of the present invention;
Fig. 3-a is the process blocks schematic diagram of the storage means of data in the another kind of key-value pair system providing in the embodiment of the present invention;
Fig. 3-b is the process blocks schematic diagram of the storage means of data in a kind of key-value pair system providing in the embodiment of the present invention;
Fig. 4-a be in prior art just key-value pair be stored in the schematic diagram in LSM-Tree;
Fig. 4-b is a kind of implementation schematic diagram of in the embodiment of the present invention, Key data, burst information and burst content being stored respectively;
Fig. 4-c is the another kind of implementation schematic diagram of in the embodiment of the present invention, Key data, burst information and burst content being stored respectively;
Fig. 4-d is the implementation schematic diagram that the burst information of large object in the embodiment of the present invention participates in rolling merge process;
The composition structural representation of the memory storage of data in a kind of key-value pair system that Fig. 5-a provides for the embodiment of the present invention;
The composition structural representation of the memory storage of data in the another kind of key-value pair system that Fig. 5-b provides for the embodiment of the present invention;
The composition structural representation of the memory storage of data in the another kind of key-value pair system that Fig. 5-c provides for the embodiment of the present invention;
The composition structural representation of the memory storage of data in the another kind of key-value pair system that Fig. 6 provides for the embodiment of the present invention.
Embodiment
The embodiment of the present invention provides storage means and the relevant apparatus of data in a kind of key-value pair system, stores Value data for the expense of the I/O port with less, improves the storage operation efficiency to Value data.
For making goal of the invention of the present invention, feature, advantage can be more obvious and understandable, below in conjunction with the accompanying drawing in the embodiment of the present invention, technical scheme in the embodiment of the present invention is clearly and completely described, obviously, the embodiments described below are only the present invention's part embodiment, but not whole embodiment.Based on the embodiment in the present invention, the every other embodiment that those skilled in the art obtains, belongs to the scope of protection of the invention.

Claims (14)

1. a storage means for data in key-value pair system, is characterized in that, comprising:
Whether the data volume that judges Value data in key-value pair exceedes data threshold, and described key-value pair comprises Key data and the Value data corresponding with described Key data;
If the data volume of described Value data does not exceed data threshold, described Value data are cut into slices, obtain M burst content, described M is greater than 1 natural number, and described M burst content is the data content information of the described Value data after burst;
According to described M burst content, N burst content generated to burst information, described burst information comprises: the individual burst content of sequence number ID, (M-N) of each burst content in the offset address of each burst content, a described N burst content in the number to described Value data fragmentation, a described N burst content, described N is for being greater than 0 natural number, and described N is less than or equal to described M;
Described Key data and described burst information are stored in the merging tree LSM-Tree based on log-structured, described N burst content is stored in key-value pair database, described Key data are corresponding with described burst information.
2. method according to claim 1, is characterized in that, described method also comprises:
If the data volume of described Value data exceedes data threshold, described key-value pair is stored in described LSM-Tree.
3. method according to claim 1, is characterized in that, the described merging tree LSM-Tree that described Key data and described burst information are stored in based on log-structured comprises:
Described Key data and described burst information are stored in to the first ordered data and concentrate, described the first ordered data collection is arranged in internal memory memory;
In the time that described the first ordered data collection exceedes merging threshold value, to concentrate burst information corresponding to Key data to be updated to described the first ordered data the second ordered data and concentrate the burst information corresponding with described Key data, described the second ordered data collection is arranged in disk.
4. method according to claim 3, is characterized in that, described by the second ordered data concentrate burst information corresponding to described Key data be updated to described the first ordered data concentrate the burst information corresponding with described Key data, also comprise afterwards:
The described Key data that described the first ordered data is concentrated and the burst information corresponding with described Key data are deleted.
5. method according to claim 1, is characterized in that, describedly also comprises after described N burst content is stored in key-value pair database:
Obtain in Value data and need to need the data-bias address of reading, Key data corresponding to described Value data in the data length reading, described Value data, described Value data are the Value data that current needs read;
In described LSM-Tree, search the first burst information corresponding to described Key data according to described Key data;
From described LSM-Tree and/or described key-value pair database, read out and in described Value data, need the data that read according to the data length that needs to read in the data-bias address that needs to read in described Value data, described Value data and the first burst information corresponding to described Key data.
6. method according to claim 5, it is characterized in that, describedly also comprise after reading out according to the data length that needs to read in the data-bias address that needs to read in described Value data, described Value data and the first burst information corresponding to described Key data the data that need to read in described Value data from described LSM-Tree and/or described key-value pair database:
In the burst content at the data place having read out, be new data more by the described data replacement having read out;
Each burst content after having upgraded in described Value data is regenerated to the first burst information;
The first burst information regenerating is stored in described LSM-Tree.
7. method according to claim 1, is characterized in that, describedly also comprises after described N burst content is stored in key-value pair database:
Obtain the data-bias address, the 2nd Key data corresponding to described the 2nd Value data that in the 2nd Value data, need to need in the data length inserting, described the 2nd Value data insertion;
In described LSM-Tree, search the second burst information corresponding to described the 2nd Key data according to described the 2nd Key data;
From described LSM-Tree and/or described key-value pair database, read out according to the data-bias address that needs in described the 2nd Value data to insert and the second burst information corresponding to described the 2nd Key data the burst content that needs the place, data-bias address of inserting in described the 2nd Value data;
In the burst content at place, data-bias address that needs to insert, insert the data of needs insertion in described the 2nd Value data;
Each burst content after data inserting in the 2nd Value data is regenerated to the second burst information, the burst content that needs the place, data-bias address of inserting in the 2nd Value data after data inserting is carried in the second burst information regenerating, the second burst information regenerating is stored in described LSM-Tree; Or, carry out following steps: each burst content after data inserting in the 2nd Value data is regenerated to the second burst information, the second burst information regenerating is stored in described LSM-Tree, the burst content that needs the place, data-bias address of inserting in the 2nd Value data after data inserting is stored in described key-value pair database.
8. a memory storage for data in key-value pair system, is characterized in that, comprising:
Whether judge module, exceed data threshold for the data volume that judges key-value pair Value data, and described key-value pair comprises Key data and the Value data corresponding with described Key data;
Section module, for in the time that the data volume of described Value data does not exceed data threshold, described Value data are cut into slices, obtain M burst content, described M is greater than 1 natural number, and described M burst content is the data content information of the described Value data after burst;
Acquisition module, for N burst content being generated to burst information according to described M burst content, described burst information comprises: the individual burst content of sequence number ID, (M-N) of each burst content in the offset address of each burst content, a described N burst content in the number to described Value data fragmentation, a described N burst content, described N is for being greater than 0 natural number, and described N is less than or equal to described M;
Memory module, for described Key data and described burst information are stored in to the merging tree LSM-Tree based on log-structured, is stored in described N burst content in key-value pair database, and described Key data are corresponding with described burst information.
9. device according to claim 8, is characterized in that, described memory module also, in the time that the data volume of described Value data exceedes data threshold, is stored in described key-value pair in described LSM-Tree.
10. device according to claim 8, is characterized in that, described memory module comprises:
Sub module stored, concentrates for described Key data and described burst information being stored in to the first ordered data, and described the first ordered data collection is arranged in internal memory memory;
Merge submodule, while merging threshold value for exceeding when described the first ordered data collection, concentrate burst information corresponding to Key data to be updated to described the first ordered data the second ordered data and concentrate the burst information corresponding with described Key data, described the second ordered data collection is arranged in disk.
11. devices according to claim 10, is characterized in that, described device also comprises: removing module, and for described Key data concentrated described the first ordered data and the burst information corresponding with described Key data are deleted.
12. devices according to claim 8, is characterized in that, described device also comprises: search module, read module, wherein,
Described acquisition module, also need to need the data-bias address of reading, Key data corresponding to described Value data in the data length reading, described Value data for obtaining Value data, described Value data are the Value data that current needs read;
The described module of searching, for searching the first burst information corresponding to described Key data according to described Key data at described LSM-Tree;
Described read module reads out and in described Value data, needs the data that read for the data length that needs to read in the data-bias address that needs to read according to described Value data, described Value data and the first burst information corresponding to described Key data from described LSM-Tree and/or described key-value pair database.
13. devices according to claim 12, is characterized in that, described device also comprises: update module, wherein,
Described update module is new data more for the burst content at the data place having read out by the described data replacement having read out;
Described acquisition module, also regenerates the first burst information for each burst content after described Value data have been upgraded;
Described memory module, also for being stored in described LSM-Tree by the first burst information regenerating.
14. devices according to claim 8, is characterized in that, described device also comprises: search module, read module, insert module, wherein,
Described acquisition module, also needs for obtaining data-bias address, the 2nd Key data corresponding to described the 2nd Value data inserted in the data length, described the 2nd Value data that the 2nd Value data need to insert;
The described module of searching, for searching the second burst information corresponding to described the 2nd Key data according to described the 2nd Key data at described LSM-Tree;
Described read module reads out for the data-bias address and the second burst information corresponding to described the 2nd Key data that need according to described the 2nd Value data to insert the burst content that needs the place, data-bias address of inserting in described the 2nd Value data from described LSM-Tree and/or described key-value pair database;
Described insert module, for inserting the data of needs insertion in the burst content at place, data-bias address that needs in described the 2nd Value data to insert;
Described acquisition module, also, for each burst content after the 2nd Value data data inserting is regenerated to the second burst information, the burst content that needs the place, data-bias address of inserting in the 2nd Value data after data inserting is carried in the second burst information regenerating; Described memory module, also for storing the second burst information regenerating into described LSM-Tree; Or described acquisition module, also for regenerating the second burst information to each burst content after the 2nd Value data data inserting; Described memory module, also for the second burst information regenerating is stored in to described LSM-Tree, stores the burst content that needs the place, data-bias address of inserting in the 2nd Value data after data inserting in described key-value pair database into.
CN201310172455.7A 2013-05-10 2013-05-10 The storage method and relevant apparatus of data in a kind of key-value pair system Active CN104142958B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310172455.7A CN104142958B (en) 2013-05-10 2013-05-10 The storage method and relevant apparatus of data in a kind of key-value pair system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310172455.7A CN104142958B (en) 2013-05-10 2013-05-10 The storage method and relevant apparatus of data in a kind of key-value pair system

Publications (2)

Publication Number Publication Date
CN104142958A true CN104142958A (en) 2014-11-12
CN104142958B CN104142958B (en) 2018-03-13

Family

ID=51852132

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310172455.7A Active CN104142958B (en) 2013-05-10 2013-05-10 The storage method and relevant apparatus of data in a kind of key-value pair system

Country Status (1)

Country Link
CN (1) CN104142958B (en)

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104809237A (en) * 2015-05-12 2015-07-29 百度在线网络技术(北京)有限公司 LSM-tree (The Log-Structured Merge-Tree) index optimization method and LSM-tree index optimization system
CN105138622A (en) * 2015-08-14 2015-12-09 中国科学院计算技术研究所 Append operation method for LSM tree memory system and reading and merging method for loads of append operation
CN105447200A (en) * 2015-12-30 2016-03-30 金蝶软件(中国)有限公司 Data processing method and data processing apparatus
CN105487820A (en) * 2015-11-30 2016-04-13 中国科学院信息工程研究所 Time slice rotation mechanism based tree storage structure write amplification optimization method
CN105989129A (en) * 2015-02-15 2016-10-05 腾讯科技(深圳)有限公司 Real-time data statistic method and device
WO2016169322A1 (en) * 2015-04-22 2016-10-27 中兴通讯股份有限公司 Query method and device for database, and computer storage medium
CN106227769A (en) * 2016-07-15 2016-12-14 北京奇虎科技有限公司 Date storage method and device
CN106682184A (en) * 2016-12-29 2017-05-17 华中科技大学 Light-weight combination method based on log combination tree structure
CN106844650A (en) * 2017-01-20 2017-06-13 中国科学院计算技术研究所 A kind of daily record merges the merging method and system of tree
CN106874459A (en) * 2017-02-14 2017-06-20 北京奇虎科技有限公司 Stream data storage method and device
CN107038206A (en) * 2017-01-17 2017-08-11 阿里巴巴集团控股有限公司 The method for building up of LSM trees, the method for reading data and server of LSM trees
CN107526550A (en) * 2017-09-06 2017-12-29 中国人民大学 A kind of two benches merging method based on log-structured merging tree
CN108052643A (en) * 2017-12-22 2018-05-18 北京奇虎科技有限公司 Date storage method, device and storage engines based on LSM Tree structures
CN108153911A (en) * 2018-01-24 2018-06-12 广西师范学院 The distributed cloud storage method of data
CN108351900A (en) * 2015-10-07 2018-07-31 甲骨文国际公司 Relational database tissue for fragment
CN109656886A (en) * 2018-12-26 2019-04-19 百度在线网络技术(北京)有限公司 File system implementation method, device, equipment and storage medium based on key-value pair
CN109684334A (en) * 2018-12-26 2019-04-26 百度在线网络技术(北京)有限公司 Date storage method, device, equipment and the storage medium of key-value pair storage system
CN110309110A (en) * 2019-05-24 2019-10-08 深圳壹账通智能科技有限公司 A kind of big data log monitoring method and device, storage medium and computer equipment
CN110377227A (en) * 2019-06-13 2019-10-25 阿里巴巴集团控股有限公司 A kind of data block storage method, apparatus and electronic equipment
CN110704453A (en) * 2019-10-15 2020-01-17 腾讯音乐娱乐科技(深圳)有限公司 Data query method and device, storage medium and electronic equipment
CN111046041A (en) * 2019-12-09 2020-04-21 珠海格力电器股份有限公司 Data processing method and device, storage medium and processor
CN111104403A (en) * 2019-11-30 2020-05-05 北京浪潮数据技术有限公司 LSM tree data processing method, system, equipment and computer medium
CN111226205A (en) * 2017-08-31 2020-06-02 美光科技公司 KVS tree database
CN111241108A (en) * 2020-01-16 2020-06-05 北京百度网讯科技有限公司 Key value pair-based KV system indexing method and device, electronic equipment and medium
CN111444138A (en) * 2019-01-16 2020-07-24 深圳市茁壮网络股份有限公司 File local modification method and system
CN112527804A (en) * 2021-01-27 2021-03-19 中智关爱通(南京)信息科技有限公司 File storage method, file reading method and data storage system
WO2021082928A1 (en) * 2019-11-01 2021-05-06 华为技术有限公司 Data reduction method and apparatus, computing device, and storage medium
US11513704B1 (en) 2021-08-16 2022-11-29 International Business Machines Corporation Selectively evicting data from internal memory during record processing
US11675513B2 (en) 2021-08-16 2023-06-13 International Business Machines Corporation Selectively shearing data when manipulating data during record processing

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7697518B1 (en) * 2006-09-15 2010-04-13 Netlogic Microsystems, Inc. Integrated search engine devices and methods of updating same using node splitting and merging operations
CN102541968A (en) * 2010-12-31 2012-07-04 百度在线网络技术(北京)有限公司 Indexing method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7697518B1 (en) * 2006-09-15 2010-04-13 Netlogic Microsystems, Inc. Integrated search engine devices and methods of updating same using node splitting and merging operations
CN102541968A (en) * 2010-12-31 2012-07-04 百度在线网络技术(北京)有限公司 Indexing method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JERMAINE C.等: "The partitioned exponential file for database storage management", 《THE VLDB JOURNAL》 *
胡昊 等: "一个高性能Key/Value数据库XDB的设计与实现", 《计算机工程与科学》 *

Cited By (48)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105989129A (en) * 2015-02-15 2016-10-05 腾讯科技(深圳)有限公司 Real-time data statistic method and device
CN105989129B (en) * 2015-02-15 2019-03-26 腾讯科技(深圳)有限公司 Real time data statistical method and device
CN106156197A (en) * 2015-04-22 2016-11-23 中兴通讯股份有限公司 The querying method of a kind of data base and device
WO2016169322A1 (en) * 2015-04-22 2016-10-27 中兴通讯股份有限公司 Query method and device for database, and computer storage medium
CN104809237A (en) * 2015-05-12 2015-07-29 百度在线网络技术(北京)有限公司 LSM-tree (The Log-Structured Merge-Tree) index optimization method and LSM-tree index optimization system
CN104809237B (en) * 2015-05-12 2018-12-14 百度在线网络技术(北京)有限公司 The optimization method and device of LSM-tree index
CN105138622B (en) * 2015-08-14 2018-05-22 中国科学院计算技术研究所 For the insertion operation of LSM tree storage systems and reading and the merging method of load
CN105138622A (en) * 2015-08-14 2015-12-09 中国科学院计算技术研究所 Append operation method for LSM tree memory system and reading and merging method for loads of append operation
CN108351900A (en) * 2015-10-07 2018-07-31 甲骨文国际公司 Relational database tissue for fragment
CN105487820A (en) * 2015-11-30 2016-04-13 中国科学院信息工程研究所 Time slice rotation mechanism based tree storage structure write amplification optimization method
CN105487820B (en) * 2015-11-30 2018-11-16 中国科学院信息工程研究所 A kind of tree-like storage structure based on round-robin mechanism writes amplification optimization method
CN105447200A (en) * 2015-12-30 2016-03-30 金蝶软件(中国)有限公司 Data processing method and data processing apparatus
CN106227769A (en) * 2016-07-15 2016-12-14 北京奇虎科技有限公司 Date storage method and device
CN106227769B (en) * 2016-07-15 2019-11-26 北京奇虎科技有限公司 Date storage method and device
CN106682184A (en) * 2016-12-29 2017-05-17 华中科技大学 Light-weight combination method based on log combination tree structure
CN106682184B (en) * 2016-12-29 2019-12-20 华中科技大学 Lightweight merging method based on log merging tree structure
CN107038206A (en) * 2017-01-17 2017-08-11 阿里巴巴集团控股有限公司 The method for building up of LSM trees, the method for reading data and server of LSM trees
CN106844650A (en) * 2017-01-20 2017-06-13 中国科学院计算技术研究所 A kind of daily record merges the merging method and system of tree
CN106874459A (en) * 2017-02-14 2017-06-20 北京奇虎科技有限公司 Stream data storage method and device
CN106874459B (en) * 2017-02-14 2020-07-10 北京奇虎科技有限公司 Streaming data storage method and device
CN111226205A (en) * 2017-08-31 2020-06-02 美光科技公司 KVS tree database
CN111226205B (en) * 2017-08-31 2021-08-31 美光科技公司 KVS tree database
CN107526550A (en) * 2017-09-06 2017-12-29 中国人民大学 A kind of two benches merging method based on log-structured merging tree
CN107526550B (en) * 2017-09-06 2020-01-17 中国人民大学 Two-stage merging method based on log structure merging tree
CN108052643A (en) * 2017-12-22 2018-05-18 北京奇虎科技有限公司 Date storage method, device and storage engines based on LSM Tree structures
CN108153911A (en) * 2018-01-24 2018-06-12 广西师范学院 The distributed cloud storage method of data
CN108153911B (en) * 2018-01-24 2022-07-19 广西师范学院 Distributed cloud storage method of data
CN109684334A (en) * 2018-12-26 2019-04-26 百度在线网络技术(北京)有限公司 Date storage method, device, equipment and the storage medium of key-value pair storage system
CN109656886B (en) * 2018-12-26 2021-11-09 百度在线网络技术(北京)有限公司 Key value pair-based file system implementation method, device, equipment and storage medium
CN109656886A (en) * 2018-12-26 2019-04-19 百度在线网络技术(北京)有限公司 File system implementation method, device, equipment and storage medium based on key-value pair
CN111444138B (en) * 2019-01-16 2024-03-15 深圳市茁壮网络股份有限公司 File local modification method and system
CN111444138A (en) * 2019-01-16 2020-07-24 深圳市茁壮网络股份有限公司 File local modification method and system
CN110309110A (en) * 2019-05-24 2019-10-08 深圳壹账通智能科技有限公司 A kind of big data log monitoring method and device, storage medium and computer equipment
WO2020248598A1 (en) * 2019-06-13 2020-12-17 创新先进技术有限公司 Data block storage method and apparatus, and electronic device
CN110377227B (en) * 2019-06-13 2020-07-07 阿里巴巴集团控股有限公司 Data block storage method and device and electronic equipment
CN110377227A (en) * 2019-06-13 2019-10-25 阿里巴巴集团控股有限公司 A kind of data block storage method, apparatus and electronic equipment
CN110704453A (en) * 2019-10-15 2020-01-17 腾讯音乐娱乐科技(深圳)有限公司 Data query method and device, storage medium and electronic equipment
WO2021082928A1 (en) * 2019-11-01 2021-05-06 华为技术有限公司 Data reduction method and apparatus, computing device, and storage medium
CN111104403B (en) * 2019-11-30 2022-06-07 北京浪潮数据技术有限公司 LSM tree data processing method, system, equipment and computer medium
CN111104403A (en) * 2019-11-30 2020-05-05 北京浪潮数据技术有限公司 LSM tree data processing method, system, equipment and computer medium
CN111046041A (en) * 2019-12-09 2020-04-21 珠海格力电器股份有限公司 Data processing method and device, storage medium and processor
CN111046041B (en) * 2019-12-09 2024-02-27 珠海格力电器股份有限公司 Data processing method and device, storage medium and processor
CN111241108A (en) * 2020-01-16 2020-06-05 北京百度网讯科技有限公司 Key value pair-based KV system indexing method and device, electronic equipment and medium
CN111241108B (en) * 2020-01-16 2023-12-26 北京百度网讯科技有限公司 Key value based indexing method and device for KV system, electronic equipment and medium
CN112527804B (en) * 2021-01-27 2022-09-16 中智关爱通(上海)科技股份有限公司 File storage method, file reading method and data storage system
CN112527804A (en) * 2021-01-27 2021-03-19 中智关爱通(南京)信息科技有限公司 File storage method, file reading method and data storage system
US11513704B1 (en) 2021-08-16 2022-11-29 International Business Machines Corporation Selectively evicting data from internal memory during record processing
US11675513B2 (en) 2021-08-16 2023-06-13 International Business Machines Corporation Selectively shearing data when manipulating data during record processing

Also Published As

Publication number Publication date
CN104142958B (en) 2018-03-13

Similar Documents

Publication Publication Date Title
CN104142958A (en) Storage method for data in Key-Value system and related device
JP4669067B2 (en) Dynamic fragment mapping
CN107391774B (en) The rubbish recovering method of log file system based on data de-duplication
CN109416694A (en) The key assignments storage system effectively indexed including resource
WO2017065885A1 (en) Distributed pipeline optimization data preparation
CN109076021B (en) Data processing method and device
EP3362916B1 (en) Signature-based cache optimization for data preparation
CN104572920A (en) Data arrangement method and data arrangement device
CN103186622B (en) The update method of index information and device in a kind of text retrieval system
CN111832065A (en) Software implemented using circuitry and method for key-value storage
CN103914483B (en) File memory method, device and file reading, device
WO2010062554A2 (en) Index compression in databases
WO2015152830A1 (en) Method of maintaining data consistency
CN116450656B (en) Data processing method, device, equipment and storage medium
CN104424219A (en) Method and equipment of managing data documents
CN103246549A (en) Method and system for data transfer
CN111126625A (en) Extensible learning index method and system
WO2017065888A1 (en) Step editor for data preparation
CN103885721A (en) Data storing or reading method and device for key-value system
CN111831691B (en) Data reading and writing method and device, electronic equipment and storage medium
CN107423321B (en) Method and device suitable for cloud storage of large-batch small files
EP3362808A1 (en) Cache optimization for data preparation
CN108804571B (en) Data storage method, device and equipment
CN102955808A (en) Data acquisition method and distributed file system
CN110515897B (en) Method and system for optimizing reading performance of LSM storage system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant