CN106886375B - The method and apparatus of storing data - Google Patents

The method and apparatus of storing data Download PDF

Info

Publication number
CN106886375B
CN106886375B CN201710187260.8A CN201710187260A CN106886375B CN 106886375 B CN106886375 B CN 106886375B CN 201710187260 A CN201710187260 A CN 201710187260A CN 106886375 B CN106886375 B CN 106886375B
Authority
CN
China
Prior art keywords
key
write
value pair
pair data
log file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710187260.8A
Other languages
Chinese (zh)
Other versions
CN106886375A (en
Inventor
覃安
陈佳捷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201710187260.8A priority Critical patent/CN106886375B/en
Publication of CN106886375A publication Critical patent/CN106886375A/en
Application granted granted Critical
Publication of CN106886375B publication Critical patent/CN106886375B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0608Saving storage space on storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0652Erasing, e.g. deleting, data cleaning, moving of data to a wastebasket
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device

Abstract

This application discloses the method and apparatus of storing data.One specific embodiment of this method includes: to obtain key-value pair data to be stored;Above-mentioned key-value pair data is stored in the write-ahead log file of disk;It is in memory that the key-value pair data generates index entry according to key of the above-mentioned key-value pair data in the storage location and the key-value pair data in disk, to execute predetermined operation to the key-value pair data by the index entry, when detecting the write-ahead log file for meeting merging finishing condition, this is met according to index in the additional write-in disk of key-value pair data retained in the write-ahead log file of predetermined condition, wherein, storage location includes at least one of the following: the filename of write-ahead log file, offset of the key-value pair data in write-ahead log file with file initial position.The embodiment can reduce write-in scale-up problem, improve the validity of storing data.

Description

The method and apparatus of storing data
Technical field
This application involves field of computer technology, and in particular to the method for data processing field more particularly to storing data And device.
Background technique
Write-in amplification (Write amplification, abbreviation WA) is actually written into flash memory and solid state hard disk (SSD) The phenomenon that physical message amount is more times of the amount of logic to be written.This is because flash memory is necessary before it can re-write data It first wipes, the process for executing these operations just produces the movement of user data more than once and metadata and (or writes again Enter), cause physical message amount to double, and increase the number of request write-in.
As the core component for developing program or system on e-platform, engine is quickly generated for developer, is laid with journey Function needed for sequence, or the operating using its auxiliary program.Existing exchange architecture generally comprises three parts: log module, Index module and memory module, log module saves and the consistent data of memory, and memory module is saved into from log module Data are removed in log module.When data are saved into memory module from log module, need to carry out data Redundancy processing is filtered, key-value pair is read from log module by index module, removes duplicate key or deleted key assignments, it Afterwards, memory module is written into the key assignments of reservation.There is write-in scale-up problem in this reading and writing process, so as to cause disc demand Increase, the validity of storing data is not high.
Summary of the invention
The purpose of the application is the method and apparatus for proposing a kind of improved storing data, to solve background above technology The technical issues of part is mentioned.
In a first aspect, the embodiment of the present application provides a kind of method of storing data, this method comprises: obtaining to be stored Key-value pair data;Above-mentioned key-value pair data is stored in the write-ahead log file of disk;According to above-mentioned key-value pair data in magnetic The key in storage location and the key-value pair data in disk is that the key-value pair data generates index entry in memory, by being somebody's turn to do Index entry executes predetermined operation to the key-value pair data, when detecting the write-ahead log file for meeting merging finishing condition, root This is met according to index in the additional write-in disk of key-value pair data retained in the write-ahead log file of predetermined condition, wherein deposit Storage space set the filename for including at least one of the following: write-ahead log file, key-value pair data in write-ahead log file with file The offset of initial position.
In some embodiments, when predetermined operation includes read operation, by index entry to being deposited in write-ahead log file It includes: to search index entry corresponding to key-value pair data to be read in the index that the key-value pair data of storage, which executes predetermined operation, with Obtain the storage location of key-value pair data to be read;Key assignments to be read is read from write-ahead log file according to above-mentioned storage location To data.
In some embodiments, when predetermined operation includes delete operation, by the index entry to write-ahead log file It includes: to delete key to be deleted in the index according to the key in key-value pair data that middle stored key-value pair data, which executes predetermined operation, It is worth the index entry to data.
In some embodiments, the above method further include: delete the write-ahead log file for being added write-in disk;In update Index in depositing.
In some embodiments, merge finishing condition and include at least one of the following: that the occupied space of write-ahead log file is super Cross default file capacity-threshold;The occupied space of write-ahead log file and the ratio of gross space are more than default file space proportion threshold Value;The key-value pair data being deleted in write-ahead log file is more than preset threshold;Key assignments logarithm is deleted in write-ahead log file According to occupied space and the ratio of gross space be more than default to delete proportion threshold value;The occupied space of disk is more than that default disk is empty Between threshold value;The occupied space of disk and the ratio of disk total capacity are more than default disk space proportion threshold value.
In some embodiments, the index in memory includes according to the key-value pair data in the write-ahead log file in disk The dictionary tree that generates of key, wherein in the dictionary tree, from the corresponding key-value pair data of each path of root node Key, the last one node of each path store the storage location of key-value pair data corresponding to the path.
Second aspect, the embodiment of the present application also provides a kind of device of storing data, which includes: acquisition module, It is configured to obtain key-value pair data to be stored;Memory module is configured to above-mentioned key-value pair data being stored in disk In write-ahead log file;Index entry generation module, be configured to the storage location according to above-mentioned key-value pair data in disk with And the key in key-value pair data is that the key-value pair data generates index entry in memory, so that data operation modules pass through the index Item executes predetermined operation to the key-value pair data, when detecting the write-ahead log file for meeting merging finishing condition, according to rope Draw in the additional write-in disk of the key-value pair data retained in the write-ahead log file that this is met to predetermined condition, wherein storage position The filename for including at least one of the following: write-ahead log file, key-value pair data is set to originate in write-ahead log file with file The offset of position.
In some embodiments, when predetermined operation includes read operation, data operation modules are configured to: being looked into the index Index entry corresponding to key-value pair data to be read is looked for, to obtain the storage location of key-value pair data to be read;According to storage position It sets and reads key-value pair data to be read from write-ahead log file.
In some embodiments, when predetermined operation includes delete operation, data operation modules are configured to: according to key assignments The index entry of to be deleted key-value pair data is deleted the key in data in the index.
In some embodiments, above-mentioned apparatus further include: removing module is configured to deletion and is added the pre- of write-in disk Write journal file;Index generation module is also configured to update the index in memory.
In some embodiments, merge finishing condition and include at least one of the following: that the occupied space of write-ahead log file is super Cross default file capacity-threshold;The occupied space of write-ahead log file and the ratio of gross space are more than default file space proportion threshold Value;The key-value pair data being deleted in write-ahead log file is more than preset threshold;Key assignments logarithm is deleted in write-ahead log file According to occupied space and the ratio of gross space be more than default to delete proportion threshold value;The occupied space of disk is more than that default disk is empty Between threshold value;The occupied space of disk and the ratio of disk total capacity are more than default disk space proportion threshold value.
In some embodiments, the index in memory includes according to the key-value pair data in the write-ahead log file in disk The dictionary tree that generates of key, wherein in the dictionary tree, from the corresponding key-value pair data of each path of root node Key, the last one node of each path store the storage location of key-value pair data corresponding to the path.
The third aspect, the embodiment of the present application provide a kind of storage equipment, comprising: one or more processors;Program is deposited Storage device, for storing one or more programs;Disk, for storing one or more write-ahead log files;When one Or multiple programs are executed by one or more of processors, so that one or more of processors realize any above-mentioned side Method.
The method and apparatus of storing data provided by the embodiments of the present application are connect by obtaining key-value pair data to be stored The key-value pair data is stored in the write-ahead log file of disk, the then storage according to the key-value pair data in disk Key in position and the key-value pair data is that the key-value pair generates index entry in memory, to pass through index entry to write-ahead log The key-value pair data stored in file executes predetermined operation, meets the write-ahead log file for merging finishing condition detecting When, this is met according to index in the additional write-in disk of the key-value pair data retained in the write-ahead log file of predetermined condition.By Be stored in write-ahead log file in key-value pair data, can by the index entry in memory to being deposited in write-ahead log file The key-value pair data of storage executes predetermined operation, when merging housekeeping operation, to add the key-value pair data that WriteMode only writes reservation, So as to reduce write-in scale-up problem, the validity of storing data is improved.
Detailed description of the invention
By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the application's is other Feature, objects and advantages will become more apparent upon:
Fig. 1 is that this application can be applied to exemplary system architecture figures therein;
Fig. 2 is the flow chart according to one embodiment of the method for the storing data of the application;
Fig. 3 is that the dictionary tree form index in an optional implementation according to the method for the storing data of the application shows It is intended to;
Fig. 4 is the schematic diagram according to an application scenarios of the method for the storing data of the application;
Fig. 5 is according to the process schematic for deleting key-value pair data in the application scenarios shown in Fig. 4;
Fig. 6 is the structural schematic diagram according to one embodiment of the storing data device of the application;
Fig. 7 is adapted for the structural representation of the computer system for the terminal device or server of realizing the embodiment of the present application Figure.
Specific embodiment
The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to Convenient for description, part relevant to related invention is illustrated only in attached drawing.
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Fig. 1 is shown can be using the exemplary of the embodiment of the method or storing data device of the storing data of the application System architecture 100.
As shown in Figure 1, system architecture 100 may include terminal device 101,102,103, network 104 and server 105. Network 104 between terminal device 101,102,103 and server 105 to provide the medium of communication link.Network 104 can be with Including various connection types, such as wired, wireless communication link or fiber optic cables etc..
User can be used terminal device 101,102,103 and be interacted by network 104 with server 105, to receive or send out Send message etc..Various telecommunication customer end applications can be installed, such as web browser is answered on terminal device 101,102,103 With the application of, storing data class, shopping class application, searching class application, instant messaging tools, mailbox client, social platform software Deng.
Terminal device 101,102,103 can be the various electronic equipments with display screen and supported web page browsing, packet Include but be not limited to smart phone, tablet computer, E-book reader, MP3 player (Moving Picture Experts Group Audio Layer III, dynamic image expert's compression standard audio level 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic image expert's compression standard audio level 4) it is player, on knee portable Computer and desktop computer etc..
Server 105 can be to provide the server of various services, such as to showing on terminal device 101,102,103 Webpage provides the web page server etc. supported.Server 105 can analyze the data (such as searching request) received Deng processing.For example, server 105 can receive the searching request from transmissions such as terminal devices 101,102,103, and to receiving Searching request in include search key data analyzed and stored.
It should be noted that the embodiment of the method for storing data provided herein is generally executed by server 105, Correspondingly, the embodiment of storing data device is generally positioned in server 105.
It should be understood that the number of terminal device, network and server in Fig. 1 is only schematical.According to realization need It wants, can have any number of terminal device, network and server.
With continued reference to Fig. 2, the process 200 of one embodiment of the method for the storing data according to the application is shown.It should The method of storing data, comprising the following steps:
Step 201, key-value pair data to be stored is obtained.
In general, in distributed memory system, it can be by key (key) by key-value (key-value pair) distributed storage The quick search of big data quantity, high concurrent is realized as index.In the present embodiment, the method for storing data is run thereon Electronic equipment (such as server 105 shown in FIG. 1) can be from locally or remotely obtaining key-value data to be stored. Wherein, key-value data to be stored can be the key-value number for meeting the condition of data of above-mentioned electronic equipment storage According to.
As an example, it is searching for user's input that the above-mentioned electronic equipment key-value data to be stored, which may include: key, Rope keyword, value (value) are other search keys relevant to the search key, such as the search key is synonymous Word, near synonym, related term etc..For example, relevant to the search key if the search key is " Beijing weather " Other search keys for example may include " Beijing weather ", " capital weather ", " Pekinese's weather " etc., then this can be generated Key-value data: key=" Beijing weather ", value=" Beijing weather, capital weather, Pekinese's weather ... ".When Above-mentioned electronic equipment is when providing the background server of support for web browser, and above-mentioned electronic equipment can store or lead to The search key for crossing acquisition terminal device generates this key-value pair data, can obtain this key-value pair data from local; It, can when above-mentioned electronic equipment is other electronic equipments being connected with the background server for providing support for web browser Remotely to obtain this key-value pair data.It is appreciated that in practice can also there are many kinds of the method for storing data run on Electronic equipment thereon is no longer exhaustive herein from the case where locally or remotely obtaining key-value data to be stored.key- The generation method of value data is not the inventive point of this embodiment scheme, can use well known various technologies, herein no longer It repeats.
Step 202, above-mentioned key-value pair data is stored in the write-ahead log file of disk.
In the present embodiment, above-mentioned electronic equipment (such as server 105 shown in FIG. 1) can be further by step 201 The key-value data to be stored of middle acquisition are stored in write-ahead log (write-ahead log, WAL) file of disk.
Wherein, disk can be for storing data and magnetic disc, by magnetic head magnetic plate on piece mobile reading Write data.It may include multiple write-ahead log files on disk.In the present embodiment, magnetic head can be mobile along a direction, Write-ahead log file is written into key-value data to be stored by the sequence of available write-ahead log file.Above-mentioned electronic equipment It can indicate whether write-ahead log file can be used by setting file mark (such as position label).For example, being stored with data Write-ahead log file it is unavailable.When the data of some write-ahead log file are deleted, above-mentioned electronic equipment can will be preset The file mark of journal file is revised as can be used and (being such as labeled as set).When the last one available write-ahead log text in disk When part is fully written, above-mentioned electronic equipment can control magnetic head can be with write-ahead log file number is written from initial position sequence detection According to.
Step 203, including the key in the storage location and key-value pair data according to above-mentioned key-value pair data in disk Index entry is generated in depositing for the key-value pair data.
In the present embodiment, above-mentioned electronic equipment can be then according to storage of the above-mentioned key-value data in disk Key in position and key-value data is this key-value data generation index entry in memory, thus, above-mentioned electricity Sub- equipment can execute predetermined operation, In to the key-value data stored in the write-ahead log file of disk by index entry When detecting the write-ahead log file for meeting merging finishing condition, this is met to the write-ahead log file of predetermined condition according to index In the key-value data supplementing write-in disk of middle reservation.In practice, electronic equipment, which can traverse, meets prewriting for predetermined condition Key-value data in journal file, and according to each key therein, it is searched from index, if finding this key- The index entry of value data, and without deleting label, then it can determine that this key-value data need to retain.Otherwise, when It searches in index less than respective index item, or finds respective index item and done deletion label, then do not retain this key- Value data.
Wherein, merging finishing condition for example can include but is not limited at least one of following: the occupancy of write-ahead log file Space is more than default file capacity-threshold (such as 15 gigabytes);The occupied space of write-ahead log file and the ratio of gross space are super Cross default file space proportion threshold value (such as 99%);The key-value pair data being deleted in write-ahead log file is more than preset threshold (such as 50);The ratio of occupied space and gross space that key-value pair data is deleted in write-ahead log file is more than default deletion ratio Example threshold value (such as 90%);The occupied space of disk is more than default disk space threshold value (such as 15 trillion bytes);Disk has accounted for It is more than default disk space proportion threshold value (such as 90%) with the ratio of space and disk total capacity;Etc..
Here, storage location can include but is not limited at least one of following: the filename of write-ahead log file, the key It is worth the offset to data in write-ahead log file with file initial position.Wherein offset can be byte number (such as 1byte), it is also possible to the digit (such as 8bit) of offset, the application does not limit this.
Here predetermined operation can be basic operation for data, for example including but be not limited to read operation, write operation, Delete operation etc..Predetermined operation can also be the combination operation being made of basic operation, such as merge housekeeping operation, can wrap Include read operation, write operation, one or more in delete operation.
By taking write operation as an example, above-mentioned electronic equipment can be by the sequence of available write-ahead log file by key- to be stored Write-ahead log file is written in value data, then the storage location according to key-value pair data in disk and key-value number Key in is that key-value pair data generates index entry in memory.
By taking read operation as an example, above-mentioned electronic equipment can be first according to the key in key-value data to be read, in rope Respective index item is found in drawing, from obtaining storage position of the corresponding key-value data of the key in disk in index entry It sets, such as the offset in write-ahead log file with file initial position of filename, key-value pair data of write-ahead log file, Above-mentioned electronic equipment can find the write-ahead log file where key-value data according to the storage location, and according to key assignments Data are found in write-ahead log file with the offset of file initial position with the reading of key-value data.
By taking delete operation as an example, above-mentioned electronic equipment can be according to the key in key-value pair data to be deleted, in the index Respective index item is found, and deletes the index entry.Electronic equipment can be corresponding according to deleting in index entry lookup disk Key-value data perhaps make key-value data corresponding in disk and have deleted label or to corresponding in disk Key-value data it is without any processing for the time being.When detecting the write-ahead log file for meeting merging finishing condition, according to It indexes in the key-value data supplementing write-in disk retained in the write-ahead log file that this is met to predetermined condition.At this point, by In delete operation key-value data to be deleted delete with the operation of index entry, so, it will herein according to index When this meets the key-value data supplementing write-in disk retained in the write-ahead log file of predetermined condition, no longer write-in need to be deleted The key-value data removed, and only it is written with the key-value data that need to retain.During being arranged as a result, to the merging of data It only needs the key-value data retained to account for two parts of spaces, read-write data volume can be greatly reduced, put to reduce write-in Greatly.
In some optional implementations of the present embodiment, above-mentioned electronic equipment merges arrangement to write-ahead log file After operation, the write-ahead log file merged before arranging can be deleted, to vacate disk space.In addition, due to the write-ahead log The storage location of key-value pair data in file is changed, and above-mentioned electronic equipment can be according to the storage location after variation more Index in new memory operates corresponding key-value pair data according to index so as to subsequent.
In some optional implementations of the present embodiment, the index in memory may include prewriting day according in disk The dictionary tree that the key of key-value data in will file is generated.Wherein, in dictionary tree, from every road of root node Diameter can correspond to the key of a key-value pair data, the last one node of each path can store key assignments corresponding to the path To the storage location of data.If as an example, the search when key in the key-value data stored is user's search Keyword, as shown in figure 3, in the dictionary tree generated: the path formed from root node 300 to node 311, node 300 and node 310 correspond to a key " west gate ", the storage address " address 1 " of the corresponding key-value pair data of storage key " west gate " in node 311; The path formed from root node 300 to node 313, the corresponding key " Xi Menqing " of node 300, node 310, node 312, section The storage address " address 2 " of the corresponding key-value data of storage key " Xi Menqing " in point 313;From root node 300 to node 321 paths formed, node 300, node 320 correspond to a key " watermelon ", corresponding key of storage key " watermelon " in node 321 It is worth the storage address " address 3 " to data;Etc..
With continued reference to the schematic diagram that Fig. 4, Fig. 4 are according to the application scenarios of the method for the storing data of the present embodiment. In the application scenarios of Fig. 4, the method for the storing data of the present embodiment for example applied to support is provided for web browser after Platform server 410.User can input search key 421 by running the terminal device 420 for having web browser to obtain Webpage.Background server 410 can be according to the synonym for the search key 421 and search key 421 that user inputs, closely justice Word, related term provide search result 422,423,424 etc. and are presented to user by terminal device 420.Background server 410 can be with According to the search key of all users, by such as clustering, the method for semantic analysis etc, classify to search key And expansion, for example, can be using synonym, near synonym, related term as value, one of word is stored as key, When passing through 421 search and webpage of search key for user, retrieve with the search key 421 and its synonym, near synonym, The corresponding result of related term is supplied to user.
In the application scenarios, when receiving the search key of user's input, background server 410 can be first by user The search key of input is matched with the key in index.Wherein, which can be synonym, near synonym, related term Matching.When in the index corresponding key can be matched to, background server 410 can determine this retrieve data do not store, when When cannot be matched to corresponding key in index, background server can using the search key as key, the search key Synonym, near synonym, related term generate key-value data as value, and as key-value data to be stored. Then, which can be stored in the write-ahead log file of disk by background server 410.At this In application scenarios, magnetic head can be sequentially written in data by available write-ahead log file, and magnetic head still jumps back to after reading data every time The position that data are write, then the key-value data to be stored can be stored in magnetic head in disk and currently referred to by background server To write-ahead log file in.Then, the storage location and key-value data according to key-value data in disk In key be in memory that key-value data generate index entry, it is predetermined to be executed by index entry to key-value data Operation.
It is appreciated that the regular period often generates the network hot word in the period, it is assumed that background server 410 is according to the time The hot word of section (such as 1 day) storage predetermined quantity (such as 1000), then be not the history hot word of hot word for a current slot, then It needs to be deleted.At this point, referring to FIG. 5, assuming that the corresponding key-value pair data x of some history hot word needs to be deleted, in step In 501, background server can only delete the index entry of key-value pair data x in the index.When default where key-value pair data x When the merging finishing condition (such as the data that write-ahead log file is deleted reach 90%) of journal file meets, background server 410 can traverse the default journal file where key-value pair data x, and as shown at step 502, for key-value pair data x, backstage is taken Business device 410 can be not matched at this time corresponding key with the key in match index, then the index entry of key-value pair data x is sky. As shown in step 503, it is assumed that in the default journal file where key-value data x, the index of another key-value pair data y Item exists, then key-value pair data y can be stored disk in a manner of additional write by background server 410, and can be updated The index entry of key-value pair data y in index.Then, as indicated in step 504, background server 410 is traversing key-value After default journal file where data x, entire file can be deleted.
The method of the storing data of the present embodiment, since key-value data are stored in write-ahead log file, Ke Yitong It crosses and predetermined operation is executed to the key-value pair data stored in write-ahead log file to the index entry in memory, arrange behaviour merging When making, to add the key-value data that WriteMode only writes reservation, so as to reduce write-in scale-up problem, storage number is improved According to validity.
With further reference to Fig. 6, as the realization of the method to storing data, this application provides a kind of dresses of storing data The one embodiment set, the Installation practice are corresponding with embodiment of the method shown in Fig. 2.
As shown in fig. 6, the device 600 of the storing data of the present embodiment includes: to obtain module 601, memory module 602, rope Draw a generation module 603 and data operation modules 604.Wherein, obtaining module 601 may be configured to obtain key- to be stored Value data;Memory module 602 may be configured to the write-ahead log file that above-mentioned key-value data are stored in disk In;Index entry generation module 603 can be according to storage location and key-value of the above-mentioned key-value data in disk Key in data is that the key-value data generate index entry in memory, so that data operation modules 604 pass through the index Item executes predetermined operation to the key-value data, when detecting the write-ahead log file for meeting merging finishing condition, according to It indexes in the key-value data supplementing write-in disk retained in the write-ahead log file that this is met to predetermined condition, wherein deposit Storage space set can include but is not limited to it is at least one of following: the filename of write-ahead log file, key-value data are prewriting day In will file with the offset of file initial position.
In some optional implementations of the present embodiment, when predetermined operation includes read operation, data operation modules 604 It may be configured to: searching index entry corresponding to key-value data to be read in the index, to obtain key- to be read The storage location of value data;Key-value data to be read are read from write-ahead log file according to storage location.
In some optional implementations of the present embodiment, when predetermined operation includes delete operation, data operation modules 604 may be configured to: delete the index of key-value data to be deleted in the index according to the key in key-value data .
In some optional implementations of the present embodiment, above-mentioned apparatus 600 can also include: that removing module (does not show Out), it is configured to delete the write-ahead log file for being added write-in disk;Index generation module 603 can also be configured to more Index in new memory.
In some optional implementations of the present embodiment, merging finishing condition can include but is not limited to following at least one : the occupied space of write-ahead log file is more than default file capacity-threshold;The occupied space and gross space of write-ahead log file Ratio be more than default file space proportion threshold value;The key-value data being deleted in write-ahead log file are more than default threshold Value;The ratio of occupied space and gross space that key-value data are deleted in write-ahead log file is more than default deletion ratio Threshold value;The occupied space of disk is more than default disk space threshold value;The occupied space of disk and the ratio of disk total capacity More than default disk space proportion threshold value;Etc..
In some optional implementations of the present embodiment, the index in memory includes according to the write-ahead log text in disk The dictionary tree that the key of key-value data in part is generated, wherein in the dictionary tree, from each path of root node The key of a corresponding key-value data, the last one node of each path store key-value number corresponding to the path According to storage location.
It is worth noting that each in the method that all modules recorded in the device 600 of storing data are described with reference Fig. 2 A step is corresponding.As a result, above with respect to method description operation and feature be equally applicable to storing data device 600 and its In include module or unit, details are not described herein.
It will be understood by those skilled in the art that the device 600 of above-mentioned storing data further includes some other known features, example Such as processor, memory, in order to unnecessarily obscure embodiment of the disclosure, these well known structures are not shown in Fig. 6.
Below with reference to Fig. 7, it illustrates the terminal device/server computers for being suitable for being used to realize the embodiment of the present application The structural schematic diagram of system 700.Terminal device/server shown in Fig. 7 is only an example, should not be to the embodiment of the present application Function and use scope bring any restrictions.
As shown in fig. 7, computer system 700 includes central processing unit (CPU) 701, it can be read-only according to being stored in Program in memory (ROM) 702 or be loaded into the program in random access storage device (RAM) 703 from storage section 708 and Execute various movements appropriate and processing.In RAM 703, also it is stored with system 700 and operates required various programs and data. CPU701, ROM 702 and RAM703 is connected with each other by bus 704.Input/output (I/O) interface 705 is also connected to bus 704。
I/O interface 705 is connected to lower component: the storage section 706 including hard disk etc.;And including such as LAN card, tune The communications portion 707 of the network interface card of modulator-demodulator etc..Communications portion 707 executes mailing address via the network of such as internet Reason.Driver 708 is also connected to I/O interface 705 as needed.Detachable media 709, such as disk, CD, magneto-optic disk, half Conductor memory etc. is mounted on as needed on driver 708, in order to as needed from the computer program read thereon It is mounted into storage section 706.
Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium On computer program, which includes the program code for method shown in execution flow chart.In such reality It applies in example, which can be downloaded and installed from network by communications portion 707, and/or from detachable media 709 are mounted.When the computer program is executed by central processing unit (CPU) 701, limited in execution the present processes Above-mentioned function.It should be noted that non-volatile computer-readable medium described herein can be non-volatile computer Readable signal medium or non-volatile computer readable storage medium storing program for executing either the two any combination.Non-volatile meter What calculation machine readable storage medium storing program for executing for example may be-but not limited to-electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor is System, device or device, or any above combination.The more specific example of non-volatile computer readable storage medium storing program for executing can be with Including but not limited to: there is electrical connection, the portable computer diskette, hard disk, random access storage device of one or more conducting wires (RAM), read-only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc Read-only memory (CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.In this application, Non-volatile computer readable storage medium storing program for executing can be any tangible medium for including or store program, which can be commanded Execution system, device or device use or in connection.And in this application, computer-readable signal media can To include in a base band or as the data-signal that carrier wave a part is propagated, wherein carrying computer-readable program generation Code.The data-signal of this propagation can take various forms, including but not limited to electromagnetic signal, optical signal or above-mentioned any Suitable combination.Computer-readable signal media can also be any computer-readable other than computer readable storage medium Medium, the computer-readable medium can be sent, propagated or transmitted for being used by instruction execution system, device or device Or program in connection.The program code for including on computer-readable medium can pass with any suitable medium It is defeated, including but not limited to: wireless, electric wire, optical cable, RF etc. or above-mentioned any appropriate combination.
Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the application, method and computer journey The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation A part of one module, program segment or code of table, a part of the module, program segment or code include one or more use The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction Combination realize.
Being described in module involved in the embodiment of the present application can be realized by way of software, can also be by hard The mode of part is realized.Described module also can be set in the processor, for example, can be described as: a kind of processor packet It includes and obtains module, memory module, index entry generation module and data operation modules.Wherein, the title of these modules is in certain feelings The restriction to the unit itself is not constituted under condition, is also described as " being configured to obtain wait store for example, obtaining module Key-value pair data module ".
As on the other hand, present invention also provides a kind of computer-readable medium, which be can be Included in device described in above-described embodiment;It is also possible to individualism, and without in the supplying device.Above-mentioned calculating Machine readable medium carries one or more program, when said one or multiple programs are executed by the device, so that should Device: key-value pair data to be stored is obtained;Above-mentioned key-value pair data is stored in the write-ahead log file of disk;According to upper State key of the key-value pair data in the storage location and the key-value pair data in disk is that the key-value pair data is raw in memory At index entry, to execute predetermined operation to the key-value pair data by the index entry, merge finishing condition detecting to meet When write-ahead log file, this is met to the key-value pair data addition retained in the write-ahead log file of predetermined condition according to index and is write Enter in disk, wherein storage location includes at least one of the following: that the filename of write-ahead log file, key-value pair data are prewriting In journal file with the offset of file initial position.
Above description is only the preferred embodiment of the application and the explanation to institute's application technology principle.Those skilled in the art Member is it should be appreciated that invention scope involved in the application, however it is not limited to technology made of the specific combination of above-mentioned technical characteristic Scheme, while should also cover in the case where not departing from foregoing invention design, it is carried out by above-mentioned technical characteristic or its equivalent feature Any combination and the other technical solutions formed.Such as features described above has similar function with (but being not limited to) disclosed herein Can technical characteristic replaced mutually and the technical solution that is formed.

Claims (14)

1. a kind of method of storing data, which is characterized in that the described method includes:
Obtain key-value pair data to be stored;
The key-value pair data is stored in the write-ahead log file of disk;
It is in memory institute according to key of the key-value pair data in the storage location and the key-value pair data in disk It states key-value pair data and generates index entry, to execute predetermined operation to the key-value pair data by the index entry, detecting When meeting the write-ahead log file for merging finishing condition, this is met according to index in the write-ahead log file of predetermined condition and retained The additional write-in disk of key-value pair data in, wherein the storage location includes at least one of the following: the text of write-ahead log file The offset of part name, the key-value pair data in write-ahead log file with file initial position.
2. described to pass through the method according to claim 1, wherein when the predetermined operation includes read operation The index entry executes predetermined operation to the key-value pair data stored in write-ahead log file
Index entry corresponding to key-value pair data to be read is searched, in the index to obtain the storage position of key-value pair data to be read It sets;
Key-value pair data to be read is read from write-ahead log file according to the storage location.
3. described logical the method according to claim 1, wherein when the predetermined operation includes delete operation Cross the index entry includes: to the key-value pair data execution predetermined operation stored in write-ahead log file
Delete the index entry of key-value pair data to be deleted in the index according to the key in key-value pair data.
4. the method according to claim 1, wherein the method also includes:
Delete the write-ahead log file for being added write-in disk;
Update the index in memory.
5. the method according to claim 1, wherein the merging finishing condition includes at least one of the following:
The occupied space of write-ahead log file is more than default file capacity-threshold;
The occupied space of write-ahead log file and the ratio of gross space are more than default file space proportion threshold value;
The key-value pair data being deleted in write-ahead log file is more than preset threshold;
The ratio of occupied space and gross space that key-value pair data is deleted in write-ahead log file is more than default deletion ratio threshold Value;
The occupied space of disk is more than default disk space threshold value;
The occupied space of disk and the ratio of disk total capacity are more than default disk space proportion threshold value.
6. any method in -5 according to claim 1, which is characterized in that the index in memory includes according in disk The dictionary tree that the key of key-value pair data in write-ahead log file generates, wherein in the dictionary tree, from root node The key of the corresponding key-value pair data of each path, the last one node of each path store key-value pair corresponding to the path The storage location of data.
7. a kind of device of storing data, which is characterized in that described device includes:
Module is obtained, is configured to obtain key-value pair data to be stored;
Memory module is configured to for the key-value pair data being stored in the write-ahead log file of disk;
Index entry generation module is configured to storage location and the key-value pair according to the key-value pair data in disk Key in data is that the key-value pair data generates index entry in memory, so that data operation modules pass through the index entry pair The key-value pair data executes predetermined operation, when detecting the write-ahead log file for meeting merging finishing condition, according to index This is met in the additional write-in disk of the key-value pair data retained in the write-ahead log file of predetermined condition, wherein the storage Position include at least one of the following: the filename of write-ahead log file, the key-value pair data in write-ahead log file with text The offset of part initial position.
8. device according to claim 7, which is characterized in that when the predetermined operation includes read operation, the data Operation module is configured to:
Index entry corresponding to key-value pair data to be read is searched, in the index to obtain the storage position of key-value pair data to be read It sets;
Key-value pair data to be read is read from write-ahead log file according to the storage location.
9. device according to claim 7, which is characterized in that when the predetermined operation includes delete operation, the number It is configured to according to operation module:
Delete the index entry of key-value pair data to be deleted in the index according to the key in key-value pair data.
10. device according to claim 7, which is characterized in that described device further include:
Removing module is configured to delete the write-ahead log file for being added write-in disk;
The index generation module is also configured to update the index in memory.
11. device according to claim 7, which is characterized in that the merging finishing condition includes at least one of the following:
The occupied space of write-ahead log file is more than default file capacity-threshold;
The occupied space of write-ahead log file and the ratio of gross space are more than default file space proportion threshold value;
The key-value pair data being deleted in write-ahead log file is more than preset threshold;
The ratio of occupied space and gross space that key-value pair data is deleted in write-ahead log file is more than default deletion ratio threshold Value;
The occupied space of disk is more than default disk space threshold value;
The occupied space of disk and the ratio of disk total capacity are more than default disk space proportion threshold value.
12. according to the device any in claim 7-11, which is characterized in that the index in memory includes according in disk Write-ahead log file in key-value pair data key generate dictionary tree, wherein in the dictionary tree, from root node The corresponding key-value pair data of each path key, the last one node of each path stores key assignments corresponding to the path To the storage location of data.
13. a kind of storage equipment, comprising:
One or more processors;
Program storage device, for storing one or more programs;
Disk, for storing one or more write-ahead log files;
When one or more of programs are executed by one or more of processors, so that one or more of processors are real Now such as method as claimed in any one of claims 1 to 6.
14. a kind of non-volatile computer readable storage medium storing program for executing, is stored thereon with computer program, which is characterized in that the program Such as method as claimed in any one of claims 1 to 6 is realized when being executed by processor.
CN201710187260.8A 2017-03-27 2017-03-27 The method and apparatus of storing data Active CN106886375B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710187260.8A CN106886375B (en) 2017-03-27 2017-03-27 The method and apparatus of storing data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710187260.8A CN106886375B (en) 2017-03-27 2017-03-27 The method and apparatus of storing data

Publications (2)

Publication Number Publication Date
CN106886375A CN106886375A (en) 2017-06-23
CN106886375B true CN106886375B (en) 2019-11-05

Family

ID=59181426

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710187260.8A Active CN106886375B (en) 2017-03-27 2017-03-27 The method and apparatus of storing data

Country Status (1)

Country Link
CN (1) CN106886375B (en)

Families Citing this family (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107463512B (en) * 2017-06-26 2020-11-13 上海高顿教育培训有限公司 Data updating method of distributed high-speed storage system
CN107480233A (en) * 2017-08-07 2017-12-15 郑州云海信息技术有限公司 A kind of method and system of daily record data positioning
CN108153488B (en) * 2017-12-13 2021-05-04 北京小米移动软件有限公司 Data self-adding method and device
CN108228829B (en) * 2018-01-03 2022-01-11 北京百度网讯科技有限公司 Method and apparatus for generating information
US10579606B2 (en) * 2018-05-03 2020-03-03 Samsung Electronics Co., Ltd Apparatus and method of data analytics in key-value solid state device (KVSSD) including data and analytics containers
CN109164977B (en) * 2018-07-23 2022-01-11 中国建设银行股份有限公司 Data storage system and method, and storage medium
CN109254870B (en) * 2018-08-01 2021-05-18 华为技术有限公司 Data backup method and device
CN110837338A (en) * 2018-08-15 2020-02-25 阿里巴巴集团控股有限公司 Storage index processing method and device
CN109558457B (en) * 2018-12-11 2022-04-22 浪潮(北京)电子信息产业有限公司 Data writing method, device, equipment and storage medium
CN111444114B (en) * 2019-01-16 2023-06-13 阿里巴巴集团控股有限公司 Method, device and system for processing data in nonvolatile memory
CN110377531B (en) * 2019-07-19 2021-08-10 清华大学 Persistent memory storage engine device based on log structure and control method
CN110764705B (en) * 2019-10-22 2023-08-04 北京锐安科技有限公司 Data reading and writing method, device, equipment and storage medium
CN111008183B (en) * 2019-11-19 2023-09-15 武汉极意网络科技有限公司 Storage method and system for business wind control log data
CN112925473A (en) * 2019-12-06 2021-06-08 阿里巴巴集团控股有限公司 Data storage method, device, equipment and storage medium
CN111045898A (en) * 2019-12-22 2020-04-21 北京浪潮数据技术有限公司 Log collection method, device and equipment for multi-stage subsystem and readable storage medium
CN113032349A (en) * 2019-12-25 2021-06-25 阿里巴巴集团控股有限公司 Data storage method and device, electronic equipment and computer readable medium
CN112256650A (en) * 2020-10-20 2021-01-22 广州市百果园网络科技有限公司 Storage space management method, device, equipment and storage medium
CN112416940A (en) * 2020-11-27 2021-02-26 深信服科技股份有限公司 Key value pair storage method and device, terminal equipment and storage medium
CN112540731B (en) * 2020-12-22 2023-08-11 北京百度网讯科技有限公司 Data append writing method, device, equipment, medium and program product
CN112732191B (en) * 2021-01-08 2023-01-10 苏州浪潮智能科技有限公司 Method, system, device and medium for merging tree merging data based on log structure
CN112783896B (en) * 2021-01-12 2023-05-23 湖北宸威玺链信息技术有限公司 Method for reducing memory usage rate by loading files
CN113094372A (en) * 2021-04-16 2021-07-09 三星(中国)半导体有限公司 Data access method, data access control device and data access system
CN115202588B (en) * 2022-09-14 2022-12-27 本原数据(北京)信息技术有限公司 Data storage method and device and data recovery method and device
CN116561073B (en) * 2023-04-14 2023-12-19 云和恩墨(北京)信息技术有限公司 File merging method and system based on database, equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104133867A (en) * 2014-07-18 2014-11-05 中国科学院计算技术研究所 DOT in-fragment secondary index method and DOT in-fragment secondary index system
CN104809178A (en) * 2015-04-15 2015-07-29 北京科电高技术公司 Write-in method of key/value database memory log
CN105468298A (en) * 2015-11-19 2016-04-06 中国科学院信息工程研究所 Key value storage method based on log-structured merged tree

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104133867A (en) * 2014-07-18 2014-11-05 中国科学院计算技术研究所 DOT in-fragment secondary index method and DOT in-fragment secondary index system
CN104809178A (en) * 2015-04-15 2015-07-29 北京科电高技术公司 Write-in method of key/value database memory log
CN105468298A (en) * 2015-11-19 2016-04-06 中国科学院信息工程研究所 Key value storage method based on log-structured merged tree

Also Published As

Publication number Publication date
CN106886375A (en) 2017-06-23

Similar Documents

Publication Publication Date Title
CN106886375B (en) The method and apparatus of storing data
US11740891B2 (en) Providing access to a hybrid application offline
US10951702B2 (en) Synchronized content library
CN109254733B (en) Method, device and system for storing data
US11221918B2 (en) Undo changes on a client device
US9747321B2 (en) Providing a content preview
US11816128B2 (en) Managing content across discrete systems
US20140195514A1 (en) Unified interface for querying data in legacy databases and current databases
US20140304384A1 (en) Uploading large content items
US20140229457A1 (en) Automatic content item upload
CN108614976A (en) Authority configuring method, device and storage medium
CN109697019A (en) The method and system of data write-in based on FAT file system
CN110119386A (en) Data processing method, data processing equipment, medium and calculating equipment
CN113361236A (en) Method and device for editing document
CN113688139B (en) Object storage method, gateway, device and medium
CN115878625A (en) Data processing method and device and electronic equipment
CN113760822A (en) HDFS-based distributed intelligent campus file management system optimization method and device
CN113722007A (en) Configuration method, device and system of VPN branch equipment
CN113792031B (en) Key value data processing method, system, equipment and medium
CN113760860B (en) Data reading method and device
US20170091300A1 (en) Distinguishing event type
CN115858496A (en) Data migration method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant