CN109947709A - Date storage method and device - Google Patents

Date storage method and device Download PDF

Info

Publication number
CN109947709A
CN109947709A CN201910261866.0A CN201910261866A CN109947709A CN 109947709 A CN109947709 A CN 109947709A CN 201910261866 A CN201910261866 A CN 201910261866A CN 109947709 A CN109947709 A CN 109947709A
Authority
CN
China
Prior art keywords
data
target
index information
file
under
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910261866.0A
Other languages
Chinese (zh)
Other versions
CN109947709B (en
Inventor
田勇
司春峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201910261866.0A priority Critical patent/CN109947709B/en
Publication of CN109947709A publication Critical patent/CN109947709A/en
Application granted granted Critical
Publication of CN109947709B publication Critical patent/CN109947709B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The embodiment of the present application discloses date storage method and device.One specific embodiment of this method comprises determining that the index information of the data in target data set, wherein the target data set includes a plurality of data, and every data includes key and value;The index information of pieces of data is added to the target data set.The embodiment of the present application can be on the basis of determining the index information of data, in order to be addressed to data subsequently through scanning index information.The case where taking long time caused by scan full hard disk must be carried out to each key-value pair when so as to avoid starting, improve the treatment effeciency of data.

Description

Date storage method and device
Technical field
The invention relates to field of computer technology, and in particular to Internet technical field more particularly to data are deposited Method for storing and device.
Background technique
With the development of data storage technology, plurality of data structures is come into being.Existing Key-value data structure, On startup and before garbage data cleaning, require to be scanned each key-value pair.
The efficiency of such data storage method starting and garbage data cleaning is lower, extends the time of data processing.
Summary of the invention
The embodiment of the present application proposes date storage method and device.
In a first aspect, the embodiment of the present application provides a kind of date storage method, comprising: determine in target data set The index information of data, wherein target data set includes a plurality of data, and every data includes key and value;By pieces of data Index information is added to target data set.
In some embodiments, target data set belongs to data total collection, and data total collection includes at least one data Set, each data acquisition system are stored under at least two catalogues, and the quantity of the file under each catalogue is preset quantity, present count Amount is less than or equal to default file amount threshold.
In some embodiments, the index information of the data in target data set is determined, comprising: by key and below At least one composition index information: serial number, the length of data, the offset address of data and the data institute of data file where data The number of corresponding catalogue, wherein serial number is used to characterize putting in order for each file under catalogue.
In some embodiments, method further include: in response to receiving the operational order to target data set, scanning is each The index information of data.
In some embodiments, the index information of pieces of data is added to target data set, comprising: by pieces of data Index information write-in target data set where catalogue under index file in, wherein be stored under catalogue corresponding Index file and data file.
In some embodiments, method further include: index file middle finger is shown that effective index information of valid data is written In memory headroom.
In some embodiments, in response to receiving the operational order to target data set, the rope of pieces of data is scanned Fuse breath, comprising: in response to receiving following one of instruction to target data set, scan each item number in memory headroom According to effective index information: look-up command, modification instruction, delete instruction, increase data command and read instruction;In response to receiving To the rewritten instructions to target data set, effective index information of pieces of data in scanning index file and memory headroom.
In some embodiments, method is applied to target electronic device, and target data set is stored in target electronic device, Method further include: in response to receiving the out code to target electronic device, on disk, to effective index in memory headroom Information carries out persistence processing;And method further include:, will be lasting in response to receiving the enabled instruction to target electronic device Change obtained effective index information write-in memory headroom.
In some embodiments, each data acquisition system stores under both directories.
In some embodiments, method further include: receive rewriting data instruction;By the preset mesh at least two catalogues Heading records lower serial number maximum or the smallest data file of serial number as starting point and rewrites target directory according to the size order of serial number Under valid data, obtain rewrite data;Receive data more new command;It will be in serial number maximum under target directory and serial number minimum The data file of another one increases more new data according to the size order of serial number as starting point under target directory.
In some embodiments, before completing to rewrite the valid data under target directory, execution increases under target directory Add the operation of more new data;Or before completing to increase more new data under target directory, executes and complete to rewrite under target directory Valid data operation.
In some embodiments, according to the size order of serial number, the valid data under target directory are rewritten, comprising: response Reach default file size threshold value in the size for the current data file that determining rewriting obtains, according to the size order of serial number, weight Write next data file of current data file.
In some embodiments, according to the size order of serial number, increase more new data under target directory, comprising: response Reach default file size threshold value in the size of current data file for determining that update obtains, according to the size order of serial number, Next data file increase more new data of current data file.
In some embodiments, method further include: determine the index information of more new data, and target directory is written;It determines The index information of data is rewritten, and target directory is written.
In some embodiments, method further include: by the Data Migration under other catalogues other than target directory to target Catalogue.
Second aspect, the embodiment of the present application provide a kind of data storage device, comprising: determination unit is configured to really The index information of the data to set the goal in data acquisition system, wherein target data set includes a plurality of data, and every data includes key And value;Adding unit is configured to the index information of pieces of data being added to target data set.
In some embodiments, target data set belongs to data total collection, and data total collection includes at least one data Set, each data acquisition system are stored under at least two catalogues, and the quantity of the file under each catalogue is preset quantity, present count Amount is less than or equal to default file amount threshold.
In some embodiments, the index information of the data in target data set is determined, comprising: by key and below At least one composition index information: serial number, the length of data, the offset address of data and the data institute of data file where data The number of corresponding catalogue, wherein serial number is used to characterize putting in order for each file under catalogue.
In some embodiments, device further include: scanning element is configured in response to receive to target data set Operational order, scan the index information of pieces of data.
In some embodiments, it adding unit: is configured to the index information of pieces of data target data set is written In index file under the catalogue at place, wherein be stored with corresponding index file and data file under catalogue.
In some embodiments, device further include: write-in internal storage location is configured to index file middle finger showing significant figure According to effective index information write-in memory headroom in.
In some embodiments, scanning element is configured to be executed as follows in response to receiving to target data The operational order of set scans the index information of pieces of data: in response to receiving following wherein one to target data set Kind instruction, scan effective index information of pieces of data in memory headroom: look-up command, deletion instruction, increases number at modification instruction According to instruction and read instruction;In response to receiving the rewritten instructions to target data set, scanning index file and memory are empty Between middle pieces of data effective index information.
In some embodiments, device is applied to target electronic device, and target data set is stored in target electronic device, Device further include: persistence unit is configured in response to receive the out code to target electronic device, right on disk Effective index information in memory headroom carries out persistence processing;And device further include: writing unit is configured in response to The enabled instruction to target electronic device is received, memory headroom is written in effective index information that persistence is obtained.
In some embodiments, scanning element is configured to be executed as follows in response to receiving to target data The operational order of set scans the index information of pieces of data: in response to receiving the operational order to target data set, sweeping Retouch the index information in pieces of data indexed file.
In some embodiments, each data acquisition system stores under both directories.
In some embodiments, device further include: the first receiving unit is configured to receive rewriting data instruction;It rewrites Unit is configured to make serial number maximum under the preset target directory at least two catalogues or the smallest data file of serial number The valid data under target directory are rewritten according to the size order of serial number for starting point, obtain rewriteeing data;Second receiving unit, It is configured to receive data more new command;Updating unit, being configured to will be in serial number maximum under target directory and serial number minimum The data file of another one increases more new data according to the size order of serial number as starting point under target directory.
In some embodiments, before completing to rewrite the valid data under target directory, execution increases under target directory Add the operation of more new data;Or before completing to increase more new data under target directory, executes and complete to rewrite under target directory Valid data operation.
In some embodiments, rewriting unit is further configured to execute as follows suitable according to the size of serial number Sequence rewrites the valid data under target directory: in response to determining that the size for rewriteeing obtained current data file reaches default text Part size threshold value rewrites next data file of current data file according to the size order of serial number.
In some embodiments, updating unit is further configured to execute as follows suitable according to the size of serial number Sequence increases more new data under target directory: in response to determining that the size for updating obtained current data file reaches default text Part size threshold value, according to the size order of serial number, in next data file increase more new data of current data file.
In some embodiments, device further include: the first index determination unit is configured to determine the index of more new data Information, and target directory is written;Second index determination unit, is configured to determine the index information for rewriteeing data, and mesh is written Heading record.
In some embodiments, device further include: migration units, being configured to will be under other catalogue other than target directory Data Migration to target directory.
The third aspect, the embodiment of the present application provide a kind of electronic equipment, comprising: one or more processors;Storage dress It sets, for storing one or more programs, when one or more programs are executed by one or more processors, so that one or more A processor realizes the method such as any embodiment in date storage method.
Fourth aspect, the embodiment of the present application provide a kind of computer readable storage medium, are stored thereon with computer journey Sequence realizes the method such as any embodiment in date storage method when the program is executed by processor.
Data storage scheme provided by the embodiments of the present application, it is first determined the index of the data in target data set is believed Breath, wherein target data set includes a plurality of data, and every data includes key and value.Later, by the index information of pieces of data It is added to target data set.The embodiment of the present application can be on the basis of determining the index information of data, in order to subsequent By scanning index information, data are addressed.So as to be swept totally to each key-value pair when avoiding starting The case where taking long time caused by retouching, improves the treatment effeciency of data.
Detailed description of the invention
By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the application's is other Feature, objects and advantages will become more apparent upon:
Fig. 1 is that this application can be applied to exemplary system architecture figures therein;
Fig. 2 a is the flow chart according to one embodiment of the date storage method of the application;
Fig. 2 b is the schematic diagram according to the organizational form of the value of the data of the date storage method of the application;
Fig. 3 a is the flow chart according to another embodiment of the date storage method of the application;
Fig. 3 b is the schematic diagram according to the organizational form of an index information of the date storage method of the application;
Fig. 4 is the schematic diagram according to the update of the data of the date storage method of the application and rewriting data;
Fig. 5 is the structural schematic diagram according to one embodiment of the data storage device of the application;
Fig. 6 is adapted for the structural schematic diagram for the computer system for realizing the electronic equipment of the embodiment of the present application.
Specific embodiment
The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to Convenient for description, part relevant to related invention is illustrated only in attached drawing.
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Fig. 1 is shown can be using the exemplary system of the embodiment of the date storage method or data storage device of the application System framework 100.
As shown in Figure 1, system architecture 100 may include terminal device 101,102,103, network 104 and server 105. Network 104 between terminal device 101,102,103 and server 105 to provide the medium of communication link.Network 104 can be with Including various connection types, such as wired, wireless communication link or fiber optic cables etc..
User can be used terminal device 101,102,103 and be interacted by network 104 with server 105, to receive or send out Send message etc..Various client applications, such as data-storage applications, video can be installed on terminal device 101,102,103 Class application, live streaming application, instant messaging tools, mailbox client, social platform software etc..On terminal device 101,102,103 Client application starting when, the data in memory can be read.
Here terminal device 101,102,103 can be hardware, be also possible to software.When terminal device 101,102, 103 be hardware when, can be various electronic equipments, including but not limited to smart phone, tablet computer, E-book reader, knee Mo(u)ld top half portable computer and desktop computer etc..When terminal device 101,102,103 is software, above-mentioned institute may be mounted at In the electronic equipment enumerated.Multiple softwares or software module may be implemented into (such as providing the multiple of Distributed Services in it Software or software module), single software or software module also may be implemented into.It is not specifically limited herein.
Server 105 can be to provide the server of various services, such as provide support to terminal device 101,102,103 Background data server.Background data server can divide the data such as data in the target data set received The processing such as analysis, and processing result (such as the data such as target data set for being added to index information) is fed back into terminal device.
It should be noted that date storage method provided by the embodiment of the present application can be by server 105 or terminal Equipment 101,102,103 executes, correspondingly, data storage device can be set in server 105 or terminal device 101, 102, in 103.
It should be understood that the number of terminal device, network and server in Fig. 1 is only schematical.According to realization need It wants, can have any number of terminal device, network and server.
With continued reference to Fig. 2 a, the process 200 of one embodiment of the date storage method according to the application is shown.The number According to storage method, comprising the following steps:
Step 201, the index information of the data in target data set is determined, wherein target data set includes a plurality of Data, every data include key and value.
In the present embodiment, the executing subject (such as server shown in FIG. 1 or terminal device) of date storage method can To determine the index information of pieces of data.Target data set is stored in the form of key-value pair (key-value). One index information indicates the position of data storage, and the data of its instruction can be found by index information.Here Identified index information not only includes key, can also include other information, other information is the information for searching data.Than Such as, the serial number etc. of the number of the corresponding catalogue of data or file in catalogue.Specifically, above-mentioned executing subject can use more Kind mode determines index information.For example, the number of key and the corresponding catalogue of data can be formed index letter by above-mentioned executing subject Key and physical address can also be formed index information by breath.
In practice, as Fig. 2 b data can be organized as the form in figure, wherein header can indicate a data The length of key length and value.Value is a data.Crc indicates the check value of above-mentioned header and value.
Step 202, the index information of pieces of data is added to target data set.
In the present embodiment, the index information of identified pieces of data can be added to number of targets by above-mentioned executing subject According in set.In this way, target data set includes data indicated by index information and each index information.
The method provided by the above embodiment of the application can on the basis of determining the index information of data, in order to Subsequently through scanning index information, data are addressed.Each key-value pair must be carried out when so as to avoid starting complete The case where sweeping takes long time caused by retouching, improves the treatment effeciency of data.
With further reference to Fig. 3 a, it illustrates the processes 300 of another embodiment of date storage method.Data storage The process 300 of method, comprising the following steps:
Step 301, the index information of the data in target data set is determined, wherein target data set includes a plurality of Data, every data include key and value.
In the present embodiment, the executing subject (such as server shown in FIG. 1 or terminal device) of date storage method can To determine the index information of pieces of data.Target data set is stored in the form of key-value pair.One index letter Breath indicates the position of data storage, and the data of its instruction can be found by index information.Here identified rope Fuse breath not only includes key, can also include other information, other information is the information for searching data.
Step 302, the index information of pieces of data is added to target data set.
In the present embodiment, the index information of identified pieces of data can be added to number of targets by above-mentioned executing subject According in set.In this way, target data set includes data indicated by index information and each index information.
Step 303, in response to receiving the operational order to target data set, the index information of pieces of data is scanned.
In the present embodiment, above-mentioned executing subject can be swept in response to receiving the operational order to target data set Retouch the index information of pieces of data.By scanning index information, the position of pieces of data can be determined, looked into order to subsequent Look for the operation of data etc..Here operational order can be various operational orders, such as data search instruction, data modification instruction Etc..
Also, by scanning index information substitution scanning key-value pair, effectively data can either be addressed, also shown It lands and shortens sweep time, to realize raising data-handling efficiency.
In some optional implementations of the present embodiment, step 302 includes: that the index information of pieces of data is written In the index file under catalogue where target data set, wherein be stored with corresponding index file and data under catalogue File.
In these optional implementations, above-mentioned executing subject can be write the index information of identified pieces of data Enter into index file.In this way, having index information and each index letter under one of catalogue where target data set The indicated data of breath.Each data file has corresponding index file, index text under identical catalogue The index information of the data in the data file is stored in part.In general, index file is stored in disk.Specifically, Even if the file that file and the data that index information is written are written also is different under the same catalogue.
The index information of these implementations and data are respectively in different files, when in order to start and junk data Before cleaning, only the index information in index file is scanned, quickly to determine the position where data.In addition, institute The index file of generation can also accelerate to construct index information in memory headroom, in order to further apply.
In further optional implementation, index file middle finger is shown in effective index information write-in of valid data It deposits in space.
In these optional implementations, only memory headroom can be written in effective index information by above-mentioned executing subject In.In this way, subsequent can be scanned effective index information in memory, or will be effective when receiving operational order Index information is stored in disk, and then scans to effective index information on disk.Such scanning mode, can be in order to avoid to invalid Data are scanned, and to avoid invalid scanning, improve scanning speed.
In some optional situations of above-mentioned further optional implementation, step 303 may include:
In response to receiving following one of instruction to target data set, pieces of data in memory headroom is scanned Effective index information: look-up command modification instruction, deletes instruction, increases data command and reads instruction;In response to receiving pair The rewritten instructions of target data set, effective index information of pieces of data in scanning index file and memory headroom.
In these optional situations, if received any in the above-metioned instruction to the data in target data set One, then scan effective index information in memory headroom.To realize and believe index by the faster processing speed of memory headroom The quick scanning of breath.And if the instruction received is the rewritten instructions for indicating to be written over valid data in disk, Need to scan effective index information in the index information and memory in disk.Because receiving the letter of update in memory headroom When breath, corresponding invalid data will be deleted.Thus ensure to rewrite in disk referring to effective index information in memory headroom Data are effective.Here increase data command can indicate to increase the more new data for being updated legacy data, can also Increase other data with instruction.
Optionally, the method under above situation can be applied to target electronic device, and target data set is stored in target Electronic equipment, the above method further include: in response to receiving the out code to target electronic device, on disk, to memory sky Between in effective index information carry out persistence processing;And method further include: in response to receiving to target electronic device Memory headroom is written in enabled instruction, effective index information that persistence is obtained.
Effective index information in memory can be persisted to disk when closing electronic equipment by above-mentioned executing subject On, in order to which memory headroom is written in effective index information that when starting electronic equipment next time, persistence is obtained, with quick Construct the index information in memory.Meanwhile convenient for scanning in memory headroom when receiving the instruction to target data set Index information, to realize quickly scanning by the faster processing speed of memory headroom.
It is worth noting that, usually memory headroom it is all smaller, if the byte number of index information is larger, can occupy compared with Big memory headroom.So, even if the disk space of storing data is very big, as memory headroom is not large enough to hold index letter Cease and cause the waste of disk space.So because being realized using above-mentioned data organization form and the composition of index information The byte number very little of index information, then can be in memory in the case where persistence processing index information, more fully using setting Large capacity disc in standby.Also, persistence can save effective index information, avoid the loss of effective index information.
In some optional implementations of the above-mentioned any embodiment of date storage method of the application, target data set Conjunction belongs to data total collection, and data total collection includes at least one data acquisition system, and each data acquisition system is stored at least two mesh Under record, the quantity of the file under each catalogue is preset quantity, and preset quantity is less than or equal to default file amount threshold.
In these optional implementations, each data in a database can be with composition data total collection.Data May exist multiple data acquisition systems (i.e. multiple group) in total collection, wherein may include target data set.Each data set Conjunction can store under at least two catalogues, such as catalogue 0, catalogue 1 and catalogue 2.The quantity of file under each catalogue is pre- It first sets, and the small number.Specifically, the quantity of the data file under catalogue and the quantity of index file are ok It is preset.For example, can have data file 64 and index file 64 under catalogue.
These implementations can determine suitable data organization form, optimize the Land use systems of memory space.If Set the situation that more than two catalogues can be excessive to avoid the number of files under the same catalogue.Above-mentioned data organization form can be with Allow data index information byte number very little, and then avoid space and be indexed the problem of information excessively occupies.Especially in rope In the case that fuse breath is stored in memory headroom, the space utilisation of memory can be improved.
In further optional implementation, each data acquisition system storage is under both directories.
By accurately defining the quantity of catalogue, catalogue occupied byte number in index field can control, and keep away Exempt from catalogue it is very few caused by the excessive problem of quantity of documents.
In further optional implementation, the index of the data in the data acquisition system that sets the goal really in step 302 is believed Breath may include: by the key of data and at least one composition index information below: the serial number of data place data file, The number of catalogue corresponding to the length of data, the offset address of data and data, wherein serial number is each under catalogue for characterizing File puts in order.
In these further optional implementations, above-mentioned executing subject can determine include key index information. Specifically, as shown in Figure 3b, index information can be organized into structure shown in figure.Wherein, dir indicates mesh corresponding to data The number of record.Here data corresponding with catalogue are the datas being written in the file under the catalogue.Fno indicates a number According to serial number of the data file at place under catalogue.One value range can be set to the serial number, to limit the text in catalogue The quantity of part.Off indicates the offset address of data, can determine the actual storage address of data according to the offset address.len Indicate the length of a data.Under the same catalogue, the serial number of data file is continuously that the serial number of index file is continuous 's.
These further optional implementations make index information include completely to address very much data, in order to pass through rope Fuse breath finds the data of its instruction.Meanwhile index information, the word of obtained index information are constituted using above-mentioned each composition Joint number very little further ensures that index information can occupy lesser space.
In further optional implementation, the above method can with the following steps are included:
Receive rewriting data instruction;Serial number maximum or serial number under preset target directory at least two catalogues is minimum Data file as starting point, according to the size order of serial number, rewrite the valid data under target directory, obtain rewriteeing data.
The date storage method can with the following steps are included:
Receive data more new command;The data file of the other of serial number maximum under target directory and serial number minimum is made Increase more new data under target directory according to the size order of serial number for starting point.
In these optional implementations, above-mentioned executing subject can receive rewriting data instruction, and according to serial number Size order carries out the rewriting of valid data, to obtain rewriteeing data.For example, there are 64 data files, data under catalogue The serial number of file is 0-63 respectively.Data can be rewritten since the 0th file under target directory.Valid data are invalid numbers Outer data accordingly.Invalid data can refer to the data for reaching expired time, and can also refer to has message to indicate its invalid data, It can also refer to the data for having had corresponding more new data.During more new data, legacy data is not deleted, is only increased new Data.For example, legacy data is X=1, during update, increases more new data X=2, legacy data and do not delete.
It can be carried out under the same catalogue in order to ensure rewriteeing data and more new data, and process is independent of each other, more The process of new data is using the data file of the other of serial number maximum and serial number minimum as starting point.For example, rewrite data from Data file 0 is used as starting point, and next file of rewriting is data file 1.More new data can be from data file 63 as Next file that point, more new data are written is data file 62.
As shown in figure 4, Fig. 4 is illustrated to be updated data in a catalogue and is rewritten data, in addition, being updated Index and rewriting index.
These further implementations are by realizing the mutual of the two in different files more new data and rewriting data It does not interfere, eliminates the step of mutual exclusion is done to rewriting and update in advance, thus the rewriting of data can be improved and update efficiency.
In further optional implementation, the above-mentioned size order according to serial number is rewritten effective under target directory Data may include:
Reach default file size threshold value in response to the current data file for determining that rewriting obtains, it is suitable according to the size of serial number Sequence rewrites next data file of current data file.
In these further implementations, above-mentioned executing subject is in current data file again data.If The data of write-in make file size reach default file size threshold value, for example reach 2G, and above-mentioned executing subject can be then switched to The adjacent next data file of serial number starts rewrite process under the same catalogue.For example, currently in data file c into Row rewrites data, and next file of rewriting can be data file d.
These further implementations can control the size of each data file under catalogue, to realize what rewriting obtained The file size of each data file is more uniform.
In further optional implementation, the above-mentioned size order according to serial number increases under target directory and updates Data, comprising: in response to determining that the size for updating obtained current data file reaches default file size threshold value, according to serial number Size order, in next data file increase more new data of current data file.
In these further implementations, more new data is written in current data file in above-mentioned executing subject.? In the case that the data having been written into make file size reach default file size threshold value, then sequence under the same catalogue can be switched to Number adjacent next data file starts renewal process.
These further implementations can control the size of each data file under catalogue, to realize what update obtained The file size of each data file is more uniform.
In further optional implementation, before completing to rewrite the valid data under target directory, execute in mesh The lower operation for increasing more new data of heading record;Or before completing to increase more new data under target directory, executes and complete to rewrite The operation of valid data under target directory.
In these further implementations, the weight that may exist the execution time of data and more new data is rewritten It is folded.That is, may be implemented to carry out data update and rewriting data simultaneously under the same catalogue.
These further implementations are in more new data and rewrite the starting point of data respectively on the basis of different files On, data update and rewriting data under the same catalogue may be implemented.It in this way can be by carrying out two kinds of data simultaneously It handles to shorten data update and rewriting data occupied time, improves data-handling efficiency.
The above method can also comprise determining that the index information of more new data in further optional implementation, and Target directory is written;It determines the index information for rewriteeing data, and target directory is written.
In these further implementations, above-mentioned executing subject can determine the index information of more new data, and will Target directory is written in the index information of more new data.Above-mentioned executing subject can also determine the index information of rewriting data, and Target directory is written into the index information for rewriteeing data.Here it is pre- that the quantity of the file where index information is also possible to some If quantity, to reduce the quantity of the file where index, file itself is avoided to occupy excessive space.
In practice, the index information of more new data and the index information of rewriting data can be respectively written into different files In.For example, the index information for limiting more new data can be written into the index file 0 under target directory into index file 63.Weight The index information for writing data can be written into index file 64 under target directory to index file 127.
These further implementations can be respectively to data be rewritten and more new data determines index information, in order to divide The index information of data and the index information of more new data Tong Guo not be rewritten, the position of data is accurately determined.
In further optional implementation, the above method can also include: by other catalogues other than target directory Under Data Migration to target directory.
In these further implementations, above-mentioned executing subject can carry out Data Migration, other than target directory Other catalogues under data, move under target directory.Target directory can be fixed some catalogue, can also be each Rotation is carried out in catalogue.
In practice, data migration process can be carried out, in this way, can before being written over data and more new data Only to carry out rewriting data and data update to the data in a catalogue, the multiple threads to multiple catalogues are avoided, Simplify data handling procedure.
With further reference to Fig. 5, as the realization to method shown in above-mentioned each figure, this application provides a kind of storages of data to fill The one embodiment set, the Installation practice is corresponding with embodiment of the method shown in Fig. 2, which specifically can be applied to respectively In kind electronic equipment.
As shown in figure 5, the data storage device 500 of the present embodiment comprises determining that unit 501 and adding unit 502.Its In, determination unit 501 is configured to determine the index information of the data in target data set, wherein target data set packet A plurality of data are included, every data includes key and value;Adding unit 502 is configured to for the index information of pieces of data being added to Target data set.
In some embodiments, the determination unit 501 of data storage device 500 can determine the index letter of pieces of data Breath.Target data set is stored in the form of key-value pair.One index information indicates the position of data storage It sets, the data of its instruction can be found by index information.
In some embodiments, the index information of identified pieces of data can be added to target by adding unit 502 In data acquisition system.In this way, target data set includes data indicated by index information and each index information.
In some optional implementations of the present embodiment, target data set belongs to data total collection, and data always collect Closing includes at least one data acquisition system, and each data acquisition system is stored under at least two catalogues, the number of the file under each catalogue Amount is preset quantity, and preset quantity is less than or equal to default file amount threshold.
In some optional implementations of the present embodiment, the index information of the data in target data set is determined, It include: by key and at least one composition index information below: serial number, the length of data, number of data file where data According to offset address and data corresponding to catalogue number, wherein serial number be used for characterize each file under catalogue arrangement it is suitable Sequence.
In some optional implementations of the present embodiment, device further include: scanning element is configured in response to connect The operational order to target data set is received, the index information of pieces of data is scanned.
In some optional implementations of the present embodiment, adding unit: it is configured to believe the index of pieces of data In the index file under catalogue where breath write-in target data set, wherein be stored with corresponding index file under catalogue And data file.
In some optional implementations of the present embodiment, device further include: write-in internal storage location is configured to rope Draw document to show in effective index information write-in memory headroom of valid data.
In some optional implementations of the present embodiment, scanning element is configured to execute response as follows In receiving the operational order to target data set, the index information of pieces of data is scanned: in response to receiving to number of targets According to following one of instruction of set, scan effective index information of pieces of data in memory headroom: look-up command, modification refer to It enables, delete instruction, increasing data command and read instruction;In response to receiving the rewritten instructions to target data set, scan Effective index information of pieces of data in index file and memory headroom.
In some optional implementations of the present embodiment, device is applied to target electronic device, target data set It is stored in target electronic device, device further include: persistence unit is configured in response to receive the pass to target electronic device Instruction is closed, on disk, persistence processing is carried out to effective index information in memory headroom;And device further include: write-in Unit is configured in response to receive the enabled instruction to target electronic device, effective index information that persistence is obtained Memory headroom is written.
In some optional implementations of the present embodiment, scanning element is configured to execute response as follows In receiving the operational order to target data set, the index information of pieces of data is scanned: in response to receiving to number of targets According to the operational order of set, the index information in pieces of data indexed file is scanned.
In some optional implementations of the present embodiment, each data acquisition system storage is under both directories.
In some optional implementations of the present embodiment, device further include: the first receiving unit is configured to receive Rewriting data instruction;Rewriting unit is configured to serial number maximum or sequence under the preset target directory at least two catalogues Number the smallest data file rewrites the valid data under target directory, is rewritten as starting point according to the size order of serial number Data;Second receiving unit is configured to receive data more new command;Updating unit is configured to serial number under target directory The data file of the other of maximum and serial number minimum increases under target directory as starting point according to the size order of serial number Add more new data.
In some optional implementations of the present embodiment, before completing to rewrite the valid data under target directory, Execute the operation for increasing more new data under target directory;Or before completing to increase more new data under target directory, execute Complete the operation of the valid data under rewriting target directory.
In some optional implementations of the present embodiment, rewriting unit is further configured to hold as follows Row rewrites the valid data under target directory according to the size order of serial number: the current data text obtained in response to determining rewriting The size of part reaches default file size threshold value, according to the size order of serial number, rewrites next data of current data file File.
In some optional implementations of the present embodiment, updating unit is further configured to hold as follows Row increases more new data according to the size order of serial number under target directory: the current data text obtained in response to determining update The size of part reaches default file size threshold value, according to the size order of serial number, in next data text of current data file Part increase more new data.
In some optional implementations of the present embodiment, device further include: the first index determination unit is configured to It determines the index information of more new data, and target directory is written;Second index determination unit, is configured to determine and rewrites data Index information, and target directory is written.
In some optional implementations of the present embodiment, device further include: migration units are configured to target mesh The Data Migration under other catalogues other than record is to target directory.
As shown in fig. 6, electronic equipment 600 may include processing unit (such as central processing unit, graphics processor etc.) 601, random access can be loaded into according to the program being stored in read-only memory (ROM) 602 or from storage device 608 Program in memory (RAM) 603 and execute various movements appropriate and processing.In RAM603, it is also stored with electronic equipment Various programs and data needed for 600 operations.Processing unit 601, ROM602 and RAM603 are connected with each other by bus 604. Input/output (I/O) interface 605 is also connected to bus 604.
In general, following device can connect to I/O interface 605: input unit 606;Including such as liquid crystal display (LCD), the output device 607 of loudspeaker, vibrator etc.;Storage device 608 including such as tape, hard disk etc.;And communication Device 609.Communication device 609 can permit electronic equipment 600 and wirelessly or non-wirelessly be communicated with other equipment to exchange data. Although Fig. 6 shows the electronic equipment 600 with various devices, it should be understood that being not required for implementing or having all The device shown.It can alternatively implement or have more or fewer devices.Each box shown in Fig. 6 can represent one A device also can according to need and represent multiple devices.
Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium On computer program, which includes the program code for method shown in execution flow chart.In such reality It applies in example, which can be downloaded and installed from network by communication device 609, or from storage device 608 It is mounted, or is mounted from ROM602.When the computer program is executed by processing unit 601, the implementation of the disclosure is executed The above-mentioned function of being limited in the method for example.It should be noted that the computer-readable medium of embodiment of the disclosure can be meter Calculation machine readable signal medium or computer readable storage medium either the two any combination.Computer-readable storage Medium for example may be-but not limited to-system, device or the device of electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, Or any above combination.The more specific example of computer readable storage medium can include but is not limited to: have one Or the electrical connections of multiple conducting wires, portable computer diskette, hard disk, random access storage device (RAM), read-only memory (ROM), Erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light Memory device, magnetic memory device or above-mentioned any appropriate combination.In embodiment of the disclosure, computer-readable to deposit Storage media can be any tangible medium for including or store program, which can be commanded execution system, device or device Part use or in connection.And in embodiment of the disclosure, computer-readable signal media may include in base band In or as carrier wave a part propagate data-signal, wherein carrying computer-readable program code.This propagation Data-signal can take various forms, including but not limited to electromagnetic signal, optical signal or above-mentioned any appropriate combination.Meter Calculation machine readable signal medium can also be any computer-readable medium other than computer readable storage medium, which can Read signal medium can be sent, propagated or be transmitted for being used by instruction execution system, device or device or being tied with it Close the program used.The program code for including on computer-readable medium can transmit with any suitable medium, including but not It is limited to: electric wire, optical cable, RF (radio frequency) etc. or above-mentioned any appropriate combination.
Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the application, method and computer journey The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation A part of one module, program segment or code of table, a part of the module, program segment or code include one or more use The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction Combination realize.
Being described in unit involved in the embodiment of the present application can be realized by way of software, can also be by hard The mode of part is realized.Described unit also can be set in the processor, for example, can be described as: a kind of processor packet Include determination unit and adding unit.Wherein, the title of these units does not constitute the limit to the unit itself under certain conditions It is fixed, for example, determination unit is also described as " determining the unit of the index information of the data in target data set ".
As on the other hand, present invention also provides a kind of computer-readable medium, which be can be Included in device described in above-described embodiment;It is also possible to individualism, and without in the supplying device.Above-mentioned calculating Machine readable medium carries one or more program, when said one or multiple programs are executed by the device, so that should Device: the index information of the data in target data set is determined, wherein target data set includes a plurality of data, every number According to including key and value;The index information of pieces of data is added to target data set.
Above description is only the preferred embodiment of the application and the explanation to institute's application technology principle.Those skilled in the art Member is it should be appreciated that invention scope involved in the application, however it is not limited to technology made of the specific combination of above-mentioned technical characteristic Scheme, while should also cover in the case where not departing from foregoing invention design, it is carried out by above-mentioned technical characteristic or its equivalent feature Any combination and the other technical solutions formed.Such as features described above has similar function with (but being not limited to) disclosed herein Can technical characteristic replaced mutually and the technical solution that is formed.

Claims (18)

1. a kind of date storage method, comprising:
Determining the index information of the data in target data set, wherein the target data set includes a plurality of data, and every Data include key and value;
The index information of pieces of data is added to the target data set.
2. the data are total according to the method described in claim 1, wherein, the target data set belongs to data total collection Set includes at least one data acquisition system, and each data acquisition system is stored under at least two catalogues, the file under each catalogue Quantity is preset quantity, and the preset quantity is less than or equal to default file amount threshold.
3. according to the method described in claim 2, wherein, the index information of the data in the determining target data set wraps It includes:
By key and composition index information at least one of below: serial number, the length of data, data of data file where data Offset address and data corresponding to catalogue number, wherein the serial number is used to characterize the arrangement of each file under catalogue Sequentially.
4. according to the method described in claim 1, wherein, the method also includes:
In response to receiving the operational order to the target data set, the index information of pieces of data is scanned.
5. described that the index information of pieces of data is added to the target data according to the method described in claim 1, wherein Set, comprising:
The index information of pieces of data is written in the index file under the catalogue where the target data set, wherein institute It states and is stored with corresponding index file and data file under catalogue.
6. according to the method described in claim 5, wherein, the method also includes:
The index file middle finger is shown in effective index information write-in memory headroom of valid data.
7. described to refer in response to receiving to the operation of the target data set according to the method described in claim 6, wherein It enables, scans the index information of pieces of data, comprising:
In response to receiving following one of instruction to the target data set, pieces of data in memory headroom is scanned Effective index information: look-up command modification instruction, deletes instruction, increases data command and reads instruction;
In response to receiving the rewritten instructions to the target data set, scan each in the index file and memory headroom Effective index information of data.
8. according to the method described in claim 7, wherein, the method is applied to target electronic device, the target data set Conjunction is stored in the target electronic device, the method also includes:
In response to receiving the out code to the target electronic device, on disk, effective index in memory headroom is believed Breath carries out persistence processing;And
The method also includes:
In response to receiving the enabled instruction to the target electronic device, institute is written in effective index information that persistence is obtained State memory headroom.
9. according to the method described in claim 2, wherein, each data acquisition system storage is under both directories.
10. according to the method described in claim 2, wherein, the method also includes:
Receive rewriting data instruction;
Using serial number under the preset target directory at least two catalogue is maximum or the smallest data file of serial number as Point rewrites the valid data under the target directory according to the size order of serial number, obtains rewriteeing data;
Receive data more new command;
Using the data file of the other of serial number maximum under the target directory and serial number minimum as starting point, according to serial number Size order increases more new data under the target directory.
11. according to the method described in claim 10, wherein, before completing to rewrite the valid data under the target directory, Execute the operation for increasing more new data under the target directory;Or
Before completing to increase more new data under the target directory, executes and complete to rewrite the significant figure under the target directory According to operation.
12. according to the method described in claim 10, wherein, the size order according to serial number rewrites the target directory Under valid data, comprising:
In response to determining that the size for rewriteeing obtained current data file reaches default file size threshold value, according to the size of serial number Sequentially, next data file of the current data file is rewritten.
13. according to the method described in claim 10, wherein, according to the size order of serial number, increasing under the target directory More new data, comprising:
In response to determining that the size for updating obtained current data file reaches default file size threshold value, according to the size of serial number Sequentially, in next data file increase more new data of the current data file.
14. according to the method described in claim 10, wherein, the method also includes:
It determines the index information of the more new data, and the target directory is written;
It determines the index information for rewriteeing data, and the target directory is written.
15. according to the method described in claim 10, wherein, the method also includes:
By the Data Migration under other catalogues other than the target directory to the target directory.
16. a kind of data storage device, comprising:
Determination unit is configured to determine the index information of the data in target data set, wherein the target data set Including a plurality of data, every data includes key and value;
Adding unit is configured to for the index information of pieces of data to be added to the target data set.
17. a kind of electronic equipment, comprising:
One or more processors;
Storage device, for storing one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of processors are real The now method as described in any in claim 1-15.
18. a kind of computer readable storage medium, is stored thereon with computer program, wherein when the program is executed by processor Realize the method as described in any in claim 1-15.
CN201910261866.0A 2019-04-02 2019-04-02 Data storage method and device Active CN109947709B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910261866.0A CN109947709B (en) 2019-04-02 2019-04-02 Data storage method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910261866.0A CN109947709B (en) 2019-04-02 2019-04-02 Data storage method and device

Publications (2)

Publication Number Publication Date
CN109947709A true CN109947709A (en) 2019-06-28
CN109947709B CN109947709B (en) 2021-10-08

Family

ID=67013411

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910261866.0A Active CN109947709B (en) 2019-04-02 2019-04-02 Data storage method and device

Country Status (1)

Country Link
CN (1) CN109947709B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110505039A (en) * 2019-09-26 2019-11-26 北京达佳互联信息技术有限公司 A kind of data transfer control method, device, equipment and medium
CN110765076A (en) * 2019-10-25 2020-02-07 北京奇艺世纪科技有限公司 Data storage method and device, electronic equipment and storage medium
CN111241108A (en) * 2020-01-16 2020-06-05 北京百度网讯科技有限公司 Key value pair-based KV system indexing method and device, electronic equipment and medium
CN112491857A (en) * 2020-11-20 2021-03-12 北京人大金仓信息技术股份有限公司 Method, device and equipment for transmitting set type data
WO2022063059A1 (en) * 2020-09-23 2022-03-31 华为云计算技术有限公司 Data management method for key-value storage system and device thereof

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080126298A1 (en) * 2006-11-23 2008-05-29 Samsung Electronics Co., Ltd. Apparatus and method for optimized index search
CN102323947A (en) * 2011-09-05 2012-01-18 东北大学 Generation method of pre-join table on ring-shaped schema database
CN103309950A (en) * 2013-05-22 2013-09-18 苏州雄立科技有限公司 Searching method for key value
CN104182508A (en) * 2014-08-19 2014-12-03 华为技术有限公司 Data processing method and data processing device
CN107704604A (en) * 2017-10-16 2018-02-16 中汇信息技术(上海)有限公司 A kind of information persistence method, server and computer-readable recording medium
CN108388569A (en) * 2018-01-09 2018-08-10 杭州电子科技大学 A kind of system and method for building up of quick key value database

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080126298A1 (en) * 2006-11-23 2008-05-29 Samsung Electronics Co., Ltd. Apparatus and method for optimized index search
CN102323947A (en) * 2011-09-05 2012-01-18 东北大学 Generation method of pre-join table on ring-shaped schema database
CN103309950A (en) * 2013-05-22 2013-09-18 苏州雄立科技有限公司 Searching method for key value
CN104182508A (en) * 2014-08-19 2014-12-03 华为技术有限公司 Data processing method and data processing device
CN107704604A (en) * 2017-10-16 2018-02-16 中汇信息技术(上海)有限公司 A kind of information persistence method, server and computer-readable recording medium
CN108388569A (en) * 2018-01-09 2018-08-10 杭州电子科技大学 A kind of system and method for building up of quick key value database

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110505039A (en) * 2019-09-26 2019-11-26 北京达佳互联信息技术有限公司 A kind of data transfer control method, device, equipment and medium
CN110505039B (en) * 2019-09-26 2022-04-01 北京达佳互联信息技术有限公司 Data transmission control method, device, equipment and medium
CN110765076A (en) * 2019-10-25 2020-02-07 北京奇艺世纪科技有限公司 Data storage method and device, electronic equipment and storage medium
CN111241108A (en) * 2020-01-16 2020-06-05 北京百度网讯科技有限公司 Key value pair-based KV system indexing method and device, electronic equipment and medium
CN111241108B (en) * 2020-01-16 2023-12-26 北京百度网讯科技有限公司 Key value based indexing method and device for KV system, electronic equipment and medium
WO2022063059A1 (en) * 2020-09-23 2022-03-31 华为云计算技术有限公司 Data management method for key-value storage system and device thereof
CN112491857A (en) * 2020-11-20 2021-03-12 北京人大金仓信息技术股份有限公司 Method, device and equipment for transmitting set type data

Also Published As

Publication number Publication date
CN109947709B (en) 2021-10-08

Similar Documents

Publication Publication Date Title
CN109947709A (en) Date storage method and device
CN105740048B (en) A kind of mirror image management method, apparatus and system
KR100287137B1 (en) Method for managing version of portable information terminal
CN103927261B (en) For the efficiently distribution simplifying supply storage and the method and system reclaimed
CN109032760A (en) Method and apparatus for application deployment
CN113641457B (en) Container creation method, device, apparatus, medium, and program product
CN105515872B (en) The update method of configuration information, apparatus and system
CN106101256B (en) Method and apparatus for synchrodata
CN103064637A (en) Network disk cache synchronizing method and system
CN110401724A (en) File management method, ftp server and storage medium
CN108846753A (en) Method and apparatus for handling data
CN110347651A (en) Method of data synchronization, device, equipment and storage medium based on cloud storage
JP7176209B2 (en) Information processing equipment
CN105740469B (en) Storage server and metadata access method
CN110019080A (en) Data access method and device
CN103049574B (en) Realize key assignments file system and the method for file dynamic copies
CN103067479A (en) Network disk synchronized method and system based on file coldness and hotness
CN108255989B (en) Picture storage method and device, terminal equipment and computer storage medium
CN109309734A (en) It is used for transmission the method and device of data
CN109213604A (en) A kind of management method and device of data source
CN109697019A (en) The method and system of data write-in based on FAT file system
CN104618445A (en) Method and device for arranging files based on cloud storage space
CN106528876B (en) The information processing method and distributed information processing system of distributed system
CN114138558A (en) Object storage method and device, electronic equipment and storage medium
CN104378396B (en) Data administrator and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20190628

Assignee: Beijing Intellectual Property Management Co.,Ltd.

Assignor: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY Co.,Ltd.

Contract record no.: X2023110000096

Denomination of invention: Data storage methods and devices

Granted publication date: 20211008

License type: Common License

Record date: 20230821