CN103902702A - Data storage system and data storage method - Google Patents

Data storage system and data storage method Download PDF

Info

Publication number
CN103902702A
CN103902702A CN201410126243.XA CN201410126243A CN103902702A CN 103902702 A CN103902702 A CN 103902702A CN 201410126243 A CN201410126243 A CN 201410126243A CN 103902702 A CN103902702 A CN 103902702A
Authority
CN
China
Prior art keywords
data
index
carrier store
entry
inquiry mode
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410126243.XA
Other languages
Chinese (zh)
Other versions
CN103902702B (en
Inventor
韩明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING CHESHANGHUI SOFTWARE Co Ltd
Original Assignee
BEIJING CHESHANGHUI SOFTWARE Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING CHESHANGHUI SOFTWARE Co Ltd filed Critical BEIJING CHESHANGHUI SOFTWARE Co Ltd
Priority to CN201410126243.XA priority Critical patent/CN103902702B/en
Publication of CN103902702A publication Critical patent/CN103902702A/en
Application granted granted Critical
Publication of CN103902702B publication Critical patent/CN103902702B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2272Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data storage system and a data storage method and belongs to the technical field of database processing. The data storage method includes: constructing an index list according to a querying manner for data entries in a first data storage unit; storing every index in the index list into a second data storage unit; dividing the first data storage unit into multiple data fields and starting multiple first lines, wherein each first line is used for acquiring the data entries from one or more data fields of the first data storage unit and storing the data entries into the second data storage unit; determining related one or more data entries according to a querying manner corresponding to each index and acquiring a data entry identification list related to the indexes; relatedly storing the data entry identification list related to each index and a name of each index into the second storage unit. The data storage system and the data storage method have the advantages that query efficiency for business data can be increased, and the speed of writing the data entries and index data into the second storage unit is increased.

Description

A kind of data-storage system and storage means
Technical field
The invention belongs to database processing technical field, be specifically related to a kind of data-storage system and storage means.
Background technology
The implementation of traditional data storage and access is generally based on relevant database or the distributed caching based on nosql database.For relevant database implementation, data are stored in the disk of relevant database, and application program is by the data in sql statement accessing database.Because the grammer of sql statement is very flexible, therefore can support the complex query under multiple sequence and multiple point of set condition, can support complicated inquiry business.But relevant database needs service data consistance, and safeguard a large amount of incidence relations, and data are stored on hard disk, cause data query speed slow, particularly in high Concurrency Access situation, can have a strong impact on search efficiency.
For nosql database implementation, data are stored in internal memory in the right mode of key-value (key-value).Under this implementation, because data are stored in internal memory, access speed is fast, and by standard api(application programming interface) obtain data and be simple and easy to use.But this implementation cannot well be supported the complex query of multiple sequence and multiple combination condition, cannot support complicated inquiry business.
The implementation of visible above-mentioned two kinds of data storage and access respectively has quality.And how under the prerequisite of supporting complicated inquiry business, to improve data access efficiency, just become technical matters urgently to be resolved hurrily.
Summary of the invention
In view of the above problems, the present invention has been proposed to a kind of overcome the problems referred to above or the data-storage system addressing the above problem at least in part and storage means are provided.
According to an aspect of the present invention, provide a kind of date storage method, be suitable for the data entry from the first data-carrier store to store in the second data-carrier store, the method comprises:
According to the data strip object inquiry mode structure index in the first data-carrier store, the corresponding a kind of inquiry mode of each index in index;
In the second data-carrier store, store the each index in this index, wherein each index comprises the index name that identifies this index;
The first data-carrier store is divided into multiple data fields, start multiple the first threads, each the first thread is responsible for obtaining each data entry from one or more data fields of the first data-carrier store, obtained data entry is stored in the second data-carrier store, wherein, in the second data-carrier store, each data entry comprises data strip object unique identification and the data content being associated;
Determine one or more data entry being associated according to the inquiry mode corresponding with each index, and determine the unique identification of each data entry in the second data-carrier store, obtain the data strip order identification list being associated with this index; And
The index name of the data strip order identification list being associated with each index and each index is stored in the second data-carrier store explicitly.
Alternatively, described basis, to the data strip object inquiry mode structure index in the first data-carrier store, comprising:
Set up sorted lists according to the sequence inquiry mode to data entry, and set up group list according to the grouping inquiry mode to data entry;
Carry out cartesian product according to sorted lists and group list, construct described index.
Alternatively, the described basis inquiry mode corresponding with each index determined one or more data entry being associated, and determine the unique identification of each data entry in the second data-carrier store, and obtain the data strip order identification list being associated with this index, comprising:
Start multiple the second threads, each the second thread is responsible for the one or more sequence inquiry modes in sorted lists, determines the sorting data entry set corresponding with responsible sequence inquiry mode from the first data-carrier store;
Start multiple the 3rd threads, each the 3rd thread is responsible for the one or more index in index, from the sorting data entry set corresponding with responsible index, determine the one or more data entries that are associated with this index, and determine the unique identification of each data entry in the second data-carrier store, obtain the data strip order identification list being associated with this index.
Alternatively, describedly from the sorting data entry set corresponding with responsible index, determine one or more data entries of being associated with this index, comprising:
Sorting data entry set is divided into multiple data blocks, starts multiple the 4th threads, each the 4th thread is responsible for determining from one or more data blocks the one or more data entries that are associated with this index.
Alternatively, described index name comprises sequence inquiry mode mark and grouping inquiry mode mark;
In described the second data-carrier store, each data strip object unique identification is the unique identification of this data entry in the first data-carrier store, and data content is that each field of data strip object is carried out to the content obtaining after serializing.
Alternatively, described index name also comprises data key assignments;
In described the second data-carrier store, each data strip object unique identification also comprises data key assignments, and data content is that each corresponding with data key assignments data strip object field is carried out to the content obtaining after serializing;
Wherein, described data key assignments is used for identifying key assignments information, and described key assignments is the one or more field names of data strip object.
Alternatively, described method also comprises:
Obtain the inquiry request from application server, described inquiry request comprises inquiry mode and key assignments, and wherein said key assignments is the one or more field names of data strip object;
Determine index name according to inquiry mode, from the second data-carrier store, search the data strip order identification list being associated with index name, and from the second data-carrier store, obtain according to the data strip order identification list finding the data content being associated;
The data content getting is gone to serializing, and from go the content serializing, obtain the content of answering with key-value pair and return to application server.
Alternatively, described method also comprises:
Obtain the inquiry request from application server, described inquiry request comprises inquiry mode and key assignments;
Determine index name according to inquiry mode and key assignments, from the second data-carrier store, search the data strip order identification list being associated with index name, and from the second data-carrier store, obtain according to the data strip order identification list finding the data content being associated;
The data content getting is gone to serializing, will go the content after serializing to return to application server.
Alternatively, the data entry in the first data-carrier store is stored in relevant database mode; And
Index in the second data-carrier store is take index name as key, store take the data strip order identification list that is associated as key-value mode of value, and data entry is take data strip object unique identification as key, store take the data content that is associated as key-value mode of value.
According to a further aspect in the invention, provide a kind of data-storage system, be suitable for the data entry from the first data-carrier store to store in the second data-carrier store, this system comprises:
Index tectonic element, is suitable for according to the data strip object inquiry mode structure index in the first data-carrier store the corresponding a kind of inquiry mode of each index in index;
Index storage unit, is suitable in the second data-carrier store, storing the each index in this index, and wherein each index comprises the index name that identifies this index;
Data entry storage unit, be suitable for the first data-carrier store to be divided into multiple data fields, start multiple the first threads, each the first thread is responsible for obtaining each data entry from one or more data fields of the first data-carrier store, obtained data entry is stored in the second data-carrier store, wherein, in the second data-carrier store, each data entry comprises data strip object unique identification and the data content being associated; And
Associative cell, is suitable for determining according to the inquiry mode corresponding with each index one or more data entry being associated, and determines the unique identification of each data entry in the second data-carrier store, obtains the data strip order identification list being associated with this index;
Wherein, described index storage unit is configured to the index name of the data strip order identification list being associated with each index and each index to store into explicitly in the second data-carrier store.
Alternatively, described index tectonic element is configured to:
Set up sorted lists according to the sequence inquiry mode to data entry, and set up group list according to the grouping inquiry mode to data entry;
Carry out cartesian product according to sorted lists and group list, construct described index.
Alternatively, described associative cell is configured to:
Start multiple the second threads, each the second thread is responsible for the one or more sequence inquiry modes in sorted lists, determines the sorting data entry set corresponding with responsible sequence inquiry mode from the first data-carrier store;
Start multiple the 3rd threads, each the 3rd thread is responsible for the one or more index in index, from the sorting data entry set corresponding with responsible index, determine the one or more data entries that are associated with this index, and determine the unique identification of each data entry in the second data-carrier store, obtain the data strip order identification list being associated with this index.
Alternatively, described associative cell is configured to:
Sorting data entry set is divided into multiple data blocks, starts multiple the 4th threads, each the 4th thread is responsible for determining from one or more data blocks the one or more data entries that are associated with this index.
Alternatively, described index name comprises sequence inquiry mode mark and grouping inquiry mode mark;
In described the second data-carrier store, each data strip object unique identification is the unique identification of this data entry in the first data-carrier store, and data content is that each field of data strip object is carried out to the content obtaining after serializing.
According to another aspect of the invention, provide a kind of data query system, comprised the first data-carrier store, data distribution server and the second data-carrier store, wherein, described data distribution server comprises above-mentioned data-storage system.
According to above-mentioned one or more technical schemes of the present invention, by adopting calculated off-line to replace synchronous calculating, and the advantage of integrating traditional relevant database and nosql database, both met the functional requirement of business datum being carried out to complex query, solve again the problem lower based on traditional relational Query Efficiency, particularly improved the efficiency data query under high concurrent environment; In calculated off-line, also adopt parallel computing, by parallel mode, the data entry in the first data-carrier store and corresponding index data are written in the second data-carrier store, increased substantially the writing speed of data.
Above-mentioned explanation is only the general introduction of technical solution of the present invention, in order to better understand technological means of the present invention, and can be implemented according to the content of instructions, and for above and other objects of the present invention, feature and advantage can be become apparent, below especially exemplified by the specific embodiment of the present invention.
Accompanying drawing explanation
Fig. 1 shows the structural representation of data query system according to an embodiment of the invention;
Fig. 2 shows the structural representation of data-storage system according to an embodiment of the invention;
Fig. 3 shows the schematic flow sheet of date storage method according to an embodiment of the invention; And
The block parallel that Fig. 4 shows data and index in the embodiment of the present invention calculates schematic diagram.
Embodiment
Exemplary embodiment of the present disclosure is described below with reference to accompanying drawings in more detail.Although shown exemplary embodiment of the present disclosure in accompanying drawing, but should be appreciated that and can realize the disclosure and the embodiment that should do not set forth limits here with various forms.On the contrary, it is in order more thoroughly to understand the disclosure that these embodiment are provided, and can be by the those skilled in the art that conveys to complete the scope of the present disclosure.
Fig. 1 shows the structural representation of data query system according to an embodiment of the invention.With reference to Fig. 1, the data query system of the embodiment of the present invention can comprise: the first data-carrier store 100, data distribution server 20 and the second data-carrier store 300, wherein data distribution server 20 communicates to connect with the first data-carrier store 100 and the second data-carrier store 300 respectively, and data distribution server 20 comprises data-storage system 200, data-storage system 200 resides in data distribution server 20.
Data-storage system 200 can be from the first data-carrier store 100 obtains each data entry, obtained data entry is stored in the second data-carrier store 300, wherein, in the second data-carrier store 300, each data entry comprises data strip object unique identification and the data content being associated.Data-storage system 200 can also be constructed index (the corresponding a kind of inquiry mode of each index in index) by various inquiry modes according to the data strip object in the first data-carrier store 100, and by the each index stores in this index in the second data-carrier store 300, wherein, in the second data-carrier store 300, each index comprises the data strip order identification list that identifies the index name of this index and be associated.
Data-storage system 200 can also adopt parallel computing that the data entry in the first data-carrier store 100 is stored in the second data-carrier store 300, and/or, adopt parallel computing by the each index stores in index in the second data-carrier store 300, so, the writing speed while having improved data and index and be written to the second data-carrier store.
According to the above-mentioned processing of data-storage system 200, be equivalent to complete in advance under various inquiry modes to data strip object inquiry in the first data-carrier store 100, and Query Result is stored in the second data-carrier store 300, adopt calculated off-line to replace traditional synchronous calculating.Like this, in the time that application server 400 need to be inquired about the data entry in the first data-carrier store 100, the query interface that can directly provide from the second data-carrier store 300 obtains Query Result, owing to not needing the inquiry mode according to concrete to calculate in real time, therefore improve data query speed.Particularly, when described query interface receives the inquiry request that application server 400 sends, first determine index name according to the inquiry mode information of carrying in described inquiry request, then from the second data-carrier store 200, obtain corresponding data strip order identification list according to index name, the data strip order identification list that last basis gets obtains corresponding data entry set from the second data-carrier store 200.
The first data-carrier store 100 can be relevant database, such as Oracle, DB2, Microsoft SQL Server and MySQL etc.; The second data-carrier store 200 can be the no sql database of key-value mode, for example redis database.In such cases, this data query system has just been integrated the advantage of traditional relational database and nosql database, both met the functional requirement of business datum being carried out to complex query, solve again the problem lower based on traditional relational Query Efficiency, particularly improved the efficiency data query under high concurrent environment.Wherein, redis is a key-value storage system, and its supports the value type of storage to comprise string(character string), list(chained list), set(set), sorted set(ordered set) and hash(Hash) type.These data types all support push/pop, add/remove and get common factor, union and difference set and etc. operation, and these operations are all atomicities.
Below concrete formation and the principle of work of the data-storage system 200 in above-mentioned data query system are described in detail.
Fig. 2 shows the structural representation of data-storage system according to an embodiment of the invention.This data-storage system 200 can reside in data distribution server 20, described data distribution server 20 communicates to connect with the first data-carrier store 100 and the second data-carrier store 300 respectively, by described data-storage system 200, can will store in the second data-carrier store 300 from the data entry of the first data-carrier store 100, thereby complete the issue of data.Wherein, the first data-carrier store 100 can be relevant database, such as Oracle, DB2, Microsoft SQL Server and MySQL etc.; The second data-carrier store 200 can be the no sql database of key-value mode, for example redis database.
With reference to Fig. 2, the data-storage system 200 of the embodiment of the present invention can comprise: index tectonic element 210, index storage unit 220, data entry storage unit 230 and associative cell 240.
Consider the use actual conditions to business datum, the data query mode using in reality is limited, therefore, index tectonic element 210 can be constructed index by various inquiry modes according to the data strip object in the first data-carrier store 100, that is to say, index comprises multiple index, the corresponding a kind of inquiry mode of each index.
The inquiry mode of data entry is generally comprised to sequence inquiry mode (order by) and grouping inquiry mode (group by), and the combination of the two.The quantity of order by and group by is limited, and most order by uses together with group by being, therefore, index tectonic element 210 can be first according to the sequence inquiry mode of data entry is set up to sorted lists, and set up group list according to the grouping inquiry mode to data entry, then, carry out cartesian product according to sorted lists and group list, construct described index.Certainly, can be also other inquiry modes of the prior art to the inquiry mode of data entry, or even other inquiry modes that likely occur in the future, the embodiment of the present invention does not limit this.
After index tectonic element 210 construction complete index, index storage unit 220 can be stored the each index in this index in the second data-carrier store 300, and wherein each index comprises the data strip order identification list that identifies the index name of this index and be associated with this index.In the time of specific implementation, index storage unit 220 can first using index name as key, (key) stores in second memory 300, and corresponding value (vlaue) is temporarily empty.Follow-up, determine after the data strip order identification list that index name is corresponding by associative cell 240, then storing in the second data-carrier store 300 that this data strip order identification list is associated with corresponding index name as value.For example, for redis database, the type of corresponding value can adopt ordered set (sorted set), and it is orderly that the data strip target in the data strip order identification list stored is known, and its order is determined by inquiry mode corresponding to index.
In one implementation, described index name comprises sequence inquiry mode mark (orderID) and grouping inquiry mode mark (groupID).In another kind of implementation, described index name comprises data key assignments, sequence inquiry mode mark and grouping inquiry mode mark.Wherein, described data key assignments is used for identifying key assignments information, and described key assignments is the one or more field names of data strip object.In the time of specific implementation, described data key assignments can directly adopt described key assignments, can be also the value after key assignments is encoded, and the embodiment of the present invention does not limit concrete coded system.
For example, tentation data entry has 4 fields, and field name is respectively A, B, C, D, and the key assignments of user inquiry is that { A, C}, described data key assignments can be that { A, C} can be also to the { value after A, C} encode for example 2.
Data entry storage unit 230 is suitable for obtaining each data entry from the first data-carrier store 100, obtained data entry is stored in the second data-carrier store 300, wherein, in the second data-carrier store 300, each data entry comprises data strip object unique identification and the data content being associated.For example, for redis database, each data entry is using data strip object unique identification as key, stores as value using the data content being associated.
For improving data processing speed, data entry storage unit 230 can be divided into multiple data fields by the first data-carrier store 100, and start multiple the first threads, each the first thread is responsible for obtaining each data entry from one or more data fields of the first data-carrier store 100, obtained data entry is stored in the second data-carrier store 300, and multiple the first threads are written to the data entry in the first data-carrier store 100 in the second data-carrier store 300 concurrently.
In the time of specific implementation, data entry storage unit 230 can first judge that whether the data strip object quantity of storage in the first data-carrier store 100 is greater than threshold value, then determines whether to carry out multithreading processing according to judged result.In the time that data strip object quantity is greater than threshold value, carry out described multithreading processing, otherwise, can not carry out described multithreading processing.Wherein, described threshold value can be determined with test based on experience value.
In the time that described index name comprises sequence inquiry mode mark and grouping inquiry mode mark, in the second data-carrier store 300, each data strip object unique identification is the unique identification of this data entry in the first data-carrier store 100, and the data content being associated is that each field of data strip object is carried out to the content obtaining after serializing.For example, in the time that the first data-carrier store 100 is relevant database, this unique identification can be data strip object major key.
Wherein, each field of data entry is carried out to serializing and refer to: take out successively the value of this each field of data strip object, the value of each field is combined into a sequence as a sequence item, and cut apart with for example comma of separator between each sequence item; Or, take out successively the value of this each field of data strip object, the value of the title of each field and field is combined into a sequence as a sequence item, and between the title of field and the value of field, cut apart with for example colon of decollator, between each sequence item, cut apart with for example comma of separator.
For example, tentation data entry has 4 fields, and field name is respectively A, B, C, D, and corresponding field value is respectively a, b, and c, d, carries out the content that obtains after serializing for { a, b, c, d}, or be { A:a, B:b, C:c, D:d}.
In the time that described index name comprises data key assignments, sequence inquiry mode mark and grouping inquiry mode mark, in the second data-carrier store 300, each data strip object unique identification is the unique identification of this data entry in the first data-carrier store 100, and the data content being associated is that each corresponding with data key assignments data strip object field is carried out to the content obtaining after serializing.For example, in the time that the first data-carrier store 100 is relevant database, this unique identification can be data strip object major key.
For example, tentation data entry has 4 fields, and field name is respectively A, B, C, D, and corresponding field value is respectively a, b, and c, d, data key assignments be that { A, C}, the content of carrying out obtaining after serializing is { a, c}, or be { A:a, C:c}.
Associative cell 240 is suitable for determining according to the inquiry mode corresponding with each index one or more data entry being associated, and determine the unique identification of each data entry in the second data-carrier store 300, obtain the data strip order identification list being associated with this index.
The corresponding a kind of inquiry mode of each index, for every kind of inquiry mode, associative cell 240 can be constructed corresponding query statement and from the first data-carrier store 100, be obtained corresponding data entry.For example, in the time that described the first data-carrier store 100 is relevant database, associative cell 240 can, according to inquiry mode constructing SQL statement, operate described the first data-carrier store 100 according to the SQL statement of structure, thereby obtains the data entry corresponding with this inquiry mode.SQL statement example is as follows:
Select*from table name sort by A group by B.
Associative cell 240 gets after the one or more data entries that are associated with this index, determine each data strip object unique identification, it is for example data strip object major key, thereby obtain the data strip order identification list being associated with this index, then, index storage unit 220 can store the index name of the data strip order identification list being associated with each index and each index in the second data-carrier store 300 explicitly.So, just data entry set associative corresponding with this index the index of storing in the second data-carrier store is got up.
For improving data processing speed, associative cell 240 can start multiple the second threads, each the second thread is responsible for the one or more sequence inquiry modes in sorted lists, determines the sorting data entry set corresponding with responsible sequence inquiry mode from the first data-carrier store 100; And start multiple the 3rd threads, each the 3rd thread is responsible for the one or more index in index, from the sorting data entry set corresponding with responsible index, determine the one or more data entries that are associated with this index, and determine the unique identification of each data entry in the second data-carrier store, obtain the data strip order identification list being associated with this index.
For further improving data processing speed, associative cell 240 can also be divided into sorting data entry set multiple data blocks, start multiple the 4th threads, each the 4th thread is responsible for determining from one or more data blocks the one or more data entries that are associated with this index.In the time of specific implementation, associative cell 240 can first judge whether the data strip object quantity in sorting data entry set is greater than threshold value, then determines whether to carry out multithreading processing according to judged result.In the time that data strip object quantity is greater than threshold value, carry out described multithreading processing, otherwise, can not carry out described multithreading processing.Wherein, described threshold value can be determined with test based on experience value.
According to the data-storage system 200 of the embodiment of the present invention, by adopting calculated off-line to replace synchronous calculating, effectively raise the search efficiency of data.For example, can be in described the second data-carrier store 300 configuration querying interface, application server 400 does not need to access the first data-carrier store 100 and calculates in real time, but directly obtain the data entry corresponding with inquiry request from query interface, so, avoid the inquiry computing to data strip object complexity in the first data-carrier store 100, improved inquiry velocity.
As previously mentioned, in one implementation, in the second data-carrier store, the index name of 300 storages comprises sequence inquiry mode mark and grouping inquiry mode mark; In another kind of implementation, described index name comprises data key assignments, sequence inquiry mode mark and grouping inquiry mode mark.
Corresponding to two kinds of above-mentioned implementations, the mode that described query interface obtains data query also has two kinds.In one implementation, query interface obtains the process of data and is:
Obtain the inquiry request from application server 400, described inquiry request comprises inquiry mode and key assignments;
Determine index name according to inquiry mode, from the second data-carrier store 300, search the data strip order identification list being associated with index name, and from the second data-carrier store 300, obtain the data content being associated according to the data strip order identification list finding;
The data content getting is gone to serializing, and from go the content serializing, obtain the content of answering with key-value pair and return to application server 400.
Wherein, going serializing is the inverse process of serializing, and its specific implementation process those skilled in the art easily determine according to aforesaid serializing process.
In another kind of implementation, the process that query interface obtains data is:
Obtain the inquiry request from application server 400, described inquiry request comprises inquiry mode and key assignments;
Determine index name according to inquiry mode and key assignments, from the second data-carrier store 300, search the data strip order identification list being associated with index name, and from the second data-carrier store 300, obtain the data content being associated according to the data strip order identification list finding;
The data content getting is gone to serializing, will go the content after serializing to return to application server 400.
A difference of two kinds of implementations is: in the first implementation, going the content after serializing is whole data strip object content, returns to application server after also needing therefrom to obtain the content of answering with key-value pair again; In the second implementation, going the content after serializing is the content of answering with key-value pair in data entry, can directly return to application server.
Fig. 3 shows the schematic flow sheet of date storage method according to an embodiment of the invention, this date storage method is suitable for the data entry from the first data-carrier store to store in the second data-carrier store, wherein the first data-carrier store can be relevant database, such as Oracle, DB2, Microsoft SQL Server and MySQL etc., the second data-carrier store can be the no sql database of key-value mode, for example redis database.
With reference to Fig. 3, described data-carrier store method can comprise:
Step S310, according to the data strip object inquiry mode structure index in the first data-carrier store, the corresponding a kind of inquiry mode of each index in index;
The inquiry mode of data entry is generally comprised to sequence inquiry mode (order by) and grouping inquiry mode (group by), and the combination of the two.The quantity of order by and group by is limited, and most order by uses together with group by being, therefore, first basis is set up sorted lists to the sequence inquiry mode of data entry, and sets up group list according to the grouping inquiry mode to data entry; Then, carry out cartesian product according to sorted lists and group list, construct described index.Certainly, can be also other inquiry modes of the prior art to the inquiry mode of data entry, or even other inquiry modes that likely occur in the future, the embodiment of the present invention does not limit this.
Step S320 stores the each index in this index in the second data-carrier store;
In the second data-carrier store, each index comprises the data strip order identification list that identifies the index name of this index and be associated with this index.In this step, can first using index name as key, (key) store in second memory, corresponding value (vlaue) temporarily be sky.In follow-up step, determine after the data strip order identification list that index name is corresponding, then using this data strip order identification list as value and storing in second memory that corresponding index name is associated.For example, for redis database, the type of corresponding value can adopt ordered set (sorted set), and it is orderly that the data strip target in the data strip order identification list stored is known, and its order is determined by inquiry mode corresponding to index.
In one implementation, described index name comprises sequence inquiry mode mark (orderID) and grouping inquiry mode mark (groupID).In another kind of implementation, described index name comprises data key assignments, sequence inquiry mode mark and grouping inquiry mode mark.Wherein, described data key assignments is used for identifying key assignments information, and described key assignments is the one or more field names of data strip object.In the time of specific implementation, described data key assignments can directly adopt described key assignments, can be also the value after key assignments is encoded, and the embodiment of the present invention does not limit concrete coded system.
Step S330 obtains each data entry from the first data-carrier store, and obtained data entry is stored in the second data-carrier store;
In the second data-carrier store, each data entry comprises data strip object unique identification and the data content being associated.For example, for redis database, each data entry is using data strip object unique identification as key, stores as value using the data content being associated.
For improving data processing speed, the first data-carrier store can be divided into multiple data fields, and start multiple the first threads, each the first thread is responsible for obtaining each data entry from one or more data fields of the first data-carrier store, obtained data entry is stored in the second data-carrier store, and multiple the first threads are written to the data entry in the first data-carrier store in the second data-carrier store concurrently.
In the time of specific implementation, can first judge whether the data strip object quantity of storing in the first data-carrier store is greater than threshold value, then determine whether to carry out multithreading processing according to judged result.In the time that data strip object quantity is greater than threshold value, carry out described multithreading processing, otherwise, can not carry out described multithreading processing.Wherein, described threshold value can be determined with test based on experience value.
In the time that described index name comprises sequence inquiry mode mark and grouping inquiry mode mark, in the second data-carrier store, each data strip object unique identification is the unique identification of this data entry in the first data-carrier store, and the data content being associated is that each field of data strip object is carried out to the content obtaining after serializing.For example, in the time that the first data-carrier store is relevant database, this unique identification can be data strip object major key.
In the time that described index name comprises data key assignments, sequence inquiry mode mark and grouping inquiry mode mark, in the second data-carrier store, each data strip object unique identification is the unique identification of this data entry in the first data-carrier store, and the data content being associated is that each corresponding with data key assignments data strip object field is carried out to the content obtaining after serializing.For example, in the time that the first data-carrier store is relevant database, this unique identification can be data strip object major key.
Step S340, determines one or more data entry being associated according to the inquiry mode corresponding with each index, and determines the unique identification of each data entry in the second data-carrier store, obtains the data strip order identification list being associated with this index;
The corresponding a kind of inquiry mode of each index, for every kind of inquiry mode, can construct corresponding query statement and from the first data-carrier store, obtain corresponding data entry.For example, in the time that described the first data-carrier store is relevant database, can, according to inquiry mode constructing SQL statement, according to the SQL statement of structure, described the first data-carrier store be operated, thereby obtain the data entry corresponding with this inquiry mode.
Get after the one or more data entries that are associated with this index, more further determine each data strip object unique identification, for example, be data strip object major key, thereby obtain the data strip order identification list being associated with this index.
For improving data processing speed, can start multiple the second threads, each the second thread is responsible for the one or more sequence inquiry modes in sorted lists, determines the sorting data entry set corresponding with responsible sequence inquiry mode from the first data-carrier store; And start multiple the 3rd threads, each the 3rd thread is responsible for the one or more index in index, from the sorting data entry set corresponding with responsible index, determine the one or more data entries that are associated with this index, and determine the unique identification of each data entry in the second data-carrier store, obtain the data strip order identification list being associated with this index.
For further improving data processing speed, sorting data entry set can also be divided into multiple data blocks, start multiple the 4th threads, each the 4th thread is responsible for determining from one or more data blocks the one or more data entries that are associated with this index.In the time of specific implementation, can first judge whether the data strip object quantity in sorting data entry set is greater than threshold value, then determine whether to carry out multithreading processing according to judged result.In the time that data strip object quantity is greater than threshold value, carry out described multithreading processing, otherwise, can not carry out described multithreading processing.Wherein, described threshold value can be determined with test based on experience value.
Step S350, the index name of the data strip order identification list being associated with each index and each index is stored in the second data-carrier store explicitly, so, just data entry set associative corresponding with this index the index of storing in the second data-carrier store is got up.
According to the date storage method of the embodiment of the present invention, by adopting calculated off-line to replace synchronous calculating, effectively raise the search efficiency of data.For example, can be in described the second data-carrier store configuration querying interface, application server does not need to access the first data-carrier store and calculates in real time, but directly obtain the data entry corresponding with inquiry request from query interface, so, avoid the inquiry computing to data strip object complexity in the first data-carrier store, improved inquiry velocity.
That is to say, the date storage method of the embodiment of the present invention can also comprise the steps:
Obtain the inquiry request from application server, described inquiry request comprises inquiry mode and key assignments, and wherein said key assignments is the one or more field names of data strip object;
Determine index name according to inquiry mode, from the second data-carrier store, search the data strip order identification list being associated with index name, and from the second data-carrier store, obtain according to the data strip order identification list finding the data content being associated;
The data content getting is gone to serializing, and from go the content serializing, obtain the content of answering with key-value pair and return to application server.
Or the date storage method of the embodiment of the present invention can also comprise the steps:
Obtain the inquiry request from application server, described inquiry request comprises inquiry mode and key assignments;
Determine index name according to inquiry mode and key assignments, from the second data-carrier store, search the data strip order identification list being associated with index name, and from the second data-carrier store, obtain according to the data strip order identification list finding the data content being associated;
The data content getting is gone to serializing, will go the content after serializing to return to application server.
Below provide an application example of the present invention.
In this application example, car dealer makes a price reduction ranking list data and is stored in traditional relevant database, and the car data entry of storing in this relevant database comprises following field name:
DealerId?SpecId?SpecName?SpecImage?SeriesId?SeriesName?BrandId?FactoryId?NewsId?NewsType?NewsTitle?StartDate?EndDate?DTime?Price?OriginalPrice?PriceOffPercent?OrdersLastMonth?OrdersLastQuarter?CreateTime?ModifyTime?PID?CID?SID?KindId?NewsTemplateId?serieslevel?InventoryState?PriceScope?EquipCarId?PackageName?PackagePrice?IsRecommend?PriceW?IsLastWeek
Below enumerate 3 data entries in this relevant database:
Existing the highest preferential 9.46 ten thousand yuan of 2013-12-3100:00:00.0002014-01-0500:00:00.0002013-12-3110: 35:33.723633400, the 728000 13 1951 2013-12-31 10:55:00 2013-12-31 10:55:00110000 110,100 110,105 1 2,671,390 40 70 000 64 1 of car abundance of the data entry 1:1104632012 money S53.0T Sportback~/S533 of upload/2013/7/19/l_201307191945189324136.jpg2734 Audi 79 8,266,471 0 S5 of Audi
The existing the highest preferential 6.93 ten thousand yuan of 2013-12-31 00:00:00.000 2014-01-0500:00:00.000 2013-12-31 10:37:12.510988700 1,058,000 7 97 12013-12-31 10:55:00 2013-12-31 10:55:00 110,000,110,100 110,105 1 26,714,295 0 100 000 99 1 of car abundance of 10,771 2013 sections of S64.0TFSI of the data entry 2:1~/S633 of upload/2013/4/19/l_201304191833167634435.jpg2736 Audi 79 8,266,541 0 S6 of Audi
Data entry 3:1122032012 money 30FSI is poly-talented~the existing the highest preferential 11.13 ten thousand yuan of 2013-12-31 00:00:00.000 2014-01-0500:00:00.000 2013-12-3110:34:25.467 380,900 432,800 12 1,700 5 2013-12-24 18:55:00 2013-12-3110:55:00 110,000,110,100 110,105 1 2,671,359 50 50 000 39 1 of the car abundance of/A6L33 of upload/spec/12203/l_201207041848360214178.jpg18 Audi 9 8,266,413 0 A6L of Audi
The process that above-mentioned price reduction ranking list data are issued is as follows:
(1) basis is to data strip object inquiry mode in these price reduction ranking list data, and structure sorted lists (Order by) is as follows:
Figure BDA0000484678630000171
Can find out, in constructed sorted lists, include 8 kinds of sequence inquiry modes.
Structure group list (group by) is as follows
new?string[]{"BrandId","SeriesId","SpecId","PID","CID","BrandId,PID","BrandId,CID","SeriesId,PID","SeriesId,CID","SpecId,PID","SeriesLevel"}
Can find out, in constructed group list, include 11 kinds of grouping inquiry modes.
Then, the two is carried out to cartesian product, obtain multiple combinations, number of combinations is 8*11=88, and index comprises 88 index, corresponding 88 kinds of inquiry modes.
(2) key using every kind of querying condition as index is stored in redis database, and key is:
"Promotion(PriceOffPercent?desc,LessPrice?desc,DTime?desc|SpecId,PID)"
Corresponding sequence querying condition:
"PriceOffPercent?desc,LessPrice?desc,DTime?desc"
Corresponding grouping querying condition:
"SpecId,PID"
(3) in redis database, store data strip object key value and value value in key-value mode, example is as follows:
Key:"Promotion_urn:promotion:92540000014492"
Value:"{\"DealerId\":9254,\"SpecId\":14492,\"SpecName\":\"2013\xe6\xac\xbe1.6L\xe6\x89\x8b\xe5\x8a\xa8\xe8\x88\x92\xe9\x80\x82\xe7\x89\x88\",\"SpecImage\":\"~/upload/2013/5/15/l_201305151900310633686.jpg\",\"SeriesId\":145,\"SeriesName\":\"POLO\",\"SeriesLevel\":2,\"BrandId\":1,\"FactoryId\":58,\"NewsId\":8313674,\"NewsType\":0,\"NewsTitle\":\"POLO\xe5\xb0\x91\xe9\x87\x8f\xe7\x8e\xb0\xe8\xbd\xa6\xe6\x9c\x80\xe9\xab\x98\xe4\xbc\x98\xe6\x83\xa00.5\xe4\xb8\x87\xe5\x85\x83\",\"StartDate\":\"\\/Date(1388505600000+0800)\\/\",\"EndDate\":\"\\/Date(1391097600000+0800)\\/\",\"DTime\":\"\\/Date(1388643961877+0800)\\/\",\"Price\":94900,\"OriginalPrice\":99900,\"PriceOffPercent\":5,\"OrdersLastMonth\":1178,\"OrdersLastQuarter\":0,\"CreateTime\":\"\\/Date(1385864760000+0800)\\/\",\"ModifyTime\":\"\\/Date(1388645760000+0800)\\/\",\"PID\":210000,\"CID\":210300,\"SID\":210303,\"KindId\":1,\"NewsTemplateId\":2699366,\"InventoryState\":1,\"PriceW\":10,\"PriceScope\":10,\"EquipCarId\":0,\"PackagePrice\":0,\"IsRecommend\":0,\"IsLastWeek\":1}"
Wherein, value value is to generate according to type of service Promotion serializing, and the partial code that carries out serializing is as follows:
Figure BDA0000484678630000181
Figure BDA0000484678630000191
(4) obtain corresponding inquiry mode according to the key generating in (2), according to the inquiry mode getting, from relevant database, obtain the data entry corresponding with this inquiry mode, and determine the key of these data entries in redis database, obtain key list.
(5) by the key of each index and storing in redis database that corresponding key list is associated, realize index associated with data strip object, thereby complete the issue of data.
Afterwards, just can be provided for to application server the api interface of data query, the example of inquiry api interface is as follows:
List<Promotion>list=Redis<Promotion>.Instance.GetData("PriceOffPercent?desc,LessPrice?desc,DTime?desc","SpecId,PID0").ToList();
Corresponding GetData method is achieved as follows:
Figure BDA0000484678630000192
By after above-mentioned data-storage system on-line running, find that dealer makes a price reduction the tps(Transactions Per Second of ranking list data query interface, number of transactions/second) bring up to 1023088, significantly lifting of performance acquisition from original 6714.
Below provide the data-storage system of the embodiment of the present invention and date storage method carries out block parallel calculating example to data and index.
The block parallel that Fig. 4 shows data and index in the embodiment of the present invention calculates schematic diagram.With reference to Fig. 4, the block parallel computation process of data and index is as follows:
(1) for the data entry for the treatment of to be written to from the first data-carrier store the second data-carrier store, judge whether these data strip object quantity are greater than threshold value, if, the first data-carrier store can be divided into multiple data fields (for example i), and start multiple the first threads (for example i), the corresponding data of each the first thread are calculated subtask, each data are calculated subtask and are responsible for obtaining each data entry from a data field of the first data-carrier store, and obtained data entry is stored in the second data-carrier store.If data strip object quantity is not more than threshold value, can not carry out subregion to the first data-carrier store, only utilize a thread to process whole data entries.Certainly, each the first thread also can corresponding multiple data calculate subtask.
(2) start multiple the second threads (being for example n), the corresponding ranking index of each the second thread is calculated subtask, each ranking index is calculated subtask and is responsible for for example, a sequence inquiry mode in sorted lists (having n order by), determines the sorting data entry set corresponding with responsible sequence inquiry mode from the first data-carrier store.Certainly, each the second thread also can corresponding multiple ranking index calculate subtask.For example, there are 20 order by, can start 5 the second threads, 4 order by of each the second thread process.
(3) start multiple the 3rd threads (being for example n*m), the corresponding sequence * packet index of each the 3rd thread calculates subtask (supposing to have m group by), and each sequence * packet index calculates subtask and be responsible for determining the one or more data entries that are associated with this index (being determined with the combination of grouping by corresponding sequence) from corresponding sorting data entry set.Certainly, each the 3rd thread also can corresponding multiple sequence * packet indexes calculating subtask.
(4) for each the 3rd thread, can also judge in its corresponding sorting data entry set, whether data strip object quantity is greater than threshold value, if, sorting data entry set can be divided into multiple data blocks (being for example x), and starting multiple the 4th threads (being for example x), each the 4th thread is responsible for determining from a data block the one or more data entries that are associated with this index.If data strip object quantity is not more than threshold value, can not carry out piecemeal to sorting data entry set, only utilize a thread to process whole data entries.Certainly, each the 4th thread also can be responsible for the processing of multiple data blocks.
According to above-mentioned parallel processing, the writing speed when significantly having improved data and index and being written to the second data-carrier store.
The algorithm providing at this is intrinsic not relevant to any certain computer, virtual system or miscellaneous equipment with demonstration.Various general-purpose systems also can with based on using together with this teaching.According to description above, it is apparent constructing the desired structure of this type systematic.In addition, the present invention is not also for any certain programmed language.It should be understood that and can utilize various programming languages to realize content of the present invention described here, and the description of above language-specific being done is in order to disclose preferred forms of the present invention.
In the instructions that provided herein, a large amount of details are described.But, can understand, embodiments of the invention can be put into practice in the situation that there is no these details.In some instances, be not shown specifically known method, structure and technology, so that not fuzzy understanding of this description.
Similarly, be to be understood that, in order to simplify the disclosure and to help to understand one or more in each inventive aspect, in the above in the description of exemplary embodiment of the present invention, each feature of the present invention is grouped together into single embodiment, figure or sometimes in its description.But, the method for the disclosure should be construed to the following intention of reflection: the present invention for required protection requires than the more feature of feature of clearly recording in each claim.Or rather, as reflected in claims below, inventive aspect is to be less than all features of disclosed single embodiment above.Therefore, claims of following embodiment are incorporated to this embodiment thus clearly, and wherein each claim itself is as independent embodiment of the present invention.
Those skilled in the art are appreciated that and can the module in the equipment in embodiment are adaptively changed and they are arranged in one or more equipment different from this embodiment.Module in embodiment or unit or assembly can be combined into a module or unit or assembly, and can put them in addition multiple submodules or subelement or sub-component.At least some in such feature and/or process or unit are mutually repelling, and can adopt any combination to combine all processes or the unit of disclosed all features in this instructions (comprising claim, summary and the accompanying drawing followed) and disclosed any method like this or equipment.Unless clearly statement in addition, in this instructions (comprising claim, summary and the accompanying drawing followed) disclosed each feature can be by providing identical, be equal to or the alternative features of similar object replaces.
In addition, those skilled in the art can understand, although embodiment more described herein comprise some feature rather than further feature included in other embodiment, the combination of the feature of different embodiment means within scope of the present invention and forms different embodiment.For example, in the following claims, the one of any of embodiment required for protection can be used with array mode arbitrarily.
All parts embodiment of the present invention can realize with hardware, or realizes with the software module of moving on one or more processor, or realizes with their combination.It will be understood by those of skill in the art that and can use in practice microprocessor or digital signal processor (DSP) to realize the some or all functions according to the some or all parts in the data-storage system of the embodiment of the present invention.The present invention can also be embodied as part or all equipment or the device program (for example, computer program and computer program) for carrying out method as described herein.Realizing program of the present invention and can be stored on computer-readable medium like this, or can there is the form of one or more signal.Such signal can be downloaded and obtain from internet website, or provides on carrier signal, or provides with any other form.
A1, a kind of date storage method, be suitable for the data entry from the first data-carrier store to store in the second data-carrier store, the method comprises: according to the data strip object inquiry mode structure index in the first data-carrier store, and the corresponding a kind of inquiry mode of each index in index; In the second data-carrier store, store the each index in this index, wherein each index comprises the index name that identifies this index; The first data-carrier store is divided into multiple data fields, start multiple the first threads, each the first thread is responsible for obtaining each data entry from one or more data fields of the first data-carrier store, obtained data entry is stored in the second data-carrier store, wherein, in the second data-carrier store, each data entry comprises data strip object unique identification and the data content being associated; Determine one or more data entry being associated according to the inquiry mode corresponding with each index, and determine the unique identification of each data entry in the second data-carrier store, obtain the data strip order identification list being associated with this index; And the index name of the data strip order identification list being associated with each index and each index is stored in the second data-carrier store explicitly.A2, date storage method as described in A1, wherein, described basis is to the data strip object inquiry mode structure index in the first data-carrier store, comprise: set up sorted lists according to the sequence inquiry mode to data entry, and set up group list according to the grouping inquiry mode to data entry; Carry out cartesian product according to sorted lists and group list, construct described index.A3, date storage method as described in A2, wherein, the described basis inquiry mode corresponding with each index determined one or more data entry being associated, and the unique identification of definite each data entry in the second data-carrier store, obtain the data strip order identification list being associated with this index, comprise: start multiple the second threads, each the second thread is responsible for the one or more sequence inquiry modes in sorted lists, determines the sorting data entry set corresponding with responsible sequence inquiry mode from the first data-carrier store; Start multiple the 3rd threads, each the 3rd thread is responsible for the one or more index in index, from the sorting data entry set corresponding with responsible index, determine the one or more data entries that are associated with this index, and determine the unique identification of each data entry in the second data-carrier store, obtain the data strip order identification list being associated with this index.A4, date storage method as described in A3, wherein, describedly from the sorting data entry set corresponding with responsible index, determine one or more data entries of being associated with this index, comprise: sorting data entry set is divided into multiple data blocks, start multiple the 4th threads, each the 4th thread is responsible for determining from one or more data blocks the one or more data entries that are associated with this index.A5, date storage method as described in A2, A3 or A4, wherein, described index name comprises sequence inquiry mode mark and grouping inquiry mode mark; In described the second data-carrier store, each data strip object unique identification is the unique identification of this data entry in the first data-carrier store, and data content is that each field of data strip object is carried out to the content obtaining after serializing.A6, date storage method as described in A5, wherein, described index name also comprises data key assignments; In described the second data-carrier store, each data strip object unique identification also comprises data key assignments, and data content is that each corresponding with data key assignments data strip object field is carried out to the content obtaining after serializing; Wherein, described data key assignments is used for identifying key assignments information, and described key assignments is the one or more field names of data strip object.A7, date storage method as described in A5, wherein, also comprise: obtain the inquiry request from application server, described inquiry request comprises inquiry mode and key assignments, wherein said key assignments is the one or more field names of data strip object; Determine index name according to inquiry mode, from the second data-carrier store, search the data strip order identification list being associated with index name, and from the second data-carrier store, obtain according to the data strip order identification list finding the data content being associated; The data content getting is gone to serializing, and from go the content serializing, obtain the content of answering with key-value pair and return to application server.A8, date storage method as described in A6, wherein, also comprise: obtain the inquiry request from application server, described inquiry request comprises inquiry mode and key assignments; Determine index name according to inquiry mode and key assignments, from the second data-carrier store, search the data strip order identification list being associated with index name, and from the second data-carrier store, obtain according to the data strip order identification list finding the data content being associated; The data content getting is gone to serializing, will go the content after serializing to return to application server.
A9, date storage method as described in A1, wherein, the data entry in the first data-carrier store is stored in relevant database mode; And second index in data-carrier store take index name as key, store take the data strip order identification list that is associated as key-value mode of value, and data entry is take data strip object unique identification as key, store take the data content that is associated as key-value mode of value.B1, a kind of data-storage system, be suitable for the data entry from the first data-carrier store to store in the second data-carrier store, this system comprises: index tectonic element, be suitable for according to the data strip object inquiry mode structure index in the first data-carrier store the corresponding a kind of inquiry mode of each index in index; Index storage unit, is suitable in the second data-carrier store, storing the each index in this index, and wherein each index comprises the index name that identifies this index; Data entry storage unit, be suitable for the first data-carrier store to be divided into multiple data fields, start multiple the first threads, each the first thread is responsible for obtaining each data entry from one or more data fields of the first data-carrier store, obtained data entry is stored in the second data-carrier store, wherein, in the second data-carrier store, each data entry comprises data strip object unique identification and the data content being associated; And associative cell, be suitable for determining according to the inquiry mode corresponding with each index one or more data entry being associated, and determine the unique identification of each data entry in the second data-carrier store, obtain the data strip order identification list being associated with this index; Wherein, described index storage unit is configured to the index name of the data strip order identification list being associated with each index and each index to store into explicitly in the second data-carrier store.B2, data-storage system as described in B1, wherein, described index tectonic element is configured to: set up sorted lists according to the sequence inquiry mode to data entry, and set up group list according to the grouping inquiry mode to data entry; Carry out cartesian product according to sorted lists and group list, construct described index.B3, data-storage system as described in claim B2, wherein, described associative cell is configured to: start multiple the second threads, each the second thread is responsible for the one or more sequence inquiry modes in sorted lists, determines the sorting data entry set corresponding with responsible sequence inquiry mode from the first data-carrier store; Start multiple the 3rd threads, each the 3rd thread is responsible for the one or more index in index, from the sorting data entry set corresponding with responsible index, determine the one or more data entries that are associated with this index, and determine the unique identification of each data entry in the second data-carrier store, obtain the data strip order identification list being associated with this index.B4, data-storage system as described in claim B3, wherein, described associative cell is configured to: sorting data entry set is divided into multiple data blocks, start multiple the 4th threads, each the 4th thread is responsible for determining from one or more data blocks the one or more data entries that are associated with this index.B5, data-storage system as described in claim B2, B3 or B4, wherein, described index name comprises sequence inquiry mode mark and grouping inquiry mode mark; In described the second data-carrier store, each data strip object unique identification is the unique identification of this data entry in the first data-carrier store, and data content is that each field of data strip object is carried out to the content obtaining after serializing.B6, data-storage system as described in claim B5, wherein, described index name also comprises data key assignments; In described the second data-carrier store, each data strip object unique identification also comprises data key assignments, and data content is that each corresponding with data key assignments data strip object field is carried out to the content obtaining after serializing; Wherein, described data key assignments is used for identifying key assignments information, and described key assignments is the one or more field names of data strip object.B7, data-storage system as described in claim B5, wherein, described the second data-carrier store also comprises query interface, described query interface is configured to: obtain the inquiry request from application server, described inquiry request comprises inquiry mode and key assignments, and wherein said key assignments is the one or more field names of data strip object; Determine index name according to inquiry mode, from the second data-carrier store, search the data strip order identification list being associated with index name, and from the second data-carrier store, obtain according to the data strip order identification list finding the data content being associated; The data content getting is gone to serializing, and from go the content serializing, obtain the content of answering with key-value pair and return to application server.B8, data-storage system as described in claim B6, wherein, described the second data-carrier store also comprises query interface, described query interface is configured to: obtain the inquiry request from application server, described inquiry request comprises inquiry mode and key assignments; Determine index name according to inquiry mode and key assignments, from the second data-carrier store, search the data strip order identification list being associated with index name, and from the second data-carrier store, obtain according to the data strip order identification list finding the data content being associated; The data content getting is gone to serializing, will go the content after serializing to return to application server.B9, data-storage system as described in claim B1, wherein, the data entry in the first data-carrier store is stored in relevant database mode; And second index in data-carrier store take index name as key, store take the data strip order identification list that is associated as key-value mode of value, and data entry is take data strip object unique identification as key, store take the data content that is associated as key-value mode of value.C1, a kind of data query system, comprise the first data-carrier store, data distribution server and the second data-carrier store, and wherein, described data distribution server comprises the data-storage system as described in any one in B1 to B9.

Claims (10)

1. a date storage method, is suitable for the data entry from the first data-carrier store to store in the second data-carrier store, and the method comprises:
According to the data strip object inquiry mode structure index in the first data-carrier store, the corresponding a kind of inquiry mode of each index in index;
In the second data-carrier store, store the each index in this index, wherein each index comprises the index name that identifies this index;
The first data-carrier store is divided into multiple data fields, start multiple the first threads, each the first thread is responsible for obtaining each data entry from one or more data fields of the first data-carrier store, obtained data entry is stored in the second data-carrier store, wherein, in the second data-carrier store, each data entry comprises data strip object unique identification and the data content being associated;
Determine one or more data entry being associated according to the inquiry mode corresponding with each index, and determine the unique identification of each data entry in the second data-carrier store, obtain the data strip order identification list being associated with this index; And
The index name of the data strip order identification list being associated with each index and each index is stored in the second data-carrier store explicitly.
2. date storage method as claimed in claim 1, wherein, described basis, to the data strip object inquiry mode structure index in the first data-carrier store, comprising:
Set up sorted lists according to the sequence inquiry mode to data entry, and set up group list according to the grouping inquiry mode to data entry;
Carry out cartesian product according to sorted lists and group list, construct described index.
3. date storage method as claimed in claim 2, wherein, the described basis inquiry mode corresponding with each index determined one or more data entry being associated, and the unique identification of definite each data entry in the second data-carrier store, obtain the data strip order identification list being associated with this index, comprising:
Start multiple the second threads, each the second thread is responsible for the one or more sequence inquiry modes in sorted lists, determines the sorting data entry set corresponding with responsible sequence inquiry mode from the first data-carrier store;
Start multiple the 3rd threads, each the 3rd thread is responsible for the one or more index in index, from the sorting data entry set corresponding with responsible index, determine the one or more data entries that are associated with this index, and determine the unique identification of each data entry in the second data-carrier store, obtain the data strip order identification list being associated with this index.
4. date storage method as claimed in claim 3 wherein, is describedly determined one or more data entries of being associated with this index from the sorting data entry set corresponding with responsible index, comprising:
Sorting data entry set is divided into multiple data blocks, starts multiple the 4th threads, each the 4th thread is responsible for determining from one or more data blocks the one or more data entries that are associated with this index.
5. the date storage method as described in claim 2,3 or 4, wherein, described index name comprises sequence inquiry mode mark and grouping inquiry mode mark;
In described the second data-carrier store, each data strip object unique identification is the unique identification of this data entry in the first data-carrier store, and data content is that each field of data strip object is carried out to the content obtaining after serializing.
6. date storage method as claimed in claim 5, wherein, described index name also comprises data key assignments;
In described the second data-carrier store, each data strip object unique identification also comprises data key assignments, and data content is that each corresponding with data key assignments data strip object field is carried out to the content obtaining after serializing;
Wherein, described data key assignments is used for identifying key assignments information, and described key assignments is the one or more field names of data strip object.
7. date storage method as claimed in claim 5, wherein, also comprises:
Obtain the inquiry request from application server, described inquiry request comprises inquiry mode and key assignments, and wherein said key assignments is the one or more field names of data strip object;
Determine index name according to inquiry mode, from the second data-carrier store, search the data strip order identification list being associated with index name, and from the second data-carrier store, obtain according to the data strip order identification list finding the data content being associated;
The data content getting is gone to serializing, and from go the content serializing, obtain the content of answering with key-value pair and return to application server.
8. date storage method as claimed in claim 6, wherein, also comprises:
Obtain the inquiry request from application server, described inquiry request comprises inquiry mode and key assignments;
Determine index name according to inquiry mode and key assignments, from the second data-carrier store, search the data strip order identification list being associated with index name, and from the second data-carrier store, obtain according to the data strip order identification list finding the data content being associated;
The data content getting is gone to serializing, will go the content after serializing to return to application server.
9. a data-storage system, is suitable for the data entry from the first data-carrier store to store in the second data-carrier store, and this system comprises:
Index tectonic element, is suitable for according to the data strip object inquiry mode structure index in the first data-carrier store the corresponding a kind of inquiry mode of each index in index;
Index storage unit, is suitable in the second data-carrier store, storing the each index in this index, and wherein each index comprises the index name that identifies this index;
Data entry storage unit, be suitable for the first data-carrier store to be divided into multiple data fields, start multiple the first threads, each the first thread is responsible for obtaining each data entry from one or more data fields of the first data-carrier store, obtained data entry is stored in the second data-carrier store, wherein, in the second data-carrier store, each data entry comprises data strip object unique identification and the data content being associated; And
Associative cell, is suitable for determining according to the inquiry mode corresponding with each index one or more data entry being associated, and determines the unique identification of each data entry in the second data-carrier store, obtains the data strip order identification list being associated with this index;
Wherein, described index storage unit is configured to the index name of the data strip order identification list being associated with each index and each index to store into explicitly in the second data-carrier store.
10. a data query system, comprises the first data-carrier store, data distribution server and the second data-carrier store, and wherein, described data distribution server comprises data-storage system as claimed in claim 9.
CN201410126243.XA 2014-03-31 2014-03-31 A kind of data-storage system and storage method Active CN103902702B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410126243.XA CN103902702B (en) 2014-03-31 2014-03-31 A kind of data-storage system and storage method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410126243.XA CN103902702B (en) 2014-03-31 2014-03-31 A kind of data-storage system and storage method

Publications (2)

Publication Number Publication Date
CN103902702A true CN103902702A (en) 2014-07-02
CN103902702B CN103902702B (en) 2017-11-28

Family

ID=50994024

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410126243.XA Active CN103902702B (en) 2014-03-31 2014-03-31 A kind of data-storage system and storage method

Country Status (1)

Country Link
CN (1) CN103902702B (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105069017A (en) * 2015-07-13 2015-11-18 深圳市永兴元科技有限公司 Report storage method in report system and report system
CN105653528A (en) * 2014-11-11 2016-06-08 金蝶软件(中国)有限公司 Business filed multi-modal display method and device
CN106446201A (en) * 2016-09-30 2017-02-22 福建中金在线信息科技有限公司 Processing method and device of social circle data
CN107656968A (en) * 2017-08-31 2018-02-02 武汉斗鱼网络科技有限公司 High-volume business datum deriving method and system
CN107817946A (en) * 2016-09-13 2018-03-20 阿里巴巴集团控股有限公司 For mixing the method and device of storage device read-write data
CN108268515A (en) * 2016-12-30 2018-07-10 北京国双科技有限公司 The selection method and device of Aggregation Table dimension
CN108536798A (en) * 2018-04-02 2018-09-14 携程旅游网络技术(上海)有限公司 The restoration methods and system of the other database data of order level
CN109145004A (en) * 2018-08-29 2019-01-04 智慧互通科技有限公司 A kind of method and device creating database index
CN109828987A (en) * 2019-01-21 2019-05-31 深圳乐信软件技术有限公司 A kind of millions method for computing data, device, electronic equipment and medium
CN110149529A (en) * 2018-11-01 2019-08-20 腾讯科技(深圳)有限公司 Processing method, server and the storage medium of media information
CN111143232A (en) * 2018-11-02 2020-05-12 伊姆西Ip控股有限责任公司 Method, apparatus and computer program product for storing metadata
TWI706260B (en) * 2018-05-29 2020-10-01 香港商阿里巴巴集團服務有限公司 Index establishment method and device based on mobile terminal NoSQL database
CN111752947A (en) * 2020-06-24 2020-10-09 中国银行股份有限公司 System integration method and device, storage medium and electronic equipment
CN113127659A (en) * 2019-12-31 2021-07-16 深圳云天励飞技术有限公司 Image data entry method and device, electronic equipment and storage medium
US20220012213A1 (en) * 2016-03-08 2022-01-13 International Business Machines Corporation Spatial-temporal storage system, method, and recording medium
CN117519839A (en) * 2024-01-05 2024-02-06 恒生电子股份有限公司 Data loading method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102375853A (en) * 2010-08-24 2012-03-14 中国移动通信集团公司 Distributed database system, method for building index therein and query method
CN102867070A (en) * 2012-09-29 2013-01-09 瑞庭网络技术(上海)有限公司 Method for updating cache of key-value distributed memory system
CN103177027A (en) * 2011-12-23 2013-06-26 北京新媒传信科技有限公司 Method and system for obtaining dynamic feed index
KR20130093999A (en) * 2012-02-15 2013-08-23 주식회사 시공미디어 Travel information sharing service system using location based service interlinked no-sql and rdbms

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102375853A (en) * 2010-08-24 2012-03-14 中国移动通信集团公司 Distributed database system, method for building index therein and query method
CN103177027A (en) * 2011-12-23 2013-06-26 北京新媒传信科技有限公司 Method and system for obtaining dynamic feed index
KR20130093999A (en) * 2012-02-15 2013-08-23 주식회사 시공미디어 Travel information sharing service system using location based service interlinked no-sql and rdbms
CN102867070A (en) * 2012-09-29 2013-01-09 瑞庭网络技术(上海)有限公司 Method for updating cache of key-value distributed memory system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
OLEKSIY KARPENKO ET AL.: ""Relational database index choices for genome annotation data"", 《2010 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOPS》 *
曹丹丹等: ""Redis数据库在视频推荐服务系统中的应用"", 《计算机与现代化》 *

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105653528A (en) * 2014-11-11 2016-06-08 金蝶软件(中国)有限公司 Business filed multi-modal display method and device
CN105069017A (en) * 2015-07-13 2015-11-18 深圳市永兴元科技有限公司 Report storage method in report system and report system
US20220012213A1 (en) * 2016-03-08 2022-01-13 International Business Machines Corporation Spatial-temporal storage system, method, and recording medium
CN107817946A (en) * 2016-09-13 2018-03-20 阿里巴巴集团控股有限公司 For mixing the method and device of storage device read-write data
CN106446201A (en) * 2016-09-30 2017-02-22 福建中金在线信息科技有限公司 Processing method and device of social circle data
CN108268515B (en) * 2016-12-30 2020-07-31 北京国双科技有限公司 Selection method and device for dimension of aggregation table
CN108268515A (en) * 2016-12-30 2018-07-10 北京国双科技有限公司 The selection method and device of Aggregation Table dimension
CN107656968A (en) * 2017-08-31 2018-02-02 武汉斗鱼网络科技有限公司 High-volume business datum deriving method and system
CN107656968B (en) * 2017-08-31 2021-04-23 武汉斗鱼网络科技有限公司 Method and system for exporting large-batch business data
WO2019041707A1 (en) * 2017-08-31 2019-03-07 武汉斗鱼网络科技有限公司 Method and system for exporting mass service data
CN108536798B (en) * 2018-04-02 2020-12-01 携程旅游网络技术(上海)有限公司 Method and system for recovering database data of order level
CN108536798A (en) * 2018-04-02 2018-09-14 携程旅游网络技术(上海)有限公司 The restoration methods and system of the other database data of order level
TWI706260B (en) * 2018-05-29 2020-10-01 香港商阿里巴巴集團服務有限公司 Index establishment method and device based on mobile terminal NoSQL database
CN109145004A (en) * 2018-08-29 2019-01-04 智慧互通科技有限公司 A kind of method and device creating database index
CN110149529A (en) * 2018-11-01 2019-08-20 腾讯科技(深圳)有限公司 Processing method, server and the storage medium of media information
CN111143232A (en) * 2018-11-02 2020-05-12 伊姆西Ip控股有限责任公司 Method, apparatus and computer program product for storing metadata
CN111143232B (en) * 2018-11-02 2023-08-18 伊姆西Ip控股有限责任公司 Method, apparatus and computer readable medium for storing metadata
CN109828987A (en) * 2019-01-21 2019-05-31 深圳乐信软件技术有限公司 A kind of millions method for computing data, device, electronic equipment and medium
CN113127659A (en) * 2019-12-31 2021-07-16 深圳云天励飞技术有限公司 Image data entry method and device, electronic equipment and storage medium
CN111752947A (en) * 2020-06-24 2020-10-09 中国银行股份有限公司 System integration method and device, storage medium and electronic equipment
CN111752947B (en) * 2020-06-24 2023-08-18 中国银行股份有限公司 Method and device for integrating system, storage medium and electronic equipment
CN117519839A (en) * 2024-01-05 2024-02-06 恒生电子股份有限公司 Data loading method and device
CN117519839B (en) * 2024-01-05 2024-04-16 恒生电子股份有限公司 Data loading method and device

Also Published As

Publication number Publication date
CN103902702B (en) 2017-11-28

Similar Documents

Publication Publication Date Title
CN103902702A (en) Data storage system and data storage method
CN103902698A (en) Data storage system and data storage method
CN103902701A (en) Data storage system and data storage method
US20190258625A1 (en) Data partitioning and ordering
CN102725753B (en) Method and apparatus for optimizing data access, method and apparatus for optimizing data storage
CN103577440B (en) A kind of data processing method and device in non-relational database
CN102129425B (en) The access method of big object set table and device in data warehouse
US20140101167A1 (en) Creation of Inverted Index System, and Data Processing Method and Apparatus
CN104794162A (en) Real-time data storage and query method
CN108388604A (en) User right data administrator, method and computer readable storage medium
CN106202548A (en) Date storage method, lookup method and device
CN103294702A (en) Data processing method, device and system
CN104281664B (en) Distributed figure computing system data segmentation method and system
CN106055621A (en) Log retrieval method and device
CN106326475A (en) High-efficiency static hash table implement method and system
CN109062936B (en) Data query method, computer readable storage medium and terminal equipment
CN102591855A (en) Data identification method and data identification system
CN104216992A (en) Information processing method and device
CN110427364A (en) A kind of data processing method, device, electronic equipment and storage medium
US20140201132A1 (en) Storing a key value to a deleted row based on key range density
US11853279B2 (en) Data storage using vectors of vectors
CN104112011A (en) Method and device for extracting mass data
CN102169491A (en) Dynamic detection method for multi-data concentrated and repeated records
CN112699142A (en) Cold and hot data processing method and device, electronic equipment and storage medium
CN101963993B (en) Method for fast searching database sheet table record

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C53 Correction of patent of invention or patent application
CB02 Change of applicant information

Address after: 100080 Beijing city Haidian District Danleng Street No. 3 floor 11 block B room 1109

Applicant after: BEIJING PIERRE BLANEY SOFTWARE CO., LTD.

Address before: 100080 Beijing city Haidian District Danleng Street No. 3 floor 11 block B room 1109

Applicant before: BEIJING CHESHANGHUI SOFTWARE CO., LTD.

COR Change of bibliographic data

Free format text: CORRECT: APPLICANT; FROM: BEIJING CHESHANGHUI SOFTWARE CO., LTD. TO: BEIJING PIER BULAINI SOFTWARE CO., LTD.

GR01 Patent grant
GR01 Patent grant