WO2020073854A1 - 管理内存数据及在内存中维护数据的方法和系统 - Google Patents

管理内存数据及在内存中维护数据的方法和系统 Download PDF

Info

Publication number
WO2020073854A1
WO2020073854A1 PCT/CN2019/109144 CN2019109144W WO2020073854A1 WO 2020073854 A1 WO2020073854 A1 WO 2020073854A1 CN 2019109144 W CN2019109144 W CN 2019109144W WO 2020073854 A1 WO2020073854 A1 WO 2020073854A1
Authority
WO
WIPO (PCT)
Prior art keywords
value
shard
data record
field
node
Prior art date
Application number
PCT/CN2019/109144
Other languages
English (en)
French (fr)
Inventor
邓龙
王太泽
黄亚建
范晓亮
刘永超
Original Assignee
第四范式(北京)技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 第四范式(北京)技术有限公司 filed Critical 第四范式(北京)技术有限公司
Publication of WO2020073854A1 publication Critical patent/WO2020073854A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries

Definitions

  • the present disclosure generally relates to the field of memory data management and maintenance, and more specifically, to a method and system for managing memory data and a method and system for maintaining data in memory.
  • Relational databases such as MySQL and SQL Server are mainly used to manage and maintain relational data.
  • Non-relational databases such as Redis and MongoDB are mainly used to manage and maintain non-relational data.
  • Relational data means data based on Relational Model (referred to as RM).
  • Non-relational data means data that is not based on a relational model.
  • a time series database such as InfluxDB
  • InfluxDB InfluxDB
  • an in-memory database such as VoltDB
  • the traditional databases including the above-listed databases have the problem of time consuming to read / write data and also have units The problem of a small number of data writing tasks and / or data query tasks that can be executed simultaneously within a time.
  • the prior art has the defects of low data writing efficiency and low data query efficiency.
  • Exemplary embodiments of the present disclosure aim to overcome the shortcomings in the prior art of low data writing efficiency and low data query efficiency.
  • a method for managing memory data including: setting a plurality of shard groups, where each shard group includes at least one shard, and all shards in each shard group Corresponding to the unified index field and sort field, the index fields of different shard groups are different, and the sort fields of different shard groups are the same or different; the corresponding first-level hop table and second-level hop are constructed for each shard Table, wherein the first-level jump table corresponding to each fragment is set to store the index field of the each fragment in the data record as a key and to indicate the second-level jump table
  • the pointer or object of is the node corresponding to the value of the key, and the second-level jump table corresponding to each shard is set to store the order of each shard in the data record.
  • the value of the field is a key and a pointer indicating a storage space for storing the value of at least one attribute field of the data record is a node corresponding
  • a method for maintaining data in a memory includes: for each shard group of a plurality of shard groups, according to The value of the index field of the shard group determines the corresponding shard.
  • Each shard group includes at least one shard. All shards in each shard group correspond to a uniform index field and sort field.
  • the index fields of the slice groups are different, and the sort fields of different slice groups are the same or different.
  • Each slice corresponds to the first-level jump table.
  • the first-level jump table is used to store the index of the corresponding slice in the data record.
  • the value of the field is a key and the pointer or object indicating the second-level jump table is a node corresponding to the value of the key; look up the first-level jump table corresponding to the determined slice to find the to-be-inserted
  • the value of the index field of each shard group in the data record of the data node is a keyword; the value of the index field of the shard group in the data record to be inserted is found For the node of the key In this case, add the sort field of the data record to be inserted into the second-level jump table indicated by the pointer or object in the found node with respect to each shard group as the key and
  • a node indicating a value corresponding to the key is a pointer indicating a storage space for storing the value of at least one attribute field of the data record to be inserted.
  • a system for managing memory data including: a shard management device for setting a plurality of shard groups, where each shard group includes at least one shard, each All shards in a shard group correspond to a unified index field and sort field.
  • the index fields of different shard groups are different, and the sort fields of different shard groups are the same or different; the jump table management device is used
  • the shards respectively construct a corresponding first-level hop table and a second-level hop table, wherein the first-level hop table corresponding to each shard is set to store data about each shard in the data record.
  • the value of the index field is a key and the pointer or object indicating the second-level jump table is a node corresponding to the value of the key, and the second-level jump table corresponding to each fragment is set to
  • the key that stores the storage space with the value of the sort field of each fragment in the data record as a key and indicates the value of at least one attribute field used to store the data record is the key Section of the value corresponding to the word point.
  • a system for maintaining data in a memory including: a shard determination device for each shard group of a plurality of shard groups, according to the data to be inserted The value of the recorded index field about each shard group determines the corresponding shard, where each shard group includes at least one shard, and all shards in each shard group correspond to a unified index Fields and sort fields, the index fields of different shard groups are different, and the sort fields of different shard groups are the same or different.
  • Each shard corresponds to the first-level jump table, which is used to store data records.
  • the index field of the corresponding fragment in the value is a key
  • the pointer or object indicating the second-level jump table is a node corresponding to the value of the key
  • the node search device is used to correspond to the determined fragment In the first-level jump table to search for the node with the value of the index field of each data group to be inserted as a key in the data record to be inserted
  • a node management device is used to find the Inserted data record In the case of a node whose value of the index field of each shard group is a key, add the to-be-inserted to the second-level jump table indicated by the pointer or object in the found node
  • the value of the sort field of the data record regarding the each shard group is a key and the key indicating the storage space for storing the value of at least one attribute field of the data record to be inserted is the key
  • a computer-readable storage medium storing instructions, wherein, when the instructions are executed by at least one computing device, the at least one computing device is caused to perform the method as described above .
  • a system including at least one computing device and at least one storage device storing instructions, wherein the instructions, when executed by the at least one computing device, cause the at least one A computing device performs the method described above.
  • all shards in each shard group can be set to multiple shard groups corresponding to a uniform index field and sort field, which can be constructed separately for each shard
  • the corresponding first-level jump table and second-level jump table so that the first-level jump table corresponding to each fragment is stored with the value of the index field in the data record as a key and to indicate the second-level jump table.
  • the pointer or object is a node corresponding to the value of the key
  • the second-level jump table corresponding to each shard stores the value of the sort field in the data record as the key and indicates the value of the data record.
  • the pointer of the storage space for the value of at least one attribute field is the node of the value corresponding to the key.
  • the multiple shard groups can be processed in parallel, the corresponding shard and the corresponding first-level jump table can be quickly located based on the index field, or shared storage can also be used Space to store the value of the attribute field. In this case, the processing time can be reduced and the processing efficiency can be improved.
  • the corresponding shard and the first-level and second-level jump tables corresponding to the shard can be quickly located in the corresponding shard group. Quickly locate the storage space used to store the values of attribute fields based on sorted fields, which can reduce processing time and improve processing efficiency.
  • any attribute field of the data record can be set as an index field, so that the data record can be flexibly and conveniently written and queried according to the required attribute field.
  • FIG. 1 shows a flowchart of a method of managing memory data according to an exemplary embodiment of the present disclosure
  • FIG. 2 shows a schematic diagram of sharding according to an exemplary embodiment of the present disclosure
  • FIG. 3 shows a flowchart of a method for maintaining data in memory according to an exemplary embodiment of the present disclosure
  • FIG. 4 shows a schematic diagram of an operation of inserting a data record according to an exemplary embodiment of the present disclosure
  • FIG. 5 shows a flowchart of an operation of querying data in a memory according to an exemplary embodiment of the present disclosure
  • FIG. 6 shows a block diagram of a system for managing memory data according to an exemplary embodiment of the present disclosure.
  • FIG. 7 shows a block diagram of a system for maintaining data in memory according to an exemplary embodiment of the present disclosure.
  • FIG. 1 shows a flowchart of a method of managing memory data according to an exemplary embodiment of the present disclosure.
  • the method of modifying memory data is executed by at least one computing device.
  • the method of managing memory data according to an exemplary embodiment of the present disclosure may include step S110 and step S120.
  • step S110 multiple shard groups are set, where each shard group includes at least one shard, all shards in each shard group correspond to a uniform index field and sort field, and index fields of different shard groups Different, and the sort fields of different shard groups are the same or different.
  • step S120 a corresponding first-level jump table and a second-level jump table are constructed for each slice, wherein the first-level jump table corresponding to each slice is set to store
  • the index field of each fragment is a key and the pointer or object indicating the second-level jump table is a node corresponding to the value of the key, and the second level corresponding to each fragment
  • the jump table is set to store the value of the sort field in the data record with respect to each fragment as a key and to indicate the value of at least one attribute field for storing the data record
  • the pointer of the space is the node of the value corresponding to the key.
  • the index field and sort field of each shard group are different.
  • each slice stores a pointer or object indicating the corresponding first-level jump table.
  • the second-level jump table corresponding to all shard groups share the storage space and the storage space stores the values of all attribute fields of the data record, which may correspond to the same sort fields of different shard groups Case.
  • the second-level jump table corresponding to the same shard group shares the same storage space and the same storage space stores all the data records except the index field and the sort field corresponding to the same shard group
  • the value of the attribute field may correspond to the case where the sort fields of different shard groups are different.
  • the shard group may be a collection of shards, and one shard group may include one or more shards;
  • the index field of the shard group refers to the index corresponding to the shard group Field
  • the sort field of the shard group refers to the sort field corresponding to the shard group;
  • the data record can have one or more attribute fields, the index field can be an attribute field of the data record, and the sort field can also be the data record An attribute field.
  • the index field is a card number or merchant category code (Merchant Category Code, MCC for short), etc.
  • the sort field is a time stamp or age.
  • the first-level skip table and the second-level skip table are skiplists, which are also referred to as skip tables.
  • the shards can be associated with the first-level jump table through pointers or objects.
  • the first-level jump table corresponding to the fragment can be located.
  • You can associate the first-level jump table with the second-level jump table through pointers or objects.
  • the second-level jump table corresponding to the first-level jump table can be located through the pointer or object stored in the first-level jump table.
  • the objects here are similar to the objects defined in Object Oriented (Object Oriented, OO for short) technology.
  • index field and / or sort field of the shard group for example, define which field is the index field and which field is the sort field.
  • arrangement order of index fields and / or sort fields may be defined.
  • the first index field as the card number and the second index field as the merchant category code.
  • the first sort field is defined as a timestamp and the second index field is age.
  • sort fields of different shard groups are the same or different.
  • sort fields of different shard groups are also time stamps.
  • the data table may include: a plurality of slice groups, a first-level jump table and a second-level jump table respectively corresponding to each slice in the plurality of slice groups.
  • the data table may or may not include storage space for storing the value of the attribute field of the data record.
  • the storage space for storing the value of the attribute field of the data record is independent of the data table.
  • a data table may be created through the interface create (table_name, ttl, ttl-type, key_1: type_1: index, key_2: type_2: index, ..., key_N: type_N: index, value: type).
  • the user can create a data table in memory through the interface create.
  • steps S110 and S120 may be performed.
  • the information required to perform steps S110 and S120 can be received through the interface create, where "table_name" represents the name of the data table; ttl (time to live) represents the survival threshold of the data stored in the data table, which According to different types of survival and expiration (the type is indicated by the ttl-type field), it has different value ranges.
  • the value of ttl may be data
  • the maximum number of records to keep for example, when tll is 100, it means that the data table can store 100 pieces of data (for example, the most recent 100 pieces of data), and the excess data will be deleted; for example, when the ttl-type field indicates that
  • the value of ttl can be the maximum retention period of the data record. For example, when tll is 3 days, it means that the data table can store the data of the last three days, and the data before this will be deleted.
  • Key_i: type_i: index represents the name of the index field (which can be called the i-th index field) corresponding to the i-th shard group and the data type of the index field (1 ⁇ i ⁇ N), where N is a natural number, where, key_i indicates the name of the i-th index field, type_i indicates the data type of the i-th index field, "index” indicates that key_i is the index field, it is only the index field ID, and can be replaced with any character or key that indicates that key_i is the index field Character combination; "value: type” indicates the name of the attribute field of at least one non-indexed field as a whole (where the value of the at least one attribute field can be encoded as a whole) and the corresponding data type.
  • the sort field is a pre-set attribute field, for example, a time stamp, so the sort field may not be set in the interface.
  • the data types in the example may include: string type, floating point type, and the like.
  • interface_name As an example, it can be created through the interface create (table_name, ttl, ttl-type, key_1: type_1: index, key_2: type_2: index, ..., key_N: type_N: index, value_1: value_type_1, ..., value_M: value_type_M) data sheet.
  • Interface parameters are created in parentheses after interface creation. The setting of parameters other than "value_1: value_type_1, ..., value_M: value_type_M" is similar to the setting of the corresponding parameters in the example of the above interface create, and will not be repeated here.
  • value_1 value_type_1, ..., value_M: value_type_M
  • value_j value_type_j represents the name and type of the attribute field of the j-th non-indexed field, where 1 ⁇ j ⁇ M and M is a natural number.
  • the parameters "field_1: field_type_1: order, field_2: field_type_2: order, ..., field_N: field_type_N: order” may be added, where field_i represents the i-th shard group
  • the name of the sort field may be called the i th sort field) (1 ⁇ i ⁇ N)
  • N is a natural number
  • field_type_i indicates the data type of the i th sort field
  • "order" indicates the field_i is the sort field, it is just
  • the sort field identifier can be replaced with any character or combination of characters that indicates that field_i is the sort field.
  • Table 1 shows data records related to bank transactions according to an exemplary embodiment of the present disclosure.
  • the data records shown in Table 1 may include the following attribute fields: card number
  • Table 1 includes 3 data records. The card number can be used as an index field, and the timestamp can be used as a sort field.
  • FIG. 2 shows a schematic diagram of sharding according to an exemplary embodiment of the present disclosure.
  • slice 0 to slice n are n + 1 slices, where n is a natural number greater than 2.
  • Each of these fragments corresponds to a first-level jump table.
  • the first-level hop table corresponding to slice 0 includes node 11 to node 1m, where m is a natural number.
  • Each slice may store a pointer or object indicating the corresponding first-level jump table, so as to locate the first-level jump table corresponding to the slice.
  • Each node in the first-level hop may correspond to a second-level hop table.
  • the second-level hop table corresponding to node 11 includes nodes 41 to 4k
  • the second-level hop table corresponding to node 12 includes nodes 31 to 3j
  • the second-level hop table corresponding to node 1m includes nodes 21 to 2i , Where i, j, and k are natural numbers.
  • Each node of the first-level jump table may store a pointer or object indicating the corresponding second-level jump table, so as to locate the second-level jump table corresponding to the node of the first-level jump table.
  • Key-value pairs can be set in the nodes of the jump table.
  • the value of the index field of the data record can be set as a key
  • the pointer or object indicating the second level jump table can be set to a value corresponding to the key (value).
  • the value of the sort field of the data record can be set as a key
  • the value corresponding to the key is a pointer that indicates the value of the value of at least one attribute field storing the data record storage.
  • the value of the at least one attribute field includes the value of the index field of the data record and / or the value of the sort field, or the value of the at least one attribute field does not include the value of the index field of the data record The value also does not include the value of the sort field of the data record.
  • a pointer or object indicating another node in the first-level jump table may be stored in the node of the first-level jump table.
  • a node or object indicating another node in the second-level jump table may also be stored in the node of the second-level jump table.
  • FIG. 3 shows a flowchart of a method of maintaining data in a memory according to an exemplary embodiment of the present disclosure.
  • the method for maintaining data in memory performed by at least one computing device according to an exemplary embodiment of the present disclosure may include step S210, step S220, and step S230.
  • step S210 for each shard group in the plurality of shard groups, the corresponding shard is determined according to the value of the index field about each shard group of the data record to be inserted, wherein, each The shard group includes at least one shard. All shards in each shard group correspond to a uniform index field and sort field. The index fields of different shard groups are different, and the sort fields of different shard groups are the same or different. Each fragment corresponds to the first-level jump table. The first-level jump table is used to store the index field of the corresponding fragment in the data record as a key and the pointer or object indicating the second-level jump table as The node with the value corresponding to the keyword. As an example, the index field and sort field of each shard group are different.
  • each slice stores a pointer or object indicating the corresponding first-level jump table.
  • the step of determining the corresponding shard includes: calculating a hash value corresponding to the value of the index field of each data group of the data record to be inserted; obtaining the calculated hash value divided by Take the remainder obtained from the total number of fragments of each fragment group; determine the fragment corresponding to the obtained residue as the corresponding fragment.
  • a hash function can be used to calculate the value of the index field to obtain a hash value.
  • the hash function used may be the Murmurhash hash function proposed by Austin Appleby.
  • the present disclosure does not limit the hash function used, and other hash functions can also be used for hash value calculation.
  • one shard group includes shard 0 to n
  • each shard shown in FIG. 2 belongs to the same shard group, and if the remainder is 0, the shard 0 in the one shard group Corresponds to the data record to be inserted; if the remainder is h (0 ⁇ h ⁇ n), the fragment h in the one fragment group corresponds to the data record to be inserted.
  • step S220 a node with the value of the index field of each slice group of the data record to be inserted as a key is searched from the first-level jump table corresponding to the determined slice.
  • step S230 in the case where a node whose key value of the index field of each data group to be inserted is a key is found, the pointer or object in the found node
  • the indicated second-level jump table is added with the value of the sort field of the data record to be inserted about each shard group as a key and to indicate at least at least one for storing the data record to be inserted
  • the pointer of the storage space of the value of an attribute field is the node of the value corresponding to the key.
  • the method for maintaining data in memory further includes: when the value of the index field for each shard group that is recorded with the data to be inserted cannot be found In the case of a node that is a keyword, a second-level jump table is created, and the index field for each shard group of the data record to be inserted is created in the first-level jump table as a key And take the pointer or object indicating the created second-level jump table as the node corresponding to the value of the key, and add in the created second-level jump table to the data record to be inserted about the
  • the value of the sort field of each shard group is a key and the pointer indicating the storage space for storing the value of at least one attribute field of the data record to be inserted is a node corresponding to the key.
  • the second-level jump table corresponding to all shard groups share the storage space and the storage space stores the values of all attribute fields of the data record, which may correspond to the same sort fields of different shard groups Case.
  • the second-level jump table corresponding to the same shard group shares the same storage space and the same storage space stores all the data records except the index field and the sort field corresponding to the same shard group
  • the value of the attribute field may correspond to the case where the sort fields of different shard groups are different.
  • each shard group is processed.
  • the processing of the shard group includes the shards corresponding to the shard group and the first-level jump table and the first Handling of the second-level jump table.
  • the value of each index field and the value of at least one attribute field of the data record may be received as the data record, and the value of the sort field of the data record may also be received.
  • the above values can be received from the user input, and when the user input is received, the processing can be performed according to the above-described steps S210, S220, and S230.
  • data records can be inserted through the interface put (table_name, ts, key_1, key_2, ..., key_N, value).
  • table_name represents the name of the data table where the data record will be inserted
  • ts represents the value of the sort field
  • key_1, key_2, ..., key_N represents the value of each index field of the data record to be inserted
  • value Indicates the value of at least one attribute field of the data record to be inserted.
  • the value may be a value obtained by encoding (eg, merging or serializing) the at least one attribute value according to specific rules.
  • the arrangement order of the index fields of the data records to be inserted is the same as the arrangement order of the index fields of the multiple shard groups.
  • the order of the values of the index fields (key_1, key_2, ..., key_N) in the interface put is the same as the order of the index fields of the slice group of the data table.
  • the index field corresponding to each value can be determined according to the arrangement order of key_1, key_2, ..., key_N in the interface put, so that the data record insertion operation can be performed conveniently and efficiently.
  • data records can be inserted through the interface put (table_name, key_1, key_2, ..., key_N, field_1, field_2, ..., field_N, value), this example can correspond to the case where the sort fields of different shard groups are different Where field_1, field_2, ..., field_N represent the values of the sort fields corresponding to key_1, key_2, ..., key_N respectively.
  • Other parameters in this interface can be understood by referring to the description of the interface put (table_name, ts, key_1, key_2, ..., key_N, value).
  • the arrangement order of the sort fields of the data record to be inserted is the same as the arrangement order of the sort fields of the plurality of shard groups.
  • the order of the values (field_1, field_2, ..., field_N) of the sort fields in the interface put is the same as the sort order of the sort fields of the slice group of the data table.
  • the index field corresponding to each value in field_1, field_2, ..., field_N can be determined according to the arrangement order of field_1, field_2, ..., field_N in the interface put, so that data recording can be performed conveniently and efficiently Insert operation.
  • the data record to be inserted into the data table may be a data record as shown in Table 1.
  • Table 1 Each data record in Table 1 can be inserted one by one by means of the above interface put. You can use the card number as the index field and the timestamp as the sort field.
  • the value of the card number, the value of the timestamp, the value of the transaction amount, the value of the transaction location and the value of the POS number can be encoded (for example, serialized or merged) to be stored to all second-level jumps Table shared storage space.
  • a storage space for storing the value of at least one attribute field of the data record to be inserted stores a character string obtained in one of the following ways: according to a predetermined character string merge rule for the at least one attribute The values of the fields are combined, the values of the at least one attribute field are serialized according to a predetermined JSON format, the values of the at least one attribute field are serialized according to a predetermined ProtocolBuffer format, and according to a predetermined Schema The format serializes the value of the at least one attribute field.
  • the predetermined character string merging rule includes merging according to a specific symbol (for example, "
  • the character string is stored in the storage space.
  • the above description is only an example and should not be regarded as limiting.
  • the data records to be inserted are time series data records, and the sort fields are all time stamps.
  • the step of adding a node in the second-level jump table includes: adding nodes according to the time indicated by the value of the time stamp, so that the nodes in the second-level jump table are arranged in order of time from near to far. The distance between the timestamps can be determined by comparing the timestamp values. The time corresponding to the larger timestamp value is closer than the time corresponding to the smaller timestamp value. Therefore, in the second-level jump table, the node with the larger time stamp value can be arranged before the node with the smaller time stamp value.
  • the method of maintaining data in the memory is described below by taking the data record to be inserted as the first data record in Table 1 as an example.
  • the data table includes two shard groups, the index field of the first shard group is the card number, the index field of the second shard group is the POS number, the sort fields of the two shard groups are both time stamps, two The second level jump table of the shard group shares the storage space SP.
  • step S210 for the first shard group, the corresponding fragment (for example, fragment 0) is determined according to the value of the card number "6222XXXX01"; for the second shard group, the correspondence is determined according to the value of the POS number "10xxx" Of shards (for example, shard 3).
  • step S210 for the first shard group, the node with the keyword "6222XXXX01" is searched from the first-level jump table corresponding to shard 0 (for example, node 11 is found); for the second shard group, Find the node with the key of "10xxx" from the first-level jump table corresponding to the slice 3 (for example, the node 333 is found).
  • step S230 for the first shard group, insert the value of the time stamp "2018052814520505" into the second-level jump table corresponding to the node 11 and use the pointer to the storage space SP as the key Value node; for the second shard group, insert the value of timestamp "2018052814520505" into the second-level jump table corresponding to node 333 as the key and the pointer to the storage space SP as the key.
  • the value node encodes "6222XXXX01", “2018052814520505", “100”, “Beijing Shangdi xx Road” and "10xxx” and stores them in the storage space S.
  • the data record to be inserted is the first data record to the third data record in Table 1.
  • the first data record and the second data record have the same card number, so these two data records correspond to the same second-level jump table.
  • the second-level jump table when the first data record is added, the value of the timestamp of the second data record is greater than the value of the timestamp of the first data record.
  • the node corresponding to the 2 data records is added before the node corresponding to the first data record.
  • FIG. 4 illustrates a schematic diagram of an operation of inserting a data record according to an exemplary embodiment of the present disclosure.
  • a data record can be inserted into a data table in memory through a put interface (for example, the put interface as described above).
  • a put interface for example, the put interface as described above.
  • the values of each index field input through the put interface are key1, key2, and key3, and key1, key2, and key3 correspond to the dimensions a, b, and c, respectively.
  • the value of the sort field input through the put interface is ts, and the put interface
  • the value of at least one attribute field entered is value.
  • the data table may include slice group a, slice group b, and slice group c.
  • the index fields of the slice group a, slice group b, and slice group c correspond to key1, key2, and key3, respectively.
  • the sort fields of the slice group a, the slice group b, and the slice group c correspond to ts.
  • the order of the index field of the shard group a, the index field of the shard group b, and the index field of the shard group c is the same as the order of the three key1, key2, and key3. When the order of any one of key1, key2 and key3 is known, it can be determined which index field value the value is.
  • the operation of querying data in the memory according to an exemplary embodiment of the present disclosure includes steps S310 to S360.
  • step S310 the index field of the data record to be queried, the value of the index field, and the value range of the sort field are received.
  • the index field of the data record to be queried refers to the name of the index field of the data record to be queried, for example, the card number.
  • the data records to be queried are time series data records, and the sort fields are all time stamps.
  • the range of values specifies the start value and end value of the time stamp or the end value of the time stamp.
  • step S320 determine a shard group in which the index field of the plurality of shard groups is the same as the index field of the data record to be queried.
  • step S330 in the determined segment group, the corresponding segment is determined according to the value of the index field of the data record to be queried.
  • the fragment corresponding to the data record to be queried can be determined by the following operations: calculating the hash value corresponding to the value of the index field of the data record to be queried; obtaining the calculated hash value divided by The remainder obtained from the total number of fragments of the determined fragment group; the fragments corresponding to the obtained residue are determined as the corresponding fragments.
  • a hash function can be used to calculate the value of the index field to obtain a hash value.
  • the hash function used may be the Murmurhash hash function proposed by Austin Appleby.
  • the present disclosure does not limit the hash function used, and other hash functions can also be used for hash value calculation.
  • step S340 a node with the value of the index field of the data record to be queried as a key is searched from the first-level jump table corresponding to the determined fragment.
  • step S350 the pointer in the node within the value range of the sort field of the data record to be queried is queried from the pointer in the found node or the second-level jump table indicated by the object.
  • step S360 the value of at least one attribute field of the data record to be queried is taken from the storage space indicated by the queried pointer.
  • data query can be performed through the interface scan (table_name, key_name, key_value, start_time, end_time), where table_name is used to define the name of the data table from which data is queried, and key_name is used to define the index field of the data record to be queried , Key_value is used to define the value of the index field of the data record to be queried, and start_time and end_time are used to define the value range of the data record to be queried, for example, start time and end time.
  • table_name is used to define the name of the data table from which data is queried
  • key_name is used to define the index field of the data record to be queried
  • Key_value is used to define the value of the index field of the data record to be queried
  • start_time and end_time are used to define the value range of the data record to be queried, for example, start time and end time.
  • This example can correspond to the case where the sort fields corresponding to all the shard groups of the data table are the same
  • data query can be performed through the interface get (table_name, key_name, key_value, ts), where table_name is used to define the name of the data table from which data is queried, key_name is used to define the index field of the data record to be queried, key_value Used to define the value of the index field of the data record to be queried.
  • Ts is used to define the value range of the data record to be queried. For example, ts is used to define the value of the time stamp of the data record to be queried. In this case, what is actually expected to be queried is the data with a timestamp value of ts; for example, ts is used to limit the end time of the data record to be queried. Start the data up to the specified ts. This example can correspond to the case where the sort fields corresponding to all the shard groups of the data table are the same.
  • interface scan and interface get are only used to describe the concept of the present disclosure, and are not used to limit the protection scope of the present disclosure.
  • Other interfaces for data query are also feasible, for example, one or more of the above interfaces may be omitted
  • Multiple parameters can also add one or more parameters to the above interface.
  • the step of retrieving the value of at least one attribute field of the data record to be queried from the storage space indicated by the queried pointer includes: retrieving the at least one of the data record to be queried in one of the following ways Value of one attribute field: the value of the at least one attribute field is split according to a predetermined string splitting rule, and the value of the at least one attribute field is deserialized according to a predetermined JSON format, according to The predetermined ProtocolBuffer format deserializes the value of the at least one attribute field, and deserializes the value of the at least one attribute field according to a predefined Schema format.
  • the values corresponding to the keywords are found as "100
  • the string is the value of the transaction amount "100”
  • the second split string is the value of the transaction location "Beijing Shangdi xx Road”
  • the third split string is the value of the POS number " 10xxx ".
  • the value of the transaction amount "50”, the value of the transaction location "Beijing Xi'erqi xx store” and the value of the POS number "20xxx" can be obtained from "50
  • a threshold of the number of nodes corresponding to the second-level jump table may be set.
  • the step of querying the pointer in the node within the value range of the sort field of the data record to be queried from the second-level jump table indicated by the pointer in the found node or the object Including: from the second-level jump table indicated by the pointer or object in the found node, extracting each of the predetermined number of nodes whose keywords are within the value range in the order of near and far A stored pointer, where the predetermined number does not exceed the node number threshold.
  • a threshold for the number of nodes corresponding to the second-level hop table may be set, and periodic deletion is performed according to the set threshold for the number of nodes, that is, the first-level hop table and the corresponding first-level hop table corresponding to each shard group are traversed in a predetermined cycle
  • the second-level jump table when the traversed number of nodes in the second-level jump table corresponding to each shard group exceeds the threshold of the number of nodes, delete according to the order of the nodes in the second-level jump table
  • the order of the second-level jump table is the node corresponding to the threshold of the number of nodes (for example, when the threshold of the number of nodes is 10, according to the order of the nodes in the second-level jump table, the node corresponding to the threshold of the number of nodes is 10th node) and all nodes after that, and delete the storage space indicated by the pointer stored by the deleted node, wherein the arrangement order is the arrangement order from time to near and far.
  • the following expired data deletion operation can be performed: set the length of the expiration period; traverse the first-level and second-level jump tables corresponding to each shard in a predetermined period (for example, 3 months) , By locating the node with the timestamp value in the second-level hop table reaching the length of the expiration period to delete the nodes in the second-level hop table whose order is after the node, where the order is time Sorting order from near to far.
  • a predetermined period for example, 3 months
  • the value of the timestamp corresponding to the length of the set expiration period is 2018060000000000, and the three corresponding to the three pieces of data in Table 1 can be added to the second-level jump table in the above example through the above-mentioned expiration data deletion operation
  • the nodes are deleted.
  • FIG. 6 shows a block diagram of a system for managing memory data according to an exemplary embodiment of the present disclosure.
  • the system for managing memory data according to an exemplary embodiment of the present disclosure includes: a slice management device 410 and a jump table management device 420.
  • the shard management device 410 is used to set multiple shard groups, where each shard group includes at least one shard, and all shards in each shard group correspond to a unified index field and sort field, and different shard groups
  • the index fields are different, and the sort fields of different shard groups are the same or different.
  • the jump table management device 420 is configured to construct a corresponding first-level jump table and a second-level jump table for each slice, wherein the first-level jump table corresponding to each slice is set to store data
  • the value of the index field in the record about each fragment is a key
  • the pointer or object indicating the second-level jump table is a node corresponding to the value of the key
  • the second-level jump table is set to store the value of the sort field in the data record with respect to each fragment as a key and to indicate the value of at least one attribute field used to store the data record
  • the pointer of the value storage space is the node of the value corresponding to the key.
  • the second-level jump table corresponding to all the shard groups shares the storage space and the storage space stores the values of all attribute fields of the data record, or the second-level jump table corresponding to the same shard group
  • the jump table shares the same storage space and the same storage space stores values of all attribute fields of the data record except the index field and the sort field corresponding to the same shard group.
  • a system for maintaining data in memory according to an exemplary embodiment of the present disclosure includes: shard determination device 510, node lookup device 520, and node management device 530.
  • the shard determination device 510 is configured to determine the corresponding shard for each shard group in the plurality of shard groups according to the value of the index field about each shard group of the data record to be inserted, wherein , Each shard group includes at least one shard, all shards in each shard group correspond to a uniform index field and sort field, the index fields of different shard groups are different, and the sort fields of different shard groups are the same Or different, each fragment corresponds to the first-level jump table.
  • the first-level jump table is used to store the index value of the corresponding fragment in the data record as a key and a pointer indicating the second-level jump table. Or the object is a node corresponding to the value of the keyword.
  • the node lookup device 520 is used to search for the node with the value of the index field for each slice group of the data record to be inserted as a key from the first-level jump table corresponding to the determined slice.
  • the node management device 530 is used to find a pointer or a pointer in the found node when a node whose key value of the index field of each data group to be inserted is a key is found.
  • the second-level jump table indicated by the object is added with the value of the sort field of the data record to be inserted about each shard group as a key and is used to indicate that the data record to be inserted is stored.
  • the pointer of the storage space of the value of at least one attribute field is the node of the value corresponding to the key.
  • a second-level jump table corresponding to all shard groups shares the storage space and the storage space stores values of all attribute fields of the data record, or a second-level jump corresponding to the same shard group
  • the tables share the same storage space and the same storage space stores the values of all attribute fields of the data record except the index field and the sort field corresponding to the same shard group.
  • the shard determination device 510 calculates a hash value corresponding to the value of the index field of each data group to be inserted into the data record, and obtains the calculated hash value divided by the The remainder obtained from the total number of shards of each shard group, and the shard corresponding to the obtained remainder is determined as the corresponding shard.
  • each slice stores a pointer or object indicating the corresponding first-level jump table.
  • the system for maintaining data in a memory further includes: a jump table management device, wherein, when the data record to be inserted cannot be found, about each shard group
  • the jump table management device creates a second-level jump table
  • the node management device 530 creates the first-level jump table with the data record to be inserted.
  • the value of the index field of each shard group is a key
  • the pointer or object indicating the created second-level jump table is a node corresponding to the value of the key, and added to the created second-level jump table.
  • a storage space for storing the value of at least one attribute field of the data record to be inserted stores a character string obtained in one of the following ways: according to a predetermined character string merge rule for the at least one attribute The values of the fields are combined, the values of the at least one attribute field are serialized according to a predetermined JSON format, the values of the at least one attribute field are serialized according to a predetermined ProtocolBuffer format, and according to a predetermined Schema The format serializes the value of the at least one attribute field.
  • the system for maintaining data in memory further includes: a shard group determination device (not shown) and a data acquisition device (not shown), where the shard group determination device determines The shard group whose index field is the same as the index field of the data record to be queried among the plurality of shard groups, and the shard determination device 510 in the determined shard group, according to the index field of the data record to be queried Value to determine the corresponding shard, the node searching device 530 searches the first level jump table corresponding to the determined shard for the node whose key value of the index field of the data record to be queried is a key, and the data obtaining device The pointer in the node within the value range of the sort field of the data record to be queried from the pointer in the found node or the second-level jump table indicated by the object, and the pointer from the query Take out the value of at least one attribute field of the data record to be queried from the indicated storage
  • the data acquisition device retrieves the value of the at least one attribute field of the data record to be queried according to the queried pointer in one of the following ways: according to a predetermined string splitting rule, the at least one The value of the attribute field is split, the value of the at least one attribute field is deserialized according to a predetermined JSON format, and the value of the at least one attribute field is deserialized according to a predetermined ProtocolBuffer format, and Deserialize the value of the at least one attribute field according to a predefined Schema format.
  • the fragment determination device 510 calculates a hash value corresponding to the value of the index field of the data record to be queried, and obtains the calculated hash value divided by the total number of fragments of the determined fragment group And determine the slice corresponding to the obtained remainder as the corresponding slice.
  • the data records to be inserted and / or the data records to be queried are time-series data records, and the sort fields are all time stamps.
  • the range of values specifies the start value and end value of the time stamp or the end value of the time stamp.
  • the node management device 530 adds nodes according to the time indicated by the value of the timestamp, so that the nodes in the second-level jump table are arranged in the order of time from near to far.
  • the system for maintaining data in memory further includes: a node number threshold setting device (not shown) for setting a node number threshold corresponding to the second-level jump table, wherein the data
  • the acquiring device extracts, from the second-level jump table indicated by the pointer or object in the found node, each node of the predetermined number of nodes whose keywords are within the value range in the order of near and far A stored pointer, where the predetermined number does not exceed the node number threshold.
  • the system for maintaining data in memory further includes: a node number threshold setting device (not shown), a jump table traversing device (not shown), and a data deleting device (not shown)
  • the node number threshold setting device sets the node number threshold corresponding to the second-level hop table
  • the hop table traversing device respectively traverses the first-level hop table and the second-level hop table corresponding to each shard group at a predetermined period
  • the node deletion device deletes the first node according to the arrangement order of the nodes in the second-level jump table
  • the second-level jump table all nodes after the node corresponding to the threshold of the number of nodes are sorted, and the storage space indicated by the pointer stored in the deleted node is deleted, wherein the sorting order is the arrangement from time to near and far order.
  • the system for maintaining data in memory further includes: an expiration period length setting device (not shown), a skip table traversing device (not shown), and a data deleting device (not shown)
  • the expiration period length setting device sets the expiration period length
  • the jump table traversing device traverses the first-level jump table and the second-level jump table corresponding to each slice in a predetermined period
  • the data deletion device deletes the entire sequence in the second A node after the node whose length of the timestamp in the jump table reaches the length of the expiration period, wherein the arrangement order is an arrangement order in which time is near and far.
  • the devices shown in FIGS. 6 and 7 may be respectively configured as software, hardware, firmware or any combination of the above to perform specific functions.
  • these devices may correspond to dedicated integrated circuits, may also correspond to pure software codes, and may also correspond to units or modules in which software and hardware are combined.
  • one or more functions implemented by these apparatuses can also be uniformly performed by components in physical entities (eg, processors, clients, or servers, etc.).
  • a computer-readable storage medium storing instructions may be provided, wherein, when the instructions When being run by at least one computing device, the at least one computing device is caused to perform: set a plurality of shard groups, where each shard group includes at least one shard, and all shards in each shard group correspond to Index field and sort field, the index fields of different shard groups are different, and the sort fields of different shard groups are the same or different; the corresponding first-level and second-level jump tables are constructed for each shard, where , The first-level jump table corresponding to each fragment is set to store a pointer to the second-level jump table or a key indicating the value of the index field of each fragment in
  • a computer-readable storage medium storing instructions may be provided, wherein, when the instructions are executed by at least one computing device, the at least one computing device is caused to execute: for multiple Each shard group in each shard group determines the corresponding shard according to the value of the index field of each shard group about the data record to be inserted, wherein each shard group includes at least one Sharding, all shards in each sharding group correspond to a uniform index field and sorting field, the index fields of different sharding groups are different, and the sorting fields of different sharding groups are the same or different, each sharding corresponds
  • the first level jump table the first level jump table is used to store the index field of the corresponding fragment in the data record as a key and the pointer or object indicating the second level jump table is corresponding to the key A node with a value; look up the node with the index value of the index field of each data group of the data record
  • the computer program in the above-mentioned computer-readable storage medium may run in an environment deployed in computer equipment such as a processor, a client, a host, an agent device, a server, etc.
  • a processor a client
  • a host an agent device
  • a server a server
  • the computing device here may serve as a computer, processor, computing unit (or module), client, host, agent device, server, and so on.
  • the computer program may also be used to perform additional steps in addition to the above steps or perform more specific processing when performing the above steps. The contents of these additional steps and further processing have been described with reference to FIGS. 1 to 5, In order to avoid repetition, it will not be repeated here.
  • system for managing memory data and the system for maintaining data in the memory can completely rely on the operation of the computer program to realize the corresponding functions, that is, the functional architecture of each device and computer program is different from each The steps are corresponding, so that the entire system is called through a special software package (for example, lib library) to realize the corresponding function.
  • lib library for example, lib library
  • each device shown in FIGS. 6 and 7 can also be implemented by hardware, software, firmware, middleware, microcode, or any combination thereof.
  • the program code or code segments for performing the corresponding operations may be stored in a computer-readable medium such as a storage medium, so that the processor can read and run the corresponding program Code or code segment to perform the corresponding operation.
  • a system including at least one computing device and at least one storage device storing instructions, wherein the instructions, when executed by the at least one computing device, cause the at least one A computing device performs the following steps for managing memory data: setting up multiple shard groups, where each shard group includes at least one shard, and all shards in each shard group correspond to a unified index field and sort field, The index fields of different shard groups are different, and the sort fields of different shard groups are the same or different; the corresponding first-level and second-level jump tables are constructed for each shard, where The corresponding first-level jump table is set to store the value of the index field of each fragment in the data record as a key and the pointer or object indicating the second-level jump table as the key For the node with the corresponding value, the second-level jump table corresponding to each shard is set to store the value of the sort field of the shard in the data record as the
  • a system including at least one computing device and at least one storage device storing instructions, wherein the instructions, when executed by the at least one computing device, cause the At least one computing device performs the following steps of maintaining data in memory: for each shard group in the plurality of shard groups, according to the value of the index field of each shard group to be inserted into the data record Determine the corresponding shards, where each shard group includes at least one shard, all shards in each shard group correspond to a uniform index field and sort field, the index fields of different shard groups are different, and, different The sort fields of the shard group are the same or different.
  • Each shard corresponds to the first-level jump table.
  • the first-level jump table is used to store the index field of the corresponding shard in the data record as a key and to indicate The pointer or object of the second-level jump table is the node corresponding to the value of the key; look up the relationship of the data record to be inserted from the first-level jump table corresponding to the determined slice A node whose value of the index field of each fragment group is a keyword; a node whose value value of the index field of each fragment group with the value of the data record to be inserted is a keyword is found In the case of, add the sort field of the to-be-inserted data record about the sort field of each shard group to the second-level jump table indicated by the pointer or object in the found node as the key And the pointer indicating the storage space for storing the value of at least one attribute field of the data record to be inserted is a node corresponding to the key.
  • the system may constitute a stand-alone computing environment or a distributed computing environment, which includes at least one computing device and at least one storage device.
  • the computing device may be a general-purpose or dedicated computer, processor, etc.
  • the unit that uses software to perform processing may also be an entity combining software and hardware. That is, the computing device may be implemented as a computer, processor, computing unit (or module), client, host, agent device, server, or the like.
  • the storage device may be a physical storage device or a logically divided storage unit, which may be operatively coupled with the computing device, or may communicate with each other, for example, through an I / O port, a network connection, or the like.
  • the exemplary embodiments of the present disclosure may also be implemented as a computing device including a storage component and a processor, and the storage component stores a set of computer executable instructions when the set of computer executable instructions is described When the processor executes, a method of managing memory data and / or a method of maintaining data in memory is executed.
  • the computing device may be deployed in a server or a client, or may be deployed on a node device in a distributed network environment.
  • the computing device may be a PC computer, a tablet device, a personal digital assistant, a smartphone, a web application, or other device capable of executing the above instruction set.
  • the computing device does not have to be a single computing device, but may also be any device or circuit assembly capable of executing the above-mentioned instructions (or instruction sets) individually or jointly.
  • the computing device may also be part of an integrated control system or system manager, or may be a portable electronic device that is configured to interface with a local or remote (e.g., via wireless transmission).
  • the processor may include a central processing unit (CPU), a graphics processor (GPU), a programmable logic device, a dedicated processor system, a microcontroller, or a microprocessor.
  • processors may also include analog processors, digital processors, microprocessors, multi-core processors, processor arrays, network processors, and so on.
  • Some operations described in the method for managing memory data and / or the method for maintaining data in memory according to exemplary embodiments of the present disclosure may be implemented by software, and some operations may be implemented by hardware. In addition, These operations can be achieved through a combination of software and hardware.
  • the processor may execute instructions or codes stored in one of the storage components, wherein the storage component may also store data. Commands and data can also be sent and received over the network via a network interface device, where the network interface device can employ any known transmission protocol.
  • the storage unit may be integrated with the processor, for example, RAM or flash memory is arranged in an integrated circuit microprocessor or the like.
  • the storage component may include an independent device, such as an external disk drive, a storage array, or any other storage device that can be used by any database system.
  • the storage unit and the processor may be operatively coupled, or may communicate with each other, for example, through an I / O port, a network connection, etc., so that the processor can read files stored in the storage unit.
  • the computing device may also include a video display (such as a liquid crystal display) and a user interaction interface (such as a keyboard, mouse, touch input device, etc.). All components of the computing device may be connected to each other via a bus and / or a network.
  • a video display such as a liquid crystal display
  • a user interaction interface such as a keyboard, mouse, touch input device, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

提供了管理内存数据及在内存中维护数据的方法和系统。管理内存数据的方法包括:设置多个分片组,其中,每个分片组包括至少一个分片,每个分片组中的所有分片对应统一的索引字段和排序字段;为每个分片分别构建对应的第一级跳表和第二级跳表,其中,与每个分片对应的第一级跳表被设置为用于存储关键字为数据记录的关于索引字段的取值且值为指示第二级跳表的指针或对象的节点,与每个分片对应的第二级跳表被设置为用于存储关键字为数据记录的关于排序字段的取值且值为指示用于存储数据记录的至少一个属性字段的取值的存储空间的指针的节点。根据所述方法和系统,可提高数据写入和数据查询的效率。

Description

管理内存数据及在内存中维护数据的方法和系统 技术领域
本公开总体说来涉及内存数据管理和维护领域,更具体地讲,涉及一种管理内存数据的方法和系统以及一种在内存中维护数据的方法和系统。
背景技术
现有的数据库包括关系型数据库和非关系型数据库。诸如MySQL和SQL Server的关系型数据库主要用于对关系型数据进行管理和维护。诸如Redis和MongoDB的非关系型数据库主要用于对非关系型数据进行管理和维护。关系型数据意指基于关系模型(Relational Model,简称为RM)的数据。非关系型数据意指非基于关系模型的数据。
为了对时间序列数据进行处理,提出了诸如InfluxDB的时间序列数据库(Time Series Database,简称为TSDB)。为了对内存中的数据进行处理,提出了诸如VoltDB的内存数据库。
然而,在需要快速处理数据和同时执行大量数据写入任务和/或数据查询任务的特定场景下,包括以上列举出的数据库的传统的数据库具有读/写数据的耗时长的问题,还具有单位时间内能够同时执行的数据写入任务和/或数据查询任务的数量少的问题。综上所述,现有技术中存在数据写入效率低和数据查询效率低的缺陷。
发明内容
本公开的示例性实施例旨在克服现有技术中存在的数据写入效率低和数据查询效率低的缺陷。
根据本公开的示例性实施例,提供一种管理内存数据的方法,包括:设置多个分片组,其中,每个分片组包括至少一个分片,每个分片组中的所有分片对应统一的索引字段和排序字段,不同分片组的索引字段不同,并且,不同分片组的排序字段相同或不同;为每个分片分别构建对应的第一级跳表和第二级跳表,其中,与每个分片对应的第一级跳表被设置为用于存储以数据记录中关于所述每个分片的索引字段的取值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点,与所述每个分片对应的第二级跳表被设置为用于存储以所述数据记录中关于所述每个分片的排序字段的取值为关键字且以指示用于存储所述数据记录的至少一个属性字段的取值的存储空间的指针为与该关键字对应的值的节点。
根据本公开的另一示例性实施例,提供一种在内存中维护数据的方法,包括:针对多个分片组中的每个分片组,根据待插入的数据记录的关于所述每个分片组的索引字段的取值来确定对应的分片,其中,每个分片组包括至少一个分片,每个分片组中的所有分片对应统一的索引字段和排序字段,不同分片组的索引字段不同,并且,不同分片组的排序字段相同或不同,每个分片分别对应第一级跳表,第一级跳表用于存储以数据记录中关于对应分片的索引字段的取值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点;从与确定的分片对应的第一级跳表中查找以所述待插入的数据记录的关于所述每个分片组的索引字段的取值为关键字的节点;在查找到以所述待插入的数据记录的关于所述每个分片组的索引字段的取值为关键字的节点的情况下,在查找到的节点中的指针或对象所指示的第二级跳表中添加以所述待插入的数据记录的关于所述每个分片组的排序字段的取值为关键字且以指示用于存储所述待插入的数据记录的至少一个属性字段的取值的存储空间的指针为与该关键字对应的值的节点。
根据本公开的另一示例性实施例,提供一种管理内存数据的系统,包括:分片管理装 置,用于设置多个分片组,其中,每个分片组包括至少一个分片,每个分片组中的所有分片对应统一的索引字段和排序字段,不同分片组的索引字段不同,并且,不同分片组的排序字段相同或不同;跳表管理装置,用于为每个分片分别构建对应的第一级跳表和第二级跳表,其中,与每个分片对应的第一级跳表被设置为用于存储以数据记录中关于所述每个分片的索引字段的取值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点,与所述每个分片对应的第二级跳表被设置为用于存储以所述数据记录中关于所述每个分片的排序字段的取值为关键字且以指示用于存储所述数据记录的至少一个属性字段的取值的存储空间的指针为与该关键字对应的值的节点。
根据本公开的另一示例性实施例,提供一种在内存中维护数据的系统,包括:分片确定装置,用于针对多个分片组中的每个分片组,根据待插入的数据记录的关于所述每个分片组的索引字段的取值来确定对应的分片,其中,每个分片组包括至少一个分片,每个分片组中的所有分片对应统一的索引字段和排序字段,不同分片组的索引字段不同,并且,不同分片组的排序字段相同或不同,每个分片分别对应第一级跳表,第一级跳表用于存储以数据记录中关于对应分片的索引字段的取值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点;节点查找装置,用于从与确定的分片对应的第一级跳表中查找以所述待插入的数据记录的关于所述每个分片组的索引字段的取值为关键字的节点;节点管理装置,用于在查找到以所述待插入的数据记录的关于所述每个分片组的索引字段的取值为关键字的节点的情况下,在查找到的节点中的指针或对象所指示的第二级跳表中添加以所述待插入的数据记录的关于所述每个分片组的排序字段的取值为关键字且以指示用于存储所述待插入的数据记录的至少一个属性字段的取值的存储空间的指针为与该关键字对应的值的节点。
根据本公开的另一示例性实施例,提供一种存储指令的计算机可读存储介质,其中,当所述指令被至少一个计算装置运行时,促使所述至少一个计算装置执行如上所述的方法。
根据本公开的另一示例性实施例,提供一种包括至少一个计算装置和至少一个存储指令的存储装置的系统,其中,所述指令在被所述至少一个计算装置运行时,促使所述至少一个计算装置执行如上所述的方法。
在根据本公开的示例性实施例的方法和系统中,可设置每个分片组中的所有分片对应统一的索引字段和排序字段的多个分片组,可为每个分片分别构建对应的第一级跳表和第二级跳表,使得与每个分片对应的第一级跳表存储以数据记录中关于索引字段的取值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点,与所述每个分片对应的第二级跳表存储以数据记录中关于排序字段的取值为关键字且以指示所述数据记录的至少一个属性字段的取值的存储空间的指针为与该关键字对应的值的节点。
通过这样的方式,当执行数据写入时,可对所述多个分片组进行并行处理,可基于索引字段快速定位对应的分片和对应的第一级跳表,也可使用共享的存储空间来存储属性字段的取值。在这种情况下,可减少处理时间并提高处理效率。当进行查询时,可基于待查询的数据记录的索引字段的取值在对应的分片组中快速定位对应的分片以及与该分片对应的第一级跳表和第二级跳表并基于排序字段快速定位用于存储属性字段的取值的存储空间,从而可减少处理时间并提高处理效率。当执行数据写入时,可对多个分片组进行并行处理,也可并行的对同一分片组中的分片进行数据写入;当执行数据查询时,可并行的对来自同一分片组或不同分片组的不同分片进行数据查询。由此,可提高单位时间内能够同时执行的数据写入任务和/或数据查询任务的数量。另外,可将数据记录的任意一个属性字段设置为索引字段,从而可根据需要的属性字段灵活、方便地进行数据记录的写入和查询。
将在接下来的描述中部分阐述本公开总体构思另外的方面和/或优点,还有一部分通过描述将是清楚的,或者可以经过本公开总体构思的实施而得知。
附图说明
通过下面结合示例性地示出实施例的附图进行的描述,本公开示例性实施例的上述和其他目的和特点将会变得更加清楚,其中:
图1示出根据本公开的示例性实施例的管理内存数据的方法的流程图;
图2示出根据本公开的示例性实施例的分片的示意图;
图3示出根据本公开的示例性实施例的在内存中维护数据的方法的流程图;
图4示出根据本公开的示例性实施例的插入数据记录的操作的示意图;
图5示出根据本公开示例性实施例的在内存中查询数据的操作的流程图;
图6示出根据本公开的示例性实施例的管理内存数据的系统的框图;以及
图7示出根据本公开的示例性实施例的在内存中维护数据的系统的框图。
具体实施方式
现将详细参照本公开的实施例,所述实施例的示例在附图中示出,其中,相同的标号始终指的是相同的部件。以下将通过参照附图来说明所述实施例,以便解释本公开。在此需要说明的是,在本公开中出现的“并且/或者”、“和/或”均表示包含三种并列的情况。例如“包括A和/或B”表示包括A和B中的至少一下,即包括如下三种并列的情况:(1)包括A;(2)包括B;(3)包括A和B。又例如“执行步骤一并且/或者步骤二”表示执行步骤一和步骤二中的至少一个,即表示如下三种并列的情况:(1)执行步骤一;(2)执行步骤二;(3)执行步骤一和步骤二。
图1示出根据本公开的示例性实施例的管理内存数据的方法的流程图。改管理内存数据的方法由至少一个计算装置执行,如图1中所示,根据本公开的示例性实施例的管理内存数据的方法可包括步骤S110和步骤S120。在步骤S110,设置多个分片组,其中,每个分片组包括至少一个分片,每个分片组中的所有分片对应统一的索引字段和排序字段,不同分片组的索引字段不同,并且,不同分片组的排序字段相同或不同。在步骤S120,为每个分片分别构建对应的第一级跳表和第二级跳表,其中,与每个分片对应的第一级跳表被设置为用于存储以数据记录中关于所述每个分片的索引字段的取值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点,与所述每个分片对应的第二级跳表被设置为用于存储以所述数据记录中关于所述每个分片的排序字段的取值为关键字且以指示用于存储所述数据记录的至少一个属性字段的取值的存储空间的指针为与该关键字对应的值的节点。作为示例,每个分片组的索引字段和排序字段不同。
作为示例,每个分片中存储有指示对应的第一级跳表的指针或对象。
作为示例,与所有分片组对应的第二级跳表共享所述存储空间且所述存储空间存储所述数据记录的全部属性字段的取值,这可对应于不同分片组的排序字段相同的情况。作为示例,与同一分片组对应的第二级跳表共享同一存储空间且所述同一存储空间存储所述数据记录的除了与所述同一分片组对应的索引字段和排序字段之外的所有属性字段的取值,这可对应于不同分片组的排序字段不同的情况。
在本公开的示例性实施例中,分片组可以是分片的集合,一个分片组可包括一个或更多个分片;分片组的索引字段指的是与分片组对应的索引字段,分片组的排序字段指的是与分片组对应的排序字段;数据记录可具有一个或更多个属性字段,索引字段可以是数据记录的一个属性字段,排序字段也可以是数据记录的一个属性字段。例如,索引字段是卡号或商户类别码(Merchant Category Code,简称为MCC)等,排序字段是时间戳或年龄等。
在本公开的示例性实施例中,第一级跳表和第二级跳表为skiplist,也被称为跳跃表。可通过指针或对象将分片与第一级跳表进行关联。相应地,通过分片中存储的指针或对象,可定位与该分片对应的第一级跳表。可通过指针或对象将第一级跳表与第二级跳表进行关联。相应地,通过第一级跳表中存储的指针或对象,可定位与第一级跳表对应的第二级跳 表。这里的对象与面向对象编程(Object Oriented,简称为OO)技术中定义的对象类似。
作为示例,在创建数据表时,可定义分片组的索引字段和/或排序字段,例如,定义哪个字段为索引字段,哪个字段为排序字段。可选地,可定义索引字段和/或排序字段的排列顺序。例如,定义第一个索引字段为卡号且第二个索引字段为商户类别码。又如,定义第一个排序字段为时间戳且第二个索引字段为年龄。
作为示例,在创建数据表时,可指定不同分片组的排序字段相同还是不同。例如,可指定不同分片组的排序字段同为时间戳。
在本公开的示例性实施例中,数据表可包括:多个分片组、与所述多个分片组中的每个分片分别对应的第一级跳表和第二级跳表。数据表还可包括或不包括用于存储数据记录的属性字段的取值的存储空间。可选地,用于存储数据记录的属性字段的取值的存储空间独立于数据表。
作为示例,可通过接口create(table_name,ttl,ttl-type,key_1:type_1:index,key_2:type_2:index,……,key_N:type_N:index,value:type)来创建数据表。用户可通过接口create在内存中创建数据表。相应地,当通过接口create接收到用户创建数据表的请求时,可执行步骤S110和步骤S120。换言之,可通过接口create接收执行步骤S110和步骤S120所需要的信息,其中,“table_name”表示数据表的名称;ttl(存活时间,time to live)表示数据表中存储的数据的存活阈值,其根据不同的存活过期类型(该类型由ttl-type字段来指示)而具有不同的取值范围,例如,当ttl-type字段指示按照数量阈值限制来删除过期数据时,ttl的取值可以是数据记录的最大保留数量,例如,当tll为100时,表示数据表可存储100条数据(例如,最近的100条数据),超出部分的数据会被删除;又例如,当ttl-type字段指示按照时间阈值限制来删除过期数据时,ttl的取值可以是数据记录的最大保留时段,例如,当tll为3天时,表示数据表可存储最近三天的数据,在此之前的数据会被删除。“key_i:type_i:index”表示第i个分片组对应的索引字段(可称为第i个索引字段)的名称和索引字段的数据类型(1≤i≤N),N为自然数,其中,key_i表示第i个索引字段的名称,type_i表示第i个索引字段的数据类型,“index”表示key_i为索引字段,其仅仅是索引字段标识,可被替换为任何表示key_i是索引字段的字符或字符组合;“value:type”表示至少一个非索引字段的属性字段作为整体的名称(其中,所述至少一个属性字段的取值可被编码为整体)和对应的数据类型。在本示例中,排序字段是预先设置的属性字段,例如,时间戳,因此,可不在接口中设置排序字段。本公开的示例性是示例中的数据类型可包括:字符串型、浮点型等。
作为示例,可通过接口create(table_name,ttl,ttl-type,key_1:type_1:index,key_2:type_2:index,……,key_N:type_N:index,value_1:value_type_1,……,value_M:value_type_M)来创建数据表。接口create后的括号内是接口的参数。除了“value_1:value_type_1,……,value_M:value_type_M”之外的参数的设置与以上接口create的示例中对应参数的设置类似,这里不再赘述。对于“value_1:value_type_1,……,value_M:value_type_M”,value_j:value_type_j表示第j个非索引字段的属性字段的名称和类型,其中,1≤j≤M,M为自然数。
作为示例,在以上任意一个接口create的示例中,可增加参数“field_1:field_type_1:order,field_2:field_type_2:order,……,field_N:field_type_N:order”,其中,field_i表示第i个分片组对应的排序字段(可称为第i个排序字段)的名称(1≤i≤N),N为自然数,field_type_i表示第i个排序字段的数据类型,“order”表示field_i为排序字段,其仅仅是排序字段标识,可被替换为任何表示field_i是排序字段的字符或字符组合。
以上接口create的示例仅仅用于描述本公开的构思,并不用于限制本公开的保护范围,其他用于创建数据表的接口也是可行的,例如,可省略以上接口中的一个或更多个参数,也可为以上接口增加一个或更多个参数。
可参照表1来理解本公开的示例性实施例中的数据记录。表1示出了根据本公开的示 例性实施例的与银行交易有关的数据记录。表1所示数据记录可包括如下属性字段:卡号
、时间戳、交易金额、交易地点、POS(Point of Sale,简称为POS)编号。表1包括3条数据记录。可将卡号作为索引字段,可将时间戳作为排序字段。
表1
Figure PCTCN2019109144-appb-000001
图2示出根据本公开的示例性实施例的分片的示意图。如图2所示,分片0至分片n为n+1个分片,其中,n为大于2的自然数。这些分片中的每个分片对应一个第一级跳表。与分片0对应的第一级跳表包括节点11至节点1m,其中,m为自然数。每个分片中可存储指示对应的第一级跳表的指针或对象,以便于定位到与分片对应的第一级跳表。第一级跳中的每个节点可对应一个第二级跳表。与节点11对应的第二级跳表包括节点41至节点4k,与节点12对应的第二级跳表包括节点31至节点3j,与节点1m对应的第二级跳表包括节点21至节点2i,其中,i、j、k为自然数。第一级跳表的每个节点中可存储指示对应的第二级跳表的指针或对象,以便于定位到与第一级跳表的节点对应的第二级跳表。
可在跳表的节点中设置键值(key-value)对。具体地,对于第一级跳表的节点,可将数据记录的索引字段的取值设置为关键字(key),并将指示第二级跳表的指针或对象设置为与关键字对应的值(value)。对于第二级跳表中的节点,可将数据记录的排序字段的取值设置为关键字,并且与关键字对应的值为指针,该指针指示存储数据记录的至少一个属性字段的取值的存储空间。
作为示例,所述至少一个属性字段的取值包括数据记录的索引字段的取值和/或排序字段的取值,或者所述至少一个属性字段的取值既不包括数据记录的索引字段的取值也不包括数据记录的排序字段的取值。
另外,在第一级跳表的节点中可存储有指示该第一级跳表中的另一节点的指针或对象。在第二级跳表的节点中也可存储有指示该第二级跳表中的另一节点的指针或对象。对于第一级跳表或第二级跳表,当跳表中已经存在节点时,除了尾部节点之外的每个节点中存储有一个指示与该节点同属于一个跳表的节点的对象或指针,以便使跳表形成链状结构。当向第一节点和第二节点之间插入第三节点时,需要将第一节点中指示第二节点的指针或对象改变为指示第三节点,并在第三节点中利用指针或对象指示第二节点。
图3示出根据本公开的示例性实施例的在内存中维护数据的方法的流程图。如图3中所示,根据本公开的示例性实施例的由至少一个计算装置执行的在内存中维护数据的方法可包括步骤S210、步骤S220和步骤S230。
在步骤S210,针对多个分片组中的每个分片组,根据待插入的数据记录的关于所述每个分片组的索引字段的取值来确定对应的分片,其中,每个分片组包括至少一个分片,每个分片组中的所有分片对应统一的索引字段和排序字段,不同分片组的索引字段不同,并且,不同分片组的排序字段相同或不同,每个分片分别对应第一级跳表,第一级跳表用于存储以数据记录中关于对应分片的索引字段的取值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点。作为示例,每个分片组的索引字段和排序字段不同。
作为示例,每个分片中存储有指示对应的第一级跳表的指针或对象。
作为示例,确定对应的分片的步骤包括:计算与所述待插入的数据记录的关于所述每个分片组的索引字段的取值对应的哈希值;获得计算出的哈希值除以所述每个分片组的分片总数所得的余数;将与获得的余数对应的分片确定为对应的分片。可使用哈希函数对索引字段的取值进行计算,以获得哈希值。例如,使用的哈希函数可以是由Austin Appleby提出的哈希函数Murmurhash。当然,本公开并不对使用的哈希函数进行限制,其他哈希函数也可用于哈希值的计算。在一个分片组包括分片0至分片n的情况下,例如,图2中所示各个分片属于同一分片组,如果余数是0,则所述一个分片组中的分片0与待插入的数据记录对应;如果余数是h(0<h≤n),则所述一个分片组中的分片h与待插入的数据记录对应。
在步骤S220,从与确定的分片对应的第一级跳表中查找以所述待插入的数据记录的关于所述每个分片组的索引字段的取值为关键字的节点。
在步骤S230,在查找到以所述待插入的数据记录的关于所述每个分片组的索引字段的取值为关键字的节点的情况下,在查找到的节点中的指针或对象所指示的第二级跳表中添加以所述待插入的数据记录的关于所述每个分片组的排序字段的取值为关键字且以指示用于存储所述待插入的数据记录的至少一个属性字段的取值的存储空间的指针为与该关键字对应的值的节点。
作为示例,根据本公开的示例性实施例的在内存中维护数据的方法还包括:在未能查找到以所述待插入的数据记录的关于所述每个分片组的索引字段的取值为关键字的节点的情况下,创建第二级跳表,在第一级跳表中创建以所述待插入的数据记录的关于所述每个分片组的索引字段的取值为关键字且以指示创建的第二级跳表的指针或对象为与该关键字对应的值的节点,并在创建的第二级跳表中添加以与所述待插入的数据记录的关于所述每个分片组的排序字段的取值为关键字且以指示用于存储所述待插入的数据记录的至少一个属性字段的取值的存储空间的指针为与该关键字对应的值的节点。
作为示例,与所有分片组对应的第二级跳表共享所述存储空间且所述存储空间存储所述数据记录的全部属性字段的取值,这可对应于不同分片组的排序字段相同的情况。作为示例,与同一分片组对应的第二级跳表共享同一存储空间且所述同一存储空间存储所述数据记录的除了与所述同一分片组对应的索引字段和排序字段之外的所有属性字段的取值,这可对应于不同分片组的排序字段不同的情况。
作为示例,当执行数据记录的插入时,对每个分片组都进行处理,对分片组的处理包括对与分片组对应的分片以及与分片对应的第一级跳表和第二级跳表的处理。具体而言,当执行数据记录的插入时,可接收作为数据记录的关于每个索引字段的取值和数据记录的至少一个属性字段的取值,还可接收数据记录的排序字段的取值。可从用户输入接收上述取值,当接收到该用户输入时,可按照以上描述的步骤S210、步骤S220和步骤S230来进行处理。
作为示例,可通过接口put(table_name,ts,key_1,key_2,……,key_N,value)来插入数据记录,该示例可对应于不同分片组的排序字段相同的情况。在插入数据记录前,可预先指定不同分片组的排序字段相同,例如,不同分片组的排序字段同为时间戳。这里,“table_name”表示将插入数据记录的数据表的名称,“ts”表示排序字段的取值,key_1,key_2,……,key_N表示待插入的数据记录的各个索引字段的取值,“value”表示待插入的数据记录的至少一个属性字段的取值。作为示例,该value可以是按照特定规则通过对所述至少一个属性值进行编码(例如,合并或序列化)而得到的值。
作为示例,待插入的数据记录的各个索引字段的取值的排列顺序与所述多个分片组的各个索引字段的排列顺序相同。换言之,接口put中的各个索引字段的取值(key_1,key_2,……,key_N)的排列顺序与数据表的分片组的各个索引字段的排列顺序相同。在这种情况下,可根据接口put中key_1,key_2,……,key_N的排列顺序确定每个取值对 应的索引字段,从而可方便、高效的进行数据记录的插入操作。
作为示例,可通过接口put(table_name,key_1,key_2,……,key_N,field_1,field_2,……,field_N,value)来插入数据记录,该示例可对应于不同分片组的排序字段不同的情况,其中,field_1,field_2,……,field_N表示与key_1,key_2,……,key_N分别对应的各个排序字段的取值。该接口中的其他参数可参照对于接口put(table_name,ts,key_1,key_2,……,key_N,value)的描述来理解。
以上接口put的示例仅仅用于描述本公开的构思,并不用于限制本公开的保护范围,其他用于插入数据记录的接口也是可行的,例如,可省略以上接口中的一个或更多个参数,也可为以上接口增加一个或更多个参数。
作为示例,待插入的数据记录的各个排序字段的取值的排列顺序与所述多个分片组的各个排序字段的排列顺序相同。换言之,接口put中的各个排序字段的取值(field_1,field_2,……,field_N)的排列顺序与数据表的分片组的各个排序字段的排列顺序相同。在这种情况下,可根据接口put中field_1,field_2,……,field_N的排列顺序确定field_1,field_2,……,field_N中每个取值对应的索引字段,从而可方便、高效的进行数据记录的插入操作。
根据本公开的示例性实施例,待插入数据表的数据记录可以是如表1中所示的数据记录。可借助于上述接口put逐个插入表1中的各条数据记录。可将卡号作为索引字段,将时间戳作为排序字段。可将卡号的取值、时间戳的取值、交易金额的取值、交易地点的取值和POS编号的取值进行编码(例如,序列化或合并)以便被存储到与所有第二级跳表共享的存储空间中。
作为示例,用于存储所述待插入的数据记录的至少一个属性字段的取值的存储空间中存储有通过以下方式之一获得的字符串:按照预定的字符串合并规则对所述至少一个属性字段的取值进行合并,按照预定的JSON格式对所述至少一个属性字段的取值进行序列化,按照预定的ProtocolBuffer格式对所述至少一个属性字段的取值进行序列化,以及按照预定的Schema格式对所述至少一个属性字段的取值进行序列化。
作为示例,预定的字符串合并规则包括按照特定符号(例如,“|”)合并。例如,可根据预先设置的符号例如“|”,将交易金额的取值、交易地点的取值、POS编号的取值合并为字符串“100|北京上地xx路|10xxx”,并将该字符串存储到存储空间中。除了按照预定的字符串合并规则对所述至少一个属性字段的取值进行合并以获得字符串之外,上述利用JSON格式、ProtocolBuffer格式和Schema格式获得字符串也是可行的。当然,以上描述仅仅作为示例而不应该被视为限制。
作为示例,所述待插入的数据记录是时序型数据记录,排序字段均为时间戳。作为示例,在第二级跳表中添加节点的步骤包括:按照时间戳的取值指示的时间添加节点,使得第二级跳表中的节点按照时间由近及远的顺序排列。可通过比较时间戳的取值来确定时间的远近,与较大的时间戳的取值对应的时间比与较小的时间戳的取值对应的时间近。因此,在第二级跳表中,可将时间戳的取值较大的节点排列在时间戳的取值较小的节点之前。
下面以待插入的数据记录为表1中的第1条数据记录为例来说明在内存中维护数据的方法。假设数据表包括两个分片组,第一个分片组的索引字段为卡号,第二个分片组的索引字段为POS编号,两个分片组的排序字段均为时间戳,两个分片组的第二级跳表共享存储空间SP。在步骤S210,对于第一分片组,根据卡号的取值“6222XXXX01”确定对应的分片(例如,分片0);对于第二分片组,根据POS编号的取值“10xxx”确定对应的分片(例如,分片3)。在步骤S210,对于第一分片组,从与分片0对应的第一级跳表中查找以“6222XXXX01”为关键字的节点(例如,查找到节点11);对于第二分片组,从与分片3对应的第一级跳表中查找以“10xxx”为关键字的节点(例如,查找到节点333)。在步骤S230,对于第一分片组,在与节点11对应的第二级跳表中插入以时间戳的取值“2018052814520505”为关键字且以指向存储空间SP的指针为与关键字对应的值的节点; 对于第二分片组,在与节点333对应的第二级跳表中插入以时间戳的取值“2018052814520505”为关键字且以指向存储空间SP的指针为与关键字对应的值的节点,将“6222XXXX01”、“2018052814520505”、“100”、“北京上地xx路”以及“10xxx”编码后存储到存储空间S中。
另外,在待插入的数据记录为表1中的第1条数据记录至第3条数据记录的情况下。对于同一分片组而言,第1条数据记录和第2条数据记录具有相同的卡号,因此,这两条数据记录对应于同一个第二级跳表。在该第二级跳表中,当第1条数据记录被添加之后,由于第2条数据记录的时间戳的取值比第1条数据记录的时间戳的取值大,因此,将与第2条数据记录对应的节点添加到与第1条数据记录对应的节点之前。
图4示出根据本公开的示例性实施例的插入数据记录的操作的示意图。如图4中所示,可通过put接口(例如,如上所述的put接口)向内存中的数据表插入数据记录。本示例性实施例对应于不同分片组的排序字段相同的情况。通过put接口输入的各个索引字段的取值为key1、key2和key3,key1、key2和key3分别对应维度a、维度b和维度c,通过put接口输入的排序字段的取值为ts,通过put接口输入的至少一个属性字段的取值为value。数据表可包括分片组a、分片组b和分片组c。分片组a、分片组b和分片组c的索引字段分别对应于key1、key2和key3。分片组a、分片组b和分片组c的排序字段对应于ts。分片组a的索引字段、分片组b的索引字段和分片组c的索引字段三者的排列顺序与key1、key2和key3三者的排列顺序相同。当获知key1、key2和key3中任意一个取值的顺序即可确定该取值为哪个索引字段的取值。
图5示出根据本公开示例性实施例的在内存中查询数据的操作的流程图。如图5中所示,根据本公开示例性实施例的在内存中查询数据的操作包括步骤S310至步骤S360。
在步骤S310,接收待查询的数据记录的索引字段、索引字段的取值和排序字段的取值范围。这里,待查询的数据记录的索引字段指的是待查询的数据记录的索引字段的名称,例如,卡号。
作为示例,所述待查询的数据记录是时序型数据记录,并且,排序字段均为时间戳。
作为示例,所述取值范围指定时间戳的起始值和终止值或者指定时间戳的终止值。
在步骤S320,确定所述多个分片组中索引字段与待查询的数据记录的索引字段相同的分片组。
在步骤S330,在确定的分片组中,根据所述待查询的数据记录的索引字段的取值来确定对应的分片。
作为示例,可通过如下操作确定与待查询的数据记录对应的分片:计算与所述待查询的数据记录的索引字段的取值对应的哈希值;获得计算出的哈希值除以所述确定的分片组的分片总数所得的余数;将与获得的余数对应的分片确定为对应的分片。可使用哈希函数对索引字段的取值进行计算,以获得哈希值。例如,使用的哈希函数可以是由Austin Appleby提出的哈希函数Murmurhash。当然,本公开并不对使用的哈希函数进行限制,其他哈希函数也可用于哈希值的计算。
在步骤S340,从与确定的分片对应的第一级跳表中查找以所述待查询的数据记录的索引字段的取值为关键字的节点。
在步骤S350,从查找到的节点中的指针或对象所指示的第二级跳表中查询关键字在所述待查询的数据记录的排序字段的取值范围内的节点中的指针。
在步骤S360,从查询到的指针所指示的存储空间中取出待查询的数据记录的至少一个属性字段的取值。
作为示例,可通过接口scan(table_name,key_name,key_value,start_time,end_time)来进行数据查询,其中,table_name用于限定从中查询数据的数据表的名称,key_name用于限定待查询的数据记录的索引字段,key_value用于限定待查询的数据记录的索引字段的取值,start_time和end_time用于限定待查询的数据记录的取值范围,例如,起始时 间和终止时间。本示例可对应于数据表的所有分片组对应的排序字段相同的情况。
作为示例,可通过接口get(table_name,key_name,key_value,ts)来进行数据查询,其中,table_name用于限定从中查询数据的数据表的名称,key_name用于限定待查询的数据记录的索引字段,key_value用于限定待查询的数据记录的索引字段的取值,ts用于限定待查询的数据记录的取值范围,例如,ts用于限定待查询的数据记录的时间戳的取值,在这种情况下,实际期望查询的是时间戳的取值为ts的数据;又例如,ts用于限定待查询的数据记录的终止时间,在这种情况下,实际期望查询的是从查询数据的时刻开始到指定的ts为止的数据。本示例可对应于数据表的所有分片组对应的排序字段相同的情况。
以上接口scan和接口get的示例仅仅用于描述本公开的构思,并不用于限制本公开的保护范围,其他的用于数据查询的接口也是可行的,例如,可省略以上接口中的一个或更多个参数,也可为以上接口增加一个或更多个参数。
以表1中的数据记录为例,需要查询卡号的取值为“6222XXXX01”并且时间戳的起始值为“2018052815520505”且终止值为“2018052814520505”的数据。假设根据“6222XXXX01”确定将卡号作为索引字段的分片组中与待查询的数据记录对应的分片为分片0,从与分片0对应的第一级跳表的节点11节点1m中查找出以“6222XXXX01”为关键字的节点为节点11,确定与节点11对应的第二级跳表包括节点41至节点4k。从节点41至节点4k查找关键字的起始值为“2018052815520505”且终止值为“2018052814520505”的节点。从查找到的第二级跳表的节点存储的指针所指向的存储空间中取出数据。又如,当所述取值范围仅指定时间戳的终止值(例如,“2018052814520505”)时,可查找第二级跳表中的关键字的取值大于或等于“2018052814520505”的节点,从查找到的第二级跳表的节点存储的指针所指向的存储空间中取出数据。
作为示例,从查询到的指针所指示的存储空间中取出待查询的数据记录的至少一个属性字段的取值的步骤包括:通过以下方式之一来取出所述待查询的数据记录的所述至少一个属性字段的取值:按照预定的字符串拆分规则对所述至少一个属性字段的取值进行拆分,按照预定的JSON格式对所述至少一个属性字段的取值进行反序列化,按照预定的ProtocolBuffer格式对所述至少一个属性字段的取值进行反序列化,以及按照预定义的Schema格式对所述至少一个属性字段的取值进行反序列化。
例如,从查找到的节点中查找出与关键字对应的值分别为“100|北京上地xx路|10xxx”和“50|北京西二旗xx店|20xxx”。可根据预先设定的符号例如“|”对“100|北京上地xx路|10xxx”进行拆分,并根据预先设定的拆分出的字符串的含义获得第一个拆分出的字符串为交易金额的取值“100”,第二个拆分出的字符串为交易地点的取值“北京上地xx路”,第三个拆分出的字符串为POS编号的取值“10xxx”。类似地,可从“50|北京西二旗xx店|20xxx”获得交易金额的取值“50”、交易地点的取值“北京西二旗xx店”以及POS编号的取值“20xxx”。
作为示例,为了保证内存(例如,内存的数据表)中不会存储过多数据,可设置与第二级跳表对应的节点数量阈值。在此基础上,从查找到的节点中的指针或对象所指示的第二级跳表中查询关键字在所述待查询的数据记录的排序字段的取值范围内的节点中的指针的步骤包括:从查找到的节点中的指针或对象所指示的第二级跳表中,按照由近及远的顺序取出关键字在所述取值范围内的预定数量的节点中的每个节点中存储的指针,其中,所述预定数量不超过所述节点数量阈值。
作为示例,可设置与第二级跳表对应的节点数量阈值,并且根据设置的节点数量阈值进行定期删除,即:以预定周期分别遍历与每个分片组的对应的第一级跳表和第二级跳表;当遍历到的与所述每个分片组对应的第二级跳表中的节点数量超过节点数量阈值时,根据该第二级跳表中的节点的排列顺序,删除该第二级跳表中排列顺序在与节点数量阈值对应的节点(例如,当节点数量阈值为10时,根据第二级跳表中的节点的排列顺序,与该节点数量阈值对应的节点为第10个节点)之后的所有节点,并且删除被删除的节点所存储 的指针所指示的存储空间,其中,所述排列顺序为时间由近及远的排列顺序。
作为示例,为了提高处理效率,可执行如下的过期数据删除操作:设置过期期限长度;以预定周期(例如,3个月)遍历与各个分片对应的第一级跳表和第二级跳表,通过定位第二级跳表中的时间戳的取值达到所述过期期限长度的节点来整体删除第二级跳表中的排列顺序在该节点之后的节点,其中,所述排列顺序为时间由近及远的排列顺序。例如,与设置的过期期限长度对应的时间戳的取值为2018060000000000,可通过上述过期数据删除操作将以上示例中添加到第二级跳表中的与表1中的3条数据对应的3个节点均删除。
图6示出根据本公开示例性实施例的管理内存数据的系统的框图。如图6中所示,根据本公开示例性实施例的管理内存数据的系统包括:分片管理装置410和跳表管理装置420。
分片管理装置410用于设置多个分片组,其中,每个分片组包括至少一个分片,每个分片组中的所有分片对应统一的索引字段和排序字段,不同分片组的索引字段不同,并且,不同分片组的排序字段相同或不同。
跳表管理装置420用于为每个分片分别构建对应的第一级跳表和第二级跳表,其中,与每个分片对应的第一级跳表被设置为用于存储以数据记录中关于所述每个分片的索引字段的取值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点,与所述每个分片对应的第二级跳表被设置为用于存储以所述数据记录中关于所述每个分片的排序字段的取值为关键字且以指示用于存储所述数据记录的至少一个属性字段的取值的存储空间的指针为与该关键字对应的值的节点。
作为示例,与所有分片组对应的第二级跳表共享所述存储空间且所述存储空间存储所述数据记录的全部属性字段的取值,或者,与同一分片组对应的第二级跳表共享同一存储空间且所述同一存储空间存储所述数据记录的除了与所述同一分片组对应的索引字段和排序字段之外的所有属性字段的取值。
图7示出根据本公开示例性实施例的在内存中维护数据的系统的框图。如图7中所示,根据本公开示例性实施例的在内存中维护数据的系统包括:分片确定装置510、节点查找装置520和节点管理装置530。
分片确定装置510用于针对多个分片组中的每个分片组,根据待插入的数据记录的关于所述每个分片组的索引字段的取值来确定对应的分片,其中,每个分片组包括至少一个分片,每个分片组中的所有分片对应统一的索引字段和排序字段,不同分片组的索引字段不同,并且,不同分片组的排序字段相同或不同,每个分片分别对应第一级跳表,第一级跳表用于存储以数据记录中关于对应分片的索引字段的取值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点。
节点查找装置520用于从与确定的分片对应的第一级跳表中查找以所述待插入的数据记录的关于所述每个分片组的索引字段的取值为关键字的节点。
节点管理装置530用于在查找到以所述待插入的数据记录的关于所述每个分片组的索引字段的取值为关键字的节点的情况下,在查找到的节点中的指针或对象所指示的第二级跳表中添加以所述待插入的数据记录的关于所述每个分片组的排序字段的取值为关键字且以指示用于存储所述待插入的数据记录的至少一个属性字段的取值的存储空间的指针为与该关键字对应的值的节点。
作为示例,与所有分片组对应的第二级跳表共享所述存储空间且所述存储空间存储所述数据记录的全部属性字段的取值,或者与同一分片组对应的第二级跳表共享同一存储空间且所述同一存储空间存储所述数据记录的除了与所述同一分片组对应的索引字段和排序字段之外的所有属性字段的取值。
作为示例,分片确定装置510计算与所述待插入的数据记录的关于所述每个分片组的索引字段的取值对应的哈希值,获得计算出的哈希值除以所述每个分片组的分片总数所得的余数,并且将与获得的余数对应的分片确定为对应的分片。
作为示例,每个分片中存储有指示对应的第一级跳表的指针或对象。
作为示例,根据本公开示例性实施例的在内存中维护数据的系统还包括:跳表管理装置,其中,在未能查找到以所述待插入的数据记录的关于所述每个分片组的索引字段的取值为关键字的节点的情况下,跳表管理装置创建第二级跳表,节点管理装置530在第一级跳表中创建以所述待插入的数据记录的关于所述每个分片组的索引字段的取值为关键字且以指示创建的第二级跳表的指针或对象为与该关键字对应的值的节点,并在创建的第二级跳表中添加以与所述待插入的数据记录的关于所述每个分片组的排序字段的取值为关键字且以指示用于存储所述待插入的数据记录的至少一个属性字段的取值的存储空间的指针为与该关键字对应的值的节点。
作为示例,用于存储所述待插入的数据记录的至少一个属性字段的取值的存储空间中存储有通过以下方式之一获得的字符串:按照预定的字符串合并规则对所述至少一个属性字段的取值进行合并,按照预定的JSON格式对所述至少一个属性字段的取值进行序列化,按照预定的ProtocolBuffer格式对所述至少一个属性字段的取值进行序列化,以及按照预定的Schema格式对所述至少一个属性字段的取值进行序列化。
作为示例,根据本公开示例性实施例的在内存中维护数据的系统还包括:分片组确定装置(未示出)和数据获取装置(未示出),其中,分片组确定装置确定所述多个分片组中索引字段与待查询的数据记录的索引字段相同的分片组,分片确定装置510在确定的分片组中,根据所述待查询的数据记录的索引字段的取值来确定对应的分片,节点查找装置530从与确定的分片对应的第一级跳表中查找以所述待查询的数据记录的索引字段的取值为关键字的节点,数据获取装置从查找到的节点中的指针或对象所指示的第二级跳表中查询关键字在所述待查询的数据记录的排序字段的取值范围内的节点中的指针,并从查询到的指针所指示的存储空间中取出待查询的数据记录的至少一个属性字段的取值。
作为示例,数据获取装置根据查询到的指针,通过以下方式之一来取出所述待查询的数据记录的所述至少一个属性字段的取值:按照预定的字符串拆分规则对所述至少一个属性字段的取值进行拆分,按照预定的JSON格式对所述至少一个属性字段的取值进行反序列化,按照预定的ProtocolBuffer格式对所述至少一个属性字段的取值进行反序列化,以及按照预定义的Schema格式对所述至少一个属性字段的取值进行反序列化。
作为示例,分片确定装置510计算与所述待查询的数据记录的索引字段的取值对应的哈希值,获得计算出的哈希值除以所述确定的分片组的分片总数所得的余数,并将与获得的余数对应的分片确定为对应的分片。
作为示例,所述待插入的数据记录和/或所述待查询的数据记录是时序型数据记录,并且,排序字段均为时间戳。
作为示例,所述取值范围指定时间戳的起始值和终止值或者指定时间戳的终止值。
作为示例,节点管理装置530按照时间戳的取值指示的时间添加节点,使得第二级跳表中的节点按照时间由近及远的顺序排列。
作为示例,根据本公开示例性实施例的在内存中维护数据的系统还包括:节点数量阈值设置装置(未示出),用于设置与第二级跳表对应的节点数量阈值,其中,数据获取装置从查找到的节点中的指针或对象所指示的第二级跳表中,按照由近及远的顺序取出关键字在所述取值范围内的预定数量的节点中的每个节点中存储的指针,其中,所述预定数量不超过所述节点数量阈值。
作为示例,根据本公开示例性实施例的在内存中维护数据的系统还包括:节点数量阈值设置装置(未示出)、跳表遍历装置(未示出)和数据删除装置(未示出),其中,节点数量阈值设置装置设置与第二级跳表对应的节点数量阈值,跳表遍历装置以预定周期分别遍历与每个分片组的对应的第一级跳表和第二级跳表,当遍历到的与所述每个分片组对应的第二级跳表中的节点数量超过节点数量阈值时,节点删除装置根据该第二级跳表中的节点的排列顺序,删除该第二级跳表中排列顺序在与节点数量阈值对应的节点之后的所有 节点,并且删除被删除的节点所存储的指针所指示的存储空间,其中,所述排列顺序为时间由近及远的排列顺序。
作为示例,根据本公开示例性实施例的在内存中维护数据的系统还包括:过期期限长度设置装置(未示出)、跳表遍历装置(未示出)和数据删除装置(未示出),其中,过期期限长度设置装置设置过期期限长度,跳表遍历装置以预定周期遍历与各个分片对应的第一级跳表和第二级跳表,数据删除装置整体删除排列顺序在第二级跳表中的时间戳的取值达到所述过期期限长度的节点之后的节点,其中,所述排列顺序为时间由近及远的排列顺序。
应该理解,根据本公开示例性实施例的管理内存数据的系统和在内存中维护数据的系统的具体实现方式可参照结合图1至图5以及表1描述的相关具体实现方式来实现,在此不再赘述。
图6和图7所示出的装置可被分别配置为执行特定功能的软件、硬件、固件或上述项的任意组合。例如,这些装置可对应于专用的集成电路,也可对应于纯粹的软件代码,还可对应于软件与硬件相结合的单元或模块。此外,这些装置所实现的一个或多个功能也可由物理实体设备(例如,处理器、客户端或服务器等)中的组件来统一执行。
以上参照图1到图5描述了根据本公开示例性实施例的管理内存数据及在内存中维护数据的方法和系统。应理解,上述方法可通过记录在计算机可读存储介质上的程序来实现,例如,根据本公开的示例性实施例,可提供一种存储指令的计算机可读存储介质,其中,当所述指令被至少一个计算装置运行时,促使所述至少一个计算装置执行:设置多个分片组,其中,每个分片组包括至少一个分片,每个分片组中的所有分片对应统一的索引字段和排序字段,不同分片组的索引字段不同,并且,不同分片组的排序字段相同或不同;为每个分片分别构建对应的第一级跳表和第二级跳表,其中,与每个分片对应的第一级跳表被设置为用于存储以数据记录中关于所述每个分片的索引字段的取值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点,与所述每个分片对应的第二级跳表被设置为用于存储以所述数据记录中关于所述每个分片的排序字段的取值为关键字且以指示用于存储所述数据记录的至少一个属性字段的取值的存储空间的指针为与该关键字对应的值的节点。
又如,根据本公开的示例性实施例,可提供一种存储指令的计算机可读存储介质,其中,当所述指令被至少一个计算装置运行时,促使所述至少一个计算装置执行:针对多个分片组中的每个分片组,根据待插入的数据记录的关于所述每个分片组的索引字段的取值来确定对应的分片,其中,每个分片组包括至少一个分片,每个分片组中的所有分片对应统一的索引字段和排序字段,不同分片组的索引字段不同,并且,不同分片组的排序字段相同或不同,每个分片分别对应第一级跳表,第一级跳表用于存储以数据记录中关于对应分片的索引字段的取值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点;从与确定的分片对应的第一级跳表中查找以所述待插入的数据记录的关于所述每个分片组的索引字段的取值为关键字的节点;在查找到以所述待插入的数据记录的关于所述每个分片组的索引字段的取值为关键字的节点的情况下,在查找到的节点中的指针或对象所指示的第二级跳表中添加以所述待插入的数据记录的关于所述每个分片组的排序字段的取值为关键字且以指示用于存储所述待插入的数据记录的至少一个属性字段的取值的存储空间的指针为与该关键字对应的值的节点。
上述计算机可读存储介质中的计算机程序可在诸如处理器、客户端、主机、代理装置、服务器等计算机设备中部署的环境中运行,例如,由位于单机环境或分布式集群环境的至少一个计算装置来运行,作为示例,这里的计算装置可作为计算机、处理器、计算单元(或模块)、客户端、主机、代理装置、服务器等。应注意,所述计算机程序还可用于执行除了上述步骤以外的附加步骤或者在执行上述步骤时执行更为具体的处理,这些附加步骤和进一步处理的内容已经参照图1到图5进行了描述,这里为了避免重复将不再进行赘述。
应注意,根据本公开示例性实施例的管理内存数据的系统及在内存中维护数据的系统可完全依赖计算机程序的运行来实现相应的功能,即,各个装置与计算机程序的功能架构中与各步骤相应,使得整个系统通过专门的软件包(例如,lib库)而被调用,以实现相应的功能。
另一方面,图6和图7所示的各个装置也可以通过硬件、软件、固件、中间件、微代码或其任意组合来实现。当以软件、固件、中间件或微代码实现时,用于执行相应操作的程序代码或者代码段可以存储在诸如存储介质的计算机可读介质中,使得处理器可通过读取并运行相应的程序代码或者代码段来执行相应的操作。
例如,根据本公开示例性实施例,可提供一种包括至少一个计算装置和至少一个存储指令的存储装置的系统,其中,所述指令在被所述至少一个计算装置运行时,促使所述至少一个计算装置执行管理内存数据的以下步骤:设置多个分片组,其中,每个分片组包括至少一个分片,每个分片组中的所有分片对应统一的索引字段和排序字段,不同分片组的索引字段不同,并且,不同分片组的排序字段相同或不同;为每个分片分别构建对应的第一级跳表和第二级跳表,其中,与每个分片对应的第一级跳表被设置为用于存储以数据记录中关于所述每个分片的索引字段的取值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点,与所述每个分片对应的第二级跳表被设置为用于存储以所述数据记录中关于所述每个分片的排序字段的取值为关键字且以指示用于存储所述数据记录的至少一个属性字段的取值的存储空间的指针为与该关键字对应的值的节点。
又如,根据本公开示例性实施例,可提供一种包括至少一个计算装置和至少一个存储指令的存储装置的系统,其中,所述指令在被所述至少一个计算装置运行时,促使所述至少一个计算装置执行在内存中维护数据的以下步骤:针对多个分片组中的每个分片组,根据待插入的数据记录的关于所述每个分片组的索引字段的取值来确定对应的分片,其中,每个分片组包括至少一个分片,每个分片组中的所有分片对应统一的索引字段和排序字段,不同分片组的索引字段不同,并且,不同分片组的排序字段相同或不同,每个分片分别对应第一级跳表,第一级跳表用于存储以数据记录中关于对应分片的索引字段的取值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点;从与确定的分片对应的第一级跳表中查找以所述待插入的数据记录的关于所述每个分片组的索引字段的取值为关键字的节点;在查找到以所述待插入的数据记录的关于所述每个分片组的索引字段的取值为关键字的节点的情况下,在查找到的节点中的指针或对象所指示的第二级跳表中添加以所述待插入的数据记录的关于所述每个分片组的排序字段的取值为关键字且以指示用于存储所述待插入的数据记录的至少一个属性字段的取值的存储空间的指针为与该关键字对应的值的节点。
这里,所述系统可构成单机计算环境或分布式计算环境,其包括至少一个计算装置和至少一个存储装置,这里,作为示例,计算装置可以是通用或专用的计算机、处理器等,可以是单纯利用软件来执行处理的单元,还可以是软硬件相结合的实体。也就是说,计算装置可实现为计算机、处理器、计算单元(或模块)、客户端、主机、代理装置、服务器等。此外,存储装置可以是物理上的存储设备或逻辑上划分出的存储单元,其可与计算装置在操作上进行耦合,或者可例如通过I/O端口、网络连接等互相通信。
此外,例如,本公开的示例性实施例还可以实现为计算装置,该计算装置包括存储部件和处理器,存储部件中存储有计算机可执行指令集合,当所述计算机可执行指令集合被所述处理器执行时,执行管理内存数据的方法和/或在内存中维护数据的方法。
具体说来,所述计算装置可以部署在服务器或客户端中,也可以部署在分布式网络环境中的节点装置上。此外,所述计算装置可以是PC计算机、平板装置、个人数字助理、智能手机、web应用或其他能够执行上述指令集合的装置。
这里,所述计算装置并非必须是单个的计算装置,还可以是任何能够单独或联合执行上述指令(或指令集)的装置或电路的集合体。计算装置还可以是集成控制系统或系统管 理器的一部分,或者可被配置为与本地或远程(例如,经由无线传输)以接口互联的便携式电子装置。
在所述计算装置中,处理器可包括中央处理器(CPU)、图形处理器(GPU)、可编程逻辑装置、专用处理器系统、微控制器或微处理器。作为示例而非限制,处理器还可包括模拟处理器、数字处理器、微处理器、多核处理器、处理器阵列、网络处理器等。
根据本公开示例性实施例的管理内存数据的方法和/或在内存中维护数据的方法中所描述的某些操作可通过软件方式来实现,某些操作可通过硬件方式来实现,此外,还可通过软硬件结合的方式来实现这些操作。
处理器可运行存储在存储部件之一中的指令或代码,其中,所述存储部件还可以存储数据。指令和数据还可经由网络接口装置而通过网络被发送和接收,其中,所述网络接口装置可采用任何已知的传输协议。
存储部件可与处理器集成为一体,例如,将RAM或闪存布置在集成电路微处理器等之内。此外,存储部件可包括独立的装置,诸如,外部盘驱动、存储阵列或任何数据库系统可使用的其他存储装置。存储部件和处理器可在操作上进行耦合,或者可例如通过I/O端口、网络连接等互相通信,使得处理器能够读取存储在存储部件中的文件。
此外,所述计算装置还可包括视频显示器(诸如,液晶显示器)和用户交互接口(诸如,键盘、鼠标、触摸输入装置等)。计算装置的所有组件可经由总线和/或网络而彼此连接。
根据本公开示例性实施例的管理内存数据的方法和/或在内存中维护数据的方法所涉及的操作可被描述为各种互联或耦合的功能块或功能示图。然而,这些功能块或功能示图可被均等地集成为单个的逻辑装置或按照非确切的边界进行操作。
以上描述了本公开的各示例性实施例,应理解,上述描述仅是示例性的,并非穷尽性的,本公开不限于所披露的各示例性实施例。在不偏离本公开的范围和精神的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。因此,本公开的保护范围应该以权利要求的范围为准。

Claims (37)

  1. 一种由至少一个计算装置执行的管理内存数据的方法,包括:
    设置多个分片组,其中,每个分片组包括至少一个分片,每个分片组中的所有分片对应统一的索引字段和排序字段,不同分片组的索引字段不同,并且,不同分片组的排序字段相同或不同;
    为每个分片分别构建对应的第一级跳表和第二级跳表,其中,与每个分片对应的第一级跳表被设置为用于存储以数据记录中关于所述每个分片的索引字段的取值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点,与所述每个分片对应的第二级跳表被设置为用于存储以所述数据记录中关于所述每个分片的排序字段的取值为关键字且以指示用于存储所述数据记录的至少一个属性字段的取值的存储空间的指针为与该关键字对应的值的节点。
  2. 如权利要求1所述的方法,其中,与所有分片组对应的第二级跳表共享所述存储空间且所述存储空间存储所述数据记录的全部属性字段的取值,或者,与同一分片组对应的第二级跳表共享同一存储空间且所述同一存储空间存储所述数据记录的除了与所述同一分片组对应的索引字段和排序字段之外的所有属性字段的取值。
  3. 一种由至少一个计算装置执行的在内存中维护数据的方法,包括:
    针对多个分片组中的每个分片组,根据待插入的数据记录的关于所述每个分片组的索引字段的取值来确定对应的分片,其中,每个分片组包括至少一个分片,每个分片组中的所有分片对应统一的索引字段和排序字段,不同分片组的索引字段不同,并且,不同分片组的排序字段相同或不同,每个分片分别对应第一级跳表,第一级跳表用于存储以数据记录中关于对应分片的索引字段的取值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点;
    从与确定的分片对应的第一级跳表中查找以所述待插入的数据记录的关于所述每个分片组的索引字段的取值为关键字的节点;
    在查找到以所述待插入的数据记录的关于所述每个分片组的索引字段的取值为关键字的节点的情况下,在查找到的节点中的指针或对象所指示的第二级跳表中添加以所述待插入的数据记录的关于所述每个分片组的排序字段的取值为关键字且以指示用于存储所述待插入的数据记录的至少一个属性字段的取值的存储空间的指针为与该关键字对应的值的节点。
  4. 如权利要求3所述的方法,其中,与所有分片组对应的第二级跳表共享所述存储空间且所述存储空间存储所述数据记录的全部属性字段的取值,或者与同一分片组对应的第二级跳表共享同一存储空间且所述同一存储空间存储所述数据记录的除了与所述同一分片组对应的索引字段和排序字段之外的所有属性字段的取值。
  5. 如权利要求3所述的方法,其中,确定对应的分片的步骤包括:
    计算与所述待插入的数据记录的关于所述每个分片组的索引字段的取值对应的哈希值;
    获得计算出的哈希值除以所述每个分片组的分片总数所得的余数;
    将与获得的余数对应的分片确定为对应的分片。
  6. 如权利要求3所述的方法,其中,每个分片中存储有指示对应的第一级跳表的指针或对象。
  7. 如权利要求3所述的方法,还包括:
    在未能查找到以所述待插入的数据记录的关于所述每个分片组的索引字段的取值为关键字的节点的情况下,创建第二级跳表,在第一级跳表中创建以所述待插入的数据记录的关于所述每个分片组的索引字段的取值为关键字且以指示创建的第二级跳表的指针或对象为与该关键字对应的值的节点,并在创建的第二级跳表中添加以与所述待插入的数据 记录的关于所述每个分片组的排序字段的取值为关键字且以指示用于存储所述待插入的数据记录的至少一个属性字段的取值的存储空间的指针为与该关键字对应的值的节点。
  8. 如权利要求3或7所述的方法,其中,用于存储所述待插入的数据记录的至少一个属性字段的取值的存储空间中存储有通过以下方式之一获得的字符串:
    按照预定的字符串合并规则对所述至少一个属性字段的取值进行合并,按照预定的JSON格式对所述至少一个属性字段的取值进行序列化,按照预定的ProtocolBuffer格式对所述至少一个属性字段的取值进行序列化,以及按照预定的Schema格式对所述至少一个属性字段的取值进行序列化。
  9. 如权利要求3所述的方法,还包括:
    确定所述多个分片组中索引字段与待查询的数据记录的索引字段相同的分片组;
    在确定的分片组中,根据所述待查询的数据记录的索引字段的取值来确定对应的分片;
    从与确定的分片对应的第一级跳表中查找以所述待查询的数据记录的索引字段的取值为关键字的节点;
    从查找到的节点中的指针或对象所指示的第二级跳表中查询关键字在所述待查询的数据记录的排序字段的取值范围内的节点中的指针;
    从查询到的指针所指示的存储空间中取出待查询的数据记录的至少一个属性字段的取值。
  10. 如权利要求9所述的方法,其中,从查询到的指针所指示的存储空间中取出待查询的数据记录的至少一个属性字段的取值的步骤包括:通过以下方式之一来取出所述待查询的数据记录的所述至少一个属性字段的取值:
    按照预定的字符串拆分规则对所述至少一个属性字段的取值进行拆分,按照预定的JSON格式对所述至少一个属性字段的取值进行反序列化,按照预定的ProtocolBuffer格式对所述至少一个属性字段的取值进行反序列化,以及按照预定义的Schema格式对所述至少一个属性字段的取值进行反序列化。
  11. 如权利要求9所述的方法,其中,确定对应的分片的步骤包括:
    计算与所述待查询的数据记录的索引字段的取值对应的哈希值;
    获得计算出的哈希值除以所述确定的分片组的分片总数所得的余数;
    将与获得的余数对应的分片确定为对应的分片。
  12. 如权利要求3或9所述的方法,其中,所述待插入的数据记录和所述待查询的数据记录中的至少一个是时序型数据记录,并且,排序字段均为时间戳。
  13. 如权利要求12所述的方法,其中,所述取值范围指定时间戳的起始值和终止值或者指定时间戳的终止值。
  14. 如权利要求12所述的方法,其中,在第二级跳表中添加节点的步骤包括:按照时间戳的取值指示的时间添加节点,使得第二级跳表中的节点按照时间由近及远的顺序排列。
  15. 如权利要求14所述的方法,还包括:设置与第二级跳表对应的节点数量阈值,其中,从查找到的节点中的指针或对象所指示的第二级跳表中查询关键字在所述待查询的数据记录的排序字段的取值范围内的节点中的指针的步骤包括:
    从查找到的节点中的指针或对象所指示的第二级跳表中,按照由近及远的顺序取出关键字在所述取值范围内的预定数量的节点中的每个节点中存储的指针,其中,所述预定数量不超过所述节点数量阈值。
  16. 如权利要求12所述的方法,还包括:
    设置与第二级跳表对应的节点数量阈值;
    以预定周期分别遍历与每个分片组的对应的第一级跳表和第二级跳表;
    当遍历到的与所述每个分片组对应的第二级跳表中的节点数量超过节点数量阈值时,根据该第二级跳表中的节点的排列顺序,删除该第二级跳表中排列顺序在与节点数量阈值 对应的节点之后的所有节点,并且删除被删除的节点所存储的指针所指示的存储空间,
    其中,所述排列顺序为时间由近及远的排列顺序。
  17. 如权利要求12所述的方法,还包括:
    设置过期期限长度;
    以预定周期遍历与各个分片对应的第一级跳表和第二级跳表,通过定位第二级跳表中的时间戳的取值达到所述过期期限长度的节点来整体删除第二级跳表中的排列顺序在该节点之后的节点,
    其中,所述排列顺序为时间由近及远的排列顺序。
  18. 一种包括至少一个计算装置和至少一个存储指令的存储装置的系统,其中,所述指令在被所述至少一个计算装置运行时,促使所述至少一个计算装置执行管理内存数据的以下步骤:
    设置多个分片组,其中,每个分片组包括至少一个分片,每个分片组中的所有分片对应统一的索引字段和排序字段,不同分片组的索引字段不同,并且,不同分片组的排序字段相同或不同;
    为每个分片分别构建对应的第一级跳表和第二级跳表,其中,与每个分片对应的第一级跳表被设置为用于存储以数据记录中关于所述每个分片的索引字段的取值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点,与所述每个分片对应的第二级跳表被设置为用于存储以所述数据记录中关于所述每个分片的排序字段的取值为关键字且以指示用于存储所述数据记录的至少一个属性字段的取值的存储空间的指针为与该关键字对应的值的节点。
  19. 如权利要求18所述的系统,其中,与所有分片组对应的第二级跳表共享所述存储空间且所述存储空间存储所述数据记录的全部属性字段的取值,或者,与同一分片组对应的第二级跳表共享同一存储空间且所述同一存储空间存储所述数据记录的除了与所述同一分片组对应的索引字段和排序字段之外的所有属性字段的取值。
  20. 一种包括至少一个计算装置和至少一个存储指令的存储装置的系统,其中,所述指令在被所述至少一个计算装置运行时,促使所述至少一个计算装置执行在内存中维护数据的以下步骤:
    针对多个分片组中的每个分片组,根据待插入的数据记录的关于所述每个分片组的索引字段的取值来确定对应的分片,其中,每个分片组包括至少一个分片,每个分片组中的所有分片对应统一的索引字段和排序字段,不同分片组的索引字段不同,并且,不同分片组的排序字段相同或不同,每个分片分别对应第一级跳表,第一级跳表用于存储以数据记录中关于对应分片的索引字段的取值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点;
    从与确定的分片对应的第一级跳表中查找以所述待插入的数据记录的关于所述每个分片组的索引字段的取值为关键字的节点;
    在查找到以所述待插入的数据记录的关于所述每个分片组的索引字段的取值为关键字的节点的情况下,在查找到的节点中的指针或对象所指示的第二级跳表中添加以所述待插入的数据记录的关于所述每个分片组的排序字段的取值为关键字且以指示用于存储所述待插入的数据记录的至少一个属性字段的取值的存储空间的指针为与该关键字对应的值的节点。
  21. 如权利要求20所述的系统,其中,与所有分片组对应的第二级跳表共享所述存储空间且所述存储空间存储所述数据记录的全部属性字段的取值,或者与同一分片组对应的第二级跳表共享同一存储空间且所述同一存储空间存储所述数据记录的除了与所述同一分片组对应的索引字段和排序字段之外的所有属性字段的取值。
  22. 如权利要求20所述的系统,其中,确定对应的分片的步骤包括:
    计算与所述待插入的数据记录的关于所述每个分片组的索引字段的取值对应的哈希 值,获得计算出的哈希值除以所述每个分片组的分片总数所得的余数,并且将与获得的余数对应的分片确定为对应的分片。
  23. 如权利要求20所述的系统,其中,每个分片中存储有指示对应的第一级跳表的指针或对象。
  24. 如权利要求20所述的系统,还执行以下步骤:在未能查找到以所述待插入的数据记录的关于所述每个分片组的索引字段的取值为关键字的节点的情况下,创建第二级跳表,在第一级跳表中创建以所述待插入的数据记录的关于所述每个分片组的索引字段的取值为关键字且以指示创建的第二级跳表的指针或对象为与该关键字对应的值的节点,并在创建的第二级跳表中添加以与所述待插入的数据记录的关于所述每个分片组的排序字段的取值为关键字且以指示用于存储所述待插入的数据记录的至少一个属性字段的取值的存储空间的指针为与该关键字对应的值的节点。
  25. 如权利要求20或24所述的系统,其中,用于存储所述待插入的数据记录的至少一个属性字段的取值的存储空间中存储有通过以下方式之一获得的字符串:
    按照预定的字符串合并规则对所述至少一个属性字段的取值进行合并,按照预定的JSON格式对所述至少一个属性字段的取值进行序列化,按照预定的ProtocolBuffer格式对所述至少一个属性字段的取值进行序列化,以及按照预定的Schema格式对所述至少一个属性字段的取值进行序列化。
  26. 如权利要求20所述的系统,还执行以下步骤:
    确定所述多个分片组中索引字段与待查询的数据记录的索引字段相同的分片组;在确定的分片组中,根据所述待查询的数据记录的索引字段的取值来确定对应的分片;从与确定的分片对应的第一级跳表中查找以所述待查询的数据记录的索引字段的取值为关键字的节点;从查找到的节点中的指针或对象所指示的第二级跳表中查询关键字在所述待查询的数据记录的排序字段的取值范围内的节点中的指针,并从查询到的指针所指示的存储空间中取出待查询的数据记录的至少一个属性字段的取值。
  27. 如权利要求26所述的系统,其中,从查询到的指针所指示的存储空间中取出待查询的数据记录的至少一个属性字段的取值的步骤包括:通过以下方式之一来取出所述待查询的数据记录的所述至少一个属性字段的取值:
    按照预定的字符串拆分规则对所述至少一个属性字段的取值进行拆分,按照预定的JSON格式对所述至少一个属性字段的取值进行反序列化,按照预定的ProtocolBuffer格式对所述至少一个属性字段的取值进行反序列化,以及按照预定义的Schema格式对所述至少一个属性字段的取值进行反序列化。
  28. 如权利要求26所述的系统,其中,确定对应的分片的步骤包括:
    计算与所述待查询的数据记录的索引字段的取值对应的哈希值,获得计算出的哈希值除以所述确定的分片组的分片总数所得的余数,并将与获得的余数对应的分片确定为对应的分片。
  29. 如权利要求20或26所述的系统,其中,所述待插入的数据记录和所述待查询的数据记录中的至少一个是时序型数据记录,并且,排序字段均为时间戳。
  30. 如权利要求29所述的系统,其中,所述取值范围指定时间戳的起始值和终止值或者指定时间戳的终止值。
  31. 如权利要求29所述的系统,其中,在第二级跳表中添加节点的步骤包括:按照时间戳的取值指示的时间添加节点,使得第二级跳表中的节点按照时间由近及远的顺序排列。
  32. 如权利要求31所述的系统,其中,还执行以下步骤:设置与第二级跳表对应的节点数量阈值,其中,从查找到的节点中的指针或对象所指示的第二级跳表中,按照由近及远的顺序取出关键字在所述取值范围内的预定数量的节点中的每个节点中存储的指针,其中,所述预定数量不超过所述节点数量阈值。
  33. 如权利要求29所述的系统,其中,还执行以下步骤:
    设置与第二级跳表对应的节点数量阈值;
    以预定周期分别遍历与每个分片组的对应的第一级跳表和第二级跳表;
    当遍历到的与所述每个分片组对应的第二级跳表中的节点数量超过节点数量阈值时,根据该第二级跳表中的节点的排列顺序,删除该第二级跳表中排列顺序在与节点数量阈值对应的节点之后的所有节点,并且删除被删除的节点所存储的指针所指示的存储空间,其中,所述排列顺序为时间由近及远的排列顺序。
  34. 如权利要求29所述的系统,还执行以下步骤:
    设置过期期限长度;
    以预定周期遍历与各个分片对应的第一级跳表和第二级跳表,整体删除排列顺序在第二级跳表中的时间戳的取值达到所述过期期限长度的节点之后的节点,其中,所述排列顺序为时间由近及远的排列顺序。
  35. 一种存储指令的计算机可读存储介质,其中,当所述指令被至少一个计算装置运行时,促使所述至少一个计算装置执行如权利要求1到17中的任一权利要求所述的方法。
  36. 一种管理内存数据的系统,包括:
    分片管理装置,用于设置多个分片组,其中,每个分片组包括至少一个分片,每个分片组中的所有分片对应统一的索引字段和排序字段,不同分片组的索引字段不同,并且,不同分片组的排序字段相同或不同;
    跳表管理装置,用于为每个分片分别构建对应的第一级跳表和第二级跳表,其中,与每个分片对应的第一级跳表被设置为用于存储以数据记录中关于所述每个分片的索引字段的取值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点,与所述每个分片对应的第二级跳表被设置为用于存储以所述数据记录中关于所述每个分片的排序字段的取值为关键字且以指示用于存储所述数据记录的至少一个属性字段的取值的存储空间的指针为与该关键字对应的值的节点。
  37. 一种在内存中维护数据的系统,包括:
    分片确定装置,用于针对多个分片组中的每个分片组,根据待插入的数据记录的关于所述每个分片组的索引字段的取值来确定对应的分片,其中,每个分片组包括至少一个分片,每个分片组中的所有分片对应统一的索引字段和排序字段,不同分片组的索引字段不同,并且,不同分片组的排序字段相同或不同,每个分片分别对应第一级跳表,第一级跳表用于存储以数据记录中关于对应分片的索引字段的取值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点;
    节点查找装置,用于从与确定的分片对应的第一级跳表中查找以所述待插入的数据记录的关于所述每个分片组的索引字段的取值为关键字的节点;
    节点管理装置,用于在查找到以所述待插入的数据记录的关于所述每个分片组的索引字段的取值为关键字的节点的情况下,在查找到的节点中的指针或对象所指示的第二级跳表中添加以所述待插入的数据记录的关于所述每个分片组的排序字段的取值为关键字且以指示用于存储所述待插入的数据记录的至少一个属性字段的取值的存储空间的指针为与该关键字对应的值的节点。
PCT/CN2019/109144 2018-10-12 2019-09-29 管理内存数据及在内存中维护数据的方法和系统 WO2020073854A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811187043.XA CN109299100B (zh) 2018-10-12 2018-10-12 管理内存数据及在内存中维护数据的方法和系统
CN201811187043.X 2018-10-12

Publications (1)

Publication Number Publication Date
WO2020073854A1 true WO2020073854A1 (zh) 2020-04-16

Family

ID=65162271

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/109144 WO2020073854A1 (zh) 2018-10-12 2019-09-29 管理内存数据及在内存中维护数据的方法和系统

Country Status (2)

Country Link
CN (2) CN111046034B (zh)
WO (1) WO2020073854A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114020986A (zh) * 2022-01-05 2022-02-08 深圳思谋信息科技有限公司 内容检索系统
CN114706836A (zh) * 2022-03-29 2022-07-05 中国科学院软件研究所 一种基于机载嵌入式数据库的数据生命周期管理方法
EP4254905A4 (en) * 2021-04-30 2024-02-14 Softgear Co Ltd SERIALIZATION METHOD, DESERIALIZATION METHOD, INFORMATION PROCESSING PROGRAM, INFORMATION PROCESSING APPARATUS AND COMMUNICATIONS SYSTEM

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111046034B (zh) * 2018-10-12 2024-02-13 第四范式(北京)技术有限公司 管理内存数据及在内存中维护数据的方法和系统
CN110555000A (zh) * 2019-09-05 2019-12-10 重庆紫光华山智安科技有限公司 一种卡口图片元数据并发写入、读取方法
CN111611245B (zh) * 2020-05-21 2023-09-05 第四范式(北京)技术有限公司 处理数据表的方法和系统
CN111913801B (zh) * 2020-07-15 2023-08-29 广州虎牙科技有限公司 数据处理方法和装置、代理服务器、存储系统及存储介质
CN112948466A (zh) * 2021-03-15 2021-06-11 北京微纳星空科技有限公司 卫星数据的处理方法、装置、电子设备及存储介质
CN112925792B (zh) * 2021-03-26 2024-01-05 北京中经惠众科技有限公司 数据存储控制方法、装置、计算设备及介质
CN113253926A (zh) * 2021-05-06 2021-08-13 天津大学深圳研究院 提升新型存储器的查询和存储性能的存储内索引构建方法
CN113377636B (zh) * 2021-06-07 2022-08-26 上海微盟企业发展有限公司 一种页面浏览量计算的方法、系统、设备及可读存储介质
CN113434518B (zh) * 2021-08-26 2021-12-03 西安热工研究院有限公司 时序数据库查询方法、系统、设备及存储介质
CN115618050B (zh) * 2022-12-06 2023-03-21 苏州浪潮智能科技有限公司 视频数据存储、分析方法、装置、系统、通信设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090132563A1 (en) * 2007-11-19 2009-05-21 Sun Microsystems, Inc. Simple optimistic skiplist
CN103942289A (zh) * 2014-04-12 2014-07-23 广西师范大学 一种Hadoop上面向范围查询的内存缓存方法
CN109086133A (zh) * 2018-07-06 2018-12-25 第四范式(北京)技术有限公司 管理内存数据及在内存中维护数据的方法和系统
CN109299100A (zh) * 2018-10-12 2019-02-01 第四范式(北京)技术有限公司 管理内存数据及在内存中维护数据的方法和系统

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100428226C (zh) * 2003-12-27 2008-10-22 海信集团有限公司 实现类内存数据库存取和检索的方法
CN101464901B (zh) * 2009-01-16 2012-03-21 华中科技大学 一种对象存储设备中的对象查找方法
EP2290562A1 (en) * 2009-08-24 2011-03-02 Amadeus S.A.S. Segmented main-memory stored relational database table system with improved collaborative scan algorithm
CN106462575A (zh) * 2013-12-02 2017-02-22 丘贝斯有限责任公司 群集内存数据库的设计及实现
CN105117417B (zh) * 2015-07-30 2018-04-17 西安交通大学 一种读优化的内存数据库Trie树索引方法
CN107679212A (zh) * 2017-10-17 2018-02-09 安徽慧视金瞳科技有限公司 一种应用于跳跃表数据结构的数据查询优化方法
CN108228796A (zh) * 2017-12-29 2018-06-29 百度在线网络技术(北京)有限公司 Mpp数据库的管理方法、装置、系统、服务器及介质
CN109669929A (zh) * 2018-12-14 2019-04-23 江苏瑞中数据股份有限公司 基于分布式并行数据库的实时数据存储方法和系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090132563A1 (en) * 2007-11-19 2009-05-21 Sun Microsystems, Inc. Simple optimistic skiplist
CN103942289A (zh) * 2014-04-12 2014-07-23 广西师范大学 一种Hadoop上面向范围查询的内存缓存方法
CN109086133A (zh) * 2018-07-06 2018-12-25 第四范式(北京)技术有限公司 管理内存数据及在内存中维护数据的方法和系统
CN109299100A (zh) * 2018-10-12 2019-02-01 第四范式(北京)技术有限公司 管理内存数据及在内存中维护数据的方法和系统

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4254905A4 (en) * 2021-04-30 2024-02-14 Softgear Co Ltd SERIALIZATION METHOD, DESERIALIZATION METHOD, INFORMATION PROCESSING PROGRAM, INFORMATION PROCESSING APPARATUS AND COMMUNICATIONS SYSTEM
CN114020986A (zh) * 2022-01-05 2022-02-08 深圳思谋信息科技有限公司 内容检索系统
CN114706836A (zh) * 2022-03-29 2022-07-05 中国科学院软件研究所 一种基于机载嵌入式数据库的数据生命周期管理方法
CN114706836B (zh) * 2022-03-29 2023-01-10 中国科学院软件研究所 一种基于机载嵌入式数据库的数据生命周期管理方法

Also Published As

Publication number Publication date
CN111046034A (zh) 2020-04-21
CN111046034B (zh) 2024-02-13
CN109299100A (zh) 2019-02-01
CN109299100B (zh) 2019-08-30

Similar Documents

Publication Publication Date Title
WO2020073854A1 (zh) 管理内存数据及在内存中维护数据的方法和系统
US11475034B2 (en) Schemaless to relational representation conversion
WO2020007288A1 (zh) 管理内存数据及在内存中维护数据的方法和系统
US9639542B2 (en) Dynamic mapping of extensible datasets to relational database schemas
US20110213775A1 (en) Database Table Look-up
US8924373B2 (en) Query plans with parameter markers in place of object identifiers
US8620880B2 (en) Database system, method of managing database, and computer-readable storage medium
EP3435256B1 (en) Optimal sort key compression and index rebuilding
US20180144061A1 (en) Edge store designs for graph databases
US20170255708A1 (en) Index structures for graph databases
US10445370B2 (en) Compound indexes for graph databases
WO2023160137A1 (zh) 图数据存储方法、系统及计算机设备
JPWO2010084754A1 (ja) データベースシステム、データベース管理方法、及びデータベース構造
US9213759B2 (en) System, apparatus, and method for executing a query including boolean and conditional expressions
US9104711B2 (en) Database system, method of managing database, and computer-readable storage medium
US11327962B1 (en) Real-time analytical database system for querying data of transactional systems
US11520763B2 (en) Automated optimization for in-memory data structures of column store databases
US20160004749A1 (en) Search system and search method
US20170031909A1 (en) Locality-sensitive hashing for algebraic expressions
WO2018218504A1 (zh) 数据查询的方法和装置
US11609909B2 (en) Zero copy optimization for select * queries
US11630816B2 (en) Continuous data protection using retroactive backup snapshots
JP2024514672A (ja) アペンド専用データ構造を用いるリスト・ベースのデータ検索
CN116860700A (zh) 处理分布式文件系统中元数据的方法、装置、设备及介质
CN117472903A (zh) 一种将关系型数据库实时接入olap分析数据库的方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19871127

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19871127

Country of ref document: EP

Kind code of ref document: A1