WO2020007288A1 - 管理内存数据及在内存中维护数据的方法和系统 - Google Patents

管理内存数据及在内存中维护数据的方法和系统 Download PDF

Info

Publication number
WO2020007288A1
WO2020007288A1 PCT/CN2019/094365 CN2019094365W WO2020007288A1 WO 2020007288 A1 WO2020007288 A1 WO 2020007288A1 CN 2019094365 W CN2019094365 W CN 2019094365W WO 2020007288 A1 WO2020007288 A1 WO 2020007288A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
value
attribute value
node
level jump
Prior art date
Application number
PCT/CN2019/094365
Other languages
English (en)
French (fr)
Inventor
邓龙
王太泽
黄亚建
范晓亮
Original Assignee
第四范式(北京)技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 第四范式(北京)技术有限公司 filed Critical 第四范式(北京)技术有限公司
Publication of WO2020007288A1 publication Critical patent/WO2020007288A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory

Definitions

  • the present disclosure relates generally to the field of memory data management and maintenance, and more particularly, to a method and system for managing memory data, a method and system for maintaining data in memory, and corresponding computer-readable media and computing devices. .
  • Relational databases such as MySQL and SQL Server are mainly used to manage and maintain relational data.
  • Non-relational databases such as Redis and MongoDB are mainly used to manage and maintain non-relational data.
  • Relational data means data based on a relational model (RM).
  • RM relational model
  • Non-relational data means data that is not based on a relational model.
  • a time series database such as InfluxDB (Time Database) (TSDB for short) is proposed.
  • TSDB Time Database
  • an in-memory database such as VoltDB is proposed.
  • the traditional databases including the databases listed above have the problem of long time consuming to read / write data, and also have units The problem of a small number of data writing tasks and / or data query tasks that can be performed simultaneously within the time.
  • Exemplary embodiments of the present disclosure are to provide a method and system for managing memory data, a method and system for maintaining data in memory, and a computer-readable medium and computing device corresponding to the provided method and system to solve
  • the prior art has a problem of long time consumption for reading / writing data, and a problem of a small number of data writing tasks and / or data query tasks that can be performed simultaneously in a unit time.
  • a method for managing memory data includes: setting a data table including a plurality of shards, wherein each shard corresponds to a first-level jump table; and setting the first-level jump table as a first attribute for storing data A node whose value is a keyword and a pointer or object indicating a second-level jump table is a value corresponding to the keyword; the second-level jump table is set to store a second attribute value of the data as a keyword And the value corresponding to the keyword includes a node of at least one attribute value of the data.
  • a method for maintaining data in a memory includes: determining, according to a first attribute value of data to be inserted, a shard corresponding to the data to be inserted in a data table including a plurality of shards, wherein each shard is respectively Corresponds to the first-level jump table.
  • the first-level jump table is used to store a node whose first attribute value of the data is a key and a pointer or object indicating the second-level jump table is a value corresponding to the keyword.
  • the step of determining a slice corresponding to the data to be inserted includes: calculating a hash value corresponding to a first attribute value of the data to be inserted; obtaining the calculated hash value divided by the value in the data table The remainder obtained from the total number of fragments; the fragment corresponding to the obtained remainder is determined as the fragment corresponding to the data to be inserted.
  • a pointer or an object indicating a corresponding first-level jump table is stored in each of the plurality of fragments.
  • the method for maintaining data in memory further includes: in the case that a node with the first attribute value of the data to be inserted is not found as a key from the first-level jump table, creating a second A level jump table, in the first level jump table, creating a node with a first attribute value of the data to be inserted as a key and a pointer or object indicating the created second level jump table as a value corresponding to the key, A node with a second attribute value of the data to be inserted as a key and a value corresponding to the key including the at least one attribute value of the data to be inserted is added to the created second-level jump table.
  • the at least one attribute value of the data to be inserted includes a first attribute value and / or a second attribute value of the data to be inserted, or the at least one attribute value of the data to be inserted neither includes the to-be-inserted data.
  • the first attribute value of the data also does not include the second attribute value of the data to be inserted, where the value corresponding to the second attribute value of the data to be inserted among the nodes added to the second-level jump table includes the following: A character string obtained by one of the methods: merging the at least one attribute value according to a predetermined string merging rule, serializing the at least one attribute value according to a predetermined JSON format, and performing at least one of the at least one attribute value according to a predetermined ProtocolBuffer format.
  • One attribute value is serialized, and the at least one attribute value is serialized according to a predefined Schema format.
  • the method for maintaining data in memory further includes: receiving a first attribute value of the data to be queried and a value range regarding the second attribute value; and determining the value according to the first attribute value of the data to be queried. Describe the shards corresponding to the data to be queried in the data table; find the node with the first attribute value of the data to be queried as the key from the first-level jump table corresponding to the determined shard; The second-level jump table indicated by the pointer or object in the node extracts the corresponding at least one attribute value of the node whose key is within the value range.
  • the step of extracting the corresponding at least one attribute value of the node whose keywords are in the value range from the second-level jump table indicated by the pointer or object in the found node includes: Take out the value corresponding to the keyword in the value range from the pointer in the node or the node of the second-level jump table indicated by the object; obtain the at least one attribute value of the data to be queried in one of the following ways : Split the retrieved value according to a predetermined string splitting rule, deserialize the retrieved value according to a predetermined JSON format, deserialize the retrieved value according to a predetermined ProtocolBuffer format, and follow a predefined Schema The format deserializes the fetched value.
  • the step of determining the shard corresponding to the data to be queried includes: calculating a hash value corresponding to the first attribute value of the data to be queried; obtaining the calculated hash value divided by the value in the data table The remainder obtained from the total number of fragments; the fragment corresponding to the obtained remainder is determined as the fragment corresponding to the data to be queried.
  • the data to be inserted or data to be queried is time-series data
  • the second attribute value is a time stamp value
  • the value range specifies a start value and an end value of the timestamp value or a termination value of the timestamp value.
  • the step of adding a node to the second-level jump table includes: adding a node according to the time indicated by the timestamp value, so that the nodes in the second-level jump table are arranged in order of time from near to far.
  • the method for maintaining data in memory further includes: setting a threshold value of the number of nodes corresponding to the second-level jump table, wherein, from the second-level jump table indicated by the pointer or object in the found node
  • the step of extracting the corresponding at least one attribute value of the node whose keywords are within the value range includes: from the second-level jump table indicated by the pointer or object in the found node, according to the distance from near to far
  • the corresponding at least one attribute value among the nodes whose keywords are in a value range and whose number does not exceed the threshold of the number of nodes is sequentially taken out.
  • the method for maintaining data in memory further includes: setting a threshold value of the number of nodes corresponding to the second-level jump table; and traversing the first-level jump table and the second-level jump table at a predetermined period.
  • the number of nodes in the second-level jump table exceeds the threshold for the number of nodes, according to the order of the nodes in the second-level jump table, all nodes that are ranked after the node corresponding to the threshold for the number of nodes are deleted.
  • the method for maintaining data in memory further includes: setting an expiration period length; traversing the first-level jump table and the second-level jump table at a predetermined period, and locating a node whose time stamp value reaches the expiration period length To delete the nodes after the node as a whole.
  • a system for managing memory data includes: a data table setting unit for setting a data table including a plurality of shards, wherein each shard corresponds to a first-level jump table; the first-level jump table setting unit is used for Set the first-level jump table as a node for storing the data with the first attribute value as the key and the pointer or object indicating the second-level jump table as the value corresponding to the key; the second-level jump table setting A unit configured to set a second-level jump table as a node for storing a second attribute value of the data as a key and a value corresponding to the key including at least one attribute value of the data.
  • a system for maintaining data in a memory includes: a fragment determining unit, configured to determine a fragment corresponding to the data to be inserted in a data table including a plurality of fragments according to a first attribute value of the data to be inserted, Each fragment corresponds to a first-level jump table, and the first-level jump table is used to store the first attribute value of the data as a key and the pointer or object indicating the second-level jump table as the key.
  • a node whose second attribute value of the data is a keyword and a value corresponding to the keyword includes at least one attribute value of the data to be inserted.
  • the fragment determination unit calculates a hash value corresponding to the first attribute value of the data to be inserted; obtains a remainder obtained by dividing the calculated hash value by the total number of fragments in the data table; and The slice corresponding to the remainder of is determined as the slice corresponding to the data to be inserted.
  • a pointer or an object indicating a corresponding first-level jump table is stored in each of the plurality of fragments.
  • the data adding unit creates a second-level jump table, and jumps at the first level. Creates a node in the table with the first attribute value of the data to be inserted as the key and a pointer or object indicating the created second-level jump table as the value corresponding to the keyword, and the second-level jump table created A node is added in which a second attribute value of the data to be inserted is a keyword and a value corresponding to the keyword includes the at least one attribute value of the data to be inserted.
  • the at least one attribute value of the data to be inserted includes a first attribute value and / or a second attribute value of the data to be inserted, or the at least one attribute value of the data to be inserted neither includes the to-be-inserted data.
  • the first attribute value of the data also does not include the second attribute value of the data to be inserted, where the value corresponding to the second attribute value of the data to be inserted among the nodes added to the second-level jump table includes the following: A character string obtained by one of the methods: merging the at least one attribute value according to a predetermined string merging rule, serializing the at least one attribute value according to a predetermined JSON format, and performing at least one of the at least one attribute value according to a predetermined ProtocolBuffer format.
  • One attribute value is serialized, and the at least one attribute value is serialized according to a predefined Schema format.
  • the system for maintaining data in the memory further includes: an input receiving unit and a data obtaining unit, wherein the input receiving unit receives the first attribute value of the data to be queried and the value range of the second attribute value,
  • the fragment determination unit determines the fragment corresponding to the data to be queried in the data table according to the first attribute value of the data to be queried, and the data acquisition unit looks up from the first-level jump table corresponding to the determined shard. Take the node with the first attribute value of the data to be queried as the key, and extract the corresponding address of the key within the value range from the second-level jump table indicated by the pointer or object in the found node. State at least one attribute value.
  • the data obtaining unit obtains a value corresponding to a keyword in a value range from a node in the second-level jump table indicated by the pointer or the object in the found node, and obtains the value in one of the following ways:
  • the at least one attribute value of the queried data splitting the fetched value according to a predetermined string splitting rule, deserializing the fetched value according to a predetermined JSON format, and fetching the fetched value according to a predetermined ProtocolBuffer format Deserialize, deserialize the fetched value according to the predefined Schema format.
  • the fragment determination unit calculates a hash value corresponding to the first attribute value of the data to be queried, and obtains the remainder obtained by dividing the calculated hash value by the total number of fragments in the data table, and The fragments corresponding to the obtained remainder are determined as the fragments corresponding to the data to be queried.
  • the data to be inserted or data to be queried is time-series data
  • the second attribute value is a time stamp value
  • the value range specifies a start value and an end value of the timestamp value or a termination value of the timestamp value.
  • the data adding unit adds nodes according to the time indicated by the timestamp value, so that the nodes in the second-level jump table are arranged in order of time from near to far.
  • the system for maintaining data in the memory further includes: a node number threshold setting unit for setting a node number threshold corresponding to the second-level jump table, wherein the data obtaining unit obtains a pointer or an object from the node found In the indicated second-level jump table, the corresponding at least one attribute value of nodes whose keywords are within a value range and whose number does not exceed the threshold of the number of nodes is taken out in order from near to far.
  • a node number threshold setting unit for setting a node number threshold corresponding to the second-level jump table, wherein the data obtaining unit obtains a pointer or an object from the node found In the indicated second-level jump table, the corresponding at least one attribute value of nodes whose keywords are within a value range and whose number does not exceed the threshold of the number of nodes is taken out in order from near to far.
  • the system for maintaining data in memory further includes: a node number threshold setting unit and a node deletion unit, wherein the node number threshold setting unit sets a node number threshold corresponding to the second-level jump table, and the search unit traverses at a predetermined period The first-level jump table and the second-level jump table.
  • the node deleting unit deletes the rows according to the order of the nodes in the second-level jump table. All nodes after the node corresponding to the node number threshold.
  • the system for maintaining data in the memory further includes: an expiration period length setting unit and a data deletion unit, wherein the expiration period length setting unit sets the expiration period length, and the search unit traverses the first-level jump table and the second In the level-hopping table, the data deletion unit deletes nodes after the node whose time stamp value reaches the expiration period.
  • a computer-readable medium having recorded thereon a computer program for performing a method of managing memory data as described above.
  • a computing device including a storage component and a processor, wherein the storage component stores a computer-executable instruction set, and when the computer-executable instruction set is stored by the processor, When executed, the method for managing memory data as described above is executed.
  • a computer-readable medium on which a computer program for performing a method of maintaining data in a memory as described above is recorded.
  • a computing device including a storage component and a processor, wherein the storage component stores a computer-executable instruction set, and when the computer-executable instruction set is stored by the processor, When executed, the method of maintaining data in memory as described above is executed.
  • a first-level jump table and a second-level jump table may be set.
  • the first-level jump table stores the first attribute value of the data as the key.
  • a pointer or object indicating a second-level jump table is a node corresponding to the keyword
  • the second-level jump table stores a node with a second attribute value of the data as a keyword, so that a pre- The keywords corresponding to the first attribute value and the keywords corresponding to the second attribute value are quickly positioned to the nodes in the second-level skip table, thereby improving the read / write speed of the data; multiple shards and The first-level jump table corresponding to each shard can thereby increase the number of data writing tasks and / or data query tasks that can be performed simultaneously in a unit time by performing parallel processing on the multiple shards.
  • FIG. 1 illustrates a schematic diagram of a data table according to an exemplary embodiment of the present disclosure
  • FIG. 2 illustrates a flowchart of a method for managing memory data according to an exemplary embodiment of the present disclosure
  • FIG. 3 illustrates a flowchart of an operation of inserting data in a memory according to an exemplary embodiment of the present disclosure
  • FIG. 4 illustrates a flowchart of an operation of querying data in a memory according to an exemplary embodiment of the present disclosure
  • FIG. 5 illustrates a block diagram of a system for managing memory data according to an exemplary embodiment of the present disclosure
  • FIG. 6 illustrates a block diagram of a system for maintaining data in a memory according to an exemplary embodiment of the present disclosure.
  • the first-stage skip table and the second-stage skip table involved in the exemplary embodiment of the present disclosure are skiplists, which are also referred to as skip tables.
  • FIG. 1 illustrates a data table according to an exemplary embodiment of the present disclosure.
  • a data table according to an exemplary embodiment of the present disclosure includes a slice 0 to a slice n, where n is a natural number greater than 1.
  • Each of these shards corresponds to a first-level skip table.
  • FIG. 1 shows a first-level jump table corresponding to segment 0.
  • the first-level jump table corresponding to segment 0 includes nodes 11 to 1m, where m is a natural number.
  • a pointer or an object indicating a corresponding first-level jump table may be stored in each slice, so as to locate the first-level jump table corresponding to the slice.
  • Each node in the first-level hop can correspond to a second-level hop table.
  • the second-level jump table corresponding to node 11 includes nodes 41 to 4k
  • the second-level jump table corresponding to node 12 includes nodes 31 to 3j
  • the second-level jump corresponding to node 1m includes nodes 21 to 2i, where i, j, and k are natural numbers.
  • a pointer or an object indicating a corresponding second-level jump table may be stored in each node of the first-level jump table, so as to locate the second-level jump table corresponding to the node of the first-level jump table.
  • Key-value pairs can be set in the nodes of the skip table.
  • the first attribute value of the data may be set as a key, and a pointer or object indicating the second-level jump table is set as a value corresponding to the keyword (value);
  • a second attribute value of the data may be set as a keyword, and a value corresponding to the keyword may be used to store at least one attribute value of the data.
  • the at least one attribute value includes a first attribute value and / or a second attribute value of the data, or the at least one attribute value includes neither the first attribute value of the data nor the data The second attribute value.
  • a node or object indicating another node in the first-level jump table may be stored in a node of the first-level jump table.
  • a pointer or object indicating another node in the second-level jump table may also be stored in a node of the second-level jump table.
  • each node except the tail node stores an object or pointer indicating a node that belongs to the same jump table as the node In order to make the jump list form a chain structure.
  • the pointer or object indicating the second node in the first node needs to be changed to indicate the third node, and the third node is inserted in the third node.
  • a pointer or object is used to indicate the second node.
  • FIG. 2 illustrates a flowchart of a method of managing memory data according to an exemplary embodiment of the present disclosure.
  • the method for managing memory data is executed by at least one computing device.
  • a method of managing memory data according to an exemplary embodiment of the present disclosure includes steps S101 to S103.
  • a data table including a plurality of fragments is set, where each fragment corresponds to a first-level skip table, respectively.
  • the plurality of shards can be set in the memory in the manner of setting an array in the memory, and then a corresponding first-level jump table can be set for each shard, and the first-level jump table set initially can be an empty jump table. No node exists in the empty hop table.
  • a pointer or an object indicating a corresponding first-level jump table may be stored in each of the plurality of shards, so that the corresponding first-level jump table may be located through the pointer or the object.
  • the first-level jump table is set as a node for storing a first attribute value of the data as a key and a pointer or object indicating the second-level jump table as a value corresponding to the key.
  • a key-value pair is stored in each node of the first-level jump table, wherein a first attribute value of the data is used as a key, and a pointer or an object is used as a value corresponding to the key.
  • the second-level jump table is set as a node for storing a second attribute value of the data as a key and a value corresponding to the key includes at least one attribute value of the data.
  • Each node of the second-level jump table also stores a key-value pair, where the second attribute value is used as a key, and the value corresponding to the key includes at least one attribute value of the data.
  • the first attribute value may be a card number
  • the second attribute value may be a timestamp value
  • the value corresponding to the timestamp value may include a transaction amount value, a transaction place, or a point of sale. , Abbreviated as POS) number.
  • Table 1 below shows data according to an exemplary embodiment of the present disclosure:
  • each piece of data may include the following attribute values: card number, timestamp value, transaction amount value, transaction place, and POS number.
  • Table 1 includes 3 pieces of data.
  • the at least one attribute value of the data includes a first attribute value and / or a second attribute value of the data, or the at least one attribute value of the data includes neither the first attribute value to be data nor the data.
  • the value corresponding to the second attribute value of the data in the node added to the second-level jump table includes a character string obtained by one of the following ways: merging the at least one attribute value according to a predetermined string merging rule, Serialize the at least one attribute value according to a predetermined JSON (such as JavaScript Object Notation called JS object representation) format, serialize the at least one attribute value according to a predetermined ProtocolBuffer format, and follow a predefined A Schema format serializes the at least one attribute value.
  • JSON such as JavaScript Object Notation called JS object representation
  • the above-mentioned sharding, the first-level jump table and the second-level jump table can be set in memory.
  • the correspondence between the shards, the first-level jump table, and the second-level jump table can be set, and the memory can be stored in the memory based on the set correspondence. Data is managed and maintained. The operation of maintaining data in the memory will be described in the following exemplary embodiments.
  • FIG. 3 illustrates a flowchart of an operation of inserting data in a memory according to an exemplary embodiment of the present disclosure.
  • an operation of maintaining data in a memory performed by at least one computing device according to an exemplary embodiment of the present disclosure includes steps S201 to S203.
  • insertion can be performed through the interface form of put (table_name, key, ts, value), where table_name is used to limit the name of the data table into which data is to be inserted, and key is used to limit the first attribute value of the data to be inserted , Ts is used to define the second attribute value of the data to be inserted, and value is used to define at least one attribute value of the data to be inserted (for example, the value may be encoded by encoding the at least one attribute value according to a specific rule ( For example, merge or serialize).
  • table_name is used to limit the name of the data table into which data is to be inserted
  • key is used to limit the first attribute value of the data to be inserted
  • Ts is used to define the second attribute value of the data to be inserted
  • value is used to define at least one attribute value of the data to be inserted (for example, the value may be encoded by encoding the at least one attribute value according to a specific rule ( For example, merge or serialize).
  • insertion can also be performed through the interface form of put (table_name, key, ts, field_1, field_1 type, field_2, field_2 type, ..., field_n, field_n type), where table_name is used to limit the data to be inserted into the data
  • table_name is used to limit the data to be inserted into the data
  • key is used to limit the first attribute value of the data to be inserted
  • ts is used to limit the second attribute value of the data to be inserted
  • n is a natural number.
  • a shard corresponding to the data to be inserted in a data table including a plurality of shards is determined according to a first attribute value of the data to be inserted, where each shard corresponds to a first-level skip table,
  • the first-level jump table is used to store a node whose first attribute value of the data is a key and a pointer or object indicating the second-level jump table is a value corresponding to the key.
  • data to be inserted may be received, for example, bank transaction data as shown in Table 1.
  • Bank transaction data may include the following attribute values: card number, time stamp value, transaction amount value, transaction location, and POS number. You can select the data table from which you want to insert data from the in-memory database. Multiple shards are stored in the selected data table.
  • the shards corresponding to the data to be inserted may be selected according to the following methods: calculating a hash value corresponding to the first attribute value of the data to be inserted; obtaining the calculated hash value divided by the total number of shards in the data table The obtained remainder; determining the fragment corresponding to the obtained remainder as the fragment corresponding to the data to be inserted.
  • a hash function may be used to calculate the first attribute value to obtain a hash value.
  • the hash function used may be the Murmurhash hash function proposed by Austin Appleby.
  • the present disclosure does not limit the hash function used, and other hash functions can also be used for the calculation of the hash value.
  • the data table may include slice 0 to slice n. If the remainder is 0, then slice 0 corresponds to the data to be inserted; if the remainder is h (0 ⁇ h ⁇ n), then slice h and pending The inserted data corresponds.
  • a shard can be associated with a first-level skip table by a pointer or an object. That is, a pointer or an object indicating a corresponding first-level jump table is stored in each of the multiple fragments of the data table.
  • the object is similar to the object involved in Object Oriented (OO) programming technology.
  • the pointer or object stored in the slice can locate the first-level jump table corresponding to the slice.
  • step S202 a node with the first attribute value of the data to be inserted as a key is searched from the first-level jump table corresponding to the determined segment.
  • step S203 in the case where a node with the first attribute value of the data to be inserted as a key is found from the first-level jump table, the second-level jump indicated by the pointer or object in the found node A node is added to the table with the second attribute value of the data to be inserted as a key, and the value corresponding to the key includes at least one attribute value of the data to be inserted.
  • the data to be inserted is time-series data
  • the second attribute value is a time stamp value
  • step The search result of S202 is empty.
  • a second-level jump table can be created.
  • a first-level jump table is created with the first attribute value of the data to be inserted as a key and a pointer or object indicating the created second-level jump table as The node of the value corresponding to this keyword.
  • a node with the second attribute value of the data to be inserted as a key and the value corresponding to the key including at least one attribute value of the data to be inserted may be added to the created second-level jump table.
  • step S201 Taking the data to be inserted as the first piece of data in Table 1 as an example, in step S201, it is assumed that the corresponding fragment is determined as fragment 0 according to the card number value "6222XXXX01".
  • step S202 referring to FIG. 1, a node with a key of "6222XXXX01" is searched from the first-level jump table corresponding to the segment 0. For example, find node 11. Based on this, it can be determined that the second-level jump table corresponding to the node 11 includes the node 41 to the node 4k.
  • the at least one attribute value of the data to be inserted includes a first attribute value and / or a second attribute value of the data to be inserted, or the at least one attribute value of the data to be inserted neither includes the
  • the first attribute value of the data also does not include the second attribute value of the data to be inserted, where the value corresponding to the second attribute value of the data to be inserted among the nodes added to the second-level jump table includes the following manner
  • One of the obtained character strings the at least one attribute value is merged according to a predetermined string merging rule, the at least one attribute value is serialized according to a predetermined JSON format, and the at least one is subjected to a predetermined ProtocolBuffer format
  • the attribute value is serialized, and the at least one attribute value is serialized according to a predefined Schema format.
  • the predetermined character string merging rule includes merging according to a specific symbol (for example, "
  • a specific symbol for example, "
  • the foregoing uses the JSON format, ProtocolBuffer format, and Schema format to obtain the character string as an AND in a node of the second-level jump table.
  • the way the keywords correspond to values is also possible.
  • the above description is only an example and should not be considered limiting.
  • the step of adding a node in the second-level jump table includes: adding a node according to the time indicated by the timestamp value, so that the nodes in the second-level jump table are arranged in order of time from near to far. You can determine the distance of the time by comparing the timestamp values. The time corresponding to the larger timestamp value is closer than the time corresponding to the smaller timestamp value. Therefore, in the second-level jump table, a node with a higher time stamp value can be arranged before a node with a lower time stamp value.
  • the shard corresponding to the third data is determined as shard 1 according to the card number value "6222XXX02", and the first-level jump table corresponding to shard 1 is determined according to the pointer or object stored in shard 1 ( Figure (The first-level skip table corresponding to slice 1 is not shown in 1).
  • a node and a second-level jump table will be created, with "6222XXXX02" as the key and indicated by A pointer or object of the created second-level jump table is a node having a value corresponding to a keyword is added to the created second-level jump table.
  • 30xxx" as a value corresponding to the keyword is added to the one-level jump table that is created.
  • FIG. 4 illustrates a flowchart of an operation of querying data in a memory according to an exemplary embodiment of the present disclosure. As shown in FIG. 4, an operation of querying data in a memory according to an exemplary embodiment of the present disclosure includes steps S301 to S304.
  • the query can be performed through the interface form of scan (table_name, key, start_time, end_time), where table_name is used to limit the name of the data table from which data is queried, and key is used to limit the first attribute value of the data to be queried , Start_time and end_time are used to limit the value range of the data to be queried, for example, start time and end time.
  • table_name is used to limit the name of the data table from which data is queried
  • key is used to limit the first attribute value of the data to be queried
  • Start_time and end_time are used to limit the value range of the data to be queried, for example, start time and end time.
  • the query can be performed through the interface form of get (table_name, key, ts), where table_name is used to limit the name of the data table from which data is queried, and key is used to limit the first attribute value of the data to be queried, ts It is used to limit the value range of the data to be queried. For example, ts is used to limit the time stamp value of the data to be queried.
  • the actual query is expected to be data with a time stamp value of ts; for example, ts It is used to limit the end time of the data to be queried. In this case, what is actually expected to be queried is the data from the time of querying the data to the specified ts.
  • step S301 a first attribute value of data to be queried and a value range regarding a second attribute value are received.
  • the data to be queried is time-series data
  • the second attribute value is a time stamp value
  • the value range specifies a start value and an end value of the timestamp value, or specifies an end value of the timestamp value.
  • step S302 a slice corresponding to the data to be queried in the data table is determined according to a first attribute value of the data to be queried.
  • the shard corresponding to the data to be queried may be determined by: calculating a hash value corresponding to the first attribute value of the data to be queried; obtaining the calculated hash value divided by the value in the data table The remainder obtained from the total number of fragments; the fragment corresponding to the obtained remainder is determined as the fragment corresponding to the data to be queried.
  • a hash function may be used to calculate the first attribute value to obtain a hash value.
  • the hash function used may be the Murmurhash hash function proposed by Austin Appleby.
  • the present disclosure does not limit the hash function used, and other hash functions can also be used for the calculation of the hash value.
  • step S303 a node whose key value is the first attribute value of the data to be queried is searched from the first-level skip table corresponding to the determined shard.
  • step S304 the corresponding at least one attribute value of the node whose key is within the value range is taken from the second-level jump table indicated by the pointer or object in the found node.
  • a node with a key value of “2018052815520505” and a termination value of “2018052814520505” is found, so that the value corresponding to the keyword can be found from the found nodes.
  • the value range only specifies the end value of the timestamp value (for example, "2018052814520505"), it can be used to query data corresponding to the node whose timestamp value is greater than or equal to "2018052814520505".
  • the step of extracting the corresponding at least one attribute value of a node whose keywords are in a value range from a second-level jump table indicated by a pointer or an object in the found node includes: The pointer in the node or the node of the second-level jump table indicated by the object fetches the value corresponding to the keyword within the value range; the at least one attribute value of the data to be queried is obtained in one of the following ways: Split the retrieved value according to a predetermined string splitting rule (corresponding to the above string merge rule), deserialize the retrieved value according to a predetermined JSON format, and perform the retrieved value according to a predetermined ProtocolBuffer format. Deserialize, deserialize the fetched value according to the predefined Schema format.
  • the values corresponding to the keywords are “100
  • the string is the transaction value "100”
  • the second split string is the transaction location "Beijing Shangdi xx Road”
  • the third split string is the POS number "10xxx”.
  • the transaction amount value "50”, the transaction place “Beijing Xierqi xx shop”, and the POS number "20xxx” can be obtained from “50
  • a threshold for the number of nodes corresponding to the second-level jump table may be set.
  • the step of extracting the corresponding at least one attribute value of the node whose keywords are in the value range from the second-level jump table indicated by the pointer or object in the found node includes: In the second-level jump table indicated by the pointer or object in the received node, the corresponding ones of the nodes whose keywords are within the value range and whose number does not exceed the threshold of the number of nodes are taken out in order from near to far. At least one attribute value.
  • periodic deletion can also be performed according to the set node number threshold, that is, the first-level hop table and the second-level hop table are traversed at a predetermined period, and when the number of nodes in the traversed second-level hop table exceeds the node number threshold At this time, according to the arrangement order of the nodes in the second-level jump table, all the nodes that are ranked after the node corresponding to the node number threshold are deleted. For example, when the threshold value of the number of nodes is 10, the node corresponding to the threshold value of the node number threshold is the 10th node according to the arrangement order of the nodes in the second-level jump table.
  • the following expired data deletion operations can be performed: setting the expiration period length; traversing the first-level jump table and the second-level jump table at a predetermined period (for example, 3 months), and locating the timestamp value Nodes that reach the expiration period length are used to delete nodes after that node as a whole.
  • a node whose time stamp value in the second-level skip table is less than the time stamp value corresponding to the set expiration period length can be deleted.
  • FIG. 5 illustrates a block diagram of a system for managing memory data according to an exemplary embodiment of the present disclosure.
  • a system 400 for managing memory data according to an exemplary embodiment of the present disclosure includes a data table setting unit 401, a first-level jump-table setting unit 402, and a second-level jump-table setting unit 403.
  • the data table setting unit 401 is configured to set a data table including a plurality of fragments, where each fragment corresponds to a first-level skip table, respectively.
  • the first-level jump table setting unit 402 is configured to set the first-level jump table to store the first attribute value of the data as a key and a pointer or object indicating the second-level jump table as a key corresponding to the key. The value of the node.
  • the second-level jump table setting unit 403 is configured to set the second-level jump table to store at least one attribute value of the data with a second attribute value of the data as a keyword and a value corresponding to the keyword Node.
  • FIG. 6 illustrates a block diagram of a system for maintaining data in a memory according to an exemplary embodiment of the present disclosure.
  • a system 500 for maintaining data in a memory according to an exemplary embodiment of the present disclosure includes a fragment determination unit 501, a lookup unit 502, and a data addition unit 503.
  • the fragment determining unit 501 is configured to determine a fragment corresponding to the data to be inserted in a data table including a plurality of fragments according to a first attribute value of the data to be inserted, where each fragment corresponds to a first level respectively Jump table, the first-level jump table is used to store a node whose first attribute value of the data is a key and a pointer or object indicating the second-level jump table is a value corresponding to the key.
  • the searching unit 502 is configured to search a node with a first attribute value of the data to be inserted as a key from the first-level jump table corresponding to the determined segment.
  • the data adding unit 503 is configured to find a node whose key attribute is the first attribute value of the data to be inserted from the first-level jump table, and the second pointer indicated by the pointer or object in the found node A node with a second attribute value of the data to be inserted as a key and a value corresponding to the key including at least one attribute value of the data to be added is added to the level jump table.
  • the fragment determination unit 501 calculates a hash value corresponding to the first attribute value of the data to be inserted; obtains the remainder obtained by dividing the calculated hash value by the total number of fragments in the data table; The slice corresponding to the remainder of is determined as the slice corresponding to the data to be inserted.
  • a pointer or an object indicating a corresponding first-level jump table is stored in each of the plurality of shards.
  • the data adding unit 503 creates a second-level jump table, and jumps at the first level. Creates a node in the table with the first attribute value of the data to be inserted as the key and a pointer or object indicating the created second-level jump table as the value corresponding to the keyword, and the second-level jump table created A node is added in which a second attribute value of the data to be inserted is a keyword and a value corresponding to the keyword includes the at least one attribute value of the data to be inserted.
  • the at least one attribute value of the data to be inserted includes a first attribute value and / or a second attribute value of the data to be inserted, or the at least one attribute value of the data to be inserted neither includes the
  • the first attribute value of the data also does not include the second attribute value of the data to be inserted, where the value corresponding to the second attribute value of the data to be inserted among the nodes added to the second-level jump table includes the following manner
  • One of the obtained character strings the at least one attribute value is merged according to a predetermined string merging rule, the at least one attribute value is serialized according to a predetermined JSON format, and the at least one is subjected to a predetermined ProtocolBuffer format
  • the attribute value is serialized, and the at least one attribute value is serialized according to a predefined Schema format.
  • the system for maintaining data in the memory further includes: an input receiving unit (not shown) and a data acquisition unit (not shown), wherein the input receiving unit receives a first attribute value of the data to be queried and information about the The value range of the second attribute value, where the fragment determination unit 501 determines the fragment corresponding to the data to be queried in the data table according to the first attribute value of the data to be queried; the data acquisition unit determines from and Find the node with the first attribute value of the data to be queried as the key in the first-level jump table corresponding to the shard, and take out the key from the second-level jump table indicated by the pointer or object in the found node.
  • the corresponding at least one attribute value in a node within a value range wherein the input receiving unit receives a first attribute value of the data to be queried and information about the The value range of the second attribute value, where the fragment determination unit 501 determines the fragment corresponding to the data to be queried in the data table according to the first attribute value of the data to
  • the data obtaining unit obtains the value corresponding to the keyword within the value range from the node in the second-level jump table indicated by the pointer or object in the found node, and obtains the query to be performed in one of the following ways
  • the at least one attribute value of the data splitting the fetched value according to a predetermined string splitting rule, deserializing the fetched value according to a predetermined JSON format, and fetching the fetched value according to a predetermined ProtocolBuffer format Deserialize, deserialize the fetched value according to the predefined Schema format.
  • the fragment determination unit 501 calculates a hash value corresponding to the first attribute value of the data to be queried; obtaining the remainder obtained by dividing the calculated hash value by the total number of fragments in the data table, and obtaining The fragment corresponding to the remainder of is determined to be the fragment corresponding to the data to be queried.
  • the data to be inserted or data to be queried is time series data
  • the second attribute value is a time stamp value
  • the value range specifies a start value and an end value of the timestamp value, or specifies an end value of the timestamp value.
  • the data adding unit 503 adds nodes according to the time indicated by the timestamp value, so that the nodes in the second-level jump table are arranged in order of time from near to far.
  • the system for maintaining data in memory further includes: a node number threshold setting unit (not shown) for setting a node number threshold corresponding to the second-level jump table, where the data acquisition unit selects the node from the found nodes In the second-level jump table indicated by the pointer or object, the corresponding at least one attribute value among the nodes whose keywords are within the value range and whose number does not exceed the number threshold of the nodes is taken in the order from near to far. .
  • a node number threshold setting unit for setting a node number threshold corresponding to the second-level jump table, where the data acquisition unit selects the node from the found nodes In the second-level jump table indicated by the pointer or object, the corresponding at least one attribute value among the nodes whose keywords are within the value range and whose number does not exceed the number threshold of the nodes is taken in the order from near to far.
  • the system for maintaining data in memory further includes a node deletion unit, wherein the node number threshold setting unit sets a node number threshold corresponding to the second-level jump table, and the search unit traverses the first-level jump table and the first Second-level jump table.
  • the node deletion unit deletes the rows corresponding to the threshold of the number of nodes according to the order of the nodes in the second-level jump table. All nodes after the node.
  • the system for maintaining data in memory further includes an expiration period setting unit (not shown) and a data deletion unit (not shown), wherein the expiration period length setting unit sets the expiration period length, and the search unit traverses at a predetermined period In the first-level jump table and the second-level jump table, the data deletion unit deletes nodes after the node whose time stamp value reaches the expiration period.
  • the units included in the system according to an exemplary embodiment of the present disclosure may be respectively configured as software, hardware, firmware, or any combination of the above, which perform specific functions.
  • these units may correspond to dedicated integrated circuits, may also correspond to pure software code, and may also correspond to units that combine software and hardware.
  • one or more functions implemented by these units may be uniformly performed by components in a physical entity device (for example, a processor, a client, or a server).
  • a computer storage medium storing instructions may be provided, wherein, When the instruction is executed by at least one computing device, prompting the at least one computing device to execute a method for managing memory data: setting a data table including a plurality of fragments, wherein each fragment corresponds to a first-level jump table respectively; Set the first-level jump table as a node for storing the data with the first attribute value as the key and the pointer or object indicating the second-level jump table as the value corresponding to the keyword; set the second-level jump table A node configured to store a key having a second attribute value of the data and a value corresponding to the key including at least one attribute value of the data.
  • a computer storage medium storing instructions, wherein when the instructions are executed by at least one computing device, the at least one computing device is caused to execute for execution in a memory.
  • Method for maintaining data According to the first attribute value of the data to be inserted, a shard corresponding to the data to be inserted in a data table including a plurality of shards is determined, wherein each shard corresponds to a first-level jump table respectively
  • the first-level jump table is used to store a node whose first attribute value of the data is a key and a pointer or object indicating the second-level jump table is a value corresponding to the key; from the node corresponding to the determined shard Find the node with the first attribute value of the data to be inserted as the key in the first-level jump table; and find the node with the first attribute value of the data to be inserted as the key from the first-level jump table
  • the computer program in the computer-readable medium described above can be run in an environment deployed in a computer device such as a client, host, proxy device, server, etc. It should be noted that the computer program can also be used to perform additional steps in addition to the above steps or More specific processing is performed when the above steps are performed. The content of these additional steps and further processing has been described with reference to FIGS. 1 to 4 and Table 1. To avoid repetition, details will not be repeated here.
  • system may completely rely on the operation of the computer program to implement the corresponding functions, that is, the functional architecture of each unit and the computer program corresponds to each step, so that the entire system passes a special software package (for example, the lib library) is called to implement the corresponding function.
  • a special software package For example, the lib library
  • each unit included in the system according to an exemplary embodiment of the present disclosure may also be implemented by hardware, software, firmware, middleware, microcode, or any combination thereof.
  • the program code or code segments for performing corresponding operations may be stored in a computer-readable medium such as a storage medium, so that the processor can read and run the corresponding program Code or code segment to perform the corresponding operation.
  • the exemplary embodiments of the present disclosure may also be implemented as a computing device.
  • the computing device includes a storage component and a processor.
  • the storage component stores a computer-executable instruction set. When executed, execute a method for managing memory data or a method for maintaining data in memory.
  • the computing device may be deployed in a server or a client, or may be deployed on a node device in a distributed network environment.
  • the computing device may be a PC computer, a tablet device, a personal digital assistant, a smart phone, a web application, or other devices capable of executing the above instruction set.
  • the computing device does not have to be a single computing device, and may also be any device or circuit assembly capable of executing the above-mentioned instructions (or instruction sets) individually or jointly.
  • the computing device may also be part of an integrated control system or system manager, or a portable electronic device that may be configured to interface with a local or remote (e.g., via wireless transmission) interface.
  • the processor may include a central processing unit (CPU), a graphics processor (GPU), a programmable logic device, a special-purpose processor system, a microcontroller, or a microprocessor.
  • processors may also include analog processors, digital processors, microprocessors, multi-core processors, processor arrays, network processors, and the like.
  • Some operations described in the method according to the exemplary embodiment of the present disclosure may be implemented by software, some operations may be implemented by hardware, and in addition, these operations may also be implemented by a combination of software and hardware.
  • the processor may execute instructions or code stored in one of the storage components, wherein the storage component may also store data. Instructions and data may also be sent and received over a network via a network interface device, which may employ any known transmission protocol.
  • the storage unit may be integrated with the processor, for example, the RAM or the flash memory is arranged in an integrated circuit microprocessor or the like.
  • the storage component may include a stand-alone device, such as an external disk drive, a storage array, or other storage device usable by any database system.
  • the storage component and the processor may be operatively coupled, or may communicate with each other, for example, through an I / O port, a network connection, or the like, so that the processor can read a file stored in the storage component.
  • the computing device may also include a video display (such as a liquid crystal display) and a user interaction interface (such as a keyboard, mouse, touch input device, etc.). All components of the computing device may be connected to each other via a bus and / or a network.
  • a video display such as a liquid crystal display
  • a user interaction interface such as a keyboard, mouse, touch input device, etc.
  • Operations involved in a method of managing memory data and / or a method of maintaining data in memory may be described as various interconnected or coupled function blocks or function diagrams. However, these functional blocks or functional diagrams may be equally integrated into a single logical device or operated on imprecise boundaries.
  • a system including at least one computing device and at least one storage device storing instructions, wherein the instructions, when executed by the at least one computing device, cause all the The at least one computing device performs the following steps for maintaining data in memory: setting a data table including a plurality of shards, wherein each shard corresponds to a first-level jump table; and setting the first-level jump table for storage
  • the first attribute value of the data is a keyword and the pointer or object indicating the second-level jump table is a node corresponding to the keyword; the second-level jump table is set to be used for storing the first A node whose two attribute values are keywords and whose value corresponding to the keyword includes at least one attribute value of the data.
  • a system including at least one computing device and at least one storage device storing instructions, wherein the instructions, when executed by the at least one computing device, cause The at least one computing device performs the following steps of maintaining data in the memory: determining a fragment corresponding to the data to be inserted in a data table including a plurality of fragments according to a first attribute value of the data to be inserted, wherein: Each slice corresponds to a first-level jump table.
  • the first-level jump table is used to store the first attribute value of the data as the key and the pointer or object indicating the second-level jump table as the value corresponding to the key.
  • Node from the first-level jump table corresponding to the determined shard looking for the node with the first attribute value of the data to be inserted as the key; looking for the data to be inserted from the first-level jump table
  • a second attribute value of the data to be inserted is added to the second-level jump table indicated by the pointer or object in the found node and is related to the relationship.
  • the word corresponding to the data values to be inserted into at least one of the attribute values of nodes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

提供了管理内存数据的方法和系统、在内存中维护数据的方法和系统、以及与提供的方法和系统对应的计算机可读介质和计算装置。所述管理内存数据的方法包括:设置包括多个分片的数据表,其中,每个分片分别对应第一级跳表;将第一级跳表设置为用于存储以数据的第一属性值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点;将第二级跳表设置为用于存储以所述数据的第二属性值为关键字且与该关键字对应的值包括所述数据的至少一个属性值的节点。根据本公开,可降低读/写数据的耗时,并可提高单位时间内同时执行的数据写入任务和/或数据查询任务的数量。

Description

管理内存数据及在内存中维护数据的方法和系统 技术领域
本公开总体说来涉及内存数据管理和维护领域,更具体地讲,涉及一种管理内存数据的方法和系统、一种在内存中维护数据的方法和系统以及对应的计算机可读介质和计算装置。
背景技术
现有的数据库包括关系型数据库和非关系型数据库。诸如MySQL和SQL Server的关系型数据库主要用于对关系型数据进行管理和维护。诸如Redis和MongoDB的非关系型数据库主要用于对非关系型数据进行管理和维护。关系型数据意指基于关系模型(Relational Model,简称为RM)的数据。非关系型数据意指非基于关系模型的数据。
为了对时间序列数据进行处理,提出了诸如InfluxDB的时间序列数据库(Time Series Database,简称为TSDB)。为了对内存中的数据进行管理,提出了诸如VoltDB的内存数据库。
然而,在需要快速处理数据和同时执行大量数据写入任务和/或数据查询任务的特定场景下,包括以上列举出的数据库的传统的数据库具有读/写数据的耗时长的问题,还具有单位时间内能够同时执行的数据写入任务和/或数据查询任务的数量少的问题。
发明内容
本公开的示例性实施例在于提供一种管理内存数据的方法和系统、一种在内存中维护数据的方法和系统、以及与提供的方法和系统对应的计算机可读介质和计算装置,以解决现有技术存在的读/写数据的耗时长的问题,以及单位时间内能够同时执行的数据写入任务和/或数据查询任务的数量少的问题。
根据本公开的示例性实施例,提供一种管理内存数据的方法。所述管理内存数据的方法包括:设置包括多个分片的数据表,其中,每个分片分别对应第一级跳表;将第一级跳表设置为用于存储以数据的第一属性值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点;将第二级跳表设置为用于存储以所述数据的第二属性值为关键字且与该关键字对应的值包括所述数据的至少一个属性值的节点。
根据本公开的另一示例性实施例,提供一种在内存中维护数据的方法。所述在内存中维护数据的方法包括:根据待插入的数据的第一属性值来确定包括多个分片的数据表中的与待插入的数据对应的分片,其中,每个分片分别对应第一级跳表,第一级跳表用于存储以数据的第一属性值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点;从与确定的分片对应的第一级跳表中查找以待插入的数据的第一属性值为关键字的节点;在从第一级跳表中查找到以待插入的数据的第一属性值为关键字的节点的情况下,在查找到的节点中的指针或对象所指示的第二级跳表中添加以待插入的数据的第二属性值为关键字且与该关键字对应的值包括待插入的数据的至少一个属性值的节点。
可选地,确定与待插入的数据对应的分片的步骤包括:计算与待插入的数据的第一属性值对应的哈希值;获得计算出的哈希值除以所述数据表中的分片总数所得的余数;将与获得的余数对应的分片确定为与待插入的数据对应的分片。
可选地,所述多个分片中的每个分片中存储有指示对应的第一级跳表的指针或对象。
可选地,所述在内存中维护数据的方法还包括:在未能从第一级跳表中查找到以待插入的数据的第一属性值为关键字的节点的情况下,创建第二级跳表,在第一级跳表中创建以待插入的数据的第一属性值为关键字且以指示创建的第二级跳表的指针或对象为与该关键字对应的值的节点,并在创建的第二级跳表中添加以待插入的数据的第二属性值为关键字且与该关键字对应的值包括待插入的数据的所述至少一个属性值的节点。
可选地,待插入的数据的所述至少一个属性值包括待插入的数据的第一属性值和/或第二属性值,或者待插入的数据的所述至少一个属性值既不包括待插入的数据的第一属性值也不包括待插入的数据的第二属性值,其中,添加到第二级跳表中的节点中的与待插入的数据的第二属性值对应的值包括通过以下方式之一获得的字符串:按照预定的字符串合并规则对所述至少一个属性值进行合并,按照预定的JSON格式对所述至少一个属性值进行序列化,按照预定的ProtocolBuffer格式对所述至少一个属性值进行序列化,按照预定义的Schema格式对所述至少一个属性值进行序列化。
可选地,所述在内存中维护数据的方法还包括:接收待查询的数据的第一属性值和关于第二属性值的取值范围;根据待查询的数据的第一属性值来确定所述数据表中的与待查询的数据对应的分片;从与确定的分片对应的第一级跳表中查找以待查询的数据的第一属性值为关键字的节点;从查找到的节点中的指针或对象所指示的第二级跳表中取出关键字在取值范围内的节点中的对应的所述至少一个属性值。
可选地,从查找到的节点中的指针或对象所指示的第二级跳表中取出关键字在取值范围内的节点中的对应的所述至少一个属性值的步骤包括:从查找到的节点中的指针或对象所指示的第二级跳表的节点中取出与在取值范围内的关键字对应的值;通过以下方式之一来获得待查询的数据的所述至少一个属性值:按照预定的字符串拆分规则对取出的值进行拆分,按照预定的JSON格式对取出的值进行反序列化,按照预定的ProtocolBuffer格式对取出的值进行反序列化,按照预定义的Schema格式对取出的值进行反序列化。
可选地,确定与待查询的数据对应的分片的步骤包括:计算与待查询的数据的第一属性值对应的哈希值;获得计算出的哈希值除以所述数据表中的分片总数所得的余数;将与获得的余数对应的分片确定为与待查询的数据对应的分片。
可选地,所述待插入的数据或待查询的数据是时序型数据,所述第二属性值为时间戳值。
可选地,所述取值范围指定时间戳值的起始值和终止值或者指定时间戳值的终止值。
可选地,在第二级跳表中添加节点的步骤包括:按照时间戳值指示的时间添加节点,使得第二级跳表中的节点按照时间从近及远的顺序排列。
可选地,所述在内存中维护数据的方法还包括:设置与第二级跳表对应的节点数量阈值,其中,从查找到的节点中的指针或对象所指示的第二级跳表中取出关键字在取值范围内的节点中的对应的所述至少一个属性值的步骤包括:从查找到的节点中的指针或对象所指示的第二级跳表中,按照从近及远的顺序取出关键字在取值范围内且数量不超过所述节点数量阈值的节点中的对应的所述至少一个属性值。
可选地,所述在内存中维护数据的方法还包括:设置与第二级跳表对应的节点数量阈值;以预定周期遍历第一级跳表和第二级跳表,当遍历到的第二级跳表中的节点数量超过节点数量阈值时,根据该第二级跳表中的节点的排列顺序,删除排在与节点数量阈值对应的节点之后的所有节点。
可选地,所述在内存中维护数据的方法还包括:设置过期期限长度;以预定周期遍历第一级跳表和第二级跳表,通过定位时间戳值达到所述过期期限长度的节点来整体删除在该节点之后的节点。
根据本公开的另一示例性实施例,提供一种管理内存数据的系统。所述管理内存数据的系统包括:数据表设置单元,用于设置包括多个分片的数据表,其中,每个分片分别对应第一级跳表;第一级跳表设置单元,用于将第一级跳表设置为用于存储以数据的第一属性值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点;第二级跳表设置单元,用于将第二级跳表设置为用于存储以所述数据的第二属性值为关键字且与该关键字对应的值包括所述数据的至少一个属性值的节点。
根据本公开的另一示例性实施例,提供在内存中维护数据的系统。所述在内存中维护数据的系统包括:分片确定单元,用于根据待插入的数据的第一属性值来确定包括多个分片的数据表中的与待插入的数据对应的分片,其中,每个分片分别对应第一级跳表,第一级跳表用于存储以数据的第一属性值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点;查找单元,用于从与确定的分片对应的第一级跳表中查找以待插入的数据的第一属性值为关键字的节点;数据添加单元,用于在从第一级跳表中查找到以待插入的数据的第一属性值为关键字的节点的情况下,在查找到的节点中的指针或对象所指示的第二级跳表中添加以待插入的数据的第二属性值为关键字且与该关键字对应的值包括待插入的数据的至少一个属性值的节点。
可选地,分片确定单元计算与待插入的数据的第一属性值对应的哈希值;获得计算出的哈希值除以所述数据表中的分片总数所得的余数;将与获得的余数对应的分片确定为与待插入的数据对应的分片。
可选地,所述多个分片中的每个分片中存储有指示对应的第一级跳表的指针或对象。
可选地,在未能从第一级跳表中查找到以待插入的数据的第一属性值为关键字的节点的情况下,数据添加单元创建第二级跳表,在第一级跳表中创建以待插入的数据的第一属性值为关键字且以指示创建的第二级跳表的指针或对象为与该关键字对应的值的节点,并在创建的第二级跳表中添加以待插入的数据的第二属性值为关键字且与该关键字对应的值包括待插入的数据的所述至少一个属性值的节点。
可选地,待插入的数据的所述至少一个属性值包括待插入的数据的第一属性值和/或第二属性值,或者待插入的数据的所述至少一个属性值既不包括待插入的数据的第一属性值也不包括待插入的数据的第二属性值,其中,添加到第二级跳表中的节点中的与待插入的数据的第二属性值对应的值包括通过以下方式之一获得的字符串:按照预定的字符串合并规则对所述至少一个属性值进行合并,按照预定的JSON格式对所述至少一个属性值进行序列化,按照预定的ProtocolBuffer格式对所述至少一个属性值进行序列化,按照预定义的Schema格式对所述至少一个属性值进行序列化。
可选地,所述在内存中维护数据的系统还包括:输入接收单元和数据获取单元,其中,输入接收单元接收待查询的数据的第一属性值和关于第二属性值的取值范围,分片确定单 元根据待查询的数据的第一属性值来确定所述数据表中的与待查询的数据对应的分片,数据获取单元从与确定的分片对应的第一级跳表中查找以待查询的数据的第一属性值为关键字的节点,从查找到的节点中的指针或对象所指示的第二级跳表中取出关键字在取值范围内的节点中的对应的所述至少一个属性值。
可选地,数据获取单元从查找到的节点中的指针或对象所指示的第二级跳表的节点中取出与在取值范围内的关键字对应的值,通过以下方式之一来获得待查询的数据的所述至少一个属性值:按照预定的字符串拆分规则对取出的值进行拆分,按照预定的JSON格式对取出的值进行反序列化,按照预定的ProtocolBuffer格式对取出的值进行反序列化,按照预定义的Schema格式对取出的值进行反序列化。
可选地,,分片确定单元计算与待查询的数据的第一属性值对应的哈希值,获得计算出的哈希值除以所述数据表中的分片总数所得的余数,将与获得的余数对应的分片确定为与待查询的数据对应的分片。
可选地,所述待插入的数据或待查询的数据是时序型数据,所述第二属性值为时间戳值。
可选地,所述取值范围指定时间戳值的起始值和终止值或者指定时间戳值的终止值。
可选地,数据添加单元按照时间戳值指示的时间添加节点,使得第二级跳表中的节点按照时间从近及远的顺序排列。
可选地,在内存中维护数据的系统还包括:节点数量阈值设置单元,用于设置与第二级跳表对应的节点数量阈值,其中,数据获取单元从查找到的节点中的指针或对象所指示的第二级跳表中,按照从近及远的顺序取出关键字在取值范围内且数量不超过所述节点数量阈值的节点中的对应的所述至少一个属性值。
可选地,在内存中维护数据的系统还包括:节点数量阈值设置单元和节点删除单元,其中,节点数量阈值设置单元设置与第二级跳表对应的节点数量阈值,查找单元以预定周期遍历第一级跳表和第二级跳表,当遍历到的第二级跳表中的节点数量超过节点数量阈值时,节点删除单元根据该第二级跳表中的节点的排列顺序,删除排在与节点数量阈值对应的节点之后的所有节点。
可选地,在内存中维护数据的系统还包括:过期期限长度设置单元和数据删除单元,其中,过期期限长度设置单元设置过期期限长度,查找单元以预定周期遍历第一级跳表和第二级跳表,数据删除单元整体删除时间戳值达到所述过期期限长度的节点之后的节点。
根据本公开的另一示例性实施例,提供一种计算机可读介质在所述计算机可读介质上记录有用于执行如上所述的管理内存数据的方法的计算机程序。
根据本公开的另一示例性实施例,提供一种计算装置,包括存储部件和处理器,其中,存储部件中存储有计算机可执行指令集合,当所述计算机可执行指令集合被所述处理器执行时,执行如上所述的管理内存数据的方法。
根据本公开的另一示例性实施例,提供一种计算机可读介质,其中,在所述计算机可读介质上记录有用于执行如上所述的在内存中维护数据的方法的计算机程序。
根据本公开的另一示例性实施例,提供一种计算装置,包括存储部件和处理器,其中,存储部件中存储有计算机可执行指令集合,当所述计算机可执行指令集合被所述处理器执行时,执行如上所述的在内存中维护数据的方法。
根据本公开示例性实施例的方法、系统、计算机可读介质及计算装置,可设置第一级 跳表和第二级跳表,第一级跳表中存储以数据的第一属性值为关键字且以指示第二级跳表的指针或对象为与所述关键字对应的值的节点,第二级跳表中存储以数据的第二属性值为关键字的节点,由此可利用预设的与第一属性值对应的关键字和与第二属性值对应的关键字快速定位到第二级跳表中的节点,从而可提高数据的读/写速度;还设置多个分片和每个分片对应的第一级跳表,由此可通过对所述多个分片进行并行处理来提高单位时间内能够同时执行的数据写入任务和/或数据查询任务的数量。
将在接下来的描述中部分阐述本公开总体构思另外的方面和/或优点,还有一部分通过描述将是清楚的,或者可以经过本公开总体构思的实施而得知。
附图说明
通过下面结合示例性地示出实施例的附图进行的描述,本公开示例性实施例的上述和其他目的和特点将会变得更加清楚,其中:
图1示出根据本公开示例性实施例的数据表的示意图;
图2示出根据本公开示例性实施例的管理内存数据的方法的流程图;
图3示出根据本公开示例性实施例的在内存中插入数据的操作的流程图;
图4示出根据本公开示例性实施例的在内存中查询数据的操作的流程图;
图5示出根据本公开示例性实施例的管理内存数据的系统的框图;
图6示出根据本公开示例性实施例的在内存中维护数据的系统的框图。
具体实施方式
现将详细参照本公开的实施例,所述实施例的示例在附图中示出,其中,相同的标号始终指的是相同的部件。以下将通过参照附图来说明所述实施例,以便解释本公开。在此需要说明的是,在本公开中出现的“并且/或者”、“和/或”均表示包含三种并列的情况。例如“包括A和/或B”表示包括A和B中的至少一下,即包括如下三种并列的情况:(1)包括A;(2)包括B;(3)包括A和B。又例如“执行步骤一并且/或者步骤二”表示执行步骤一和步骤二中的至少一个,即表示如下三种并列的情况:(1)执行步骤一;(2)执行步骤二;(3)执行步骤一和步骤二。
在本公开的示例性实施例中涉及的第一级跳表和第二级跳表为skiplist,也被称为跳跃表。
图1示出根据本公开示例性实施例的数据表的示意图。如图1中所示,根据本公开示例性实施例的数据表包括分片0至分片n,其中,n为大于1的自然数。这些分片中的每个分片对应一个第一级跳表。图1中示出了与分片0对应的第一级跳表,与分片0对应的第一级跳表包括节点11至节点1m,其中,m为自然数。每个分片中可存储指示对应的第一级跳表的指针或对象,以便于定位到与分片对应的第一级跳表。
第一级跳中的每个节点可对应一个第二级跳表。如图1中所示,与节点11对应的第二级跳表包括节点41至节点4k,与节点12对应的第二级跳表包括节点31至节点3j,与节点1m对应的第二级跳表包括节点21至节点2i,其中,i、j、k为自然数。第一级跳表的每个节点中可存储指示对应的第二级跳表的指针或对象,以便于定位到与第一级跳表的节点对应的第二级跳表。
可在跳表的节点中设置键值(key-value)对。具体地,对于第一级跳表的节点,可将 数据的第一属性值设置为关键字(key),并将指示第二级跳表的指针或对象设置为与所述关键字对应的值(value);对于第二级跳表中的节点,可将所述数据的第二属性值设置为关键字,并且与关键字对应的值可用于存储数据的至少一个属性值。
作为示例,所述至少一个属性值包括所述数据的第一属性值和/或第二属性值,或者所述至少一个属性值既不包括所述数据的第一属性值也不包括所述数据的第二属性值。
另外,在第一级跳表的节点中可存储有指示该第一级跳表中的另一节点的指针或对象。在第二级跳表的节点中也可存储有指示该第二级跳表中的另一节点的指针或对象。对于第一级跳表或第二级跳表,当跳表中已经存在节点时,除了尾部节点之外的每个节点中存储有一个指示与该节点同属于一个跳表的节点的对象或指针,以便使跳表形成链状结构。当向第一个节点和第二个节点之间插入第三个节点时,需要将第一个节点中指示第二个节点的指针或对象改变为指示第三个节点,并在第三个节点中利用指针或对象指示第二个节点。
图2示出根据本公开示例性实施例的管理内存数据的方法的流程图。该管理内存数据的方法由至少一个计算装置执行。如图2中所示,根据本公开示例性实施例的管理内存数据的方包括步骤S101至步骤S103。
在步骤S101,设置包括多个分片的数据表,其中,每个分片分别对应第一级跳表。可按照在内存中设置数组的方式在内存中设置所述多个分片,随后,可为每个分片设置对应的第一级跳表,最初设置的第一级跳表可为空跳表,所述空跳表中不存在节点。可在所述多个分片中的每个分片中存储指示对应的第一级跳表的指针或对象,从而可通过指针或对象定位到对应的第一级跳表。
在步骤S102,将第一级跳表设置为用于存储以数据的第一属性值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点。第一级跳表的每个节点中存储有键值对,其中,数据的第一属性值为作为关键字,指针或对象作为与关键字对应的值。在步骤S103,将第二级跳表设置为用于存储以所述数据的第二属性值为关键字且与该关键字对应的值包括所述数据的至少一个属性值的节点。第二级跳表的每个节点中也存储有键值对,其中,第二属性值作为关键字,与关键字对应的值包括数据的至少一个属性值。当所述数据为银行交易数据时,第一属性值可为卡号,第二属性值可为时间戳值,与时间戳值对应的值可以包括交易金额值、交易地点或销售点(Point of Sale,简称为POS)编号中的至少一个。如下的表1示出了根据本公开的示例性实施例的数据:
表1
卡号 时间戳值 交易金额值 交易地点 POS编号
6222XXXX01 2018052814520505 100 北京上地xx路 10xxx
6222XXXX01 2018052815520505 50 北京西二旗xx店 20xxx
6222XXXX02 2018052811520505 1000 南京鼓楼区xxx 30xxx
如表1中所示,每条数据可包括如下属性值:卡号、时间戳值、交易金额值、交易地点、以及POS编号,表1中包括3条数据。
作为示例,数据的所述至少一个属性值包括的数据的第一属性值和/或第二属性值,或者数据的所述至少一个属性值既不包括待数据的第一属性值也不包括数据的第二属性值。添加到第二级跳表中的节点中的与数据的第二属性值对应的值包括通过以下方式之一获得 的字符串:按照预定的字符串合并规则对所述至少一个属性值进行合并,按照预定的JSON(诸如,被称作JS对象表示的JavaScript Object Notation)格式对所述至少一个属性值进行序列化,按照预定的ProtocolBuffer格式对所述至少一个属性值进行序列化,按照预定义的Schema格式对所述至少一个属性值进行序列化。
上述的分片、第一级跳表和第二级跳表可设置在内存中。通过如上所述的方式设置第一级跳表和第二级跳表,可设置分片、第一级跳表和第二级跳表之间的对应关系,可基于设置的对应关系对内存中的数据进行管理和维护。将在以下的示例性实施例中描述在内存中维护数据的操作。
图3示出根据本公开示例性实施例的在内存中插入数据的操作的流程图。如图3中所示,根据本公开示例性实施例的由至少一个计算装置执行的在内存中维护数据的操作包括步骤S201至步骤S203。
作为示例,可通过put(table_name,key,ts,value)的接口形式来进行插入,其中,table_name用于限定将插入数据的数据表的名称,key用于限定待插入的数据的第一属性值,ts用于限定待插入的数据的第二属性值,value用于限定待插入的数据的至少一个属性值(作为示例,该value可以是按照特定规则通过对所述至少一个属性值进行编码(例如,合并或序列化)而得到的值)。
作为示例,还可通过put(table_name,key,ts,field_1,field_1类型,field_2,field_2类型,……,field_n,field_n类型)的接口形式来进行插入,其中,table_name用于限定将插入数据的数据表的名称,key用于限定待插入的数据的第一属性值,ts用于限定待插入的数据的第二属性值,field_1,field_1类型,field_2,field_2类型,……,field_n,field_n类型用于限定待插入的数据的n个属性值(Schema格式),n为自然数。
在步骤S201,根据待插入的数据的第一属性值来确定包括多个分片的数据表中的与待插入的数据对应的分片,其中,每个分片分别对应第一级跳表,第一级跳表用于存储以数据的第一属性值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点。
在本公开的示例性实施例中,可接收待插入的数据,例如,如表1中所示的银行交易数据。银行交易数据可包括如下属性值:卡号、时间戳值、交易金额值、交易地点和POS编号。可从内存数据库中选择待插入的数据将被插入的数据表。选择的数据表中存储有多个分片。
可根据如下方式选择与待插入的数据对应的分片:计算与待插入的数据的第一属性值对应的哈希值;获得计算出的哈希值除以所述数据表中的分片总数所得的余数;将与获得的余数对应的分片确定为与待插入的数据对应的分片。可使用哈希函数对第一属性值进行计算,以获得哈希值。例如,使用的哈希函数可以是由Austin Appleby提出的哈希函数Murmurhash。当然,本公开并不对使用的哈希函数进行限制,其他哈希函数也可用于哈希值的计算。参照图1,数据表可包括分片0至分片n,如果余数是0,则分片0与待插入的数据对应;如果余数是h(0<h≤n),则分片h与待插入的数据对应。
作为示例,可通过指针或对象将分片与第一级跳表进行关联。也就是说,数据表的多个分片中的每个分片中存储有指示对应的第一级跳表的指针或对象。所述对象与面向对象(Object Oriented,简称为OO)编程技术中涉及的对象类似。通过分片中存储的指针或对象,可定位与该分片对应的第一级跳表。
在步骤S202,从与确定的分片对应的第一级跳表中查找以待插入的数据的第一属性值 为关键字的节点。
在步骤S203,在从第一级跳表中查找到以待插入的数据的第一属性值为关键字的节点的情况下,在查找到的节点中的指针或对象所指示的第二级跳表中添加以待插入的数据的第二属性值为关键字且与该关键字对应的值包括待插入的数据的至少一个属性值的节点。
作为示例,所述待插入的数据是时序型数据,所述第二属性值为时间戳值。
作为示例,如果在建立了与确定的分片对应的第一级跳表之后,未在该第一级跳表中插入过以待插入的数据的第一属性值为关键字的节点,则步骤S202的查找的结果为空。在这种情况下,可创建第二级跳表,在第一级跳表中创建以待插入的数据的第一属性值为关键字且以指示创建的第二级跳表的指针或对象为与该关键字对应的值的节点。随后,可在创建的第二级跳表中添加以待插入的数据的第二属性值为关键字且与该关键字对应的值包括待插入的数据的至少一个属性值的节点。
以待插入的数据为表1中的第1条数据为例,在步骤S201,假设根据卡号值“6222XXXX01”确定对应的分片为分片0。在步骤S202,参照图1,从与分片0对应的第一级跳表中查找以“6222XXXX01”为关键字的节点。例如,查找到节点11。基于此,可确定与节点11对应的第二级跳表包括节点41至节点4k。
作为示例,待插入的数据的所述至少一个属性值包括待插入的数据的第一属性值和/或第二属性值,或者待插入的数据的所述至少一个属性值既不包括待插入的数据的第一属性值也不包括待插入的数据的第二属性值,其中,添加到第二级跳表中的节点中的与待插入的数据的第二属性值对应的值包括通过以下方式之一获得的字符串:按照预定的字符串合并规则对所述至少一个属性值进行合并,按照预定的JSON格式对所述至少一个属性值进行序列化,按照预定的ProtocolBuffer格式对所述至少一个属性值进行序列化,按照预定义的Schema格式对所述至少一个属性值进行序列化。
作为示例,预定的字符串合并规则包括按照特定符号(例如,“|”)合并。例如,可根据预先设置的符号例如“|”,将交易金额值、交易地点值、POS编号值合并为字符串“100|北京上地xx路|10xxx”,设置以“2018052814520505”为关键字,以“100|北京上地xx路|10xxx”为与关键字对应的值的节点4g(g为自然数),并将设置的节点4g插入到第二级跳表中,例如,插入到节点41和节点42之间。除了按照预定的字符串合并规则对所述至少一个属性值进行合并的方式获得字符串之外,上述利用JSON格式、ProtocolBuffer格式和Schema格式获得字符串以作为第二级跳表的节点中的与关键字对应的值的方式也是可行的。当然,以上描述仅仅作为示例而不应该被视为限制。
作为示例,在第二级跳表中添加节点的步骤包括:按照时间戳值指示的时间添加节点,使得第二级跳表中的节点按照时间从近及远的顺序排列。可通过比较时间戳值来确定时间的远近,与较大的时间戳值对应的时间比与较小的时间戳值对应的时间近。因此,在第二级跳表中,可将时间戳值较大的节点排列在时间戳值较小的节点之前。
以待插入的数据为表1中的第1条数据至第3条数据作为示例进行说明。第1条数据和第2条数据具有相同的卡号,因此,这两条数据中对应于同一个第二级跳表。在该第二级跳表中,当第1条数据被添加之后,第2条数据的时间戳值比第1条数据的时间戳值大,因此,将与第2条数据对应的节点添加到与第1条数据对应的节点之前。在该第二级跳表中,插入以“2018052814520505”为关键字且以“100|北京上地xx路|10xxx”为与关键字对应的值的节点4g,并节点4g之前插入以“2018052815520505”为关键字且以“50|北京西二 旗xx店|20xxx”为与关键字对应的值的节点。
在上述示例中,根据卡号值“6222XXXX02”确定与第3条数据对应的分片为分片1,根据分片1中存储的指针或对象确定与分片1对应的第一级跳表(图1中未示出与分片1对应的第一级跳表)。假设未从与分片1对应的第一级跳表中查找到以“6222XXXX02”为关键字的节点,则创建一个节点和一个第二级跳表,将以“6222XXXX02”为关键字且以指示创建的所述一个第二级跳表的指针或对象为与关键字对应的值的节点添加到创建的所述一个第二级跳表中。随后,在创建的所述一个第二级跳表中添加以“2018052811520505”为关键字,且以“1000|南京鼓楼区xxx|30xxx”为与该关键字对应的值的节点。
图4示出根据本公开示例性实施例的在内存中查询数据的操作的流程图。如图4中所示,根据本公开示例性实施例的在内存中查询数据的操作包括步骤S301至步骤S304。
作为示例,可通过scan(table_name,key,start_time,end_time)的接口形式来进行查询,其中,table_name用于限定从中查询数据的数据表的名称,key用于限定待查询的数据的第一属性值,start_time和end_time用于限定待查询的数据的取值范围,例如,起始时间和终止时间。
作为示例,可通过get(table_name,key,ts)的接口形式来进行查询,其中,table_name用于限定从中查询数据的数据表的名称,key用于限定待查询的数据的第一属性值,ts用于限定待查询的数据的取值范围,例如,ts用于限定待查询的数据的时间戳值,在这种情况下,实际期望查询的是时间戳值为ts的数据;又例如,ts用于限定待查询的数据的终止时间,在这种情况下,实际期望查询的是从查询数据的时刻开始到指定的ts为止的数据。
在步骤S301,接收待查询的数据的第一属性值和关于第二属性值的取值范围。
作为示例,所述待查询的数据是时序型数据,所述第二属性值为时间戳值。
作为示例,所述取值范围指定时间戳值的起始值和终止值或者指定时间戳值的终止值。
在步骤S302,根据待查询的数据的第一属性值来确定所述数据表中的与待查询的数据对应的分片。
作为示例,可通过如下操作确定与待查询的数据对应的分片:计算与待查询的数据的第一属性值对应的哈希值;获得计算出的哈希值除以所述数据表中的分片总数所得的余数;将与获得的余数对应的分片确定为与待查询的数据对应的分片。可使用哈希函数对第一属性值进行计算,以获得哈希值。例如,使用的哈希函数可以是由Austin Appleby提出的哈希函数Murmurhash。当然,本公开并不对使用的哈希函数进行限制,其他哈希函数也可用于哈希值的计算。
在步骤S303,从与确定的分片对应的第一级跳表中查找以待查询的数据的第一属性值为关键字的节点。
在步骤S304,从查找到的节点中的指针或对象所指示的第二级跳表中取出关键字在取值范围内的节点中的对应的所述至少一个属性值。
以表1中的数据为例,需要查询卡号值为“6222XXXX01”并且时间戳值的起始值为“2018052815520505”且终止值为“2018052814520505”的数据。根据“6222XXXX01”确定与待查询的数据对应的分片为分片0。从与分片0对应的第一级跳表的节点11节点1m中查找出以“6222XXXX01”为关键字的节点为节点11。确定与节点11对应的第二级跳表包括节点41至节点4k。从节点41至节点4k中查找到关键值的起始值为“2018052815520505”且终止值为“2018052814520505”的节点,从而可从查找到的节点中查找出与关键字对应的 值。又如,当所述取值范围仅指定时间戳值的终止值(例如,“2018052814520505”)时,可用于查询与时间戳值大于或等于“2018052814520505”的节点对应的数据。
作为示例,从查找到的节点中的指针或对象所指示的第二级跳表中取出关键字在取值范围内的节点中的对应的所述至少一个属性值的步骤包括:从查找到的节点中的指针或对象所指示的第二级跳表的节点中取出与在取值范围内的关键字对应的值;通过以下方式之一来获得待查询的数据的所述至少一个属性值:按照预定的字符串拆分规则(与上述字符串合并规则相对应)对取出的值进行拆分,按照预定的JSON格式对取出的值进行反序列化,按照预定的ProtocolBuffer格式对取出的值进行反序列化,按照预定义的Schema格式对取出的值进行反序列化。
例如,从查找到的节点中查找出与关键字对应的值分别为“100|北京上地xx路|10xxx”和“50|北京西二旗xx店|20xxx”。可根据预先设定的符号例如“|”对“100|北京上地xx路|10xxx”进行拆分,并根据预先设定的拆分出的字符串的含义获得第一个拆分出的字符串为交易金额值“100”,第二个拆分出的字符串为交易地点“北京上地xx路”,第三个拆分出的字符串为POS编号“10xxx”。类似地,可从“50|北京西二旗xx店|20xxx”获得交易金额值“50”、交易地点“北京西二旗xx店”以及POS编号“20xxx”。
作为示例,为了保证内存中不会存储过多数据,可设置与第二级跳表对应的节点数量阈值。在此基础上,从查找到的节点中的指针或对象所指示的第二级跳表中取出关键字在取值范围内的节点中的对应的所述至少一个属性值的步骤包括:从查找到的节点中的指针或对象所指示的第二级跳表中,按照从近及远的顺序取出关键字在取值范围内且数量不超过所述节点数量阈值的节点中的对应的所述至少一个属性值。
作为示例,还可根据设置的节点数量阈值进行定期删除,即:以预定周期遍历第一级跳表和第二级跳表,当遍历到的第二级跳表中的节点数量超过节点数量阈值时,根据该第二级跳表中的节点的排列顺序,删除排在与节点数量阈值对应的节点之后的所有节点。例如,当节点数量阈值为10时,根据第二级跳表中的节点的排列顺序,与该节点数量阈值对应的节点为第10个节点。
作为示例,为了提高处理效率,可执行如下的过期数据删除操作:设置过期期限长度;以预定周期(例如,3个月)遍历第一级跳表和第二级跳表,通过定位时间戳值达到所述过期期限长度的节点来整体删除在该节点之后的节点。通过所述过期数据删除操作,可将第二级跳表中的时间戳值小于与设置的过期期限长度对应的时间戳值的节点删除。例如,与设置的过期期限长度对应的时间戳值为2018060000000000,则可通过上述过期数据删除操作将以上示例中添加到第二级跳表中的与表1中的3条数据对应的3个节点均删除。
图5示出根据本公开示例性实施例的管理内存数据的系统的框图。如图5中所示,根据本公开示例性实施例的管理内存数据的系统400包括:数据表设置单元401、第一级跳表设置单元402以及第二级跳表设置单元403。
数据表设置单元401用于设置包括多个分片的数据表,其中,每个分片分别对应第一级跳表。第一级跳表设置单元402用于将第一级跳表设置为用于存储以数据的第一属性值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点。第二级跳表设置单元403用于将第二级跳表设置为用于存储以所述数据的第二属性值为关键字且与该关键字对应的值包括所述数据的至少一个属性值的节点。
图6示出根据本公开示例性实施例的在内存中维护数据的系统的框图。如图6中所示, 根据本公开示例性实施例的在内存中维护数据的系统500包括:分片确定单元501、查找单元502和数据添加单元503。
分片确定单元501用于根据待插入的数据的第一属性值来确定包括多个分片的数据表中的与待插入的数据对应的分片,其中,每个分片分别对应第一级跳表,第一级跳表用于存储以数据的第一属性值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点。查找单元502用于从与确定的分片对应的第一级跳表中查找以待插入的数据的第一属性值为关键字的节点。数据添加单元503用于在从第一级跳表中查找到以待插入的数据的第一属性值为关键字的节点的情况下,在查找到的节点中的指针或对象所指示的第二级跳表中添加以待插入的数据的第二属性值为关键字且与该关键字对应的值包括待插入的数据的至少一个属性值的节点。
作为示例,分片确定单元501计算与待插入的数据的第一属性值对应的哈希值;获得计算出的哈希值除以所述数据表中的分片总数所得的余数;将与获得的余数对应的分片确定为与待插入的数据对应的分片。
作为示例,所述多个分片中的每个分片中存储有指示对应的第一级跳表的指针或对象。
作为示例,在未能从第一级跳表中查找到以待插入的数据的第一属性值为关键字的节点的情况下,数据添加单元503创建第二级跳表,在第一级跳表中创建以待插入的数据的第一属性值为关键字且以指示创建的第二级跳表的指针或对象为与该关键字对应的值的节点,并在创建的第二级跳表中添加以待插入的数据的第二属性值为关键字且与该关键字对应的值包括待插入的数据的所述至少一个属性值的节点。
作为示例,待插入的数据的所述至少一个属性值包括待插入的数据的第一属性值和/或第二属性值,或者待插入的数据的所述至少一个属性值既不包括待插入的数据的第一属性值也不包括待插入的数据的第二属性值,其中,添加到第二级跳表中的节点中的与待插入的数据的第二属性值对应的值包括通过以下方式之一获得的字符串:按照预定的字符串合并规则对所述至少一个属性值进行合并,按照预定的JSON格式对所述至少一个属性值进行序列化,按照预定的ProtocolBuffer格式对所述至少一个属性值进行序列化,按照预定义的Schema格式对所述至少一个属性值进行序列化。
作为示例,所述在内存中维护数据的系统还包括:输入接收单元(未示出)和数据获取单元(未示出),其中,输入接收单元接收待查询的数据的第一属性值和关于第二属性值的取值范围,其中,分片确定单元501根据待查询的数据的第一属性值来确定所述数据表中的与待查询的数据对应的分片;数据获取单元从与确定的分片对应的第一级跳表中查找以待查询的数据的第一属性值为关键字的节点,从查找到的节点中的指针或对象所指示的第二级跳表中取出关键字在取值范围内的节点中的对应的所述至少一个属性值。
作为示例,数据获取单元从查找到的节点中的指针或对象所指示的第二级跳表的节点中取出与在取值范围内的关键字对应的值,通过以下方式之一来获得待查询的数据的所述至少一个属性值:按照预定的字符串拆分规则对取出的值进行拆分,按照预定的JSON格式对取出的值进行反序列化,按照预定的ProtocolBuffer格式对取出的值进行反序列化,按照预定义的Schema格式对取出的值进行反序列化。
作为示例,分片确定单元501计算与待查询的数据的第一属性值对应的哈希值;获得计算出的哈希值除以所述数据表中的分片总数所得的余数,将与获得的余数对应的分片确定为与待查询的数据对应的分片。
作为示例,所述待插入的数据或待查询的数据是时序型数据,所述第二属性值为时间戳值。
作为示例,所述取值范围指定时间戳值的起始值和终止值或者指定时间戳值的终止值。
作为示例,数据添加单元503按照时间戳值指示的时间添加节点,使得第二级跳表中的节点按照时间从近及远的顺序排列。
作为示例,在内存中维护数据的系统还包括:节点数量阈值设置单元(未示出),用于设置与第二级跳表对应的节点数量阈值,其中,数据获取单元从查找到的节点中的指针或对象所指示的第二级跳表中,按照从近及远的顺序取出关键字在取值范围内且数量不超过所述节点数量阈值的节点中的对应的所述至少一个属性值。
作为示例,在内存中维护数据的系统还包括:节点删除单元,其中,节点数量阈值设置单元设置与第二级跳表对应的节点数量阈值,查找单元以预定周期遍历第一级跳表和第二级跳表,当遍历到的第二级跳表中的节点数量超过节点数量阈值时,节点删除单元根据该第二级跳表中的节点的排列顺序,删除排在与节点数量阈值对应的节点之后的所有节点。
作为示例,在内存中维护数据的系统还包括:过期期限设置单元(未示出)和数据删除单元(未示出),其中,过期期限长度设置单元设置过期期限长度,查找单元以预定周期遍历第一级跳表和第二级跳表,数据删除单元整体删除时间戳值达到所述过期期限长度的节点之后的节点。
应该理解,根据本公开示例性实施例的管理内存数据的系统和在内存中维护数据的系统的具体实现方式可参照结合图1至图4以及表1描述的相关具体实现方式来实现,在此不再赘述。
根据本公开示例性实施例的系统所包括的单元可被分别配置为执行特定功能的软件、硬件、固件或上述项的任意组合。例如,这些单元可对应于专用的集成电路,也可对应于纯粹的软件代码,还可对应于软件与硬件相结合的单元。此外,这些单元所实现的一个或多个功能也可由物理实体设备(例如,处理器、客户端或服务器等)中的组件来统一执行。
应理解,根据本公开示例性实施例的方法可通过记录在计算可读介质上的程序来实现,例如,根据本公开的示例性实施例,可提供一种存储指令的计算机存储介质,其中,当所述指令被至少一个计算装置运行时,促使所述至少一个计算装置执行管理内存数据的方法:设置包括多个分片的数据表,其中,每个分片分别对应第一级跳表;将第一级跳表设置为用于存储以数据的第一属性值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点;将第二级跳表设置为用于存储以所述数据的第二属性值为关键字且与该关键字对应的值包括所述数据的至少一个属性值的节点。
又如,根据本公开的示例性实施例,可提供一种存储指令的计算机存储介质,其中,当所述指令被至少一个计算装置运行时,促使所述至少一个计算装置执行用于在内存中维护数据的方法:根据待插入的数据的第一属性值来确定包括多个分片的数据表中的与待插入的数据对应的分片,其中,每个分片分别对应第一级跳表,第一级跳表用于存储以数据的第一属性值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点;从与确定的分片对应的第一级跳表中查找以待插入的数据的第一属性值为关键字的节点;在从第一级跳表中查找到以待插入的数据的第一属性值为关键字的节点的情况下,在查找到的节点中的指针或对象所指示的第二级跳表中添加以待插入的数据的第二属性值为关键字且与该关键字对应的值包括待插入的数据的至少一个属性值的节点。
上述计算机可读介质中的计算机程序可在诸如客户端、主机、代理装置、服务器等计算机设备中部署的环境中运行,应注意,所述计算机程序还可用于执行除了上述步骤以外的附加步骤或者在执行上述步骤时执行更为具体的处理,这些附加步骤和进一步处理的内容已经参照图1至图4以及表1进行了描述,这里为了避免重复将不再进行赘述。
应注意,根据本公开示例性实施例的系统可完全依赖计算机程序的运行来实现相应的功能,即,各个单元与计算机程序的功能架构中与各步骤相应,使得整个系统通过专门的软件包(例如,lib库)而被调用,以实现相应的功能。
另一方面,根据本公开示例性实施例的系统所包括的各个单元也可以通过硬件、软件、固件、中间件、微代码或其任意组合来实现。当以软件、固件、中间件或微代码实现时,用于执行相应操作的程序代码或者代码段可以存储在诸如存储介质的计算机可读介质中,使得处理器可通过读取并运行相应的程序代码或者代码段来执行相应的操作。
例如,本公开的示例性实施例还可以实现为计算装置,该计算装置包括存储部件和处理器,存储部件中存储有计算机可执行指令集合,当所述计算机可执行指令集合被所述处理器执行时,执行用于管理内存数据的方法或者执行在内存中维护数据的方法。
具体说来,所述计算装置可以部署在服务器或客户端中,也可以部署在分布式网络环境中的节点装置上。此外,所述计算装置可以是PC计算机、平板装置、个人数字助理、智能手机、web应用或其他能够执行上述指令集合的装置。
这里,所述计算装置并非必须是单个的计算装置,还可以是任何能够单独或联合执行上述指令(或指令集)的装置或电路的集合体。计算装置还可以是集成控制系统或系统管理器的一部分,或者可被配置为与本地或远程(例如,经由无线传输)以接口互联的便携式电子装置。
在所述计算装置中,处理器可包括中央处理器(CPU)、图形处理器(GPU)、可编程逻辑装置、专用处理器系统、微控制器或微处理器。作为示例而非限制,处理器还可包括模拟处理器、数字处理器、微处理器、多核处理器、处理器阵列、网络处理器等。
根据本公开示例性实施例的方法中所描述的某些操作可通过软件方式来实现,某些操作可通过硬件方式来实现,此外,还可通过软硬件结合的方式来实现这些操作。
处理器可运行存储在存储部件之一中的指令或代码,其中,所述存储部件还可以存储数据。指令和数据还可经由网络接口装置而通过网络被发送和接收,其中,所述网络接口装置可采用任何已知的传输协议。
存储部件可与处理器集成为一体,例如,将RAM或闪存布置在集成电路微处理器等之内。此外,存储部件可包括独立的装置,诸如,外部盘驱动、存储阵列或任何数据库系统可使用的其他存储装置。存储部件和处理器可在操作上进行耦合,或者可例如通过I/O端口、网络连接等互相通信,使得处理器能够读取存储在存储部件中的文件。
此外,所述计算装置还可包括视频显示器(诸如,液晶显示器)和用户交互接口(诸如,键盘、鼠标、触摸输入装置等)。计算装置的所有组件可经由总线和/或网络而彼此连接。
根据本公开示例性实施例的管理内存数据的方法和/或在内存中维护数据的方法所涉及的操作可被描述为各种互联或耦合的功能块或功能示图。然而,这些功能块或功能示图可被均等地集成为单个的逻辑装置或按照非确切的边界进行操作。
例如,如上所述,根据本公开示例性实施例,一种包括至少一个计算装置和至少一个 存储指令的存储装置的系统,其中,所述指令在被所述至少一个计算装置运行时,促使所述至少一个计算装置执行在内存中维护数据的以下步骤:设置包括多个分片的数据表,其中,每个分片分别对应第一级跳表;将第一级跳表设置为用于存储以数据的第一属性值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点;将第二级跳表设置为用于存储以所述数据的第二属性值为关键字且与该关键字对应的值包括所述数据的至少一个属性值的节点。
又如,如上所述,根据本公开示例性实施例,一种包括至少一个计算装置和至少一个存储指令的存储装置的系统,其中,所述指令在被所述至少一个计算装置运行时,促使所述至少一个计算装置执行在内存中维护数据的以下步骤:根据待插入的数据的第一属性值来确定包括多个分片的数据表中的与待插入的数据对应的分片,其中,每个分片分别对应第一级跳表,第一级跳表用于存储以数据的第一属性值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点;从与确定的分片对应的第一级跳表中查找以待插入的数据的第一属性值为关键字的节点;在从第一级跳表中查找到以待插入的数据的第一属性值为关键字的节点的情况下,在查找到的节点中的指针或对象所指示的第二级跳表中添加以待插入的数据的第二属性值为关键字且与该关键字对应的值包括待插入的数据的至少一个属性值的节点。
关于以上各个方法步骤的细节,已经参照相应的附图进行了描述,这里将不再赘述。
以上描述了本公开的各示例性实施例,应理解,上述描述仅是示例性的,并非穷尽性的,本公开不限于所披露的各示例性实施例。在不偏离本公开的范围和精神的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。因此,本公开的保护范围应该以权利要求的范围为准。

Claims (34)

  1. 一种由至少一个计算装置执行的管理内存数据的方法,包括:
    设置包括多个分片的数据表,其中,每个分片分别对应第一级跳表;
    将第一级跳表设置为用于存储以数据的第一属性值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点;
    将第二级跳表设置为用于存储以所述数据的第二属性值为关键字且与该关键字对应的值包括所述数据的至少一个属性值的节点。
  2. 一种由至少一个计算装置执行的在内存中维护数据的方法,包括:
    根据待插入的数据的第一属性值来确定包括多个分片的数据表中的与待插入的数据对应的分片,其中,每个分片分别对应第一级跳表,第一级跳表用于存储以数据的第一属性值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点;
    从与确定的分片对应的第一级跳表中查找以待插入的数据的第一属性值为关键字的节点;
    在从第一级跳表中查找到以待插入的数据的第一属性值为关键字的节点的情况下,在查找到的节点中的指针或对象所指示的第二级跳表中添加以待插入的数据的第二属性值为关键字且与该关键字对应的值包括待插入的数据的至少一个属性值的节点。
  3. 如权利要求2所述的方法,其中,确定与待插入的数据对应的分片的步骤包括:
    计算与待插入的数据的第一属性值对应的哈希值;
    获得计算出的哈希值除以所述数据表中的分片总数所得的余数;
    将与获得的余数对应的分片确定为与待插入的数据对应的分片。
  4. 如权利要求2所述的方法,其中,所述多个分片中的每个分片中存储有指示对应的第一级跳表的指针或对象。
  5. 如权利要求2所述的方法,还包括:
    在未能从第一级跳表中查找到以待插入的数据的第一属性值为关键字的节点的情况下,创建第二级跳表,在第一级跳表中创建以待插入的数据的第一属性值为关键字且以指示创建的第二级跳表的指针或对象为与该关键字对应的值的节点,并在创建的第二级跳表中添加以待插入的数据的第二属性值为关键字且与该关键字对应的值包括待插入的数据的所述至少一个属性值的节点。
  6. 如权利要求2或5所述的方法,待插入的数据的所述至少一个属性值包括待插入的数据的第一属性值和第二属性值之中的至少一个,或者待插入的数据的所述至少一个属性值既不包括待插入的数据的第一属性值也不包括待插入的数据的第二属性值,
    其中,添加到第二级跳表中的节点中的与待插入的数据的第二属性值对应的值包括通过以下方式之一获得的字符串:
    按照预定的字符串合并规则对所述至少一个属性值进行合并,按照预定的JSON格式对所述至少一个属性值进行序列化,按照预定的ProtocolBuffer格式对所述至少一个属性值进行序列化,按照预定的Schema格式对所述至少一个属性值进行序列化。
  7. 如权利要求2所述的方法,还包括:
    接收待查询的数据的第一属性值和关于第二属性值的取值范围;
    根据待查询的数据的第一属性值来确定所述数据表中的与待查询的数据对应的分片;
    从与确定的分片对应的第一级跳表中查找以待查询的数据的第一属性值为关键字的节点;
    从查找到的节点中的指针或对象所指示的第二级跳表中取出关键字在取值范围内的节点中的对应的所述至少一个属性值。
  8. 如权利要求7所述的方法,其中,从查找到的节点中的指针或对象所指示的第二级跳表中取出关键字在取值范围内的节点中的对应的所述至少一个属性值的步骤包括:
    从查找到的节点中的指针或对象所指示的第二级跳表的节点中取出与在取值范围内的关键字对应的值;
    通过以下方式之一来获得待查询的数据的所述至少一个属性值:
    按照预定的字符串拆分规则对取出的值进行拆分,按照预定的JSON格式对取出的值进行反序列化,按照预定的ProtocolBuffer格式对取出的值进行反序列化,按照预定义的Schema格式对取出的值进行反序列化。
  9. 如权利要求7所述的方法,其中,确定与待查询的数据对应的分片的步骤包括:
    计算与待查询的数据的第一属性值对应的哈希值;
    获得计算出的哈希值除以所述数据表中的分片总数所得的余数;
    将与获得的余数对应的分片确定为与待查询的数据对应的分片。
  10. 如权利要求7所述的方法,其中,所述待插入的数据或待查询的数据是时序型数据,所述第二属性值为时间戳值。
  11. 如权利要求10所述的方法,其中,所述取值范围指定时间戳值的起始值和终止值或者指定时间戳值的终止值。
  12. 如权利要求10所述的方法,其中,在第二级跳表中添加节点的步骤包括:按照时间戳值指示的时间添加节点,使得第二级跳表中的节点按照时从近及远的顺序排列。
  13. 如权利要求12所述的方法,其中,还包括:设置与第二级跳表对应的节点数量阈值,
    其中,从查找到的节点中的指针或对象所指示的第二级跳表中取出关键字在取值范围内的节点中的对应的所述至少一个属性值的步骤包括:
    从查找到的节点中的指针或对象所指示的第二级跳表中,按照从近及远的顺序取出关键字在取值范围内且数量不超过所述节点数量阈值的节点中的对应的所述至少一个属性值。
  14. 如权利要求12所述的方法,其中,还包括:
    设置与第二级跳表对应的节点数量阈值;
    以预定周期遍历第一级跳表和第二级跳表;
    当遍历到的第二级跳表中的节点数量超过节点数量阈值时,根据该第二级跳表中的节点的排列顺序,删除排在与节点数量阈值对应的节点之后的所有节点。
  15. 如权利要求10所述的方法,还包括:
    设置过期期限长度;
    以预定周期遍历第一级跳表和第二级跳表,通过定位时间戳值达到所述过期期限长度的节点来整体删除在该节点之后的节点。
  16. 一种管理内存数据的系统,包括:
    数据表设置单元,用于设置包括多个分片的数据表,其中,每个分片分别对应第一级 跳表;
    第一级跳表设置单元,用于将第一级跳表设置为用于存储以数据的第一属性值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点;
    第二级跳表设置单元,用于将第二级跳表设置为用于存储以所述数据的第二属性值为关键字且与该关键字对应的值包括所述数据的至少一个属性值的节点。
  17. 一种包括至少一个计算装置和至少一个存储指令的存储装置的系统,其中,所述指令在被所述至少一个计算装置运行时,促使所述至少一个计算装置执行在内存中维护数据的以下步骤:
    根据待插入的数据的第一属性值来确定包括多个分片的数据表中的与待插入的数据对应的分片,其中,每个分片分别对应第一级跳表,第一级跳表用于存储以数据的第一属性值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点;
    从与确定的分片对应的第一级跳表中查找以待插入的数据的第一属性值为关键字的节点;
    在从第一级跳表中查找到以待插入的数据的第一属性值为关键字的节点的情况下,在查找到的节点中的指针或对象所指示的第二级跳表中添加以待插入的数据的第二属性值为关键字且与该关键字对应的值包括待插入的数据的至少一个属性值的节点。
  18. 如权利要求17所述的系统,其中,确定与待插入的数据对应的分片的步骤包括:计算与待插入的数据的第一属性值对应的哈希值;获得计算出的哈希值除以所述数据表中的分片总数所得的余数;将与获得的余数对应的分片确定为与待插入的数据对应的分片。
  19. 如权利要求17所述的系统,其中,所述多个分片中的每个分片中存储有指示对应的第一级跳表的指针或对象。
  20. 如权利要求17所述的系统,其中,所述指令在被所述至少一个计算装置运行时,将促使所述至少一个计算装置还执行以下步骤:在未能从第一级跳表中查找到以待插入的数据的第一属性值为关键字的节点的情况下,创建第二级跳表,在第一级跳表中创建以待插入的数据的第一属性值为关键字且以指示创建的第二级跳表的指针或对象为与该关键字对应的值的节点,并在创建的第二级跳表中添加以待插入的数据的第二属性值为关键字且与该关键字对应的值包括待插入的数据的所述至少一个属性值的节点。
  21. 如权利要求17或20所述的系统,待插入的数据的所述至少一个属性值包括待插入的数据的第一属性值和第二属性值之中的至少一个,或者待插入的数据的所述至少一个属性值既不包括待插入的数据的第一属性值也不包括待插入的数据的第二属性值,
    其中,添加到第二级跳表中的节点中的与待插入的数据的第二属性值对应的值包括通过以下方式之一获得的字符串:
    按照预定的字符串合并规则对所述至少一个属性值进行合并,按照预定的JSON格式对所述至少一个属性值进行序列化,按照预定的ProtocolBuffer格式对所述至少一个属性值进行序列化,按照预定义的Schema格式对所述至少一个属性值进行序列化。
  22. 如权利要求17所述的系统,其中,所述指令在被所述至少一个计算装置运行时,将促使所述至少一个计算装置还执行以下步骤:接收待查询的数据的第一属性值和关于第二属性值的取值范围,根据待查询的数据的第一属性值来确定所述数据表中的与待查询的数据对应的分片,从与确定的分片对应的第一级跳表中查找以待查询的数据的第一属性值为关键字的节点,从查找到的节点中的指针或对象所指示的第二级跳表中取出关键字在取 值范围内的节点中的对应的所述至少一个属性值。
  23. 如权利要求22所述的系统,其中,从查找到的节点中的指针或对象所指示的第二级跳表中取出关键字在取值范围内的节点中的对应的所述至少一个属性值的步骤包括:从查找到的节点中的指针或对象所指示的第二级跳表的节点中取出与在取值范围内的关键字对应的值,通过以下方式之一来获得待查询的数据的所述至少一个属性值:
    按照预定的字符串拆分规则对取出的值进行拆分,按照预定的JSON格式对取出的值进行反序列化,按照预定的ProtocolBuffer格式对取出的值进行反序列化,按照预定义的Schema格式对取出的值进行反序列化。
  24. 如权利要求22所述的系统,其中,确定与待查询的数据对应的分片的步骤包括:计算与待查询的数据的第一属性值对应的哈希值,获得计算出的哈希值除以所述数据表中的分片总数所得的余数,将与获得的余数对应的分片确定为与待查询的数据对应的分片。
  25. 如权利要求22所述的系统,其中,所述待插入的数据或待查询的数据是时序型数据,所述第二属性值为时间戳值。
  26. 如权利要求25所述的系统,其中,所述取值范围指定时间戳值的起始值和终止值或者指定时间戳值的终止值。
  27. 如权利要求25所述的系统,其中,在第二级跳表中添加节点的步骤包括:按照时间戳值指示的时间添加节点,使得第二级跳表中的节点按照时从近及远的顺序排列。
  28. 如权利要求27所述的系统,其中,所述指令在被所述至少一个计算装置运行时,将促使所述至少一个计算装置还执行以下步骤:设置与第二级跳表对应的节点数量阈值,其中,从查找到的节点中的指针或对象所指示的第二级跳表中取出关键字在取值范围内的节点中的对应的所述至少一个属性值的步骤包括:从查找到的节点中的指针或对象所指示的第二级跳表中,按照从近及远的顺序取出关键字在取值范围内且数量不超过所述节点数量阈值的节点中的对应的所述至少一个属性值。
  29. 如权利要求27所述的系统,其中,所述指令在被所述至少一个计算装置运行时,将促使所述至少一个计算装置还执行以下步骤:设置与第二级跳表对应的节点数量阈值,以预定周期遍历第一级跳表和第二级跳表,当遍历到的第二级跳表中的节点数量超过节点数量阈值时,根据该第二级跳表中的节点的排列顺序,删除排在与节点数量阈值对应的节点之后的所有节点。
  30. 如权利要求25所述的系统,其中,所述指令在被所述至少一个计算装置运行时,将促使所述至少一个计算装置还执行以下步骤:设置过期期限长度,以预定周期遍历第一级跳表和第二级跳表,整体删除时间戳值达到所述过期期限长度的节点之后的节点。
  31. 一种存储指令的计算机存储介质,其中,当所述指令被至少一个计算装置运行时,促使所述至少一个计算装置执行如权利要求1所述的管理内存数据的方法。
  32. 一种包括至少一个计算装置和至少一个存储指令的存储装置的系统,其中,所述指令在被所述至少一个计算装置运行时,促使所述至少一个计算装置执行如权利要求1所述的管理内存数据的方法。
  33. 一种存储指令的计算机存储介质,其中,当所述指令被至少一个计算装置运行时,促使所述至少一个计算装置执行如权利要求2至15中的任一项权利要求所述的在内存中维护数据的方法。
  34. 一种在内存中维护数据的系统,包括:
    分片确定单元,用于根据待插入的数据的第一属性值来确定包括多个分片的数据表中的与待插入的数据对应的分片,其中,每个分片分别对应第一级跳表,第一级跳表用于存储以数据的第一属性值为关键字且以指示第二级跳表的指针或对象为与该关键字对应的值的节点;
    查找单元,用于从与确定的分片对应的第一级跳表中查找以待插入的数据的第一属性值为关键字的节点;以及
    数据添加单元,用于在从第一级跳表中查找到以待插入的数据的第一属性值为关键字的节点的情况下,在查找到的节点中的指针或对象所指示的第二级跳表中添加以待插入的数据的第二属性值为关键字且与该关键字对应的值包括待插入的数据的至少一个属性值的节点。
PCT/CN2019/094365 2018-07-06 2019-07-02 管理内存数据及在内存中维护数据的方法和系统 WO2020007288A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810735798.2A CN109086133B (zh) 2018-07-06 2018-07-06 在内存中维护数据的方法和系统
CN201810735798.2 2018-07-06

Publications (1)

Publication Number Publication Date
WO2020007288A1 true WO2020007288A1 (zh) 2020-01-09

Family

ID=64837006

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/094365 WO2020007288A1 (zh) 2018-07-06 2019-07-02 管理内存数据及在内存中维护数据的方法和系统

Country Status (2)

Country Link
CN (2) CN109086133B (zh)
WO (1) WO2020007288A1 (zh)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109086133B (zh) * 2018-07-06 2019-08-30 第四范式(北京)技术有限公司 在内存中维护数据的方法和系统
CN109299100B (zh) * 2018-10-12 2019-08-30 第四范式(北京)技术有限公司 管理内存数据及在内存中维护数据的方法和系统
CN111124312B (zh) * 2019-12-23 2023-10-31 第四范式(北京)技术有限公司 数据去重的方法及其装置
CN111176842A (zh) * 2019-12-23 2020-05-19 中国平安财产保险股份有限公司 数据处理方法、装置、电子设备及存储介质
CN111597076B (zh) * 2020-05-12 2024-04-16 第四范式(北京)技术有限公司 操作数据的方法和装置以及管理持久化跳表的方法和装置
CN111913801B (zh) * 2020-07-15 2023-08-29 广州虎牙科技有限公司 数据处理方法和装置、代理服务器、存储系统及存储介质
CN112597152B (zh) * 2020-12-04 2022-08-23 国创移动能源创新中心(江苏)有限公司 基于跳跃表的带特征的时序数据的索引方法、索引装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103942289A (zh) * 2014-04-12 2014-07-23 广西师范大学 一种Hadoop上面向范围查询的内存缓存方法
CN106209645A (zh) * 2016-07-29 2016-12-07 北京邮电大学 一种数据包的起始查找节点确定方法及装置
US9665623B1 (en) * 2013-03-15 2017-05-30 EMC IP Holding Company LLC Key-value store utilizing ranged keys in skip list data structure
CN109086133A (zh) * 2018-07-06 2018-12-25 第四范式(北京)技术有限公司 管理内存数据及在内存中维护数据的方法和系统

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030196024A1 (en) * 2002-04-16 2003-10-16 Exanet, Inc. Apparatus and method for a skip-list based cache
US7552306B2 (en) * 2005-11-14 2009-06-23 Kabushiki Kaisha Toshiba System and method for the sub-allocation of shared memory
US9055011B2 (en) * 2010-08-31 2015-06-09 Intel Corporation Methods and apparatus for linked-list circular buffer management
US9361215B2 (en) * 2013-05-31 2016-06-07 Apple Inc. Memory allocation improvements
CN104346362B (zh) * 2013-07-29 2019-03-26 腾讯科技(深圳)有限公司 一种基于属性值查找目标对象的方法和装置
CN104766013A (zh) * 2015-04-10 2015-07-08 北京理工大学 一种基于跳表的跨站脚本攻击防御方法
JP6133960B2 (ja) * 2015-11-12 2017-05-24 株式会社Pfu 映像処理装置、および、映像処理方法
CN105574104B (zh) * 2015-12-11 2019-04-05 上海爱数信息技术股份有限公司 一种基于ObjectStore的LogStructure存储系统及其数据写入方法
CN105701209A (zh) * 2016-01-13 2016-06-22 广西师范大学 一种提高大数据上并行连接性能的负载平衡方法
TWI814707B (zh) * 2016-08-14 2023-09-11 加拿大商Www信託科技公司 有助於金融交易之方法和系統
CN106815326B (zh) * 2016-12-28 2021-03-02 中国民航信息网络股份有限公司 一种检测无主键数据表一致性的系统及方法
CN107609089B (zh) * 2017-09-07 2019-11-19 北京神州绿盟信息安全科技股份有限公司 一种数据处理方法、装置及系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9665623B1 (en) * 2013-03-15 2017-05-30 EMC IP Holding Company LLC Key-value store utilizing ranged keys in skip list data structure
CN103942289A (zh) * 2014-04-12 2014-07-23 广西师范大学 一种Hadoop上面向范围查询的内存缓存方法
CN106209645A (zh) * 2016-07-29 2016-12-07 北京邮电大学 一种数据包的起始查找节点确定方法及装置
CN109086133A (zh) * 2018-07-06 2018-12-25 第四范式(北京)技术有限公司 管理内存数据及在内存中维护数据的方法和系统

Also Published As

Publication number Publication date
CN110704194A (zh) 2020-01-17
CN109086133B (zh) 2019-08-30
CN109086133A (zh) 2018-12-25

Similar Documents

Publication Publication Date Title
WO2020007288A1 (zh) 管理内存数据及在内存中维护数据的方法和系统
WO2020073854A1 (zh) 管理内存数据及在内存中维护数据的方法和系统
US9778991B2 (en) Exporting and importing database tables in a multi-user database environment
US9817858B2 (en) Generating hash values
US8442954B2 (en) Creating and managing links to deduplication information
US9645828B2 (en) Method of searching character string, character string searching device, and recording medium
EP3435256B1 (en) Optimal sort key compression and index rebuilding
US9218394B2 (en) Reading rows from memory prior to reading rows from secondary storage
US20160239549A1 (en) Method for processing a database query
CN105989015B (zh) 一种数据库扩容方法和装置以及访问数据库的方法和装置
WO2022048284A1 (zh) 一种基因对比的哈希查表方法、装置、设备及存储介质
CN108027713A (zh) 用于固态驱动器控制器的重复数据删除
CN113721862B (zh) 数据处理方法及装置
CN107135662A (zh) 一种差异数据备份方法、存储系统和差异数据备份装置
CN110020272B (zh) 缓存方法、装置以及计算机存储介质
US9213759B2 (en) System, apparatus, and method for executing a query including boolean and conditional expressions
US10698608B2 (en) Method, apparatus and computer storage medium for data input and output
US9104711B2 (en) Database system, method of managing database, and computer-readable storage medium
WO2016177027A1 (zh) 批量数据查询方法和装置
US20170031909A1 (en) Locality-sensitive hashing for algebraic expressions
CN113806803A (zh) 一种数据存储方法、系统、终端设备及存储介质
JP2018109898A (ja) データマイグレーションシステム
CN112084141A (zh) 一种全文检索系统扩容方法、装置、设备及介质
US11609909B2 (en) Zero copy optimization for select * queries
CN112202822B (zh) 数据库注入检测方法、装置、电子设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19831268

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19831268

Country of ref document: EP

Kind code of ref document: A1