WO2016070750A1 - Distributed buffering range querying method, device, and system - Google Patents

Distributed buffering range querying method, device, and system Download PDF

Info

Publication number
WO2016070750A1
WO2016070750A1 PCT/CN2015/093310 CN2015093310W WO2016070750A1 WO 2016070750 A1 WO2016070750 A1 WO 2016070750A1 CN 2015093310 W CN2015093310 W CN 2015093310W WO 2016070750 A1 WO2016070750 A1 WO 2016070750A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
value
query
identifier value
keyword
Prior art date
Application number
PCT/CN2015/093310
Other languages
French (fr)
Chinese (zh)
Inventor
湛滨瑜
于君泽
Original Assignee
阿里巴巴集团控股有限公司
湛滨瑜
于君泽
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司, 湛滨瑜, 于君泽 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2016070750A1 publication Critical patent/WO2016070750A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor

Definitions

  • the present invention relates to distributed cache, and in particular, to a distributed cache range query method, device and system.
  • Distributed cache is a data cache method in which cached data is stored in a memory hash table in the form of key-value (keyword-cache data) through a distributed cache server cluster.
  • Distributed caching reduces the number of accesses to the database by caching data and objects in memory, increasing data access speed.
  • the range query of the associated key is implemented by establishing an index supporting the range query in the relational database.
  • the index of the relational database is used to query the associated key according to the scope query condition, and then the corresponding value is obtained by direct query according to the key in the distributed cache.
  • the purpose of the present application is to provide a distributed cache range query method, apparatus and system for achieving the purpose of range query in the case of complete decoupling from the database.
  • a distributed cache range query method may include: in the keyword used to map the cached data, the identifier value corresponding to the field value that can be used for the range query is pre-stored in the storage area of the memory; in response to receiving the keyword for the specified range The query request is used to find an identifier value corresponding to the endpoint value of the specified range from the storage area, and determine a keyword set corresponding to the specified range according to the identifier value corresponding to the endpoint value of the specified range.
  • a distributed cache range query device may include: a pre-processing unit, configured to store, in the keyword used for mapping the cache data, an identifier value corresponding to the field value of the range query, in a storage area of the memory; the query response unit, And in response to receiving the query request for the specified range of keywords, the identifier value corresponding to the specified range of endpoint values is searched from the storage area; the keyword obtaining unit is configured to use the endpoint value of the specified range The corresponding identifier value determines a keyword set corresponding to the specified range.
  • a distributed cache range query system may include: a cache server, configured to store cached data having a mapping relationship with a keyword, receive a query request sent by the query server for the cached data corresponding to the keyword set, and feed back the keyword set corresponding to the keyword set by the query server.
  • the cached data; the query server may be used in the keyword used to map the cached data, and the identifier value corresponding to the field value of the range query may be pre-stored in the memory storage area, in response to receiving from the client for the specified And determining, by the identifier value corresponding to the endpoint value of the specified range, the identifier value corresponding to the endpoint value of the specified range, and determining the specified range according to the identifier value corresponding to the endpoint value of the specified range.
  • a set of keywords the cache data corresponding to the keyword set is obtained from the cache server, and the obtained cache data is fed back to the client that sends the query request; the client may be configured to send to the query server.
  • Query request for cached data corresponding to a specified range of keywords The query cache server receives data feedback.
  • the identifier value corresponding to the field value of the range query is pre-stored in the storage area of the memory, and the keyword for the specified range is received in advance.
  • the query for finding the identifier value corresponding to the endpoint value of the specified range from the storage area may all be completed in the memory, and the scope query decoupled from the database is implemented without accessing the database.
  • FIG. 1 is a schematic flowchart of a distributed cache range query method according to an embodiment of the present application
  • FIG. 2 is a schematic flowchart of a distributed cache range query method according to another embodiment of the present disclosure
  • FIG. 3 is a schematic diagram of a node ring according to an embodiment of the present application.
  • FIG. 4 is a schematic diagram of a node ring according to another embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of a distributed cache range query apparatus according to an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of a distributed cache range query apparatus according to another embodiment of the present disclosure.
  • FIG. 7 is a schematic structural diagram of a distributed cache range query system according to an embodiment of the present application.
  • embodiments of the present application can be applied to separate query servers that are different from one or more cache servers for storing cached data.
  • a plurality of cache servers store mappings of keywords and cache data composed of hotel IDs and date values.
  • the query server applying the method provided by the embodiment of the present application may receive a query request for a hotel ID in a specified range, and obtain a keyword set corresponding to the query request. For example, a collection of keywords with a hotel ID in the range of 7 to 32 can be queried.
  • the embodiment of the present application provides the following distributed cache range query method and apparatus.
  • FIG. 1 is a schematic flowchart of a distributed cache range query method according to an embodiment of the present application. As shown in FIG. 1, the method can include:
  • the identifier value corresponding to the field value that can be used for the range query is pre-stored in the storage area of the memory.
  • all the keys can be Words are duplicated in the field values of the same field, and only the different field values of the same field are pre-stored in a memory area of the memory.
  • the storage structure of the storage area located in the memory is not limited, and may be, for example, a singly linked list, an array, a circular linked list, or the like.
  • the identifier value corresponding to the field value that can be used for the range query in the keyword may be pre-stored in a node ring (ie, a ring list) located in the memory, where one identifier
  • the values are correspondingly stored in a node, and a corresponding routing table is established for each node in the node ring, and the routing table records the identification values of one or more other nodes determined according to the preset index algorithm.
  • the preset index algorithm may be: using a routing table to record an identifier value of a power relationship of a node corresponding to the routing table in the node ring to a power relationship of 2.
  • the step S110 may be specifically: storing, in the keyword, the identifier value corresponding to the field value of the same field that can be used for the range query, in the order of the identifier value, in the node ring located in the memory, wherein the routing table An identification value of a power relationship between the node ring and the identification value of the corresponding node in a power of 2 is recorded.
  • the identifier value corresponding to the field value that can be used for the range query in the cache data is usually a value that is not much different, and the identifier in the routing table corresponds to the identifier value of the power relationship of the node to the power of 2 Nodes can facilitate subsequent lookups and improve search efficiency.
  • the preset indexing algorithm is not limited to the one in the above embodiment, and may be set according to the actual query efficiency. This application does not limit this.
  • the preset index algorithm may further be: using a routing table to record an identifier value of a node value of a node corresponding to the routing table in the node ring to be an integer multiple of a specified constant, and the like.
  • the key can be composed of fields with different fields, wherein the fields that can be used for the range query can include fields such as numbers, dates, and the like.
  • the date value may be extracted from each keyword, and the extracted different date values are correspondingly converted into identifier values that can be calculated according to the preset index algorithm.
  • the identifier value may be stored in the form of a linked list as a node loop that is connected end to end and sorted by date value.
  • the original keyword is "date-August 07, 2014-hotelId-18873"
  • the keyword reflects the correspondence between the date and the hotel id
  • the date "August 07, 2014” can be converted to
  • the identification value is 140807.
  • the sorting according to the size of the identifier value is specifically sorted by the field value from small to large, or from the largest to the smallest.
  • the identification value is stored in a node ring
  • routing at the current node The table finds the identity value closest to the endpoint value of the specified range.
  • the found identifier value is used as the identifier value corresponding to the endpoint value of the specified range;
  • the identifier value is not the identifier value closest to the endpoint value of the specified range in the node ring, and the found identifier value is used as the current node, and then returned to the routing table of the current node to find the specified range from the current node.
  • the step of identifying the value of the endpoint value most recently.
  • the specified range may include one or more specified ranges
  • the endpoint value may be an endpoint value used to determine the specified range interval.
  • the specified date range may include: January 1, 2001 to May 1, 2001, and January 1, 2002 to May 1, 2002.
  • the endpoint values may include: 010101 and 010501, 020101 and 020501.
  • the identifier value corresponding to the endpoint value may be an identifier value equal to the endpoint value, and in the case that there is no identifier value equal to the endpoint value, the node ring may be in the specified The identity value within the range that is closest to the endpoint value.
  • a keyword within the specified range may be constructed according to an identifier value corresponding to the endpoint value of the specified range, and a keyword set corresponding to the specified range may be obtained.
  • a corresponding keyword construction rule may be set for different types of keywords in advance, and a corresponding keyword construction rule is adopted according to the type of the keyword to be queried, and the field value corresponding to the identifier value is used as an input variable to construct Corresponding keywords, get the keyword set corresponding to the query request.
  • a corresponding keyword construction rule may be set for different types of keywords in advance, and a corresponding keyword construction rule is adopted according to the type of the keyword to be queried, and the field value corresponding to the identifier value is used as an input variable to construct Corresponding keywords, get the keyword set corresponding to the query request.
  • the method provided by the embodiment of the present application may be applied to a separate query server different from one or more cache servers that store cached data, after obtaining a keyword set corresponding to the specified range, The cache data corresponding to the keyword set may be further obtained from the one or more cache servers by one multi-thread download concurrently, and the obtained cache data is returned to the client that issues the query request.
  • the query for finding the identifier value corresponding to the endpoint value of the specified range from the storage area may all be in the memory. Completed in the middle, without access to the database, the scope query decoupled from the database.
  • an embodiment in which the routing table is used to record the identification value of the power relationship between the node ring and the corresponding node is set to a power relationship of two.
  • the embodiment can include:
  • the identifier value corresponding to the field value of the same field that can be used for the range query is pre-stored in the node ring in the memory according to the size of the identifier value, wherein one identifier value is correspondingly stored in one node, And establishing a corresponding routing table for each node in the node ring, where the routing table records an identifier value of a power relationship between the node ring and the identifier value of the corresponding node.
  • the identifier value of the power relationship with the spacing of 2 may refer to the identifier value of the node whose spacing is equal to 2 i-1 , and when there is no identifier value with the spacing equal to 2 i-1 , the spacing is closest to 2 i- The node ID value of 1 .
  • the identifier value closest to 2 i-1 may be in the identification value with a spacing greater than 2 i-1 , and the spacing is closest to 2 i
  • An identification value of -1 where i is an integer, and is greater than or equal to 1, less than or equal to the maximum identification value of the node in the node ring, taking the logarithm of 2 and then rounding up the number.
  • the node ring shown in FIG. 3, wherein the numbers 2, 8, 10, 16 and the like marked next to the node are identification values for identifying the node.
  • Each node in the node ring shown in Figure 3 maintains a routing table of m items.
  • m is the number of bits of the largest binary identification value in the node ring.
  • L is the node with the largest identification value in the ring, then m is L and the 2 logarithm is rounded up. which is:
  • all nodes need to be distributed to the ring.
  • the value of m should be 6.
  • the identifier value of the i-th record of the routing table is equal to:
  • the routing table Since the routing table records the identity value of the power relationship with the identity value of the corresponding node in a power of 2, the direct successor node of each node is the first item of its routing table. In order to facilitate the query of the identifier value corresponding to the endpoint value of the specified range, each node in the node ring also maintains its own direct precursor node. In the embodiment of the present application, since the interval of the identifier value recorded by the routing table increases exponentially, the density of the node adjacent to the corresponding node recorded in the routing table is greater than the density of the remote node, so the routing table is indexed below.
  • the sparse remote node recorded according to the routing table can quickly jump to a farther node for query. If the endpoint of the range is closer to the identifier of the current node, the denser neighboring node recorded by the routing table can be more accurately hopped to the node closer to the identifier value for query. Therefore, the routing table established by the node in the embodiment of the present application can perform efficient range query.
  • the identifier value that is found is the identifier value that is closest to the endpoint value of the specified range in the node ring, the identifier value that is found is used as the identifier value corresponding to the endpoint value of the specified range.
  • the specified range may be a range between the first endpoint value and the second endpoint value, wherein the first endpoint value is smaller than the second endpoint value
  • S220-S250 may be
  • the query step can include:
  • any node in the node ring can be used as the current node.
  • an identity value equal to the first endpoint value is used as the identity value corresponding to the first endpoint value.
  • the first endpoint value is between the identity value of the current node and the identity value of its immediate precursor node or direct successor node. It can be understood that if the endpoint value is between the identifier value of the current node and the identifier value of the direct precursor node or its immediate successor node, it means that there is no identifier value equal to the endpoint value in the node ring, only in the The identifier value corresponding to the endpoint value is selected by the current node, the direct precursor node of the current node, or the direct successor node, and may be selected according to whether the endpoint value is the starting endpoint or the ending endpoint of the range.
  • the endpoint value is not between the identifier value of the current node and the identifier value of the direct precursor node or its immediate successor node, it indicates that other identifiers in the node ring may have an identity value equal to the endpoint value, and then the current node may be hopped.
  • the node identified by the identifier value closest to the endpoint value recorded in the routing table continues to be judged.
  • the identifier value of the current node is used as the identifier value corresponding to the first endpoint value.
  • the identifier value of the direct successor node of the current node is used as the identifier value corresponding to the first endpoint value.
  • the current And updating, by the node, the node identified by the identifier value of the routing value of the current node recorded in the routing table of the current node, and returning to the foregoing determining whether the identifier value of the routing table record of the current node exists The step of identifying values with equal endpoint values.
  • the identity value equal to the second endpoint value is used as the identity value corresponding to the second endpoint value.
  • the identifier value of the direct precursor node of the current node is used as the identifier value corresponding to the second endpoint value.
  • the identifier value of the current node is used as the identifier value corresponding to the second endpoint value.
  • the current node Updating to the node where the identifier value closest to the second endpoint value recorded in the routing table of the current node is located, and returning to the identifier value of the routing table record of the current node to determine whether the second endpoint exists.
  • the query step for the identifier value corresponding to the first endpoint value and the identifier value corresponding to the second endpoint value may be performed concurrently or concurrently, and the sequence of query steps for different endpoint values in the embodiment of the present application is performed. There are no restrictions.
  • the above query step is schematically illustrated by taking the node ring shown in FIG. 3 and the cached information in the range of 7 to 32 for the query request value as an example. It can be understood that the numerical example is only for ease of understanding.
  • the field value available for the range query is non-numeric
  • the field value of the non-numeric type can be converted into the identifier value of the numeric type. For example, starting from node 2, between node 2 and the immediate successor node 8 of node 2, based on endpoint value 7, it is determined that there is node with an identity value of 7 in the node ring. Therefore, the identifier value corresponding to the endpoint value 7 is 8.
  • the node 28 jumps to the routing table of the node 28 to query, and queries the routing table information of the node 28 according to the node closest to 32.
  • the routing table of the node 30 is queried, and the node 32 is located between the node 30 and its immediate successor node 33, and it is determined that there is no node with the identifier value of 32 in the node ring. Therefore, the identifier value corresponding to 32 is 30.
  • the identification values in the specified range 7 to 32 are found to be: 8, 10, 16, 21, 28, 30.
  • the routing table of the ring node is used to query the end of the identification value of the specified range 7 to 32.
  • S260 Determine, according to the identifier value corresponding to the endpoint value of the specified range, the corresponding range of the specified range. Keyword collection.
  • the node ring can be directly read from the memory, and the routing table of the node in the node ring is used as an index to perform range query, and the dependency on the database is fast, the reading speed is fast, and the routing table records the In the node ring, the identifier value of the corresponding node is separated by an identifier value of a power relationship of 2, and therefore, in the process of searching for the identifier value corresponding to the endpoint value of the specified range according to the routing table, the jump is always the closest to the endpoint value.
  • cached keywords may be added or removed at any time.
  • the embodiment of the present application may further include:
  • the node that stores the identifier value corresponding to the field value is used as the node to be deleted. Updating the direct precursor node of the direct successor node of the node to be deleted as the direct precursor node of the node to be deleted, and deleting the node to be deleted from the node ring;
  • the routing table that needs to be updated is affected by the joining of the new node or affected by the deletion of the node to be deleted.
  • the routing table of each node the field value of the power relationship with the field value of the corresponding node should be separately recorded, which is required to be affected by the joining of the new node or affected by the deletion of the node to be deleted.
  • the updated routing table is updated.
  • the following is an example of how to update the routing table. For example, if the field value of the i-th item of the above routing table is equal to the successor ((the node's identification value +2 i-1 ) mod2 m ), (1 ⁇ i ⁇ m ), if the node ring newly joins the node P, you can use the following steps to update the routing table that needs to be updated by P:
  • the information recorded in the routing table of the predecessor node of the node P is recursively updated until the recursive precursor node cannot simultaneously
  • the two conditions of the update are met and the recursion is terminated.
  • the two conditions are as follows: Condition 1: The distance between the identification values of the recursive precursor node S and the node P is greater than or equal to 2 i-1 .
  • the i- th item of the node S routing table must be after the node P, so the i-th item of the routing table does not need to be updated.
  • Condition 2 Under the condition that the condition 1 is satisfied, the current i-th item of the routing table of the node S needs to be after the node P. Because if the routing table information of node S is the i-th item before node P. The P node is the node after the current i-th item, and the current item of the routing table does not need to be updated.
  • the routing table information of the predecessor node of the newly joined node can be recursively updated in the opposite direction of the preset order of the ring node.
  • the update of the routing table by the affected node is the same as the update of the routing table by the new joining node, and will not be described here. Because the node is inserted or deleted, it will not affect the routing table of the successor node of the current node, and only affects the precursor node of the current node. This requires each node to maintain a direct precursor node in addition to maintaining routing table information.
  • the direct predecessor node of the node S may also need to update the routing table information. Conversely, if the node S does not need to update the routing table information, the predecessor node of the S There is also no need to update the routing table information. The recursive update of the routing table information ends.
  • the newly joined node is the node 30, and the routing table information of the predecessor node of the node 30 is recursively updated counterclockwise along the ring node, and is recursively updated from the predecessor node 28 to the node 16.
  • the node 28 routes the first item of the table, and the second item is updated from 33 to 30. Since the update to the node 16 does not satisfy the two conditions of updating the routing table at the same time, the routing table information of the node 16 does not change, and therefore, the recursive update is ended.
  • a distributed cache range query device is also provided.
  • FIG. 5 is a schematic structural diagram of a distributed cache range query apparatus according to an embodiment of the present application.
  • the apparatus may include:
  • the pre-processing unit 510 can be configured to store, in the keyword used for mapping the cache data, an identifier value corresponding to the field value of the range query, which is stored in a storage area of the memory in advance; the query response unit 520 can be used for In the keyword of the mapping cache data, the identifier value corresponding to the field value of the range query is pre-stored in the storage area of the memory; the keyword obtaining unit 530 can be used to identify the identifier corresponding to the endpoint value of the specified range. The value determines a set of keywords corresponding to the specified range.
  • the pre-processing unit 510 may be configured to store, in the keyword, an identifier value corresponding to a field value that is available for the range query, in a node ring located in the memory, where the identifier is The values are correspondingly stored in a node, and a corresponding routing table is established for each node in the node ring, and the routing table records the identification values of one or more other nodes determined according to the preset index algorithm.
  • the query response unit 520 may include: a lookup subunit 521, which may be configured to respond to a query request for a specified range of keywords, with any node in the node ring as a current The node searches for the identity value closest to the endpoint value of the specified range in the routing table of the current node.
  • the first determining sub-unit 522 may be configured to: if it is determined that the found identifier value is the identifier value closest to the endpoint value of the specified range in the node ring, use the found identifier value as the endpoint value of the specified range. The corresponding identification value.
  • the second determining sub-unit 523 may be configured to: if it is determined that the found identifier value is not the identifier value closest to the endpoint value of the specified range in the node ring, and use the found identifier value as the current node, triggering the searching
  • the subunit searches for the identity value closest to the endpoint value of the specified range in the routing table of the current node.
  • the pre-processing unit 510 may be configured to store, in the keyword, the identifier values corresponding to the field values of the same field that are available for the range query, in the order of the identifier value, in the node ring located in the memory.
  • the routing table records field values in the node ring that are spaced apart from the field values of the corresponding nodes by a power relationship of two.
  • the routing table records field values in the node ring that are spaced apart from the field value of the corresponding node by a power relationship of 2 is described in detail.
  • the identifier values are sorted in ascending order, the specified range being a range between the first endpoint value and the second endpoint value, wherein the first endpoint value is less than the second endpoint value.
  • the search sub-unit 521 in the embodiment of the present application may include:
  • the departure sub-unit 5210 may be configured to respond to the query request for the specified range of keywords, using any one of the node rings as the current node.
  • the first endpoint determining sub-unit 5211 is configured to determine whether an identifier value equal to the first endpoint value exists in the identifier value of the routing table record of the current node.
  • the first endpoint determining subunit 5212 may be configured to: if the first endpoint determining subunit 5211 determines that the identifier is present, the identifier value equal to the endpoint value is used as the first endpoint in the node ring The most recent identity value for the value.
  • the first endpoint continuation sub-unit 5213 may be configured to determine, if the first endpoint determining sub-unit 5211 determines that there is no presence, determine whether the first endpoint value is at the current node identifier value and its direct precursor node or directly Between the identification values of subsequent nodes.
  • the first endpoint continued stator unit 5214 can be configured to: if the first endpoint continuation subunit 5213 determines that the first endpoint value is between the identity value of the current node and an identity value of the immediate precursor node, Using the identifier value of the current node as the identifier value closest to the first endpoint value in the node ring; if the first endpoint continuation sub-unit 5213 determines that the first endpoint value is in the current Between the identifier value of the node and the identifier value of the direct successor node, the identifier value of the direct successor node of the current node is used as the identifier value closest to the first endpoint value in the node ring.
  • the second determining subunit 523 may be configured to: if the first endpoint contingency subunit 5213 determines that the first endpoint value is not between the identifier value of the current node and the identifier value of the immediate successor node And not between the identifier value of the current node and the identifier value of the direct precursor node, updating the current node to the identifier value closest to the first endpoint value recorded in the routing table of the current node The node where it is located, re-triggers the first endpoint determining sub-unit 5211 to execute.
  • the search sub-unit 521 in the embodiment of the present application, as shown in FIG. 6, may further include:
  • the second endpoint determining sub-unit 5215 can be configured to determine the routing table record of the current node. Whether there is an identification value equal to the second endpoint value in the identification value.
  • the second endpoint determining subunit 5216 may be configured to: if the second endpoint determining subunit 5215 determines that the identifier is present, the identifier value equal to the endpoint value is the closest to the second endpoint value in the node ring. Identification value.
  • the second endpoint continuation sub-unit 5217 may be configured to determine, if the second endpoint determining sub-unit 5215 determines that there is no presence, determine whether the second endpoint value is at an identifier value of the current node and a direct predecessor node or a direct successor node. Between the identification values.
  • the second endpoint continuation unit unit 5218 can be configured to: if the second endpoint continuation subunit 5217 determines that the second endpoint value is between the identity value of the current node and an identity value of the immediate precursor node thereof, The direct precursor node of the current node is the identifier value closest to the second endpoint value in the node ring; if the second endpoint continuation sub-unit 5217 determines that the second endpoint value is at the identifier value of the current node Between the identification values of the direct successor nodes, the identifier value of the current node is used as the identifier value closest to the second endpoint value in the node ring.
  • the second determining sub-unit 523 may be configured to: if the second endpoint continuation sub-unit 5217 determines that the second endpoint value is not between the identifier value of the current node and an identifier value of the immediate successor node, and Not between the identifier value of the current node and the identifier value of the direct precursor node, updating the current node to a node where the identifier value closest to the second endpoint value recorded in the routing table of the current node is located Re-triggering the second endpoint determination sub-unit 5215 to execute.
  • the apparatus provided in this embodiment of the present application may further include:
  • the node joining unit 540 may be configured to determine, for the newly added keyword in the cache, whether the identifier value corresponding to the field value of the newly added keyword that is available for the range query already exists in the node ring, and if not, the The identifier value is stored in a new node, in which the node N that can be the direct precursor node of the new node is found, and the direct precursor node of the direct successor node of the node N is updated as the new node, and the new node is updated.
  • Node N is a direct precursor node of the new node, and a corresponding routing table is established for the new node;
  • the node deleting unit 550 may be configured to: for the keyword deleted in the cache, if the field value applicable to the range query in the field of the deleted keyword does not exist in any other keyword, A node that stores the identifier value corresponding to the field value is used as the node to be deleted, and the direct precursor node that updates the direct successor node of the node to be deleted is the direct precursor node of the node to be deleted, and the to-be-deleted is deleted from the node ring. node;
  • routing update unit 560 can be configured to update, according to the preset indexing algorithm, a routing table that needs to be updated by the joining effect of the new node or affected by the deletion of the to-be-deleted node.
  • kay-value information of each node may be stored in multiple cache servers of the distributed cache system.
  • the apparatus provided in this embodiment of the present application may be configured in a separate query server different from multiple cache servers for storing cached data.
  • the device may further include: a data feedback unit 570, configured to: after the keyword obtaining unit obtains the keyword set corresponding to the specified range, obtain the key from the plurality of cache servers by one multi-thread download and concurrent The cached data corresponding to the set of words returns the obtained cached data to the client that issued the request.
  • the query response unit 520 can directly read the node ring from the memory, and use the routing table of the node in the node ring as an index to perform range query, which is free from dependence on the database and fast in reading. And searching for the identifier value corresponding to the endpoint value of the specified range according to the routing table, always jumping to the routing table of the node identified by the field value closest to the endpoint value to search, and finally enabling the keyword obtaining unit 530 to Find the keyword set corresponding to the specified range, so that the query process becomes a process of folding the search to achieve the purpose of efficient range query.
  • the search subunit 521, the first determining subunit 522, the second determining subunit 523, the starting subunit 5210, the first endpoint judging subunit 5211, and the first endpoint determining subroutine are provided in the embodiment of the present application.
  • the unit 5212, the first endpoint continuation subunit 5213, the first endpoint continuation stator unit 5214, the second endpoint determination subunit 5215, the second endpoint determination subunit 5216, the second endpoint continuation subunit 5217, and the second endpoint continuation stator Unit 5218, node joining unit 540, node deleting unit 550, routing update unit 560, and data feedback unit 570 are drawn in dashed lines in FIG. 6 to indicate that these units or subunits are not necessary units of the apparatus provided by the embodiments of the present application.
  • the embodiment of the present application further provides a Distributed cache range query system.
  • FIG. 7 is a schematic structural diagram of a distributed cache range query system according to an embodiment of the present application.
  • the system can include:
  • the cache server 710 may be configured to store the cached data that has a mapping relationship with the keyword, receive the query request sent by the query server 720 for the cached data corresponding to the keyword set, and feed back the cached data corresponding to the keyword set to the query server 720;
  • the query server 720 may be configured to store, in the keyword used for mapping the cache data, an identifier value corresponding to the field value of the range query, in a storage area of the memory, in response to receiving the specified range from the client. And a query request for the cached data corresponding to the keyword, the identifier value corresponding to the endpoint value of the specified range is searched from the storage area, and the key corresponding to the specified range is determined according to the identifier value corresponding to the endpoint value of the specified range a set of words, the cache data corresponding to the keyword set is obtained from the cache server, and the obtained cache data is fed back to the client that sends the query request;
  • the client 730 may be configured to send, to the query server, a query request for cached data corresponding to a specified range of keywords; and receive cached data fed back by the query server.
  • the cache server 710 can have one or more. Different node rings and routing tables that can be used for range query established by the embodiments of the present application can be saved in the separate query server 720.
  • the query server 720 can determine the node ring that needs to be read according to the query request, and use any node in the node ring as the current node to perform a subsequent query step.
  • the query may be multi-threaded. Downloading and obtaining the cache data corresponding to the keyword set from the one or more cache servers, and returning the obtained cache data to the client that sends the request.
  • the present invention can be implemented by means of software plus a necessary general hardware platform. Based on such understanding, the technical solution of the present invention, which is essential or contributes to the prior art, may be embodied in the form of a software product, which may be stored in a storage medium such as a ROM/RAM or a disk. , optical discs, etc., including a number of instructions to make a computer device (can be a personal computer, A server, or network device, etc.) performs the methods described in various embodiments of the present invention or in certain portions of the embodiments.
  • the invention is applicable to a wide variety of general purpose or special purpose computing system environments or configurations.
  • the invention may be described in the general context of computer-executable instructions executed by a computer, such as a program module.
  • program modules include routines, programs, objects, components, data structures, and the like that perform particular tasks or implement particular abstract data types.
  • the invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are connected through a communication network.
  • program modules can be located in both local and remote computer storage media including storage devices.

Abstract

Disclosed is a distributed buffering range querying method. The method comprises: prestoring, in a storage area of a memory, an identifier value, corresponding to a field value that can be used for range querying, in keywords used for mapping buffered data; in response to a received query request aimed at a keyword within a specified range, finding, from the storage area, an identifier value corresponding to an end value of the specified range; and determining, according to the identifier value corresponding to the end value of the specified range, a keyword set corresponding to the specified range, to implement range querying decoupled from a database. Also disclosed are a distributed buffering range querying device and system.

Description

一种分布式缓存范围查询方法、装置及系统Distributed cache range query method, device and system 技术领域Technical field
本发明涉及分布式缓存,尤其涉及一种分布式缓存范围查询方法、装置及系统。The present invention relates to distributed cache, and in particular, to a distributed cache range query method, device and system.
背景技术Background technique
分布式缓存,是一种通过分布式缓存服务器集群将缓存数据以key-value(关键字-缓存数据)的形式存储在内存Hash表中的数据缓存方式。分布式缓存通过在内存中缓存数据和对象来减少访问数据库的次数,提高了数据访问速度。Distributed cache is a data cache method in which cached data is stored in a memory hash table in the form of key-value (keyword-cache data) through a distributed cache server cluster. Distributed caching reduces the number of accesses to the database by caching data and objects in memory, increasing data access speed.
目前,为了支持分布式缓存的范围查询,主要通过在关系型数据库中建立支持范围查询的索引来实现对关联key的范围查询。当服务端收到一个范围条件查询请求时,通过关系型数据库的索引,根据范围查询条件查询出符合特定范围的关联key,再到分布式缓存中根据key直接查询得到对应的value。Currently, in order to support the range query of the distributed cache, the range query of the associated key is implemented by establishing an index supporting the range query in the relational database. When the server receives a range condition query request, the index of the relational database is used to query the associated key according to the scope query condition, and then the corresponding value is obtained by direct query according to the key in the distributed cache.
但是,由于目前分布式缓存的范围查询需要通过数据库索引来实现,没有脱离对数据库的强依赖,查询性能相对较差。However, since the scope query of the distributed cache needs to be implemented by the database index, the query performance is relatively poor without leaving the strong dependence on the database.
发明内容Summary of the invention
有鉴于此,本申请的目的在于提供一种分布式缓存范围查询方法、装置及系统以在与数据库完全解耦的情况下实现范围查询的目的。In view of this, the purpose of the present application is to provide a distributed cache range query method, apparatus and system for achieving the purpose of range query in the case of complete decoupling from the database.
在本申请实施例的第一个方面,提供了一种分布式缓存范围查询方法。例如,该方法可以包括:将用于映射缓存数据的关键字中,可用于范围查询的字段值对应的标识值,预先存储在内存的存储区域中;响应于接收到针对指定范围的关键字的查询请求,从所述存储区域中查找出所述指定范围的端点值对应的标识值;根据所述指定范围的端点值对应的标识值确定所述指定范围对应的关键字集合。In a first aspect of the embodiments of the present application, a distributed cache range query method is provided. For example, the method may include: in the keyword used to map the cached data, the identifier value corresponding to the field value that can be used for the range query is pre-stored in the storage area of the memory; in response to receiving the keyword for the specified range The query request is used to find an identifier value corresponding to the endpoint value of the specified range from the storage area, and determine a keyword set corresponding to the specified range according to the identifier value corresponding to the endpoint value of the specified range.
在本申请实施例的第二个方面,提供了一种分布式缓存范围查询装置。 例如,该装置可以包括:预处理单元,用于将用于映射缓存数据的关键字中,可用于范围查询的字段值对应的标识值,预先存储在内存的存储区域中;查询响应单元,用于响应于接收到针对指定范围的关键字的查询请求,从所述存储区域中查找出所述指定范围的端点值对应的标识值;关键字获取单元,用于根据所述指定范围的端点值对应的标识值确定所述指定范围对应的关键字集合。In a second aspect of the embodiments of the present application, a distributed cache range query device is provided. For example, the apparatus may include: a pre-processing unit, configured to store, in the keyword used for mapping the cache data, an identifier value corresponding to the field value of the range query, in a storage area of the memory; the query response unit, And in response to receiving the query request for the specified range of keywords, the identifier value corresponding to the specified range of endpoint values is searched from the storage area; the keyword obtaining unit is configured to use the endpoint value of the specified range The corresponding identifier value determines a keyword set corresponding to the specified range.
在本申请实施例的第三个方面,提供了一种分布式缓存范围查询系统。例如,该系统可以包括:缓存服务器,可以用于存储与关键字具有映射关系的缓存数据,接收查询服务器发出的针对关键字集合对应的缓存数据的查询请求,向查询服务器反馈关键字集合对应的缓存数据;查询服务器,可以用于将用于映射缓存数据的关键字中,可用于范围查询的字段值对应的标识值,预先存储在内存的存储区域中,响应于从客户端接收到针对指定范围的关键字对应的缓存数据的查询请求,从所述存储区域中查找出所述指定范围的端点值对应的标识值,根据所述指定范围的端点值对应的标识值确定所述指定范围对应的关键字集合,从所述缓存服务器中得到所述关键字集合对应的缓存数据,将得到的缓存数据反馈给发出所述查询请求的客户端;客户端,可以用于向所述查询服务器发送针对指定范围的关键字对应的缓存数据的查询请求;接收查询服务器反馈的缓存数据。In a third aspect of the embodiments of the present application, a distributed cache range query system is provided. For example, the system may include: a cache server, configured to store cached data having a mapping relationship with a keyword, receive a query request sent by the query server for the cached data corresponding to the keyword set, and feed back the keyword set corresponding to the keyword set by the query server. The cached data; the query server may be used in the keyword used to map the cached data, and the identifier value corresponding to the field value of the range query may be pre-stored in the memory storage area, in response to receiving from the client for the specified And determining, by the identifier value corresponding to the endpoint value of the specified range, the identifier value corresponding to the endpoint value of the specified range, and determining the specified range according to the identifier value corresponding to the endpoint value of the specified range. a set of keywords, the cache data corresponding to the keyword set is obtained from the cache server, and the obtained cache data is fed back to the client that sends the query request; the client may be configured to send to the query server. Query request for cached data corresponding to a specified range of keywords The query cache server receives data feedback.
可见本申请具有如下有益效果:It can be seen that the application has the following beneficial effects:
由于本申请实施例预先将用于映射缓存数据的各个关键字中,可用于范围查询的字段值对应的标识值,预先存储在内存的存储区域中,因此,在接收到针对指定范围的关键字的查询请求后,从所述存储区域中查找出所述指定范围的端点值对应的标识值的查询可以全部在内存中完成,无需访问数据库,实现了与数据库解耦的范围查询。The identifier value corresponding to the field value of the range query is pre-stored in the storage area of the memory, and the keyword for the specified range is received in advance. After the query request, the query for finding the identifier value corresponding to the endpoint value of the specified range from the storage area may all be completed in the memory, and the scope query decoupled from the database is implemented without accessing the database.
附图说明DRAWINGS
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请中记载的一些实施例,对于本领域普通技术人员 来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings to be used in the embodiments or the prior art description will be briefly described below. Obviously, the drawings in the following description are only Some embodiments described in this application are for those of ordinary skill in the art In other words, other drawings can be obtained based on these drawings without paying for creative labor.
图1为本申请一实施例提供的分布式缓存范围查询方法流程示意图;1 is a schematic flowchart of a distributed cache range query method according to an embodiment of the present application;
图2为本申请另一实施例提供的分布式缓存范围查询方法流程示意图;2 is a schematic flowchart of a distributed cache range query method according to another embodiment of the present disclosure;
图3为本申请一实施例涉及的节点环示意图;3 is a schematic diagram of a node ring according to an embodiment of the present application;
图4为本申请另一实施例涉及的节点环示意图;4 is a schematic diagram of a node ring according to another embodiment of the present application;
图5为本申请一施例提供的分布式缓存范围查询装置结构示意图;FIG. 5 is a schematic structural diagram of a distributed cache range query apparatus according to an embodiment of the present application; FIG.
图6为本申请另一实施例提供的分布式缓存范围查询装置结构示意图;FIG. 6 is a schematic structural diagram of a distributed cache range query apparatus according to another embodiment of the present disclosure;
图7为本申请施例提供的分布式缓存范围查询系统结构示意图。FIG. 7 is a schematic structural diagram of a distributed cache range query system according to an embodiment of the present application.
具体实施方式detailed description
为了使本技术领域的人员更好地理解本申请中的技术方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都应当属于本发明保护的范围。The technical solutions in the embodiments of the present application are clearly and completely described in the following, in which the technical solutions in the embodiments of the present application are clearly and completely described. The embodiments are only a part of the embodiments of the present application, and not all of them. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present application without departing from the inventive scope should fall within the scope of the present invention.
为了使本申请实施例更加便于理解,首先对本申请实施例可能的应用场景进行举例说明。例如,本申请实施例可以应用于与用于存储缓存数据的一台或多台缓存服务器不同的单独的查询服务器中。其中,多台缓存服务器中保存了由酒店ID与日期值组成的关键字与缓存数据的映射。应用本申请实施例提供的方法的查询服务器可以接收针对酒店ID在指定范围的查询请求,得到与查询请求对应的关键字集合。例如,可以查询出酒店ID在7到32范围内的关键字集合。In order to make the embodiments of the present application more understandable, the possible application scenarios of the embodiments of the present application are first illustrated. For example, embodiments of the present application can be applied to separate query servers that are different from one or more cache servers for storing cached data. Among them, a plurality of cache servers store mappings of keywords and cache data composed of hotel IDs and date values. The query server applying the method provided by the embodiment of the present application may receive a query request for a hotel ID in a specified range, and obtain a keyword set corresponding to the query request. For example, a collection of keywords with a hotel ID in the range of 7 to 32 can be queried.
基于上述分析,本申请实施例提供了如下的分布式缓存范围查询方法及装置。Based on the foregoing analysis, the embodiment of the present application provides the following distributed cache range query method and apparatus.
例如,参见图1,为本申请实施例提供的分布式缓存范围查询方法流程示意图。如图1所示,该方法可以包括:For example, refer to FIG. 1 , which is a schematic flowchart of a distributed cache range query method according to an embodiment of the present application. As shown in FIG. 1, the method can include:
S110、将用于映射缓存数据的关键字中,可用于范围查询的字段值对应的标识值,预先存储在内存的存储区域中。S110. In the keyword that is used to map the cached data, the identifier value corresponding to the field value that can be used for the range query is pre-stored in the storage area of the memory.
例如,为了减少节点环中的节点数量,提高查询效率,可以将所有关键 字属于同一字段的字段值中重复的部分剔除,仅将同一字段的不同字段值预先存储在内存的一个存储区域中。For example, in order to reduce the number of nodes in the node ring and improve query efficiency, all the keys can be Words are duplicated in the field values of the same field, and only the different field values of the same field are pre-stored in a memory area of the memory.
其中,位于内存的存储区域的存储结构不限,例如,可以是单向链表、数组、环状链表等。例如,一些可能的实施方式中,可以将所述关键字中,可用于范围查询的字段值对应的标识值,预先存储在位于内存的节点环(即,环状链表)中,其中,一个标识值对应存储到一个节点中,且为所述节点环中的每一节点建立对应的路由表,所述路由表记录了根据预置索引算法确定的一个或多个其他节点的标识值。The storage structure of the storage area located in the memory is not limited, and may be, for example, a singly linked list, an array, a circular linked list, or the like. For example, in some possible implementations, the identifier value corresponding to the field value that can be used for the range query in the keyword may be pre-stored in a node ring (ie, a ring list) located in the memory, where one identifier The values are correspondingly stored in a node, and a corresponding routing table is established for each node in the node ring, and the routing table records the identification values of one or more other nodes determined according to the preset index algorithm.
一些可能的实施方式中,所述预置索引算法可以为:用一个路由表来记录节点环中与该路由表对应的节点的标识值间距成2的幂次方关系的标识值。相应的,步骤S110具体可以为:将关键字中,可用于范围查询的、属于同一字段的字段值对应的标识值按标识值大小顺序存储在位于内存的节点环中,其中,所述路由表记录了所述节点环中与对应节点的标识值间距成2的幂次方关系的标识值。通常而言,由于缓存数据中可用于范围查询的字段值所对应的标识值通常是相差不大的值,路由表中记录与一节点的标识值间距成2的幂次方关系的标识值对应的节点,可便于后续的查找,提高查找效率。In some possible implementation manners, the preset index algorithm may be: using a routing table to record an identifier value of a power relationship of a node corresponding to the routing table in the node ring to a power relationship of 2. Correspondingly, the step S110 may be specifically: storing, in the keyword, the identifier value corresponding to the field value of the same field that can be used for the range query, in the order of the identifier value, in the node ring located in the memory, wherein the routing table An identification value of a power relationship between the node ring and the identification value of the corresponding node in a power of 2 is recorded. Generally, the identifier value corresponding to the field value that can be used for the range query in the cache data is usually a value that is not much different, and the identifier in the routing table corresponds to the identifier value of the power relationship of the node to the power of 2 Nodes can facilitate subsequent lookups and improve search efficiency.
当然,预置索引算法不仅限于上面实施方式中这一种,具体可以根据实际查询效率需要进行设置,本申请对此并不进行限制。例如,所述预置索引算法还可以为:用路由表来记录节点环中与该路由表对应的节点的标识值间距为指定常量整数倍的标识值,等等。Of course, the preset indexing algorithm is not limited to the one in the above embodiment, and may be set according to the actual query efficiency. This application does not limit this. For example, the preset index algorithm may further be: using a routing table to record an identifier value of a node value of a node corresponding to the routing table in the node ring to be an integer multiple of a specified constant, and the like.
可以理解的是,key(关键字)可以由各种字段不同的字段构成,其中,可用于范围查询的字段可以包括如,数字、日期等字段。例如,如果需要对日期做范围查询,则可以从各个关键字中提取出日期值,将提取出的不同日期值对应转换为可以根据预置索引算法运算的标识值。结合上面节点环的实施方式,则可以将标识值以链表的形式存储为首尾相连的、按日期值大小顺序排序的节点环。例如,原始关键字为“date-2014年08月07日-hotelId-18873”,关键字体现了日期与酒店id对应关系,日期“2014年08月07日”可以转换为 标识值140807。其中,按标识值大小顺序排序具体为按字段值从小到大顺序排序,也可以为从大到小顺序排序。It can be understood that the key can be composed of fields with different fields, wherein the fields that can be used for the range query can include fields such as numbers, dates, and the like. For example, if a range query is needed for the date, the date value may be extracted from each keyword, and the extracted different date values are correspondingly converted into identifier values that can be calculated according to the preset index algorithm. In combination with the implementation of the node loop above, the identifier value may be stored in the form of a linked list as a node loop that is connected end to end and sorted by date value. For example, the original keyword is "date-August 07, 2014-hotelId-18873", the keyword reflects the correspondence between the date and the hotel id, and the date "August 07, 2014" can be converted to The identification value is 140807. The sorting according to the size of the identifier value is specifically sorted by the field value from small to large, or from the largest to the smallest.
S120、响应于接收到针对指定范围的关键字的查询请求,从所述存储区域中查找出所述指定范围的端点值对应的标识值。S120. Search for an identifier value corresponding to the endpoint value of the specified range from the storage area, in response to receiving the query request for the keyword of the specified range.
例如,在标识值存储在节点环的实施例中,可以响应于接收到针对指定范围的关键字的查询请求,以所述节点环中的任一节点作为当前节点,在所述当前节点的路由表中查找距所述指定范围的端点值最近的标识值。如果判定查找到的标识值是所述节点环中距所述指定范围的端点值最近的标识值,将查找到的标识值作为所述指定范围的端点值对应的标识值;如果判定查找到的标识值不是所述节点环中距所述指定范围的端点值最近的标识值,将查找到的标识值作为当前节点,则返回到所述在当前节点的路由表中查找距所述指定范围的端点值最近的标识值的步骤。For example, in an embodiment where the identification value is stored in a node ring, in response to receiving a query request for a specified range of keywords, using any of the node rings as the current node, routing at the current node The table finds the identity value closest to the endpoint value of the specified range. If it is determined that the identified identifier value is the identifier value closest to the endpoint value of the specified range in the node ring, the found identifier value is used as the identifier value corresponding to the endpoint value of the specified range; The identifier value is not the identifier value closest to the endpoint value of the specified range in the node ring, and the found identifier value is used as the current node, and then returned to the routing table of the current node to find the specified range from the current node. The step of identifying the value of the endpoint value most recently.
可以理解的是,指定范围可以包括一个或多个指定范围,端点值可以为用于确定指定范围区间的端点值。例如,指定日期范围可以包括:2001年1月1日到2001年5月1日,以及,2002年1月1日到2002年5月1日。则端点值可以包括:010101与010501、020101与020501。It can be understood that the specified range may include one or more specified ranges, and the endpoint value may be an endpoint value used to determine the specified range interval. For example, the specified date range may include: January 1, 2001 to May 1, 2001, and January 1, 2002 to May 1, 2002. Then the endpoint values may include: 010101 and 010501, 020101 and 020501.
需要说明的是,所述端点值对应的标识值可以是与所述端点值相等的标识值,在不存在与所述端点值相等的标识值的情况下,可以是节点环中在所述指定范围内与所述端点值最近的标识值。It should be noted that the identifier value corresponding to the endpoint value may be an identifier value equal to the endpoint value, and in the case that there is no identifier value equal to the endpoint value, the node ring may be in the specified The identity value within the range that is closest to the endpoint value.
S130、根据所述指定范围的端点值对应的标识值确定所述指定范围对应的关键字集合。S130. Determine, according to the identifier value corresponding to the endpoint value of the specified range, a keyword set corresponding to the specified range.
例如,可以根据所述指定范围的端点值对应的标识值,构造出在所述指定范围内的关键字,得到所述指定范围对应的关键字集合。具体地,例如,可以预先为不同类型关键字设置对应的关键字构造规则,根据需要查询的关键字的类型,采用对应的关键字构造规则,将标识值对应的字段值作为输入变量,构造出对应的关键字,得到查询请求对应的关键字集合。假设需要构造体现日期与酒店id对应关系的关键字,则,可以根据关键字构造规则预设的不同酒店id,将查找出的标识值对应的日期字段值与不同酒店id分别进行拼接,得到完整的关键字。当然,还可以有其他跟标识值确定关键字的方法, 本领域技术人员可以根据实际实施需要设置,在此不再赘述。For example, a keyword within the specified range may be constructed according to an identifier value corresponding to the endpoint value of the specified range, and a keyword set corresponding to the specified range may be obtained. Specifically, for example, a corresponding keyword construction rule may be set for different types of keywords in advance, and a corresponding keyword construction rule is adopted according to the type of the keyword to be queried, and the field value corresponding to the identifier value is used as an input variable to construct Corresponding keywords, get the keyword set corresponding to the query request. Suppose that you need to construct a keyword that reflects the relationship between the date and the hotel id. You can splicing the date field value corresponding to the identified identifier value and the different hotel id according to the different hotel ids preset by the keyword construction rule. Keyword. Of course, there are other ways to determine the keyword with the identity value. Those skilled in the art can set according to actual implementation requirements, and details are not described herein again.
例如,一些可能的实施方式中,本申请实施例提供的方法可以应用于与存储缓存数据的一台或多台缓存服务器不同的单独的查询服务器中,在得到指定范围对应的关键字集合之后,还可以进一步通过一次多线程下载并发从所述一台或多台缓存服务器中得到所述关键字集合对应的缓存数据,将得到的缓存数据返回给发出所述查询请求的客户端。For example, in some possible implementation manners, the method provided by the embodiment of the present application may be applied to a separate query server different from one or more cache servers that store cached data, after obtaining a keyword set corresponding to the specified range, The cache data corresponding to the keyword set may be further obtained from the one or more cache servers by one multi-thread download concurrently, and the obtained cache data is returned to the client that issues the query request.
可见,应用本申请实施例提供的方法,在接收到针对指定范围的关键字的查询请求后,从所述存储区域中查找出所述指定范围的端点值对应的标识值的查询可以全部在内存中完成,无需访问数据库,实现了与数据库解耦的范围查询。It can be seen that, after the method provided by the embodiment of the present application is received, after the query request for the specified range of keywords is received, the query for finding the identifier value corresponding to the endpoint value of the specified range from the storage area may all be in the memory. Completed in the middle, without access to the database, the scope query decoupled from the database.
下面,对路由表用于记录节点环中与对应节点的标识值间距成2的幂次方关系的标识值的实施例进行详细说明。例如,该实施例可以包括:Next, an embodiment in which the routing table is used to record the identification value of the power relationship between the node ring and the corresponding node is set to a power relationship of two. For example, the embodiment can include:
S210、将关键字中,可用于范围查询的、属于同一字段的字段值对应的标识值按标识值大小顺序预先存储在位于内存的节点环中,其中,一个标识值对应存储到一个节点中,且为所述节点环中的每一节点建立对应的路由表,其中,所述路由表记录了所述节点环中与对应节点的标识值间距成2的幂次方关系的标识值。S210, in the keyword, the identifier value corresponding to the field value of the same field that can be used for the range query is pre-stored in the node ring in the memory according to the size of the identifier value, wherein one identifier value is correspondingly stored in one node, And establishing a corresponding routing table for each node in the node ring, where the routing table records an identifier value of a power relationship between the node ring and the identifier value of the corresponding node.
其中,间距成2的幂次方关系的标识值可以指,间距等于2i-1的节点的标识值,以及,当不存在间距等于2i-1的标识值时,间距最接近2i-1的节点标识值。例如,在一些可能的实施方式中,为了尽量拉开合适的间距以提高查询效率,间距最接近2i-1的标识值可以取间距大于2i-1的标识值中,间距最接近2i-1的标识值,其中,i为整数,且大于等于1、小于等于所述节点环中节点的最大标识值取2的对数再向上取整的得数。Wherein, the identifier value of the power relationship with the spacing of 2 may refer to the identifier value of the node whose spacing is equal to 2 i-1 , and when there is no identifier value with the spacing equal to 2 i-1 , the spacing is closest to 2 i- The node ID value of 1 . For example, in some possible implementations, in order to maximize the spacing to improve the query efficiency, the identifier value closest to 2 i-1 may be in the identification value with a spacing greater than 2 i-1 , and the spacing is closest to 2 i An identification value of -1 , where i is an integer, and is greater than or equal to 1, less than or equal to the maximum identification value of the node in the node ring, taking the logarithm of 2 and then rounding up the number.
例如,如图3所示的节点环,其中,节点旁边标注的2、8、10、16等数字为用于标识节点的标识值。如图3所示的节点环中的每个节点维护了一个m项的路由表。其中,标识值以二进制来表示的话,m为节点环中最大的二进制标识值的位数。若L表示环中标识值最大的节点,则m为L取2对数向上取整。 即:
Figure PCTCN2015093310-appb-000001
如图3所示,需要将所有节点分布到环上,m的取值应该为6。在每个节点维护的m项路由表中,路由表的第i项纪录的标识值等于:
For example, the node ring shown in FIG. 3, wherein the numbers 2, 8, 10, 16 and the like marked next to the node are identification values for identifying the node. Each node in the node ring shown in Figure 3 maintains a routing table of m items. Wherein, if the identifier value is expressed in binary, m is the number of bits of the largest binary identification value in the node ring. If L is the node with the largest identification value in the ring, then m is L and the 2 logarithm is rounded up. which is:
Figure PCTCN2015093310-appb-000001
As shown in Figure 3, all nodes need to be distributed to the ring. The value of m should be 6. In the m-item routing table maintained by each node, the identifier value of the i-th record of the routing table is equal to:
successor((该节点的标识值+2i-1)mod2m),(1≤i≤m)。Successor ((the node's identity value +2 i-1 ) mod2 m ), (1 ≤ im ).
由于路由表记录了与对应节点的标识值间距成2的幂次方关系的标识值,因此,每个节点的直接后继节点为其路由表的第一项。为了便于下面查询指定范围的端点值对应的标识值,节点环中每一节点同时还维护自身的直接前驱节点。在本申请实施例中,由于路由表记录的标识值的间隔以指数增长,路由表中记录的与对应节点临近的节点的密度比远端节点的密度大,所以,在下面以路由表为索引查询指定范围对应的标识值过程中,如果指定范围的端点值离当前节点的标识值较远,则可以根据路由表记录的较为稀疏的远端节点快速跳到更远的节点进行查询,如果指定范围的端点离当前节点的标识值较近,则可以根据路由表记录的较为密集的临近的节点更加准确地跳到更加接近标识值的节点上进行查询。因此,通过本申请实施例为节点建立的路由表,可以进行高效的范围查询。Since the routing table records the identity value of the power relationship with the identity value of the corresponding node in a power of 2, the direct successor node of each node is the first item of its routing table. In order to facilitate the query of the identifier value corresponding to the endpoint value of the specified range, each node in the node ring also maintains its own direct precursor node. In the embodiment of the present application, since the interval of the identifier value recorded by the routing table increases exponentially, the density of the node adjacent to the corresponding node recorded in the routing table is greater than the density of the remote node, so the routing table is indexed below. In the process of querying the identifier value corresponding to the specified range, if the endpoint value of the specified range is far from the identifier value of the current node, the sparse remote node recorded according to the routing table can quickly jump to a farther node for query. If the endpoint of the range is closer to the identifier of the current node, the denser neighboring node recorded by the routing table can be more accurately hopped to the node closer to the identifier value for query. Therefore, the routing table established by the node in the embodiment of the present application can perform efficient range query.
S220、响应于接收到针对指定范围的关键字的查询请求,以所述节点环中的任一节点作为当前节点;S220. Respond to receiving a query request for a specified range of keywords, using any node in the node ring as a current node;
S230、在所述当前节点的路由表中查找出距所述指定范围的端点值最近的标识值。S230. Find an identifier value that is closest to the endpoint value of the specified range in the routing table of the current node.
S240、如果判定查找到的标识值是所述节点环中距所述指定范围的端点值最近的标识值,将查找到的标识值作为所述指定范围的端点值对应的标识值。S240. If it is determined that the identifier value that is found is the identifier value that is closest to the endpoint value of the specified range in the node ring, the identifier value that is found is used as the identifier value corresponding to the endpoint value of the specified range.
S250、如果判定查找到的标识值不是所述节点环中距所述指定范围的端点值最近的标识值,将当前节点更新为查找到的标识值所在节点,返回到S230所述在当前节点的路由表中查找距所述指定范围的端点值最近的标识值的步骤。S250. If it is determined that the identified identifier value is not the identifier value closest to the endpoint value of the specified range in the node ring, update the current node to the node where the found identifier value is located, and return to the current node in S230. The step of finding the identity value closest to the endpoint value of the specified range in the routing table.
下面,结合标识值对应的节点在节点环中按标识值从小到大的顺序排序的实施方式,对本申请实施例上述步骤S220-S250可能的实施方式进行详细说明。例如,在该实施方式中,所述指定范围可以为第一端点值到第二端点值之间的范围,其中,所述第一端点值小于第二端点值,S220-S250可能的 查询步骤可以包括:In the following, the possible implementation manners of the foregoing steps S220-S250 in the embodiment of the present application are described in detail in conjunction with the implementation manner in which the nodes corresponding to the identifiers are sorted in the node ring in the order of the identifier values from small to large. For example, in this embodiment, the specified range may be a range between the first endpoint value and the second endpoint value, wherein the first endpoint value is smaller than the second endpoint value, and S220-S250 may be The query step can include:
在接收到查询请求时,可以将节点环中任一节点作为当前节点。When receiving a query request, any node in the node ring can be used as the current node.
判断所述当前节点的路由表记录的标识值中是否存在与所述第一端点值相等的标识值。Determining whether an identifier value equal to the first endpoint value exists in the identifier value of the routing table record of the current node.
如果是,将与所述第一端点值相等的标识值作为与所述第一端点值对应的标识值。If so, an identity value equal to the first endpoint value is used as the identity value corresponding to the first endpoint value.
如果否,判断所述第一端点值是否在所述当前节点的标识值与其直接前驱节点或直接后继节点的标识值之间。可以理解的是,如果端点值在所述当前节点的标识值与其直接前驱节点或与其直接后继节点的标识值之间,则说明在节点环中不存在与端点值相等的标识值,只能在当前节点、当前节点的直接前驱节点或直接后继节点中选择出所述端点值对应的标识值,具体可以依据端点值是范围的起始端点还是终止端点来选择。如果端点值不在所述当前节点的标识值与其直接前驱节点或与其直接后继节点的标识值之间,则说明节点环中其他位置可能存在与端点值相等的标识值,则可以跳到当前节点的路由表中记录的距所述端点值最近的标识值所标识的节点,继续进行判断。If not, it is determined whether the first endpoint value is between the identity value of the current node and the identity value of its immediate precursor node or direct successor node. It can be understood that if the endpoint value is between the identifier value of the current node and the identifier value of the direct precursor node or its immediate successor node, it means that there is no identifier value equal to the endpoint value in the node ring, only in the The identifier value corresponding to the endpoint value is selected by the current node, the direct precursor node of the current node, or the direct successor node, and may be selected according to whether the endpoint value is the starting endpoint or the ending endpoint of the range. If the endpoint value is not between the identifier value of the current node and the identifier value of the direct precursor node or its immediate successor node, it indicates that other identifiers in the node ring may have an identity value equal to the endpoint value, and then the current node may be hopped. The node identified by the identifier value closest to the endpoint value recorded in the routing table continues to be judged.
如果所述第一端点值在所述当前节点的标识值与其直接前驱节点的标识值之间,则将所述当前节点的标识值作为与所述第一端点值对应的标识值。And if the first endpoint value is between the identifier value of the current node and the identifier value of the direct precursor node, the identifier value of the current node is used as the identifier value corresponding to the first endpoint value.
如果所述第一端点值在所述当前节点的标识值与其直接后继节点的标识值之间,将所述当前节点的直接后继节点的标识值作为所述第一端点值对应的标识值。If the first endpoint value is between the identifier value of the current node and the identifier value of the direct successor node, the identifier value of the direct successor node of the current node is used as the identifier value corresponding to the first endpoint value. .
如果所述第一端点值不在所述当前节点的标识值与其直接后继节点的标识值之间,且不在所述当前节点的标识值与其直接前驱节点的标识值之间,则将所述当前节点更新为所述当前节点的路由表中记录的距所述第一端点值最近的标识值所标识的节点,返回到上述判断当前节点的路由表记录的标识值中是否存在与所述第一端点值相等的标识值的步骤。If the first endpoint value is not between the identity value of the current node and the identity value of the immediate successor node, and is not between the identity value of the current node and the identity value of the immediate precursor node, then the current And updating, by the node, the node identified by the identifier value of the routing value of the current node recorded in the routing table of the current node, and returning to the foregoing determining whether the identifier value of the routing table record of the current node exists The step of identifying values with equal endpoint values.
判断所述当前节点的路由表记录的标识值中是否存在与所述第二端点值相等的标识值。Determining whether an identifier value equal to the second endpoint value exists in the identifier value of the routing table record of the current node.
如果是,则将与所述第二端点值相等的标识值作为与所述第二端点值对应的标识值。 If yes, the identity value equal to the second endpoint value is used as the identity value corresponding to the second endpoint value.
如果否,判断所述第二端点值是否在所述当前节点的标识值与其直接前驱节点或直接后继节点的标识值之间。If not, it is determined whether the second endpoint value is between the identity value of the current node and the identity value of its immediate precursor node or direct successor node.
如果所述第二端点值在所述当前节点的标识值与其直接前驱节点的标识值之间,则将所述当前节点的直接前驱节点的标识值作为所述第二端点值对应的标识值。And if the second endpoint value is between the identifier value of the current node and the identifier value of the direct precursor node, the identifier value of the direct precursor node of the current node is used as the identifier value corresponding to the second endpoint value.
如果所述第二端点值在所述当前节点的标识值与其直接后继节点的标识值之间,将所述当前节点的标识值作为所述第二端点值对应的标识值。If the second endpoint value is between the identifier value of the current node and the identifier value of the direct successor node, the identifier value of the current node is used as the identifier value corresponding to the second endpoint value.
如果所述第二端点值不在所述当前节点的标识值与其直接后继节点的标识值之间,且不在所述当前节点的标识值与其直接前驱节点的标识值之间,则将所述当前节点更新为所述当前节点的路由表中记录的距所述第二端点值最近的标识值所在的节点,返回到所述判断当前节点的路由表记录的标识值中是否存在与所述第二端点值相等的标识值的步骤。If the second endpoint value is not between the identity value of the current node and the identity value of the immediate successor node, and is not between the identity value of the current node and the identity value of the immediate precursor node, then the current node Updating to the node where the identifier value closest to the second endpoint value recorded in the routing table of the current node is located, and returning to the identifier value of the routing table record of the current node to determine whether the second endpoint exists The step of identifying values with equal values.
需要说明的是,针对第一端点值对应的标识值以及第二端点值对应的标识值的查询步骤可以同时并发执行,也可以先后执行,本申请实施例针对不同端点值的查询步骤执行顺序并无限制。It should be noted that the query step for the identifier value corresponding to the first endpoint value and the identifier value corresponding to the second endpoint value may be performed concurrently or concurrently, and the sequence of query steps for different endpoint values in the embodiment of the present application is performed. There are no restrictions.
下面,结合图3所示节点环,以及以所接收到的查询请求为查询标识值7到32范围内的缓存信息为例对上述查询步骤进行示意性说明。可以理解的是,在此以数字举例仅为便于理解,如果可用于范围查询的字段值是非数字类型的,可以将非数字类型的字段值转换为数字类型的标识值。例如,可以首先从节点2出发,根据端点值7在节点2与节点2的直接后继节点8之间,确定节点环中不存在标识值为7的节点。所以端点值7对应的标识值为8。然后,从节点8出发,根据节点8的路由表中与节点32最接近的节点为节点28,跳到节点28的路由表进行查询,查询节点28的路由表信息,根据与32最接近的节点为节点30,跳到节点30的路由表进行查询,根据32位于节点30与其直接后继节点33之间,确定节点环中不存在标识值为32的节点。所以32对应的标识值为30。从而,根据如图3所示的节点环,查找出指定范围7到32内的标识值包括:8,10,16,21,28,30。由此,利用环节点的路由表查询指定范围7到32的标识值结束。In the following, the above query step is schematically illustrated by taking the node ring shown in FIG. 3 and the cached information in the range of 7 to 32 for the query request value as an example. It can be understood that the numerical example is only for ease of understanding. If the field value available for the range query is non-numeric, the field value of the non-numeric type can be converted into the identifier value of the numeric type. For example, starting from node 2, between node 2 and the immediate successor node 8 of node 2, based on endpoint value 7, it is determined that there is no node with an identity value of 7 in the node ring. Therefore, the identifier value corresponding to the endpoint value 7 is 8. Then, starting from the node 8, according to the node in the routing table of the node 8 that is closest to the node 32, the node 28 jumps to the routing table of the node 28 to query, and queries the routing table information of the node 28 according to the node closest to 32. For the node 30, the routing table of the node 30 is queried, and the node 32 is located between the node 30 and its immediate successor node 33, and it is determined that there is no node with the identifier value of 32 in the node ring. Therefore, the identifier value corresponding to 32 is 30. Thus, according to the node ring as shown in FIG. 3, the identification values in the specified range 7 to 32 are found to be: 8, 10, 16, 21, 28, 30. Thus, the routing table of the ring node is used to query the end of the identification value of the specified range 7 to 32.
S260、根据指定范围的端点值对应的标识值确定所述指定范围对应的关 键字集合。S260. Determine, according to the identifier value corresponding to the endpoint value of the specified range, the corresponding range of the specified range. Keyword collection.
可见,应用该实施例,可以直接从内存读取该节点环,以节点环中节点的路由表作为索引进行范围查询,脱离对数据库的依赖、读取速度快,且路由表记录的是所述节点环中与对应节点的标识值间距成2的幂次方关系的标识值,因此,在根据路由表查找指定范围的端点值对应的标识值的过程中,总是跳到距端点值最近的字段值所标识的节点的路由表中来查找,最终查找出指定范围对应的关键字集合,从而使查询过程成为折半查找的过程,达到高效范围查询的目的。例如,以一亿个节点为例,L=100,000,000,那么每个节点需要维护的路由表项数为log2L=27,查询其中任意一个节点需要经历的跳数最多为logL=8,所以,查询的性能是非常高的。It can be seen that, by applying the embodiment, the node ring can be directly read from the memory, and the routing table of the node in the node ring is used as an index to perform range query, and the dependency on the database is fast, the reading speed is fast, and the routing table records the In the node ring, the identifier value of the corresponding node is separated by an identifier value of a power relationship of 2, and therefore, in the process of searching for the identifier value corresponding to the endpoint value of the specified range according to the routing table, the jump is always the closest to the endpoint value. The routing table of the node identified by the field value is searched, and finally the keyword set corresponding to the specified range is found, so that the query process becomes a process of folding the search to achieve the purpose of efficient range query. For example, if 100 million nodes are used as an example, L=100,000,000, then the number of routing entries that need to be maintained for each node is log 2 L=27, and the number of hops that any one of the nodes needs to query is at most logL=8, so The performance of the query is very high.
另外,对于分布式缓存系统来说,缓存的关键字可能在任何时候加入或者删除。与此同时,为了保证节点环以及路由表与缓存中的关键字一致,需要在关键字加入或删除的同时,更新节点环以及路由表。具体地,本申请实施例还可以包括:In addition, for distributed caching systems, cached keywords may be added or removed at any time. At the same time, in order to ensure that the node ring and the routing table are consistent with the keywords in the cache, it is necessary to update the node ring and the routing table at the same time as the keyword is added or deleted. Specifically, the embodiment of the present application may further include:
针对缓存中新加入的关键字,判断新加入的关键字中可用于范围查询的字段值对应的标识值是否已存在于所述节点环中,如果否,将该标识值存储在新节点中,在所述节点环中查找出可作为新节点的直接前驱节点的节点N,更新所述节点N的直接后继节点的直接前驱节点为所述新节点,更新所述节点N为所述新节点的直接前驱节点,为所述新节点建立对应的路由表;Determining, for the newly added keyword in the cache, whether the identifier value corresponding to the field value that can be used for the range query in the newly added keyword already exists in the node ring, and if not, storing the identifier value in the new node, Searching in the node ring for a node N that can serve as a direct precursor node of the new node, updating a direct precursor node of the direct successor node of the node N as the new node, and updating the node N as the new node Directly preceding the node, establishing a corresponding routing table for the new node;
针对缓存中删除的关键字,如果所述删除的关键字的字段中可用于范围查询的字段值不存在于任何其他关键字中,将存储了该字段值对应的标识值的节点作为待删除节点,更新待删除节点的直接后继节点的直接前驱节点为所述待删除节点的直接前驱节点,从所述节点环中删除所述待删除节点;For the keyword deleted in the cache, if the field value that can be used for the range query in the field of the deleted keyword does not exist in any other keyword, the node that stores the identifier value corresponding to the field value is used as the node to be deleted. Updating the direct precursor node of the direct successor node of the node to be deleted as the direct precursor node of the node to be deleted, and deleting the node to be deleted from the node ring;
以及,根据预置索引算法,对受所述新节点的加入影响或受所述待删除节点的删除影响而需要更新的路由表进行更新。例如,可以根据每个节点的路由表应分别记录与对应节点的字段值间距成2的幂次方关系的字段值,对受所述新节点的加入影响或受待删除节点的删除影响而需要更新的路由表进行更新。 And, according to the preset indexing algorithm, the routing table that needs to be updated is affected by the joining of the new node or affected by the deletion of the node to be deleted. For example, according to the routing table of each node, the field value of the power relationship with the field value of the corresponding node should be separately recorded, which is required to be affected by the joining of the new node or affected by the deletion of the node to be deleted. The updated routing table is updated.
下面,对如何进行路由表的更新进行举例说明。例如:结合上述路由表的第i项的字段值等于successor((该节点的标识值+2i-1)mod2m),(1≤i≤m)的实施方式,如果节点环中新加入节点P,则可以通过以下步骤对受P加入影响而需要更新的路由表进行更新:The following is an example of how to update the routing table. For example, if the field value of the i-th item of the above routing table is equal to the successor ((the node's identification value +2 i-1 ) mod2 m ), (1 ≤ im ), if the node ring newly joins the node P, you can use the following steps to update the routing table that needs to be updated by P:
根据第i项的字段值等于successor((该节点的标识值+2i-1)mod2m),对节点P的前驱节点的路由表记录的信息进行递归更新,直到递归到的前驱节点不能同时满足更新的两个条件,递归终止。其中所述两个条件为:条件一:递归到的前驱节点S与节点P之间标识值的间距大于等于2i-1。因为如果节点S与节点P之间标识值的距离小于2i-1,则节点S路由表的第i项一定在节点P之后,所以路由表第i项不需要更新。条件二:在满足条件一的前提下,节点S的路由表当前第i项需要在节点P之后。因为如果节点S的路由表信息第i项在节点P之前。说明P节点是当前第i项之后的节点,不需要更新路由表当前项。According to the field value of the item i is equal to successor ((the identification value of the node +2 i-1 ) mod2 m ), the information recorded in the routing table of the predecessor node of the node P is recursively updated until the recursive precursor node cannot simultaneously The two conditions of the update are met and the recursion is terminated. The two conditions are as follows: Condition 1: The distance between the identification values of the recursive precursor node S and the node P is greater than or equal to 2 i-1 . Because if the distance of the identification value between the node S and the node P is less than 2 i-1 , the i- th item of the node S routing table must be after the node P, so the i-th item of the routing table does not need to be updated. Condition 2: Under the condition that the condition 1 is satisfied, the current i-th item of the routing table of the node S needs to be after the node P. Because if the routing table information of node S is the i-th item before node P. The P node is the node after the current i-th item, and the current item of the routing table does not need to be updated.
根据上述两个条件的判断,可以准确地沿环节点预设排序顺序相反方向递归更新新加入节点的前驱节点的路由表信息。由于受删除节点影响而对路由表的更新与上述受新加入节点的影响而对路由表的更新的原理相同,在此不再赘述。因为插入或者删除节点,不会对当前节点的后继节点的路由表产生影响,只会影响当前节点的前驱节点。这就要求每个节点除了需要维护路由表信息外,也需要维护直接前驱节点。在递归更新过程中,如果节点S需要更新路由表的第i项,那么节点S的直接前驱节点也可能需要更新路由表信息,反之,如果节点S不需要更新路由表信息,则S的前驱节点也不需要更新路由表信息。递归更新路由表信息结束。According to the judgment of the above two conditions, the routing table information of the predecessor node of the newly joined node can be recursively updated in the opposite direction of the preset order of the ring node. The update of the routing table by the affected node is the same as the update of the routing table by the new joining node, and will not be described here. Because the node is inserted or deleted, it will not affect the routing table of the successor node of the current node, and only affects the precursor node of the current node. This requires each node to maintain a direct precursor node in addition to maintaining routing table information. In the recursive update process, if the node S needs to update the i-th item of the routing table, the direct predecessor node of the node S may also need to update the routing table information. Conversely, if the node S does not need to update the routing table information, the predecessor node of the S There is also no need to update the routing table information. The recursive update of the routing table information ends.
例如,如图4所示,新加入节点为节点30,沿环节点逆时针方向递归更新节点30的前驱节点的路由表信息,从前驱节点28一直递归更新到节点16。如图4所示,节点28路由表的第1项,第2项从33更新为30。由于更新到节点16时,不同时满足更新路由表的两个条件,节点16的路由表信息没有发生变化,因此,结束递归更新。For example, as shown in FIG. 4, the newly joined node is the node 30, and the routing table information of the predecessor node of the node 30 is recursively updated counterclockwise along the ring node, and is recursively updated from the predecessor node 28 to the node 16. As shown in FIG. 4, the node 28 routes the first item of the table, and the second item is updated from 33 to 30. Since the update to the node 16 does not satisfy the two conditions of updating the routing table at the same time, the routing table information of the node 16 does not change, and therefore, the recursive update is ended.
与本申请实施例提供的分布式缓存范围查询方法对应的,本申请实施例 还提供了一种分布式缓存范围查询装置。Corresponding to the distributed cache range query method provided by the embodiment of the present application, the embodiment of the present application A distributed cache range query device is also provided.
例如,参见图5,为本申请实施例提供的分布式缓存范围查询装置结构示意图。如图5所示,该装置可以包括:For example, refer to FIG. 5, which is a schematic structural diagram of a distributed cache range query apparatus according to an embodiment of the present application. As shown in FIG. 5, the apparatus may include:
预处理单元510,可以用于将用于映射缓存数据的关键字中,可用于范围查询的字段值对应的标识值,预先存储在内存的存储区域中;查询响应单元520,可以用于将用于映射缓存数据的关键字中,可用于范围查询的字段值对应的标识值,预先存储在内存的存储区域中;关键字获取单元530,可以用于根据所述指定范围的端点值对应的标识值确定所述指定范围对应的关键字集合。The pre-processing unit 510 can be configured to store, in the keyword used for mapping the cache data, an identifier value corresponding to the field value of the range query, which is stored in a storage area of the memory in advance; the query response unit 520 can be used for In the keyword of the mapping cache data, the identifier value corresponding to the field value of the range query is pre-stored in the storage area of the memory; the keyword obtaining unit 530 can be used to identify the identifier corresponding to the endpoint value of the specified range. The value determines a set of keywords corresponding to the specified range.
一些可能的实施方式中,所述预处理单元510,可以用于将所述关键字中,可用于范围查询的字段值对应的标识值,预先存储在位于内存的节点环中,其中,一个标识值对应存储到一个节点中,且为所述节点环中的每一节点建立对应的路由表,所述路由表记录了根据预置索引算法确定的一个或多个其他节点的标识值。相应地,参见图6,所述查询响应单元520可以包括:查找子单元521,可以用于响应于接收到针对指定范围的关键字的查询请求,以所述节点环中的任一节点作为当前节点,在所述当前节点的路由表中查找距所述指定范围的端点值最近的标识值。第一判定子单元522,可以用于如果判定查找到的标识值是所述节点环中距所述指定范围的端点值最近的标识值,将查找到的标识值作为所述指定范围的端点值对应的标识值。第二判定子单元523,可以用于如果判定查找到的标识值不是所述节点环中距所述指定范围的端点值最近的标识值,将查找到的标识值作为当前节点,触发所述查找子单元在当前节点的路由表中查找距所述指定范围的端点值最近的标识值。In some possible implementations, the pre-processing unit 510 may be configured to store, in the keyword, an identifier value corresponding to a field value that is available for the range query, in a node ring located in the memory, where the identifier is The values are correspondingly stored in a node, and a corresponding routing table is established for each node in the node ring, and the routing table records the identification values of one or more other nodes determined according to the preset index algorithm. Correspondingly, referring to FIG. 6, the query response unit 520 may include: a lookup subunit 521, which may be configured to respond to a query request for a specified range of keywords, with any node in the node ring as a current The node searches for the identity value closest to the endpoint value of the specified range in the routing table of the current node. The first determining sub-unit 522 may be configured to: if it is determined that the found identifier value is the identifier value closest to the endpoint value of the specified range in the node ring, use the found identifier value as the endpoint value of the specified range. The corresponding identification value. The second determining sub-unit 523 may be configured to: if it is determined that the found identifier value is not the identifier value closest to the endpoint value of the specified range in the node ring, and use the found identifier value as the current node, triggering the searching The subunit searches for the identity value closest to the endpoint value of the specified range in the routing table of the current node.
结合上面的实施方式,所述预处理单元510,可以用于将关键字中,可用于范围查询的、属于同一字段的字段值对应的标识值按标识值大小顺序存储在位于内存的节点环中,其中,所述路由表记录了所述节点环中与对应节点的字段值间距成2的幂次方关系的字段值。In combination with the foregoing implementation manner, the pre-processing unit 510 may be configured to store, in the keyword, the identifier values corresponding to the field values of the same field that are available for the range query, in the order of the identifier value, in the node ring located in the memory. The routing table records field values in the node ring that are spaced apart from the field values of the corresponding nodes by a power relationship of two.
下面,对所述路由表记录了所述节点环中与对应节点的字段值间距成2的幂次方关系的字段值的实施方式进行详细说明。假设所述节点环中的节点 按标识值从小到大的顺序排序,所述指定范围为第一端点值到第二端点值之间的范围,其中,所述第一端点值小于所述第二端点值。针对第一端点值对应的标识值的查询,本申请实施例所述查找子单元521,如图6所示,可以包括:Next, an embodiment in which the routing table records field values in the node ring that are spaced apart from the field value of the corresponding node by a power relationship of 2 is described in detail. Assume that the nodes in the node ring The identifier values are sorted in ascending order, the specified range being a range between the first endpoint value and the second endpoint value, wherein the first endpoint value is less than the second endpoint value. For the query of the identifier value corresponding to the first endpoint value, the search sub-unit 521 in the embodiment of the present application, as shown in FIG. 6, may include:
出发子单元5210,可以用于响应于接收到针对指定范围的关键字的查询请求,以所述节点环中的任一节点作为当前节点。The departure sub-unit 5210 may be configured to respond to the query request for the specified range of keywords, using any one of the node rings as the current node.
第一端点判断子单元5211,可以用于判断所述当前节点的路由表记录的标识值中是否存在与所述第一端点值相等的标识值。The first endpoint determining sub-unit 5211 is configured to determine whether an identifier value equal to the first endpoint value exists in the identifier value of the routing table record of the current node.
第一端点确定子单元5212,可以用于如果所述第一端点判断子单元5211判定存在,则将与所述端点值相等的标识值作为所述节点环中距所述第一端点值最近的标识值。The first endpoint determining subunit 5212 may be configured to: if the first endpoint determining subunit 5211 determines that the identifier is present, the identifier value equal to the endpoint value is used as the first endpoint in the node ring The most recent identity value for the value.
第一端点续判子单元5213,可以用于如果所述第一端点判断子单元5211判定不存在,判断所述第一端点值是否在所述当前节点的标识值与其直接前驱节点或直接后继节点的标识值之间。The first endpoint continuation sub-unit 5213 may be configured to determine, if the first endpoint determining sub-unit 5211 determines that there is no presence, determine whether the first endpoint value is at the current node identifier value and its direct precursor node or directly Between the identification values of subsequent nodes.
第一端点续定子单元5214,可以用于如果所述第一端点续判子单元5213判定所述第一端点值在所述当前节点的标识值与其直接前驱节点的标识值之间,则将所述当前节点的标识值作为所述节点环中距所述第一端点值最近的标识值;如果所述第一端点续判子单元5213判定所述第一端点值在所述当前节点的标识值与其直接后继节点的标识值之间,则将所述当前节点的直接后继节点的标识值作为所述节点环中距所述第一端点值最近的标识值。The first endpoint continued stator unit 5214 can be configured to: if the first endpoint continuation subunit 5213 determines that the first endpoint value is between the identity value of the current node and an identity value of the immediate precursor node, Using the identifier value of the current node as the identifier value closest to the first endpoint value in the node ring; if the first endpoint continuation sub-unit 5213 determines that the first endpoint value is in the current Between the identifier value of the node and the identifier value of the direct successor node, the identifier value of the direct successor node of the current node is used as the identifier value closest to the first endpoint value in the node ring.
其中,所述第二判定子单元523,可以用于如果所述第一端点续判子单元5213判定所述第一端点值不在所述当前节点的标识值与其直接后继节点的标识值之间,且不在所述当前节点的标识值与其直接前驱节点的标识值之间,则将所述当前节点更新为所述当前节点的路由表中记录的距所述第一端点值最近的标识值所在的节点,重新触发所述第一端点判断子单元5211执行。The second determining subunit 523 may be configured to: if the first endpoint contingency subunit 5213 determines that the first endpoint value is not between the identifier value of the current node and the identifier value of the immediate successor node And not between the identifier value of the current node and the identifier value of the direct precursor node, updating the current node to the identifier value closest to the first endpoint value recorded in the routing table of the current node The node where it is located, re-triggers the first endpoint determining sub-unit 5211 to execute.
针对第二端点值对应的标识值的查询,本申请实施例所述查找子单元521,如图6所示,还可以包括:For the query of the identifier value corresponding to the second endpoint value, the search sub-unit 521 in the embodiment of the present application, as shown in FIG. 6, may further include:
第二端点判断子单元5215,可以用于判断所述当前节点的路由表记录的 标识值中是否存在与所述第二端点值相等的标识值。The second endpoint determining sub-unit 5215 can be configured to determine the routing table record of the current node. Whether there is an identification value equal to the second endpoint value in the identification value.
第二端点确定子单元5216,可以用于如果所述第二端点判断子单元5215判定存在,则将与所述端点值相等的标识值作为所述节点环中距所述第二端点值最近的标识值。The second endpoint determining subunit 5216 may be configured to: if the second endpoint determining subunit 5215 determines that the identifier is present, the identifier value equal to the endpoint value is the closest to the second endpoint value in the node ring. Identification value.
第二端点续判子单元5217,可以用于如果所述第二端点判断子单元5215判定不存在,判断所述第二端点值是否在所述当前节点的标识值与其直接前驱节点或直接后继节点的标识值之间。The second endpoint continuation sub-unit 5217 may be configured to determine, if the second endpoint determining sub-unit 5215 determines that there is no presence, determine whether the second endpoint value is at an identifier value of the current node and a direct predecessor node or a direct successor node. Between the identification values.
第二端点续定子单元5218,可以用于如果所述第二端点续判子单元5217判定所述第二端点值在所述当前节点的标识值与其直接前驱节点的标识值之间,则将所述当前节点的直接前驱节点作为所述节点环中距所述第二端点值最近的标识值;如果所述第二端点续判子单元5217判定所述第二端点值在所述当前节点的标识值与其直接后继节点的标识值之间,则将所述当前节点的标识值作为所述节点环中距所述第二端点值最近的标识值。The second endpoint continuation unit unit 5218 can be configured to: if the second endpoint continuation subunit 5217 determines that the second endpoint value is between the identity value of the current node and an identity value of the immediate precursor node thereof, The direct precursor node of the current node is the identifier value closest to the second endpoint value in the node ring; if the second endpoint continuation sub-unit 5217 determines that the second endpoint value is at the identifier value of the current node Between the identification values of the direct successor nodes, the identifier value of the current node is used as the identifier value closest to the second endpoint value in the node ring.
其中,所述第二判定子单元523,可以用于如果所述第二端点续判子单元5217判定所述第二端点值不在所述当前节点的标识值与其直接后继节点的标识值之间,且不在所述当前节点的标识值与其直接前驱节点的标识值之间,则将所述当前节点更新为所述当前节点的路由表中记录的距所述第二端点值最近的标识值所在的节点,重新触发所述第二端点判断子单元5215执行。The second determining sub-unit 523 may be configured to: if the second endpoint continuation sub-unit 5217 determines that the second endpoint value is not between the identifier value of the current node and an identifier value of the immediate successor node, and Not between the identifier value of the current node and the identifier value of the direct precursor node, updating the current node to a node where the identifier value closest to the second endpoint value recorded in the routing table of the current node is located Re-triggering the second endpoint determination sub-unit 5215 to execute.
下面,再对本申请实施例在节点环中加入或删除节点的具体实施方式进行介绍。例如,参见图6,本申请实施例提供的装置还可以包括:The following describes the specific implementation manner of adding or deleting a node in a node ring in the embodiment of the present application. For example, referring to FIG. 6, the apparatus provided in this embodiment of the present application may further include:
节点加入单元540,可以用于针对缓存中新加入的关键字,判断新加入的关键字中可用于范围查询的字段值对应的标识值是否已存在于所述节点环中,如果否,将该标识值存储在新节点中,在所述节点环中查找出可作为新节点的直接前驱节点的节点N,更新所述节点N的直接后继节点的直接前驱节点为所述新节点,更新所述节点N为所述新节点的直接前驱节点,为所述新节点建立对应的路由表;The node joining unit 540 may be configured to determine, for the newly added keyword in the cache, whether the identifier value corresponding to the field value of the newly added keyword that is available for the range query already exists in the node ring, and if not, the The identifier value is stored in a new node, in which the node N that can be the direct precursor node of the new node is found, and the direct precursor node of the direct successor node of the node N is updated as the new node, and the new node is updated. Node N is a direct precursor node of the new node, and a corresponding routing table is established for the new node;
节点删除单元550,可以用于针对缓存中删除的关键字,如果所述删除的关键字的字段中可用于范围查询的字段值不存在于任何其他关键字中,将 存储了该字段值对应的标识值的节点作为待删除节点,更新待删除节点的直接后继节点的直接前驱节点为所述待删除节点的直接前驱节点,从所述节点环中删除所述待删除节点;The node deleting unit 550 may be configured to: for the keyword deleted in the cache, if the field value applicable to the range query in the field of the deleted keyword does not exist in any other keyword, A node that stores the identifier value corresponding to the field value is used as the node to be deleted, and the direct precursor node that updates the direct successor node of the node to be deleted is the direct precursor node of the node to be deleted, and the to-be-deleted is deleted from the node ring. node;
以及,路由更新单元560,可以用于根据预置索引算法,对受所述新节点的加入影响或受所述待删除节点的删除影响而需要更新的路由表进行更新。And the routing update unit 560 can be configured to update, according to the preset indexing algorithm, a routing table that needs to be updated by the joining effect of the new node or affected by the deletion of the to-be-deleted node.
下面,再结合本发明实施例一些可能的应用场景进行示意性说明。The following is a schematic description of some possible application scenarios in the embodiments of the present invention.
例如,一些可能的实施方式中,根据一致性hash规则,可以在分布式缓存系统的多台缓存服务器存储有各个节点的kay-value信息。为了提高查询性能,本申请实施例提供的装置可以配置于与用于存储缓存数据的多台缓存服务器不同的单独查询服务器中。相应地,该装置还可以包括:数据反馈单元570,可以用于在关键字获取单元得到指定范围对应的关键字集合之后,通过一次多线程下载并发从所述多台缓存服务器中得到所述关键字集合对应的缓存数据,将得到的缓存数据返回给发出所述请求的客户端。For example, in some possible implementations, according to the consistency hash rule, kay-value information of each node may be stored in multiple cache servers of the distributed cache system. In order to improve query performance, the apparatus provided in this embodiment of the present application may be configured in a separate query server different from multiple cache servers for storing cached data. Correspondingly, the device may further include: a data feedback unit 570, configured to: after the keyword obtaining unit obtains the keyword set corresponding to the specified range, obtain the key from the plurality of cache servers by one multi-thread download and concurrent The cached data corresponding to the set of words returns the obtained cached data to the client that issued the request.
可见,配置本申请实施例提供的装置,可以由查询响应单元520直接从内存读取该节点环,以节点环中节点的路由表作为索引进行范围查询,脱离对数据库的依赖、读取速度快,且在根据路由表查找指定范围的端点值对应的标识值的过程中,总是跳到距端点值最近的字段值所标识的节点的路由表中来查找,最终使得关键字获取单元530可以查找出指定范围对应的关键字集合,从而使查询过程成为折半查找的过程,达到高效范围查询的目的。It can be seen that, by configuring the device provided by the embodiment of the present application, the query response unit 520 can directly read the node ring from the memory, and use the routing table of the node in the node ring as an index to perform range query, which is free from dependence on the database and fast in reading. And searching for the identifier value corresponding to the endpoint value of the specified range according to the routing table, always jumping to the routing table of the node identified by the field value closest to the endpoint value to search, and finally enabling the keyword obtaining unit 530 to Find the keyword set corresponding to the specified range, so that the query process becomes a process of folding the search to achieve the purpose of efficient range query.
需要注意的是,本申请实施例所述查找子单元521,第一判定子单元522、第二判定子单元523、出发子单元5210、第一端点判断子单元5211、第一端点确定子单元5212、第一端点续判子单元5213、第一端点续定子单元5214、第二端点判断子单元5215、第二端点确定子单元5216、第二端点续判子单元5217、第二端点续定子单元5218、节点加入单元540、节点删除单元550、路由更新单元560、以及数据反馈单元570在图6中以虚线绘制,以表示这些单元或子单元不是本申请实施例提供的装置的必要单元。It should be noted that the search subunit 521, the first determining subunit 522, the second determining subunit 523, the starting subunit 5210, the first endpoint judging subunit 5211, and the first endpoint determining subroutine are provided in the embodiment of the present application. The unit 5212, the first endpoint continuation subunit 5213, the first endpoint continuation stator unit 5214, the second endpoint determination subunit 5215, the second endpoint determination subunit 5216, the second endpoint continuation subunit 5217, and the second endpoint continuation stator Unit 5218, node joining unit 540, node deleting unit 550, routing update unit 560, and data feedback unit 570 are drawn in dashed lines in FIG. 6 to indicate that these units or subunits are not necessary units of the apparatus provided by the embodiments of the present application.
与上述分布式缓存范围查询方法相对应的,本申请实施例还提供了一种 分布式缓存范围查询系统。Corresponding to the above-mentioned distributed cache range query method, the embodiment of the present application further provides a Distributed cache range query system.
例如,参见图7,为本申请实施例提供的分布式缓存范围查询系统结构示意图。如图7所示,该系统可以包括:For example, refer to FIG. 7, which is a schematic structural diagram of a distributed cache range query system according to an embodiment of the present application. As shown in Figure 7, the system can include:
缓存服务器710,可以用于存储与关键字具有映射关系的缓存数据,接收查询服务器720发出的针对关键字集合对应的缓存数据的查询请求,向查询服务器720反馈关键字集合对应的缓存数据;The cache server 710 may be configured to store the cached data that has a mapping relationship with the keyword, receive the query request sent by the query server 720 for the cached data corresponding to the keyword set, and feed back the cached data corresponding to the keyword set to the query server 720;
查询服务器720,可以用于将用于映射缓存数据的关键字中,可用于范围查询的字段值对应的标识值,预先存储在内存的存储区域中,响应于从客户端接收到针对指定范围的关键字对应的缓存数据的查询请求,从所述存储区域中查找出所述指定范围的端点值对应的标识值,根据所述指定范围的端点值对应的标识值确定所述指定范围对应的关键字集合,从所述缓存服务器中得到所述关键字集合对应的缓存数据,将得到的缓存数据反馈给发出所述查询请求的客户端;The query server 720 may be configured to store, in the keyword used for mapping the cache data, an identifier value corresponding to the field value of the range query, in a storage area of the memory, in response to receiving the specified range from the client. And a query request for the cached data corresponding to the keyword, the identifier value corresponding to the endpoint value of the specified range is searched from the storage area, and the key corresponding to the specified range is determined according to the identifier value corresponding to the endpoint value of the specified range a set of words, the cache data corresponding to the keyword set is obtained from the cache server, and the obtained cache data is fed back to the client that sends the query request;
客户端730,可以用于向所述查询服务器发送针对指定范围的关键字对应的缓存数据的查询请求;接收查询服务器反馈的缓存数据。The client 730 may be configured to send, to the query server, a query request for cached data corresponding to a specified range of keywords; and receive cached data fed back by the query server.
例如,缓存服务器710可以有一个台或多台。本申请实施例所建立的可用于范围查询的不同节点环以及路由表均可以保存于该单独的查询服务器720。例如,该查询服务器720可以根据查询请求确定需要读取的节点环,将节点环中任一节点作为当前节点,进行后续查询步骤,在得到查询请求对应的关键字集合之后,可以通过一次多线程下载并发从所述一台或多台缓存服务器中得到所述关键字集合对应的缓存数据,将得到的缓存数据返回给发出所述请求的客户端。For example, the cache server 710 can have one or more. Different node rings and routing tables that can be used for range query established by the embodiments of the present application can be saved in the separate query server 720. For example, the query server 720 can determine the node ring that needs to be read according to the query request, and use any node in the node ring as the current node to perform a subsequent query step. After obtaining the keyword set corresponding to the query request, the query may be multi-threaded. Downloading and obtaining the cache data corresponding to the keyword set from the one or more cache servers, and returning the obtained cache data to the client that sends the request.
为了描述的方便,描述以上装置时以功能分为各种单元分别描述。当然,在实施本发明时可以把各单元的功能在同一个或多个软件和/或硬件中实现。For the convenience of description, the above devices are described separately by function into various units. Of course, the functions of the various units may be implemented in one or more software and/or hardware in the practice of the invention.
通过以上的实施方式的描述可知,本领域的技术人员可以清楚地了解到本发明可借助软件加必需的通用硬件平台的方式来实现。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在存储介质中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机, 服务器,或者网络设备等)执行本发明各个实施例或者实施例的某些部分所述的方法。It will be apparent to those skilled in the art from the above description of the embodiments that the present invention can be implemented by means of software plus a necessary general hardware platform. Based on such understanding, the technical solution of the present invention, which is essential or contributes to the prior art, may be embodied in the form of a software product, which may be stored in a storage medium such as a ROM/RAM or a disk. , optical discs, etc., including a number of instructions to make a computer device (can be a personal computer, A server, or network device, etc.) performs the methods described in various embodiments of the present invention or in certain portions of the embodiments.
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于系统实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。The various embodiments in the specification are described in a progressive manner, and the same or similar parts between the various embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and the relevant parts can be referred to the description of the method embodiment.
本发明可用于众多通用或专用的计算系统环境或配置中。例如:个人计算机、服务器计算机、手持设备或便携式设备、平板型设备、多处理器系统、基于微处理器的系统、置顶盒、可编程的消费电子设备、网络PC、小型计算机、大型计算机、包括以上任何系统或设备的分布式计算环境等等。The invention is applicable to a wide variety of general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, handheld or portable devices, tablet devices, multiprocessor systems, microprocessor based systems, set-top boxes, programmable consumer electronics devices, network PCs, small computers, mainframe computers, including A distributed computing environment of any of the above systems or devices, and the like.
本发明可以在由计算机执行的计算机可执行指令的一般上下文中描述,例如程序模块。一般地,程序模块包括执行特定任务或实现特定抽象数据类型的例程、程序、对象、组件、数据结构等等。也可以在分布式计算环境中实践本发明,在这些分布式计算环境中,由通过通信网络而被连接的远程处理设备来执行任务。在分布式计算环境中,程序模块可以位于包括存储设备在内的本地和远程计算机存储介质中。The invention may be described in the general context of computer-executable instructions executed by a computer, such as a program module. Generally, program modules include routines, programs, objects, components, data structures, and the like that perform particular tasks or implement particular abstract data types. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are connected through a communication network. In a distributed computing environment, program modules can be located in both local and remote computer storage media including storage devices.
需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。It should be noted that, in this context, relational terms such as first and second are used merely to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply such entities or operations. There is any such actual relationship or order between them. Furthermore, the term "comprises" or "comprises" or "comprises" or any other variations thereof is intended to encompass a non-exclusive inclusion, such that a process, method, article, or device that comprises a plurality of elements includes not only those elements but also Other elements, or elements that are inherent to such a process, method, item, or device. An element that is defined by the phrase "comprising a ..." does not exclude the presence of additional equivalent elements in the process, method, item, or device that comprises the element.
以上所述仅为本发明的较佳实施例而已,并非用于限定本发明的保护范围。凡在本发明的精神和原则之内所作的任何修改、等同替换、改进等,均包含在本发明的保护范围内。 The above is only the preferred embodiment of the present invention and is not intended to limit the scope of the present invention. Any modifications, equivalents, improvements, etc. made within the spirit and scope of the invention are intended to be included within the scope of the invention.

Claims (11)

  1. 一种分布式缓存范围查询方法,其特征在于,包括:A distributed cache range query method, comprising:
    将用于映射缓存数据的关键字中,可用于范围查询的字段值对应的标识值,预先存储在内存的存储区域中;Among the keywords that will be used to map the cached data, the identifier value corresponding to the field value that can be used for the range query is pre-stored in the storage area of the memory;
    响应于接收到针对指定范围的关键字的查询请求,从所述存储区域中查找出所述指定范围的端点值对应的标识值;And in response to receiving the query request for the specified range of keywords, searching for the identifier value corresponding to the specified range of endpoint values from the storage area;
    根据所述指定范围的端点值对应的标识值确定所述指定范围对应的关键字集合。And determining, according to the identifier value corresponding to the endpoint value of the specified range, the keyword set corresponding to the specified range.
  2. 根据权利要求1所述的方法,其特征在于,所述将用于映射缓存数据的关键字中,可用于范围查询的字段值对应的标识值,预先存储在内存的存储区域中包括:The method according to claim 1, wherein the identifier value corresponding to the field value that can be used for the range query in the keyword to be used for mapping the cache data is pre-stored in the storage area of the memory, including:
    将所述关键字中,可用于范围查询的字段值对应的标识值,预先存储在位于内存的节点环中,其中,一个标识值对应存储到一个节点中,且为所述节点环中的每一节点建立对应的路由表,所述路由表记录了根据预置索引算法确定的一个或多个其他节点的标识值;The identifier value corresponding to the field value that can be used for the range query is pre-stored in the node ring located in the memory, where an identifier value is correspondingly stored in one node, and is each in the node ring. A node establishes a corresponding routing table, and the routing table records an identifier value of one or more other nodes determined according to a preset index algorithm;
    所述响应于接收到针对指定范围的关键字的查询请求,从所述存储区域中查找出所述指定范围的端点值对应的标识值包括:And in response to receiving the query request for the specified range of keywords, the identifier value corresponding to the endpoint value of the specified range from the storage area is:
    响应于接收到针对指定范围的关键字的查询请求,以所述节点环中的任一节点作为当前节点,在所述当前节点的路由表中查找出距所述指定范围的端点值最近的标识值;Responding to receiving the query request for the specified range of keywords, using any node in the node ring as the current node, and finding the identifier closest to the specified range of endpoint values in the routing table of the current node value;
    如果判定查找到的标识值是所述节点环中距所述指定范围的端点值最近的标识值,将查找到的标识值作为所述指定范围的端点值对应的标识值;If it is determined that the identified identifier value is the identifier value closest to the endpoint value of the specified range in the node ring, the found identifier value is used as the identifier value corresponding to the endpoint value of the specified range;
    如果判定查找到的标识值不是所述节点环中距所述指定范围的端点值最近的标识值,将当前节点更新为查找到的标识值所在的节点,返回到所述在当前节点的路由表中查找距所述指定范围的端点值最近的标识值的步骤。If it is determined that the found identity value is not the identity value closest to the endpoint value of the specified range in the node ring, the current node is updated to the node where the found identity value is located, and the routing table is returned to the current node. The step of finding the identity value closest to the endpoint value of the specified range.
  3. 根据权利要求2所述的方法,其特征在于,所述将关键字中,可用于范围查询的字段值对应的标识值,预先存储在位于内存的节点环中包括:The method according to claim 2, wherein the identifier value corresponding to the field value that can be used for the range query in the keyword is pre-stored in the node ring located in the memory, and includes:
    将关键字中,可用于范围查询的、属于同一字段的字段值对应的标识值 按标识值大小顺序预先存储在位于内存的节点环中,其中,所述路由表记录了所述节点环中与对应节点的标识值间距成2的幂次方关系的标识值。The identifier value corresponding to the field value of the same field that can be used for the range query in the keyword Pre-stored in the node ring located in the memory according to the size of the identifier value, wherein the routing table records the identifier value of the power relationship between the node ring and the corresponding node's identification value.
  4. 根据权利要求2所述的方法,其特征在于,还包括:The method of claim 2, further comprising:
    针对缓存中新加入的关键字,判断新加入的关键字中可用于范围查询的字段值对应的标识值是否已存在于所述节点环中,如果否,将该标识值存储在新节点中,在所述节点环中查找出可作为新节点的直接前驱节点的节点N,更新所述节点N的直接后继节点的直接前驱节点为所述新节点,更新所述节点N为所述新节点的直接前驱节点,为所述新节点建立对应的路由表;Determining, for the newly added keyword in the cache, whether the identifier value corresponding to the field value that can be used for the range query in the newly added keyword already exists in the node ring, and if not, storing the identifier value in the new node, Searching in the node ring for a node N that can serve as a direct precursor node of the new node, updating a direct precursor node of the direct successor node of the node N as the new node, and updating the node N as the new node Directly preceding the node, establishing a corresponding routing table for the new node;
    针对缓存中删除的关键字,如果所述删除的关键字的字段中可用于范围查询的字段值不存在于任何其他关键字中,将存储了该字段值对应的标识值的节点作为待删除节点,更新待删除节点的直接后继节点的直接前驱节点为所述待删除节点的直接前驱节点,从所述节点环中删除所述待删除节点;For the keyword deleted in the cache, if the field value that can be used for the range query in the field of the deleted keyword does not exist in any other keyword, the node that stores the identifier value corresponding to the field value is used as the node to be deleted. Updating the direct precursor node of the direct successor node of the node to be deleted as the direct precursor node of the node to be deleted, and deleting the node to be deleted from the node ring;
    以及,根据预置索引算法,对受所述新节点的加入影响或受所述待删除节点的删除影响而需要更新的路由表进行更新。And, according to the preset indexing algorithm, the routing table that needs to be updated is affected by the joining of the new node or affected by the deletion of the node to be deleted.
  5. 根据权利要求1所述的方法,其特征在于,所述方法应用于与用于存储缓存数据的一台或多台缓存服务器不同的查询服务器中;The method according to claim 1, wherein the method is applied to a query server different from one or more cache servers for storing cached data;
    在得到指定范围对应的关键字集合之后,还包括:通过一次多线程下载并发从所述一台或多台缓存服务器中得到所述关键字集合对应的缓存数据,将得到的缓存数据返回给发出所述查询请求的客户端。After obtaining the keyword set corresponding to the specified range, the method further includes: obtaining, by using one multi-threaded download concurrently, the cache data corresponding to the keyword set from the one or more cache servers, and returning the obtained cache data to the issued The client of the query request.
  6. 一种分布式缓存范围查询装置,其特征在于,包括:A distributed cache range query device, comprising:
    预处理单元,用于将用于映射缓存数据的关键字中,可用于范围查询的字段值对应的标识值,预先存储在内存的存储区域中;a pre-processing unit, configured to store, in the keyword used for mapping the cache data, an identifier value corresponding to the field value of the range query, which is pre-stored in a storage area of the memory;
    查询响应单元,用于响应于接收到针对指定范围的关键字的查询请求,从所述存储区域中查找出所述指定范围的端点值对应的标识值;a query response unit, configured to search for an identifier value corresponding to the endpoint value of the specified range from the storage area, in response to receiving the query request for the keyword of the specified range;
    关键字获取单元,用于根据所述指定范围的端点值对应的标识值确定所述指定范围对应的关键字集合。The keyword obtaining unit is configured to determine, according to the identifier value corresponding to the endpoint value of the specified range, a keyword set corresponding to the specified range.
  7. 根据权利要求6所述的装置,其中,所述预处理单元,用于将所述关键字中,可用于范围查询的字段值对应的标识值,预先存储在位于内存的节点环中,其中,一个标识值对应存储到一个节点中,且为所述节点环中的每 一节点建立对应的路由表,所述路由表记录了根据预置索引算法确定的一个或多个其他节点的标识值;The device according to claim 6, wherein the pre-processing unit is configured to pre-store the identifier value corresponding to the field value that can be used for the range query in the keyword, in a node ring located in the memory, where An identifier value is correspondingly stored in one node, and is each of the node rings A node establishes a corresponding routing table, and the routing table records an identifier value of one or more other nodes determined according to a preset index algorithm;
    所述查询响应单元包括:The query response unit includes:
    查找子单元,用于响应于接收到针对指定范围的关键字的查询请求,以所述节点环中的任一节点作为当前节点,在所述当前节点的路由表中查找出距所述指定范围的端点值最近的标识值;Determining a subunit, configured to: in response to receiving a query request for a specified range of keywords, use any one of the node rings as a current node, and find a specified range from the routing table of the current node The last identifier value of the endpoint value;
    第一判定子单元,用于如果判定查找到的标识值是所述节点环中距所述指定范围的端点值最近的标识值,将查找到的标识值作为所述指定范围的端点值对应的标识值;a first determining subunit, configured to determine, if the found identity value is an identifier value that is closest to the endpoint value of the specified range in the node ring, the identifier value that is found is corresponding to the endpoint value of the specified range Identification value
    第二判定子单元,用于如果判定查找到的标识值不是所述节点环中距所述指定范围的端点值最近的标识值,将当前节点更新为查找到的标识值所在的节点,触发所述查找子单元在当前节点的路由表中查找出距所述指定范围的端点值最近的标识值。a second determining subunit, configured to: if it is determined that the found identifier value is not the identifier value closest to the endpoint value of the specified range in the node ring, update the current node to the node where the found identifier value is located, and trigger the location The lookup subunit searches for the identity value closest to the endpoint value of the specified range in the routing table of the current node.
  8. 根据权利要求7所述的装置,其特征在于,所述预处理单元,具体用于将关键字中,可用于范围查询的、属于同一字段的字段值对应的标识值按标识值大小顺序存储在位于内存的节点环中,其中,所述路由表记录了所述节点环中与对应节点的字段值间距成2的幂次方关系的字段值。The device according to claim 7, wherein the pre-processing unit is configured to store, in the keyword, an identifier value corresponding to a field value of the same field that can be used for the range query, in the order of the identifier value. Located in a node ring of the memory, wherein the routing table records field values in the node ring that are in a power relationship of 2 to the field value of the corresponding node.
  9. 根据权利要求7所述的装置,其特征在于,还包括:The device according to claim 7, further comprising:
    节点加入单元,用于针对缓存中新加入的关键字,判断新加入的关键字中可用于范围查询的字段值对应的标识值是否已存在于所述节点环中,如果否,将该标识值存储在新节点中,在所述节点环中查找出可作为新节点的直接前驱节点的节点N,更新所述节点N的直接后继节点的直接前驱节点为所述新节点,更新所述节点N为所述新节点的直接前驱节点,为所述新节点建立对应的路由表;The node joining unit is configured to determine, for the newly added keyword in the cache, whether the identifier value corresponding to the field value of the newly added keyword that is available for the range query already exists in the node ring, and if not, the identifier value Storing in a new node, finding a node N in the node ring that can be a direct precursor node of the new node, updating a direct precursor node of the direct successor node of the node N as the new node, and updating the node N Establishing a corresponding routing table for the new node as a direct precursor node of the new node;
    节点删除单元,用于针对缓存中删除的关键字,如果所述删除的关键字的字段中可用于范围查询的字段值不存在于任何其他关键字中,将存储了该字段值对应的标识值的节点作为待删除节点,更新待删除节点的直接后继节点的直接前驱节点为所述待删除节点的直接前驱节点,从所述节点环中删除所述待删除节点; a node deleting unit, configured to: for a keyword deleted in the cache, if a field value that is available for the range query in the field of the deleted keyword does not exist in any other keyword, the identifier value corresponding to the field value is stored The node as the node to be deleted, the direct precursor node of the direct successor node of the node to be deleted is the direct precursor node of the node to be deleted, and the node to be deleted is deleted from the node ring;
    以及,路由更新单元,用于根据预置索引算法,对受所述新节点的加入影响或受所述待删除节点的删除影响而需要更新的路由表进行更新。And a routing update unit, configured to update, according to the preset indexing algorithm, a routing table that needs to be updated by the join of the new node or affected by the deletion of the node to be deleted.
  10. 根据权利要求6所述的装置,其特征在于,所述装置配置于与用于存储缓存数据的一台或多台缓存服务器不同的查询服务器中;The apparatus according to claim 6, wherein said apparatus is configured in a query server different from one or more cache servers for storing cached data;
    所述装置还包括:数据反馈单元,用于在所述关键字获取单元得到指定范围对应的关键字集合之后,通过一次多线程下载并发从所述一台或多台缓存服务器中得到所述关键字集合对应的缓存数据,将得到的缓存数据返回给发出所述查询请求的客户端。The device further includes: a data feedback unit, configured to: after the keyword obtaining unit obtains the keyword set corresponding to the specified range, obtain the key from the one or more cache servers by using a multi-thread download concurrently The cached data corresponding to the set of words returns the obtained cached data to the client that issued the query request.
  11. 一种分布式缓存范围查询系统,其特征在于,包括:A distributed cache range query system, comprising:
    缓存服务器,用于存储与关键字具有映射关系的缓存数据,接收查询服务器发出的针对关键字集合对应的缓存数据的查询请求,向查询服务器反馈关键字集合对应的缓存数据;a cache server, configured to store cache data with a mapping relationship with the keyword, receive a query request sent by the query server for the cached data corresponding to the keyword set, and feed back the cache data corresponding to the keyword set to the query server;
    查询服务器,用于将用于映射缓存数据的关键字中,可用于范围查询的字段值对应的标识值,预先存储在内存的存储区域中,响应于从客户端接收到针对指定范围的关键字对应的缓存数据的查询请求,从所述存储区域中查找出所述指定范围的端点值对应的标识值,根据所述指定范围的端点值对应的标识值确定所述指定范围对应的关键字集合,从所述缓存服务器中得到所述关键字集合对应的缓存数据,将得到的缓存数据反馈给发出所述查询请求的客户端;The query server is configured to store, in the keyword used for mapping the cache data, an identifier value corresponding to the field value of the range query, which is pre-stored in the storage area of the memory, in response to receiving the keyword for the specified range from the client And corresponding to the query request of the cached data, the identifier value corresponding to the endpoint value of the specified range is searched from the storage area, and the keyword set corresponding to the specified range is determined according to the identifier value corresponding to the endpoint value of the specified range. Obtaining the cached data corresponding to the keyword set from the cache server, and feeding the obtained cached data to the client that sends the query request;
    客户端,用于向所述查询服务器发送针对指定范围的关键字对应的缓存数据的查询请求;接收查询服务器反馈的缓存数据。 a client, configured to send, to the query server, a query request for cached data corresponding to a specified range of keywords; and receive cached data fed back by the query server.
PCT/CN2015/093310 2014-11-06 2015-10-30 Distributed buffering range querying method, device, and system WO2016070750A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410637554.2 2014-11-06
CN201410637554.2A CN105610881B9 (en) 2014-11-06 2014-11-06 Distributed cache range query method, device and system

Publications (1)

Publication Number Publication Date
WO2016070750A1 true WO2016070750A1 (en) 2016-05-12

Family

ID=55908568

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/093310 WO2016070750A1 (en) 2014-11-06 2015-10-30 Distributed buffering range querying method, device, and system

Country Status (2)

Country Link
CN (1) CN105610881B9 (en)
WO (1) WO2016070750A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107622057A (en) * 2016-07-13 2018-01-23 阿里巴巴集团控股有限公司 A kind of method and apparatus of lookup task

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107622079A (en) * 2017-07-28 2018-01-23 阿里巴巴集团控股有限公司 Data storage, querying method and device
CN110020040B (en) * 2017-08-17 2021-07-06 北京京东尚科信息技术有限公司 Method, device and system for querying data
CN110740155B (en) * 2018-07-18 2022-05-27 阿里巴巴集团控股有限公司 Request processing method and device in distributed system
CN110377647B (en) * 2019-07-30 2022-07-29 江门职业技术学院 Demand information query method and system based on distributed database

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101635741A (en) * 2009-08-27 2010-01-27 中国科学院计算技术研究所 Method and system thereof for inquiring recourses in distributed network
CN101855921A (en) * 2007-06-15 2010-10-06 泰克莱克公司 Methods, systems, and computer program products for identifying a serving home subscriber server (hss) in a communications network
US20130173634A1 (en) * 2011-12-30 2013-07-04 Microsoft Corporation Identifying files stored on client devices as web-based search results

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8275761B2 (en) * 2008-05-15 2012-09-25 International Business Machines Corporation Determining a density of a key value referenced in a database query over a range of rows
CN103473267B (en) * 2013-08-09 2016-11-16 深圳市中科新业信息科技发展有限公司 Data store query method and system
CN103942289B (en) * 2014-04-12 2017-01-25 广西师范大学 Memory caching method oriented to range querying on Hadoop

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101855921A (en) * 2007-06-15 2010-10-06 泰克莱克公司 Methods, systems, and computer program products for identifying a serving home subscriber server (hss) in a communications network
CN101635741A (en) * 2009-08-27 2010-01-27 中国科学院计算技术研究所 Method and system thereof for inquiring recourses in distributed network
US20130173634A1 (en) * 2011-12-30 2013-07-04 Microsoft Corporation Identifying files stored on client devices as web-based search results

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107622057A (en) * 2016-07-13 2018-01-23 阿里巴巴集团控股有限公司 A kind of method and apparatus of lookup task

Also Published As

Publication number Publication date
CN105610881B9 (en) 2019-06-21
CN105610881B (en) 2019-04-09
CN105610881A (en) 2016-05-25

Similar Documents

Publication Publication Date Title
WO2016070751A1 (en) Distributed cache range querying method, device, and system
US11157473B2 (en) Multisource semantic partitioning
WO2016070750A1 (en) Distributed buffering range querying method, device, and system
US20180046511A1 (en) Tracking large numbers of moving objects in an event processing system
CN107784044B (en) Table data query method and device
US9817858B2 (en) Generating hash values
US20130138646A1 (en) System and methods for mapping and searching objects in multidimensional space
CN111538724B (en) Method for managing index
US9104713B2 (en) Managing a temporal key property in a database management system
US9229960B2 (en) Database management delete efficiency
US8812492B2 (en) Automatic and dynamic design of cache groups
US20170293617A1 (en) Unified storage system for online image searching and offline image analytics
US10496648B2 (en) Systems and methods for searching multiple related tables
TWI686705B (en) Paging query method and device and electronic equipment
US8032550B2 (en) Federated document search by keywords
CN109597829B (en) Middleware method for realizing searchable encryption relational database cache
Von der Weth et al. Multiterm keyword search in NoSQL systems
US9229969B2 (en) Management of searches in a database system
CN105574010B (en) Data query method and device
US10311054B2 (en) Query data splitting
US7752194B2 (en) LDAP revision history
US20230214391A1 (en) Metadata search via n-gram index
US20090063417A1 (en) Index attribute subtypes for LDAP entries
CN111159175B (en) Incomplete database Skyline query method based on index
US8583596B2 (en) Multi-master referential integrity

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15857736

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15857736

Country of ref document: EP

Kind code of ref document: A1