CN111177245A - Key value traversal method of Redis cluster, server and storage medium - Google Patents

Key value traversal method of Redis cluster, server and storage medium Download PDF

Info

Publication number
CN111177245A
CN111177245A CN201911371822.XA CN201911371822A CN111177245A CN 111177245 A CN111177245 A CN 111177245A CN 201911371822 A CN201911371822 A CN 201911371822A CN 111177245 A CN111177245 A CN 111177245A
Authority
CN
China
Prior art keywords
key
redis
redis cluster
cursor
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911371822.XA
Other languages
Chinese (zh)
Inventor
齐天亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Property and Casualty Insurance Company of China Ltd
Original Assignee
Ping An Property and Casualty Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Property and Casualty Insurance Company of China Ltd filed Critical Ping An Property and Casualty Insurance Company of China Ltd
Priority to CN201911371822.XA priority Critical patent/CN111177245A/en
Publication of CN111177245A publication Critical patent/CN111177245A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • G06F16/2445Data retrieval commands; View definitions

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a cluster storage technology, and discloses a key value traversal method of a Redis cluster, which comprises the following steps: acquiring a Redis cluster node set through a jedis connection pool, and establishing connection with each node in the Redis cluster; polling all nodes in the Redis cluster, and respectively querying the key value in each node by adopting a SCAN command to obtain all key values meeting the conditions in the Redis cluster. The invention also provides a server and a storage medium. The key value traversal method, the server and the storage medium of the Redis cluster can improve the performance of the server, and are simple and convenient to implement.

Description

Key value traversal method of Redis cluster, server and storage medium
Technical Field
The invention relates to the technical field of cluster storage, in particular to a key value traversal method of a Redis cluster, a server and a computer readable storage medium.
Background
Redis is a key-value storage system, which is mainly divided into a master-slave mode and a cluster mode. Each node in the master-slave mode stores all keys (keywords), each node in the cluster mode only stores part of the keys, and the combination of the keys in all the nodes forms all the keys in the whole cluster.
When scanning the key under each node in the Redis cluster, the existing KEYS command scans all records at once. Moreover, the KEYS algorithm is a traversal algorithm, the complexity is O (n), and if the Redis data volume is very large, for example, KEYS above ten million levels, the command can cause Redis service card pause. Since Redis is a single-threaded program, all commands are executed sequentially, and other commands must wait until the current KEYS command is executed, so that all other commands for reading and writing Redis may be delayed or even time-out and error-reporting may be performed. Therefore, the use of the KEYS command can affect Redis performance, cause blockage to the environment, and finally make the system unusable and unsuitable for a production environment.
A new command SCAN appears in the Redis2.8 version, and can be used for scanning Redis records in batches, but a Redis cluster does not directly support the use of a SCAN method, so that difficulty is encountered in development, or the existing architecture needs to be greatly modified, and the implementation is troublesome.
Disclosure of Invention
In view of the above, the present invention provides a key value traversal method, a server and a computer-readable storage medium for a Redis cluster, so as to solve at least one of the above technical problems.
Firstly, in order to achieve the above object, the present invention provides a key value traversal method for a Redis cluster, including the steps of:
acquiring a Redis cluster node set through a jedis connection pool, and establishing connection with each node in the Redis cluster;
polling all nodes in the Redis cluster, and respectively querying the key value in each node by adopting a SCAN command to obtain all key values meeting the conditions in the Redis cluster.
Optionally, the polling all nodes in the Redis cluster, and using a SCAN command to respectively query the key value in each node to obtain all key values meeting the condition in the Redis cluster includes:
setting query parameters of a current node according to user input;
scanning a current node by adopting an SCAN command according to the query parameters, and querying a key value in the current node through one or more times of iteration;
and continuing polling the next node until all nodes in the Redis cluster are completely scanned, and obtaining all key values in the Redis cluster.
Optionally, the method further includes, after the step of scanning the current node by using the SCAN command according to the query parameter and querying a key value in the current node through one or more iterations:
and putting the query result into a set container for deduplication.
Optionally, the step of putting the query result into a set container for deduplication includes:
and when receiving an inquiry result returned by the SCAN command every time, judging whether the inquiry result is empty, and if not, assigning the inquiry result to a set container to automatically remove repeated data.
Optionally, the query parameters include curror, MATCH, COUNT, where curror represents a cursor, COUNT represents the number of elements returned per iteration, and MATCH represents a regular pattern of the queried keyword key.
Optionally, the step of scanning the current node by using a SCAN command according to the query parameter and querying a key value in the current node through one or more iterations further includes:
each time an inquiry result returned by the SCAN command is received, acquiring a cursor in the inquiry result, and judging whether the cursor is equal to 0 or not;
if the cursor is not 0, continuing to perform the next iteration by taking the cursor as a new iteration parameter to obtain a query result returned by the SCAN command;
if the cursor is equal to 0, the iteration in the current node is ended, and the next node is continuously polled.
Alternatively, the value of the COUNT parameter may be set the same or different for each iteration.
Optionally, in the step of scanning the current node by using the SCAN command according to the query parameter and querying a key value in the current node through one or more iterations, each iteration includes:
and traversing the array storing the key value in the current node by the SCAN command according to the element quantity set by the COUNT parameter and the condition set by the MATCH parameter, returning the query result meeting the condition in the array, and returning the cursor according to the index of the traversed array.
In addition, in order to achieve the above object, the present invention further provides a server, including a memory and a processor, where the memory stores a key value traversal system of a Redis cluster that can run on the processor, and when executed by the processor, the key value traversal system of the Redis cluster implements the steps of the key value traversal method of the Redis cluster as described above.
Further, to achieve the above object, the present invention also provides a computer-readable storage medium storing a key value traversal system of a Redis cluster, which is executable by at least one processor to cause the at least one processor to execute the steps of the key value traversal method of the Redis cluster as described above.
Compared with the prior art, the key value traversal method, the server and the computer readable storage medium for the Redis cluster can poll each node of the Redis cluster through the jedis connection pool, SCAN the current node by using the SCAN command, query the key under the node, and obtain the key under each node in the cluster by using the Redis source code native mode. Although the execution speed of the SCAN method is not high compared with that of the keys method, the SCAN method is carried out in a grading way, threads cannot be blocked, and the performance of the server is greatly improved. In addition, the Redis SCAN method is not required to be rewritten, any configuration file and any configuration method are not required to be added, and the method is simple and convenient to implement.
Drawings
FIG. 1 is a schematic diagram of an alternative hardware architecture for a server according to the present invention;
FIG. 2 is a schematic diagram of program modules of a first embodiment of a key value traversal system of a Redis cluster according to the present invention;
FIG. 3 is a schematic diagram of program modules of a key value traversal system of a Redis cluster according to a second embodiment of the present invention;
FIG. 4 is a flowchart illustrating a key value traversal method for Redis clusters according to a preferred embodiment of the present invention;
FIG. 5 is a detailed flowchart of the first alternative embodiment of step S402 in FIG. 4;
FIG. 6 is a detailed flowchart of a second alternative embodiment of step S402 in FIG. 4;
the implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the description relating to "first", "second", etc. in the present invention is for descriptive purposes only and is not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In addition, technical solutions between various embodiments may be combined with each other, but must be realized by a person skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination should not be considered to exist, and is not within the protection scope of the present invention.
Fig. 1 is a schematic diagram of an alternative hardware architecture of the server 2 according to the present invention.
In this embodiment, the server 2 may include, but is not limited to, a memory 11, a processor 12, and a network interface 13, which may be communicatively connected to each other through a system bus. It is noted that fig. 1 only shows the server 2 with components 11-13, but it is to be understood that not all of the shown components are required to be implemented, and that more or fewer components may be implemented instead.
The server 2 may be a rack server, a blade server, a tower server, or a rack server, and the server 2 may be an independent server or a server cluster formed by a plurality of servers.
The memory 11 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the storage 11 may be an internal storage unit of the server 2, such as a hard disk or a memory of the server 2. In other embodiments, the memory 11 may also be an external storage device of the server 2, such as a plug-in hard disk, a Smart Memory Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), or the like, provided on the server 2. Of course, the memory 11 may also comprise both an internal storage unit of the server 2 and an external storage device thereof. In this embodiment, the memory 11 is generally used for storing an operating system and various application software installed in the server 2, for example, program codes of the key value traversal system 200 of the Redis cluster, and the like. Furthermore, the memory 11 may also be used to temporarily store various types of data that have been output or are to be output.
The processor 12 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 12 is typically used to control the overall operation of the server 2. In this embodiment, the processor 12 is configured to run a program code or process data stored in the memory 11, for example, run the key-value traversal system 200 of the Redis cluster.
The network interface 13 may comprise a wireless network interface or a wired network interface, and the network interface 13 is generally used for establishing communication connection between the server 2 and other electronic devices.
The hardware structure and functions of the related devices of the present invention have been described in detail so far. Various embodiments of the present invention will be presented based on the above description.
First, the present invention provides a key value traversal system 200 of a Redis cluster.
Referring to fig. 2, a program module diagram of a first embodiment of a key-value traversal system 200 of a Redis cluster according to the present invention is shown.
In this embodiment, the key value traversal system 200 of the Redis cluster includes a series of computer program instructions stored on the memory 11, and when the computer program instructions are executed by the processor 12, the key value traversal operation of the Redis cluster according to the embodiments of the present invention can be implemented. In some embodiments, the key-value traversal system 200 of a Redis cluster may be partitioned into one or more modules based on the particular operations implemented by portions of the computer program instructions. For example, in fig. 2, the key-value traversal system 200 of the Redis cluster may be partitioned into a connection module 201 and a polling module 202. Wherein:
the connection module 201 is configured to obtain a Redis cluster node set through a jedis connection pool, and establish a connection with each node in the Redis cluster.
Specifically, the present embodiment acquires a set of Redis cluster nodes using a redisplacenative method. First, a JeddisCluster connection pool is acquired, and then nodes in the connection pool are acquired through a cluster. The current node may be acquired by a jedis ═ entry.
Redis, as a cache database, requires the client and the server 2 to establish a connection and then perform related operations. Because direct connection consumes a large amount of database resources, a connection is newly established each time and is disconnected after use, which is obviously not efficient for frequently accessed scenes. The production environment generally manages Redis connections in a manner of Jedis connection pools. Jedis is a Java connection development tool recommended by Redis officials. The Jedis connection pool is implemented based on apache-common pool 2. All Jedis objects are placed in the pool first, Redis is connected every time when needed, and the Jedis objects only need to be borrowed from the pool and returned to the pool after being used up.
The client is connected with Redis and uses a Transmission Control Protocol (TCP), the mode of connecting the pool is to initialize Jedis connection in advance, so that each time only the client needs to borrow from the Jedis connection pool, and the borrowing and returning operation is carried out locally, and only a small amount of concurrent synchronization cost is far less than the cost of newly-built TCP connection. In addition, the number of Jedis objects cannot be limited by a direct connection mode, connection leakage can be caused under extreme conditions, and the use of resources can be effectively protected and controlled by the connection pool mode.
The polling module 202 is configured to poll each node in the Redis cluster.
Specifically, since the Redis cluster has no direct SCAN method, after all nodes of the Redis cluster are acquired, polling is performed on each node of the Redis cluster, and then scanning is performed on each node by using a SCAN command to query a key in the node.
In this embodiment, the polling module 202 includes a setting sub-module 203 and a query sub-module 204.
The setting sub-module 203 is configured to set a query parameter of each node according to a user input.
In particular, query parameters need to be defined prior to scanning each cluster node. The basic format of the SCAN command is: SCAN cursor [ MATCH pattern ] [ COUNT COUNT ]. That is, the SCAN command provides three parameters cursor, MATCH, and COUNT. Where cursor represents a cursor, an integer value. The server 2 does not need to save the state for the cursor, and the only state of the cursor is the cursor integer returned by the SCAN command to the client. COUNT and MATCH are the query parameters.
COUNT represents the number of elements returned per iteration, i.e., the limit hit for the traversal, which may control the maximum number of results returned per iteration. The SCAN command is an incremental iteration command, only a small part of elements are returned in each calling, the number of the elements returned in each iteration is not guaranteed, and the command behavior can be adjusted to a certain degree by using a COUNT parameter. The COUNT parameter functions to inform the iteration command how many elements should be returned from the data set in each iteration. The use of the COUNT parameter corresponds to a hint to incrementally iterate commands, which is effective in most cases in controlling the number of returned values.
In this embodiment, if the number of keys is less than or equal to 5000, the execution can be completed at one time; if the number of keys is greater than 5000, it is recommended that each execution be within 10000. For example, 5000 pieces of key may be set to be smaller than or equal to 5000 pieces, and 8000 pieces of key may be set to be larger than 10000 pieces.
It is noted that the COUNT parameter does not strictly control the number of keys returned, but is a rough constraint. The same COUNT value is not used in each iteration, and the COUNT value can be changed at will in each iteration according to the needs of the user, so long as the cursor returned in the previous iteration is used in the next iteration.
MATCH is a pattern parameter, in this embodiment, a regular pattern of keys. Like the KEYS command, the incremental iteration command by way of a given MATCH parameter enables the command to return only elements that MATCH the given pattern by providing a glob-style pattern parameter. For example, if we want to query all keys below csj: risk, the MATCH parameter is set to csj: risk:. The MATCH of the MATCH parameter to an element is performed during the period of time after the command takes the element out of the dataset and before returning the element to the client, so if only a small number of elements in the dataset being iterated MATCH the pattern, the iterated command may not return any elements over multiple executions.
The query submodule 204 is configured to SCAN a current node by using a SCAN command, and query a key in the node.
Specifically, the set query parameters are assigned to the SCAN method, the SCAN command is executed, the nodes are scanned, and a return result is queried.
Redis uses a Hash table as the bottom layer implementation for reasons of extra efficiency and simplicity of implementation. The storage structure of the Redis underlying key is the structure of an array + linked list similar to a HashMap. Wherein the array size of the first dimension is 2n (n > -0). The length of the array is expanded by one time. The SCAN command is to traverse the one-dimensional array. The cursor value returned each time is also the index of this array. The limit parameter (COUNT) indicates how many array elements are traversed and the eligible results of hooks under these elements are returned. Because the linked list attached under each element has different sizes, the number of results returned each time is different.
The time complexity of the SCAN command, although also O (N), is done in multiple passes and does not block threads. The SCAN command returns only a small number of elements per execution and can therefore be used in a production environment without the problems of possible blocking of the server 2, as with the KEYS or SMEMBERS commands. The SCAN command is a cursor-based iterator. This means that each time a command is called, the cursor returned by the last call needs to be used as the cursor parameter of the call, so as to continue the previous iteration process. When the cursor parameter (i.e., cursor) of the SCAN command is set to 0, it indicates that a new iteration will begin. The cursor returned in the first iteration is used in the second iteration as a new iteration parameter, and the like. And when the command returns the cursor with the value of 0, the iteration is ended, and one complete traversal is completed. And acquiring a cursor of the current result, judging whether the cursor is equal to 0, and if not, continuously acquiring the next batch of results. The SCAN incremental iterate command does not guarantee that every execution returns some given number of elements, and may even return zero elements, but as long as the command returns a cursor that is not 0, the iteration should not be considered to be an end. Also, the cursor returned by the command is not necessarily incremented, and it is possible that the cursor returned the next time is smaller than the previous time.
Using the KEYS command is to match all KEYS in Redis with the KEYS parameters one-to-one, but this matching process is a significant loss of server 2 performance. The SCAN command processes all key pages, and the number of the processed pieces is transmitted through parameters. And returning a cursor after processing, and carrying the cursor when requesting again next time, thereby greatly improving the performance of the server 2.
Then, the query submodule 204 continues to scan the next node until all the nodes are scanned, and all keys in the Redis cluster are obtained.
The query submodule 204 obtains the cursor in the current result each time it receives the query result returned by the SCAN command, and determines whether the cursor is equal to 0. And if the cursor is not 0, continuing to perform the next iteration by taking the cursor as a new iteration parameter to obtain a query result returned by the SCAN command. If the cursor is equal to 0, indicating that the iteration in that node has ended, the scan continues on to the next node. And obtaining all query results of all nodes as the query result of the Redis cluster until all nodes of the Redis cluster are completely scanned. In this embodiment, a SCAN method may be used based on the Redis cluster, and the Redis key values may be modified in a batch-wise manner in a large batch.
The key value traversal system for the Redis cluster provided in this embodiment can poll each node of the Redis cluster through the jedis connection pool, SCAN the current node by using the SCAN command, and query the key under the node, thereby achieving the purpose of obtaining the key under each node in the cluster by using the Redis source code native mode. Although the execution speed of the SCAN method is not as high as that of the keys method, the SCAN method is carried out in a plurality of times, threads cannot be blocked, the server 2 is not blocked, and the performance of the server 2 is greatly improved. Moreover, the embodiment does not need to rewrite the Redis SCAN method, and does not need to add any configuration file and method.
Referring to fig. 3, a block diagram of a key-value traversal system 200 of a Redis cluster according to a second embodiment of the present invention. In this embodiment, the polling module 202 includes a duplicate removal sub-module 205 in addition to the setting sub-module 203 and the query sub-module 204 in the first embodiment.
The deduplication submodule 205 is configured to put the query result into a set container for deduplication.
Specifically, since the SCAN method may iterate many times (considering the situations of dictionary expansion and contraction during traversal), repeated data may be in the returned result. In order to avoid performance waste caused by repeated execution, the set is used for loading data returned from the Redis cluster node, so that the aim of removing the duplicate is fulfilled. And when receiving the query result returned by the SCAN command every time, judging whether the query result is empty, and if not, assigning the query result to a set container for automatic deduplication, wherein the result is greater than 0. The keys of the set container are unique, and only one key will be stored in the set container by a plurality of identical keys.
In other embodiments, other existing deduplication methods may also be used to perform deduplication processing on the query result returned by the SCAN command.
Then, the query submodule 204 continues to scan the next node, and the deduplication submodule 205 continues to deduplicate the query result of the next node until all scanning is completed, so as to obtain all keys in the Redis cluster.
The query submodule 204 obtains the cursor in the current result each time it receives the query result returned by the SCAN command, and determines whether the cursor is equal to 0. And if the cursor is not 0, continuing to perform the next iteration by taking the cursor as a new iteration parameter to obtain a query result returned by the SCAN command. And the query result is set in a set container for deduplication by the deduplication module 205. If the cursor is equal to 0, indicating that the iteration in that node has ended, the scan continues on to the next node. And obtaining all (past-weighted) query results of all nodes as the query result of the Redis cluster until all the nodes of the Redis cluster are completely scanned. In this embodiment, a SCAN method may be used based on the Redis cluster, and the Redis key values may be modified in a batch-wise manner in a large batch.
The key value traversal system for the Redis cluster provided in this embodiment can poll each node of the Redis cluster through the jedis connection pool, SCAN the current node by using the SCAN command, and query the key under the node, thereby achieving the purpose of obtaining the key under each node in the cluster by using the Redis source code native mode. Although the execution speed of the SCAN method is not as high as that of the keys method, the SCAN method is carried out in a plurality of times, threads cannot be blocked, the server 2 is not blocked, and the performance of the server 2 is greatly improved. Moreover, the embodiment does not need to rewrite the Redis SCAN method, and does not need to add any configuration file and method. In addition, the query result can be deduplicated by the set container, so that data repetition caused by multiple iterations of the scan method is avoided, and the query result is more accurate.
In addition, the invention also provides a key value traversal method of the Redis cluster.
Fig. 4 is a schematic flow chart of a preferred embodiment of the key value traversal method for the Redis cluster of the present invention. In this embodiment, the execution order of the steps in the flowchart shown in fig. 4 may be changed and some steps may be omitted according to different requirements. The method comprises the following steps:
and step S400, acquiring a Redis cluster node set through the jedis connection pool, and establishing connection with each node in the Redis cluster.
Specifically, the present embodiment acquires a set of Redis cluster nodes using a redisplacenative method. First, a JeddisCluster connection pool is acquired, and then nodes in the connection pool are acquired through a cluster. The current node may be acquired by a jedis ═ entry.
Redis, as a cache database, requires the client and the server 2 to establish a connection and then perform related operations. Because direct connection consumes a large amount of database resources, a connection is newly established each time and is disconnected after use, which is obviously not efficient for frequently accessed scenes. The production environment generally manages Redis connections in a manner of Jedis connection pools. Jedis is a Java connection development tool recommended by Redis officials. The Jedis connection pool is implemented based on apache-common pool 2. All Jedis objects are placed in the pool first, Redis is connected every time when needed, and the Jedis objects only need to be borrowed from the pool and returned to the pool after being used up.
The client is connected with the Redis by using a TCP protocol, and the mode of connecting the pool is to initialize the Jedis connection in advance, so that each time, only the Jedis connection pool needs to be borrowed, and the borrowing and returning operation is performed locally, only a small amount of concurrent synchronization cost is needed, and the cost is far less than that of newly-built TCP connection. In addition, the number of Jedis objects cannot be limited by a direct connection mode, connection leakage can be caused under extreme conditions, and the use of resources can be effectively protected and controlled by the connection pool mode.
Step S402, polling all nodes in the Redis cluster, and respectively querying keys in each node by adopting SCAN commands to obtain all keys meeting conditions in the Redis cluster.
Specifically, since the Redis cluster has no direct SCAN method, after all nodes of the Redis cluster are acquired, polling is performed on each node of the Redis cluster, and then scanning is performed on each node by using a SCAN command to query a key in the node.
Referring to fig. 5, in the first alternative embodiment, the step S402 specifically includes:
and S404, setting the query parameters of the current node according to the input of the user.
In particular, query parameters need to be defined prior to scanning each cluster node. The basic format of the SCAN command is: SCAN cursor [ MATCH pattern ] [ COUNT COUNT ]. That is, the SCAN command provides three parameters cursor, MATCH, and COUNT. Where cursor represents a cursor, an integer value. The server 2 does not need to save the state for the cursor, and the only state of the cursor is the cursor integer returned by the SCAN command to the client. COUNT and MATCH are the query parameters.
COUNT represents the number of elements returned per iteration, i.e., the limit hit for the traversal, which may control the maximum number of results returned per iteration. The SCAN command is an incremental iteration command, only a small part of elements are returned in each calling, the number of the elements returned in each iteration is not guaranteed, and the command behavior can be adjusted to a certain degree by using a COUNT parameter. The COUNT parameter functions to inform the iteration command how many elements should be returned from the data set in each iteration. The use of the COUNT parameter corresponds to a hint to incrementally iterate commands, which is effective in most cases in controlling the number of returned values.
In this embodiment, if the number of keys is less than or equal to 5000, the execution can be completed at one time; if the number of keys is greater than 5000, it is recommended that each execution be within 10000. For example, 5000 pieces of key may be set to be smaller than or equal to 5000 pieces, and 8000 pieces of key may be set to be larger than 10000 pieces.
It is noted that the COUNT parameter does not strictly control the number of keys returned, but is a rough constraint. The same COUNT value is not used in each iteration, and the COUNT value can be changed at will in each iteration according to the needs of the user, so long as the cursor returned in the previous iteration is used in the next iteration.
MATCH is a pattern parameter, in this embodiment, a regular pattern of keys. Like the KEYS command, the incremental iteration command by way of a given MATCH parameter enables the command to return only elements that MATCH the given pattern by providing a glob-style pattern parameter. For example, if we want to query all keys below csj: risk, the MATCH parameter is set to csj: risk:. The MATCH of the MATCH parameter to an element is performed during the period of time after the command takes the element out of the dataset and before returning the element to the client, so if only a small number of elements in the dataset being iterated MATCH the pattern, the iterated command may not return any elements over multiple executions.
Step S406, scanning the current node by adopting a SCAN command, and inquiring the key in the node.
Specifically, the set query parameters are assigned to the SCAN method, the SCAN command is executed, the nodes are scanned, and a return result is queried.
Redis uses a Hash table as the bottom layer implementation for reasons of extra efficiency and simplicity of implementation. The storage structure of the Redis underlying key is the structure of an array + linked list similar to a HashMap. Wherein the array size of the first dimension is 2n (n > -0). The length of the array is expanded by one time. The SCAN command is to traverse the one-dimensional array. The cursor value returned each time is also the index of this array. The limit parameter (COUNT) indicates how many array elements are traversed and the eligible results of hooks under these elements are returned. Because the linked list attached under each element has different sizes, the number of results returned each time is different.
The time complexity of the SCAN command, although also O (N), is done in multiple passes and does not block threads. The SCAN command returns only a small number of elements per execution and can therefore be used in a production environment without the problems of possible blocking of the server 2, as with the KEYS or SMEMBERS commands. The SCAN command is a cursor-based iterator. This means that each time a command is called, the cursor returned by the last call needs to be used as the cursor parameter of the call, so as to continue the previous iteration process. When the cursor parameter (i.e., cursor) of the SCAN command is set to 0, it indicates that a new iteration will begin. The cursor returned in the first iteration is used in the second iteration as a new iteration parameter, and the like. And when the command returns the cursor with the value of 0, the iteration is ended, and one complete traversal is completed. And acquiring a cursor of the current result, judging whether the cursor is equal to 0, and if not, continuously acquiring the next batch of results. The SCAN incremental iterate command does not guarantee that every execution returns some given number of elements, and may even return zero elements, but as long as the command returns a cursor that is not 0, the iteration should not be considered to be an end. Also, the cursor returned by the command is not necessarily incremented, and it is possible that the cursor returned the next time is smaller than the previous time.
Using the KEYS command is to match all KEYS in Redis with the KEYS parameters one-to-one, but this matching process is a significant loss of server 2 performance. The SCAN command processes all key pages, and the number of the processed pieces is transmitted through parameters. And returning a cursor after processing, and carrying the cursor when requesting again next time, thereby greatly improving the performance of the server 2.
Step S408, judging whether all the nodes in the Redis cluster are completely scanned. If yes, the process ends. If not, executing the step S410, continuing to poll the next node, and repeatedly executing the steps S404-S408 until all the nodes are completely scanned, so as to obtain all keys in the Redis cluster.
Specifically, each time a query result returned by the SCAN command is received, a cursor in the current result is obtained, and whether the cursor is equal to 0 is determined. And if the cursor is not 0, continuing to perform the next iteration by taking the cursor as a new iteration parameter to obtain a query result returned by the SCAN command. If the cursor is equal to 0, indicating that the iteration in that node has ended, the next node is polled continuously. And obtaining all query results of all nodes as the query result of the Redis cluster until all nodes of the Redis cluster are completely scanned. In this embodiment, a SCAN method may be used based on the Redis cluster, and the Redis key values may be modified in a batch-wise manner in a large batch.
The key value traversal method for the Redis cluster provided in this embodiment may poll each node of the Redis cluster through the jedis connection pool, SCAN the current node by using the SCAN command, and query the key under the node, thereby implementing obtaining the key under each node in the cluster by using the Redis source code native manner. Although the execution speed of the SCAN method is not as high as that of the keys method, the SCAN method is carried out in a plurality of times, threads cannot be blocked, the server 2 is not blocked, and the performance of the server 2 is greatly improved. Moreover, the embodiment does not need to rewrite the Redis SCAN method, and does not need to add any configuration file and method.
Referring to fig. 6, in the second alternative embodiment, the step S402 specifically includes:
and step S504, setting the query parameters of the current node according to the user input.
In particular, query parameters need to be defined prior to scanning each cluster node. The basic format of the SCAN command is: SCAN cursor [ MATCH pattern ] [ COUNT COUNT ]. That is, the SCAN command provides three parameters cursor, MATCH, and COUNT. Where cursor represents a cursor, an integer value. The server 2 does not need to save the state for the cursor, and the only state of the cursor is the cursor integer returned by the SCAN command to the client. COUNT and MATCH are the query parameters.
COUNT represents the number of elements returned per iteration, i.e., the limit hit for the traversal, which may control the maximum number of results returned per iteration. The SCAN command is an incremental iteration command, only a small part of elements are returned in each calling, the number of the elements returned in each iteration is not guaranteed, and the command behavior can be adjusted to a certain degree by using a COUNT parameter. The COUNT parameter functions to inform the iteration command how many elements should be returned from the data set in each iteration. The use of the COUNT parameter corresponds to a hint to incrementally iterate commands, which is effective in most cases in controlling the number of returned values.
In this embodiment, if the number of keys is less than or equal to 5000, the execution can be completed at one time; if the number of keys is greater than 5000, it is recommended that each execution be within 10000. For example, 5000 pieces of key may be set to be smaller than or equal to 5000 pieces, and 8000 pieces of key may be set to be larger than 10000 pieces.
It is noted that the COUNT parameter does not strictly control the number of keys returned, but is a rough constraint. The same COUNT value is not used in each iteration, and the COUNT value can be changed at will in each iteration according to the needs of the user, so long as the cursor returned in the previous iteration is used in the next iteration.
MATCH is a pattern parameter, in this embodiment, a regular pattern of keys. Like the KEYS command, the incremental iteration command by way of a given MATCH parameter enables the command to return only elements that MATCH the given pattern by providing a glob-style pattern parameter. For example, if we want to query all keys below csj: risk, the MATCH parameter is set to csj: risk:. The MATCH of the MATCH parameter to an element is performed during the period of time after the command takes the element out of the dataset and before returning the element to the client, so if only a small number of elements in the dataset being iterated MATCH the pattern, the iterated command may not return any elements over multiple executions.
Step S506, scanning the current node by adopting a SCAN command, and inquiring the key in the node.
Specifically, the set query parameters are assigned to the SCAN method, the SCAN command is executed, the nodes are scanned, and a return result is queried.
Redis uses a Hash table as the bottom layer implementation for reasons of extra efficiency and simplicity of implementation. The storage structure of the Redis underlying key is the structure of an array + linked list similar to a HashMap. Wherein the array size of the first dimension is 2n (n > -0). The length of the array is expanded by one time. The SCAN command is to traverse the one-dimensional array. The cursor value returned each time is also the index of this array. The limit parameter (COUNT) indicates how many array elements are traversed and the eligible results of hooks under these elements are returned. Because the linked list attached under each element has different sizes, the number of results returned each time is different.
The time complexity of the SCAN command, although also O (N), is done in multiple passes and does not block threads. The SCAN command returns only a small number of elements per execution and can therefore be used in a production environment without the problems of possible blocking of the server 2, as with the KEYS or SMEMBERS commands. The SCAN command is a cursor-based iterator. This means that each time a command is called, the cursor returned by the last call needs to be used as the cursor parameter of the call, so as to continue the previous iteration process. When the cursor parameter (i.e., cursor) of the SCAN command is set to 0, it indicates that a new iteration will begin. The cursor returned in the first iteration is used in the second iteration as a new iteration parameter, and the like. And when the command returns the cursor with the value of 0, the iteration is ended, and one complete traversal is completed. And acquiring a cursor of the current result, judging whether the cursor is equal to 0, and if not, continuously acquiring the next batch of results. The SCAN incremental iterate command does not guarantee that every execution returns some given number of elements, and may even return zero elements, but as long as the command returns a cursor that is not 0, the iteration should not be considered to be an end. Also, the cursor returned by the command is not necessarily incremented, and it is possible that the cursor returned the next time is smaller than the previous time.
Using the KEYS command is to match all KEYS in Redis with the KEYS parameters one-to-one, but this matching process is a significant loss of server 2 performance. The SCAN command processes all key pages, and the number of the processed pieces is transmitted through parameters. And returning a cursor after processing, and carrying the cursor when requesting again next time, thereby greatly improving the performance of the server 2.
And step S508, putting the query result into a set container for duplication elimination.
Specifically, since the SCAN method may iterate many times (considering the situations of dictionary expansion and contraction during traversal), repeated data may be in the returned result. In order to avoid performance waste caused by repeated execution, the set is used for loading data returned from the Redis cluster node, so that the aim of removing the duplicate is fulfilled. And when receiving the query result returned by the SCAN command every time, judging whether the query result is empty, and if not, assigning the query result to a set container for automatic deduplication, wherein the result is greater than 0. The keys of the set container are unique, and only one key will be stored in the set container by a plurality of identical keys.
In other embodiments, other existing deduplication methods may also be used to perform deduplication processing on the query result returned by the SCAN command.
Step S510, determining whether all the nodes in the Redis cluster are completely scanned. If yes, the process ends. If not, executing step S512, continuing to poll the next node, and repeatedly executing the steps S504-S510 until all the nodes are completely scanned, so as to obtain all keys in the Redis cluster.
Specifically, each time a query result returned by the SCAN command is received, a cursor in the current result is obtained, and whether the cursor is equal to 0 is determined. And if the cursor is not 0, continuing to perform the next iteration by taking the cursor as a new iteration parameter to obtain a query result returned by the SCAN command. And the query result is set in a set container for deduplication. If the cursor is equal to 0, indicating that the iteration in that node has ended, the next node is polled continuously. And obtaining all (past-weighted) query results of all nodes as the query result of the Redis cluster until all the nodes of the Redis cluster are completely scanned. In this embodiment, a SCAN method may be used based on the Redis cluster, and the Redis key values may be modified in a batch-wise manner in a large batch.
The key value traversal method for the Redis cluster provided in this embodiment may poll each node of the Redis cluster through the jedis connection pool, SCAN the current node by using the SCAN command, and query the key under the node, thereby implementing obtaining the key under each node in the cluster by using the Redis source code native manner. Although the execution speed of the SCAN method is not as high as that of the keys method, the SCAN method is carried out in a plurality of times, threads cannot be blocked, the server 2 is not blocked, and the performance of the server 2 is greatly improved. Moreover, the embodiment does not need to rewrite the Redis SCAN method, and does not need to add any configuration file and method. In addition, the query result can be deduplicated by the set container, so that data repetition caused by multiple iterations of the scan method is avoided, and the query result is more accurate.
The present invention also provides another embodiment, which is to provide a computer-readable storage medium storing a key value traversal program of a Redis cluster, the key value traversal program of the Redis cluster being executable by at least one processor to cause the at least one processor to perform the steps of the key value traversal method of the Redis cluster as described above.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. A key value traversal method for Redis cluster, the method comprising the steps of:
acquiring a Redis cluster node set through a jedis connection pool, and establishing connection with each node in the Redis cluster;
polling all nodes in the Redis cluster, and respectively querying the key value in each node by adopting a SCAN command to obtain all key values meeting the conditions in the Redis cluster.
2. The key value traversal method for the Redis cluster as claimed in claim 1, wherein the polling all nodes in the Redis cluster, respectively querying the key value in each node by adopting a SCAN command to obtain all key values meeting the condition in the Redis cluster comprises:
setting query parameters of a current node according to user input;
scanning a current node by adopting an SCAN command according to the query parameters, and querying a key value in the current node through one or more times of iteration;
and continuing polling the next node until all nodes in the Redis cluster are completely scanned, and obtaining all key values in the Redis cluster.
3. The key-value traversal method for Redis clusters according to claim 2, wherein the method further comprises, after the step of scanning a current node by using SCAN commands according to the query parameters, and querying out the key value in the current node through one or more iterations:
and putting the query result into a set container for deduplication.
4. The key-value traversal method for Redis clusters as claimed in claim 3, wherein the step of putting query results into set containers for deduplication comprises:
and when receiving an inquiry result returned by the SCAN command every time, judging whether the inquiry result is empty, and if not, assigning the inquiry result to a set container to automatically remove repeated data.
5. The key-value traversal method for Redis clusters according to claim 2 or 3, wherein the query parameters comprise curror, MATCH, COUNT, wherein curror represents cursor, COUNT represents the number of elements returned per iteration, and MATCH represents the regular pattern of the queried key.
6. The key-value traversal method for Redis clusters according to claim 5, wherein the step of scanning a current node by using a SCAN command according to the query parameter, and querying out the key value in the current node through one or more iterations further comprises:
each time an inquiry result returned by the SCAN command is received, acquiring a cursor in the inquiry result, and judging whether the cursor is equal to 0 or not;
if the cursor is not 0, continuing to perform the next iteration by taking the cursor as a new iteration parameter to obtain a query result returned by the SCAN command;
if the cursor is equal to 0, the iteration in the current node is ended, and the next node is continuously polled.
7. The key-value traversal method of a Redis cluster according to claim 5, wherein the value of the COUNT parameter can be set to be the same or different for each iteration.
8. The key-value traversal method for Redis clusters according to claim 5, wherein in the step of scanning a current node by using a SCAN command according to the query parameter and querying out the key value in the current node through one or more iterations, each iteration comprises:
and traversing the array storing the key value in the current node by the SCAN command according to the element quantity set by the COUNT parameter and the condition set by the MATCH parameter, returning the query result meeting the condition in the array, and returning the cursor according to the index of the traversed array.
9. A server, characterized in that the server comprises a memory, a processor, the memory having stored thereon a key-value traversal system of a Redis cluster executable on the processor, the key-value traversal system of a Redis cluster implementing the steps of the key-value traversal method of a Redis cluster as claimed in any one of claims 1 to 8 when executed by the processor.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a key-value traversal system of a Redis cluster, executable by at least one processor, to cause the at least one processor to perform the steps of the key-value traversal method of the Redis cluster as claimed in any one of claims 1-8.
CN201911371822.XA 2019-12-25 2019-12-25 Key value traversal method of Redis cluster, server and storage medium Pending CN111177245A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911371822.XA CN111177245A (en) 2019-12-25 2019-12-25 Key value traversal method of Redis cluster, server and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911371822.XA CN111177245A (en) 2019-12-25 2019-12-25 Key value traversal method of Redis cluster, server and storage medium

Publications (1)

Publication Number Publication Date
CN111177245A true CN111177245A (en) 2020-05-19

Family

ID=70654043

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911371822.XA Pending CN111177245A (en) 2019-12-25 2019-12-25 Key value traversal method of Redis cluster, server and storage medium

Country Status (1)

Country Link
CN (1) CN111177245A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111966690A (en) * 2020-08-21 2020-11-20 西安寰宇卫星测控与数据应用有限公司 Method and device for loading table full data, computer equipment and storage medium
CN112732427A (en) * 2021-01-13 2021-04-30 广州虎牙科技有限公司 Data processing method, system and related device based on Redis cluster
CN113836366A (en) * 2021-08-18 2021-12-24 广州致远电子有限公司 Data traversal method and device based on embedded system
CN116521688A (en) * 2023-07-04 2023-08-01 浩鲸云计算科技股份有限公司 Key prefix operation KEY value method based on Redis cluster
CN111966690B (en) * 2020-08-21 2024-07-12 西安寰宇卫星测控与数据应用有限公司 Method, device, computer equipment and storage medium for loading form full data

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180165331A1 (en) * 2016-12-09 2018-06-14 Futurewei Technologies, Inc. Dynamic computation node grouping with cost based optimization for massively parallel processing
CN108280031A (en) * 2017-12-22 2018-07-13 努比亚技术有限公司 Redis cache cleaner method, server and computer readable storage medium
CN109471635A (en) * 2018-09-03 2019-03-15 中新网络信息安全股份有限公司 A kind of algorithm optimization method realized based on Java Set set
CN109918429A (en) * 2019-01-21 2019-06-21 武汉烽火众智智慧之星科技有限公司 Spark data processing method and system based on Redis
CN110059129A (en) * 2019-04-28 2019-07-26 顶象科技有限公司 Date storage method, device and electronic equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180165331A1 (en) * 2016-12-09 2018-06-14 Futurewei Technologies, Inc. Dynamic computation node grouping with cost based optimization for massively parallel processing
CN108280031A (en) * 2017-12-22 2018-07-13 努比亚技术有限公司 Redis cache cleaner method, server and computer readable storage medium
CN109471635A (en) * 2018-09-03 2019-03-15 中新网络信息安全股份有限公司 A kind of algorithm optimization method realized based on Java Set set
CN109918429A (en) * 2019-01-21 2019-06-21 武汉烽火众智智慧之星科技有限公司 Spark data processing method and system based on Redis
CN110059129A (en) * 2019-04-28 2019-07-26 顶象科技有限公司 Date storage method, device and electronic equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"RedisCluster的scan命令", pages 1 - 3, Retrieved from the Internet <URL:https://blog.csdn.net/lengnado/article/details/53718866?spm=1001.2014.3001.5502> *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111966690A (en) * 2020-08-21 2020-11-20 西安寰宇卫星测控与数据应用有限公司 Method and device for loading table full data, computer equipment and storage medium
CN111966690B (en) * 2020-08-21 2024-07-12 西安寰宇卫星测控与数据应用有限公司 Method, device, computer equipment and storage medium for loading form full data
CN112732427A (en) * 2021-01-13 2021-04-30 广州虎牙科技有限公司 Data processing method, system and related device based on Redis cluster
CN112732427B (en) * 2021-01-13 2024-03-01 广州虎牙科技有限公司 Data processing method, system and related device based on Redis cluster
CN113836366A (en) * 2021-08-18 2021-12-24 广州致远电子有限公司 Data traversal method and device based on embedded system
CN116521688A (en) * 2023-07-04 2023-08-01 浩鲸云计算科技股份有限公司 Key prefix operation KEY value method based on Redis cluster
CN116521688B (en) * 2023-07-04 2023-09-26 浩鲸云计算科技股份有限公司 Key prefix operation KEY value method based on Redis cluster

Similar Documents

Publication Publication Date Title
CN111177245A (en) Key value traversal method of Redis cluster, server and storage medium
CN108769111B (en) Server connection method, computer readable storage medium and terminal device
CN110674432A (en) Second-level caching method and device and computer readable storage medium
CN110704463B (en) Local caching method and device for common data, computer equipment and storage medium
CN106599111B (en) Data management method and storage system
US9424297B2 (en) Index building concurrent with table modifications and supporting long values
CN110134335B (en) RDF data management method and device based on key value pair and storage medium
CN110928904A (en) Data query method and device and related components
US20140372466A1 (en) Method and system for operating on database queries
CN109213774B (en) Data storage method and device, storage medium and terminal
CN111767314A (en) Data caching and querying method and device, lazy caching system and storage medium
CN107784073B (en) Data query method for local cache, storage medium and server
CN113656098B (en) Configuration acquisition method and system
CN114817146A (en) Method and device for processing data
CN111125170A (en) Cross-service data acquisition method and device of micro-service and micro-service equipment
CN117077599B (en) Method and device for generating field programmable gate array view
CN110955460A (en) Service process starting method and device, electronic equipment and storage medium
CN110990640B (en) Data determination method, device, equipment and computer readable storage medium
CN110990611B (en) Picture caching method and device, electronic equipment and storage medium
CN110333883B (en) Method and device for updating persistent data
CN112487039A (en) Data processing method, device and equipment and readable storage medium
CN112000482A (en) Memory management method and device, electronic equipment and storage medium
CN111475535A (en) Data storage and access method and device
CN115203490B (en) Query method and device for data types of List List container in graph database
CN112699147B (en) Paging query method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination