CN109815232B - Method and system for retrieving and processing data ranking by using binary search tree - Google Patents

Method and system for retrieving and processing data ranking by using binary search tree Download PDF

Info

Publication number
CN109815232B
CN109815232B CN201811613883.8A CN201811613883A CN109815232B CN 109815232 B CN109815232 B CN 109815232B CN 201811613883 A CN201811613883 A CN 201811613883A CN 109815232 B CN109815232 B CN 109815232B
Authority
CN
China
Prior art keywords
node
tree
key
nodes
current node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811613883.8A
Other languages
Chinese (zh)
Other versions
CN109815232A (en
Inventor
朱智佳
李山
林长录
常鹏
张永光
周成祖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Meiya Pico Information Co Ltd
Original Assignee
Xiamen Meiya Pico Information Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen Meiya Pico Information Co Ltd filed Critical Xiamen Meiya Pico Information Co Ltd
Priority to CN201811613883.8A priority Critical patent/CN109815232B/en
Publication of CN109815232A publication Critical patent/CN109815232A/en
Application granted granted Critical
Publication of CN109815232B publication Critical patent/CN109815232B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

A method and system for data rank retrieval, data processing with a binary search tree is disclosed, which includes adding a sub-tree size and an access number freq to the binary search tree; if the key of the current node is equal to the key to be searched, the rank of the key to be searched is rank-node.left.size, the access frequency of the current node is added with 1, if the key of the current node is smaller than the key to be searched, searching in the direction of the right subtree is carried out, the rank is raised, namely the rank value of the minimum node in the direction of the right subtree of the current node is rank R equal to rank-1-node.left.size, if the key of the current node is larger than the key to be searched, searching in the direction of the left subtree is carried out, the rank is unchanged until the searching is finished, if the access frequency of the current node is smaller than the access frequency of the left subtree or the right subtree, the position of the current node is replaced with the position of the left subtree or the right subtree. By moving the nodes with the large number of access times to the positions closer to the root node as much as possible, the query efficiency of frequently querying the nodes is improved, and therefore the retrieval efficiency of the whole system is improved.

Description

Method and system for retrieving and processing data ranking by using binary search tree
Technical Field
The present invention relates to the technical field of data retrieval, and in particular, to a method and system for retrieving and processing data ranking using a binary search tree.
Background
Many systems have a ranking list function that requires real-time ranking of users by points or contribution values or other values. The implementation of the ranking list is simple, data can be stored in a database, the fields required by the ranking list are indexed, and the database can effectively inquire the data of the user of the ranking list top k.
Assuming that the ranking list is sorted by points, the general user also has the following requirements:
1. it is desirable to know its specific rank.
2. The users who are ranked close to the users themselves are the integral of the user who is ranked close to the users who are ranked.
Therefore, the ranking of the designated points needs to be efficiently inquired, and the corresponding user information can be inquired according to the specific ranking. The integral data is updated and changed in real time, although the requirement can be met through query functions of statistics, sorting, paging and the like of the database, the performance is not so efficient, and the database has no worry about the functions. Taking the mysql database as an example, for a system of ten million users, the information of the users with specified ranks is queried through the order and limit keywords, each rank test is taken, the query average takes about 10 seconds, and the requirements of the system cannot be met.
According to the data stored in a certain size relation sequence, the binary search can efficiently retrieve the storage position of the data with the specified value, and the rank is known by knowing the position because the data is stored in sequence. However, the real-time ranking and the data updating are real-time, and the updating of the data needs to continuously maintain the ordered and sequential storage of the data, which is costly.
Disclosure of Invention
The invention provides a method and a system for retrieving and processing data ranking by using a binary search tree.
In one aspect, the present invention provides a method for retrieving a data rank, comprising the following steps:
s101: increasing the size of a subtree and the access times freq of the binary search tree, initializing the rank (root) where the key to be inquired is located, and setting the current node as a root node to start searching;
s102: judging the size relationship between the key of the current node and the key to be searched;
s103: if the key of the current node is equal to the key to be searched, the ranking rank x of the key to be searched is equal to rank-node.left.size, and meanwhile, 1 is added to the number of access times of the current node;
s104: if the key of the current node is smaller than the key to be searched, searching in the direction of the right subtree, and increasing the rank, namely the rank R of the minimum node of the right subtree node of the current node is equal to rank-1-node.
S105: if the key of the current node is larger than the key to be searched, searching in the left sub-tree direction, and keeping the rank unchanged until the searching is completed, and if the access times of the current node are smaller than the access times of the left sub-tree node, replacing the positions of the current node and the left sub-tree node;
the key value represents a certain field of user data, the node.
The key value is, for example, the integral of the user, the integral key of the user is used for carrying out ranking retrieval, the corresponding node position of the key to be searched is searched by comparing the key value, backtracking is carried out after the searching is finished, the nodes of the binary tree are replaced and adjusted by using the access times of the nodes, and the nodes with high access frequency are transferred to the position close to the root node, so that the whole retrieval system can carry out retrieval work more efficiently.
In some embodiments, in step S104 and step S105, after the key to be searched is found, tracing back step by step, and replacing the positions of the current node and the left sub-tree or the right sub-tree node by using the access times.
In some embodiments, the rank value in step S104 decreases with the change of the current node during the search of the right subtree, and the decreased value is the number of the left subtrees of the current node plus 1.
In some embodiments, the method further comprises the steps of:
s106: and repeating the steps S102 to S105 until the current node key is equal to the key to be inquired or the current node is a leaf node.
In some embodiments, while replacing the positions of the current node and the left or right subtree node in steps S104 and S105, adjusting the positions of the current node and the children nodes of the left or right subtree node according to the traversal order of the original binary search tree, and updating the size corresponding to the adjusted node. The original data are sequenced according to the original traversal sequence of the binary search tree, and the accuracy of data ranking is ensured.
In some embodiments, the method further comprises the steps of: searching the information of the object according to the ranking rank of the object, and judging the size relationship between the number of the left subtrees of the current node and the size;
if the number of the left subtrees of the current node is equal to the size of the size, the information of the current node is the information of the object, and 1 is added to the number of access times of the current node;
if the number of the left subtrees of the current node is smaller than size, the object node is in the right subtree, and the position of the object is size-1-node.
If the number of the left subtrees of the current node is larger than the size, the object node is in the left subtree, and the size is unchanged until the searching is completed, and if the access times of the current node are smaller than the access times of the left subtree node, the positions of the current node and the left subtree node are replaced;
wherein, the size is the node number of the left sub-tree of the rank, and the initialized size is root.
And inquiring the information of the object according to the ranking of the object, adjusting by using the access times of the binary search tree, tracing back step by step after the object information is searched, and updating the node position of the binary search tree by using the access times of the nodes so as to maximize the retrieval efficiency.
In some specific embodiments, the method further includes a step of performing node update on the binary lookup number, and the step of node update specifically includes a step of adding a node and a step of deleting a node. The addition and deletion of the node data can better maintain the data information of the binary search tree, so that the data information is more effective and perfect.
In a further embodiment, the step of adding a node specifically includes: and finding out nodes which are larger than the key and have no left subtree node or are smaller than the key and have no right subtree through the key, creating new nodes, backtracking to the root node step by step, and updating the size plus 1.
In a further preferred embodiment, the step of deleting the node specifically includes: finding out the nodes needing to be deleted through the key, rotating the nodes to be deleted to leaf nodes through the rotating nodes, deleting, tracing back to the root nodes step by step, and updating the size by 1.
According to another aspect of the present invention, a data processing method using a binary search tree is provided, which includes the steps of:
s601: increasing a freq value and a size value of a subtree of the node access times to the nodes of the binary search tree;
s602: in the process of searching the nodes of the binary search tree, adding 1 to the number of access times of the searched nodes;
s603: judging whether to adjust the nodes of the binary search tree according to the freq values of the access times of the parent node and the left subtree node or the right subtree node;
s604: if the value of the current node is smaller than the value of the node to be searched, searching the right subtree, and after the searching is finished, if the access times of the parent node are smaller than the access times of the right subtree node, interchanging the positions of the two nodes;
s605: if the value of the current node is larger than the value of the node to be searched, searching the left sub-tree, and after the searching is finished, if the access times of the parent node are smaller than the access times of the left sub-tree node, interchanging the positions of the two nodes.
According to the scheme, the positions of the nodes are adjusted through the access times, and the nodes with the large access times are transferred to the positions close to the root nodes, so that the query efficiency of the nodes frequently queried in the binary search tree can be improved, the object to be searched can be found out at the highest speed in the data search process, and the search efficiency for searching by using the binary search tree is improved.
In some embodiments, the positions of the parent node and the left or right subtree node are changed according to the number of accesses, and the corresponding child nodes of the left or right subtree and the parent node are adjusted according to the traversal order of the original binary search tree. The whole traversal order of the binary search tree is guaranteed to be unchanged after the position is replaced, and the traversal order of the nodes of the original binary search tree is not affected.
In some embodiments, steps S603-S605 are repeated until the current node becomes a leaf node. The nodes of the whole binary search tree are adjusted, so that the nodes of the binary search tree are distributed more reasonably, and the retrieval efficiency is improved.
According to a third aspect, a computer-readable storage medium is proposed, on which one or more computer programs are stored, characterized in that the one or more computer programs, when executed by a computer processor, implement the method of any of the above.
According to a fourth aspect, a data ranking retrieval system is proposed, the system comprising:
the data recording unit is used for increasing a subtree size value and a node access frequency freq value for the nodes of the binary search tree and recording the subtree size value and the node access frequency freq value;
the data query judging unit is used for judging the size relationship between the key of the current node and the key to be searched;
the ranking searching unit is used for searching the ranking rank according to the key to be searched, and if the key of the current node is equal to the size of the key to be searched, the ranking rank x of the key to be searched is equal to rank-node.
If the key of the current node is smaller than the key to be searched, searching to the right subtree direction, and the rank of the minimum node of the right subtree node of the current node is equal to rank-1-node.
If the current node key is larger than the key to be searched, searching in the left sub-tree direction, and keeping the rank unchanged until the key to be searched is searched;
a node updating unit for moving the node with the large number of access times to a position closer to the root node;
when the access times of the current node are less than the access times of the right subtree node, the positions of the current node and the right subtree node are replaced;
when the access times of the current node are less than the access times of the left sub-tree node, the positions of the current node and the left sub-tree node are replaced;
and the corresponding child nodes of the left sub-tree or the right sub-tree are adjusted according to the traversal sequence of the original binary search tree while the position of the current node and the nodes of the left sub-tree or the right sub-tree are replaced according to the access times.
According to the data processing and data ranking retrieval method and system of the binary search tree, the total number of all nodes in the node and the subtree is recorded for each node of the binary search tree, the ranking of the query node can be calculated in the subsequent search process according to the field, and data with the specified ranking can be obtained through the field. And increasing the number of times of accessing each node of the binary search tree, and moving the nodes with the large number of access times to a position closer to the root node as much as possible, so that the query efficiency of frequently querying the nodes is improved, and the retrieval efficiency of the whole system is improved.
Drawings
The accompanying drawings are included to provide a further understanding of the embodiments and are incorporated in and constitute a part of this specification. The drawings illustrate embodiments and together with the description serve to explain the principles of the invention. Other embodiments and many of the intended advantages of embodiments will be readily appreciated as they become better understood by reference to the following detailed description. Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
FIG. 1 is a method of data rank retrieval utilizing a binary search tree, according to one embodiment of the invention;
FIG. 2 is a schematic illustration of a subtree rotation according to an embodiment of the invention;
FIG. 3 is a flow diagram of a retrieval method utilizing key data ranking in accordance with a specific embodiment of the present invention;
FIG. 4 is a flow diagram of a method for retrieving object information using size data according to another specific embodiment of the present invention;
FIG. 5 is a flow diagram of a method of deleting a node according to another embodiment of the present invention;
FIG. 6 is a flow diagram of a data processing method for a binary search tree according to one embodiment of the invention;
FIG. 7 is a retrieval system for data ranking according to the present invention.
Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
A method for data rank retrieval utilizing a binary search tree according to one embodiment of the invention, fig. 1 shows a flow diagram of a retrieval method for data ranking of a binary search tree according to an embodiment of the invention. As shown in fig. 1, the method comprises the steps of:
s101: and increasing the size of a subtree and the access times freq of the binary search tree, initializing the rank (root) where the key to be queried is located, and setting the current node as a root node to start retrieval.
S102: and judging the size relationship between the key of the current node and the key to be searched. By judging the size of the key of the current node and the size of the key to be searched, the position relation between the position of the node of the key to be searched and the current node can be obtained, and searching is carried out according to the position relation.
S103: and if the key of the current node is equal to the key to be searched, the ranking rank x of the key to be searched is equal to rank-node.left.size, and meanwhile, the number of access times of the current node is added with 1. The key of the current node is equal to the key to be searched, which means that the current node is the node to be searched, and the calculation formula of the ranking is as follows: rank x is rank-node.left.size, where rank is initialized to the total number of nodes, root.size, of the entire binary tree, node.left.size representing the total number of left sub-tree nodes of the current node.
S104: if the key of the current node is smaller than the key to be searched, searching in the direction of the right subtree, and increasing the rank, namely the rank R of the right subtree node of the current node is equal to rank-1-node.
In a specific embodiment, the current node key is smaller than the key to be searched, which indicates that the key to be searched is in the right subtree direction of the current node, the ranking value of the minimum node in the right subtree direction of the current node is rank R-rank-1-node.
S105: if the key of the current node is larger than the key to be searched, searching in the left sub-tree direction, and keeping the rank unchanged until the searching is completed, and if the access times of the current node are smaller than the access times of the left sub-tree node, replacing the positions of the current node and the left sub-tree node;
in a specific embodiment, the current node key is larger than the key to be searched, which indicates that the key to be searched is in the left sub-tree direction of the current node, the rank is unchanged, the node of the left sub-tree is recorded as the current node to be compared with the key to be searched, the above steps are repeated until the key of the searched node is equal to the key to be searched, the searching process is ended, backtracking is performed, and the positions of the current node and the right sub-tree node are replaced by comparing the access times of the nodes if the access times of the current node is smaller than the access times of the left sub-tree node.
The key value represents a certain field of user data, the node.
In a specific embodiment, the key value is, for example, the integral of the user, the integral key of the user is used for ranking retrieval, the corresponding node position of the key to be searched is searched by comparing the key value, after the search is finished, backtracking is performed, the nodes of the binary tree are replaced and adjusted by using the access times of the nodes, and the nodes with high access frequency are transferred to the position close to the root node, so that the whole retrieval system can perform retrieval more efficiently, and more optimized binary search tree data can be provided for subsequent query retrieval.
In a specific embodiment, as shown in the schematic diagram of the right subtree rotation shown in fig. 2, when the number of access times of the parent node a is less than that of the right subtree node C, the positions of the parent node a and the right subtree node C are replaced. It should be noted that in the process of performing the position replacement, the positions of the child nodes of the parent node a and the right subtree node C are simultaneously adjusted to ensure that the traversal order of the original binary search tree remains unchanged, for example, the traversal order of the original binary search tree in the schematic diagram of the right subtree rotation of fig. 2 is B-a-C1-C2, after the node a and the node C are replaced, the position of C1 needs to be adjusted to ensure that the order of the binary search tree remains B-a-C1-C2. It should be noted that the number of subtrees of the node may change during the adjustment process, so the size corresponding to the adjusted node needs to be updated synchronously, and the size is updated to the sum of the number of the left subtrees and the number of the right subtrees plus 1.
In a specific embodiment, as shown in the schematic diagram of the left sub-tree rotation shown in fig. 2, when the number of access times of the parent node a is smaller than that of the left sub-tree node B, the positions of the parent node a and the left sub-tree node B are replaced. It should be noted that during the process of performing the position replacement, the positions of the child nodes of the parent node a and the left subtree node B are simultaneously adjusted to ensure that the traversal order of the original binary search tree remains unchanged, for example, the traversal order of the original binary search tree in the schematic diagram of the left subtree rotation of fig. 2 is B1-B2-a-C, after the node a and the node B are replaced, the position of the node B2 needs to be adjusted to ensure that the order of the binary search tree remains B1-B2-a-C. It should be noted that the number of subtrees of the node may change during the adjustment process, so the size corresponding to the adjusted node needs to be updated synchronously, and the size is updated to the sum of the number of the left subtrees and the number of the right subtrees plus 1.
The access times are recorded in each node, the positions of the nodes are adjusted by the access times, and the nodes with the large access times move to the positions close to the root nodes, so that the query efficiency of the nodes which are frequently queried is improved, the required information can be queried more quickly in the data retrieval process, the data retrieval efficiency of the whole system is greatly improved, multiple times of the data retrieval process and the common database retrieval process are improved, the coding process is simple, and the method is suitable for various application scenes.
FIG. 3 illustrates a flow diagram of a retrieval method utilizing key data ranking in accordance with a specific embodiment of the present invention.
The method specifically comprises the following steps:
s300: starting to search from the Root Node of the binary search tree, recording the current Node as the Root Node, wherein Node is Root, and initializing the size rank of the whole tree as Root.
S301: and judging whether the key value node of the current node is equal to the searched key.
In a specific embodiment, the key value corresponds to the user's score, and the ranking of the user's score in the whole score database is retrieved through the user's score, so that the ranking condition of the user can be judged more intuitively.
S302: if the key value node of the current node is equal to the searched key, it indicates that the current node is the searched node, and 1 is added to the access times node. And recording the access times of the current node, and taking the access times as the basis of node adjustment.
S303: and subtracting the node number of the left sub-tree from rank to obtain the ranking of the current node. And subtracting the total number of the left sub-tree nodes of the current node from the total number of the nodes of the whole tree to obtain the ranking of the current node.
S304: backtracking along the opposite direction of the searched result and judging whether the tree needs to be adjusted. And tracing back upwards step by step, judging the size relation of the access times of the nodes, and judging whether the adjustment is needed.
In a specific embodiment, whether the binary search tree needs to be adjusted is judged, the judgment is carried out according to the access times of the nodes, in the backtracking process, if the access times of the parent node is smaller than the access times of the nodes of the left sub-tree or the right sub-tree, the position of the parent node and the nodes of the left sub-tree or the right sub-tree need to be replaced, meanwhile, the child nodes of the left sub-tree or the right sub-tree need to be adjusted according to the traversal sequence of the original binary search tree, the traversal sequence of the original binary search tree is guaranteed to be unchanged, and the retrieval traversal sequence of the data of the binary search tree is not affected.
S305: and finding the Node and returning to rank. And indicating that the current node is the node to be searched, recording the ranking value of the current node as the ranking value of the node to be searched, outputting the ranking of the node, and finishing the query process.
S306: and judging the node.key size of the search Key and the current node, if the searched Key is smaller than the node.key size of the current node, performing the step S307, and if the searched Key is larger than the node.key size of the current node, entering the step S310.
S307: and judging whether the current node does not have a left sub-tree. And under the condition that the key to be searched is smaller than the current node, judging whether the current node has a left sub-tree. If the current node does not have a left sub-tree, the process proceeds to step S309, and if the current node has a left sub-tree, the process proceeds to step S308.
S308: left. If the current Node has a left sub-tree, the found key is in the left sub-tree, the left sub-tree Node of the current Node is recorded as node.left, and the step S301 is returned to perform the judgment again until the found key is equal to the node.key value, the Node is found, or if the left sub-tree does not exist in the final Node, the key to be found is not found, and a null value is returned.
S309: not found, null is returned. And indicating that the key to be searched does not exist in the binary search tree, outputting a result to the system that the node of the key is not found, and returning a null value.
In an optional implementation manner of the present invention, node addition may be performed on the binary search tree, and in a case that the left sub-tree does not exist in the current node, a new node with a key inserted newly and with value information may be created for the left node of the current node, and the node is initialized to have a size and an access number freq both equal to 1 for the current node and the sub-tree node. Under the condition that the right subtree does not exist in the current node, a new node with key information of value can be created for the right node of the current node, and the node is initialized to be the current node, the size of the subtree node and the frequency of access freq both being 1. And the step by step is traced back to the root node and the updated node number size plus 1. By adding the nodes of the binary search tree, the binary search tree can be updated, so that the data of the binary search tree is more perfect, and the data retrieval is more convenient.
S310: and judging whether the current node has no right subtree. And under the condition that the key to be searched is larger than the current node, judging whether the current node has a right sub-tree. If the right sub-tree does not exist in the current node, step S311 is performed, and if the right sub-tree exists in the current node, step S309 is performed.
S311: and subtracting the current Node and the left sub-tree Node number Node from rank. And if the current Node has a right subtree, indicating that the searched key is in the right subtree, recording that the right subtree Node of the current Node is rank minus the current Node and the left subtree Node number Node, and returning to the step S301 to judge again until the searched key is equal to the Node.
In the process of carrying out ranking query by using the key value, the access times of each node are recorded through continuous loop query, and after the query is finished, the adjustment of the binary search tree is carried out by using the access times of the nodes, so that the structure of the binary search tree is more reasonable, and the retrieval and query of data are more efficient.
Fig. 4 illustrates a flowchart of a method of retrieving object information using size data according to a specific embodiment of the present invention. The method specifically comprises the following steps:
s400: starting searching from a Root Node of a binary search tree, recording a current Node as the Root Node, setting the Node as Root, setting a left sub-tree to be searched and ranked as a rank object to have size nodes, and initializing size-rank.
S401: and judging whether the node number of the left sub-tree is equal to the node number size of the left sub-tree of the object to be searched.
In a specific embodiment, rank corresponds to the rank of the object to be searched, and other information of the object in the database is retrieved through the rank of the object, so that the position of the object can be quickly located and relevant information of the object can be obtained.
S402: if the number of the left sub-tree nodes of the current node is equal to the number size of the left sub-tree nodes of the object to be searched, the current node is the searched node, and 1 is added to the number of times of access node. And recording the access times of the current node, and taking the access times as the basis of node adjustment.
S403: backtracking along the opposite direction of the searched result and judging whether the tree needs to be adjusted. And tracing back upwards step by step, judging the size relation of the access times of the nodes, and judging whether the adjustment is needed.
In a specific embodiment, whether the binary search tree needs to be adjusted is judged, the judgment is carried out according to the access times of the nodes, in the backtracking process, if the access times of the parent node is smaller than the access times of the nodes of the left sub-tree or the right sub-tree, the position of the parent node and the nodes of the left sub-tree or the right sub-tree need to be replaced, meanwhile, the child nodes of the left sub-tree or the right sub-tree need to be adjusted according to the traversal sequence of the original binary search tree, the traversal sequence of the original binary search tree is guaranteed to be unchanged, and the retrieval traversal sequence of the data of the binary search tree is not affected.
S404: and finding and returning to the Node. And indicating that the current node is the node to be searched, recording the information of the current node as the information of the node to be searched, outputting the information of the node, and finishing the query process.
S405: judging the node number size of the left sub-tree of the object to be searched and the node number of the left sub-tree of the current node, if the node number size of the left sub-tree of the object to be searched is less than the node number of the left sub-tree of the current node, performing the step S406, and if the node number size of the left sub-tree of the object to be searched is greater than the node number of the left sub-tree of the current node, performing the step S409.
S406: and judging whether the current node does not have a left sub-tree. And under the condition that the node number size of the left sub-tree of the object to be searched is smaller than the node number of the left sub-tree of the current node, judging whether the left sub-tree exists in the current node. If the current node does not have a left sub-tree, the process proceeds to step S408, and if the current node has a left sub-tree, the process proceeds to step S407.
S407: left. And if the current Node has a left sub-tree, indicating that the object to be searched is in the left sub-tree, noting that the left sub-tree Node of the current Node is Node, and returning to step S401 to judge again until the number of the left sub-tree nodes is equal to the number size of the left sub-tree Node of the object to be searched, finding the Node and outputting information of the Node, or if the left sub-tree does not exist in the final Node, finding the key to be searched, and outputting a null value.
S408: not found, null is returned. And indicating that the object to be searched does not exist in the binary search tree, outputting a result to the system that the node of the object is not found, and outputting a null value.
S409: and judging whether the current node has no right subtree. And under the condition that the node number size of the left sub-tree of the object to be searched is larger than the node number of the left sub-tree of the current node, judging whether the right sub-tree exists in the current node. If the right sub-tree does not exist in the current node, step S408 is performed, and if the right sub-tree exists in the current node, step S410 is performed.
S410: right is obtained by subtracting the current Node and the left sub-tree Node number Node from size. And (3) the current Node has a right sub-tree, which indicates that the object to be searched is in the right sub-tree, the right sub-tree Node of the current Node is counted as size, the current Node and the left sub-tree Node number Node are subtracted from the size, and the step S401 is returned to judge again until the left sub-tree Node number is equal to the Node number size of the left sub-tree of the object to be searched, the Node is found, and the information of the Node is output, or the final Node exists in the right sub-tree, the object to be searched is not found, and a null value is output.
The process of using the ranking value of the object to inquire the object data is similar to the method of using the integral of the object as key to inquire the object, the access times of each node are recorded in each inquiry process through continuous inquiry of the binary search tree, and the adjustment of the binary search tree is performed by using the access times of the nodes, so that the structure of the binary search tree is more reasonable, the retrieval and inquiry of the data are more efficient, the data can be orderly adjusted in real time, and the retrieval efficiency of the whole system is improved.
Fig. 5 shows a flow diagram of a method of deleting a node according to another embodiment of the invention. The method specifically comprises the following steps:
s500: and searching from the root Node of the binary search tree to find the Node where the key is located.
S501: and judging whether the left Node and the right Node exist and whether the freq of the left Node is larger than that of the right Node. If nodes exist on the left and right sides and the left Node freq is greater than the right Node, the step S502 is performed, and if nodes exist on the left and right sides and the left Node freq is less than the right Node, the step S503 is performed.
S502: and if the left Node and the right Node exist and the freq of the left Node is greater than that of the right Node, the Node nodes are rotated in the left direction. After the rotation, the process returns to S501 to perform the determination again.
In a specific embodiment, the left rotation is illustrated in fig. 2, when a Node performs left rotation, the position of a corresponding child Node needs to be adjusted according to the traversal order of the original binary search tree, so as to ensure the consistency of the traversal order of the original binary search tree.
S503: and judging whether the Node exists in the right Node or not. If Node has a right Node, step S504 is performed, and if Node does not have a right Node, step S505 is performed.
S504: and if the Node has a right Node, the Node is rotated to the right. After the rotation, the process returns to S501 to perform the determination again.
In a specific embodiment, a right rotation is schematically shown in fig. 2, when a Node performs right rotation, it needs to adjust the position of a corresponding child Node according to the traversal order of the original binary search tree, so as to ensure the consistency of the traversal order of the original binary search tree.
S505: and deleting the Node leaf nodes, tracing back to the root nodes step by step, and reducing the size by 1.
The nodes of the binary search tree are transferred to the leaf nodes by rotating left and right, and then deleted, so that the traversal sequence of the whole binary search tree is not influenced, meanwhile, the data of the binary search tree is updated, some useless Node information can be deleted, and the data effectiveness of the binary search tree is maintained.
A data processing method for a binary search tree according to an embodiment of the present invention. FIG. 6 shows a flow of a data processing method for a binary search tree according to an embodiment of the invention. As shown in fig. 6, the method comprises the steps of:
s601: adding a node access frequency freq value and a subtree size value to nodes of a binary search tree, and adding two subtree pointers to the left and right of each node of the original binary search tree, wherein the nodes have six attributes in total, and the method comprises the following steps: key, other information value, total number size of nodes, number of accesses freq, left sub-tree left and right sub-tree right.
S602: in the process of searching the nodes of the binary search tree, 1 is added to the number of times of accessing the searched nodes. And searching by using the key, starting to search from the root node of the binary search tree, and when the key of the current node is equal to the key to be searched through the comparison between the key of the current node and the key to be searched, indicating that the current node is the node to be searched, and adding 1 to the access frequency of the current node.
S603: and judging whether to adjust the nodes of the binary search tree or not according to the freq values of the access times of the parent node and the left subtree node or the right subtree node. And comparing the current node serving as the parent node with the access times of the nodes of the left sub-tree or the right sub-tree by using the access time increase record in the step S102, and judging whether the node position needs to be adjusted.
S604: if the value of the current node is smaller than the value of the node to be searched, searching the right subtree, and if the access times of the parent node are smaller than the access times of the right subtree node, interchanging the positions of the two nodes. And when the access times of the parent node are less than the access times of the nodes of the right subtree, the positions of the two nodes are adjusted for interchange.
S605: if the value of the current node is larger than the value of the node to be searched, searching the left subtree, and if the access times of the parent node are smaller than the access times of the left subtree node, interchanging the positions of the two nodes. And when the access times of the parent node are less than the access times of the nodes of the right subtree, the positions of the two nodes are adjusted for interchange.
The access times are recorded in each node, the positions of the nodes are adjusted by the access times, and the nodes with the large access times move to the positions close to the root nodes, so that the query efficiency of the nodes which are frequently queried is improved, the required information can be queried more quickly in the data retrieval process, the data retrieval efficiency of the whole system is greatly improved, multiple times of the data retrieval process and the common database retrieval process are improved, the coding process is simple, and the method is suitable for various application scenes.
Embodiments of the present invention also relate to a computer-readable storage medium having stored thereon one or more computer programs which, when executed by a computer processor, implement the above method. The computer program comprises program code for performing the method illustrated in the flow chart. It should be noted that the computer readable medium of the present application can be a computer readable signal medium or a computer readable medium or any combination of the two.
As shown in fig. 7, a data ranking retrieval system according to the present invention is shown, and the system includes a data recording unit 1, a data query judging unit 2, a ranking searching unit 3 and a node updating unit 4.
In a specific embodiment, the data recording unit 1 is configured to add a subtree size value and a node access frequency freq value to a node of the binary search tree, and record the subtree size value and the node access frequency freq value. And through recording the size value and the access times freq, a data basis is provided for data retrieval and node updating of the binary search tree, and the data accuracy of the binary search tree is ensured.
In a specific embodiment, the data query determining unit 2 is configured to determine a size relationship between a key of a current node and a key to be searched. The query judging unit compares the key values of the nodes of the binary search tree and judges the size relationship between the key to be searched and the current node to obtain the position information of the key to be searched, so that the position of the object to be searched can be quickly positioned, and the searching efficiency is improved.
In a specific embodiment, the ranking searching unit 3 is configured to query the ranking rank according to the to-be-searched key, if the key of the current node is equal to the size of the to-be-searched key, the ranking rank x of the to-be-searched key is equal to rank-node.left.size, if the key of the current node is smaller than the to-be-searched key, search in a right subtree direction, and the ranking of the minimum node of the right subtree node of the current node is equal to rank-1-node.left.size, until the to-be-searched key is found, if the key of the current node is larger than the to-be-searched key, search in a left subtree direction, and remain unchanged until the to-be-searched key is found. The ranking searching unit 3 can directly obtain the ranking information of the object to be searched through the number calculation of the nodes, so that the efficiency of data retrieval is improved, and meanwhile, the accuracy of the ranking information is also ensured.
In a specific embodiment, the node updating unit 4 is configured to move a node with a large number of access times to a position closer to the root node, replace the position of the current node with the position of the right sub-tree node if the number of access times of the current node is less than the number of access times of the right sub-tree node, replace the position of the current node with the position of the left sub-tree node if the number of access times of the current node is less than the number of access times of the left sub-tree node, and configure to change the positions of the current node and the left sub-tree or the right sub-tree node according to the number of access times, while adjusting the corresponding child nodes of the left sub-tree or the right sub-tree according to the traversal order of the original binary search tree, and update the size corresponding to the node. The nodes of the binary search tree are adjusted by the updating unit, the nodes with more access times can be moved to the position closer to the root node as far as possible, the query efficiency of the frequently queried nodes is improved, and the retrieval efficiency of the whole binary search tree is greatly improved.
According to the data processing and data ranking retrieval method and system of the binary search tree, two attributes of key and freq are added to the nodes of the binary tree, the nodes of the binary search tree are adjusted according to the access times freq of the nodes, the nodes with the access times freq are moved to the root node, retrieval efficiency is greatly improved, meanwhile, the ranking of objects can be rapidly positioned by using the key, the nodes are adjusted after the data retrieval process is finished, the access times and the nodes are updated after each retrieval is finished, and the whole system is kept on the basis of the optimal retrieval data, and retrieval work is efficiently carried out.
It will be apparent to those skilled in the art that various modifications and variations can be made to the embodiments of the present invention without departing from the spirit and scope of the invention. In this way, if these modifications and changes are within the scope of the claims of the present invention and their equivalents, the present invention is also intended to cover these modifications and changes. The word "comprising" does not exclude the presence of other elements or steps than those listed in a claim. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage. Any reference signs in the claims shall not be construed as limiting the scope.

Claims (14)

1. A data ranking retrieval method is characterized by comprising the following steps:
s101: increasing the size of a subtree and the access times freq of the binary search tree, initializing the rank (root) where the key to be inquired is located, and setting the current node as a root node to start searching;
s102: judging the size relationship between the key of the current node and the key to be searched;
s103: if the key of the current node is equal to the key to be searched, the ranking rank x of the key to be searched is equal to rank-node.left.size, and meanwhile, 1 is added to the number of accesses to the current node;
s104: if the key of the current node is smaller than the key to be searched, searching in the right subtree direction, and increasing the rank, namely the rank value of the minimum node in the right subtree direction of the current node is rank R equal to rank-1-node.
S105: if the key of the current node is larger than the key to be searched, searching in the direction of the left sub-tree, and keeping the rank unchanged until the searching is completed, and if the access times of the current node are smaller than the access times of the left sub-tree node, replacing the positions of the current node and the left sub-tree node;
and the value of the key represents the integral of a user, the node.left.size represents the node number of the left sub-tree of the current node, and the root.size represents the total number of all nodes of the binary search tree.
2. The method as claimed in claim 1, wherein in step S104 and step S105, after the key to be searched is found, tracing back step by step, and replacing the position of the current node with the left sub-tree node or the right sub-tree node by using the number of accesses.
3. The method of claim 1, wherein said rank value is decreased with said current node in said step S104, and the decreased value is the left sub-tree number of said current node plus 1.
4. The method of claim 1, further comprising the steps of:
s106: repeating the step S102 to the step S105 until the current node key is equal to the key to be queried or the current node is a leaf node.
5. The method of claim 3, wherein the positions of the current node and the left sub-tree node or the right sub-tree node are changed in steps S104 and S105, and the positions of the current node and the sub-nodes of the left sub-tree node or the right sub-tree node are adjusted according to the traversal order of the original binary search tree, and the size corresponding to the adjusted node is updated.
6. The method of claim 1, further comprising the steps of:
starting searching from a Root Node of a binary search tree, taking a current Node as the Root Node, setting the rank of an object to be searched as rank, setting the left sub-tree of the object to be searched as size nodes, initializing the size of the size-rank, and judging the size relationship between the number of the left sub-trees of the current Node and the size;
if the number of the left subtrees of the current node is equal to the size of the size, the information of the current node is the information of the object, and 1 is added to the access times of the current node;
if the number of the left subtrees of the current node is smaller than the size, the position of the object is size-1-node.left.size in the right subtree of the node of the object, and when the visit times of the current node are smaller than the visit times of the node of the right subtree, the positions of the current node and the node of the right subtree are replaced;
if the number of the left subtrees of the current node is larger than the size, the size of the object node in the left subtree is unchanged, and the positions of the current node and the left subtree node are replaced when the visit times of the current node are smaller than the visit times of the left subtree node;
wherein, the size is the number of nodes of the left sub-tree of the object of the rank.
7. The method for retrieving the data rank according to claim 1, further comprising a step of updating the nodes of the binary search tree, wherein the step of updating the nodes specifically comprises a step of adding the nodes and a step of deleting the nodes.
8. The method for retrieving the data rank according to claim 7, wherein the step of adding the node specifically includes: and finding out nodes which are larger than the key and have no left subtree node or smaller than the key and have no right subtree through the key, creating new nodes, backtracking to a root node step by step, and updating the size plus 1.
9. The method for retrieving the data rank according to claim 7, wherein the step of deleting the node specifically includes: finding out the nodes needing to be deleted through the key, rotating the nodes to be deleted to leaf nodes through the rotating nodes, deleting, tracing back to the root nodes step by step, and updating the size by 1.
10. A data processing method using a binary search tree, characterized in that the retrieval method for performing the data ranking according to any of claims 1-9 comprises the following steps:
s601: increasing a freq value of the node access times to the nodes of the binary search tree;
s602: in the process of searching the nodes of the binary search tree, adding 1 to the number of access times of the searched nodes;
s603: judging whether to adjust the nodes of the binary search tree or not according to the freq values of the access times of the parent node and the left subtree node or the right subtree node;
s604: if the value of the current node is smaller than the value of the node to be searched, searching the right subtree, and if the access times of the parent node are smaller than the access times of the right subtree node, interchanging the positions of the two nodes;
s605: if the value of the current node is larger than the value of the node to be searched, searching towards the left sub-tree, and if the number of access times of the parent node is smaller than that of the left sub-tree node, interchanging the positions of the two nodes.
11. The method of claim 10, wherein positions of the parent node and the left or right subtree nodes are changed according to the number of accesses, and the corresponding child nodes of the left or right subtree and the parent node are adjusted according to a traversal order of an original binary tree.
12. The method of claim 10, further comprising repeating steps S603-S605 until the current node becomes a leaf node.
13. A computer-readable storage medium having one or more computer programs stored thereon, which when executed by a computer processor perform the method of any one of claims 1 to 9.
14. A data ranking retrieval system, the system comprising:
the data recording unit is used for increasing a subtree size value and a node access frequency freq value to the nodes of the binary search tree and recording the subtree size value and the node access frequency freq value;
the data query judging unit is used for judging the size relationship between the key of the current node and the key to be searched;
the ranking searching unit is used for inquiring the ranking rank according to the key to be searched, and if the key of the current node is equal to the size of the key to be searched, the ranking rank x of the key to be searched is equal to rank-node.
If the current node key is smaller than the key to be searched, searching to the right subtree direction, and the rank of the minimum node of the right subtree node of the current node is equal to rank-1-node.
If the current node key is larger than the key to be searched, searching in the direction of the left sub-tree, and keeping the rank unchanged until the key to be searched is searched;
a node updating unit for moving the node with the large number of access times to a position closer to the root node;
when the number of access times of the current node is less than that of the right sub-tree node, the positions of the current node and the right sub-tree node are replaced;
when the number of access times of the current node is less than that of the left sub-tree node, the positions of the current node and the left sub-tree node are replaced;
and the position of the current node and the left sub-tree or the right sub-tree is changed according to the access times, and meanwhile, the corresponding child nodes of the left sub-tree or the right sub-tree are adjusted according to the traversal sequence of the original binary search tree.
CN201811613883.8A 2018-12-27 2018-12-27 Method and system for retrieving and processing data ranking by using binary search tree Active CN109815232B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811613883.8A CN109815232B (en) 2018-12-27 2018-12-27 Method and system for retrieving and processing data ranking by using binary search tree

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811613883.8A CN109815232B (en) 2018-12-27 2018-12-27 Method and system for retrieving and processing data ranking by using binary search tree

Publications (2)

Publication Number Publication Date
CN109815232A CN109815232A (en) 2019-05-28
CN109815232B true CN109815232B (en) 2022-03-18

Family

ID=66602551

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811613883.8A Active CN109815232B (en) 2018-12-27 2018-12-27 Method and system for retrieving and processing data ranking by using binary search tree

Country Status (1)

Country Link
CN (1) CN109815232B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110413228B (en) * 2019-07-09 2022-10-14 江苏芯盛智能科技有限公司 Mapping table management method and system, electronic equipment and storage medium
CN111367947A (en) * 2020-03-09 2020-07-03 北京奇艺世纪科技有限公司 Information retrieval method and device, electronic equipment and storage medium
CN113449003B (en) * 2021-07-07 2024-04-16 京东科技控股股份有限公司 Information query method, device, electronic equipment and medium
CN113535171B (en) * 2021-07-23 2024-03-08 上海米哈游璃月科技有限公司 Information searching method, device, equipment and storage medium
CN116107932B (en) * 2023-04-13 2023-07-11 摩尔线程智能科技(北京)有限责任公司 Data queue updating method and device, storage medium and electronic equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102315979A (en) * 2010-07-05 2012-01-11 国讯新创软件技术有限公司 Method and device for monitoring network flow
CN104036531A (en) * 2014-06-16 2014-09-10 西安交通大学 Information hiding method based on vector quantization and bintree
CN106294545A (en) * 2016-07-22 2017-01-04 中国农业银行股份有限公司 The access method of a kind of tree structure data and device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8533129B2 (en) * 2008-09-16 2013-09-10 Yahoo! Inc. Efficient data layout techniques for fast machine learning-based document ranking
CN105512320B (en) * 2015-12-18 2019-03-01 北京金山安全软件有限公司 User ranking obtaining method and device and server

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102315979A (en) * 2010-07-05 2012-01-11 国讯新创软件技术有限公司 Method and device for monitoring network flow
CN104036531A (en) * 2014-06-16 2014-09-10 西安交通大学 Information hiding method based on vector quantization and bintree
CN106294545A (en) * 2016-07-22 2017-01-04 中国农业银行股份有限公司 The access method of a kind of tree structure data and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"红黑树算法研究综述";马博韬等;《网络新媒体技术》;20180715;56-62页 *

Also Published As

Publication number Publication date
CN109815232A (en) 2019-05-28

Similar Documents

Publication Publication Date Title
CN109815232B (en) Method and system for retrieving and processing data ranking by using binary search tree
US7634465B2 (en) Indexing and caching strategy for local queries
CN105706078B (en) Automatic definition of entity collections
JP5323300B2 (en) System and method for narrowing a search using index keys
CN102693266B (en) Search for method, the navigation equipment and method of generation index structure of database
US6925462B2 (en) Database management system, and query method and query execution program in the database management system
US20100106713A1 (en) Method for performing efficient similarity search
CN101324896B (en) Method for storing and searching vector data and management system thereof
JP2004518226A (en) Database system and query optimizer
JP5863494B2 (en) Information processing apparatus, control method therefor, and program
CN108241709B (en) Data integration method, device and system
CN109992603B (en) Data searching method and device, electronic equipment and computer readable medium
CN109871233B (en) Cloud programming file management method and device, equipment and storage medium
CN110807028B (en) Method, apparatus and computer program product for managing a storage system
CN110263108B (en) Keyword Skyline fuzzy query method and system based on road network
CN117171161A (en) Data query method and device
Medina et al. Evaluation of indexing strategies for possibilistic queries based on indexing techniques available in traditional RDBMS
JP6376124B2 (en) Information processing apparatus, information processing method, and program
CN115757896A (en) Vector retrieval method, device, equipment and readable storage medium
CN114840487A (en) Metadata management method and device for distributed file system
JPH06215037A (en) Automatic updating device for index
CN109684504B (en) Data processing method and device and electronic equipment
CN113779286A (en) Method and device for managing graph data
EP0394172A2 (en) Method of performing file services given partial file names
JP5845818B2 (en) Region search method, region search program, and information processing apparatus

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant