CN116910125B - Digital object distributed search method and device integrating distance and longest prefix - Google Patents

Digital object distributed search method and device integrating distance and longest prefix Download PDF

Info

Publication number
CN116910125B
CN116910125B CN202311167276.4A CN202311167276A CN116910125B CN 116910125 B CN116910125 B CN 116910125B CN 202311167276 A CN202311167276 A CN 202311167276A CN 116910125 B CN116910125 B CN 116910125B
Authority
CN
China
Prior art keywords
node
nodes
data
prefix
search
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311167276.4A
Other languages
Chinese (zh)
Other versions
CN116910125A (en
Inventor
蔡华谦
黄罡
李影
景翔
张齐勋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University filed Critical Peking University
Priority to CN202311167276.4A priority Critical patent/CN116910125B/en
Publication of CN116910125A publication Critical patent/CN116910125A/en
Application granted granted Critical
Publication of CN116910125B publication Critical patent/CN116910125B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • G06F16/9566URL specific, e.g. using aliases, detecting broken or misspelled links
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Fuzzy Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a digital object distributed search method and device integrating distance and longest prefix, and relates to the technical field of data search, wherein the method comprises the following steps: sending out a search request through the coordination node; accessing the search request to an intelligent contract layer based on a contract interface; calling a search engine interface through the intelligent contract layer, and accessing the search request into a search engine; executing a preset prefix distance scheduling algorithm through the search engine to determine a target node list for data search; and calling respective contract interfaces of all target nodes in the target node list through the search engine to acquire data corresponding to the search request, summarizing the acquired data at the coordination node, and returning the data to the terminal where the user is located through the coordination node. The method aims to reduce the cost of computing resources while guaranteeing the credibility and controllability of data searching so as to improve the data searching efficiency.

Description

Digital object distributed search method and device integrating distance and longest prefix
Technical Field
The invention relates to the technical field of data search, in particular to a digital object distributed search method and device integrating distance and longest prefix.
Background
The current internet implementation is based on a TCP/IP protocol stack, realizes data transmission and encoding and decoding based on binary information on heterogeneous software, and generates a large amount of network applications and services on the basis. At the same time, however, the TCP/IP protocol only provides the underlying data interconnect, on which the data codec and use are determined by the upper layer system and application. In the process, in order to realize safety and high efficiency, the upper layer application can use different solutions for reliable data transmission, which can lead to that data can be interconnected in the same system, but information islands are generated between different systems due to different design standards, so that the value of the data can not be fully exerted, and the communication use of the data is blocked.
Aiming at the problems, in order to solve the problems of data interconnection and intercommunication and data credibility, management and controllability between heterogeneous platforms, the thought of the Internet is used for reference, and the Internet of data (Internet of data) is generated, which is called as Internet of things for short. Unlike the platfonn big data solutions, the goal of digital networking is to use the protocol solutions to implement the interconnection of data. The Internet of things is a virtual network based on the Internet, which is formed by connecting heterogeneous systems through standardized operation protocols by using data as a center, and realizes interconnection and interworking interoperability of heterogeneous, heterogeneous and heterogeneous main data, and further realizes data application of whole network integration on the basis of the Internet.
Searching is the most straightforward way to mine value in mass data and is also a critical part in designing and implementing a high performance system. When the traditional single-machine data is persistent, data can be efficiently retrieved by using some data structures such as B+ tree, skip list, red-black tree, dictionary tree and the like. In the stage of platform formation, the data volume is larger, but the data scale in the same system is basically larger, and the original single-node architecture is generally changed into a distributed master-slave architecture, so that higher concurrency and throughput are realized, and meanwhile, the consistency and reliability of the data are ensured.
For the scenario of digital networking, there is a substantial difference between the search requirement and the above-described single-platform big data distributed search. Firstly, the whole architecture is changed, a powerful search platform is needed for the platform-type distributed search, the standard of a system is formulated uniformly, the index of data is maintained, and a plurality of machines realize the function together. In the digital networking scene, each data area is independent and heterogeneous, the standards and the performances are different, and the key problem to be solved in the digital networking is to realize heterogeneous data interconnection and intercommunication by using a protocol mode. Therefore, for the data search of the digital networking scene, the key technology of the digital networking is combined on the basis of the traditional distributed search engine technology to realize the protocol distributed search, so that the efficient search is more and more emphasized in the digital networking scene.
Disclosure of Invention
In view of this, the present invention provides a digital object distributed search method that fuses distance and longest prefix. The method aims to reduce the cost of computing resources while guaranteeing the credibility and controllability of data searching so as to improve the data searching efficiency.
In a first aspect of an embodiment of the present invention, there is provided a digital object distributed search method for fusing distance and longest prefix, the method including:
sending out a search request through the coordination node;
accessing the search request to an intelligent contract layer based on a contract interface;
calling a search engine interface through the intelligent contract layer, and accessing the search request into a search engine;
executing a preset prefix distance scheduling algorithm through the search engine to determine a target node list for data search;
and calling respective contract interfaces of all target nodes in the target node list through the search engine to acquire data corresponding to the search request, summarizing the acquired data at the coordination node, and returning the data to the terminal where the user is located through the coordination node.
Optionally, the method further comprises:
according to node information of a node to be accessed, accessing the node to be accessed into a digital networking network through an identification analysis system;
Inputting the node information of the node to be accessed into a father node of the node to be accessed;
according to the node information received by the father node, the father node sends a digital object data acquisition request to the node to be accessed;
according to the acquisition request, the node to be accessed returns digital object data corresponding to the acquisition request to the father node;
and indexing all metadata according to the acquired digital object data by the father node, maintaining the configuration information of the node to be accessed, synchronizing the acquired digital object data with the upper node of the node and updating the statistical data of the node.
Optionally, the determining, by the search engine executing a preset prefix distance scheduling algorithm, a target node list for performing data search includes:
step S41: determining an initial target node set for data searching according to the search request;
step S42: determining prefix distances among all nodes in the initial target node set according to the initial target node set;
step S43: when the prefix distance between two nodes is 0, acquiring the node of the longest common prefix in the two nodes;
Step S44: when the prefix distance between a plurality of nodes is 1, acquiring father nodes of the plurality of nodes;
step S45: and determining the prefix distance between the acquired nodes, returning to the step S43 when the prefix distance between the acquired nodes comprises the prefix distance with the value of 0 and/or 1, and determining the acquired nodes as target nodes to obtain a target node list when the prefix distances between the acquired nodes are all larger than 1.
Optionally, the method further comprises:
and determining the root node as the target node under the condition that the number of the nodes in the initial target node set exceeds the preset proportion of the total number of the networking nodes.
Optionally, the method further comprises:
the node in the digital network carries out data persistence on the own digital object target file, wherein the digital object target file comprises the following components: sequentially stored digital object metadata files, sequentially stored locally maintained digital object repository files, full text index files for digital objects; and recording the number files of the digital objects in each digital object warehouse and the warehouse update event files monitored by the persistent local registry.
Optionally, before the node to be accessed is accessed to the digital networking network through the identification resolution system, the method further comprises: and configuring parameter sets of node attributes and state attributes of the node to be accessed.
A second aspect of the present invention provides a digital object distributed search apparatus fusing distance and longest prefix, the apparatus comprising:
the front-end user interface is used for sending out a search request through the coordination node;
the contract interface is used for accessing the search request into an intelligent contract layer;
the search engine interface is used for calling the search engine interface through the intelligent contract layer and accessing the search request into a search engine;
the search engine is used for executing a preset prefix distance scheduling algorithm, determining a target node list for searching data, calling contract interfaces of all target nodes in the target node list to acquire data corresponding to the search request, summarizing the acquired data at the coordination node, and returning the data to a terminal where a user is located through the coordination node.
Aiming at the prior art, the invention has the following advantages:
according to the digital object distributed search method integrating the distance and the longest prefix, firstly, a user inputs a search request comprising data to be searched and a registry node to be searched through a user interface of a registry node (namely a coordination node) in a local terminal by inputting url of the registry node in a digital network, and the search request is sent through the user interface of the coordination node; the search request is accessed to the intelligent contract layer through the intelligent contract interface of the coordination node, so that the process of data search passes through the intelligent contract layer, and the credibility and the controllability of the data search are ensured; then, calling a search engine interface through the intelligent contract layer, and accessing the search request into a search engine of the coordination node through the search engine interface; the search engine of the coordination node executes a preset prefix distance scheduling algorithm, and a target node list for data search is screened from registry nodes which are required to be searched by a user, so that the computing resource cost of the data search is reduced, and the data search efficiency is improved; after determining the target node list, calling respective contract interfaces of all target nodes in the target node list through search engines of the coordination nodes, accessing respective intelligent contract layers of all target nodes, calling respective search engine interfaces through respective intelligent contract layers of all target nodes, calling respective search engines through respective search engine interfaces of all target nodes to search data (namely data corresponding to the search request) which the user wants to search in metadata maintained by the respective target nodes, returning to the coordination nodes, and returning to a local terminal where the user is located after summarized by the coordination nodes. Therefore, the computing resource cost can be reduced while the credibility and the controllability of the data searching are ensured, so that the data searching efficiency is improved.
The foregoing description is only an overview of the present invention, and is intended to be implemented in accordance with the teachings of the present invention in order that the same may be more clearly understood and to make the same and other objects, features and advantages of the present invention more readily apparent.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.
FIG. 1 is a flowchart of a method for distributed searching of digital objects that merges distance and longest prefix according to an embodiment of the present invention;
fig. 2 is an exemplary schematic diagram of a preset prefix distance scheduling algorithm in a digital object distributed search method of merging distance and longest prefix according to an embodiment of the present invention;
FIG. 3 is a flowchart of a prefix distance algorithm in a digital object distributed search method that merges distance and longest prefix according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a digital object distributed search device that merges distance and longest prefix according to an embodiment of the present invention;
fig. 5 is a general architecture diagram of a digital object distributed search device that fuses distance and longest prefix according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings.
Fig. 1 is a flowchart of a digital object distributed search method for merging distance and longest prefix, which is provided in an embodiment of the present invention, and as shown in fig. 1, the method includes:
step S11: sending out a search request through the coordination node;
step S12: accessing the search request to an intelligent contract layer based on a contract interface;
step S13: calling a search engine interface through the intelligent contract layer, and accessing the search request into a search engine;
step S14: executing a preset prefix distance scheduling algorithm through the search engine to determine a target node list for data search;
step S15: and calling respective contract interfaces of all target nodes in the target node list through the search engine to acquire data corresponding to the search request, summarizing the acquired data at the coordination node, and returning the data to the terminal where the user is located through the coordination node.
In the embodiment of the invention, the node mentioned in the invention represents a registry in the Internet of things, and the corresponding data platforms can be a search engine, a database, a data center, internet of things equipment and the like. Each node includes a user interface layer, an intelligent contract layer, a search engine layer, and a data persistence layer.
A user firstly inputs url of a node in the digital network through a local terminal, and the user interface of the node corresponding to the url, namely the coordination node, is called based on the url. In the user interface, a user enters a search request, based on which the coordinator node will issue a search request. The search request includes data (such as keyword information of data) that the user wants to search, and includes a list of nodes that the user wants to search.
Illustratively, the user enters a search key abdesf in the user interface and designates that registry node a, registry node B, registry node C, registry node D, registry node E, registry node F in the digital network is desired to search for the search key abdesf.
The data networking is a method using software definition, uses data as a center, connects heterogeneous systems through standardized operation protocols to form an internet-based virtual network, realizes interconnection and interworking interoperability of heterogeneous, heterogeneous and heterogeneous main data, and further realizes data application of whole network integration on the basis of the internet. Specifically, a search request sent by the coordination node is accessed to an intelligent contract layer of the coordination node through an intelligent contract interface of the coordination node, and the search engine interface is called by the intelligent contract layer to access the search request to a search engine of the coordination node, so that the process of searching and calling data is recorded by the intelligent contract layer in the process of searching the data, and the accuracy of the data and the credibility and controllability in the process of operating the data are ensured.
After a search engine of the coordination node receives a search request of a user, a preset prefix distance scheduling algorithm is executed, partial optimal nodes for searching data are screened from a node list which is included in the search request and is used for searching the data which is included in the search request and is used for searching the user and is used for searching the data, so that the computing resource cost is reduced, and the data searching efficiency is improved.
After determining the target node list, calling respective contract interfaces of all target nodes in the target node list through search engines of the coordination nodes, accessing respective intelligent contract layers of all target nodes, calling respective search engine interfaces through respective intelligent contract layers of all target nodes, calling respective search engines through respective search engine interfaces of all target nodes to search data (namely data corresponding to the search request) which the user wants to search in metadata maintained by the respective target nodes, returning to the coordination nodes, and returning to a local terminal where the user is located after summarized by the coordination nodes.
According to the digital object distributed search method integrating the distance and the longest prefix, firstly, a user inputs a search request comprising data to be searched and a registry node to be searched through a user interface of a registry node (namely a coordination node) in a local terminal by inputting url of the registry node in a digital network, and the search request is sent through the user interface of the coordination node; the search request is accessed to the intelligent contract layer through the intelligent contract interface of the coordination node, so that the process of data search passes through the intelligent contract layer, and the credibility and the controllability of the data search are ensured; then, calling a search engine interface through the intelligent contract layer, and accessing the search request into a search engine of the coordination node through the search engine interface; the search engine of the coordination node executes a preset prefix distance scheduling algorithm, and a target node list for data search is screened from registry nodes which are required to be searched by a user, so that the computing resource cost of the data search is reduced, and the data search efficiency is improved; after determining the target node list, calling respective contract interfaces of all target nodes in the target node list through search engines of the coordination nodes, accessing respective intelligent contract layers of all target nodes, calling respective search engine interfaces through respective intelligent contract layers of all target nodes, calling respective search engines through respective search engine interfaces of all target nodes to search data (namely data corresponding to the search request) which the user wants to search in metadata maintained by the respective target nodes, returning to the coordination nodes, and returning to a local terminal where the user is located after summarized by the coordination nodes. Therefore, the computing resource cost can be reduced while the credibility and the controllability of the data searching are ensured, so that the data searching efficiency is improved.
In the present invention, the method further comprises: according to node information of a node to be accessed, accessing the node to be accessed into a digital networking network through an identification analysis system; inputting the node information of the node to be accessed into a father node of the node to be accessed; according to the node information received by the father node, the father node sends a digital object data acquisition request to the node to be accessed; according to the acquisition request, the node to be accessed returns digital object data corresponding to the acquisition request to the father node; and indexing all metadata according to the acquired digital object data by the father node, maintaining the configuration information of the node to be accessed, synchronizing the acquired digital object data with the upper node of the node and updating the statistical data of the node.
In the embodiment of the invention, the registry node meeting the specification can be accessed into the digital network through the identification analysis system. Meanwhile, each node connected to the digital network needs to synchronize data to the upper node. Specifically, according to node information of a node to be accessed, the node information comprises local identification analysis information, the node information of the node to be accessed is analyzed through an identification analysis system, and the node to be accessed is accessed to a digital networking network based on the local identification of the node to be accessed obtained through analysis. After the node to be accessed is accessed to the digital networking network, the node information of the node to be accessed is input into the father node of the node to be accessed, wherein the node information comprises the name, address and the like of the node to be accessed. After receiving the node information, the father node sends a request for acquiring the digital object data to the node to be accessed so as to request the information related to the digital object in the node to be accessed, thereby realizing the synchronization of the digital object data in the node to be accessed to the father node. And according to the received digital object data acquisition request, the node to be accessed returns the digital object data corresponding to the digital object data acquisition request to the father node of the node to be accessed. According to the digital object data acquired from the node to be accessed, the father node builds an index for all metadata of the father node, wherein all metadata comprise the digital object data synchronized from the node to be accessed and metadata maintained by the father node. Meanwhile, the father node maintains the configuration information of the node to be accessed, so that when the node to be accessed has an update event of the digital object, the father node can synchronize the digital object data of the node to be accessed again after the node to be accessed has the update event of the digital object. Meanwhile, the father node synchronously acquires the digital object data from the upper node of the father node, so that all the upper nodes of the node to be accessed can synchronize to the digital object data of the node to be accessed. Meanwhile, the father node updates the related statistical data of the digital object according to the related statistical data of the digital object in the received digital object data. Wherein the request for obtaining the digital object data comprises; request event name, request digital object data of the node to be accessed, request synchronization of all metadata of the node to be accessed.
In an embodiment of the invention, each registry node includes a prefix identification that is divided into several levels, each level being separated by a "". And determining the father node of the node to be accessed from other nodes in the digital network according to the prefix identification of the node to be accessed. Illustratively, if the prefix of the node to be accessed is 86.1000.21, the prefix of the parent node of the node to be accessed is 86.1000, and the parent node of the node to be accessed is determined based on the prefix.
In the present invention, the determining, by the search engine executing a preset prefix distance scheduling algorithm, a target node list for performing data search includes:
step S41: determining an initial target node set for data searching according to the search request;
step S42: determining prefix distances among all nodes in the initial target node set according to the initial target node set;
step S43: when the prefix distance between two nodes is 0, acquiring the node of the longest common prefix in the two nodes;
step S44: when the prefix distance between a plurality of nodes is 1, acquiring father nodes of the plurality of nodes;
Step S45: and determining the prefix distance between the acquired nodes, returning to the step S43 when the prefix distance between the acquired nodes comprises the prefix distance with the value of 0 and/or 1, and determining the acquired nodes as target nodes to obtain a target node list when the prefix distances between the acquired nodes are all larger than 1.
In the embodiment of the invention, the search request comprises a node list which is intended to be searched by a user, the search engine analyzes the search request to determine the node list which is included in the search request and is intended to be searched by the user, and all nodes in the node list form an initial target node set for searching data. After the initial target node set is obtained, the prefix distance between the nodes is determined according to the prefix identifications of the nodes.
When the prefix distance between two nodes is 0, the two nodes are subordinate relations, and at the moment, the node with the longest common prefix in the two nodes is selected to participate in the determination of the subsequent target node, as shown in fig. 2, node 1.1.1.1 and node 1.1.1 in the graph are subordinate relations, so that the node 1.1.1.1 and node 1.1.1 are included in the initial target node set, and then the node 1.1.1 with the longest common prefix is selected to participate in the determination of the subsequent target node. In the case that the initial target node set includes the nodes 1.1.1.1 and 1.1.1 and 1.1, the node 1.1 with the longest common prefix is selected to participate in the subsequent determination of the target node because the three are subordinate relations.
When the prefix distance between the nodes is 1, the father node of the nodes participates in the determination of the subsequent target node, as shown in fig. 2, only the last level of prefix identifiers of the nodes 1.1.1.1.1 and 1.1.2 in the graph are different, and the prefix distance between the nodes 1.1.1.1 and 1.1.2 is 1, so that the initial target node set comprises the nodes 1.1.1.1 and 1.1.2, the father node 1.1.1.1 of the nodes 1.1.1 and 1.1.2 participates in the determination of the subsequent target node, and the father node (node 1.1.1.1) of the nodes (node 1.1.1.1 and 1.1.2) with the prefix distance 1 is also the longest common prefix node for the nodes.
For example, as shown in fig. 2, the specific solution process of determining the target node list for the initial target node set by using the embodiment of the step S14 includes the node 1.1.1.1, the node 1.1.1.2, the node 1.1.2.1, the node 1.1.2.2, the node 1.3.1.1, the node 1.3.1, the node 1.99.2.1, and the node 1.99.2.2 is: the prefix distance between the node 1.1.1.1 and the node 1.1.1.2 is 1, and the father node 1.1.1 of the node 1.1 and the father node 1.1 are selected to participate in the subsequent determination of the target node; the prefix distance between the node 1.1.2.1 and the node 1.1.2.2 is 1, and the father node 1.1.2 of the node 1.1.2 and the node 1.1.2.2 are selected to participate in the subsequent determination of the target node; the prefix distance between the node 1.3.1.1 and the node 1.3.1 is 0, and the node 1.3.1.1 and the node 1.3.1 are in subordinate relation, and the node 1.3.1 with the longest common prefix is selected to participate in the subsequent determination of the target node; the prefix distance between node 1.99.2.1 and node 1.99.2.2 is 1, where the parent node 1.99.2 of both is selected to participate in subsequent determination of the target node. The selected nodes are node 1.1.1, node 1.1.2, node 1.3.1 and node 1.99.2, and the selected nodes include node 1.1.1 and node 1.1.2 with prefix distance of 1, at this time, step S43 is returned to, and the node selected from the nodes with prefix distances of 1 and 0 is continued to participate in the subsequent determination of the target node. At this time, the prefix distance between the node 1.1.1 and the node 1.1.2 is 1, and the parent node 1.1 of the two is selected to participate in the subsequent determination of the target node. The nodes determined at this time are node 1.1, node 1.3.1 and node 1.99.2, the distances between the three nodes are all greater than 1, further determination of the target node cannot be performed, at this time, all the three nodes are determined to be the target nodes to form a target node list, and subsequent data searching is performed.
In the embodiment of the present invention, an algorithm for calculating a prefix distance between every two nodes is shown in fig. 3, where prefix1 and prefix2 respectively represent prefix strings of two nodes for performing prefix distance calculation. Lines 1 and 2 of the algorithm represent comparing the sizes of the prefix strings of the two nodes, giving the smaller one of the prefix strings of the two nodes to prefix1, and giving the larger one of the prefix strings of the two nodes to prefix2. Lines 3 to 6 of the algorithm represent determining whether the large prefix string prefix2 contains the small prefix string prefix1, and in the case of containing, indicating that the two prefix strings are in a subordinate relationship, where the distance between them is 0. Lines 7 through 9 of the algorithm represent that the first division of the prefix1 and prefix2 with the symbol "", resulting in two arrays of prefix1Array and prefix2Array, initializing the distance to 0. The 10 th row and the 11 th row represent that the obtained two arrays are subjected to OR operation at the value of each same level position, and the obtained result is the distance between the two nodes.
For example, for node 1.1.1.1 and node 1.3.1.1, since the prefix strings of the two are the same, there is no need to compare the size strings of the two, and there is no dependency between the two, at this time, the jump is directly performed to line 7, and the prefix strings of the two nodes are divided by symbol "", so as to obtain two arrays (1, 1), (1,3,1,1). The 1 st bit in the two arrays is 1, the values are the same, and the result obtained by executing or operation is 0; the 2 nd bit in the two arrays is 1 and 3 respectively, the values are different, and the result obtained by executing or operation is 1; the 3 rd bit in the two arrays is 1, the values are the same, and the result obtained by executing or operation at the moment is 0; the 4 th bit in the two arrays is 1, the values are the same, and the result obtained by performing the OR operation is 0. A binary 0100 result would be obtained and then converted to a 10-ary 4, resulting in a distance of 4 between node 1.1.1.1 and node 1.3.1.1.
In the present invention, the method further comprises: and determining the root node as the target node under the condition that the number of the nodes in the initial target node set exceeds the preset proportion of the total number of the networking nodes.
In the embodiment of the invention, when a user searches data, the number of nodes in a node list to be searched is too large, the range of the nodes involved in the process is too large, for example, the number of the nodes in the node list to be searched (namely, the initial target node set) exceeds half or more of the total number of the nodes in the whole digital networking network, the target node list is determined based on the preset prefix distance scheduling algorithm provided by the invention, the determination of the target node list to the root node in the final digital networking network is possible, and meanwhile, the target node list is determined by the preset prefix distance scheduling algorithm to consume certain calculation resources because the number of the nodes involved is too large, so that in order to avoid the situation, the invention presets a preset proportion, and when the number of the nodes in the node list to be searched (namely, the initial target node set) exceeds the preset proportion of the total number of the digital networking nodes, the root node in the digital networking network is directly determined as the target node to search the data. The preset proportion may be set according to an actual application scenario, and is not limited herein, for example, set to 40%, 60%, 70% or the like of the total number of nodes in the data network.
In the present invention, the method further comprises: the node in the digital network carries out data persistence on the own digital object target file, wherein the digital object target file comprises the following components: sequentially stored digital object metadata files, sequentially stored locally maintained digital object repository files, full text index files for digital objects; and recording the number files of the digital objects in each digital object warehouse and the warehouse update event files monitored by the persistent local registry.
In an embodiment of the invention, the registry node functions to implement a search function for digital objects, thus requiring persistence of information about the digital objects. Therefore, the registry node in the data networking performs data persistence on the own digital object target file. As shown in table 1, the digital object target file includes: sequentially stored digital object metadata files, sequentially stored locally maintained digital object repository files, full text index files for digital objects; and recording the number files of the digital objects in each digital object warehouse and the warehouse update event files monitored by the persistent local registry. Table 1 is as follows:
in the present invention, before the node to be accessed is accessed to the digital networking network through the identification analysis system, the method further comprises: and configuring parameter sets of node attributes and state attributes of the node to be accessed.
In the embodiment of the invention, in order to consider access and authority control after network access and the like, each registry node performs parameter set configuration of node attributes at the time of starting, the configured node attribute parameter set is shown in table 2, and meanwhile, a digital object warehouse with the same prefix identification is maintained under each registry node, so that after the registry node is started, a registration prefix is applied in a local identification analysis system of the registry node and used for analysis service of the digital object under the registry node. Meanwhile, for each registry, relevant statistics information of the digital object use needs to be provided for the upper node, and configuration information of some initial state attributes is also needed, and the configuration information also needs to be configured when the registry node is started, and a parameter set of the configured state attributes mainly comprises information of the number of digital objects, the number of users, the number of access times and the like as shown in table 3. Tables 2 and 3 are as follows:
in the embodiment of the invention, the data in the digital network exists in the form of digital objects, each digital object has corresponding metadata, at least comprises an identifier and a name, the identifier doId carries out unique identification on the digital object, the digital object can be quickly found in the whole network according to the identifier, and other fields can be dynamically expanded and maintained according to different system designs. The digital object also comprises a digital object type, and the digital object warehouse provides different access interfaces and coding modes according to different digital object types; description of the digital object for content search of the search engine; a digital object format schema for implementing format conversion functions for different types of digital objects; the digital object version number version is used for consistency maintenance of different registries and warehouses when the same digital object is updated; whether the digital object is added with an index attribute enableIndex, the attribute of the digital object may be set to false for nodes that do not wish to be indexed to achieve full-network fuzzy search, taking into account the logic of data management. Table 4 below is an exemplary representation of the present invention for one digital object field:
In the embodiment of the invention, because the invention introduces an intelligent contract layer for ensuring the credibility and controllability of data searching, the digital object warehouse also needs to comprise a persistence function of the complete data of the digital object; a function of synchronizing metadata to a registry; the function of accessing the digital object is realized through DOIP protocol, so the invention needs to carry out corresponding configuration on the digital object warehouse, and the configuration information design of the digital object warehouse is shown in the table 5:
in an embodiment of the present invention, a registry record of a node to be accessed maintained by a parent node of the node to be accessed includes: the method comprises the steps of registering a list id (unique identification of a registering list) of a node to be accessed, registering a list name (used for remotely calling a function of the node to be accessed), an address (used for remotely calling when synchronizing data), an update event (a father node needs to subscribe to the event and used for synchronizing the data of the node to be accessed), and a total number of digital objects (used for incremental synchronization when updating). The registry record of the upper node maintained by the father node of the access node comprises: registry id, registry name, address of the upper node. It should be understood that each registry can be searched by the search engine in the digital object distributed search method of the fusion distance and the longest prefix provided by the present invention, and then the registry record of the node to be accessed, which is maintained by the father node of the node to be accessed, i.e. the registry record of the respective child node maintained by each father node in the exponential networking network, will become the node to be accessed at a certain moment and be accessed into the digital networking network.
A second aspect of the present invention provides a digital object distributed search apparatus for merging distance and longest prefix, as shown in fig. 4, the apparatus 400 includes:
a front-end user interface 401 for issuing a search request through the coordinator node;
a contract interface 402, configured to access the search request to an intelligent contract layer;
a search engine interface 403, configured to invoke a search engine interface through the intelligent contract layer, and access the search request to a search engine;
the search engine 404 is configured to execute a preset prefix distance scheduling algorithm, determine a target node list for searching data, and call respective contract interfaces of each target node in the target node list to obtain data corresponding to the search request, and aggregate the obtained data at the coordination node, and return the data to a terminal where the user is located through the coordination node.
In the embodiment of the present invention, the front-end user interface 401 is configured to send a search request through the coordination node, and the presentation interface for implementing the search function based on vue is used for visual access by the user. The contract interface 402 is configured to access the search request to the intelligent contract layer, specifically, provide tttp and websocket interfaces and java and js call interfaces for users of different platforms to use the intelligent contract through the contract engine sdk. The intelligent contract layer is realized based on a North Dacron intelligent contract engine and is used for controlling the authority of the user (such as that an ordinary user can only perform the functions of adding and deleting a digital object warehouse, accessing a digital networking network and the like) and uploading a calling result. The search engine interface 403 is configured to invoke a search engine interface through the intelligent contract layer, access the search request to a search engine, implement the search engine by Java, provide an index of a digital object and the search engine interface to the outside through encapsulation of a SearchEngine class, and access an interface of a DOIP protocol. The data persistence of the nodes depends on Lucene and RocksDB, and the full-text index and the sequential storage of the metadata of the digital object are respectively realized.
In the embodiment of the invention, the digital object distributed searching device integrating the distance and the longest prefix further comprises a gateway system which is used for the access, statistics and management functions of the digital object warehouse and is logically equivalent to a large digital object warehouse. The system also comprises an identification analysis system, in particular a global identification analysis system (GRS Global Resolution System) and a local identification analysis system (LRS Local Resolution System), and the main function of the LRS is to register identification of each resource (participant, registry, warehouse, digital object, etc.) in the data network and provide analysis service. The primary function of GRS is to assign an identity (prefix) and resolution to the LRS. Also included are a registry system to which front end user interface 401 and search engine 404 belong, which is responsible for storing and managing metadata for digital objects and providing search services. Each of the plurality of networking nodes has a registry system and an LRS. Wherein the registry is prefixed in an "x.y.z" format, and the registry maintains all digital object identities prefixed by the prefix. As shown in fig. 5, fig. 5 shows an overall architecture diagram of a digital object distributed search apparatus for merging distances and longest prefixes provided by the present invention, including digital object, a digital object repository for persisting digital objects, a gateway system gateway for accessing the digital object repository to an identification resolution system to enable the digital object repository to participate in subsequent data searches, register an identification for each resource (participant, registry, repository, digital object, etc.) in the digital network, and provide a resolution service and a local identification resolution system LRS, a global identification resolution system GRS for assigning identifications (prefixes) and resolutions to LRS, metadata for storing and managing digital objects, and a registry system registry for providing a search service. Each node in the digital networking network comprises a self registry system and a local identification analysis system LRS.
Optionally, the apparatus 400 further includes:
the digital networking access module is used for accessing the node to be accessed into the digital networking network through the identification analysis system according to the node information of the node to be accessed;
the node information sending module is used for inputting the node information of the node to be accessed into the father node of the node to be accessed;
the acquisition request sending module is used for sending a digital object data acquisition request to the node to be accessed according to the node information received by the father node;
the digital object data return module is used for returning the digital object data corresponding to the acquisition request to the father node by the node to be accessed according to the acquisition request;
and the metadata index synchronization module is used for indexing all metadata according to the acquired digital object data, maintaining the configuration information of the node to be accessed, synchronizing the acquired digital object data with the upper node of the node and updating the statistical data of the node.
Optionally, the search engine 404 is configured to perform steps including:
step S41: determining an initial target node set for data searching according to the search request;
Step S42: determining prefix distances among all nodes in the initial target node set according to the initial target node set;
step S43: when the prefix distance between two nodes is 0, acquiring the node of the longest common prefix in the two nodes;
step S44: when the prefix distance between a plurality of nodes is 1, acquiring father nodes of the plurality of nodes;
step S45: and determining the prefix distance between the acquired nodes, returning to the step S43 when the prefix distance between the acquired nodes comprises the prefix distance with the value of 0 and/or 1, and determining the acquired nodes as target nodes to obtain a target node list when the prefix distances between the acquired nodes are all larger than 1.
Optionally, the apparatus 400 further includes:
the first target node determining module is used for determining the root node as a target node under the condition that the number of nodes in the initial target node set exceeds the preset proportion of the total number of the nodes of the digital network.
Optionally, the apparatus 400 further includes:
the data persistence module is used for carrying out data persistence on a digital object target file of the node in the digital network, and the digital object target file comprises: sequentially stored digital object metadata files, sequentially stored locally maintained digital object repository files, full text index files for digital objects; and recording the number files of the digital objects in each digital object warehouse and the warehouse update event files monitored by the persistent local registry.
Optionally, the apparatus 400 further includes: and the node attribute configuration module is used for configuring the parameter set of the node attribute and the state attribute of the node to be accessed.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present invention, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, for example, by wired (e.g., coaxial cable, optical fiber, digital Subscriber Line (DSL)), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Disk (SSD)), etc.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
In this specification, each embodiment is described in a related manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.
The foregoing description is only of the preferred embodiments of the present invention and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention are included in the protection scope of the present invention.

Claims (6)

1. A method for distributed searching of a digital object with a fusion of distance and longest prefix, the method comprising:
sending out a search request through the coordination node;
accessing the search request to an intelligent contract layer based on a contract interface;
calling a search engine interface through the intelligent contract layer, and accessing the search request into a search engine;
executing a preset prefix distance scheduling algorithm through the search engine to determine a target node list for data search;
the search engine calls contract interfaces of all target nodes in the target node list to acquire data corresponding to the search request, the acquired data are summarized in the coordination node, and the data are returned to the terminal where the user is located through the coordination node;
the method for determining the target node list for searching data by executing a preset prefix distance scheduling algorithm through the search engine comprises the following steps:
Step S41: determining an initial target node set for data searching according to the search request;
step S42: determining prefix distances among all nodes in the initial target node set according to the initial target node set;
step S43: when the prefix distance between two nodes is 0, acquiring the node of the longest common prefix in the two nodes;
step S44: when the prefix distance between a plurality of nodes is 1, acquiring father nodes of the plurality of nodes;
step S45: and determining the prefix distance between the acquired nodes, returning to the step S43 when the prefix distance between the acquired nodes comprises the prefix distance with the value of 0 and/or 1, and determining the acquired nodes as target nodes to obtain a target node list when the prefix distances between the acquired nodes are all larger than 1.
2. The method of digital object distributed search fusing distance and longest prefix of claim 1, further comprising:
according to node information of a node to be accessed, accessing the node to be accessed into a digital networking network through an identification analysis system;
inputting the node information of the node to be accessed into a father node of the node to be accessed;
According to the node information received by the father node, the father node sends a digital object data acquisition request to the node to be accessed;
according to the acquisition request, the node to be accessed returns digital object data corresponding to the acquisition request to the father node;
and indexing all metadata according to the acquired digital object data by the father node, maintaining the configuration information of the node to be accessed, synchronizing the acquired digital object data with the upper node of the node and updating the statistical data of the node.
3. The method of digital object distributed search fusing distance and longest prefix of claim 1, further comprising:
and determining the root node as the target node under the condition that the number of the nodes in the initial target node set exceeds the preset proportion of the total number of the networking nodes.
4. The method of digital object distributed search fusing distance and longest prefix of claim 1, further comprising:
the node in the digital network carries out data persistence on the own digital object target file, wherein the digital object target file comprises the following components: sequentially stored digital object metadata files, sequentially stored locally maintained digital object repository files, full text index files for digital objects; and recording the number files of the digital objects in each digital object warehouse and the warehouse update event files monitored by the persistent local registry.
5. The method of digital object distributed search incorporating distance and longest prefix according to claim 2, wherein prior to accessing the node to be accessed to a digital networking network by an identity resolution system, the method further comprises: and configuring parameter sets of node attributes and state attributes of the node to be accessed.
6. A digital object distributed search apparatus that fuses distance and longest prefix, the apparatus comprising:
the front-end user interface is used for sending out a search request through the coordination node;
the contract interface is used for accessing the search request into an intelligent contract layer;
the search engine interface is used for calling the search engine interface through the intelligent contract layer and accessing the search request into a search engine;
the search engine is used for executing a preset prefix distance scheduling algorithm, determining a target node list for searching data, calling contract interfaces of all target nodes in the target node list to acquire data corresponding to the search request, summarizing the acquired data at the coordination node, and returning the data to a terminal where a user is located through the coordination node;
the search engine is configured to perform the steps comprising:
Step S41: determining an initial target node set for data searching according to the search request;
step S42: determining prefix distances among all nodes in the initial target node set according to the initial target node set;
step S43: when the prefix distance between two nodes is 0, acquiring the node of the longest common prefix in the two nodes;
step S44: when the prefix distance between a plurality of nodes is 1, acquiring father nodes of the plurality of nodes;
step S45: and determining the prefix distance between the acquired nodes, returning to the step S43 when the prefix distance between the acquired nodes comprises the prefix distance with the value of 0 and/or 1, and determining the acquired nodes as target nodes to obtain a target node list when the prefix distances between the acquired nodes are all larger than 1.
CN202311167276.4A 2023-09-12 2023-09-12 Digital object distributed search method and device integrating distance and longest prefix Active CN116910125B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311167276.4A CN116910125B (en) 2023-09-12 2023-09-12 Digital object distributed search method and device integrating distance and longest prefix

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311167276.4A CN116910125B (en) 2023-09-12 2023-09-12 Digital object distributed search method and device integrating distance and longest prefix

Publications (2)

Publication Number Publication Date
CN116910125A CN116910125A (en) 2023-10-20
CN116910125B true CN116910125B (en) 2023-12-26

Family

ID=88368089

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311167276.4A Active CN116910125B (en) 2023-09-12 2023-09-12 Digital object distributed search method and device integrating distance and longest prefix

Country Status (1)

Country Link
CN (1) CN116910125B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114186115A (en) * 2021-11-24 2022-03-15 北京大学 Physical topology sensitive human-computer digital object searching method and system
CN116303675A (en) * 2023-03-20 2023-06-23 东北大学 Coalition chain data query method based on consensus group division and multi-layer index
CN116628721A (en) * 2023-06-06 2023-08-22 北京大学 Searchable encryption method and system for digital object

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114186115A (en) * 2021-11-24 2022-03-15 北京大学 Physical topology sensitive human-computer digital object searching method and system
CN116303675A (en) * 2023-03-20 2023-06-23 东北大学 Coalition chain data query method based on consensus group division and multi-layer index
CN116628721A (en) * 2023-06-06 2023-08-22 北京大学 Searchable encryption method and system for digital object

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
BETASCO: 面向智能合约分片的联盟区块链系统;吴恺东;《软件学报》;全文 *
Feasibility of Longest Prefix Matching using Learned Index Structures;Shunsuke Higuchi 等;《Performance Evaluation Review》;第第48卷卷(第第4期期);全文 *
区块链理论研究进展;单进勇;高胜;;密码学报(第05期);全文 *

Also Published As

Publication number Publication date
CN116910125A (en) 2023-10-20

Similar Documents

Publication Publication Date Title
EP1901181B1 (en) Discovery Web Service
CN110636093B (en) Microservice registration and discovery method, microservice registration and discovery device, storage medium and microservice system
US7668908B2 (en) System and method for generalized and distributed scalable eventing system
EP1901526B1 (en) Concatenation of web services
CN110119292B (en) System operation parameter query method, matching method, device and node equipment
US7139760B2 (en) Peer-to-peer record structure and query language for searching and discovery thereof
US20030115065A1 (en) Method and system for providing a distributed querying and filtering system
US6363375B1 (en) Classification tree based information retrieval scheme
CN116932614B (en) Distributed searching method and system oriented to digital networking
Tenorio-Fornés et al. Open peer-to-peer systems over blockchain and ipfs: An agent oriented framework
CN113326264A (en) Data processing method, server and storage medium
CN102034144A (en) Group compositing algorithms for presence background
JP2005025362A (en) Device, method and program for controlling data synchronization
CN116910125B (en) Digital object distributed search method and device integrating distance and longest prefix
US20080133587A1 (en) Extending Existing Data within a Directory Service
CN102195959B (en) The analytic method of the XML data of SIP signaling and device
US8499007B2 (en) Information processing system, first information processing apparatus, second information processing apparatus, and third information processing apparatus
US7155503B2 (en) Data server
Shetty et al. A novel web service composition and web service discovery based on map reduce algorithm
Frank et al. Personalizable service discovery in pervasive systems
US20180218061A1 (en) Unknown
US11093483B2 (en) Multilevel data lineage view
CN112069541B (en) Authority management and query method and device
US20240012857A1 (en) Asserted Relationships Matching in an Identity Graph Data Structure
TW202123001A (en) Decoupling method and system for decomposing services

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant