CN113986841A - All-node rapid acquisition and analysis system and method for I2P network - Google Patents

All-node rapid acquisition and analysis system and method for I2P network Download PDF

Info

Publication number
CN113986841A
CN113986841A CN202111262743.2A CN202111262743A CN113986841A CN 113986841 A CN113986841 A CN 113986841A CN 202111262743 A CN202111262743 A CN 202111262743A CN 113986841 A CN113986841 A CN 113986841A
Authority
CN
China
Prior art keywords
task
node
information
component
detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111262743.2A
Other languages
Chinese (zh)
Inventor
刘冠华
王轶骏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN202111262743.2A priority Critical patent/CN113986841A/en
Publication of CN113986841A publication Critical patent/CN113986841A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/1734Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention relates to a system and a method for quickly acquiring and analyzing all nodes of an I2P network, which comprises the steps of triggering tasks through a central control component, starting the tasks in parallel after each detection node in the detection component receives the context of the tasks, accelerating the collection of the nodes, entering collected node information into a calculation component in real time to finish index calculation, saving calculation time, and analyzing the index information falling into a library by a data analysis component after one detection task is finished to finally obtain recommended nodes and abnormal nodes. Compared with the prior art, the method has the advantages of flexibility, effectiveness, time saving and the like.

Description

All-node rapid acquisition and analysis system and method for I2P network
Technical Field
The invention relates to the technical field of network security, in particular to a system and a method for quickly acquiring and analyzing all nodes of an I2P network.
Background
I2P is an extensible, self-organizing, resilient packet-switched anonymous network layer on which any number of applications of varying degrees of anonymity or security awareness can run. When using an I2P network, the choice can be made freely in terms of anonymity, reliability, bandwidth usage and latency, etc., according to the requirements. Applications available in I2P include Web browsing, chat, file sharing, email, blogs, etc., and new applications are also constantly being developed.
I2P anonymous networks face a number of problems and challenges. An attacker can make external attacks by monitoring from the outside and can also attack the anonymous network with some internal nodes. In the anonymous communication system based on multiple routes, collusion attack is one of the most common attacks, an attacker and a malicious node deduce the identity of a user and information of communication by controlling a plurality of nodes on the multiple routes, and when the proportion of the malicious nodes exceeds a certain threshold value, the anonymity is destroyed. In addition, an attacker can also determine the identity of the sender and the recipient by observing the synchronization pattern of the communication message. The longer the attacker observes the synchronous communication, the higher the probability that the communication nodes are associated.
At present, a solution for collusion attack of an I2P node selection algorithm mainly aims to increase the cost of a node entering a hash table, for example, a passive scheme such as an authentication code and a community trust mechanism is used, and there is no scheme for actively selecting or avoiding selecting some nodes to improve the I2P node selection security by collecting and analyzing information of all nodes.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a system and a method for rapidly acquiring and analyzing the full nodes of an I2P network.
The purpose of the invention can be realized by the following technical scheme:
one aspect of the present invention provides a full-node fast acquisition and analysis system of an I2P network, including:
the central control component: managing all other components, acquiring task information through interaction with a user, and triggering various tasks;
a detection component: after receiving the context of the task, starting the task in parallel through each detection node, and quickly collecting all node information;
a computing component: consuming node information generated in real time and historical information in a library, and calculating indexes;
a data analysis component: after one detection task is finished, analyzing the index information falling into the database to obtain a recommended node and an abnormal node;
the communication between the central control component and the detection component, the communication between the central control component and the calculation component and between the central control component and the data analysis component are completed by adopting a distributed application program coordination service; the detection component and the calculation component are connected by a message queue capable of long connection.
Specifically, the central control assembly comprises:
a task triggering module: triggering the node to collect tasks, return to recommended node tasks and return to abnormal node tasks in a user interaction mode or a timing triggering mode;
a task management module: adopting a distributed file management architecture to uniformly manage all task contexts;
a task result module: the use of the task results is processed, including the processing of the results presentation and the processing of generating the netDB for direct use by I2P.
Specifically, the detection assembly includes:
a node detection module: generating a detection task according to the task context, and collecting node information of the I2P network;
progress preservation module: the task progress is stored, and the task is prevented from starting from the beginning after the task is abnormally pulled up again;
the information transmission module: and rewriting the information according to the requirement, and writing the information into one end of the message queue producer.
Specifically, the computing component includes:
an information input module: connecting one end of a message queue consumer, receiving streaming data, connecting a database and receiving batch data;
an information calculation module: aggregating the real-time node information and the historical node information, and performing index calculation by adopting multiple operators;
an information output module: the calculation result falls into a database for being called by the central control component;
the information transmission module: and rewriting the information according to the requirement, and writing the information into one end of the message queue producer.
Specifically, the data analysis component includes:
a data storage module: storing various information and indexes of nodes in the I2P network;
a data analysis module: and analyzing the node information and indexes falling into the database of the data storage module to obtain recommended nodes and abnormal nodes.
The invention also provides a method for quickly acquiring and analyzing the full nodes of the I2P network, which is based on the system and comprises the following steps:
s1: the central control component obtains task information through a front end and an open interface, creates a task context, obtains an online detection instance list through a distributed file system, and selects a detection instance. Specifically, the method comprises the following steps:
the central control component acquires task information through a front-end or open interface mode to manufacture a task context, acquires an online detection instance list through a distributed file system, elects instances according to the task detection number, writes a task number and a task starting state code into a corresponding instance path, monitors the task starting state code by the corresponding detection instance, requests the central control component for the task context, constructs a task according to the task context, and writes the corresponding state code of a constructed result back to the distributed file system; the central control component writes a task number and a task starting state code into a path of the distributed file system computing component, the main computing instance enumerates an instance for executing the task, requests a task context from the central control component, constructs the task according to the task context, and writes a corresponding state code of a constructed result back to the distributed file system; meanwhile, the central control component collects the task conditions of the two components to determine whether to terminate and rerun the task, and starts to monitor the distributed file system in the whole process to monitor whether the task is abnormal or ended correctly.
S2: the detection component obtains the task context for the selected detection instance, constructs and starts the task, and generates a state code for completing the task. Specifically, the method comprises the following steps:
firstly, applying for connection resources from a connection pool, constructing a detection message according to the rID and the flooding node information after applying for the connection resources, and sending the message; if the flooding node successfully returns the node information, the node information is sent to a message queue, the size relation between the rID and the rID in the progress database is judged, and if the rID is larger than the rID of the progress database, a read-write lock is added to write data; if the request flooding node is overtime and the retry is overtime, discarding the request and releasing the connection pool resource;
and after the process or the instance is restarted for some reasons, detecting the task number of the instance in the distributed pull operation, pulling the task context to the central control component according to the number, pulling the task progress in the progress database, and rapidly continuing the task.
S3: the computing component computes the acquired task context, constructs and starts the task, and generates a state code for completing the task. Specifically, the method comprises the following steps:
node information enters a computing component from one end of a message queue consumer, and task context is issued to each node of the computing component in a broadcasting mode; in the aggregation node, acquiring an IP address in the task real-time information, searching node historical information in a database by taking the IP address as a key word, entering the aggregation node, copying the node real-time information, enabling original node information to flow out of a main stream and fall into a node information database, and judging whether the aggregation is completed according to the quantity of historical data in the historical summary information by the copied node real-time information and the historical node information; and the node information which finishes aggregation or overtime aggregation enters a computing node, index calculation is finished according to the configured computing index, and the measured flow falls into a node index library.
S4: and after the central control component obtains the state codes of the computing component and the detection component for completing the tasks, the central control component informs the data analysis component to start working, and the data analysis component finds out the recommended nodes and the abnormal nodes. Specifically, the method comprises the following steps:
after the central control component obtains the state codes of the computing component and the detection component for completing the tasks, the central control component informs the data analysis component to start working; firstly, finding out nodes with too low two indexes of the significance of uniform distribution of rID and the proportion that the online time is more than 24 hours, and listing the nodes as suspected abnormal nodes; analyzing the distribution of historical online time and rID of the suspected abnormal node, and listing the node close to the suspected abnormal node as an abnormal node; and finally, obtaining a plurality of nodes with excellent performance indexes and non-abnormal nodes including average capacity, average bandwidth, average time delay and reachable proportion, and listing the nodes as recommended nodes.
S5: the central control component writes task ending state codes into detection examples and calculation component paths of the distributed file system, waits for each component to report task releasing state codes to clear task paths, and writes task information and logs into a database; and obtaining the task result from the data analysis component to obtain the generated netDB file.
Compared with the prior art, the system and the method for rapidly acquiring and analyzing the full nodes of the I2P network, provided by the invention, at least have the following beneficial effects:
1) in the existing scheme, some simple indexes (such as node delay) are added in a local database of an I2P node to select a node for establishing a tunnel, because the node database of an I2P network is a distributed network database, the cost for collecting all nodes is high, and if information of a large number of nodes cannot be collected, the indexes are increased, so that the probability of selecting attack nodes cannot be effectively reduced, and no node is available; the invention actively analyzes the recommended nodes and the abnormal nodes by adopting an I2P full-node collection and analysis scheme, and the full-node collection enlarges the node range of the I2P node which can selectively establish the tunnel, so that the cost of collusion attack by an attacker is increased, the selection is increased, and the user can select the nodes more flexibly.
2) In order to enable the system to complete the collection of the whole nodes of the I2P network more quickly and enable the whole system to keep a working state at any time, the invention adopts the technical scheme of multi-instance detection, streaming calculation and state maintenance. The multi-instance detection means that the detection component consists of a plurality of detection instances, and the detection component is uniformly managed by the distributed file system, so that the detection speed is increased; the streaming calculation means that the node information obtained by the detection component flows into the calculation component in real time to carry out aggregation and index calculation, so that the detection component and the calculation component work in parallel, and the calculation time is reduced; the state maintaining means that the state maintaining modules of the components work together with the distributed file system, so that when an exception occurs in the whole task, the state and the progress before the exception can be recovered during restarting, and the long-time task of collecting the tasks by the whole nodes does not need to be executed again.
Drawings
Fig. 1 is a schematic structural diagram of a full-node rapid acquisition and analysis system of an I2P network in an embodiment;
fig. 2 is a schematic flow chart of a full-node rapid acquisition and analysis method of the I2P network in the embodiment.
Detailed Description
The invention is described in detail below with reference to the figures and specific embodiments. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, shall fall within the scope of protection of the present invention.
Examples
The invention provides a full-node rapid acquisition and analysis system of an I2P network, which can be used for collecting full-node information and indexes of the I2P network and giving recommended nodes and abnormal nodes. The method comprises the steps that firstly, a central control component triggers a task, each detection node in the detection component starts the task after receiving the context of the task in parallel, node collection is accelerated, collected node information enters a calculation component in real time to complete index calculation, calculation time is saved, and a data analysis component analyzes index information falling into a library after the detection task is finished at one time and finally obtains a recommended node and an abnormal node.
In this patent, an abnormal node refers to a node where there is a possibility of an attack behavior, and its abnormal index is to be highlighted and there is a feature too close to it. The recommended node refers to a node in which the performance index is located at the front of all nodes and is not an abnormal node.
In detail, the system of the invention comprises a central control component, a detection component, a calculation component and a data analysis component. The central control component management is responsible for managing other components, interacting with a user and triggering various tasks; the detection component is used as a producer to be responsible for quickly collecting all node information; the calculation component is responsible for consuming node information generated in real time and historical information in the library and calculating indexes; the database component is responsible for the storage of data and some non-streaming computation. The communication between the central control component and each component is completed by adopting a distributed application program coordination service; the detection component and the calculation component should be connected by a message queue capable of long connection.
The central control component of the system comprises:
a task triggering module: through a mode of interacting with a user or triggering at regular time, the nodes are triggered to collect, return to recommended nodes, return to abnormal nodes and the like;
a task management module: the distributed file management architecture is used for uniformly managing all task contexts, so that the context consistency is ensured, and meanwhile, the distributed file management architecture can be matched with state maintenance related modules of other components to enable the tasks to have the recovery capability, and the results of failure or overlong time and the like of the whole task caused by abnormity such as interruption and the like of the long-time task are prevented;
a task result module: handling the use of task results, including result presentation and generating netDB files (Unix and Linux specific header files, mainly defining network related structures, variable types, macros, functions, etc.) directly usable by I2P.
Further, the task triggering module should trigger different types of tasks by changing the timing triggering mode through a user interaction mode such as a front end or a daily rID (a value calculated by a node number and a date hash, representing the position of the node in the hash ring). When the trigger node collects the tasks, the detection component, the calculation component and the data analysis component need to be scheduled, and when the trigger node returns to the recommendation node and returns to the abnormal node task, the data analysis component needs to be scheduled.
Further, the task management module should maintain the task number and the running state code in the distributed file system, and write the entire task context into the library. When the central control component is abnormal or restarted, the context can be fished out from the library to continue the task, and when other components are abnormal or restarted, the task number and the running state can be fished out from the distributed file system to re-request the context to continue the task.
Based on the above, the interaction mode of the central control component, the calculation component and the detection component is as follows: the central control component writes down the task number and the task starting state code under the corresponding path of the distributed file system, the computing component and the detecting component request a task context according to the task number to the central control component after monitoring that the leaf node of the corresponding path changes, the task is constructed according to the context, and the status code in operation is written back after the construction is successful to inform the central control component of normal operation. And the central control component monitors a task path of the distributed file system in the whole task, pays attention to the state code, and decides to rerun or finish the task if other components write back the abnormal state code. And after the calculation of the calculation component is finished, reporting the state to the central control component, writing an end state code into the distributed file system by the central control component, and informing each component of the end of the current task. Each component divides the process according to the task granularity, and each task generates different distributed file paths through different task numbers to carry out communication among the components, so that consistency is kept and mutual noninterference is avoided.
Further, the task result module refers to the ability of the central control component to have various output results after interacting with the data analysis, including but not limited to presentation at the front end, generating csv files containing the results, generating local netDB files available for direct use by the I2P network.
The detection assembly of the system of the present invention comprises:
a node detection module: generating a detection task according to the task context, and collecting node information of the I2P network;
progress preservation module: the task progress is stored, and the task is prevented from starting from the beginning after the task is abnormally pulled up again;
the information transmission module: the information can be rewritten as needed and written to the producer side of the message queue.
Furthermore, the node detection module should take the range of the rID of the node responsible for detection from the task context, and perform parallel detection in a thread pool manner, where the thread pool should be set according to the actual configuration of the detection instance, so as to ensure that neither accumulation nor memory overflow occurs. The single thread constructs an I2NP exploration message of I2P after getting rID, and sends the exploration message to the flooding nodes in the flooding node list in the context to obtain the node information.
Further, the progress saving module records the rID in the response when each exploration message is responded and writes the rID into the library, and when the process is restarted due to memory overflow and other abnormalities, the detection instance can find the number of the task in operation in the distributed file system and drag the task context back according to the number; and then detecting the instance to obtain the detected rID in the database before the task is restarted, wherein the rID is the lower bound of the rID range obtained from the context and increases in an accumulated mode, so that repeated detection can be avoided. This is very important for such long-time tasks.
Each detection instance in the detection component of the system has a unique number which is not changed, the number is used as a part of a path of the distributed file system, so that each detection instance has a unique path, and the central control component can elect the detection instance existing in the path and send different contexts when issuing tasks. And after the instance process is restarted, the task state can be obtained in the corresponding path according to the invariable serial number of the instance process, and the task is continuously executed. The task path of a complete detection instance is/task/instance ID/task ID/, and the content in the file under the path is the state code of the task. The detection component has a set of corresponding state codes at different stages of task execution aiming at different exceptions, and returns through a state path of the distributed file system, so that the central control component can conveniently master the running state of each detection instance in the detection component.
Furthermore, the information transmission module is responsible for writing the returned node information in the response into the message queue after disassembling and sorting the node information after the detection instance receives the response. The message queue should be one that can maintain long-connection, point-to-point asynchronously deliverable messages, with the probing component acting as the producer of the message and the computing component acting as the consumer of the message.
The computing components of the system of the present invention include:
an information input module: connecting one end of a message queue consumer, receiving streaming data, connecting a database and receiving batch data;
an information calculation module: aggregating the real-time node information and the historical node information, and then adopting multiple operators to perform index calculation;
an information output module: and the calculation result falls into a database for being called by the central control component.
The computing component and the detecting component of the system adopt the same interaction mode with the central control component, obtain the task number from the distributed file system and request the context from the central control component. The difference is that the computing component does not adopt the example numbering mode to carry out distributed management, but adopts a master-slave management architecture of a distributed stream data engine to carry out internal management, and the examples do not need to be numbered in a file path.
Further, the information input module will have both a message queue and a database source. The message queue provides streaming data of real-time information of the nodes, the database provides batch data of historical information of the nodes, and the summaries of the historical node information and the task context in the database are also distributed as input streams.
Further, the information calculation module is mainly responsible for the aggregation of node information and index calculation. After the real-time information flows into the aggregation node of the calculation module, the historical information and the historical information summary simultaneously flow into the aggregation node, and the real-time information and the historical information carry information to be subjected to index calculation according to the information summary to complete aggregation. After aggregation, the real-time information flow falls into the node information base, the aggregated information flow flows out from the flow measurement to the computing node, and the computing node falls into the node index base after index computation is completed.
The data analysis component of the present invention comprises:
a data storage module: a database for storing various information and indexes of the nodes in the I2P network and a task database containing task history information, I2P node information summary and detection component task process;
a data analysis module: and analyzing the node information and indexes falling into the database of the data storage module to obtain recommended nodes and abnormal nodes.
In the invention, the node information comprises capacity, bandwidth, online time, accessibility and time delay. Capacity refers to the number of tunnels a routing node successfully establishes over a period of time. Bandwidth refers to the weighted result of the speed of a node over different time periods. The online time refers to the time when the node is online. Reachability is whether the node qualifies as a tunnel participant. The time delay is the measurement of the time delay of the node and reflects whether the node is on-line or not.
The indexes of the I2P node include: average capacity, average bandwidth, average time delay, times that online time is more or less than 24 hours, proportion, significance of uniform distribution of rID, online frequency, online time, and reachable proportion. The average value, the reachable proportion and the online frequency of each item reflect the long-term state of a node and whether the node is stable or not; the number of times the online time is more or less than 24 hours, the proportion, the significance of uniform distribution of rIDs, and the online time mainly reflect whether the node has abnormal behavior or whether the node which may have abnormal behavior is too close.
Preferred embodiments of the present invention are further described below.
As shown in FIG. 1, the full-node rapid acquisition and analysis system of the I2P network consists of a central control component, a detection component, a calculation component and a data analysis component. The central control component is used as a command center of the whole distributed architecture and is responsible for scheduling other components of the whole system, issuing task context and displaying and making task results. The detection component consists of a plurality of detection instances, is uniformly elected and scheduled by the central control component and is responsible for sending detection messages and obtaining node information. The computing components adopt a master-slave architecture, election is completed internally, and the scheduling mode is a central control scheduling master computing example and is responsible for integrally calculating node indexes in batches. The data analysis component also adopts a master-slave architecture, is directly dispatched by a central control and is responsible for analyzing recommended nodes and abnormal nodes through node information and node indexes.
The entire distributed system requires a connection of the distributed file system and the message queue. The distributed file system ensures the consistency of tasks, connects the central control component with the computing component and the detection component, and loads information such as task numbers, state codes and detection component elections. The message queue ensures that information produced by the probe component can enter the compute component in a streaming manner, so that node information can be orderly and timely consumed.
The invention also provides a method for quickly acquiring and analyzing the full nodes of the I2P network, which applies the system for quickly acquiring and analyzing the full nodes of the I2P network, as shown in FIG. 2, the method comprises the following five steps:
step one, the central control component obtains task information and makes task context in the modes of a front end, an open interface and the like. The central control component obtains an online detection example list through the distributed file system, elects examples according to the task detection number, writes task numbers and task starting state codes into corresponding example paths, monitors the task starting state codes by the corresponding detection examples, requests task context from the central control component, constructs tasks according to the task context, and writes the corresponding state codes of constructed results back to the distributed file system. And the central control component writes the task number and the task starting state code into the path of the computing component of the distributed file system, the main computing instance elects an instance for executing the task, requests the task context from the central control component, constructs the task according to the task context and writes the corresponding state code of the constructed result back to the distributed file system. The central control component collects the task conditions of the two components, determines whether to terminate and rerun the task, and starts to monitor the distributed file system in the whole process to monitor whether the task is abnormal or ended correctly.
The task information includes:
numbering the tasks;
detection range: dividing the whole range by the number of detection instances to divide according to the rID decision of the I2P distributed database;
flooding node list: partitioning all of the flooding nodes fetched to different probing instances prevents a single flooding node qps from being too large and causing slow response;
detection timeout time: the overtime time of each detection message prevents the whole task from running too slowly;
number of probing retries: whether to retry and retry times after the probe instance fails to receive the response;
detecting the number of connection pools: the specification of a connection pool maintained by a detection instance is too large, which may cause memory overflow, and the specification of the connection pool maintained by the detection instance is too small, which may cause detection tasks to be excessively blocked;
message queue qps: determining qps that the message queue writes the node into the computing component, preventing the computing component from generating accumulation, and adopting a discarding strategy when the message queue buffer overflows to ensure the normal operation of the whole task;
aggregation timeout time: the time for determining the overtime of the information flow at the aggregation node is provided for the computing component, the computing component is likely to be accumulated if the time is too long, and a large amount of information is not aggregated if the time is too short;
and (3) calculating indexes: calculating indexes and calculation modes of the components needing to be calculated, wherein the indexes and the calculation modes comprise operators, default values and the like;
task timeout time: the maximum time that the entire task can run;
external parameter configuration: other parameters of the task are passed through the components of the system, such as specifying probe instances, ignorable exception codes, and so forth.
And step two, the selected detection instance obtains the task context, and constructs and starts the task. Firstly, connection resources are applied to a connection pool, and then a detection message is constructed according to the rID and the flooding node information, and the message is sent. And if the flooding node successfully returns the node information, sending the node information to the message queue, judging the size relationship between the rID and the rID in the progress database, and if the rID is larger than the rID of the progress database, adding a read-write lock to write data. And if the request flooding node is overtime and the retry is overtime, discarding the request and releasing the connection pool resources. The rID of the request sent by the probe instance is incremented according to the lower bound of the rID range in the task context. And after the process or the instance is restarted, detecting the task number of the instance in the distributed pull operation, pulling the task context to the central control component according to the number, pulling the task progress to the progress database, and quickly continuing the task.
And step three, the computing component obtains the task context, and constructs and starts the task. The node information enters the computing component from one end of the message queue consumer, and the task context is issued to each node of the computing component in a broadcasting mode. In the aggregation node, an IP address in the task real-time information is obtained, node historical information in a search base is searched by taking the IP address as a key word and enters the aggregation node, the node real-time information is copied, original node information flows out of a main stream and falls into a node information base, and the copied node real-time information and the historical node information judge whether the aggregation is completed according to the quantity of historical data in the historical summary information. And the node information which finishes aggregation or overtime aggregation enters a computing node, index calculation is finished according to the configured computing index, and the measured flow falls into a node index library.
The information from the detection component in the second step enters the calculation component in the third step for index calculation, and the calculation component outputs node information and the calculated node index enters the data analysis component in the fourth step to obtain a final result. The third step of the invention is flow calculation, the fourth step is batch data analysis, and the data is calculated by the third step.
And step four, the central control component informs the data analysis component to start working after obtaining the state codes of the computing component and the detection component for completing the tasks. Firstly, finding out nodes with too low indexes, namely the significance of uniform distribution of rID and the proportion of more than 24 hours of online time, and listing the nodes as suspected abnormal nodes, wherein the high setting of the index threshold value can cause too high expense for analysis of the suspected nodes, the low setting can cause leakage of attack nodes, and the indexes are set to be 1% to 5% at the lowest according to the number of the existing I2P nodes and the cost estimation of collusion attack. And analyzing the distribution of the line time and the rID on the history of the suspected abnormal nodes, listing the suspected abnormal nodes which are too close as abnormal nodes, wherein the too close means that the distance between more than 5 continuous I2P nodes on the hash ring is less than the average distance and the line time is less than one hour from the efficiency curve of collusion attack. And finally, finding out nodes which have excellent performance indexes such as average capacity, average bandwidth, average time delay, reachable proportion and the like and are not abnormal nodes, and listing the nodes as recommended nodes, wherein different election schemes can be selected according to communication requirements, for example, a node with larger bandwidth is selected for file transmission, a node with lower time delay for screen communication, and a node with higher capacity for various mixed communications.
And fifthly, the central control component writes task ending state codes into the detection examples and the calculation component paths of the distributed file system, waits for each component to report task releasing state codes and clear the task paths, and writes task information and logs into a database. The central control component obtains the task result from the data analysis component to obtain the generated netDB file.
Those skilled in the art will appreciate that, in addition to implementing the systems, apparatus, and various modules thereof provided by the present invention in purely computer readable program code, the same procedures can be implemented entirely by logically programming method steps such that the systems, apparatus, and various modules thereof are provided in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system, the device and the modules thereof provided by the present invention can be considered as a hardware component, and the modules included in the system, the device and the modules thereof for implementing various programs can also be considered as structures in the hardware component; modules for performing various functions may also be considered to be both software programs for performing the methods and structures within hardware components.
While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and those skilled in the art can easily conceive of various equivalent modifications or substitutions within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A full-node rapid acquisition and analysis system of an I2P network, comprising:
the central control component: managing all other components, acquiring task information through interaction with a user, and triggering various tasks;
a detection component: after receiving the context of the task, starting the task in parallel through each detection node, and quickly collecting all node information;
a computing component: consuming node information generated in real time and historical information in a library, and calculating indexes;
a data analysis component: after one detection task is finished, analyzing the index information falling into the database to obtain a recommended node and an abnormal node;
the communication between the central control component and the detection component, the communication between the central control component and the calculation component and between the central control component and the data analysis component are completed by adopting a distributed application program coordination service; the detection component and the calculation component are connected by a message queue capable of long connection.
2. The full-node rapid acquisition and analysis system of the I2P network of claim 1, wherein the central control component comprises:
a task triggering module: triggering the node to collect tasks, return to recommended node tasks and return to abnormal node tasks in a user interaction mode or a timing triggering mode;
a task management module: adopting a distributed file management architecture to uniformly manage all task contexts;
a task result module: the use of the task results is processed, including the processing of the results presentation and the processing of generating the netDB for direct use by I2P.
3. The full-node rapid acquisition analysis system of the I2P network of claim 1, wherein the probing component comprises:
a node detection module: generating a detection task according to the task context, and collecting node information of the I2P network;
progress preservation module: the task progress is stored, and the task is prevented from starting from the beginning after the task is abnormally pulled up again;
the information transmission module: and rewriting the information according to the requirement, and writing the information into one end of the message queue producer.
4. The full-node rapid acquisition analysis system of the I2P network of claim 1, wherein the computation component comprises:
an information input module: connecting one end of a message queue consumer, receiving streaming data, connecting a database and receiving batch data;
an information calculation module: aggregating the real-time node information and the historical node information, and performing index calculation by adopting multiple operators;
an information output module: the calculation result falls into a database for being called by the central control component;
the information transmission module: and rewriting the information according to the requirement, and writing the information into one end of the message queue producer.
5. The full-node rapid acquisition analysis system of the I2P network of claim 4, wherein the data analysis component comprises:
a data storage module: storing various information and indexes of nodes in the I2P network;
a data analysis module: and analyzing the node information and indexes falling into the database of the data storage module to obtain recommended nodes and abnormal nodes.
6. A full-node fast-acquisition analysis method of an I2P network applying the full-node fast-acquisition analysis system of the I2P network according to any one of claims 1 to 5, comprising:
1) the central control component acquires task information and makes a task context through a front end and an open interface, acquires an online detection instance list through a distributed file system, and selects a detection instance;
2) the detection component obtains a task context for the selected detection instance, constructs and starts a task, and generates a state code for completing the task;
3) the computing component computes the obtained task context, constructs and starts a task, and generates a state code for completing the task;
4) after the central control component obtains the state codes of the computing component and the detection component for completing the tasks, the central control component informs the data analysis component to start working, and the data analysis component finds out the recommended nodes and the abnormal nodes;
5) the central control component writes task ending state codes into detection examples and calculation component paths of the distributed file system, waits for each component to report task releasing state codes to clear task paths, and writes task information and logs into a database; and obtaining the task result from the data analysis component to obtain the generated netDB file.
7. The method for full-node rapid acquisition and analysis of the I2P network according to claim 6, wherein the specific content of step 1) is as follows:
the central control component acquires task information through a front-end or open interface mode to manufacture a task context, acquires an online detection instance list through a distributed file system, elects instances according to the task detection number, writes a task number and a task starting state code into a corresponding instance path, monitors the task starting state code by the corresponding detection instance, requests the central control component for the task context, constructs a task according to the task context, and writes the corresponding state code of a constructed result back to the distributed file system; the central control component writes a task number and a task starting state code into a path of the distributed file system computing component, the main computing instance enumerates an instance for executing the task, requests a task context from the central control component, constructs the task according to the task context, and writes a corresponding state code of a constructed result back to the distributed file system; meanwhile, the central control component collects the task conditions of the two components to determine whether to terminate and rerun the task, and starts to monitor the distributed file system in the whole process to monitor whether the task is abnormal or ended correctly.
8. The method for full-node rapid acquisition and analysis of the I2P network according to claim 7, wherein the specific content of step 2) is as follows:
firstly, applying for connection resources from a connection pool, constructing a detection message according to the rID and the flooding node information after applying for the connection resources, and sending the message; if the flooding node successfully returns the node information, the node information is sent to a message queue, the size relation between the rID and the rID in the progress database is judged, and if the rID is larger than the rID of the progress database, a read-write lock is added to write data; if the request flooding node is overtime and the retry is overtime, discarding the request and releasing the connection pool resource;
and after the process or the instance is restarted for some reasons, detecting the task number of the instance in the distributed pull operation, pulling the task context to the central control component according to the number, pulling the task progress in the progress database, and rapidly continuing the task.
9. The method for full-node rapid acquisition and analysis of the I2P network according to claim 8, wherein the specific content in step 3) is as follows:
node information enters a computing component from one end of a message queue consumer, and task context is issued to each node of the computing component in a broadcasting mode; in the aggregation node, acquiring an IP address in the task real-time information, searching node historical information in a database by taking the IP address as a key word, entering the aggregation node, copying the node real-time information, enabling original node information to flow out of a main stream and fall into a node information database, and judging whether the aggregation is completed according to the quantity of historical data in the historical summary information by the copied node real-time information and the historical node information; and the node information which finishes aggregation or overtime aggregation enters a computing node, index calculation is finished according to the configured computing index, and the measured flow falls into a node index library.
10. The method for full-node rapid acquisition and analysis of the I2P network according to claim 9, wherein the specific content of step 4) is as follows:
after the central control component obtains the state codes of the computing component and the detection component for completing the tasks, the central control component informs the data analysis component to start working; firstly, finding out nodes with too low two indexes of the significance of uniform distribution of rID and the proportion that the online time is more than 24 hours, and listing the nodes as suspected abnormal nodes; analyzing the distribution of historical online time and rID of the suspected abnormal node, and listing the node close to the suspected abnormal node as an abnormal node; and finally, obtaining a plurality of nodes with excellent performance indexes and non-abnormal nodes including average capacity, average bandwidth, average time delay and reachable proportion, and listing the nodes as recommended nodes.
CN202111262743.2A 2021-10-28 2021-10-28 All-node rapid acquisition and analysis system and method for I2P network Pending CN113986841A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111262743.2A CN113986841A (en) 2021-10-28 2021-10-28 All-node rapid acquisition and analysis system and method for I2P network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111262743.2A CN113986841A (en) 2021-10-28 2021-10-28 All-node rapid acquisition and analysis system and method for I2P network

Publications (1)

Publication Number Publication Date
CN113986841A true CN113986841A (en) 2022-01-28

Family

ID=79743373

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111262743.2A Pending CN113986841A (en) 2021-10-28 2021-10-28 All-node rapid acquisition and analysis system and method for I2P network

Country Status (1)

Country Link
CN (1) CN113986841A (en)

Similar Documents

Publication Publication Date Title
CN109586947B (en) Distributed equipment information acquisition system and method
JP3537356B2 (en) Delay factor analysis method in job system
US10904112B2 (en) Automatic capture of detailed analysis information based on remote server analysis
US20060277295A1 (en) Monitoring system and monitoring method
US20050097207A1 (en) System and method of predicting future behavior of a battery of end-to-end probes to anticipate and prevent computer network performance degradation
CN109861878A (en) The monitoring method and relevant device of the topic data of kafka cluster
KR20040062941A (en) Automatic data interpretation and implementation using performance capacity management framework over many servers
US9027025B2 (en) Real-time database exception monitoring tool using instance eviction data
ZA200400131B (en) Method and system for correlating and determining root causes of system and enterprise events.
CN111200526B (en) Monitoring system and method of network equipment
CN112416581B (en) Distributed calling system for timed tasks
Xu et al. Lightweight and adaptive service api performance monitoring in highly dynamic cloud environment
CN112737800A (en) Service node fault positioning method, call chain generation method and server
US10122602B1 (en) Distributed system infrastructure testing
CN109474470A (en) One kind is from monitoring method and device
Liu et al. Fluxinfer: Automatic diagnosis of performance anomaly for online database system
CN105357026B (en) A kind of resource information collection method and calculate node
US10282245B1 (en) Root cause detection and monitoring for storage systems
WO2022088809A1 (en) Method and system for determining interval time for testing servers, and device and medium
JP5779548B2 (en) Information processing system operation management apparatus, operation management method, and operation management program
US10223189B1 (en) Root cause detection and monitoring for storage systems
CN113986841A (en) All-node rapid acquisition and analysis system and method for I2P network
CN110837428A (en) Storage device management method and device
CN115269519A (en) Log detection method and device and electronic equipment
KR20180132292A (en) Method for automatic real-time analysis for bottleneck and apparatus for using the same

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination