CN114301757A - Network asset processing method, device, equipment and storage medium - Google Patents

Network asset processing method, device, equipment and storage medium Download PDF

Info

Publication number
CN114301757A
CN114301757A CN202111424684.4A CN202111424684A CN114301757A CN 114301757 A CN114301757 A CN 114301757A CN 202111424684 A CN202111424684 A CN 202111424684A CN 114301757 A CN114301757 A CN 114301757A
Authority
CN
China
Prior art keywords
network
traffic
asset
network asset
clustering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111424684.4A
Other languages
Chinese (zh)
Inventor
周君宇
谭凌霄
于旸
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202111424684.4A priority Critical patent/CN114301757A/en
Publication of CN114301757A publication Critical patent/CN114301757A/en
Pending legal-status Critical Current

Links

Images

Abstract

The application discloses a network asset processing method relating to cloud security and data protection technology, which specifically comprises the following steps: acquiring a first traffic characteristic set of each network asset from a traffic database; counting the first traffic feature set of each network asset according to K preset attribute types to obtain a first feature vector of each network asset; clustering the N network assets according to the first feature vector of each network asset until a clustering stop condition is met, and obtaining at least one clustering result aiming at a first time window; and marking at least one clustering result aiming at the first time window to obtain an asset identification tag of each network asset. Related apparatus, devices, and media are also provided. According to the method and the device, on one hand, the labor cost required by network asset management is saved, and the real-time performance of network asset identification is improved. On the other hand, the situation that normal service is influenced due to network asset identification is effectively avoided.

Description

Network asset processing method, device, equipment and storage medium
Technical Field
The present application relates to the field of information security technologies, and in particular, to a method, an apparatus, a device, and a storage medium for processing a network asset.
Background
In recent years, cyberspace has been the foundation for the existence and development of various social systems, and once a network is attacked, the operation of various infrastructures will be affected. The network assets have flexible and changeable characteristics, and many network assets change in real time. Therefore, it is important to identify and map assets in the cyberspace.
The network space mapping technology is a technology for obtaining relevant attributes of network assets and providing analysis through network detection, analysis and other modes. The network assets are mainly various network assets used in computer networks, and mainly comprise hosts, routers, switches, firewalls and the like. Most of the traditional network space mapping technology adopts artificial statistics or active detection. The active detection means that a detected message is actively sent to a target, and a fingerprint is obtained from a returned message and compared.
The inventor finds that at least the following problems exist in the existing scheme, and the complicated and variable network environment is difficult to track by adopting a manual management mode because the asset types in the network environment are various and change rapidly. Active probing requires a large amount of traffic to be sent, which may affect normal traffic.
Disclosure of Invention
The embodiment of the application provides a network asset processing method, a device, equipment and a storage medium. According to the method and the device, on one hand, the labor cost required by network asset management is saved, and the real-time performance of network asset identification is improved. On the other hand, the condition that normal service is influenced due to network asset identification can be effectively avoided.
In view of the above, an aspect of the present application provides a network asset processing method, including:
acquiring a first traffic characteristic set corresponding to each network asset in N network assets to be identified from a traffic database, wherein the first traffic characteristic set is obtained by extracting original traffic based on a first time window, and N is an integer greater than 1;
counting a first traffic feature set corresponding to each network asset according to K preset attribute types to obtain a first feature vector corresponding to each network asset, wherein the first feature vector comprises K first feature values, each first feature value corresponds to one preset attribute type, and K is an integer greater than or equal to 1;
clustering the N network assets according to the first feature vector corresponding to each network asset until a clustering stopping condition is met, and obtaining at least one clustering result aiming at a first time window;
and marking at least one clustering result aiming at the first time window to obtain an asset identification tag corresponding to each network asset.
Another aspect of the present application provides a network asset processing apparatus, including:
the system comprises an acquisition module, a flow database and a flow analysis module, wherein the acquisition module is used for acquiring a first flow characteristic set corresponding to each network asset in N network assets to be identified from the flow database, the first flow characteristic set is obtained by extracting original flow based on a first time window, and N is an integer greater than 1;
the statistical module is used for carrying out statistics on a first traffic feature set corresponding to each network asset according to K preset attribute types to obtain a first feature vector corresponding to each network asset, wherein the first feature vector comprises K first feature values, each first feature value corresponds to one preset attribute type, and K is an integer greater than or equal to 1;
the clustering module is used for clustering the N network assets according to the first characteristic vector corresponding to each network asset until a clustering stopping condition is met, and obtaining at least one clustering result aiming at a first time window;
and the marking module is used for marking at least one clustering result aiming at the first time window to obtain an asset identification tag corresponding to each network asset.
In one possible design, in another implementation manner of another aspect of the embodiment of the present application, the network asset processing apparatus further includes a recording module;
the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is further used for acquiring original flow from gateway equipment, and the gateway equipment comprises at least one of routing equipment, a firewall and a switch;
the acquisition module is also used for acquiring at least one message from the original flow;
the acquisition module is further used for extracting the characteristics of at least one message to obtain flow characteristics, wherein the flow characteristics comprise a timestamp, a source Internet Protocol (IP) address, a destination IP address, a source port, a destination port, message size and service type;
and the recording module is used for recording the flow characteristics into the flow database.
In one possible design, in another implementation of another aspect of an embodiment of the present application,
the acquisition module is specifically used for acquiring a message set to be processed from original flow;
classifying each message in a message set to be processed to obtain at least one message belonging to the same protocol type;
the acquisition module is specifically used for performing message recombination on at least one message belonging to the same protocol type to obtain an application layer message;
and extracting the characteristics of the application layer message to obtain the flow characteristics.
In one possible design, in another implementation of another aspect of an embodiment of the present application,
the acquisition module is specifically used for acquiring a target traffic characteristic set of a timestamp in a first time window from a traffic database;
regarding each network asset, taking the traffic characteristics of the source IP address belonging to the asset IP address in the target traffic characteristic set as first traffic characteristics in a first traffic characteristic set, wherein the asset IP address is the IP address of the network asset;
and regarding the traffic characteristics of which the destination IP address belongs to the asset IP address in the target traffic characteristic set as the first traffic characteristics in the first traffic characteristic set for each network asset.
In one possible design, in another implementation manner of another aspect of the embodiment of the present application, the K preset attribute types include at least one of an uplink traffic total, a downlink traffic total, an uplink and downlink traffic total, an uplink traffic proportion, and a downlink traffic proportion;
the statistical module is specifically used for summing the sizes of messages included in first traffic characteristics of source IP addresses belonging to the asset IP addresses in the first traffic characteristic set if the K preset attribute types include uplink traffic totals for each network asset to obtain a first characteristic value corresponding to the uplink traffic totals in the first characteristic vector, wherein the asset IP addresses are the IP addresses of the network assets;
for each network asset, if the K preset attribute types comprise downlink traffic totals, summing the sizes of messages included in first traffic characteristics of a target IP address belonging to an asset IP address in a first traffic characteristic set to obtain a first characteristic value corresponding to the downlink traffic totals in a first characteristic vector;
for each network asset, if the K preset attribute types comprise uplink and downlink flow sums, summing a first characteristic value corresponding to the downlink flow sums and a first characteristic value corresponding to the uplink flow sums to obtain a first characteristic value corresponding to the uplink and downlink flow sums in a first characteristic vector;
for each network asset, if the K preset attribute types comprise an uplink flow proportion, calculating a ratio between a first characteristic value corresponding to an uplink flow sum and a first characteristic value corresponding to the uplink flow sum to obtain a first characteristic value corresponding to the uplink flow proportion in a first characteristic vector;
for each network asset, if the K preset attribute types include a downlink traffic proportion, calculating a ratio between a first eigenvalue corresponding to a downlink traffic sum and a first eigenvalue corresponding to an uplink traffic sum, and obtaining a first eigenvalue corresponding to the downlink traffic proportion in a first eigenvector.
In one possible design, in another implementation manner of another aspect of the embodiment of the present application, the K preset attribute types include at least one of an intranet traffic total, an extranet traffic total, an intranet and extranet traffic total, an intranet traffic ratio, and an extranet traffic ratio;
the statistical module is specifically used for summing the sizes of messages included in the first traffic characteristics of the source IP address and the destination IP address belonging to the intranet address in the first traffic characteristic set if the K preset attribute types include the intranet traffic total so as to obtain a first characteristic value corresponding to the intranet traffic total in the first characteristic vector for each network asset;
for each network asset, if the K preset attribute types comprise the outer network traffic total, summing the sizes of messages included in the first traffic characteristics of the source IP address or the destination IP address belonging to the outer network address in the first traffic characteristic set to obtain a first characteristic value corresponding to the outer network traffic total in the first characteristic vector;
for each network asset, if the K preset attribute types comprise the internal and external network flow sum, summing a first characteristic value corresponding to the internal network flow sum and a first characteristic value corresponding to the external network flow proportion to obtain a first characteristic value corresponding to the internal and external network flow sum in a first characteristic vector;
for each network asset, if the K preset attribute types comprise an intranet flow ratio, calculating a ratio between a first characteristic value corresponding to an intranet flow sum and a first characteristic value corresponding to an intranet flow sum to obtain a first characteristic value corresponding to the intranet flow ratio in a first characteristic vector;
for each network asset, if the K preset attribute types comprise an external network traffic proportion, calculating a ratio between a first characteristic value corresponding to the external network traffic proportion and a first characteristic value corresponding to the total internal and external network traffic to obtain a first characteristic value corresponding to the external network traffic proportion in a first characteristic vector.
In one possible design, in another implementation manner of another aspect of this embodiment of the present application, the K preset attribute types include at least one of a maximum ratio source port, a maximum ratio source port ratio, and a maximum ratio port ratio;
the statistical module is specifically configured to determine, for each network asset, if the K preset attribute types include a maximum ratio source port, a total flow of the source ports and a flow of each source port according to the first flow feature set, and use the source port with the largest flow ratio as a first feature value corresponding to the maximum ratio source port in the first feature vector;
for each network asset, if the K preset attribute types comprise a maximum-ratio destination port, determining total flow of the destination ports and flow of each destination port according to a first flow characteristic set, and taking the destination port with the maximum flow ratio as a first characteristic value corresponding to the maximum-ratio destination port in a first characteristic vector;
for each network asset, if the K preset attribute types comprise the maximum ratio source port proportion, calculating the ratio of the flow corresponding to the maximum ratio source port to the total flow of the source port to obtain a first characteristic value corresponding to the maximum ratio source port proportion in a first characteristic vector;
for each network asset, if the K preset attribute types include a port proportion of a maximum proportion target, calculating a ratio between a flow corresponding to the port of the maximum proportion target and a total flow of the target port to obtain a first eigenvalue corresponding to the port proportion of the maximum proportion target in the first eigenvector.
In one possible design, in another implementation of another aspect of an embodiment of the present application,
the clustering module is specifically used for acquiring preset K weighted values, wherein each weighted value corresponds to a preset attribute type;
for each network asset, determining distances between the network asset and T clustering centers according to K weighted values and a first feature vector, and dividing the network asset into clustering clusters with the shortest distances, wherein T is an integer greater than 1;
if the clustering stop condition is met, obtaining at least one clustering result aiming at the first time window;
and if the clustering stop condition is not met, updating the T clustering centers.
In one possible design, in another implementation of another aspect of an embodiment of the present application,
the marking module is specifically used for displaying at least one clustering result of the first time window;
and responding to a labeling instruction aiming at each clustering result, and determining an asset identification tag corresponding to each network asset, wherein the labeling instruction carries the asset identification tag.
In one possible design, in another implementation of another aspect of an embodiment of the present application,
the marking module is specifically used for setting the asset identification tag corresponding to each network asset in the clustering result as a production network tag if the uplink flow ratio average value of all the network assets in the clustering result is less than or equal to a first ratio threshold value for each clustering result of a first time window;
for each clustering result of the first time window, if the average value of the external network flow proportion of all the network assets in the clustering result is greater than or equal to a second proportion threshold, setting the asset identification tag corresponding to each network asset in the clustering result as an office network tag;
and aiming at each clustering result of the first time window, if the intranet flow ratio average value of all the network assets in the clustering result is greater than or equal to a third ratio threshold, setting the asset identification tag corresponding to each network asset in the clustering result as a development network tag.
In one possible design, in another implementation of another aspect of the embodiments of the present application, a network asset processing apparatus includes a culling module;
the acquisition module is further used for acquiring a second traffic characteristic set corresponding to each network asset from the traffic database, wherein the second traffic characteristic set is extracted based on the original traffic in a second time window;
the statistical module is further configured to perform statistics on a second traffic feature set corresponding to each network asset according to the K preset attribute types to obtain a second feature vector corresponding to each network asset, where the second feature vector includes K second feature values, and each second feature value corresponds to one preset attribute type;
and the removing module is used for removing the network assets from the clustering results corresponding to the first clustering center if the distance between the first characteristic vector corresponding to the network assets and the first clustering center is greater than the distance between the second characteristic vector corresponding to the network assets and the second clustering center.
In one possible design, in another implementation of another aspect of an embodiment of the present application,
and the clustering module is further used for clustering each network asset according to a second feature vector corresponding to the network asset if the network asset is a newly added network asset in a second time window until a clustering stop condition is met, and obtaining at least one clustering result aiming at the second time window.
In one possible design, in another implementation manner of another aspect of the embodiment of the present application, the network asset processing apparatus further includes an operation and maintenance module;
the operation and maintenance module is used for marking at least one clustering result aiming at the first time window to obtain an asset identification tag corresponding to each network asset, and triggering operation and maintenance alarm operation when the network asset accesses the network asset with the office network tag if the asset identification tag corresponding to the network asset is a development network tag;
the operation and maintenance module is further used for marking at least one clustering result aiming at the first time window to obtain an asset identification tag corresponding to each network asset, and triggering operation and maintenance alarm operation when the network asset accesses the network asset with the production network tag if the asset identification tag corresponding to the network asset is an office network tag;
and the operation and maintenance module is further used for marking at least one clustering result aiming at the first time window to obtain an asset identification tag corresponding to each network asset, and triggering operation and maintenance alarm operation when the network asset accesses the network asset with the office network tag if the asset identification tag corresponding to the network asset is a production network tag.
Another aspect of the present application provides a computer device, comprising: a memory, a processor, and a bus system;
wherein, the memory is used for storing programs;
a processor for executing the program in the memory, the processor for performing the above-described aspects of the method according to instructions in the program code;
the bus system is used for connecting the memory and the processor so as to enable the memory and the processor to communicate.
Another aspect of the present application provides a computer-readable storage medium having stored therein instructions, which when executed on a computer, cause the computer to perform the method of the above-described aspects.
In another aspect of the application, a computer program product or computer program is provided, the computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the method provided by the above aspects.
According to the technical scheme, the embodiment of the application has the following advantages:
in the embodiment of the application, a network asset processing method is provided, and first traffic characteristic sets corresponding to each network asset are respectively obtained from a traffic database based on N network assets to be identified. Then, the first traffic feature set corresponding to each network asset may be counted according to the K preset attribute types to obtain a first feature vector corresponding to each network asset. Next, according to the first feature vector corresponding to each network asset, clustering the N network assets until a clustering stop condition is met, and obtaining at least one clustering result for the first time window. And finally, marking at least one clustering result aiming at the first time window to obtain an asset identification tag corresponding to each network asset. By the method, the purpose of automatically identifying the network assets can be achieved based on flow monitoring and feature clustering, on one hand, the problem of low manual management efficiency is solved, the labor cost required by network asset management is saved, and the real-time property of network asset identification is favorably improved. On the other hand, a large amount of flow does not need to be sent actively, so that the condition that normal service is influenced due to network asset identification can be effectively avoided.
Drawings
FIG. 1 is a schematic diagram of an architecture of an intrusion detection system according to an embodiment of the present application;
FIG. 2 is a block diagram of an embodiment of an intrusion prevention system;
FIG. 3 is a block diagram of a network asset processing system in an embodiment of the present application;
FIG. 4 is a schematic flow chart of a network asset processing method in an embodiment of the present application;
FIG. 5 is a network topology diagram for implementing port mirroring in an embodiment of the present application;
FIG. 6 is a schematic flow chart of flow characteristic extraction in the embodiment of the present application;
FIG. 7 is a diagram illustrating an application of a firewall to an internal network and an external network according to an embodiment of the present application;
FIG. 8 is a diagram illustrating an application of a firewall to an internal network according to an embodiment of the present application;
FIG. 9 is a schematic illustration of an interface for setting asset identification tags in an embodiment of the present application;
FIG. 10 is a schematic flow chart of the network asset tagging in the embodiment of the present application;
FIG. 11 is a schematic flow chart illustrating updating of a network asset tag according to an embodiment of the present application;
FIG. 12 is another schematic flow chart diagram illustrating a method for network asset processing according to an embodiment of the present application;
FIG. 13 is a schematic diagram of a network asset processing device in an embodiment of the present application;
fig. 14 is a schematic structural diagram of a terminal device in an embodiment of the present application;
fig. 15 is a schematic structural diagram of a server in an embodiment of the present application.
Detailed Description
The embodiment of the application provides a network asset processing method, a device, equipment and a storage medium. According to the method and the device, on one hand, the labor cost required by network asset management is saved, and the real-time performance of network asset identification is improved. On the other hand, the condition that normal service is influenced due to network asset identification can be effectively avoided.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims of the present application and in the drawings described above, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "corresponding" and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Information security has been an important concern for businesses and institutions. With the change of enterprise business organization mode, the protection object and protection means of the information security protection product are continuously updated and upgraded. In an enterprise security operation system, the network asset security belongs to the security foundation, so the asset security management work is very important. In the network asset management, identifying the network assets is one of basic functions of risk management, mapping of the network assets is achieved through identification results, and enterprise personnel can be assisted to know safety conditions of internal equipment and systems in time. Meanwhile, as internet online services are increasingly complex, more and more devices, systems and services migrate to the cloud, and an open or semi-open platform is formed. Therefore, identification of network assets is also widely used in the field of cloud security (cloud security).
Cloud security refers to the generic name of security software, hardware, users, organizations, and security cloud platforms applied based on cloud computing business models. The cloud security integrates emerging technologies and concepts such as parallel processing, grid computing and unknown virus behavior judgment, abnormal monitoring of software behaviors in the network is achieved through a large number of meshed clients, the latest information of trojans and malicious programs in the internet is obtained and sent to the server for automatic analysis and processing, and then the virus and trojan solution is distributed to each client.
The main research directions of cloud security include: firstly, cloud computing security mainly researches how to guarantee the security of cloud and various applications on the cloud, including cloud computer system security, user data security storage and isolation, user access authentication, information transmission security, network attack protection, compliance audit and the like; secondly, cloud computing of the security infrastructure is mainly used for researching how to adopt cloud computing to newly build and integrate security infrastructure resources and optimize a security protection mechanism, and the cloud computing technology is used for constructing a super-large-scale security event and information acquisition and processing platform, so that acquisition and correlation analysis of mass information are realized, and the handling control capability and the risk control capability of the security event of the whole network are improved; thirdly, the cloud security service mainly researches various security services, such as anti-virus services and the like, provided for users based on the cloud computing platform.
Based on the above, the application provides a network asset processing method based on traffic characteristics, which is used for solving the problem of asset mapping in a network asset management scene. The method can be particularly applied to various software and hardware products such as firewalls, asset management systems, Intrusion Detection Systems (IDS) and Intrusion Prevention Systems (IPS), and the like, and can automatically identify corresponding network assets and perform key protection on core application. Based on the network asset processing method provided by the application, the network assets can be automatically mapped, so that the identification rate of the network assets can be effectively improved, the problems that the management of the network assets is complex and different protection strategies for different network assets are solved, and the safety of a user network is improved.
For ease of understanding, the architecture of the IDS and IPS will be described below in conjunction with the figures.
Referring to fig. 1, fig. 1 is a schematic diagram of an intrusion detection system according to an embodiment of the present invention, in which an IDS is a system for locating and identifying malicious traffic by monitoring network traffic in real time, and in a network system, the IDS is usually deployed behind a firewall, and besides, a plurality of IDSs may be deployed to further improve the overall security of the network. In the application scene of IDS, the network information can be analyzed, and an alarm can be given in time when malicious activities are found. Referring to fig. 2, fig. 2 is a schematic diagram of an architecture of an intrusion prevention system in an embodiment of the present application, where as shown, an IPS is mainly in an online mode, which solves a problem that an IDS cannot block an intrusion, and can not only detect the intrusion but also intercept the intrusion. In an application scene of an IPS (intrusion prevention system), network information can be analyzed, and malicious activities can be intercepted in time when being found.
Taking enterprise business organization mode as an example, a firewall is a hardware device or a software system, and is mainly erected between an internal network and an external network. Internal networks include, but are not limited to, research and development departments, financial departments, and marketing departments, among others. The isolation area includes, but is not limited to, a mail server, a web server, a domain name resolution server, and the like.
In order to save the labor cost required by network asset management and improve the real-time performance of network asset identification, the application provides a network asset processing method, which is applied to the network asset processing system shown in fig. 3, a hardware implementation part of the network asset processing system includes a server and a terminal device, and a client is deployed on the terminal device, wherein the client can run on the terminal device in the form of a browser, can run on the terminal device in the form of an independent Application (APP), and the specific presentation form of the client is not limited herein. The server related to the present application may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, a middleware service, a domain name service, a security service, a Content Delivery Network (CDN), a big data and artificial intelligence platform, and the like. The terminal device may be a smart phone, a tablet computer, a notebook computer, a palm computer, a personal computer, a smart television, a smart watch, a vehicle-mounted device, a wearable device, and the like, but is not limited thereto. The terminal device and the server may be directly or indirectly connected through wired or wireless communication, and the application is not limited herein. The number of servers and terminal devices is not limited. The scheme provided by the application can be independently completed by the terminal device, can also be independently completed by the server, and can also be completed by the cooperation of the terminal device and the server, so that the application is not particularly limited.
The software implementation part of the network asset processing system comprises a flow acquisition module, a historical flow database module, an asset marking module, an asset and tag module and a behavior monitoring module. The flow acquisition module is used for acquiring data and performing primary feature extraction. The historical flow database module is used for providing a storage function for the flow acquisition module and providing a query function for the asset marking module. The asset marking module is used for analyzing the network assets from the flow by taking the network assets as dimensions and marking the network assets. The asset and tag module is used for generating corresponding protection rules. And the behavior monitoring module monitors the flow in real time according to the protection rule of the asset marking module.
With reference to fig. 4, a method for identifying a network asset in the present application will be described below, and an embodiment of a method for processing a network asset in the present application includes:
110. acquiring a first traffic characteristic set corresponding to each network asset in N network assets to be identified from a traffic database, wherein the first traffic characteristic set is obtained by extracting original traffic based on a first time window, and N is an integer greater than 1;
in one or more embodiments, a network asset processing device may determine N network assets to identify. For example, all or a portion of the network assets in an enterprise are considered as network assets to be identified. The network asset processing device extracts a first traffic characteristic of each network asset in a first time window from a traffic database, wherein the first traffic characteristic is acquired based on original traffic. The N first flow characteristics constitute a first flow characteristic set.
IT is understood that the network assets referred to in this application refer to objects of Internet Technology (IT) security management, including but not limited to information (or data), hardware, software, funds, services, personnel, and the like. The core of this patent is a network service, which includes information such as Internet Protocol (IP), Media Access Control (MAC) address, open port, service, vendor, model, and version of a system such as a host, a network device, a security device, middleware, a database, big data, and virtualization.
It is understood that the size of the first time window can be flexibly set, and generally, the first time window can be set for a week or a half month, and the like, which is not limited herein.
It should be noted that the network asset processing apparatus may be deployed in a server, or in a terminal device, or in a system composed of a server and a terminal device, and is not limited herein.
120. Counting a first traffic feature set corresponding to each network asset according to K preset attribute types to obtain a first feature vector corresponding to each network asset, wherein the first feature vector comprises K first feature values, each first feature value corresponds to one preset attribute type, and K is an integer greater than or equal to 1;
in one or more embodiments, after obtaining the first traffic feature set, the network asset processing apparatus needs to perform statistics on the first traffic features, that is, to perform statistics on feature vectors of each network asset according to different preset attribute types. Assuming that there are K preset attribute types, the feature vector obtained by statistics includes K first feature values.
130. Clustering the N network assets according to the first feature vector corresponding to each network asset until a clustering stopping condition is met, and obtaining at least one clustering result aiming at a first time window;
in one or more embodiments, after obtaining the N first feature vectors, the network asset processing apparatus may perform clustering on the N first feature vectors by using a clustering algorithm, and in a case that a clustering stop condition is satisfied, stop clustering and obtain at least one clustering result for the first time window. In the clustering, according to a certain specific criterion (e.g., a distance criterion), one N network assets are divided into different classes or clusters, so that the similarity of the network assets in the same cluster is as large as possible, and meanwhile, the differences of the network assets not in the same cluster are also as large as possible. That is, after clustering, the data of the same class are gathered together as much as possible, and different data are separated as much as possible.
For example, in one case, when the clustering frequency reaches a frequency threshold, it may be determined that the clustering stop condition is satisfied. In another case, when the clustering distance is less than or equal to the distance threshold, it may be determined that the clustering stop condition is satisfied.
It is understood that the Clustering algorithm employed in the present application includes, but is not limited to, K-means Clustering algorithm (K-means Clustering algorithm), mean shift Clustering algorithm (mean shift) and Density-Based Clustering with Noise (DBSCAN) method, and is not limited herein.
140. And marking at least one clustering result aiming at the first time window to obtain an asset identification tag corresponding to each network asset.
In one or more embodiments, the network asset processing device marks each clustering result obtained after clustering for the first time window, the clustering results may be in the form of clustered clusters,
specifically, assume that 5 clustering results are obtained after clustering, and assume that 3 types of asset identification tags are in total, and based on this, each clustering result is labeled. For convenience of illustration, please refer to table 1, where table 1 is an illustration of a corresponding relationship between asset identification tags and clustering results.
TABLE 1
Clustering results Network assets Asset identification tag
Clustering result 1 Network asset A, network asset K, network asset D … Production net label
Clustering result 2 Network asset B, network asset I, network asset L … Office net label
Clustering results 3 Network asset M, network asset C, network asset J … Office net label
Clustering result
4 Network asset F, network asset G, network asset N … Development net label
Clustering results 5 Network asset H, network asset K, network asset E … Office net label
Based on the above, the asset identification tag corresponding to the clustering result is used as the asset identification tag of the belonged network asset. It is to be understood that the content shown in table 1 is only an illustration and should not be construed as a limitation of the present application.
In the embodiment of the application, a network asset processing method is provided. By the method, the purpose of automatically identifying the network assets can be achieved based on flow monitoring and feature clustering, on one hand, the problem of low manual management efficiency is solved, the labor cost required by network asset management is saved, and the real-time property of network asset identification is favorably improved. On the other hand, a large amount of flow does not need to be sent actively, so that the condition that normal service is influenced due to network asset identification can be effectively avoided.
Optionally, on the basis of each embodiment corresponding to fig. 4, another optional embodiment provided in the embodiments of the present application may further include:
acquiring original flow from gateway equipment, wherein the gateway equipment comprises at least one of routing equipment, a firewall and a switch;
obtaining at least one message from original flow;
performing feature extraction on at least one message to obtain flow features, wherein the flow features comprise a timestamp, a source Internet Protocol (IP) address, a destination IP address, a source port, a destination port, a message size and a service type;
and recording the flow characteristics into a flow database.
In one or more embodiments, a way to build a traffic database based on raw traffic is presented. As can be seen from the foregoing embodiments, in the flow collection process, the original flow can be obtained by using the mirror image technique. It is understood that the entry point of the present application is a traffic mirror, however, other ways of obtaining traffic data may be applied to the present application, and thus, the present application is not limited thereto. In the network, mirroring copies a message of a specified port or a message conforming to a specified rule to a destination port, and a user can perform network supervision and troubleshooting by using mirroring technology. The mirroring technique includes three modes, namely local port mirroring, remote port mirroring and stream mirroring.
Specifically, for convenience of understanding, please refer to fig. 5, where fig. 5 is a network topology diagram for implementing port mirroring in the embodiment of the present application, and as shown in the figure, taking a local port mirroring as an example, a packet on a source port of a gateway device may be copied to a destination port of the gateway device, so that a traffic monitoring device may record traffic without affecting normal network interaction and implement monitoring and analysis on the packet.
For convenience of understanding, please refer to fig. 6, where fig. 6 is a schematic flow diagram of traffic feature extraction in the embodiment of the present application, and as shown in the figure, first, original traffic needs to be obtained from a gateway device, where the gateway device includes, but is not limited to, a routing device, a firewall, and a switch, and the original traffic refers to traffic extracted through the gateway device by using a mirroring technique. After the original traffic is obtained, the packets in the original traffic may be preliminarily classified and filtered, for example, some packets that are not in the analysis range are filtered based on the source IP address, the destination IP address, the MAC address, and the like of the packet.
Based on this, at least one message meeting the requirements can be taken out from the original flow. Then, feature extraction is carried out on each message which meets the requirements. Illustratively, one message may extract one traffic feature, and illustratively, a plurality of messages may extract one traffic feature. Each traffic characteristic includes, but is not limited to, a timestamp, a source internet protocol IP address, a destination IP address, a source port, a destination port, a packet size, and a service type. And finally, recording the flow characteristics into a flow database to realize persistent storage. For easy understanding, please refer to table 2, where table 2 is an illustration of a traffic characteristic corresponding to a message.
TABLE 2
Newspaper number Time stamp Source IP address Source port Destination IP address Destination port Message size
0 1637222266 192.168.0.1 80 128.11.3.31 2000 2048KB
As can be seen from table 2, the corresponding traffic characteristics can be extracted for each packet, and the content shown in table 2 is only an illustration and should not be construed as a limitation to the present application.
Secondly, in the embodiment of the application, a method for constructing a traffic database based on original traffic is provided, and through the method, traffic mirror images are adopted to obtain input from gateway equipment, so that the operation of normal service is not influenced. Therefore, the method can perform primary feature extraction and then store the extracted flow features into the flow database, thereby facilitating subsequent processing.
Optionally, on the basis of each embodiment corresponding to fig. 4, in another optional embodiment provided in the embodiment of the present application, the obtaining at least one packet from the original traffic may specifically include:
acquiring a message set to be processed from original flow;
classifying each message in a message set to be processed to obtain at least one message belonging to the same protocol type;
performing feature extraction on at least one packet to obtain a traffic feature, which may specifically include:
performing message recombination on at least one message belonging to the same protocol type to obtain an application layer message;
and extracting the characteristics of the application layer message to obtain the flow characteristics.
In one or more embodiments, a manner of flow feature extraction in the case of recombination is described. As can be seen from the foregoing embodiments, when the packets in the original traffic are primarily classified, the packets may be classified according to different protocol types, for example, into a Transmission Control Protocol (TCP) and a user data packet protocol (UDP).
Specifically, taking the divided TCP packet as an example, a plurality of packets may be reassembled, and corresponding traffic characteristics may be extracted from the reassembled packets. For easy understanding, please refer to table 3, where table 3 is an illustration of a traffic characteristic corresponding to a message.
TABLE 3
Newspaper number Time stamp Source IP address Source port
0-9 1637222266 192.168.0.1 80
Destination IP address Destination port Message size Type of service
128.11.3.31 2000 1.8MB HTTP
As can be seen from table 3, the messages from 0 to 9 are reassembled to obtain the application layer messages, and the corresponding traffic characteristics can be extracted for the application layer messages obtained after reassembly, and the content shown in table 3 is only one schematic and should not be construed as a limitation to the present application.
It is understood that the service type represents a network service, and taking a TCP type application layer packet as an example, the related service types include, but are not limited to, a Secure Shell (SSH) protocol, a File Transfer Protocol (FTP), a Remote Desktop Protocol (RDP), and a hypertext transfer protocol (HTTP). Taking UDP type application layer packets as an example, the related service types include, but are not limited to, domain name resolution (DNS) and distributed cache system (memcached).
In the embodiment of the present application, a traffic feature extraction method under a reassembly condition is provided, and through the above method, the purpose of extracting application layer features is achieved based on the reassembly of multiple packets, so that the feature richness is increased.
Optionally, on the basis of each embodiment corresponding to fig. 4, in another optional embodiment provided in the embodiment of the present application, the obtaining the first traffic feature set corresponding to each network asset from the traffic database may specifically include:
acquiring a target traffic characteristic set of a timestamp in a first time window from a traffic database;
regarding each network asset, taking the traffic characteristics of the source IP address belonging to the asset IP address in the target traffic characteristic set as first traffic characteristics in a first traffic characteristic set, wherein the asset IP address is the IP address of the network asset;
and regarding the traffic characteristics of which the destination IP address belongs to the asset IP address in the target traffic characteristic set as the first traffic characteristics in the first traffic characteristic set for each network asset.
In one or more embodiments, a manner of extracting a first set of traffic characteristics based on an IP address is presented. As can be seen from the foregoing embodiment, the traffic database stores traffic characteristics corresponding to messages (or application layer messages), and the traffic characteristics include timestamps, so that the traffic characteristics of corresponding time windows can be screened out according to the timestamps.
Specifically, assuming that the first time window is from 00 minutes 00 seconds at 11/2021 to 00 minutes 00 seconds at 00 hours at 18/11/2021, all the flow characteristics with the time stamp within the first time window can be obtained from the flow database, and these flow characteristics are taken as the target flow characteristic set. Illustratively, taking any one of the network assets to be identified as an example, the asset IP address (e.g., 192.168.0.1) of the network asset can be known, and then, the first traffic characteristics of the asset IP address belonging to the source IP address or the destination IP address are screened from the target traffic characteristic set, and are taken as the first traffic characteristic set of the network asset.
It is understood that for other network assets to be identified, corresponding first traffic features are extracted in a similar manner, and therefore, a first traffic feature set corresponding to each network asset is obtained.
Secondly, in the embodiment of the present application, a manner is provided for extracting a first traffic feature set based on an IP address, and in this manner, the IP address (i.e., the source IP address and the destination IP address) is used as a basis for aggregating the network asset features, thereby improving the feasibility and operability of the scheme.
Optionally, on the basis of each embodiment corresponding to fig. 4, in another optional embodiment provided in the embodiment of the present application, the K preset attribute types include at least one of an uplink traffic total, a downlink traffic total, an uplink and downlink traffic total, an uplink traffic proportion, and a downlink traffic proportion;
counting the first traffic feature set corresponding to each network asset according to the K preset attribute types to obtain a first feature vector corresponding to each network asset, which may specifically include:
for each network asset, if the K preset attribute types comprise uplink flow sum, summing the sizes of messages included in first flow characteristics of source IP addresses belonging to asset IP addresses in a first flow characteristic set to obtain a first characteristic value corresponding to the uplink flow sum in a first characteristic vector, wherein the asset IP addresses are IP addresses of the network assets;
for each network asset, if the K preset attribute types comprise downlink traffic totals, summing the sizes of messages included in first traffic characteristics of a target IP address belonging to an asset IP address in a first traffic characteristic set to obtain a first characteristic value corresponding to the downlink traffic totals in a first characteristic vector;
for each network asset, if the K preset attribute types comprise uplink and downlink flow sums, summing a first characteristic value corresponding to the downlink flow sums and a first characteristic value corresponding to the uplink flow sums to obtain a first characteristic value corresponding to the uplink and downlink flow sums in a first characteristic vector;
for each network asset, if the K preset attribute types comprise an uplink flow proportion, calculating a ratio between a first characteristic value corresponding to an uplink flow sum and a first characteristic value corresponding to the uplink flow sum to obtain a first characteristic value corresponding to the uplink flow proportion in a first characteristic vector;
for each network asset, if the K preset attribute types include a downlink traffic proportion, calculating a ratio between a first eigenvalue corresponding to a downlink traffic sum and a first eigenvalue corresponding to an uplink traffic sum, and obtaining a first eigenvalue corresponding to the downlink traffic proportion in a first eigenvector.
In one or more embodiments, a manner of performing statistics based on uplink and downlink traffic characteristics is described. As can be seen from the foregoing embodiment, after the first traffic feature set of each network asset is obtained, the first traffic feature set is respectively counted according to different preset attribute types. The manner of counting the first characteristic value will be described below with reference to any one of the network assets to be identified as an example, from dimensions of the uplink traffic total, the downlink traffic total, the uplink and downlink traffic total, the uplink traffic proportion, and the downlink traffic proportion, respectively.
Firstly, uplink flow is summed;
specifically, the upstream traffic means traffic transmitted from the local device to the network. Assuming that, in the first traffic characteristic set of the network asset, the sum of sizes of messages sent from the IP address of the network asset (i.e., the IP address of the network asset to which the source IP address belongs) is 50 Megabytes (MB), it is obtained that the first characteristic value corresponding to the uplink traffic total is 50.
Secondly, downlink flow is summed;
specifically, downstream traffic represents traffic downloaded by the native machine from the network. In the first traffic characteristic set of the network asset, the sum of sizes of messages downloaded through the IP address of the network asset (that is, the IP address of the network asset to which the destination IP address belongs) is 800MB, and thus, the first characteristic value corresponding to the downlink traffic total is 800 MB.
Thirdly, summing up the uplink and downlink flow;
specifically, in the first traffic characteristic set of the network asset, the sum of the sizes of messages sent from the IP address of the network asset (i.e., the IP address of which the source IP address belongs to the network asset) is 50MB, and the sum of the sizes of messages downloaded through the IP address of the network asset (i.e., the IP address of which the destination IP address belongs to the network asset) is 800MB, so that the first characteristic value corresponding to the uplink and downlink traffic total is 850.
Fourthly, the proportion of the uplink flow;
specifically, in the first traffic characteristic set of the network asset, the first characteristic value corresponding to the uplink traffic total of the network asset is 50 (i.e., represents 50MB), and the first characteristic value corresponding to the uplink and downlink traffic total of the network asset is 850 (i.e., represents 850MB), and based on this, the ratio between the two is calculated, so that the first characteristic value corresponding to the uplink traffic ratio is 0.06.
Fifthly, downlink flow proportion;
specifically, in the first traffic characteristic set of the network asset, the first characteristic value corresponding to the downlink traffic proportion of the network asset is 800 (i.e., representing 800MB), the first characteristic value corresponding to the sum of the uplink traffic and the downlink traffic of the network asset is 850 (i.e., representing 850MB), and based on this, the ratio between the two is calculated, so that the first characteristic value corresponding to the downlink traffic proportion is 0.94.
It can be understood that for other network assets to be identified, the features related to the traffic dimension are counted in a similar manner, and thus, a first feature vector corresponding to each network asset is obtained. Meanwhile, as for the first feature value, some feature engineering manners may also be used for processing (for example, normalization, one-hot coding, or the like), which is not limited herein.
Secondly, in the embodiment of the application, a mode of carrying out statistics according to uplink and downlink traffic characteristics is provided, and through the mode, characteristics related to uplink and downlink traffic dimensions are aggregated together to be used as a basis for generating a characteristic vector, so that the richness of the characteristic vector is increased, and the accuracy of characteristic clustering is favorably improved.
Optionally, on the basis of each embodiment corresponding to fig. 4, in another optional embodiment provided in the embodiment of the present application, the K preset attribute types include at least one of an intranet traffic total, an extranet traffic total, an intranet and extranet traffic total, an intranet traffic ratio, and an extranet traffic ratio;
counting the first traffic feature set corresponding to each network asset according to the K preset attribute types to obtain a first feature vector corresponding to each network asset, which may specifically include:
for each network asset, if the K preset attribute types comprise an intranet flow total, summing the sizes of messages included in first flow characteristics of a source IP address and a destination IP address belonging to the intranet address in a first flow characteristic set to obtain a first characteristic value corresponding to the intranet flow total in a first characteristic vector;
for each network asset, if the K preset attribute types comprise the outer network traffic total, summing the sizes of messages included in the first traffic characteristics of the source IP address or the destination IP address belonging to the outer network address in the first traffic characteristic set to obtain a first characteristic value corresponding to the outer network traffic total in the first characteristic vector;
for each network asset, if the K preset attribute types comprise the internal and external network flow sum, summing a first characteristic value corresponding to the internal network flow sum and a first characteristic value corresponding to the external network flow proportion to obtain a first characteristic value corresponding to the internal and external network flow sum in a first characteristic vector;
for each network asset, if the K preset attribute types comprise an intranet flow ratio, calculating a ratio between a first characteristic value corresponding to an intranet flow sum and a first characteristic value corresponding to an intranet flow sum to obtain a first characteristic value corresponding to the intranet flow ratio in a first characteristic vector;
for each network asset, if the K preset attribute types comprise an external network traffic proportion, calculating a ratio between a first characteristic value corresponding to the external network traffic proportion and a first characteristic value corresponding to the total internal and external network traffic to obtain a first characteristic value corresponding to the external network traffic proportion in a first characteristic vector.
In one or more embodiments, a manner of performing statistics based on intra-and extranet traffic characteristics is described. As can be seen from the foregoing embodiment, after the first traffic feature set of each network asset is obtained, the first traffic feature set is respectively counted according to different preset attribute types. The manner of counting the first eigenvalue will be described below, taking any one of the network assets to be identified as an example, from the dimensions of the inner network traffic total, the outer network traffic total, the inner and outer network traffic total, the inner network traffic ratio, and the outer network traffic ratio, respectively.
First, the difference between the internal network and the external network will be described with reference to the drawings. For easy understanding, please refer to fig. 7, fig. 7 is a schematic diagram illustrating an application of a firewall to an internal network and an external network according to an embodiment of the present application, where the firewall is used at an edge of the internal network and the external network to prevent intrusion of the internal network from the external network. Referring to fig. 8, fig. 8 is a schematic diagram illustrating an application of a firewall in an internal network according to an embodiment of the present invention, where the firewall is used in the internal network to mainly prevent attacks from the inside and ensure security of important data.
Firstly, totalizing the flow of an internal network;
specifically, the intranet traffic indicates that both the source IP address and the destination IP address belong to the intranet address. In the first traffic characteristic set of the network asset, the sum of the sizes of the messages of which the source IP address and the destination IP address both belong to the intranet address is 500MB, so that the first characteristic value corresponding to the intranet traffic total is 500.
Secondly, totalizing the flow of the external network;
specifically, the extranet traffic means traffic in which a source IP address or a destination IP address belongs to an extranet address. Assuming that the sum of the sizes of the messages with the source IP address or the destination IP address of the message belonging to the extranet address in the first traffic feature set of the network asset is 100MB, the first feature value corresponding to the extranet traffic total is 100.
Thirdly, summing the flows of the internal network and the external network;
specifically, in the first traffic feature set of the network asset, the sum of the sizes of the messages whose source IP address and destination IP address both belong to the intranet address is 500MB, and the sum of the sizes of the messages whose source IP address or destination IP address belongs to the extranet address is 100MB, so that the first feature value corresponding to the total of the intranet and extranet traffic is 600.
Fourthly, the flow proportion of the internal network;
specifically, assuming that, in the first traffic feature set of the network asset, the first feature value corresponding to the total intranet traffic of the network asset is 500 (i.e., representing 500MB), and the first feature value corresponding to the total intranet and extranet traffic of the network asset is 600 (i.e., representing 600MB), based on this, the ratio between the two is calculated, and thus, the first feature value corresponding to the intranet traffic ratio is 0.83.
Fifthly, the flow proportion of the external network;
specifically, in the first traffic feature set of the network asset, the first feature value corresponding to the extranet traffic ratio of the network asset is 100 (i.e., represents 100MB), and the first feature value corresponding to the total extranet traffic of the network asset is 600 (i.e., represents 600MB), and based on this, the ratio between the two is calculated, so that the first feature value corresponding to the extranet traffic ratio is 0.17.
It can be understood that for other network assets to be identified, features related to the intra-extranet traffic dimension are counted in a similar manner, and therefore a first feature vector corresponding to each network asset is obtained. Meanwhile, as for the first feature value, some feature engineering manners may also be used for processing (for example, normalization, one-hot coding, or the like), which is not limited herein.
Secondly, this application embodiment provides a mode of making statistics according to intranet and extranet flow characteristics, through the aforesaid mode, will be in the same place with the characteristics that intranet and extranet flow dimension is relevant are gathered, as the foundation that generates the eigenvector to increase the richness of eigenvector, be favorable to promoting the accuracy of feature clustering.
Optionally, on the basis of each embodiment corresponding to fig. 4, in another optional embodiment provided in this embodiment of the present application, the K preset attribute types include at least one of a maximum ratio source port, a maximum ratio source port ratio, and a maximum ratio port ratio;
counting the first traffic feature set corresponding to each network asset according to the K preset attribute types to obtain a first feature vector corresponding to each network asset, which may specifically include:
for each network asset, if the K preset attribute types comprise a maximum ratio source port, determining total flow of the source ports and the flow of each source port according to a first flow characteristic set, and taking the source port with the maximum ratio as a first characteristic value corresponding to the maximum ratio source port in a first characteristic vector;
for each network asset, if the K preset attribute types comprise a maximum-ratio destination port, determining total flow of the destination ports and flow of each destination port according to a first flow characteristic set, and taking the destination port with the maximum flow ratio as a first characteristic value corresponding to the maximum-ratio destination port in a first characteristic vector;
for each network asset, if the K preset attribute types comprise the maximum ratio source port proportion, calculating the ratio of the flow corresponding to the maximum ratio source port to the total flow of the source port to obtain a first characteristic value corresponding to the maximum ratio source port proportion in a first characteristic vector;
for each network asset, if the K preset attribute types include a port proportion of a maximum proportion target, calculating a ratio between a flow corresponding to the port of the maximum proportion target and a total flow of the target port to obtain a first eigenvalue corresponding to the port proportion of the maximum proportion target in the first eigenvector.
In one or more embodiments, a way to perform statistics on port characteristics is presented. As can be seen from the foregoing embodiment, after the first traffic feature set of each network asset is obtained, the first traffic feature set is respectively counted according to different preset attribute types. The manner of counting the first characteristic value will be described below with reference to any network asset to be identified as an example, from dimensions of the maximum-ratio source port, the maximum-ratio destination port, the maximum-ratio source port ratio, and the maximum-ratio destination port ratio.
Firstly, a maximum ratio source port;
specifically, the total flow of all source ports in the first flow characteristic set of the network asset is counted, and it is assumed that the total flow of the source ports is 1000MB, where the flow of the source port "8080" is 800MB, and the flow ratio thereof is 0.8. The source port "6060" has a flow rate of 200MB, and its flow rate ratio is 0.2. Based on this, the source port with the largest traffic ratio is "8080", and thus, the first characteristic value corresponding to the source port with the largest traffic ratio is 8080.
Secondly, a port with the largest proportion target;
specifically, the total flow of all target ports in the first flow characteristic set of the network asset is counted, and the total flow of the target ports is assumed to be 800MB, wherein the flow of the target port "7070" is 500MB, and the flow ratio thereof is 0.625. The traffic of the destination port "2020" is 300MB, and the traffic ratio is 0.375. Based on this, the target port with the largest traffic proportion is "7070", and thus the first characteristic value corresponding to the target port with the largest proportion is 7070.
Thirdly, the ratio of the maximum ratio source port;
specifically, the total flow of all source ports in the first flow characteristic set of the network asset is counted, and it is assumed that the total flow of the source ports is 1000MB, where the flow of the source port "8080" is 800MB, and the flow ratio thereof is 0.8. The source port "6060" has a flow rate of 200MB, and its flow rate ratio is 0.2. Based on this, the source port with the largest traffic ratio (i.e., the largest ratio source port) is "8080", and thus, the first characteristic value corresponding to the ratio of the largest ratio source port is 0.8.
Fourthly, the port proportion of the largest proportion target is occupied;
specifically, the total flow of all destination ports in the first flow characteristic set of the network asset is counted, and the total flow of the destination ports is assumed to be 800MB, wherein the flow of the destination port "7070" is 800MB, and the flow ratio thereof is 0.625. The destination port "2020" has a traffic volume of 300MB and a traffic volume ratio of 0.375. Based on this, the destination port having the largest traffic proportion (i.e., the port having the largest proportion) is "7070", and thus the first characteristic value corresponding to the proportion of the port having the largest proportion is 0.625.
It is understood that for other network assets to be identified, the port dimension related features are counted in a similar manner, and thus, a first feature vector corresponding to each network asset is obtained. Meanwhile, as for the first feature value, some feature engineering manners may also be used for processing (for example, normalization, one-hot coding, or the like), which is not limited herein.
Secondly, in the embodiment of the application, a mode of carrying out statistics according to port features is provided, and through the mode, features related to port dimensions are aggregated together to be used as a basis for generating feature vectors, so that the richness of the feature vectors is increased, and the accuracy of feature clustering is favorably improved.
Optionally, on the basis of each embodiment corresponding to fig. 4, in another optional embodiment provided in this application embodiment, according to the first feature vector corresponding to each network asset, clustering N network assets until a clustering stop condition is met, and obtaining at least one clustering result for a first time window may specifically include:
acquiring preset K weight values, wherein each weight value corresponds to a preset attribute type;
for each network asset, determining distances between the network asset and T clustering centers according to K weighted values and a first feature vector, and dividing the network asset into clustering clusters with the shortest distances, wherein T is an integer greater than 1;
if the clustering stop condition is met, obtaining at least one clustering result aiming at the first time window;
and if the clustering stop condition is not met, updating the T clustering centers.
In one or more embodiments, a way to perform clustering based on preset weight values is introduced. As can be seen from the foregoing embodiment, in the clustering process, a weight value may also be set for each preset attribute type, and the distance between the first feature vector and each cluster center is calculated by combining the weight values.
Specifically, the K preset attribute types include an intranet traffic total, an extranet traffic total, and an extranet traffic total. Suppose that the total corresponding weight value of the internal network flow is 0.5, the total of the external network flow is 0.8, and the total of the internal network flow and the external network flow is 0.1. Assume that the first feature vector is (500,100,600). Assume that the cluster center A is (200,300,500) and the cluster center B is (100,200, 300). Based on this, the distances of the first feature vector from the two cluster centers are calculated respectively in the following manner.
DA=0.5×(500-200)2+0.8×(100-300)2+0.1×(600-500)2=78000
DB=0.5×(500-100)2+0.8×(100-200)2+0.1×(600-300)2=97000
As can be seen, the first feature vector is shorter from the cluster center a, and therefore, the network assets corresponding to the first feature vector can be divided into the cluster a. It can be understood that similar processing is performed on the first feature vectors corresponding to other network assets, and when the clustering stopping condition is met, a clustering result can be obtained. If the cluster stop condition is not satisfied, T cluster centers may be updated.
Secondly, this application embodiment provides a mode of clustering based on predetermineeing the weighted value, through above-mentioned mode, can give every weight value of predetermineeing attribute type according to experimental data, when clustering, combines the weighted value to carry out the partition of characteristic to be favorable to promoting the accuracy of clustering.
Optionally, on the basis of each embodiment corresponding to fig. 4, in another optional embodiment provided in the embodiment of the present application, marking at least one clustering result for a first time window to obtain an asset identification tag corresponding to each network asset may specifically include:
displaying at least one clustering result of the first time window;
and responding to a labeling instruction aiming at each clustering result, and determining an asset identification tag corresponding to each network asset, wherein the labeling instruction carries the asset identification tag.
In one or more embodiments, a manner of manually setting asset identification tags based on human experience is presented. As can be seen from the foregoing embodiments, the user may also mark asset identification tags for the clustered results in combination with practical experience.
Specifically, for convenience of understanding, please refer to fig. 9, where fig. 9 is an interface schematic diagram for setting an asset identification tag in the embodiment of the present application, and as shown in (a) of fig. 9, it is assumed that N network assets are clustered in a first time window to obtain two clustering results, namely a cluster a and a cluster B, and based on this, a user can view information of the network assets in the clusters, where the information of the network assets includes, but is not limited to, a location of a device, an IP address, a current user, a historical asset tag, and the like. The user may select a cluster, e.g., select cluster a, which may highlight cluster a and its recommended asset identification tag (e.g., "production net"). If the user needs to modify the asset identification tag of cluster a, the user can click on the controls corresponding to other asset identification tags, as shown in (B) of fig. 9, and the user can click on the control corresponding to "office net".
The interface shown in fig. 9 is only an illustration, and should not be construed as limiting the present application.
Secondly, in the embodiment of the application, a mode of manually setting the asset identification tag based on manual experience is provided, and through the mode, a corresponding channel is provided for the user-defined asset identification tag, so that the asset identification tag of the clustering result can be manually calibrated, and the flexibility of the scheme is improved.
Optionally, on the basis of each embodiment corresponding to fig. 4, in another optional embodiment provided in the embodiment of the present application, marking at least one clustering result for a first time window to obtain an asset identification tag corresponding to each network asset may specifically include:
for each clustering result of the first time window, if the average value of the uplink flow ratios of all the network assets in the clustering result is less than or equal to a first ratio threshold, setting the asset identification tag corresponding to each network asset in the clustering result as a production network tag;
for each clustering result of the first time window, if the average value of the external network flow proportion of all the network assets in the clustering result is greater than or equal to a second proportion threshold, setting the asset identification tag corresponding to each network asset in the clustering result as an office network tag;
and aiming at each clustering result of the first time window, if the intranet flow ratio average value of all the network assets in the clustering result is greater than or equal to a third ratio threshold, setting the asset identification tag corresponding to each network asset in the clustering result as a development network tag.
In one or more embodiments, a manner of automatically setting asset identification tags based on feature rules is presented. According to the embodiment, the clustering result can be labeled according to the characteristics corresponding to different asset identification tags. In one case, the initial clustering centers may be set according to the asset identification tags, for example, the clustering center corresponding to the production web tag, the clustering center corresponding to the office web tag, and the clustering center corresponding to the development web tag may be set, respectively. In another case, several initial clustering centers are arbitrarily set, and finally, which asset identification tag should belong to is determined according to the clustering result. The following description will be made with reference to examples
Firstly, producing a net label;
specifically, the production network is mainly used for realizing better linkage of production environment or realizing digital control. In general, a production network is characterized by a large amount of downlink traffic and a small amount of uplink traffic, and therefore, corresponding feature rules can be set according to the characteristics of the uplink traffic and the downlink traffic. For example, if the average value of the uplink traffic ratios of all the network assets in the clustering result is less than or equal to the first ratio threshold, the clustering result is considered to belong to the production network tag, that is, the asset identification tag corresponding to each network asset in the clustering result is set as the production network tag.
For example, assuming that the clustering result a includes 10 network assets, the average value of the uplink traffic proportion of the 10 network assets is 0.05, and the first proportion threshold value is 0.1, based on which, the production network tags can be marked on the 10 network assets in the clustering result a.
Secondly, office net labels;
specifically, the office network is mainly a network used by employees in an enterprise, and is mainly used for handling various official businesses and completing various work tasks. In general, an office network is characterized by frequent access to an external network, and therefore, corresponding characteristic rules can be set according to the traffic characteristics of the internal network and the external network. For example, if the average value of the external network traffic ratios of all the network assets in the clustering result is greater than or equal to the second ratio threshold, the clustering result is considered to belong to the office network tag, that is, the asset identification tag corresponding to each network asset in the clustering result is set as the office network tag.
Illustratively, assuming that the clustering result a includes 10 network assets, the average value of the extranet traffic ratio of the 10 network assets is 0.8, and the second ratio threshold value is 0.7, the office net tags can be labeled for the 10 network assets in the clustering result a.
Thirdly, developing a web tag;
specifically, a development network is mainly a test network provided to developers. In general, a development network is characterized in that the interaction with an external network is very little, and basically, a test task is completed in an internal network, so that a corresponding characteristic rule can be set according to the flow characteristics of the internal network and the external network. For example, if the average value of the intranet flow ratios of all the network assets in the clustering result is greater than or equal to the third ratio threshold, the clustering result is considered to belong to a development network tag, that is, the asset identification tag corresponding to each network asset in the clustering result is set as the development network tag.
Illustratively, assuming that the clustering result a includes 10 network assets, the average value of the intranet traffic proportion of the 10 network assets is 0.98, and the second proportion threshold value is 0.95, based on which the net tags can be developed for the 10 network assets in the clustering result a.
For convenience of understanding, please refer to fig. 10, where fig. 10 is a schematic flow diagram of network asset tagging in the embodiment of the present application, and as shown in the figure, a traffic feature set corresponding to each network asset is extracted from a traffic database, and each network asset is counted according to a preset attribute type, so that a feature vector of each network asset can be obtained, and based on this, clustering can be started based on the feature vectors of the network assets.
In the first round of clustering, several temporary clustering centers can be randomly selected, then the distance between each asset network and the clustering center is calculated, the nearest clustering center is found, and the temporary clustering cluster is formed. And updating the clustering center for the temporary clustering cluster, and then judging whether the clustering center changes. If the cluster center changes, the calculation process is repeated again with the new cluster center. If the cluster center is not changed (i.e., the cluster stop condition is satisfied), at least one clustering result is output. And finally, automatically setting an asset identification tag for the clustering result based on the characteristic rule.
The network assets are classified by adopting a clustering algorithm, and it is required to be noted that other statistical or machine learning-based methods can be used for generating classification results according to the extracted features.
Secondly, in the embodiment of the application, a mode for automatically setting asset identification tags based on feature rules is provided, and through the mode, a user can classify and survey a large number of network assets only by importing historical traffic. The problems that a large amount of manual labels are needed for network asset mapping and asset changes cannot be found in real time are solved, and therefore real-time and efficient network asset mapping service is provided for users.
Optionally, on the basis of each embodiment corresponding to fig. 4, another optional embodiment provided in the embodiments of the present application may further include:
acquiring a second traffic characteristic set corresponding to each network asset from a traffic database, wherein the second traffic characteristic set is obtained by extracting original traffic based on a second time window;
counting a second flow characteristic set corresponding to each network asset according to K preset attribute types to obtain a second characteristic vector corresponding to each network asset, wherein the second characteristic vector comprises K second characteristic values, and each second characteristic value corresponds to one preset attribute type;
and if the distance between the first characteristic vector corresponding to the network asset and the first clustering center is greater than the distance between the second characteristic vector corresponding to the network asset and the second clustering center, removing the network asset from the clustering result corresponding to the first clustering center.
In one or more embodiments, a way to update the clustering results in connection with the next time window is presented. According to the foregoing embodiments, the network assets are dynamically changed, and the traffic characteristics of the same network asset in different time windows may also change, so that the same network asset in different time windows may be clustered.
Specifically, according to the set time window size, a second traffic feature set corresponding to each network asset is obtained from a traffic database. Assuming that the second time window is from 20/18/20/00/20/2021 to 00/25/11/20/2021, all the flow characteristics with the time stamp in the second time window can be obtained from the flow database and used as the target flow characteristic set. Illustratively, taking any one of the network assets to be identified as an example, the asset IP address (e.g., 192.168.0.1) of the network asset can be known, and then, second traffic characteristics of the asset IP address belonging to the source IP address or the destination IP address are screened from the target traffic characteristic set, and are taken as the second traffic characteristic set of the network asset.
Then, according to the K preset attribute types, counting a second flow characteristic set corresponding to each network asset, and thus obtaining a second characteristic vector corresponding to each network asset. It can be understood that the manner of obtaining the second feature vector is similar to the manner of obtaining the first feature vector, and therefore, the description thereof is omitted here.
Next, the last clustering result is imported, i.e. at least one clustering result for the first time window is imported. If a network asset originally belonging to the first clustering result exists, the distance between the corresponding first characteristic vector and the first clustering center is greater than the distance between the corresponding second characteristic vector and the second clustering center, which indicates that the attribute of the network asset is changed, therefore, the network device is added into the clustering result corresponding to the second clustering center from the clustering result corresponding to the first clustering center
It is understood that the original asset identification tag may be retained for the cluster corresponding to the first cluster center and the cluster corresponding to the second cluster center.
Secondly, in the embodiment of the application, a mode of updating the clustering result in combination with the next time window is provided, and by the mode, considering that the network assets are dynamically changed, the asset marking module can periodically operate again to acquire the latest data, so that the real-time performance of the clustering result is improved.
Optionally, on the basis of each embodiment corresponding to fig. 4, another optional embodiment provided in the embodiments of the present application may further include:
and if the network assets are newly added network assets in the second time window, clustering each network asset according to the second characteristic vector corresponding to the network assets until the clustering stopping condition is met, and obtaining at least one clustering result aiming at the second time window.
In one or more embodiments, a way to update the clustering results in connection with the next time window is presented. As can be seen from the foregoing embodiments, the network assets are dynamically changing, new network assets may be added at different time windows, and therefore, re-clustering may be performed based on the newly added network assets.
It is understood that the foregoing embodiment has described how to obtain the second feature vector of the network asset, and details are not described here. Based on this, in one case, the clustering result corresponding to the original network asset (i.e., the network asset within the first time window) is retained, only the newly added network asset is clustered, and when the clustering stop condition is satisfied, at least one clustering result for the second time window is obtained. In another case, the second eigenvector of the original network asset (i.e., the network asset within the first time window) is obtained again, and clustering is performed again in combination with the second eigenvector of the new network asset, so as to obtain at least one clustering result for the second time window when the clustering stop condition is satisfied.
For ease of understanding, please refer to fig. 11, fig. 11 is a schematic flow chart illustrating updating the network asset tag according to an embodiment of the present application, and specifically as shown in the figure:
in step S1, the original traffic is acquired from the gateway device, and the traffic characteristics in the original traffic are extracted and stored in the traffic database. The flow database stores the flow characteristics and establishes data indexes correspondingly.
In step S2, the asset tagging module periodically obtains traffic characteristics within a time window from the traffic database, calculates and clusters the traffic characteristics in units of network assets, and tags the network assets.
In step S3, after the network assets are successfully classified, the asset tagging module integrates the computed asset identification tags, generates corresponding monitoring rules for different asset identification tags, and transmits the monitoring rules to the behavior monitoring module.
In step S4, the behavior monitoring module detects different network behaviors according to the monitoring rules and processes the monitoring rules correspondingly, and then applies the monitoring rules to the traffic monitoring module, and performs operations such as network sniffing and attack according to different types of network assets, and performs operations such as alarming or blocking according to the types of behaviors.
In step S5, the asset tagging module is periodically re-run to update the asset and its corresponding tag.
In the embodiment of the application, a mode of updating the clustering result in combination with the next time window is provided, and by the mode, considering that the network assets are dynamically changed, the asset marking module can be periodically re-operated to acquire the latest data, so that the real-time performance of the clustering result is improved.
Optionally, on the basis of the foregoing embodiments corresponding to fig. 4, in another optional embodiment provided in this embodiment of the present application, after marking at least one clustering result for a first time window and obtaining an asset identification tag corresponding to each network asset, the method may further include:
if the asset identification tag corresponding to the network asset is a development network tag, triggering operation and maintenance alarm operation when the network asset accesses the network asset with the office network tag;
if the asset identification tag corresponding to the network asset is an office network tag, triggering operation and maintenance alarm operation when the network asset accesses the network asset with the production network tag;
and if the asset identification tag corresponding to the network asset is a production network tag, triggering operation and maintenance alarm operation when the network asset accesses the network asset with the office network tag.
In one or more embodiments, a manner of triggering an operation and maintenance alarm operation based on an asset identification tag is presented. As can be seen from the foregoing embodiments, for a network asset in an enterprise or an organization, the corresponding asset identification tag may be obtained, and therefore, different operation and maintenance strategies may be adopted for different asset identification tags.
Illustratively, asset identification tags corresponding to network assets are taken as development network tags as an example. One operation and maintenance strategy may be to trigger an operation and maintenance alarm operation when a network asset with an open net tag actively accesses a network asset with an office net tag.
Illustratively, the asset identification tag corresponding to the network asset is taken as an office network tag as an example. One operation and maintenance strategy may be to trigger an operation and maintenance alarm operation when a network asset with an office network tag actively accesses a network asset with a production network tag. Another operation and maintenance policy may be to trigger an operation and maintenance alarm operation when a network asset with an office network tag sends traffic to an external network with a percentage greater than a threshold (e.g., 30%). Another operation and maintenance strategy may be to trigger an operation and maintenance alarm operation when the traffic fraction sent by the network asset with the office net tag to the network asset with the production net tag exceeds a threshold (e.g., 1%).
Illustratively, asset identification tags corresponding to network assets are taken as production network tags as an example. One operation and maintenance strategy may be to trigger an operation and maintenance alarm operation when a network asset with a production net tag actively accesses a network asset with an office net tag. Another operation and maintenance strategy may be to trigger an operation and maintenance alarm operation when the traffic fraction sent by the network asset with the production net tag to the network asset with the office net tag exceeds a threshold (e.g., 1%).
It is understood that the operation and maintenance alarm operation includes, but is not limited to, automatically pushing an email or a short message prompt to the operation and maintenance staff, or directly blocking access to the network asset, or limiting access traffic of the network asset, and the like, and is not limited herein.
Secondly, in the embodiment of the application, a mode for triggering operation and maintenance alarm operation based on an asset identification tag is provided, through the mode, the network asset can be monitored in real time by combining the asset identification tag, and corresponding operation and maintenance alarm operation is adopted for the network asset with risks, so that the safety of enterprises or mechanisms is favorably improved.
With reference to fig. 12, an overall flow of the network asset processing method in the present application will be described below, and another embodiment of the network asset processing method in the present application includes:
201. acquiring original flow from gateway equipment;
202. acquiring a message set to be processed from original flow;
203. classifying each message in a message set to be processed to obtain at least one message belonging to the same protocol type;
204. performing message recombination on at least one message belonging to the same protocol type to obtain an application layer message;
205. carrying out feature extraction on the application layer message to obtain flow features;
206. recording the flow characteristics into a flow database;
207. determining N network assets to be identified;
208. acquiring a target traffic characteristic set of a timestamp in a first time window from a traffic database;
209. for each network asset, taking the traffic characteristics of the source IP address belonging to the asset IP address in the target traffic characteristic set as first traffic characteristics in a first traffic characteristic set;
210. for each network asset, taking the traffic characteristics of the asset IP address belonging to the destination IP address in the target traffic characteristic set as first traffic characteristics in a first traffic characteristic set;
211. counting a first traffic feature set corresponding to each network asset according to K preset attribute types to obtain a first feature vector corresponding to each network asset;
212. clustering the N network assets according to the first feature vector corresponding to each network asset until a clustering stopping condition is met, and obtaining at least one clustering result aiming at a first time window;
213. for each clustering result of the first time window, if the average value of the uplink flow ratios of all the network assets in the clustering result is less than or equal to a first ratio threshold, setting the asset identification tag corresponding to each network asset in the clustering result as a production network tag;
214. for each clustering result of the first time window, if the average value of the external network flow proportion of all the network assets in the clustering result is greater than or equal to a second proportion threshold, setting the asset identification tag corresponding to each network asset in the clustering result as an office network tag;
215. and aiming at each clustering result of the first time window, if the intranet flow ratio average value of all the network assets in the clustering result is greater than or equal to a third ratio threshold, setting the asset identification tag corresponding to each network asset in the clustering result as a development network tag.
Referring to fig. 13, fig. 13 is a schematic diagram of an embodiment of a network asset processing device in an embodiment of the present application, where the network asset processing device 30 includes:
an obtaining module 310, configured to obtain, from a traffic database, a first traffic feature set corresponding to each network asset of N network assets to be identified, where the first traffic feature set is extracted based on an original traffic within a first time window, and N is an integer greater than 1;
the statistical module 320 is configured to perform statistics on a first traffic feature set corresponding to each network asset according to K preset attribute types to obtain a first feature vector corresponding to each network asset, where the first feature vector includes K first feature values, each first feature value corresponds to one preset attribute type, and K is an integer greater than or equal to 1;
the clustering module 330 is configured to perform clustering processing on the N network assets according to the first feature vector corresponding to each network asset until a clustering stop condition is met, so as to obtain at least one clustering result for a first time window;
and a marking module 340, configured to mark at least one clustering result for the first time window to obtain an asset identification tag corresponding to each network asset.
In the embodiment of the application, a network asset processing device is provided. By adopting the device, the purpose of automatically identifying the network assets can be achieved based on flow monitoring and feature clustering, on one hand, the problem of low manual management efficiency is solved, the labor cost required by network asset management is saved, and the real-time property of network asset identification is favorably improved. On the other hand, a large amount of flow does not need to be sent actively, so that the condition that normal service is influenced due to network asset identification can be effectively avoided.
Optionally, on the basis of the embodiment corresponding to fig. 13, in another embodiment of the network asset processing device 30 provided in the embodiment of the present application, the network asset processing device 30 includes a recording module 350;
an obtaining module 310, configured to obtain an original traffic from a gateway device, where the gateway device includes at least one of a routing device, a firewall, and a switch;
the obtaining module 310 is further configured to obtain at least one packet from the original traffic;
the obtaining module 310 is further configured to perform feature extraction on at least one packet to obtain a traffic feature, where the traffic feature includes a timestamp, a source internet protocol IP address, a destination IP address, a source port, a destination port, a packet size, and a service type;
a recording module 350, configured to record the flow characteristics into a flow database.
In the embodiment of the application, a network asset processing device is provided. By adopting the device, the flow mirror image is adopted to obtain the input from the gateway equipment, so the operation of normal service is not influenced. Therefore, the method can perform primary feature extraction and then store the extracted flow features into the flow database, thereby facilitating subsequent processing.
Alternatively, on the basis of the embodiment corresponding to fig. 13, in another embodiment of the network asset processing device 30 provided in the embodiment of the present application,
an obtaining module 310, specifically configured to obtain a set of messages to be processed from an original flow;
classifying each message in a message set to be processed to obtain at least one message belonging to the same protocol type;
an obtaining module 310, configured to perform packet reassembly on at least one packet belonging to the same protocol type to obtain an application layer packet;
and extracting the characteristics of the application layer message to obtain the flow characteristics.
In the embodiment of the application, a network asset processing device is provided. By adopting the device, the aim of extracting the characteristics of the application layer is achieved based on the recombination of a plurality of messages, thereby being beneficial to increasing the richness of the characteristics.
Alternatively, on the basis of the embodiment corresponding to fig. 13, in another embodiment of the network asset processing device 30 provided in the embodiment of the present application,
an obtaining module 310, specifically configured to obtain, from a traffic database, a target traffic feature set with a timestamp within a first time window;
regarding each network asset, taking the traffic characteristics of the source IP address belonging to the asset IP address in the target traffic characteristic set as first traffic characteristics in a first traffic characteristic set, wherein the asset IP address is the IP address of the network asset;
and regarding the traffic characteristics of which the destination IP address belongs to the asset IP address in the target traffic characteristic set as the first traffic characteristics in the first traffic characteristic set for each network asset.
In the embodiment of the application, a network asset processing device is provided. With the above apparatus, the IP addresses (i.e., the source IP address and the destination IP address) are used as the basis for aggregating the asset characteristics of the network, thereby improving the feasibility and operability of the scheme.
Optionally, on the basis of the embodiment corresponding to fig. 13, in another embodiment of the network asset processing device 30 provided in the embodiment of the present application, the K preset attribute types include at least one of an uplink traffic total, a downlink traffic total, an uplink traffic proportion, and a downlink traffic proportion;
the statistics module 320 is specifically configured to sum, for each network asset, message sizes included in first traffic features of which source IP addresses belong to asset IP addresses in the first traffic feature set if the K preset attribute types include an uplink traffic total, to obtain a first feature value corresponding to the uplink traffic total in a first feature vector, where the asset IP address is an IP address of the network asset;
for each network asset, if the K preset attribute types comprise downlink traffic totals, summing the sizes of messages included in first traffic characteristics of a target IP address belonging to an asset IP address in a first traffic characteristic set to obtain a first characteristic value corresponding to the downlink traffic totals in a first characteristic vector;
for each network asset, if the K preset attribute types comprise uplink and downlink flow sums, summing a first characteristic value corresponding to the downlink flow sums and a first characteristic value corresponding to the uplink flow sums to obtain a first characteristic value corresponding to the uplink and downlink flow sums in a first characteristic vector;
for each network asset, if the K preset attribute types comprise an uplink flow proportion, calculating a ratio between a first characteristic value corresponding to an uplink flow sum and a first characteristic value corresponding to the uplink flow sum to obtain a first characteristic value corresponding to the uplink flow proportion in a first characteristic vector;
for each network asset, if the K preset attribute types include a downlink traffic proportion, calculating a ratio between a first eigenvalue corresponding to a downlink traffic sum and a first eigenvalue corresponding to an uplink traffic sum, and obtaining a first eigenvalue corresponding to the downlink traffic proportion in a first eigenvector.
In the embodiment of the application, a network asset processing device is provided. By adopting the device, the characteristics related to the uplink and downlink flow dimensions are aggregated together to be used as the basis for generating the characteristic vector, so that the richness of the characteristic vector is increased, and the accuracy of characteristic clustering is favorably improved.
Optionally, on the basis of the embodiment corresponding to fig. 13, in another embodiment of the network asset processing device 30 provided in this embodiment of the present application, the K preset attribute types include at least one of an internal network traffic total, an external network traffic total, an internal and external network traffic total, an internal network traffic ratio, and an external network traffic ratio;
the statistical module 320 is specifically configured to sum, for each network asset, message sizes included in first traffic characteristics, where the source IP address and the destination IP address in the first traffic characteristic set belong to an intranet address, if the K preset attribute types include an intranet traffic total, to obtain a first characteristic value corresponding to the intranet traffic total in a first characteristic vector;
for each network asset, if the K preset attribute types comprise the outer network traffic total, summing the sizes of messages included in the first traffic characteristics of the source IP address or the destination IP address belonging to the outer network address in the first traffic characteristic set to obtain a first characteristic value corresponding to the outer network traffic total in the first characteristic vector;
for each network asset, if the K preset attribute types comprise the internal and external network flow sum, summing a first characteristic value corresponding to the internal network flow sum and a first characteristic value corresponding to the external network flow proportion to obtain a first characteristic value corresponding to the internal and external network flow sum in a first characteristic vector;
for each network asset, if the K preset attribute types comprise an intranet flow ratio, calculating a ratio between a first characteristic value corresponding to an intranet flow sum and a first characteristic value corresponding to an intranet flow sum to obtain a first characteristic value corresponding to the intranet flow ratio in a first characteristic vector;
for each network asset, if the K preset attribute types comprise an external network traffic proportion, calculating a ratio between a first characteristic value corresponding to the external network traffic proportion and a first characteristic value corresponding to the total internal and external network traffic to obtain a first characteristic value corresponding to the external network traffic proportion in a first characteristic vector.
In the embodiment of the application, a network asset processing device is provided. By adopting the device, the characteristics related to the flow dimensions of the internal and external networks are aggregated together to be used as the basis for generating the characteristic vector, so that the richness of the characteristic vector is increased, and the accuracy of characteristic clustering is favorably improved.
Optionally, on the basis of the embodiment corresponding to fig. 13, in another embodiment of the network asset processing device 30 provided in this embodiment of the present application, the K preset attribute types include at least one of a maximum ratio source port, a maximum ratio source port ratio, and a maximum ratio port ratio;
the statistics module 320 is specifically configured to, for each network asset, determine, according to the first traffic feature set, total traffic of the source ports and traffic of each source port if the K preset attribute types include the maximum-ratio source port, and use the source port with the largest traffic ratio as a first feature value corresponding to the maximum-ratio source port in the first feature vector;
for each network asset, if the K preset attribute types comprise a maximum-ratio destination port, determining total flow of the destination ports and flow of each destination port according to a first flow characteristic set, and taking the destination port with the maximum flow ratio as a first characteristic value corresponding to the maximum-ratio destination port in a first characteristic vector;
for each network asset, if the K preset attribute types comprise the maximum ratio source port proportion, calculating the ratio of the flow corresponding to the maximum ratio source port to the total flow of the source port to obtain a first characteristic value corresponding to the maximum ratio source port proportion in a first characteristic vector;
for each network asset, if the K preset attribute types include a port proportion of a maximum proportion target, calculating a ratio between a flow corresponding to the port of the maximum proportion target and a total flow of the target port to obtain a first eigenvalue corresponding to the port proportion of the maximum proportion target in the first eigenvector.
In the embodiment of the application, a network asset processing device is provided. By adopting the device, the features related to the port dimensionality are aggregated together to be used as the basis for generating the feature vector, so that the richness of the feature vector is increased, and the accuracy of feature clustering is favorably improved.
Alternatively, on the basis of the embodiment corresponding to fig. 13, in another embodiment of the network asset processing device 30 provided in the embodiment of the present application,
the clustering module 330 is specifically configured to obtain K preset weight values, where each weight value corresponds to a preset attribute type;
for each network asset, determining distances between the network asset and T clustering centers according to K weighted values and a first feature vector, and dividing the network asset into clustering clusters with the shortest distances, wherein T is an integer greater than 1;
if the clustering stop condition is met, obtaining at least one clustering result aiming at the first time window;
and if the clustering stop condition is not met, updating the T clustering centers.
In the embodiment of the application, a network asset processing device is provided. By adopting the device, each preset attribute type can be endowed with a weighted value according to experimental data, and the characteristics are divided by combining the weighted values when clustering is carried out, so that the clustering accuracy is favorably improved.
Alternatively, on the basis of the embodiment corresponding to fig. 13, in another embodiment of the network asset processing device 30 provided in the embodiment of the present application,
a marking module 340, specifically configured to display at least one clustering result of the first time window;
and responding to a labeling instruction aiming at each clustering result, and determining an asset identification tag corresponding to each network asset, wherein the labeling instruction carries the asset identification tag.
In the embodiment of the application, a network asset processing device is provided. By adopting the device, a corresponding channel is provided for the user-defined asset identification tag, and the asset identification tag of the clustering result can be manually calibrated, so that the flexibility of the scheme is improved.
Alternatively, on the basis of the embodiment corresponding to fig. 13, in another embodiment of the network asset processing device 30 provided in the embodiment of the present application,
the marking module 340 is specifically configured to, for each clustering result of the first time window, set an asset identification tag corresponding to each network asset in the clustering result as a production network tag if an uplink traffic proportion average value of all network assets in the clustering result is less than or equal to a first proportion threshold;
for each clustering result of the first time window, if the average value of the external network flow proportion of all the network assets in the clustering result is greater than or equal to a second proportion threshold, setting the asset identification tag corresponding to each network asset in the clustering result as an office network tag;
and aiming at each clustering result of the first time window, if the intranet flow ratio average value of all the network assets in the clustering result is greater than or equal to a third ratio threshold, setting the asset identification tag corresponding to each network asset in the clustering result as a development network tag.
In the embodiment of the application, a network asset processing device is provided. By adopting the device, the user can classify and survey a large amount of network assets only by importing historical flow. The problems that a large amount of manual labels are needed for network asset mapping and asset changes cannot be found in real time are solved, and therefore real-time and efficient network asset mapping service is provided for users.
Optionally, on the basis of the embodiment corresponding to fig. 13, in another embodiment of the network asset processing apparatus 30 provided in the embodiment of the present application, the network asset processing apparatus 30 includes a culling module 360;
the obtaining module 310 is further configured to obtain a second traffic feature set corresponding to each network asset from the traffic database, where the second traffic feature set is extracted based on the original traffic in the second time window;
the statistical module 320 is further configured to perform statistics on a second traffic feature set corresponding to each network asset according to K preset attribute types to obtain a second feature vector corresponding to each network asset, where the second feature vector includes K second feature values, and each second feature value corresponds to one preset attribute type;
and the eliminating module 360 is configured to eliminate the network asset from the clustering result corresponding to the first clustering center if the distance between the first feature vector corresponding to the network asset and the first clustering center is greater than the distance between the second feature vector corresponding to the network asset and the second clustering center.
In the embodiment of the application, a network asset processing device is provided. By adopting the device, the dynamic change of the network assets is considered, and the asset marking module can be periodically operated again to acquire the latest data, so that the real-time performance of the clustering result is improved.
Alternatively, on the basis of the embodiment corresponding to fig. 13, in another embodiment of the network asset processing device 30 provided in the embodiment of the present application,
the clustering module 330 is further configured to, if the network asset is a network asset newly added in the second time window, perform clustering processing on each network asset according to a second feature vector corresponding to the network asset until a clustering stop condition is met, and obtain at least one clustering result for the second time window.
In the embodiment of the application, a network asset processing device is provided. By adopting the device, the dynamic change of the network assets is considered, and the asset marking module can be periodically operated again to acquire the latest data, so that the real-time performance of the clustering result is improved.
Optionally, on the basis of the embodiment corresponding to fig. 13, in another embodiment of the network asset processing device 30 provided in the embodiment of the present application, the network asset processing device 30 further includes an operation and maintenance module 370;
the operation and maintenance module 370 is configured to, after at least one clustering result for the first time window is marked to obtain an asset identification tag corresponding to each network asset, trigger an operation and maintenance alarm operation when a network asset accesses a network asset with an office network tag if the asset identification tag corresponding to the network asset is a development network tag;
the operation and maintenance module 370 is further configured to, after marking at least one clustering result for the first time window to obtain an asset identification tag corresponding to each network asset, trigger an operation and maintenance alarm operation when the network asset accesses the network asset having the production network tag if the asset identification tag corresponding to the network asset is an office network tag;
the operation and maintenance module 370 is further configured to, after marking the at least one clustering result for the first time window to obtain the asset identification tag corresponding to each network asset, trigger an operation and maintenance alarm operation when the network asset accesses the network asset with the office network tag if the asset identification tag corresponding to the network asset is the production network tag.
In the embodiment of the application, a network asset processing device is provided. By adopting the device, the network assets can be monitored in real time by combining the asset identification tag, and corresponding operation and maintenance alarming operation is adopted for the network assets with risks, so that the security of enterprises or institutions is improved.
The embodiment of the application also provides another network asset processing device which can be deployed in terminal equipment. As shown in fig. 14, for convenience of explanation, only the parts related to the embodiments of the present application are shown, and details of the technology are not disclosed, please refer to the method part of the embodiments of the present application. The terminal device may be any terminal device including a mobile phone, a tablet computer, a Personal Digital Assistant (PDA), a Point of Sales (POS), a vehicle-mounted computer, and the like, taking the terminal device as the mobile phone as an example:
fig. 14 is a block diagram illustrating a partial structure of a mobile phone related to a terminal device provided in an embodiment of the present application. Referring to fig. 14, the handset includes: radio Frequency (RF) circuit 410, memory 420, input unit 430, display unit 440, sensor 450, audio circuit 460, wireless fidelity (WiFi) module 470, processor 480, and power supply 490. Those skilled in the art will appreciate that the handset configuration shown in fig. 14 is not intended to be limiting and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.
The following describes each component of the mobile phone in detail with reference to fig. 14:
the RF circuit 410 may be used for receiving and transmitting signals during information transmission and reception or during a call, and in particular, receives downlink information of a base station and then processes the received downlink information to the processor 480; in addition, the data for designing uplink is transmitted to the base station. In general, the RF circuit 410 includes, but is not limited to, an antenna, at least one Amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like. In addition, the RF circuitry 410 may also communicate with networks and other devices via wireless communications. The wireless communication may use any communication standard or protocol, including but not limited to Global System for Mobile communication (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), email, Short Messaging Service (SMS), and the like.
The memory 420 may be used to store software programs and modules, and the processor 480 executes various functional applications and data processing of the mobile phone by operating the software programs and modules stored in the memory 420. The memory 420 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. Further, the memory 420 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
The input unit 430 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the cellular phone. Specifically, the input unit 430 may include a touch panel 431 and other input devices 432. The touch panel 431, also called a touch screen, may collect touch operations of a user on or near the touch panel 431 (e.g., operations of the user on or near the touch panel 431 using any suitable object or accessory such as a finger or a stylus) and drive the corresponding connection device according to a preset program. Alternatively, the touch panel 431 may include two parts of a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 480, and receives and executes commands sent from the processor 480. In addition, the touch panel 431 may be implemented in various types, such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. The input unit 430 may include other input devices 432 in addition to the touch panel 431. In particular, other input devices 432 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.
The display unit 440 may be used to display information input by the user or information provided to the user and various menus of the cellular phone. The Display unit 440 may include a Display panel 441, and optionally, the Display panel 441 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like. Further, the touch panel 431 may cover the display panel 441, and when the touch panel 431 detects a touch operation on or near the touch panel 431, the touch panel is transmitted to the processor 480 to determine the type of the touch event, and then the processor 480 provides a corresponding visual output on the display panel 441 according to the type of the touch event. Although the touch panel 431 and the display panel 441 are shown in fig. 14 as two separate components to implement the input and output functions of the mobile phone, in some embodiments, the touch panel 431 and the display panel 441 may be integrated to implement the input and output functions of the mobile phone.
The handset may also include at least one sensor 450, such as a light sensor, motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor that adjusts the brightness of the display panel 441 according to the brightness of ambient light, and a proximity sensor that turns off the display panel 441 and/or the backlight when the mobile phone is moved to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally, three axes), can detect the magnitude and direction of gravity when stationary, and can be used for applications of recognizing the posture of a mobile phone (such as horizontal and vertical screen switching, related games, magnetometer posture calibration), vibration recognition related functions (such as pedometer and tapping), and the like; as for other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which can be configured on the mobile phone, further description is omitted here.
Audio circuit 460, speaker 461, microphone 462 may provide an audio interface between the user and the cell phone. The audio circuit 460 may transmit the electrical signal converted from the received audio data to the speaker 461, and convert the electrical signal into a sound signal for output by the speaker 461; on the other hand, the microphone 462 converts the collected sound signal into an electrical signal, which is received by the audio circuit 460 and converted into audio data, which is then processed by the audio data output processor 480 and then transmitted to, for example, another cellular phone via the RF circuit 410, or output to the memory 420 for further processing.
WiFi belongs to short-distance wireless transmission technology, and the mobile phone can help a user to receive and send e-mails, browse webpages, access streaming media and the like through the WiFi module 470, and provides wireless broadband Internet access for the user. Although fig. 14 shows the WiFi module 470, it is understood that it does not belong to the essential constitution of the handset, and may be omitted entirely as needed within the scope not changing the essence of the invention.
The processor 480 is a control center of the mobile phone, connects various parts of the entire mobile phone by using various interfaces and lines, and performs various functions of the mobile phone and processes data by operating or executing software programs and/or modules stored in the memory 420 and calling data stored in the memory 420, thereby integrally monitoring the mobile phone. Optionally, processor 480 may include one or more processing units; optionally, the processor 480 may integrate an application processor and a modem processor, wherein the application processor mainly handles operating systems, user interfaces, application programs, and the like, and the modem processor mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into processor 480.
The phone also includes a power supply 490 (e.g., a battery) for powering the various components, optionally, the power supply may be logically connected to the processor 480 through a power management system, so as to implement functions such as managing charging, discharging, and power consumption through the power management system.
Although not shown, the mobile phone may further include a camera, a bluetooth module, etc., which are not described herein.
The steps performed by the terminal device in the above-described embodiment may be based on the terminal device structure shown in fig. 14.
The embodiment of the application also provides another network asset processing device which can be deployed in a server. Fig. 15 is a schematic diagram of a server structure provided by an embodiment of the present application, where the server 500 may have a relatively large difference due to different configurations or performances, and may include one or more Central Processing Units (CPUs) 522 (e.g., one or more processors) and a memory 532, and one or more storage media 530 (e.g., one or more mass storage devices) for storing applications 542 or data 544. Memory 532 and storage media 530 may be, among other things, transient storage or persistent storage. The program stored on the storage medium 530 may include one or more modules (not shown), each of which may include a series of instruction operations for the server. Still further, the central processor 522 may be configured to communicate with the storage medium 530, and execute a series of instruction operations in the storage medium 530 on the server 500.
The Server 500 may also include one or more power supplies 526, one or more wired or wireless network interfaces 550, one or more input-output interfaces 558, and/or one or more operating systems 541, such as a Windows ServerTM,Mac OS XTM,UnixTM,LinuxTM,FreeBSDTMAnd so on.
The steps performed by the server in the above embodiment may be based on the server structure shown in fig. 15.
Embodiments of the present application also provide a computer-readable storage medium, in which a computer program is stored, and when the computer program runs on a computer, the computer is caused to execute the method described in the foregoing embodiments.
Embodiments of the present application also provide a computer program product including a program, which, when run on a computer, causes the computer to perform the methods described in the foregoing embodiments.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network asset, etc.) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (17)

1. A method for network asset handling, comprising:
acquiring a first traffic characteristic set corresponding to each network asset in N network assets to be identified from a traffic database, wherein the first traffic characteristic set is obtained by extracting original traffic based on a first time window, and N is an integer greater than 1;
counting a first traffic feature set corresponding to each network asset according to K preset attribute types to obtain a first feature vector corresponding to each network asset, wherein the first feature vector comprises K first feature values, each first feature value corresponds to one preset attribute type, and K is an integer greater than or equal to 1;
clustering the N network assets according to the first characteristic vector corresponding to each network asset until a clustering stopping condition is met, and obtaining at least one clustering result aiming at the first time window;
and marking at least one clustering result aiming at the first time window to obtain an asset identification tag corresponding to each network asset.
2. The method of network asset handling according to claim 1, said method further comprising:
acquiring original flow from gateway equipment, wherein the gateway equipment comprises at least one of routing equipment, a firewall and a switch;
acquiring at least one message from the original flow;
performing feature extraction on the at least one message to obtain a traffic feature, wherein the traffic feature comprises a timestamp, a source Internet Protocol (IP) address, a destination IP address, a source port, a destination port, a message size and a service type;
recording the flow characteristics into the flow database.
3. The method according to claim 2, wherein said obtaining at least one packet from said original traffic comprises:
acquiring a message set to be processed from the original flow;
classifying each message in the message set to be processed to obtain the at least one message belonging to the same protocol type;
the extracting the feature of the at least one message to obtain the flow feature includes:
performing message recombination on the at least one message belonging to the same protocol type to obtain an application layer message;
and extracting the characteristics of the application layer message to obtain the flow characteristics.
4. The method according to claim 1, wherein the obtaining the first traffic feature set corresponding to each network asset from the traffic database comprises:
obtaining a target traffic feature set with a timestamp within the first time window from the traffic database;
regarding each network asset, taking the traffic characteristics of which the source IP address belongs to the asset IP address in the target traffic characteristic set as the first traffic characteristics in the first traffic characteristic set, wherein the asset IP address is the IP address of the network asset;
and regarding each network asset, taking the traffic characteristics of the IP address of the destination in the target traffic characteristic set, which belong to the IP address of the asset, as the first traffic characteristics in the first traffic characteristic set.
5. The network asset processing method according to claim 1, wherein said K preset attribute types include at least one of an upstream traffic total, a downstream traffic total, an upstream and downstream traffic total, an upstream traffic proportion, and a downstream traffic proportion;
the counting the first traffic feature set corresponding to each network asset according to the K preset attribute types to obtain a first feature vector corresponding to each network asset includes:
for each network asset, if the K preset attribute types include the uplink traffic total, summing sizes of messages included in first traffic features, of which source IP addresses belong to asset IP addresses, in the first traffic feature set to obtain a first feature value corresponding to the uplink traffic total in the first feature vector, where the asset IP addresses are IP addresses of the network assets;
for each network asset, if the K preset attribute types include the downlink traffic total, summing sizes of messages included in first traffic characteristics of which destination IP addresses belong to asset IP addresses in the first traffic characteristic set to obtain a first characteristic value corresponding to the downlink traffic total in the first characteristic vector;
for each network asset, if the K preset attribute types include the uplink and downlink traffic totals, summing a first eigenvalue corresponding to the downlink traffic totals and a first eigenvalue corresponding to the uplink traffic totals to obtain a first eigenvalue corresponding to the uplink and downlink traffic totals in the first eigenvector;
for each network asset, if the K preset attribute types include the uplink traffic proportion, calculating a ratio between a first eigenvalue corresponding to the uplink traffic total and a first eigenvalue corresponding to the uplink traffic total to obtain a first eigenvalue corresponding to the uplink traffic proportion in the first eigenvector;
for each network asset, if the K preset attribute types include the downlink traffic proportion, calculating a ratio between a first eigenvalue corresponding to the downlink traffic total and a first eigenvalue corresponding to the uplink traffic total, to obtain a first eigenvalue corresponding to the downlink traffic proportion in the first eigenvector.
6. The method according to claim 1, wherein the K preset attribute types include at least one of an intranet traffic total, an extranet traffic total, an intranet and extranet traffic total, an intranet traffic proportion, and an extranet traffic proportion;
the counting the first traffic feature set corresponding to each network asset according to the K preset attribute types to obtain a first feature vector corresponding to each network asset includes:
for each network asset, if the K preset attribute types include the intranet traffic total, summing sizes of messages included in a first traffic feature of which a source IP address and a destination IP address belong to an intranet address in the first traffic feature set to obtain a first feature value corresponding to the intranet traffic total in the first feature vector;
for each network asset, if the K preset attribute types include the outer network traffic total, summing sizes of messages included in first traffic features of which source IP addresses or destination IP addresses belong to outer network addresses in the first traffic feature set to obtain a first feature value corresponding to the outer network traffic total in the first feature vector;
for each network asset, if the K preset attribute types include the internal and external network traffic total, summing a first eigenvalue corresponding to the internal network traffic total and a first eigenvalue corresponding to the external network traffic proportion to obtain a first eigenvalue corresponding to the internal and external network traffic total in the first eigenvector;
for each network asset, if the K preset attribute types include the intranet flow ratio, calculating a ratio between a first eigenvalue corresponding to the intranet flow sum and a first eigenvalue corresponding to the intranet and extranet flow sum, and obtaining a first eigenvalue corresponding to the intranet flow ratio in the first eigenvector;
for each network asset, if the K preset attribute types include the extranet traffic proportion, calculating a ratio between a first eigenvalue corresponding to the extranet traffic proportion and a first eigenvalue corresponding to the total extranet traffic to obtain a first eigenvalue corresponding to the extranet traffic proportion in the first eigenvector.
7. The method of claim 1, wherein the K preset attribute types comprise at least one of a maximum proportion source port, a maximum proportion source port ratio, and a maximum proportion port ratio;
the counting the first traffic feature set corresponding to each network asset according to the K preset attribute types to obtain a first feature vector corresponding to each network asset includes:
for each network asset, if the K preset attribute types include the maximum ratio source port, determining total flow of the source ports and flow of each source port according to the first flow feature set, and taking the source port with the largest ratio as a first feature value corresponding to the maximum ratio source port in the first feature vector;
for each network asset, if the K preset attribute types include the maximum ratio destination port, determining total traffic of the destination ports and traffic of each destination port according to the first traffic feature set, and using the destination port with the maximum traffic ratio as a first feature value corresponding to the maximum ratio destination port in the first feature vector;
for each network asset, if the K preset attribute types include the maximum ratio source port ratio, calculating a ratio between a flow corresponding to the maximum ratio source port and a total flow of the source ports to obtain a first eigenvalue corresponding to the maximum ratio source port ratio in the first eigenvector;
for each network asset, if the K preset attribute types include the maximum proportion destination port proportion, calculating a ratio between a flow corresponding to the maximum proportion destination port and a total flow of the destination ports to obtain a first eigenvalue corresponding to the maximum proportion destination port proportion in the first eigenvector.
8. The method according to claim 1, wherein the clustering the N network assets according to the first eigenvector corresponding to each network asset until a clustering stop condition is satisfied to obtain at least one clustering result for the first time window comprises:
acquiring preset K weight values, wherein each weight value corresponds to a preset attribute type;
for each network asset, determining distances between the network asset and T clustering centers according to the K weighted values and the first feature vector, and dividing the network asset into clustering clusters with the shortest distances, wherein T is an integer greater than 1;
if the clustering stopping condition is met, obtaining at least one clustering result aiming at the first time window;
and if the clustering stop condition is not met, updating the T clustering centers.
9. The method according to claim 1, wherein said tagging at least one clustering result for the first time window to obtain an asset identification tag corresponding to each network asset comprises:
displaying at least one clustering result of the first time window;
and responding to the marking instruction aiming at each clustering result, and determining the asset identification tag corresponding to each network asset, wherein the marking instruction carries the asset identification tag.
10. The method according to claim 1, wherein said tagging at least one clustering result for the first time window to obtain an asset identification tag corresponding to each network asset comprises:
for each clustering result of the first time window, if the average value of the uplink flow ratios of all the network assets in the clustering result is less than or equal to a first ratio threshold, setting the asset identification tag corresponding to each network asset in the clustering result as a production network tag;
for each clustering result of the first time window, if the average value of the external network flow proportion of all the network assets in the clustering result is greater than or equal to a second proportion threshold, setting the asset identification tag corresponding to each network asset in the clustering result as an office network tag;
and aiming at each clustering result of the first time window, if the average value of the intranet flow ratios of all the network assets in the clustering result is greater than or equal to a third ratio threshold, setting the asset identification tag corresponding to each network asset in the clustering result as a development network tag.
11. The method of network asset handling according to claim 1, said method further comprising:
acquiring a second traffic characteristic set corresponding to each network asset from the traffic database, wherein the second traffic characteristic set is extracted based on original traffic in a second time window;
counting a second traffic feature set corresponding to each network asset according to the K preset attribute types to obtain a second feature vector corresponding to each network asset, wherein the second feature vector comprises K second feature values, and each second feature value corresponds to one preset attribute type;
and if the distance between the first characteristic vector corresponding to the network asset and the first clustering center is greater than the distance between the second characteristic vector corresponding to the network asset and the second clustering center, removing the network asset from the clustering result corresponding to the first clustering center.
12. The method of network asset handling according to claim 11, said method further comprising:
and if the network assets are newly added network assets in the second time window, clustering each network asset according to a second feature vector corresponding to the network asset until the clustering stopping condition is met, and obtaining at least one clustering result aiming at the second time window.
13. The method according to any of claims 1 to 12, wherein after tagging the at least one clustering result for the first time window to obtain the asset identification tag corresponding to each network asset, the method further comprises:
if the asset identification tag corresponding to the network asset is a development network tag, triggering operation and maintenance alarm operation when the network asset accesses the network asset with the office network tag;
if the asset identification tag corresponding to the network asset is an office network tag, triggering operation and maintenance alarm operation when the network asset accesses the network asset with the production network tag;
and if the asset identification tag corresponding to the network asset is a production network tag, triggering operation and maintenance alarm operation when the network asset accesses the network asset with the office network tag.
14. A network asset processing device, comprising:
the system comprises an acquisition module, a traffic analysis module and a traffic analysis module, wherein the acquisition module is used for acquiring a first traffic characteristic set corresponding to each network asset in N network assets to be identified from a traffic database, the first traffic characteristic set is obtained by extracting original traffic based on a first time window, and N is an integer greater than 1;
the statistical module is used for carrying out statistics on the first traffic feature set corresponding to each network asset according to K preset attribute types to obtain a first feature vector corresponding to each network asset, wherein the first feature vector comprises K first feature values, each first feature value corresponds to one preset attribute type, and K is an integer greater than or equal to 1;
the clustering module is used for clustering the N network assets according to the first characteristic vector corresponding to each network asset until a clustering stopping condition is met, and obtaining at least one clustering result aiming at the first time window;
and the marking module is used for marking at least one clustering result aiming at the first time window to obtain an asset identification tag corresponding to each network asset.
15. A computer device, comprising: a memory, a processor, and a bus system;
wherein the memory is used for storing programs;
the processor for executing the program in the memory, the processor for performing the network asset processing method of any of claims 1 to 13 according to instructions in program code;
the bus system is used for connecting the memory and the processor so as to enable the memory and the processor to communicate.
16. A computer-readable storage medium comprising instructions that, when executed on a computer, cause the computer to perform the network asset processing method of any of claims 1 to 13.
17. A computer program product comprising a computer program and instructions, characterized in that the computer program/instructions, when executed by a processor, implement the network asset processing method according to any of claims 1 to 13.
CN202111424684.4A 2021-11-26 2021-11-26 Network asset processing method, device, equipment and storage medium Pending CN114301757A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111424684.4A CN114301757A (en) 2021-11-26 2021-11-26 Network asset processing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111424684.4A CN114301757A (en) 2021-11-26 2021-11-26 Network asset processing method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114301757A true CN114301757A (en) 2022-04-08

Family

ID=80965188

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111424684.4A Pending CN114301757A (en) 2021-11-26 2021-11-26 Network asset processing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114301757A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109167799A (en) * 2018-11-06 2019-01-08 北京华顺信安科技有限公司 A kind of vulnerability monitoring detection system for intelligent network information system
CN112260861A (en) * 2020-10-13 2021-01-22 上海奇甲信息科技有限公司 Network asset topology identification method based on flow perception
KR102244036B1 (en) * 2020-08-24 2021-04-23 주식회사 로그프레소 Method for Classifying Network Asset Using Network Flow data and Method for Detecting Threat to the Network Asset Classified by the Same Method
CN112929216A (en) * 2021-02-05 2021-06-08 深信服科技股份有限公司 Asset management method, device, equipment and readable storage medium
CN113554056A (en) * 2021-06-21 2021-10-26 杭州安恒信息技术股份有限公司 Network asset aggregation method, device, electronic device and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109167799A (en) * 2018-11-06 2019-01-08 北京华顺信安科技有限公司 A kind of vulnerability monitoring detection system for intelligent network information system
KR102244036B1 (en) * 2020-08-24 2021-04-23 주식회사 로그프레소 Method for Classifying Network Asset Using Network Flow data and Method for Detecting Threat to the Network Asset Classified by the Same Method
CN112260861A (en) * 2020-10-13 2021-01-22 上海奇甲信息科技有限公司 Network asset topology identification method based on flow perception
CN112929216A (en) * 2021-02-05 2021-06-08 深信服科技股份有限公司 Asset management method, device, equipment and readable storage medium
CN113554056A (en) * 2021-06-21 2021-10-26 杭州安恒信息技术股份有限公司 Network asset aggregation method, device, electronic device and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
李憧;刘鹏;蔡国庆;: "基于流量感知的动态网络资产监测研究", 信息安全研究, no. 06, 4 June 2020 (2020-06-04) *
王宸东;郭渊博;甄帅辉;杨威超;: "网络资产探测技术研究", 计算机科学, no. 12, 15 December 2018 (2018-12-15) *

Similar Documents

Publication Publication Date Title
US10902114B1 (en) Automated cybersecurity threat detection with aggregation and analysis
CN108460278A (en) A kind of threat information processing method and device
CN111416845B (en) Method, device and storage medium for configuring edge equipment
CN107204989B (en) Advertisement blocking method, terminal, server and storage medium
CN110019825B (en) Method and device for analyzing data semantics
CN110995810B (en) Object identification method based on artificial intelligence and related device
CN111125523B (en) Searching method, searching device, terminal equipment and storage medium
CN110399720A (en) A kind of method and relevant apparatus of file detection
CN112214390B (en) Test case generation method, device, system, equipment and medium
CN113392150A (en) Data table display method, device, equipment and medium based on service domain
CN110223088A (en) A kind of method and device that method, the information of information distribution generate
CN116976898B (en) Data acquisition method, data visualization method, device and related products
CN111951021A (en) Method and device for discovering suspicious communities, storage medium and computer equipment
CN115239941B (en) Countermeasure image generation method, related device and storage medium
CN115398861A (en) Abnormal file detection method and related product
CN114301757A (en) Network asset processing method, device, equipment and storage medium
CN111031004B (en) Service flow processing method, service flow learning method, device and system
CN115062197A (en) Attendance data detection method and device and storage medium
CN109544241A (en) A kind of construction method of clicking rate prediction model, clicking rate predictor method and device
US20220318378A1 (en) Detecting threats based on api service business logic abuse
CN114971635A (en) Transaction risk management method and related device
CN115470399A (en) ID (identity) communication method, device, equipment and storage medium based on big data
CN115145910A (en) Protocol data management method and related device
CN117692898B (en) Supervision and early warning method and system with automatic risk identification function
WO2024098699A1 (en) Entity object thread detection method and apparatus, device, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40070827

Country of ref document: HK