CN113259402B - Method and device for determining abnormal network protocol address - Google Patents

Method and device for determining abnormal network protocol address Download PDF

Info

Publication number
CN113259402B
CN113259402B CN202110815253.4A CN202110815253A CN113259402B CN 113259402 B CN113259402 B CN 113259402B CN 202110815253 A CN202110815253 A CN 202110815253A CN 113259402 B CN113259402 B CN 113259402B
Authority
CN
China
Prior art keywords
target
sample
network protocol
protocol address
map
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110815253.4A
Other languages
Chinese (zh)
Other versions
CN113259402A (en
Inventor
王硕
徐凯波
孙泽懿
周星杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Mininglamp Software System Co ltd
Original Assignee
Beijing Mininglamp Software System Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Mininglamp Software System Co ltd filed Critical Beijing Mininglamp Software System Co ltd
Priority to CN202110815253.4A priority Critical patent/CN113259402B/en
Publication of CN113259402A publication Critical patent/CN113259402A/en
Application granted granted Critical
Publication of CN113259402B publication Critical patent/CN113259402B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection

Abstract

The application provides a method and a device for determining an abnormal network protocol address, and belongs to the technical field of intelligent marketing. The method comprises the following steps: analyzing a plurality of types of characteristic fields from a flow log through an advertisement flow monitoring system, wherein each type of characteristic field comprises a plurality of user identifications and network protocol addresses corresponding to each user identification; acquiring a plurality of target user identifications adopting target network protocol addresses in the same time period; constructing a target map according to a plurality of target user identifications and target network protocol addresses; inputting the target map into a target map attention network to obtain tag data output by the target map attention network; in the event that the tag data indicates that the target network protocol address in the target map is abnormal, determining the target network protocol address as an abnormal network protocol address. The method and the device are beneficial to searching the network protocol address of the black industry group.

Description

Method and device for determining abnormal network protocol address
Technical Field
The present application relates to the field of intelligent marketing technologies, and in particular, to a method and an apparatus for determining an abnormal network protocol address.
Background
In the process of marketing advertisement service development, advertisers are often attacked by black industry, and the benefits of the advertisers are seriously damaged. With the improvement of the anti-fraud industry of the advertisement, the cheating rules of the black industry are continuously updated, the parameters such as the equipment parameter replacement, the proxy IP and the Cookie refreshing of the program are used by the cheating rules, the difficulty of judging the abnormal flow is increased, and due to the fact that resources such as the IP and the UUID (Universal Unique Identifier) are limited, a cheating network is often formed due to the reusability of the IP and the UUID when the black industry makes a large amount of fake cheats.
In the existing method, a supervised machine learning model is generally trained by using labeled traffic data, and the traffic is divided into normal traffic and abnormal traffic. At present, the machine learning model can only be used for simply determining whether the flow of a single user is abnormal or not, and is difficult to determine a cheating group.
Disclosure of Invention
The embodiment of the application aims to provide a method and a device for determining an abnormal network protocol address, so as to solve the problem that a cheating group is difficult to determine. The specific technical scheme is as follows:
in a first aspect, a method for determining an abnormal network protocol address is provided, where the method includes:
analyzing a plurality of types of characteristic fields from a flow log through an advertisement flow monitoring system, wherein each type of characteristic field comprises a plurality of user identifications and network protocol addresses corresponding to the user identifications;
acquiring a plurality of target user identifications adopting target network protocol addresses in the same time period;
constructing a target map according to the target user identifications and the target network protocol addresses, wherein the target map comprises a plurality of target nodes, each target node indicates one target user identification, and each two target user identifications have an association relationship;
inputting the target map into a target map attention network to obtain tag data output by the target map attention network;
determining a target network protocol address in the target map as an abnormal network protocol address if the tag data indicates that the target network protocol address is abnormal.
Optionally, before inputting the target atlas into the target atlas attention network, the method further comprises:
acquiring a sample map and an annotation result of the sample map, wherein the annotation result is used for indicating whether a sample network protocol address in the sample map is an abnormal network protocol address;
inputting the sample map into an initial map attention network to obtain an identification result output by the initial map attention network, wherein the identification result is used for indicating whether a sample network protocol address in the sample map is an abnormal network protocol address;
and under the condition that the labeling result is inconsistent with the identification result, adjusting the weight distributed by the initial graph attention network to different adjacent nodes to obtain the target graph attention network, wherein the identification result output by the target graph attention network is consistent with the labeling result.
Optionally, the adjusting the weights assigned by the initial graph attention network to different adjacent nodes includes:
obtaining a plurality of sample nodes in the sample graph, wherein each sample node indicates a sample user identifier, and a sample edge is arranged between any two sample nodes;
determining an association weight corresponding to each sample edge, wherein the association weight is used for indicating the association weight between sample nodes at two ends of the sample edge;
adjusting the weights assigned by the initial graph attention network to different neighboring nodes based on a plurality of associated weights in the sample graph.
Optionally, the determining the associated weight corresponding to each sample edge includes:
determining a first sample node and a second sample node at two ends of the sample edge, wherein the first sample node corresponds to a first sample user identifier, and the second sample node corresponds to a second sample user identifier;
determining an average of the frequency of the first sample subscriber identity using traffic within a target time period and the frequency of the second sample subscriber identity using traffic within the target time period;
and taking the average value as an edge weight of the sample edge, wherein the edge weight is used for indicating an association weight between the first sample user identifier and the second sample user identifier.
Optionally, the obtaining of the labeling result of the sample atlas includes:
taking a sample node carrying abnormal traffic in the sample map as a first node, wherein each sample node in the sample map carries traffic;
labeling a second node in the sample graph by adopting a label propagation scheme, wherein the similarity between the second node and the first node is greater than a similarity threshold value;
and determining that the sample network protocol address in the sample map is an abnormal network protocol address when the sum of the number of the first nodes and the number of the second nodes is greater than a preset threshold value.
Optionally, before the network protocol address includes a public network protocol address and a personal network protocol address, and acquiring multiple target user identifiers using target network protocol addresses in the same time period, the method further includes:
determining a public network protocol address in the network protocol addresses through stored public network protocol addresses in a preset database;
removing a public network protocol address in the network protocol address by using preset data;
and selecting one personal network protocol address from the plurality of personal network protocol addresses as the target network protocol address.
Optionally, after the advertisement traffic monitoring system analyzes the multiple types of feature fields from the traffic log, the method further includes:
traversing each type of feature field as follows:
identifying a field type of a target characteristic field in the traffic log and a missing field in the target characteristic field, wherein the target characteristic field corresponds to a target user identifier;
determining a target filling mode of the target characteristic field according to the field type;
and performing data filling on the missing field by adopting the target filling mode.
Optionally, after determining the target network protocol address as an abnormal network protocol address, the method further includes:
determining a target attribute type corresponding to the target node, wherein the target node corresponds to a plurality of attribute types, each attribute type comprises at least one sub-attribute type, the occurrence frequency of different sub-attribute types is not completely the same, and the target attribute type is one of the attribute types;
determining a sub-attribute type with the highest occurrence frequency in the target attribute types;
taking the target attribute corresponding to the sub-attribute type with the highest occurrence frequency as the target attribute corresponding to the target attribute type;
and taking all target attributes of the target node as the node attributes of the target node.
In a second aspect, an apparatus for determining an abnormal network protocol address is provided, the apparatus comprising:
the analysis module is used for analyzing a plurality of types of characteristic fields from a flow log through an advertisement flow monitoring system, wherein each type of characteristic field comprises a plurality of user identifications and network protocol addresses corresponding to the user identifications;
the acquisition module is used for acquiring a plurality of target user identifications adopting target network protocol addresses in the same time period;
a building module, configured to build a target graph according to the target user identifiers and the target network protocol addresses, where the target graph includes a plurality of target nodes, each target node indicates one target user identifier, and every two target user identifiers have an association relationship;
the input module is used for inputting the target map into a target map attention network to obtain tag data output by the target map attention network;
a determining module, configured to determine, when the tag data indicates that a target network protocol address in the target graph is abnormal, the target network protocol address as an abnormal network protocol address.
In a third aspect, an electronic device is provided, which includes a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory complete communication with each other through the communication bus;
a memory for storing a computer program;
and the processor is used for realizing the steps of the method for determining the abnormal network protocol address when executing the program stored in the memory.
In a fourth aspect, a computer-readable storage medium is provided, in which a computer program is stored, which computer program, when being executed by a processor, carries out any of the above method steps for determining an abnormal network protocol address.
The embodiment of the application has the following beneficial effects:
the embodiment of the application is used for predicting and optimizing in the technical field of marketing intelligence, and provides a method for determining an abnormal network protocol address.
In the application, the server determines a plurality of target user identifications adopting the target network protocol address, and constructs the target map according to the target user identifications and the target network protocol address, so as to obtain the label of the target map, wherein the label of the target map can indicate whether the target network protocol address in the target map is abnormal, and the server determines the abnormal network protocol address through the label of the target map, so that the probability that the abnormal network protocol address is the network protocol address of the black industry group is high, the network protocol address of the black industry group is searched, the black industry group is hit, and the normal development of marketing service is ensured.
Of course, not all of the above advantages need be achieved in the practice of any one product or method of the present application.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
Fig. 1 is a flowchart of a method for determining an abnormal ip address according to an embodiment of the present disclosure;
fig. 2 is a schematic structural diagram of an apparatus for determining an abnormal ip address according to an embodiment of the present disclosure;
fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In the following description, suffixes such as "module", "component", or "unit" used to denote elements are used only for the convenience of description of the present application, and have no specific meaning in themselves. Thus, "module" and "component" may be used in a mixture.
The embodiment of the application provides a method for determining an abnormal network protocol address, which can be applied to a server and is used for determining a target network protocol address as the abnormal network protocol address.
The following describes in detail a method for determining an abnormal network protocol address provided in an embodiment of the present application with reference to a specific embodiment, and as shown in fig. 1, the specific steps are as follows:
step 101: and analyzing the multi-class characteristic fields from the traffic log through an advertisement traffic monitoring system.
Each type of characteristic field comprises a plurality of user identifications and network protocol addresses corresponding to the user identifications.
In the embodiment of the application, a user can generate a flow log in a flow using process, the flow log comprises a characteristic field corresponding to the user using the flow, the flow log is passed back by a medium, the same medium can pass back a plurality of flow logs, but the types of the flow logs passed back by different media are different, and the characteristic field exists in the flow log, so the characteristic field is divided into a plurality of categories. The characteristic field includes a user identifier and a network protocol address corresponding to the user identifier, where the user identifier can uniquely identify the user, and the user identifier may be a code allocated by the server for the user.
After the server obtains a plurality of flow logs returned by different media, a plurality of types of characteristic fields are analyzed from the flow logs through the advertisement flow monitoring system, and a plurality of user identifications contained in each type of characteristic fields and a network protocol address corresponding to each user identification are determined.
Step 102: a plurality of target user identifications using target network protocol addresses within the same time period are obtained.
In the embodiment of the application, a plurality of users can adopt the same network protocol address. After obtaining the multiple user identifications and the multiple network protocol addresses, the server determines a target network protocol address, and then selects the multiple target user identifications adopting the target network protocol address in the same time period from the multiple user identifications.
Step 103: and constructing a target map according to the plurality of target user identifications and the target network protocol address.
The target graph comprises a plurality of target nodes, each target node indicates a target user identifier, and every two target nodes have an association relation.
In the embodiment of the present application, each target user identifier in the target graph adopts the same target network protocol address, which indicates that any two target user identifiers have an association relationship, and the target graph is constructed according to the multiple target user identifiers and the target network protocol addresses, where the target graph is exemplarily a bipartite graph. Each target node in the target map represents a target user identifier, and an edge is arranged between every two target nodes and represents the association relationship between the two target nodes at the two ends of the edge.
Step 104: and inputting the target map into the target map attention network to obtain the label data output by the target map attention network.
And the server inputs the target map into the target map attention network to obtain tag data output by the target map attention network, wherein the tag data is used for indicating whether a target network protocol address in the target map is abnormal or not. Illustratively, the tag data is 1, which indicates that the target network protocol address in the target map is abnormal, and the tag data is 0, which indicates that the target network protocol address in the target map is normal.
Step 105: in the event that the tag data indicates that the target network protocol address in the target map is abnormal, determining the target network protocol address as an abnormal network protocol address.
If the server determines that the tag data indicates that the target network protocol address in the target map is abnormal, the server determines the target network protocol address as an abnormal network protocol address, and a network corresponding to the abnormal network protocol address is an abnormal network.
In the application, the server determines a plurality of target user identifications adopting the target network protocol address, and constructs the target map according to the target user identifications and the target network protocol address, so as to obtain the label of the target map, wherein the label of the target map can indicate whether the target network protocol address in the target map is abnormal, and the server determines the abnormal network protocol address through the label of the target map, so that the probability that the abnormal network protocol address is the network protocol address of the black industry group is high, the network protocol address of the black industry group is searched, the black industry group is hit, and the normal development of marketing service is ensured.
As an optional implementation, before inputting the target atlas into the target atlas attention network, the method further includes: acquiring a sample map and a labeling result of the sample map, wherein the labeling result is used for indicating whether a sample network protocol address in the sample map is an abnormal network protocol address; inputting the sample map into the initial map attention network to obtain an identification result output by the initial map attention network; and under the condition that the labeling result is inconsistent with the identification result, adjusting the weight distributed by the initial graph attention network to different adjacent nodes to obtain a target graph attention network, wherein the identification result output by the target graph attention network is consistent with the labeling result.
In the embodiment of the application, a server acquires a sample map and a labeling result of the sample map, wherein the labeling result is used for indicating whether a sample network protocol address in the sample map is an abnormal network protocol address, the server inputs the sample map into an attention network of an initial map to obtain an identification result output by the attention network of the initial map, and the identification result is used for indicating whether the sample network protocol address in the sample map is the abnormal network protocol address. And if the server determines that the labeling result is inconsistent with the identification result, adjusting the weight distributed by the initial graph attention network to different adjacent nodes until the labeling result is consistent with the identification result, and determining the initial graph attention network after the weight is adjusted to be the target graph attention network.
As an alternative embodiment, adjusting the weights assigned by the initial graph attention network to different neighboring nodes includes: obtaining a plurality of sample nodes in a sample graph, wherein each sample node indicates a sample user identifier, and a sample edge is arranged between any two sample nodes; determining an association weight corresponding to each sample edge, wherein the association weight is used for indicating the association weight between sample nodes at two ends of each sample edge; the weights assigned by the initial graph attention network to the different neighboring nodes are adjusted based on the plurality of associated weights in the sample graph.
In the embodiment of the application, the sample graph comprises a sample network protocol address and a plurality of sample user identifications corresponding to the sample network protocol address, each sample user identification corresponds to one sample node, every two sample nodes are connected by one sample edge, and the sample edge has weight. The determination process of the weight of the sample edge is as follows: the server determines a first sample node and a second sample node at two ends of the sample edge, then determines the frequency of using the flow in a target time period by a first sample user identifier corresponding to the first sample node, determines the frequency of using the flow in the target time period by a second sample user identifier corresponding to the second sample node, then determines the average value of the two frequencies, and takes the average value as the edge weight of the sample edge, wherein the weight is the association degree between the first sample user identifier and the second sample user identifier, and the greater the weight is, the higher the association degree is between the first sample user identifier and the second sample user identifier. The server obtains the association weight between any two sample user identifications by adopting the mode. And when the identification result output by the initial graph attention network is inconsistent with the labeling result, the server adjusts the weight between different adjacent nodes based on the plurality of associated weights of the target graph, so as to obtain the target graph attention network.
In the method, the server determines the association weight among the sample user identifications, different weights are distributed to different neighbor nodes in the initial graph attention network based on the association weight, more attention can be distributed to the neighbor nodes with larger influence, the node representation accuracy is improved, a more accurate target graph attention network is obtained, and the label output accuracy is improved. In addition, the weight of the initial graph attention network is adjusted based on the plurality of associated weights in the target graph, so that the adjusting direction of the weight of the initial graph attention network is more fit with the target graph, and the efficiency of converting the initial graph attention into the target graph attention network is improved.
As an alternative embodiment, obtaining the labeling result of the sample atlas includes: taking a sample node carrying abnormal traffic in a sample map as a first node, wherein each sample node in the sample map carries traffic; labeling a second node in the sample map by adopting a label propagation scheme, wherein the similarity between the second node and the first node is greater than a similarity threshold value; and under the condition that the sum of the number of the first nodes and the number of the second nodes is larger than a preset threshold value, determining that the sample network protocol address in the sample map is an abnormal network protocol address.
In the embodiment of the application, in the field of advertisement anti-fraud, the cost of marking data is high, uncertainty exists, and all abnormal traffic is difficult to mark by means of manual work. Therefore, the label propagation scheme is adopted for label labeling of semi-supervision, and the specific process is as follows: the server takes the sample nodes carrying abnormal traffic in the sample graph as first nodes, wherein the number of the first nodes can be one or multiple, and then determines second nodes with the similarity between the first nodes being greater than a similarity threshold value in the sample graph, wherein the similarity can be obtained based on the weight between the first nodes and the second nodes, and the higher the weight is, the higher the similarity is.
And the server determines the sum of the number of the first nodes and the number of the second nodes, and if the sum is greater than a preset threshold value, the sample network protocol address in the sample map is determined to be an abnormal network protocol address. For example, the preset threshold may be half the number of target nodes in the sample graph. In addition, the server can also adopt a preset scheme to correct the label, so that the accuracy of the label is improved.
In the application, the server adopts a label propagation scheme to carry out semi-supervised label marking, so that the marking cost can be saved, the marking quality is improved and the marking efficiency is improved compared with manual marking.
As an optional implementation manner, the network protocol address includes a public network protocol address and a personal network protocol address, before acquiring a plurality of target user identifications using the target network protocol address in the same time period, the method further includes: determining a public network protocol address in the network protocol address through a stored public network protocol address in a preset database; removing a public network protocol address in the network protocol address by using preset data; and selecting one personal network protocol address from the plurality of personal network protocol addresses as a target network protocol address.
In the embodiment of the application, the network protocol address comprises a public network protocol address and a personal network protocol address, the public network protocol address is a protocol address suitable for being shared by a plurality of users, such as a cell, a business center and the like, a plurality of public network protocol addresses are stored in a database in advance, the server compares the network protocol address acquired from the flow log with the plurality of public network protocol addresses in the database, and then the public network protocol address is determined. The server removes the public network protocol address in the network protocol address through the third-party data and the self-accumulated data to obtain a plurality of residual personal network protocol addresses, and the server takes one of the personal network protocol addresses as a target network protocol address.
In the application, the probability that the public network protocol address is an abnormal network protocol address is very low, which affects the accurate judgment of the server, so that the public network protocol address needs to be removed, the wrong association by a user is prevented, and the accuracy of the server in judging the abnormal network protocol address is improved.
As an optional implementation, after the advertisement traffic monitoring system analyzes the multiple types of feature fields from the traffic log, the method further includes: traversing each type of feature field as follows: identifying field types of target characteristic fields in the flow logs and missing fields in the target characteristic fields; determining a target filling mode of a target characteristic field according to the field type; and filling data in the missing field by adopting a target filling mode.
In the embodiment of the application, the types (formats) of the traffic logs returned by different media are different, that is, the types of the feature fields in the traffic logs are not completely the same, and if the feature fields of all types are uniformly analyzed, missing fields often occur, and the missing fields need to be filled, so that the target map can be constructed subsequently. The server traverses each type of feature field as follows: each type of characteristic field comprises a plurality of characteristic fields, the server takes one of the characteristic fields as a target characteristic field, then determines the field type of the target characteristic field and the missing field in the target characteristic field, different field types correspond to different filling modes, then the server determines the target filling mode corresponding to the field type of the target characteristic field, and then the missing field of the target characteristic field is filled by adopting the target filling mode. Exemplary filling means include, but are not limited to: the os field is filled, either with mode or unk, with the number of seconds the request was received or with the average.
As an optional implementation manner, after determining the target network protocol address as the abnormal network protocol address, the method further includes: determining a target attribute type corresponding to a target node, wherein the target node corresponds to a plurality of attribute types, each attribute type comprises at least one sub-attribute type, the occurrence frequency of different sub-attribute types is not completely the same, and the target attribute type is one of the attribute types; determining a sub-attribute type with the highest occurrence frequency in the target attribute types; taking the target attribute corresponding to the sub-attribute type with the highest occurrence frequency as the target attribute corresponding to the target attribute type; and taking all target attributes of the target node as the node attributes of the target node.
Each target node corresponds to a target user identifier, the target node corresponds to a plurality of attribute types, and the attribute types can be device models, device system types, device versions, user positions or advertisement positions and the like corresponding to users. Each attribute type also includes at least one child attribute type. For example, the device model includes a model a and a model B, and the model a and the model B are respectively a sub-attribute type.
And the occurrence frequencies of different sub-attribute types are not completely the same, and the server selects the sub-attribute type with the highest occurrence frequency as the target attribute corresponding to the target attribute type. By adopting the mode, the server can select one target attribute according to each attribute type, and the server takes all the target attributes as the node attributes corresponding to the target node.
For example, the device model includes a model a and a model B, and the model a and the model B are respectively a sub-attribute type. The user position comprises a C position and a D position, and the C position and the D position are respectively a sub-attribute type. And the server selects the model A with the highest occurrence frequency as the equipment model adopted by the user, selects the position C with the highest occurrence frequency as the position of the user, and then takes the model A and the position C as the node attributes of the target node corresponding to the user.
After determining that the tag data indicates that the target network protocol address in the target graph is abnormal, the server also determines the node attribute of each target node in the target graph, so that the feature information of the graph is enriched.
Optionally, an embodiment of the present application further provides a processing flow chart for determining an abnormal network protocol address, where the specific steps are as follows.
Step 1: the server takes the user identification of the same sample network protocol address as the sample user identification, establishes the association relation between the sample user identifications,
step 2: the server constructs a sample map based on sample user identifications and sample network protocol addresses, wherein the sample user identifications have an association relationship.
And step 3: and the server adopts a label propagation scheme to label the sample map.
And 4, step 4: the server trains the initial graph attention network by adopting a sample graph with a label, and adjusts the weight of each neighbor node in the initial graph attention network based on a plurality of associated weights in the sample graph to obtain a target graph attention network.
And 5: and the server inputs the target map into the target map attention network to obtain the label output by the target map attention network.
Step 6: the server determines that the target network protocol address in the target map is an abnormal network protocol address by adopting the label.
Based on the same technical concept, an embodiment of the present application further provides an apparatus for determining an abnormal network protocol address, as shown in fig. 2, the apparatus includes:
the analysis module 201 is configured to analyze multiple types of feature fields from a traffic log through an advertisement traffic monitoring system, where each type of feature field includes multiple user identifiers and a network protocol address corresponding to each user identifier;
a first obtaining module 202, configured to obtain multiple target user identifiers using target network protocol addresses in the same time period;
a constructing module 203, configured to construct a target graph according to multiple target user identifiers and target network protocol addresses, where the target graph includes multiple target nodes, each target node indicates one target user identifier, and every two target user identifiers have an association relationship;
the first input/output module 204 is configured to input the target map into the target map attention network to obtain tag data output by the target map attention network;
a first determining module 205, configured to determine the target network protocol address as an abnormal network protocol address if the tag data indicates that the target network protocol address in the target graph is abnormal.
Optionally, the apparatus further comprises:
the second acquisition module is used for acquiring a sample map and a labeling result of the sample map, wherein the labeling result is used for indicating whether a sample network protocol address in the sample map is an abnormal network protocol address;
the second input and output module is used for inputting the sample map into the initial map attention network to obtain an identification result output by the initial map attention network, wherein the identification result is used for indicating whether a sample network protocol address in the sample map is an abnormal network protocol address;
and the adjusting module is used for adjusting the weights distributed by the initial graph attention network to different adjacent nodes to obtain a target graph attention network under the condition that the labeling result is inconsistent with the identification result, wherein the identification result output by the target graph attention network is consistent with the labeling result.
Optionally, the adjusting module comprises:
the system comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a plurality of sample nodes in a sample map, each sample node indicates a sample user identifier, and a sample edge is arranged between any two sample nodes;
the determining unit is used for determining an association weight corresponding to each sample edge, wherein the association weight is used for indicating the association weight between sample nodes at two ends of the sample edge;
and the adjusting unit is used for adjusting the weight distributed by the initial graph attention network to different adjacent nodes based on the plurality of associated weights in the sample graph.
Optionally, the determining unit includes:
the first determining subunit is configured to determine a first sample node and a second sample node at two ends of a sample edge, where the first sample node corresponds to a first sample user identifier, and the second sample node corresponds to a second sample user identifier;
a second determining subunit, configured to determine an average value of the frequency of using the traffic in the target time period by the first sample user identifier and the frequency of using the traffic in the target time period by the second sample user identifier;
and the sub-unit is used for taking the average value as an edge weight of the sample edge, wherein the edge weight is used for indicating an association weight between the first sample user identifier and the second sample user identifier.
Optionally, the second obtaining module includes:
the second serving unit is used for serving a sample node carrying abnormal traffic in the sample map as a first node, wherein each sample node in the sample map carries traffic;
the labeling unit is used for labeling a second node in the sample map by adopting a label propagation scheme, wherein the similarity between the second node and the first node is greater than a similarity threshold value;
and the second determining unit is used for determining the sample network protocol address in the sample map as the abnormal network protocol address under the condition that the sum of the number of the first nodes and the number of the second nodes is greater than a preset threshold value.
Optionally, the network protocol address includes a public network protocol address and a personal network protocol address, and the apparatus further includes:
the second determining module is used for determining the public network protocol address in the network protocol address through the stored public network protocol address in the preset database;
the removing module is used for removing the public network protocol address in the network protocol address by using the preset data;
and the selecting module is used for selecting one personal network protocol address from the plurality of personal network protocol addresses as a target network protocol address.
Optionally, the apparatus further comprises:
traversing each type of feature field as follows:
the identification module is used for identifying the field type of a target characteristic field in the flow log and a missing field in the target characteristic field, wherein the target characteristic field corresponds to a target user identifier;
the third determining module is used for determining a target filling mode of the target characteristic field according to the field type;
and the filling module is used for filling data in the missing fields by adopting a target filling mode.
Optionally, the apparatus further comprises:
a fourth determining module, configured to determine a target attribute type corresponding to the target node, where the target node corresponds to multiple attribute types, each attribute type includes at least one sub-attribute type, the occurrence frequencies of different sub-attribute types are not completely the same, and the target attribute type is one of the multiple attribute types;
the fifth determining module is used for determining the sub-attribute type with the highest frequency of occurrence in the target attribute types;
the first acting module is used for taking the target attribute corresponding to the sub-attribute type with the highest occurrence frequency as the target attribute corresponding to the target attribute type;
and the second is used as a module for taking all target attributes of the target node as the node attributes of the target node.
According to another aspect of the embodiments of the present application, there is provided an electronic device, as shown in fig. 3, including a memory 303, a processor 301, a communication interface 302, and a communication bus 304, where a computer program operable on the processor 301 is stored in the memory 303, the memory 303 and the processor 301 communicate with each other through the communication interface 302 and the communication bus 304, and the processor 301 implements the steps of the method when executing the computer program.
The memory and the processor in the electronic equipment are communicated with the communication interface through a communication bus. The communication bus may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc.
The Memory may include a Random Access Memory (RAM) or a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the Integrated Circuit may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component.
There is also provided, in accordance with yet another aspect of an embodiment of the present application, a computer-readable medium having non-volatile program code executable by a processor.
Optionally, in an embodiment of the present application, a computer readable medium is configured to store program code for the processor to execute the above method.
Optionally, the specific examples in this embodiment may refer to the examples described in the above embodiments, and this embodiment is not described herein again.
When the embodiments of the present application are specifically implemented, reference may be made to the above embodiments, and corresponding technical effects are achieved.
It is to be understood that the embodiments described herein may be implemented in hardware, software, firmware, middleware, microcode, or any combination thereof. For a hardware implementation, the Processing units may be implemented within one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), general purpose processors, controllers, micro-controllers, microprocessors, other electronic units configured to perform the functions described herein, or a combination thereof.
For a software implementation, the techniques described herein may be implemented by means of units performing the functions described herein. The software codes may be stored in a memory and executed by a processor. The memory may be implemented within the processor or external to the processor.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical division, and in actual implementation, there may be other divisions, for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially implemented or make a contribution to the prior art, or may be implemented in the form of a software product stored in a storage medium and including several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a U disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk. It is noted that, in this document, relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The above description is merely exemplary of the present application and is presented to enable those skilled in the art to understand and practice the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (11)

1. A method for determining an abnormal network protocol address, the method comprising:
analyzing a plurality of types of characteristic fields from a flow log through an advertisement flow monitoring system, wherein each type of characteristic field comprises a plurality of user identifications and network protocol addresses corresponding to the user identifications;
acquiring a plurality of target user identifications adopting target network protocol addresses in the same time period;
constructing a target map according to the target user identifications and the target network protocol addresses, wherein the target map comprises a plurality of target nodes, each target node indicates one target user identification, and each two target user identifications have an association relationship;
inputting the target map into a target map attention network to obtain tag data output by the target map attention network;
determining a target network protocol address in the target map as an abnormal network protocol address if the tag data indicates that the target network protocol address is abnormal;
wherein the target graph attention network is obtained by:
acquiring a sample map and an annotation result of the sample map, wherein the annotation result is used for indicating whether a sample network protocol address in the sample map is an abnormal network protocol address;
inputting the sample map into an initial map attention network to obtain an identification result output by the initial map attention network, wherein the identification result is used for indicating whether a sample network protocol address in the sample map is an abnormal network protocol address;
if the labeling result is inconsistent with the identification result, the weights distributed by the initial graph attention network to different adjacent nodes are adjusted until the labeling result is consistent with the identification result, and the initial graph attention network after the weights are adjusted is determined to be the target graph attention network.
2. The method of claim 1, wherein prior to entering the target atlas into the target atlas attention network, the method further comprises:
acquiring a sample map and an annotation result of the sample map, wherein the annotation result is used for indicating whether a sample network protocol address in the sample map is an abnormal network protocol address;
inputting the sample map into an initial map attention network to obtain an identification result output by the initial map attention network, wherein the identification result is used for indicating whether a sample network protocol address in the sample map is an abnormal network protocol address;
and under the condition that the labeling result is inconsistent with the identification result, adjusting the weight distributed by the initial graph attention network to different adjacent nodes to obtain the target graph attention network, wherein the identification result output by the target graph attention network is consistent with the labeling result.
3. The method of claim 2, wherein the adjusting the weights assigned by the initial graph attention network to different neighboring nodes comprises:
obtaining a plurality of sample nodes in the sample graph, wherein each sample node indicates a sample user identifier, and a sample edge is arranged between any two sample nodes;
determining an association weight corresponding to each sample edge, wherein the association weight is used for indicating the association weight between sample nodes at two ends of the sample edge;
adjusting the weights assigned by the initial graph attention network to different neighboring nodes based on a plurality of associated weights in the sample graph.
4. The method of claim 3, wherein determining the associated weight corresponding to each sample edge comprises:
determining a first sample node and a second sample node at two ends of the sample edge, wherein the first sample node corresponds to a first sample user identifier, and the second sample node corresponds to a second sample user identifier;
determining an average of the frequency of the first sample subscriber identity using traffic within a target time period and the frequency of the second sample subscriber identity using traffic within the target time period;
and taking the average value as an edge weight of the sample edge, wherein the edge weight is used for indicating an association weight between the first sample user identifier and the second sample user identifier.
5. The method of claim 2, wherein the obtaining the labeling result of the sample atlas comprises:
taking a sample node carrying abnormal traffic in the sample map as a first node, wherein each sample node in the sample map carries traffic;
labeling a second node in the sample graph by adopting a label propagation scheme, wherein the similarity between the second node and the first node is greater than a similarity threshold value;
and determining that the sample network protocol address in the sample map is an abnormal network protocol address when the sum of the number of the first nodes and the number of the second nodes is greater than a preset threshold value.
6. The method of claim 1, wherein the network protocol address comprises a public network protocol address and a personal network protocol address, and wherein before obtaining the plurality of target subscriber identities that use the target network protocol address in the same time period, the method further comprises:
determining a public network protocol address in the network protocol addresses through stored public network protocol addresses in a preset database;
removing a public network protocol address in the network protocol address by using preset data;
and selecting one personal network protocol address from the plurality of personal network protocol addresses as the target network protocol address.
7. The method of claim 1, wherein after parsing the plurality of types of feature fields from the traffic log by the advertisement traffic monitoring system, the method further comprises:
traversing each type of feature field as follows:
identifying a field type of a target characteristic field in the traffic log and a missing field in the target characteristic field, wherein the target characteristic field corresponds to a target user identifier;
determining a target filling mode of the target characteristic field according to the field type;
and performing data filling on the missing field by adopting the target filling mode.
8. The method of claim 1, wherein after determining the target network protocol address as an abnormal network protocol address, the method further comprises:
determining a target attribute type corresponding to the target node, wherein the target node corresponds to a plurality of attribute types, each attribute type comprises at least one sub-attribute type, the occurrence frequency of different sub-attribute types is not completely the same, and the target attribute type is one of the attribute types;
determining a sub-attribute type with the highest occurrence frequency in the target attribute types;
taking the target attribute corresponding to the sub-attribute type with the highest occurrence frequency as the target attribute corresponding to the target attribute type;
and taking all target attributes of the target node as the node attributes of the target node.
9. An apparatus for determining an abnormal network protocol address, the apparatus comprising:
the analysis module is used for analyzing a plurality of types of characteristic fields from a flow log through an advertisement flow monitoring system, wherein each type of characteristic field comprises a plurality of user identifications and network protocol addresses corresponding to the user identifications;
the acquisition module is used for acquiring a plurality of target user identifications adopting target network protocol addresses in the same time period;
a building module, configured to build a target graph according to the target user identifiers and the target network protocol addresses, where the target graph includes a plurality of target nodes, each target node indicates one target user identifier, and every two target user identifiers have an association relationship;
the input module is used for inputting the target map into a target map attention network to obtain tag data output by the target map attention network;
a determining module, configured to determine, when the tag data indicates that a target network protocol address in the target map is abnormal, the target network protocol address as an abnormal network protocol address;
wherein the apparatus is further configured to:
acquiring a sample map and an annotation result of the sample map, wherein the annotation result is used for indicating whether a sample network protocol address in the sample map is an abnormal network protocol address;
inputting the sample map into an initial map attention network to obtain an identification result output by the initial map attention network, wherein the identification result is used for indicating whether a sample network protocol address in the sample map is an abnormal network protocol address;
if the labeling result is inconsistent with the identification result, the weights distributed by the initial graph attention network to different adjacent nodes are adjusted until the labeling result is consistent with the identification result, and the initial graph attention network after the weights are adjusted is determined to be the target graph attention network.
10. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;
a memory for storing a computer program;
a processor for implementing the method steps of any of claims 1 to 8 when executing a program stored in the memory.
11. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of the claims 1-8.
CN202110815253.4A 2021-07-19 2021-07-19 Method and device for determining abnormal network protocol address Active CN113259402B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110815253.4A CN113259402B (en) 2021-07-19 2021-07-19 Method and device for determining abnormal network protocol address

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110815253.4A CN113259402B (en) 2021-07-19 2021-07-19 Method and device for determining abnormal network protocol address

Publications (2)

Publication Number Publication Date
CN113259402A CN113259402A (en) 2021-08-13
CN113259402B true CN113259402B (en) 2021-10-15

Family

ID=77180554

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110815253.4A Active CN113259402B (en) 2021-07-19 2021-07-19 Method and device for determining abnormal network protocol address

Country Status (1)

Country Link
CN (1) CN113259402B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114143049A (en) * 2021-11-18 2022-03-04 北京明略软件系统有限公司 Abnormal flow detection method, abnormal flow detection device, storage medium and electronic equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108769001A (en) * 2018-04-11 2018-11-06 哈尔滨工程大学 Malicious code detecting method based on the analysis of network behavior feature clustering
CN109040000A (en) * 2017-06-12 2018-12-18 北京京东尚科信息技术有限公司 IP address-based user identification method and system
CN109714322A (en) * 2018-12-14 2019-05-03 中国科学院声学研究所 A kind of method and its system detecting exception flow of network

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9659085B2 (en) * 2012-12-28 2017-05-23 Microsoft Technology Licensing, Llc Detecting anomalies in behavioral network with contextual side information
US20180007578A1 (en) * 2016-06-30 2018-01-04 Alcatel-Lucent Usa Inc. Machine-to-Machine Anomaly Detection
CN109300028A (en) * 2018-09-11 2019-02-01 上海天旦网络科技发展有限公司 Real-time anti-fraud method and system and storage medium based on network data
CN112543168A (en) * 2019-09-20 2021-03-23 中移(苏州)软件技术有限公司 Network attack detection method, device, server and storage medium
CN111510434A (en) * 2020-03-24 2020-08-07 中国建设银行股份有限公司 Network intrusion detection method, system and related equipment
CN111666502A (en) * 2020-07-08 2020-09-15 腾讯科技(深圳)有限公司 Abnormal user identification method and device based on deep learning and storage medium
CN112435122A (en) * 2020-11-18 2021-03-02 东莞智盾信息安全科技有限公司 Network training method, abnormal transaction behavior identification method, device and medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109040000A (en) * 2017-06-12 2018-12-18 北京京东尚科信息技术有限公司 IP address-based user identification method and system
CN108769001A (en) * 2018-04-11 2018-11-06 哈尔滨工程大学 Malicious code detecting method based on the analysis of network behavior feature clustering
CN109714322A (en) * 2018-12-14 2019-05-03 中国科学院声学研究所 A kind of method and its system detecting exception flow of network

Also Published As

Publication number Publication date
CN113259402A (en) 2021-08-13

Similar Documents

Publication Publication Date Title
CN106992994B (en) Automatic monitoring method and system for cloud service
CN108366045B (en) Method and device for setting wind control scoring card
CN110166462B (en) Access control method, system, electronic device and computer storage medium
CN112559303B (en) Log analysis in vector space
CN110166344B (en) Identity identification method, device and related equipment
CN110648172B (en) Identity recognition method and system integrating multiple mobile devices
CN114598539B (en) Root cause positioning method and device, storage medium and electronic equipment
CN109063433B (en) False user identification method and device and readable storage medium
CN115278741A (en) Fault diagnosis method and device based on multi-mode data dependency relationship
CN106571933B (en) Service processing method and device
CN113259402B (en) Method and device for determining abnormal network protocol address
CN113313280A (en) Cloud platform inspection method, electronic equipment and nonvolatile storage medium
CN111193727A (en) Operation monitoring system and operation monitoring method
WO2017000817A1 (en) Method and device for acquiring matching relationship between data
CN111046082B (en) Report data source recommendation method and device based on semantic analysis
CN111935279B (en) Internet of things network maintenance method based on block chain and big data and computing node
CN107517474B (en) Network analysis optimization method and device
CN113869904B (en) Suspicious data identification method, device, electronic equipment, medium and computer program
CN114610372A (en) Processing method and device for review file, storage medium, processor and terminal
CN117194668A (en) Knowledge graph construction method, device, equipment and storage medium
CN109829713B (en) Mobile payment mode identification method based on common drive of knowledge and data
CN110098983B (en) Abnormal flow detection method and device
CN114157648B (en) Request matching rule generation method and device, website server and storage medium
CN110618906B (en) Missing detection interface detection method and device, network equipment and storage medium
CN113347021B (en) Model generation method, collision library detection method, device, electronic equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant