CN111949803B - Knowledge graph-based network abnormal user detection method, device and equipment - Google Patents

Knowledge graph-based network abnormal user detection method, device and equipment Download PDF

Info

Publication number
CN111949803B
CN111949803B CN202010850232.1A CN202010850232A CN111949803B CN 111949803 B CN111949803 B CN 111949803B CN 202010850232 A CN202010850232 A CN 202010850232A CN 111949803 B CN111949803 B CN 111949803B
Authority
CN
China
Prior art keywords
access
user
network
behavior
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010850232.1A
Other languages
Chinese (zh)
Other versions
CN111949803A (en
Inventor
孙强强
连耿雄
陈昊
丘惠军
陈霖
匡晓云
杨祎巍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China South Power Grid International Co ltd
Shenzhen Power Supply Co ltd
Original Assignee
China South Power Grid International Co ltd
Shenzhen Power Supply Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China South Power Grid International Co ltd, Shenzhen Power Supply Co ltd filed Critical China South Power Grid International Co ltd
Priority to CN202010850232.1A priority Critical patent/CN111949803B/en
Publication of CN111949803A publication Critical patent/CN111949803A/en
Application granted granted Critical
Publication of CN111949803B publication Critical patent/CN111949803B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/069Management of faults, events, alarms or notifications using logs of notifications; Post-processing of notifications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Security & Cryptography (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Animal Behavior & Ethology (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The application discloses a method, a device and equipment for detecting network abnormal users based on a knowledge graph, wherein the method comprises the following steps: constructing a knowledge graph of the user access behavior based on the obtained weblog of the access user to obtain a webbehavior graph; extracting access behavior characteristics of each access user based on the network behavior patterns and the network logs; the method comprises the steps of inputting the access behavior characteristics of each access user into a preset random forest model for user type detection, outputting the access user with abnormal user type, and presetting the random forest model as a relation mapping model of the access behavior characteristics of the access user and the user type, so that the technical problem that in the prior art, analysis is carried out on a single log, namely the related attribute of the single access behavior, and the detection result precision of the abnormal user is low is solved.

Description

Knowledge graph-based network abnormal user detection method, device and equipment
Technical Field
The present application relates to the field of network security technologies, and in particular, to a method, an apparatus, and a device for detecting a network abnormal user based on a knowledge graph.
Background
The weblog is a summary of user click information and other access behaviors of the website, and relevant attributes of the website behaviors are recorded in detail. After a web site is attacked, a network manager typically looks at the information of the relevant network access log. Thus, the weblog becomes an important credential for the network manager to discover and defend against the network intruder's attack. However, in order to avoid the pursuit, the intruder usually makes the log information generated by the attack behavior and the log information generated by the normal access behavior as similar as possible, so that the difficulty of finding the intruder by the network manager is increased.
At present, the existing method for analyzing the abnormal behavior based on the weblog is mainly used for constructing a model through the weblog and trying to find the characteristics and differences of normal log content and attack log content, but the existing method is used for analyzing a single log, namely the related attribute of a single access behavior, and the problem of low accuracy of detection results of abnormal users exists.
Disclosure of Invention
The application provides a network abnormal user detection method, device and equipment based on a knowledge graph, which are used for solving the technical problems that in the prior art, analysis is carried out on a single log, namely, the related attribute of a single access behavior is low in accuracy of an abnormal user detection result.
In view of the foregoing, a first aspect of the present application provides a method for detecting a network anomaly user based on a knowledge graph, including:
constructing a knowledge graph of the user access behavior based on the obtained weblog of the access user to obtain a webbehavior graph;
extracting access behavior characteristics of each access user based on the network behavior pattern and the weblog;
And inputting the access behavior characteristics of each access user into a preset random forest model for user type detection, and outputting the access users with abnormal user types, wherein the preset random forest model is a relation mapping model of the access behavior characteristics of the access users and the user types.
Optionally, the constructing a knowledge graph of the user access behavior based on the obtained weblog of the access user to obtain a webbehavior graph includes:
after a weblog of an access user is obtained, taking an access address in the weblog as a node, obtaining an access relation between the nodes according to the weblog, and constructing a knowledge graph of the access behaviors of the user based on the nodes and the access relation to obtain a network behavior graph;
An edge is connected between two nodes with the access relation in the network behavior map, and the weight of the edge is the number of times of access between the two nodes.
Optionally, the extracting the access behavior feature of each access user based on the network behavior map and the weblog includes:
extracting first network access characteristics of each access user based on the network behavior patterns, and extracting second network access characteristics of each access user based on the weblogs to obtain access behavior characteristics of each access user;
Wherein the first network access feature comprises: the user path size feature, the user log number feature, or the user access frequency feature, and the second network access feature includes: URL length feature, request parameter number feature, special character frequency feature, or character entropy feature.
Optionally, extracting the user path scale feature of each access user based on the network behavior spectrum includes:
And after the weights of all the access paths of all the access users in the network behavior spectrum are extracted, calculating the ratio of the sum of the weights of all the access paths of all the access users to the sum of the weights of all the access paths of all the access users in the network behavior spectrum, and obtaining the user path scale characteristics of all the access users.
Optionally, the configuration process of the preset random forest model includes:
Acquiring historical weblogs of normal access users and abnormal access users;
extracting access behavior characteristics of the normal access user and the abnormal access user based on the network behavior patterns constructed by the historical weblog and the historical weblog;
Performing category marking on the access behavior characteristics of the normal access user and the abnormal access user to obtain a training set;
Training the random forest through the training set until the random forest converges, and obtaining the preset random forest model.
The second aspect of the present application provides a device for detecting network abnormal users based on a knowledge graph, comprising:
The construction unit is used for constructing a knowledge graph of the user access behavior based on the acquired weblogs of the access users to obtain a webbehavior graph;
The feature extraction unit is used for extracting the access behavior features of each access user based on the network behavior pattern and the weblog;
The detection unit is used for inputting the access behavior characteristics of each access user into a preset random forest model for user type detection and outputting the access users with abnormal user types, wherein the preset random forest model is a relation mapping model of the access behavior characteristics of the access users and the user types.
Optionally, the construction unit is specifically configured to:
after a weblog of an access user is obtained, taking an access address in the weblog as a node, obtaining an access relation between the nodes according to the weblog, and constructing a knowledge graph of the access behaviors of the user based on the nodes and the access relation to obtain a network behavior graph;
An edge is connected between two nodes with the access relation in the network behavior map, and the weight of the edge is the number of times of access between the two nodes.
Optionally, the feature extraction unit is specifically configured to:
extracting first network access characteristics of each access user based on the network behavior patterns, and extracting second network access characteristics of each access user based on the weblogs to obtain access behavior characteristics of each access user;
Wherein the first network access feature comprises: the user path size feature, the user log number feature, or the user access frequency feature, and the second network access feature includes: URL length feature, request parameter number feature, special character frequency feature, or character entropy feature.
Optionally, the method further comprises: a configuration unit;
The configuration unit is used for:
Acquiring historical weblogs of normal access users and abnormal access users;
extracting access behavior characteristics of the normal access user and the abnormal access user based on the network behavior patterns constructed by the historical weblog and the historical weblog;
Performing category marking on the access behavior characteristics of the normal access user and the abnormal access user to obtain a training set;
Training the random forest through the training set until the random forest converges, and obtaining the preset random forest model.
The third aspect of the present application provides a network anomaly user detection device based on a knowledge graph, the device comprising a processor and a memory:
The memory is used for storing program codes and transmitting the program codes to the processor;
The processor is configured to execute the network anomaly user detection method based on a knowledge graph according to any one of the first aspects according to an instruction in the program code.
From the above technical scheme, the application has the following advantages:
The application provides a network abnormal user detection method based on a knowledge graph, which comprises the following steps: constructing a knowledge graph of the user access behavior based on the obtained weblog of the access user to obtain a webbehavior graph; extracting access behavior characteristics of each access user based on the network behavior patterns and the network logs; and inputting the access behavior characteristics of each access user into a preset random forest model for user type detection, outputting the access user with abnormal user type, and presetting the random forest model as a relation mapping model of the access behavior characteristics of the access user and the user type.
According to the network abnormal user detection method based on the knowledge graph, the knowledge graph of the user access behavior is built based on the acquired weblogs of the access user, the network behavior graph is obtained, the behavior of the access user is commonly reflected through a plurality of weblogs, the behavior of the access user is reflected not only through one weblog, and more accurate access behavior characteristics can be obtained through analysis of the network behavior graph; moreover, the access behavior characteristics of the access user are extracted based on the network behavior map and the network log, and the access behavior characteristics are extracted from the two aspects, so that more comprehensive and more accurate characteristic representation can be obtained, and the accuracy of detecting the network abnormal user is improved; the user type detection is automatically carried out on the input access behavior characteristics through the preset random forest model, and the detection efficiency is improved, so that the technical problem that the accuracy of abnormal user detection results is low in the prior art due to the fact that single logs are analyzed, namely, the related attributes of single access behaviors are obtained is solved.
Drawings
In order to more clearly illustrate the embodiments of the application or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are only some embodiments of the application, and that other drawings can be obtained from these drawings without inventive faculty for a person skilled in the art.
Fig. 1 is a schematic flow chart of a network abnormal user detection method based on a knowledge graph according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of a network abnormal user detection device based on a knowledge graph according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a network behavior map according to an embodiment of the present application.
Detailed Description
The application provides a network abnormal user detection method, device and equipment based on a knowledge graph, which are used for solving the technical problems that in the prior art, analysis is carried out on a single log, namely, the related attribute of a single access behavior is low in accuracy of an abnormal user detection result.
In order to make the present application better understood by those skilled in the art, the following description will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the accompanying drawings, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
In the prior art, a machine learning method is adopted to construct an abnormal log or a normal log model, and the characteristics and differences of the normal log and the abnormal log are tried to be found, but the existing method aims at a single log, namely the related attribute of a single access behavior. However, in the process of accessing a website, there is a complex relationship between access behaviors dominated by the same user, between different users accessing the same path, and between users and access paths that are not completely independent. Based on the above, in order to solve the above problems, the present application provides a method for detecting network abnormal users based on a knowledge graph.
For easy understanding, referring to fig. 1, an embodiment of a method for detecting network abnormal users based on a knowledge graph provided by the present application includes:
and step 101, constructing a knowledge graph of the user access behavior based on the acquired weblog of the access user to obtain a webbehavior graph.
The original data of the weblog contains much information, some of which are not valuable in abnormal user detection. Therefore, the weblog can be preprocessed, useless information in abnormal user detection is removed, valuable information is reserved, and data processing efficiency can be improved. And constructing a knowledge graph of the user access behaviors based on the processed weblog to obtain a network behavior graph. The small network behavior patterns of each access user can be constructed according to the weblog data of each access user; a network behavior pattern containing the access behaviors of all access users can also be constructed according to the network log data of all access users, and the network behavior pattern can be regarded as superposition of the network behavior patterns of single access users.
Further, the specific process of constructing the network behavior map comprises the following steps: after the weblog of the access user is obtained, taking the access address in the weblog as a node, obtaining the access relation between the nodes according to the weblog, and constructing a knowledge graph of the access behavior of the user based on the nodes and the access relation to obtain a network behavior graph; an edge is connected between two nodes with access relation in the network behavior map, and the weight of the edge is the number of times of access between the two nodes. Referring to fig. 3, nodes 1,2 and 3 represent 3 different access addresses, respectively, the edge between the nodes is an access path, and w 12 is the weight of the access path between the node 1 and the node 2, which is equal to the number of accesses between the node 1 and the node 2. The other nodes are similar and will not be described in detail herein.
And 102, extracting access behavior characteristics of each access user based on the network behavior map and the weblog.
After the network behavior pattern is built, a plurality of features are extracted from the network behavior pattern and the weblog to serve as access behavior features of the access user, so that more comprehensive and more accurate feature representation is obtained, and the accuracy of subsequent abnormal user detection is improved.
Further, the extraction process of the access behavior characteristics of each access user specifically includes: and extracting the first network access characteristics of each access user based on the network behavior pattern, and extracting the second network access characteristics of each access user based on the weblog to obtain the access behavior characteristics of each access user. Wherein the first network access feature comprises: the user path size feature, the user log number feature, or the user access frequency feature, and the second network access feature includes: URL length feature, request parameter number feature, special character frequency feature or character entropy feature, all of which are preferably adopted as access behavior features in the embodiment of the present application.
The extraction process of each feature is as follows:
(1) User path size feature P 1:
After the weights of all access paths of all access users in the network behavior spectrum are extracted, calculating the ratio of the sum of the weights of all access paths of all access users to the sum of the weights of all access paths of all access users in the network behavior spectrum, and obtaining the user path scale characteristics of all access users, wherein the characteristic indexes are used for measuring the range of the access paths of the access users. Specifically, for the access user c, the set of short paths visited by the access user c is recorded as SP c, and the network behavior pattern formed by the paths visited by the access user c is recorded as The network behavior pattern composed of all the access users is N 2, and the calculation formula of the user path network scale characteristic P 1 c of the access user c is as follows:
Where e ij is the edge of node i and node j, For the weight of the edge e ij of the node i and the node j in the network behavior graph composed of the paths visited by the visiting user c, w ij is the weight of the edge e ij of the node i and the node j in the network behavior graph composed of the paths visited by all the visiting users. When P 1 c is large, the accessing user c is likely to be a scanner, intending to learn the overall architecture of the network application, and the type of accessing user may not launch a practical attack, however, most are probing the network, attempting to discover vulnerable nodes in the network structure, which are generated due to insufficient security awareness of the developer, or related to the network infrastructure, and possibly also related to vulnerability of other components on which the network application depends.
(2) User log number feature P 2:
the characteristic measure measures the access scope of the accessing user from another different angle. The scope of the path network focuses on the breadth and depth of the user access logs, while the user log number feature is more focused on the number of user accesses that result in the weblog.
(3) User access frequency feature P 3:
The user access frequency characteristic index can be used for effectively identifying the malicious software. In order to obtain the access frequency of the access user, a time interval can be selected, then the number of weblogs generated by the access of the user in the time interval is calculated, and the number of weblogs is divided by the time interval to obtain the user access frequency characteristic of the access user in the time interval. Since the user's access frequency varies over this time interval, the calculated user access frequency characteristic is the average frequency of access over this time interval. The characteristic index is to find out an abnormal access user with a high access frequency. Theoretically, the smaller the time interval, the more accurate the result is, however, the calculation amount also increases sharply. In the embodiment of the application, in order to balance the accuracy and the calculated amount of the result, the time interval is preferably 100 seconds, the access frequency of each user per 100 seconds is calculated for each access user, and the maximum access frequency of the access user is used as the user access frequency characteristic of the user.
(4) Second network access feature
The second network access feature includes: URL length feature, request parameter number feature, special character frequency feature or character entropy feature, wherein the calculation formula of the character entropy is as follows:
where E i is the character entropy of the ith access user, The number of occurrences of the kth character in the request for the ith access user.
And 103, inputting the access behavior characteristics of each access user into a preset random forest model for user type detection, and outputting the access user with abnormal user type, wherein the preset random forest model is a relation mapping model of the access behavior characteristics of the access user and the user type.
And automatically detecting the user type of the input access behavior feature through a preset random forest model which is configured in advance, detecting whether the corresponding access user type belongs to abnormality or is normal according to the input access behavior feature, and finally outputting the access user with the abnormal user type, thereby achieving the purpose of detecting the network abnormal user. The weblog accumulated in one hour can be used as an object of one-time calculation, corresponding access behavior characteristics are extracted according to the steps, and detection is further carried out through a preset random forest model.
Further, the configuration process of the preset random forest model comprises the following steps:
1. acquiring historical weblogs of normal access users and abnormal access users;
2. extracting access behavior characteristics of normal access users and abnormal access users based on a network behavior map constructed by the historical weblog and the historical weblog;
3. performing category marking on the access behavior characteristics of the normal access user and the abnormal access user to obtain a training set;
4. training the random forest through the training set until the random forest converges, and obtaining a preset random forest model.
According to the network anomaly user detection method based on the knowledge graph, the knowledge graph of the user access behavior is built based on the obtained weblogs of the access users, the network behavior graph is obtained, the behavior of the access users is commonly reflected through a plurality of weblogs, the behavior of the access users is reflected not only through one weblog, and more accurate access behavior characteristics can be obtained through analysis of the network behavior graph; moreover, the access behavior characteristics of the access user are extracted based on the network behavior map and the network log, and the access behavior characteristics are extracted from the two aspects, so that more comprehensive and more accurate characteristic representation can be obtained, and the accuracy of detecting the network abnormal user is improved; the user type detection is automatically carried out on the input access behavior characteristics through the preset random forest model, and the detection efficiency is improved, so that the technical problem that the accuracy of abnormal user detection results is low in the prior art due to the fact that single logs are analyzed, namely, the related attributes of single access behaviors are obtained is solved.
The above is an embodiment of a method for detecting a network abnormal user based on a knowledge graph, and the following is an embodiment of a device for detecting a network abnormal user based on a knowledge graph.
For easy understanding, referring to fig. 2, an embodiment of a network anomaly user detection device based on a knowledge graph provided by the present application includes:
a construction unit 201, configured to construct a knowledge graph of the user access behavior based on the obtained weblog of the access user, and obtain a webbehavior graph.
The feature extraction unit 202 is configured to extract access behavior features of each access user based on the network behavior pattern and the weblog.
The detecting unit 203 is configured to input the access behavior features of each access user into a preset random forest model for user type detection, output the access user with an abnormal user type, and preset the random forest model as a mapping model of the relationship between the access behavior features of the access user and the user type.
As a further improvement, the construction unit 201 is specifically configured to:
after the weblog of the access user is obtained, taking the access address in the weblog as a node, obtaining the access relation between the nodes according to the weblog, and constructing a knowledge graph of the access behavior of the user based on the nodes and the access relation to obtain a network behavior graph;
an edge is connected between two nodes with access relation in the network behavior map, and the weight of the edge is the number of times of access between the two nodes.
As a further improvement, the feature extraction unit 202 is specifically configured to:
Extracting first network access characteristics of each access user based on the network behavior patterns, and extracting second network access characteristics of each access user based on the weblogs to obtain access behavior characteristics of each access user;
Wherein the first network access feature comprises: the user path size feature, the user log number feature, or the user access frequency feature, and the second network access feature includes: URL length feature, request parameter number feature, special character frequency feature, or character entropy feature.
As a further improvement, further comprising: a configuration unit 204;
the configuration unit 204 is configured to:
Acquiring historical weblogs of normal access users and abnormal access users;
Extracting access behavior characteristics of normal access users and abnormal access users based on a network behavior map constructed by the historical weblog and the historical weblog;
Performing category marking on the access behavior characteristics of the normal access user and the abnormal access user to obtain a training set;
training the random forest through the training set until the random forest converges, and obtaining a preset random forest model.
The embodiment of the application also provides a network abnormal user detection device based on the knowledge graph, which comprises a processor and a memory:
The memory is used for storing the program codes and transmitting the program codes to the processor;
the processor is configured to execute the network anomaly user detection method based on the knowledge graph according to the instruction in the program code.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the apparatus and units described above may refer to corresponding procedures in the foregoing method embodiments, which are not described herein again.
In the several embodiments provided by the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for executing all or part of the steps of the method according to the embodiments of the present application by means of a computer device (which may be a personal computer, a server, or a network device, etc.). And the aforementioned storage medium includes: u disk, mobile hard disk, read-Only Memory (ROM), random access Memory (RandomAccess Memory, RAM), magnetic disk or optical disk, etc.
The above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application.

Claims (3)

1. The network anomaly user detection method based on the knowledge graph is characterized by comprising the following steps of:
constructing a knowledge graph of the user access behavior based on the obtained weblog of the access user to obtain a webbehavior graph;
extracting access behavior characteristics of each access user based on the network behavior pattern and the weblog;
Inputting the access behavior characteristics of each access user into a preset random forest model for user type detection, and outputting the access users with abnormal user types, wherein the preset random forest model is a relation mapping model of the access behavior characteristics of the access users and the user types;
The construction of the knowledge graph of the user access behavior based on the obtained weblog of the access user to obtain the network behavior graph comprises the following steps:
after a weblog of an access user is obtained, taking an access address in the weblog as a node, obtaining an access relation between the nodes according to the weblog, and constructing a knowledge graph of the access behaviors of the user based on the nodes and the access relation to obtain a network behavior graph;
An edge is connected between two nodes with the access relation in the network behavior map, and the weight of the edge is the number of times of access between the two nodes;
The configuration process of the preset random forest model comprises the following steps:
Acquiring historical weblogs of normal access users and abnormal access users;
extracting access behavior characteristics of the normal access user and the abnormal access user based on the network behavior patterns constructed by the historical weblog and the historical weblog;
Performing category marking on the access behavior characteristics of the normal access user and the abnormal access user to obtain a training set;
training a random forest through the training set until the random forest converges to obtain the preset random forest model;
the extracting the access behavior feature of each access user based on the network behavior map and the weblog includes:
extracting first network access characteristics of each access user based on the network behavior patterns, and extracting second network access characteristics of each access user based on the weblogs to obtain access behavior characteristics of each access user;
Wherein the first network access feature comprises: the user path size feature, the user log number feature, or the user access frequency feature, and the second network access feature includes: URL length feature, request parameter quantity feature, special character frequency feature or character entropy feature;
extracting the user path scale characteristics of each access user based on the network behavior patterns, including:
And after the weights of all the access paths of all the access users in the network behavior spectrum are extracted, calculating the ratio of the sum of the weights of all the access paths of all the access users to the sum of the weights of all the access paths of all the access users in the network behavior spectrum, and obtaining the user path scale characteristics of all the access users.
2. The utility model provides a network abnormal user detection device based on knowledge graph which characterized in that includes:
The construction unit is used for constructing a knowledge graph of the user access behavior based on the acquired weblogs of the access users to obtain a webbehavior graph;
The feature extraction unit is used for extracting the access behavior features of each access user based on the network behavior pattern and the weblog;
The detection unit is used for inputting the access behavior characteristics of each access user into a preset random forest model for user type detection and outputting the access users with abnormal user types, wherein the preset random forest model is a relation mapping model of the access behavior characteristics of the access users and the user types;
the construction unit is specifically used for:
after a weblog of an access user is obtained, taking an access address in the weblog as a node, obtaining an access relation between the nodes according to the weblog, and constructing a knowledge graph of the access behaviors of the user based on the nodes and the access relation to obtain a network behavior graph;
An edge is connected between two nodes with the access relation in the network behavior map, and the weight of the edge is the number of times of access between the two nodes;
the configuration unit is used for acquiring historical weblogs of normal access users and abnormal access users;
extracting access behavior characteristics of the normal access user and the abnormal access user based on the network behavior patterns constructed by the historical weblog and the historical weblog;
Performing category marking on the access behavior characteristics of the normal access user and the abnormal access user to obtain a training set;
training a random forest through the training set until the random forest converges to obtain the preset random forest model;
the feature extraction unit is specifically configured to:
extracting first network access characteristics of each access user based on the network behavior patterns, and extracting second network access characteristics of each access user based on the weblogs to obtain access behavior characteristics of each access user;
Wherein the first network access feature comprises: the user path size feature, the user log number feature, or the user access frequency feature, and the second network access feature includes: URL length feature, request parameter quantity feature, special character frequency feature or character entropy feature;
extracting the user path scale characteristics of each access user based on the network behavior patterns, including:
And after the weights of all the access paths of all the access users in the network behavior spectrum are extracted, calculating the ratio of the sum of the weights of all the access paths of all the access users to the sum of the weights of all the access paths of all the access users in the network behavior spectrum, and obtaining the user path scale characteristics of all the access users.
3. A knowledge-graph-based network anomaly user detection device, the device comprising a processor and a memory:
The memory is used for storing program codes and transmitting the program codes to the processor;
The processor is configured to execute the knowledge-graph-based network anomaly user detection method of claim 1 according to instructions in the program code.
CN202010850232.1A 2020-08-21 2020-08-21 Knowledge graph-based network abnormal user detection method, device and equipment Active CN111949803B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010850232.1A CN111949803B (en) 2020-08-21 2020-08-21 Knowledge graph-based network abnormal user detection method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010850232.1A CN111949803B (en) 2020-08-21 2020-08-21 Knowledge graph-based network abnormal user detection method, device and equipment

Publications (2)

Publication Number Publication Date
CN111949803A CN111949803A (en) 2020-11-17
CN111949803B true CN111949803B (en) 2024-05-28

Family

ID=73359110

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010850232.1A Active CN111949803B (en) 2020-08-21 2020-08-21 Knowledge graph-based network abnormal user detection method, device and equipment

Country Status (1)

Country Link
CN (1) CN111949803B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112604297A (en) * 2020-12-29 2021-04-06 网易(杭州)网络有限公司 Game plug-in detection method and device, server and storage medium
CN113572752B (en) * 2021-07-20 2023-11-07 上海明略人工智能(集团)有限公司 Abnormal flow detection method and device, electronic equipment and storage medium
CN113726786B (en) * 2021-08-31 2023-05-05 上海观安信息技术股份有限公司 Abnormal access behavior detection method and device, storage medium and electronic equipment
CN113987206A (en) * 2021-10-29 2022-01-28 平安银行股份有限公司 Abnormal user identification method, device, equipment and storage medium
CN114422267B (en) * 2022-03-03 2024-02-06 北京天融信网络安全技术有限公司 Flow detection method, device, equipment and medium
CN114329455B (en) * 2022-03-08 2022-07-29 北京大学 User abnormal behavior detection method and device based on heterogeneous graph embedding
CN114710392B (en) * 2022-03-23 2024-03-12 阿里云计算有限公司 Event information acquisition method and device
CN115378988B (en) * 2022-10-25 2023-02-24 国网智能电网研究院有限公司 Data access abnormity detection and control method and device based on knowledge graph
CN116668192B (en) * 2023-07-26 2023-11-10 国网山东省电力公司信息通信公司 Network user behavior anomaly detection method and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106599160A (en) * 2016-12-08 2017-04-26 网帅科技(北京)有限公司 Content rule base management system and encoding method thereof
WO2017084362A1 (en) * 2015-11-18 2017-05-26 百度在线网络技术(北京)有限公司 Model generation method, recommendation method and corresponding apparatuses, device and storage medium
CN108600270A (en) * 2018-05-10 2018-09-28 北京邮电大学 A kind of abnormal user detection method and system based on network log
CN109460664A (en) * 2018-10-23 2019-03-12 北京三快在线科技有限公司 Risk analysis method, device, Electronic Design and computer-readable medium
CN109816397A (en) * 2018-12-03 2019-05-28 北京奇艺世纪科技有限公司 A kind of fraud method of discrimination, device and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017084362A1 (en) * 2015-11-18 2017-05-26 百度在线网络技术(北京)有限公司 Model generation method, recommendation method and corresponding apparatuses, device and storage medium
CN106599160A (en) * 2016-12-08 2017-04-26 网帅科技(北京)有限公司 Content rule base management system and encoding method thereof
CN108600270A (en) * 2018-05-10 2018-09-28 北京邮电大学 A kind of abnormal user detection method and system based on network log
CN109460664A (en) * 2018-10-23 2019-03-12 北京三快在线科技有限公司 Risk analysis method, device, Electronic Design and computer-readable medium
CN109816397A (en) * 2018-12-03 2019-05-28 北京奇艺世纪科技有限公司 A kind of fraud method of discrimination, device and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
网络威胁安全数据可视化综述;张繁;谢凡;江颉;;网络与信息安全学报(第02期);全文 *

Also Published As

Publication number Publication date
CN111949803A (en) 2020-11-17

Similar Documents

Publication Publication Date Title
CN111949803B (en) Knowledge graph-based network abnormal user detection method, device and equipment
CN107888571B (en) Multi-dimensional webshell intrusion detection method and system based on HTTP log
CN112003870B (en) Network encryption traffic identification method and device based on deep learning
CN110099059B (en) Domain name identification method and device and storage medium
CN105072089B (en) A kind of WEB malice scanning behavior method for detecting abnormality and system
CN107579956B (en) User behavior detection method and device
CN112866023B (en) Network detection method, model training method, device, equipment and storage medium
CN107302547A (en) A kind of web service exceptions detection method and device
CN108924118B (en) Method and system for detecting database collision behavior
CN110351280A (en) A kind of method, system, equipment and readable storage medium storing program for executing for threatening information to extract
CN107612911B (en) Method for detecting infected host and C & C server based on DNS traffic
CN107395553A (en) A kind of detection method and device of network attack
CN110135162A (en) The recognition methods of the back door WEBSHELL, device, equipment and storage medium
CN107231383B (en) CC attack detection method and device
US20200342095A1 (en) Rule generaton apparatus and computer readable medium
CN110378115A (en) A kind of data layer system of information security attack-defence platform
CN112347457A (en) Abnormal account detection method and device, computer equipment and storage medium
CN111541687B (en) Network attack detection method and device
CN109309665A (en) A kind of access request processing method and processing device, a kind of calculating equipment and storage medium
CN116319089B (en) Dynamic weak password detection method, device, computer equipment and medium
CN116846644A (en) Unauthorized access detection method and device
WO2016173327A1 (en) Method and device for detecting website attack
CN115643044A (en) Data processing method, device, server and storage medium
CN115827379A (en) Abnormal process detection method, device, equipment and medium
CN111800409A (en) Interface attack detection method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant