CN113918795B - Method and device for determining target label, electronic equipment and storage medium - Google Patents

Method and device for determining target label, electronic equipment and storage medium Download PDF

Info

Publication number
CN113918795B
CN113918795B CN202111529872.3A CN202111529872A CN113918795B CN 113918795 B CN113918795 B CN 113918795B CN 202111529872 A CN202111529872 A CN 202111529872A CN 113918795 B CN113918795 B CN 113918795B
Authority
CN
China
Prior art keywords
attack
information
data set
determining
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111529872.3A
Other languages
Chinese (zh)
Other versions
CN113918795A (en
Inventor
童将
黄扬洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lianlian Hangzhou Information Technology Co ltd
Original Assignee
Lianlian Hangzhou Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lianlian Hangzhou Information Technology Co ltd filed Critical Lianlian Hangzhou Information Technology Co ltd
Priority to CN202111529872.3A priority Critical patent/CN113918795B/en
Publication of CN113918795A publication Critical patent/CN113918795A/en
Application granted granted Critical
Publication of CN113918795B publication Critical patent/CN113918795B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Databases & Information Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The method comprises the steps of obtaining an object set and an attack data set, enabling objects in the object set to correspond to attack data subsets in the attack data set one by one, determining a related data set corresponding to the object set according to the attack data set, enabling the objects in the object set to correspond to the related data subsets in the related data set one by one, enabling the related data subsets to comprise social account information and network browsing information of each object, and determining target tag information of each object in the object set according to the attack data set and the related data set. Based on the embodiment of the application, the target label of the object is determined by combining the attack data set, the social account information and the network browsing information, so that the label information of the object can be enriched, the portrait of the user can be perfected, and the determination accuracy of the identity information of the object can be improved.

Description

Method and device for determining target label, electronic equipment and storage medium
Technical Field
The present invention relates to the field of computer networks, and in particular, to a method and an apparatus for determining a target tag, an electronic device, and a storage medium.
Background
In the big data age, all data of any system can be used as resources, and correlation relations among the data can be discovered. Nowadays, big data is widely applied to aspects of advertisement push, user personalized service and the like of the internet, and provides strong data support for internet service.
User portrayal, also called user roles, is an important application of big data technology, by establishing descriptive label attributes aiming at users in multiple dimensions, and accordingly outlining various personal characteristics of the users by utilizing the label attributes.
The existing method for determining the target label mainly comprises two methods, one method is to portrait the user based on the basic information, consumption behavior, network behavior and the like of the user, however, the portrait of the user obtained based on the method is not complete. The other method is that a user portrait model is directly called as a portrait model corresponding to different users, and the method is easy to cause the situation that the user portrait cannot be completely matched with the users, so that the inference of big data is not accurate.
Disclosure of Invention
The embodiment of the application provides a method and a device for determining a target label, electronic equipment and a storage medium, which can enrich label information of an object, improve portrait of a user and improve accuracy of determination of identity information of the object.
The embodiment of the application provides a method for determining a target label, which comprises the following steps:
acquiring an object set and an attack data set; the objects in the object set correspond to the attack data subsets in the attack data set one by one;
determining a related data set corresponding to the object set according to the attack data set; the objects in the object set correspond to associated data subsets in the associated data set one by one, and the associated data subsets comprise social account information and network browsing information of each object;
and determining the target label information of each object in the object set according to the attack data set and the associated data set.
Further, determining an associated data set corresponding to the object set according to the attack data set, including:
and determining an associated data set corresponding to the object set from the database according to the attack data set.
Further, the attack data subset comprises attack time information, attack protocol address information, attack times information, attack type information and attack method information of each object.
Further, determining an associated data set corresponding to the object set according to the attack data set, including:
determining attack domain name information and attack geographical position information of each object according to the attack protocol address information of each object in the attack data subset;
determining the communication account information of each object according to the attack domain name information and the attack geographical position information of each object;
and determining social account information and network browsing information of each object according to the communication account information of each object to obtain an associated data set corresponding to the object set.
Further, determining target tag information of each object in the object set according to the attack data set and the associated data set, including:
determining a data set to be processed according to the attack data set and the associated data set;
performing data processing on the data set to be processed to obtain a reference data set; the data processing comprises repeated value processing, missing value processing, abnormal value processing and standard unified processing, and objects in the object set correspond to reference data subsets in the reference data set one by one;
and determining the target label information of each object according to the reference data subset corresponding to each object.
Correspondingly, an embodiment of the present application provides an apparatus for determining a target tag, including:
the acquisition module is used for acquiring an object set and an attack data set; the objects in the object set correspond to the attack data subsets in the attack data set one by one;
the first determining module is used for determining a related data set corresponding to the object set according to the attack data set; the objects in the object set correspond to associated data subsets in the associated data set one by one, and the associated data subsets comprise social account information and network browsing information of each object;
and the second determining module is used for determining the target label information of each object in the object set according to the attack data set and the associated data set.
Further, the first determining module is configured to determine, according to the attack data set, an associated data set corresponding to the object set from the database.
Further, the attack data subset comprises attack time information, attack protocol address information, attack times information, attack type information and attack method information of each object.
Further, the first determining module includes:
the first determining submodule is used for determining attack domain name information and attack geographical position information of each object according to the attack protocol address information of each object in the attack data subset;
the second determining submodule is used for determining the communication account information of each object according to the attack domain name information and the attack geographical position information of each object;
and the third determining submodule is used for determining the social account information and the network browsing information of each object according to the communication account information of each object to obtain an associated data set corresponding to the object set.
Further, the second determining module includes:
the fourth determining submodule is used for determining a data set to be processed according to the attack data set and the associated data set;
the data processing submodule is used for carrying out data processing on the data set to be processed to obtain a reference data set; the data processing comprises repeated value processing, missing value processing, abnormal value processing and standard unified processing, and objects in the object set correspond to reference data subsets in the reference data set one by one;
and the fifth determining submodule is used for determining the target label information of each object according to the reference data subset corresponding to each object.
Accordingly, an embodiment of the present application further provides an electronic device, which includes a processor and a memory, where the memory stores at least one instruction, at least one program, a code set, or a set of instructions, and the at least one instruction, the at least one program, the code set, or the set of instructions is loaded and executed by the processor to implement the method for determining the target tag.
Accordingly, an embodiment of the present application further provides a computer-readable storage medium, in which at least one instruction, at least one program, a code set, or a set of instructions is stored, and the at least one instruction, the at least one program, the code set, or the set of instructions is loaded and executed by a processor to implement the method for determining the target tag.
The embodiment of the application has the following beneficial effects:
the method, the device, the electronic equipment and the storage medium for determining the target tag, disclosed by the embodiment of the application, comprise the steps of obtaining an object set and an attack data set, wherein objects in the object set correspond to attack data subsets in the attack data set one by one, determining a related data set corresponding to the object set according to the attack data set, wherein the objects in the object set correspond to the related data subsets in the related data set one by one, the related data subsets comprise social account information and network browsing information of each object, and further determining the target tag information of each object in the object set according to the attack data set and the related data set. Based on the embodiment of the application, the target label of the object is determined by combining the attack data set, the social account information and the network browsing information, so that the label information of the object can be enriched, the portrait of the user can be perfected, and the determination accuracy of the identity information of the object can be improved. Moreover, social account information and network browsing information of the crawling object based on the intelligence database can reduce the complexity of tracing information collection, and save a large amount of human resources and time cost.
Drawings
In order to more clearly illustrate the technical solutions and advantages of the embodiments of the present application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic diagram of an application environment provided by an embodiment of the present application;
fig. 2 is a schematic flowchart of a method for determining a target tag according to an embodiment of the present application;
fig. 3 is a schematic flowchart of a method for determining an associated data set corresponding to an object set according to an embodiment of the present application;
fig. 4 is a structural diagram illustrating determination of a target tag according to an embodiment of the present application.
Detailed Description
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings. It should be apparent that the described embodiment is only one embodiment of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
An "embodiment" as referred to herein relates to a particular feature, structure, or characteristic that may be included in at least one implementation of the present application. In the description of the embodiments of the present application, it should be understood that the terms "first", "second", "third" and "fourth" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicit indication of the number of technical features indicated. Thus, features defined as "first", "second", "third" and "fourth" may explicitly or implicitly include one or more of the features. Moreover, the terms "first," "second," "third," and "fourth," etc. are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in other sequences than described or illustrated herein. Furthermore, the terms "comprising," "having," and "being," as well as any variations thereof, are intended to cover non-exclusive inclusions.
Referring to fig. 1, a schematic diagram of an application environment provided by an embodiment of the present application is shown, including a processor 101, a security device 103, and a database server 105. Wherein the database server has an open source intelligence repository. The processor may obtain an object set and an attack data set from the security device, where objects in the object set correspond to attack data subsets in the attack data set one to one, and the attack data set may include attack protocol address information, attack category information, and attack date information as in fig. 1. Further, an associated data set corresponding to the object set may be determined according to the attack data set, the objects in the object set correspond to associated data subsets in the associated data set one to one, and the associated data subsets include social account information and network browsing information of each object, and further, target tag information of each object in the object set is determined according to the attack data set and the associated data set, such as attack protocol address information, attack category information, attack date information, attack geographical location information, attack domain name information, and communication account information in fig. 1. I.e. whether the identity information of the object is an attacker or a white cap.
According to the method and the device, the target label of the object is determined by combining the attack data set, the social account information and the network browsing information, so that the label information of the object can be enriched, the portrait of a user can be improved, and the determination accuracy of the identity information of the object can be improved. Moreover, social account information and network browsing information of the crawling object based on the intelligence database can reduce the complexity of tracing information collection, and save a large amount of human resources and time cost.
The following describes a specific embodiment of a method for determining an object tag according to the present application, and fig. 2 is a schematic flow chart of a method for determining an object tag according to the embodiment of the present application, and the present specification provides the method operation steps as shown in the embodiment or the flow chart, but may include more or less operation steps based on conventional or non-inventive labor. The order of steps recited in the embodiments is only one of many possible orders of execution and does not represent the only order of execution, and in actual execution, the steps may be performed sequentially or in parallel as in the embodiments or methods shown in the figures (e.g., in the context of parallel processors or multi-threaded processing). Specifically, as shown in fig. 2, the method includes:
s201: acquiring an object set and an attack data set; and the objects in the object set correspond to the attack data subsets in the attack data set one by one.
In the embodiment of the Application, an object set and an attack data set may be collected from a security device, where the security device may be an Application protection system (waf), and the security device may also be an open source host intrusion monitoring system. The security device may store a plurality of attack cases, each attack case may include an object and a corresponding attack data subset of the object, and specific identity information of the object in each attack case, namely, whether an attacker or an engineer, is ambiguous. The security device may further store a plurality of historical attack cases, each historical attack case may include an object and a corresponding data subset of the object, and specific identity information of the object in each historical attack case, that is, whether an attacker or an engineer is clear. Each object in the object set of the historical attack case can be represented by a specific code, for example, the security device can store an attacker A and an attacker B, and can also store an attack organization A and an attack organization B. Further, for example, the safety equipment may store an engineer 001 and an engineer 002, and may store a team of engineers 001 and a team of engineers 002.
In the embodiment of the present application, the object set may include hackers, such as attackers and organizations that attack internal systems of the enterprise to obtain confidential information of the enterprise and add viruses to carry out lasso. The set of objects may include white caps, such as engineers and teams thereof who mine and feedback vulnerabilities. Since the specific identity information of the object, i.e. whether it is an attacker or an engineer, is ambiguous, each object in the object set may be represented by a specific code, such as object c and object d, which may be stored in the security device, and object 003 and object 004.
In this embodiment of the present application, the attack data set may include a plurality of attack data subsets, and each attack data subset in the plurality of attack data subsets corresponds to an object in the object set. Each attack data subset may include attack time information, attack Protocol address Information (IP), attack number information, attack type information, and attack method information of the object. In an alternative embodiment, each attack data subset may be integrated into a static data set a of the object, for example, the attack data subset of the object nail may be integrated into aC3= attack time informationC3IP addressC3Attack frequency informationC3Attack type informationC3And attack method informationC3},Further, for example, a subset of the attack data of object 001 is integrated into a003= attack time information003IP address003Attack frequency information003Attack type information003And attack method information003}. It should be noted that the object corresponding to each attack data subset is unique, and the code numbers of the objects are uniform when the objects corresponding to the attack data subsets are a hacker organization including a plurality of hackers. However, each type of attack data in each attack data subset is not unique, e.g., an IP address in an attack data subset of object 003 may include IP1003、IP2003Attack method information003May include attack method information 1003Attack method information 2003That is, a plurality of hackers in a hacker organization can attack internal systems of different enterprises by adopting different attack lines and methods, but can classify the same IP address into the same attack case, and a plurality of hackers in one hacker organization can attack internal systems of different enterprises by adopting different IP addresses, but can classify attack types and attack methods into the same attack case.
S203: determining a related data set corresponding to the object set according to the attack data set; the objects in the object set correspond to associated data subsets in the associated data set one by one, and the associated data subsets comprise social account information and network browsing information of each object.
In the embodiment of the application, an associated data set corresponding to an object may be determined from a database according to an attack data set, where the associated data set may include a plurality of associated data subsets, and each associated data subset in the plurality of associated data subsets corresponds to an object in an object set. The database may be an Open-source intelligence (OSINT) library. Social account information and network browsing information of the crawling object based on the intelligence database can reduce complexity of tracing information collection, and save a large amount of human resources and time cost.
In an alternative embodiment, a honeypot may be formed according to the attack data subset of the object, that is, it looks like a bait of an attack target to entice the object to attack it again, and during the attack again, the associated data subset corresponding to the object, that is, the social account information and the web browsing information of the object, may be obtained.
Fig. 3 is a schematic flowchart of a method for determining an associated data set corresponding to an object set according to an embodiment of the present application, and in an alternative implementation, the following steps may be adopted to determine the associated data set corresponding to the object set, and the specific steps are as follows:
s301: and determining attack domain name information and attack geographical position information of each object according to the attack protocol address information of each object in the attack data subset.
In the embodiment of the application, the attack domain name information and the attack geographical position information of each object can be determined according to the IP of each object in the attack data subset. Alternatively, the domain name and geographic location information of the object may be back-looked up according to the IP address of each object in the set of objects.
S303: and determining the communication account information of each object according to the attack domain name information and the attack geographical position information of each object.
In the embodiment of the application, after the domain name and the geographic position information of the object are obtained, the communication account information of the object, such as a registered mailbox of the object and a mobile contact mode of the object, namely a mobile phone number, can be back-checked according to the domain name and the geographic position of the object.
In an optional implementation manner, after obtaining the domain name, the geographic location information, and the communication account information of the object, the information may be integrated to obtain a data set b = { attack geographic location information, attack domain name information, registration mailbox, mobile phone number }.
S305: and determining social account information and network browsing information of each object according to the communication account information of each object to obtain an associated data set corresponding to the object set.
In the embodiment of the application, the social account information and the network browsing information of each object can be determined according to the communication account information of each object, that is, information such as pulse number, QQ number, micro-signal and website browsing record of the object is searched based on the registered mailbox and the mobile phone number, and the associated data subset corresponding to each object in the object set is determined to obtain the associated data set.
In an optional implementation manner, after obtaining the social account information and the web browsing information of the object, the static data set a and the data set b may be integrated to obtain a data set c = { IP, attack geographical location information, attack domain name information, register mailbox, mobile phone number, social account information, web browsing information }. That is, the source intelligence repository is crawled based on the data set b to gather a greater amount of information for the object.
S205: and determining the target label information of each object in the object set according to the attack data set and the associated data set.
In the embodiment of the application, a data set to be processed may be determined according to an attack data set and an associated data set, and data processing may be performed on the data set to be processed to obtain a reference data set, where the data processing may include repeated value processing, missing value processing, abnormal value processing, and standard unified processing, the reference data set may include a plurality of reference data subsets, and each reference data subset in the plurality of reference data subsets corresponds to an object in an object set. That is, the static data set a and the data set c may be integrated to obtain a data set d = { attack time information, IP address, attack times information, attack type information, attack method information, attack geographical location information, attack domain name information, register mailbox, mobile phone number, social account information, and web browsing information }. And then, data cleaning and preprocessing can be performed on various source attack cases and disordered data sets to be processed, wherein the data cleaning and preprocessing comprises statistical analysis, normalization processing and the like on data from a plurality of attack cases.
Alternatively, the duplicate value can be detected by using a drop _ duplicates method in the pandas, the missing value can be detected by using a notull and isnull method, the abnormal value can be detected by using a box diagram, then the duplicate value, the missing value and the abnormal value can be deleted by using dropna, the duplicate value, the missing value and the abnormal value are removed, and the duplicate value, the missing value and the abnormal value are converted into the feature vector with the same dimension.
In the embodiment of the present application, after determining the reference data subset corresponding to each object, the target tag information of each object may be determined according to the reference data subset corresponding to each object in the reference data set. That is, whether an object is an attacker or a white hat can be determined according to attack time information, an IP address, attack frequency information, attack type information, attack method information, attack geographical location information, attack domain name information, a registered mailbox, a mobile phone number, social account information, and network browsing information corresponding to each object.
In an alternative embodiment, a deep neural network model may be written based on Kera of the tensrflow back end to determine the target tag information for each object. Optionally, the deep neural network model may include an input layer, a hidden layer, and an output layer. Wherein, the input layer can have 4 neurons, the hidden layer can have two layers, and each layer has 5 neurons and 6 neurons respectively, and the output layer can have 2 neurons. The neural activation function can adopt a ReLU function, the loss function can be a cross entropy, an iterative optimizer, and optionally an Adam optimization algorithm, and initially the connection weight and the bias weight of each layer are randomly generated.
In an optional implementation manner, the IP address information, the attack frequency information, the attack type information, and the attack method information of the object may be input into the deep neural network model, and candidate tag information corresponding to the object is output, that is, whether the attack operation of the object is artificial or non-artificial is determined. For example, when the attack number information is 1 time/second, it may be determined that the attack operation of the object is non-artificial, i.e., a scanning operation of the scanner. When it is determined that the attack operation of the object is artificial, the target tag information of the object, that is, the identity information of the object is an attacker or an engineer, and specifically which attacker organization or engineer team can be determined according to attack time information, an IP address, attack frequency information, attack type information, attack method information, attack geographical location information, attack domain name information, a registered mailbox, a mobile phone number, social account number information, and network browsing information.
Optionally, attack time information, an IP address, attack frequency information, attack type information, attack method information, attack geographical location information, attack domain name information, a registered mailbox, a mobile phone number, social account information, and web browsing information of the object may be matched with the historical attack case, and if the matching degree of the attack data with the attacker is higher than that with the attack data with a white hat, the target tag information of the object may be determined as the attacker; if the matching degree of the target tag information with the attack data of the attacker is lower than that with the attack data of the white hat, the target tag information of the object can be determined to be the white hat.
Optionally, the attack time information, the IP address, the attack frequency information, the attack type information, the attack method information, the attack geographical location information, the attack domain name information, the registered mailbox and the mobile phone number of the object may be matched with the historical attack case, and if the matching degree of the attack data of the attacker and the registered mailbox is higher than the matching degree of the attack data of the white hat, the tag information of the object may be determined as a candidate attacker; if the matching degree of the target tag information with the attack data of the attacker is lower than that with the attack data of the white hat, the target tag information can be determined to be the candidate white hat. And further determining target tag information of the object according to the social account information and the web browsing information of the object, for example, when the QQ of the candidate attacker includes a large number of friends of which the identity information is determined to be the attacker, determining the target tag information of the candidate attacker to be the attacker. Further, for example, when the QQ of the candidate attacker includes a large number of friends of the attacker determined to belong to the attacker organization a, the target tag information of the candidate attacker may be determined to be the attacker and belong to the attacker organization a.
By adopting the method for determining the target tag provided by the embodiment of the application, the target tag of the object is determined by combining the attack data set, the social account information and the network browsing information, so that the tag information of the object can be enriched, the portrait of the user can be perfected, and the determination accuracy of the identity information of the object can be improved. Moreover, social account information and network browsing information of the crawling object based on the intelligence database can reduce the complexity of tracing information collection, and save a large amount of human resources and time cost.
Fig. 4 is a schematic structural diagram of a target tag determination provided in an embodiment of the present application, and as shown in fig. 4, the apparatus may include:
the obtaining module 401 is configured to obtain an object set and an attack data set; the objects in the object set correspond to the attack data subsets in the attack data set one by one;
the first determining module 403 is configured to determine, according to the attack data set, an associated data set corresponding to the object set; the objects in the object set correspond to associated data subsets in the associated data set one by one, and the associated data subsets comprise social account information and network browsing information of each object;
the second determining module 405 is configured to determine target tag information of each object in the object set according to the attack data set and the associated data set.
In this embodiment of the application, the first determining module 403 may be configured to determine, according to the attack data set, an association data set corresponding to the object set from the database.
In the embodiment of the application, the attack data subset comprises attack time information, attack protocol address information, attack times information, attack type information and attack method information of each object.
In this embodiment of the application, the first determining module 403 may include:
the first determining submodule is used for determining attack domain name information and attack geographical position information of each object according to the attack protocol address information of each object in the attack data subset;
the second determining submodule is used for determining the communication account information of each object according to the attack domain name information and the attack geographical position information of each object;
and the third determining submodule is used for determining the social account information and the network browsing information of each object according to the communication account information of each object to obtain an associated data set corresponding to the object set.
In this embodiment, the second determining module 405 may include:
the fourth determining submodule is used for determining a data set to be processed according to the attack data set and the associated data set;
the data processing submodule is used for carrying out data processing on the data set to be processed to obtain a reference data set; the data processing comprises repeated value processing, missing value processing, abnormal value processing and standard unified processing, and objects in the object set correspond to reference data subsets in the reference data set one by one;
and the fifth determining submodule is used for determining the target label information of each object according to the reference data subset corresponding to each object.
The device and method embodiments in the embodiments of the present application are based on the same application concept.
By adopting the device for determining the target label, the target label of the object is determined by combining the attack data set, the social account information and the network browsing information, so that the label information of the object can be enriched, the portrait of the user can be perfected, and the determination accuracy of the identity information of the object can be improved. Social account information and network browsing information of the crawling object based on the intelligence database can reduce complexity of tracing information collection, and save a large amount of human resources and time cost.
The present invention further provides an electronic device, which may be disposed in a server to store at least one instruction, at least one program, a code set, or a set of instructions related to a method for determining a target tag in the method embodiments, where the at least one instruction, the at least one program, the code set, or the set of instructions is loaded from the memory and executed to implement the method for determining a target tag.
The present application further provides a storage medium, which may be disposed in a server to store at least one instruction, at least one program, a code set, or a set of instructions related to implementing a method for determining an object tag in the method embodiments, where the at least one instruction, the at least one program, the code set, or the set of instructions is loaded and executed by the processor to implement the method for determining an object tag.
Optionally, in this embodiment, the storage medium may be located in at least one network server of a plurality of network servers of a computer network. Optionally, in this embodiment, the storage medium may include, but is not limited to, a storage medium including: various media that can store program codes, such as a usb disk, a Read-only Memory (ROM), a removable hard disk, a magnetic disk, or an optical disk.
As can be seen from the embodiments of the target tag determination method, the target tag determination apparatus, the electronic device, or the storage medium provided by the present application, the method in the present application includes acquiring an object set and an attack data set, where objects in the object set correspond to attack data subsets in the attack data set one to one, and determining an associated data set corresponding to the object set according to the attack data set, where the objects in the object set correspond to the associated data subsets in the associated data set one to one, and the associated data subsets include social account information and web browsing information of each object, and further determining target tag information of each object in the object set according to the attack data set and the associated data set. Based on the embodiment of the application, the target label of the object is determined by combining the attack data set, the social account information and the network browsing information, so that the label information of the object can be enriched, the portrait of the user can be perfected, and the determination accuracy of the identity information of the object can be improved. Moreover, social account information and network browsing information of the crawling object based on the intelligence database can reduce the complexity of tracing information collection, and save a large amount of human resources and time cost.
In the present invention, unless otherwise expressly stated or limited, the terms "connected" and "connected" are to be construed broadly, e.g., as meaning either a fixed connection or a removable connection, or an integral part; can be mechanically or electrically connected; either directly or indirectly through intervening media, either internally or in any other relationship. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.
It should be noted that: the foregoing sequence of the embodiments of the present application is for description only and does not represent the superiority and inferiority of the embodiments, and the specific embodiments are described in the specification, and other embodiments are also within the scope of the appended claims. In some cases, the actions or steps recited in the claims can be performed in the order of execution in different embodiments and achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown or connected to enable the desired results to be achieved, and in some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
All the embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment is described with emphasis on differences from other embodiments. Especially, for the embodiment of the device, since it is based on the embodiment similar to the method, the description is simple, and the relevant points can be referred to the partial description of the method embodiment.
While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention.

Claims (7)

1. A method for determining a target tag, comprising:
acquiring an object set and an attack data set; the objects in the object set correspond to attack data subsets in the attack data set one by one, and the object set comprises hackers and white caps; the attack data subset comprises attack time information, attack protocol address information, attack frequency information, attack type information and attack method information of each object;
determining a relevant data set corresponding to the object set according to the attack data set; the objects in the object set correspond to associated data subsets in the associated data set one by one, and the associated data subsets comprise social account information and network browsing information of each object;
determining target label information of each object in the object set according to the attack data set and the associated data set;
determining an associated data set corresponding to the object set according to the attack data set, including:
determining attack domain name information and attack geographical position information of each object according to the attack protocol address information of each object in the attack data subset;
determining communication account information of each object according to the attack domain name information and the attack geographical position information of each object;
and determining the social account information and the network browsing information of each object according to the communication account information of each object to obtain the associated data set corresponding to the object set.
2. The method according to claim 1, wherein the determining, according to the attack data set, an association data set corresponding to the object set comprises:
and determining the associated data set corresponding to the object set from a database according to the attack data set.
3. The method of claim 1, wherein determining target tag information for each object in the set of objects from the set of attack data and the set of association data comprises:
determining a data set to be processed according to the attack data set and the associated data set;
performing data processing on the data set to be processed to obtain a reference data set; the data processing comprises repeated value processing, missing value processing, abnormal value processing and standard unified processing, and objects in the object set correspond to reference data subsets in the reference data set one by one;
and determining the target label information of each object according to the reference data subset corresponding to each object.
4. An apparatus for determining a target tag, comprising:
the acquisition module is used for acquiring an object set and an attack data set; the objects in the object set correspond to attack data subsets in the attack data set one by one, and the object set comprises hackers and white caps; the attack data subset comprises attack time information, attack protocol address information, attack frequency information, attack type information and attack method information of each object;
the first determining module is used for determining a related data set corresponding to the object set according to the attack data set; the objects in the object set correspond to associated data subsets in the associated data set one by one, and the associated data subsets comprise social account information and network browsing information of each object;
the first determining module includes:
a first determining submodule, configured to determine attack domain name information and attack geographical location information of each object according to the attack protocol address information of each object in the attack data subset;
a second determining submodule, configured to determine, according to the attack domain name information and the attack geographical location information of each object, communication account information of each object;
a third determining submodule, configured to determine, according to the communication account information of each object, the social account information and the web browsing information of each object, to obtain the associated data set corresponding to the object set;
and the second determining module is used for determining the target label information of each object in the object set according to the attack data set and the associated data set.
5. The apparatus of claim 4,
the first determining module is configured to determine the associated data set corresponding to the object set from a database according to the attack data set.
6. An electronic device, comprising a processor and a memory, wherein at least one instruction, at least one program, a set of codes, or a set of instructions is stored in the memory, and wherein the at least one instruction, the at least one program, the set of codes, or the set of instructions is loaded and executed by the processor to implement the method for determining an object tag according to any one of claims 1-3.
7. A computer readable storage medium having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by a processor to implement the method of determining an object tag according to any one of claims 1-3.
CN202111529872.3A 2021-12-15 2021-12-15 Method and device for determining target label, electronic equipment and storage medium Active CN113918795B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111529872.3A CN113918795B (en) 2021-12-15 2021-12-15 Method and device for determining target label, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111529872.3A CN113918795B (en) 2021-12-15 2021-12-15 Method and device for determining target label, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113918795A CN113918795A (en) 2022-01-11
CN113918795B true CN113918795B (en) 2022-04-12

Family

ID=79248929

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111529872.3A Active CN113918795B (en) 2021-12-15 2021-12-15 Method and device for determining target label, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113918795B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113645253A (en) * 2021-08-27 2021-11-12 杭州安恒信息技术股份有限公司 Attack information acquisition method, device, equipment and storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020107446A1 (en) * 2018-11-30 2020-06-04 北京比特大陆科技有限公司 Method and apparatus for obtaining attacker information, device, and storage medium
US11146581B2 (en) * 2018-12-31 2021-10-12 Radware Ltd. Techniques for defending cloud platforms against cyber-attacks
CN109729095B (en) * 2019-02-13 2021-08-24 奇安信科技集团股份有限公司 Data processing method, data processing device, computing equipment and media
CN113055386B (en) * 2021-03-12 2023-03-24 安天科技集团股份有限公司 Method and device for identifying and analyzing attack organization

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113645253A (en) * 2021-08-27 2021-11-12 杭州安恒信息技术股份有限公司 Attack information acquisition method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN113918795A (en) 2022-01-11

Similar Documents

Publication Publication Date Title
Amato et al. Recognizing human behaviours in online social networks
MacDermott et al. Iot forensics: Challenges for the ioa era
Le Sceller et al. Sonar: Automatic detection of cyber security events over the twitter stream
Nisioti et al. Data-driven decision support for optimizing cyber forensic investigations
CN110958220A (en) Network space security threat detection method and system based on heterogeneous graph embedding
GB2555192A (en) Methods and apparatus for detecting and identifying malware by mapping feature data into a semantic space
Alani Big data in cybersecurity: a survey of applications and future trends
CN114915479B (en) Web attack stage analysis method and system based on Web log
US20210248145A1 (en) System, method and computer program for ingesting, processing, storing, and searching technology asset data
Lin et al. Collaborative alert ranking for anomaly detection
Li et al. An android malware detection system based on feature fusion
CN111371757B (en) Malicious communication detection method and device, computer equipment and storage medium
Zhang et al. Detecting Insider Threat from Behavioral Logs Based on Ensemble and Self‐Supervised Learning
Alnusair et al. Context-aware multimodal recommendations of multimedia data in cyber situational awareness
CN115208643A (en) Tracing method and device based on WEB dynamic defense
US20230396641A1 (en) Adaptive system for network and security management
KR102296215B1 (en) Method For Recommending Security Requirements With Ontology Knowledge Base For Advanced Persistent Threat, Apparatus And System Thereof
Abdel-Fattah et al. A Survey of Internet of Things (IoT) Forensics Frameworks and Challenges
Kidmose et al. Featureless discovery of correlated and false intrusion alerts
CN113918795B (en) Method and device for determining target label, electronic equipment and storage medium
Bo et al. Tom: A threat operating model for early warning of cyber security threats
Alzaabi et al. The use of ontologies in forensic analysis of smartphone content
Yi et al. Identifying untrusted interactive behaviour in Enterprise Resource Planning systems based on a big data pattern recognition method using behavioural analytics
Cheng et al. GHunter: A Fast Subgraph Matching Method for Threat Hunting
Kuehn et al. The Notion of Relevance in Cybersecurity: A Categorization of Security Tools and Deduction of Relevance Notions

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant