CN112491864A - Method, device, equipment and medium for detecting phishing deep victim user - Google Patents

Method, device, equipment and medium for detecting phishing deep victim user Download PDF

Info

Publication number
CN112491864A
CN112491864A CN202011322821.9A CN202011322821A CN112491864A CN 112491864 A CN112491864 A CN 112491864A CN 202011322821 A CN202011322821 A CN 202011322821A CN 112491864 A CN112491864 A CN 112491864A
Authority
CN
China
Prior art keywords
user
victim
domain name
information
website
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011322821.9A
Other languages
Chinese (zh)
Inventor
陈扬
侯立冬
王伟
常宁
梁彧
田野
傅强
王杰
杨满智
蔡琳
金红
陈晓光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Eversec Beijing Technology Co Ltd
Original Assignee
Eversec Beijing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Eversec Beijing Technology Co Ltd filed Critical Eversec Beijing Technology Co Ltd
Priority to CN202011322821.9A priority Critical patent/CN112491864A/en
Publication of CN112491864A publication Critical patent/CN112491864A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/02Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
    • H04L63/0227Filtering policies
    • H04L63/0236Filtering by address, protocol, port number or service, e.g. IP-address or URL
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/10Network architectures or network communication protocols for network security for controlling access to devices or network resources
    • H04L63/101Access control lists [ACL]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/20Network architectures or network communication protocols for network security for managing network security; network security policies in general

Abstract

The embodiment of the invention relates to a method, a device, electronic equipment and a storage medium for detecting phishing deep victim users, wherein the method comprises the following steps: generating a ticket file according to the internet traffic data of a user, and extracting domain name information in the ticket file; adopting the domain name information to collide with a domain name white list library to filter call tickets with normal domain name information, then performing rule matching according to the phishing website/APP, performing network access behavior reduction and log backtracking text analysis on an access behavior log of any victim user in the phishing website/APP according to a user who is successfully matched to determine the operation behavior of the user in the cheating process, and inputting the operation behavior to a pre-trained fraud event hierarchical classification model to obtain the victim degree information; and determining depth victim users according to the victim degree information of each user, so that the depth victim users can be accurately positioned.

Description

Method, device, equipment and medium for detecting phishing deep victim user
Technical Field
The embodiment of the invention relates to the technical field of Internet fraud prevention application, in particular to a method, a device, electronic equipment and a storage medium for detecting phishing deep victim users.
Background
In order to purify the network environment, illegal criminal behaviors such as network hackers, telecommunication network fraud, invasion of individual privacy of citizens and the like are strikingly attacked by law, a network criminal interest chain is cut off, a high-voltage situation is continuously formed, and the legal rights and interests of the masses are maintained.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method, an apparatus, an electronic device, and a storage medium for detecting phishing depth victim users, so as to achieve accurate positioning of depth victim users.
Additional features and advantages of embodiments of the invention will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of embodiments of the invention.
In a first aspect of the present disclosure, an embodiment of the present invention provides a method for detecting phishing deep victim users, comprising:
generating a ticket file according to the internet traffic data of the user, and extracting domain name information in the ticket file, wherein the ticket file at least comprises the domain name information and the MSISDN of the user;
adopting the domain name information to collide with a domain name white list library so as to filter the call ticket with normal domain name information;
carrying out rule matching on the domain name information in the filtered ticket files according to the network address/APP for implementing phishing, and determining a phishing victim user set according to the MSISDN of the user in the ticket file corresponding to the successfully matched domain name information;
performing network access behavior restoration and log backtracking text analysis on access behavior logs of any victim user in the phishing implemented website/APP in the victim user set to determine an operation behavior of the user in a cheating process, inputting the operation behavior into a pre-trained fraud event hierarchical classification model, and obtaining the victim degree information of the user output by the fraud event hierarchical classification model;
and determining depth victim users according to the victim degree information of each user in the victim user set.
In an embodiment, after determining a depth victim user according to the victim degree information of each user in the victim user set, the method further includes: and respectively sending early warning reminding information to each depth victim user.
In an embodiment, the ticket file further includes internet surfing time, a reporting operator, IMSI, IMEI, URL, top-level domain name, province of the mobile phone, city of the mobile phone, destination IP, country of the destination IP, and province of the destination IP.
In one embodiment, determining the operational behavior in the process of the user being spoofed includes: determining the behavior of inputting a bank card number in the cheating process of the visiting user, the behavior of visiting a malicious website/cheating website, the behavior of downloading the Trojan horse program, and the behavior of installing the Trojan horse program.
In one embodiment, before the act of determining that the bank card number is input, the malicious website/fraud website is accessed, the Trojan horse program is downloaded, and the Trojan horse program is installed in the process of determining that the visiting user is deceived, the method further comprises the following acts: outputting a domain name sample from a known feature library, and performing website family information extraction, main control information extraction, control mailbox extraction, control terminal IP extraction and control terminal mobile phone number tracing analysis on the domain name sample to determine the malicious website/fraud website and the Trojan horse program.
In an embodiment, the performing network access behavior restoration and log backtracking text analysis on the access behavior log of any victim user in the phishing-implemented website/APP in the victim user set comprises: aiming at the website/APP internet access behavior logs in the fraud event time window, performing network access behavior restoration and log backtracking text analysis on the fraud event to determine all operation behaviors of each victim user in the cheating process.
In one embodiment, the fraud event classification model is trained by the following method:
acquiring a training sample set, wherein the training sample comprises operation behaviors in a process that a user is deceived and annotation information used for representing the deceived degree of the user;
determining an initialized fraud event classification model, wherein the initialized fraud event classification model comprises a target layer for outputting a user victimization degree;
and by utilizing a machine learning method, taking the operation behavior in the cheating process of the user in the training samples in the training sample set as the input of the initialized fraud event hierarchical classification model, taking the annotation information corresponding to the input operation behavior in the cheating process of the user as the expected output of the initialized fraud event hierarchical classification model, and training to obtain the fraud event hierarchical classification model.
In a second aspect of the present disclosure, an embodiment of the present invention further provides an apparatus for detecting a phishing deep victim user, including:
the domain name extraction unit is used for generating a ticket file according to the internet traffic data of the user and extracting domain name information in the ticket file, wherein the ticket file at least comprises the domain name information and the MSISDN of the user;
the white list filtering unit is used for adopting the domain name information to collide with a domain name white list library so as to filter the call ticket with normal domain name information;
the rule matching unit is used for carrying out rule matching on the domain name information in the filtered ticket file according to the website/APP for implementing phishing, and determining a phishing victim user set according to the MSISDN of the user in the ticket file corresponding to the successfully matched domain name information;
the victim degree determining unit is used for carrying out network access behavior restoration and log backtracking text analysis on access behavior logs of any victim user in the phishing implemented website/APP in the victim user set so as to determine an operation behavior of the user in a cheating process, and inputting the operation behavior into a pre-trained fraud event hierarchical classification model to obtain the victim degree information of the user output by the fraud event hierarchical classification model;
and the depth victim user determining unit is used for determining a depth victim user according to the victim degree information of each user in the victim user set.
In an embodiment, the depth victim user determining unit is further configured to send early warning reminding information to each depth victim user after determining a depth victim user according to the victim degree information of each user in the victim user set.
In an embodiment, the ticket file further includes internet surfing time, a reporting operator, IMSI, IMEI, URL, top-level domain name, province of the mobile phone, city of the mobile phone, destination IP, country of the destination IP, and province of the destination IP.
In an embodiment, the operation of the victim determining unit for determining that the user is deceived includes: the method comprises the steps of determining the behaviors of inputting a bank card number, accessing a malicious website/fraud website, downloading a Trojan program and installing the Trojan program in the process of cheating a visiting user.
In an embodiment, the victim degree determining unit is further configured to, before determining the act of inputting a bank card number, the act of accessing a malicious/fraudulent website, the act of downloading a trojan program, and the act of installing the trojan program in the process of the user being deceived: outputting a domain name sample from a known feature library, and performing website family information extraction, main control information extraction, control mailbox extraction, control terminal IP extraction and control terminal mobile phone number tracing analysis on the domain name sample to determine the malicious website/fraud website and the Trojan horse program.
In an embodiment, the victim extent determining unit is configured to perform internet access behavior restoration and log backtracking text analysis on an access behavior log of any victim user in the phishing-implemented website/APP in the victim user set, and includes: the method is used for carrying out network access behavior restoration and log backtracking text analysis on the fraud event aiming at the website/APP Internet access behavior log in the fraud event time window so as to determine all operation behaviors of each victim user in the cheating process.
In one embodiment, the fraud event classification model is trained by the following modules:
the system comprises a sample acquisition module, a training sample collection module and a data processing module, wherein the training sample collection module is used for acquiring a training sample set, and the training sample comprises operation behaviors of a user in a cheating process and marking information for representing the cheating degree of the user;
a model determination module for determining an initialized fraud event classification model, wherein said initialized fraud event classification model comprises a target layer for outputting a user victimization degree;
and the model training module is used for utilizing a machine learning device to train the operation behaviors in the cheating process of the user in the training samples in the training sample set as the input of the initialized fraud event hierarchical classification model, and the annotation information corresponding to the input operation behaviors in the cheating process of the user as the expected output of the initialized fraud event hierarchical classification model to obtain the fraud event hierarchical classification model.
In a third aspect of the disclosure, an electronic device is provided. The electronic device includes: a processor; and a memory for storing executable instructions that, when executed by the processor, cause the electronic device to perform the method of the first aspect.
In a fourth aspect of the disclosure, a computer-readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, carries out the method in the first aspect.
The technical scheme provided by the embodiment of the invention has the beneficial technical effects that:
the embodiment of the invention generates a ticket file according to the internet traffic data of a user, and extracts the domain name information in the ticket file; adopting the domain name information to collide with a domain name white list library to filter call tickets with normal domain name information, then performing rule matching according to the phishing website/APP, performing network access behavior reduction and log backtracking text analysis on an access behavior log of any victim user in the phishing website/APP according to a user who is successfully matched to determine the operation behavior of the user in the cheating process, and inputting the operation behavior to a pre-trained fraud event hierarchical classification model to obtain the victim degree information; and determining depth victim users according to the victim degree information of each user, so that the depth victim users can be accurately positioned.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments of the present invention will be briefly described below, and it is obvious that the drawings in the following description are only a part of the embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the contents of the embodiments of the present invention and the drawings without creative efforts.
FIG. 1 is a schematic flow chart diagram illustrating a method for detecting phishing deep victim users according to an embodiment of the present invention;
FIG. 1a is a schematic flow chart of a training method of a fraud event classification model according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart diagram illustrating another method for detecting phishing deep victim users provided in accordance with an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of an apparatus for detecting phishing deep victim users provided according to an embodiment of the present invention;
FIG. 3a is a schematic structural diagram of a training apparatus of a fraud event classification model according to an embodiment of the present invention;
FIG. 4 shows a schematic diagram of an electronic device suitable for use in implementing embodiments of the present invention.
Detailed Description
In order to make the technical problems solved, the technical solutions adopted and the technical effects achieved by the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be described in further detail below with reference to the accompanying drawings, and it is obvious that the described embodiments are only some embodiments, but not all embodiments, of the embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, belong to the scope of protection of the embodiments of the present invention.
It should be noted that the terms "system" and "network" are often used interchangeably herein in embodiments of the present invention. Reference to "and/or" in embodiments of the invention is intended to include any and all combinations of one or more of the associated listed items. The terms "first", "second", and the like in the description and claims of the present disclosure and in the drawings are used for distinguishing between different objects and not for limiting a particular order.
It should be further noted that, in the embodiments of the present invention, each of the following embodiments may be executed alone, or may be executed in combination with each other, and the embodiments of the present invention are not limited in this respect.
The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.
The technical solutions of the embodiments of the present invention are further described by the following detailed description with reference to the accompanying drawings.
FIG. 1 shows a flow chart of a method for detecting phishing deep victim users provided by an embodiment of the present invention, which can be applied to the case of detecting phishing deep users in a network, the method can be executed by an apparatus for detecting phishing deep victim users configured in an electronic device, as shown in FIG. 1, the method for detecting phishing deep victim users described in the embodiment comprises:
in step S110, a ticket file is generated according to the internet traffic data of the user, and domain name information in the ticket file is extracted, where the ticket file at least includes the domain name information and the MSISDN of the user.
After the user internet traffic data is obtained, the internet traffic can be analyzed, cleaned, sorted and counted to generate a unified ticket file.
For example, the internet traffic data includes or can generate information such as time, reporting operator, IMSI, IMEI, URL, top domain name, province of the mobile phone, city of the mobile phone, destination IP, country of the destination IP, province of the destination IP, and mobile phone number.
In step S120, the domain name information is used to collide with the domain name white list library to filter the call ticket with normal domain name information.
The domain name information in the ticket file can be extracted, the extracted domain name information is collided with a domain name white list library, and normal domain name data are filtered.
In step S130, the domain name information in the filtered ticket file is subjected to rule matching according to the website/APP for performing phishing, and a phishing victim user set is determined according to the MSISDN of the user in the ticket file corresponding to the domain name information that is successfully matched.
The step is used for carrying out website rule matching and APP rule matching on the domain name information which is not matched with the white list, or only carrying out one of the matching, and if the matching is successful, extracting the MSISDN in the access log.
In order to collect more cheated information, access content such as IMSI, access time, access times, URL, page text content and the like and user victim degree information in the access log can be extracted.
In step S140, performing network access behavior restoration and log backtracking text analysis on the access behavior log of any victim user in the phishing implemented website/APP in the set of victim users to determine the operation behavior of the user in the process of being deceived, inputting the operation behavior to a pre-trained fraud event hierarchical classification model, and obtaining the victim degree information of the user output by the fraud event hierarchical classification model.
The network access behavior restoration and log backtracking text analysis of the access behavior log of any victim user in the phishing implemented website/APP in the victim user set can include various methods, for example, the network access behavior restoration and log backtracking text analysis of the phishing event can be performed on the website/APP internet access behavior log within the time window of the phishing event to determine all the operation behaviors of each victim user in the process of being cheated. Specifically, the performing network access behavior restoration and log backtracking text analysis on the access behavior log of any victim user in the phishing-implemented website/APP in the victim user set may include: aiming at the internet access behavior logs of the fraud websites in the fraud event time window, carrying out network access behavior restoration and log backtracking text analysis on the fraud events, retrieving information such as fraud bank card numbers, malicious websites/fraud websites, trojan programs and the like, determining all operation behaviors in the cheating process, and sorting the operation behaviors according to the time sequence.
Fig. 1a is a schematic flow chart of a training method of a fraud event classification model provided according to an embodiment of the present invention, as shown in fig. 1a, the fraud event classification model is trained by the following methods:
in step S141, a training sample set is obtained, wherein the training sample includes an operation behavior in which a user is deceived and annotation information indicating a fraud level of the user.
In step S142, an initialized fraud event classification model is determined, wherein the initialized fraud event classification model comprises a target layer for outputting a user victim level.
In step S143, using a machine learning method, the fraud event classification model is trained by taking the operation behavior in the fraud process of the user in the training samples in the training sample set as an input of the initialized fraud event classification model, and taking the annotation information corresponding to the input operation behavior in the fraud process of the user as an expected output of the initialized fraud event classification model.
In step S150, a depth victim user is determined according to the victim degree information of each user in the victim user set.
The determination of the degree of damage can be determined according to a predetermined rule, for example, for a fraud website, the degree of user damage from low to high can be defined as: the user closes the fraud website in seconds after opening the fraud website, closes the fraud website after browsing for a certain time after opening the fraud website, closes the fraud website after inputting non-sensitive information after opening the fraud website, closes the fraud website after inputting sensitive information such as a bank card number and the like after opening the fraud website but not inputting a password, and inputs the bank card number and the password after opening the fraud website.
For a fraud APP, the user damage level can be defined as: the user downloads the fraud APP installation package but does not install the fraud APP, the user installs the fraud APP but does not use the APP, the user uses the fraud APP but does not input information, the user uses the fraud APP and inputs non-sensitive information, the user uses the fraud APP and inputs sensitive information such as a bank card account number and the like, and the user uses the fraud APP and inputs sensitive information such as a bank card account number and a payment password.
Furthermore, a domain name sample can be output from a known feature library, and the source tracing analysis of the website family information, the master control information, the control mailbox, the control terminal IP and the control terminal mobile phone number can be performed on the sample. Through characteristic fingerprint analysis, such as information of a registration/control mailbox, a contact mailbox of a registration domain name and the like, the information of other registered domain names, mailboxes and the like can be traced, and the website access behavior extraction and restoration capability can be realized by combining with other identity position information acquired by a middling original library.
On the basis, various mutually independent data information can be subjected to correlation analysis, and an internet fraud 'deep victim' analysis method is formed by establishing a fraud information identification model, a fraud event classification model and a victim user image model.
The embodiment generates a ticket file according to the internet traffic data of a user, and extracts the domain name information in the ticket file; adopting the domain name information to collide with a domain name white list library to filter call tickets with normal domain name information, then performing rule matching according to the phishing website/APP, performing network access behavior reduction and log backtracking text analysis on an access behavior log of any victim user in the phishing website/APP according to a user who is successfully matched to determine the operation behavior of the user in the cheating process, and inputting the operation behavior to a pre-trained fraud event hierarchical classification model to obtain the victim degree information; and determining depth victim users according to the victim degree information of each user, so that the depth victim users can be accurately positioned.
Fig. 2 is a flow chart of another method for detecting phishing deep victim users according to an embodiment of the present invention, which is based on the foregoing embodiment and is optimized. As shown in fig. 2, the method for detecting phishing deep victim users of the present embodiment comprises:
in step S201, report data is acquired, and step S202 is executed.
In step S202, invalid data is filtered, and step S203 is executed.
In step S203, a domain name is extracted, and step S204 is executed.
In step S204, it is determined whether the domain name belongs to the domain name white list library, if so, step S205 is executed, otherwise, step S206 is executed.
In step S205, the call ticket is filtered.
In step S206, it is determined whether the domain name matches the url rule information, if yes, step S208 is executed, otherwise, step S207 is executed.
In step S207, it is determined whether the domain name matches the APP rule, if yes, step S208 is executed, otherwise, step S205 is executed to filter the ticket.
In step S208, the user depth information is extracted, and step S209 is executed.
Wherein the depth information refers to user victim information.
In step S209, the user mobile phone number segment is matched, and step S210 is executed.
In step S210, the matched mobile phone number segment is stored in the database, and step S211 is executed.
In step S211, the stored mobile phone number segment is subjected to platform analysis, and the process is ended.
In step S212, the domain name and APP information are acquired, and step S213 is performed.
In step S213, the integration rule is analyzed, and step S206 and step S207 are performed.
According to the method, domain name information is extracted after reported data is obtained, a domain name white list library is collided so as to filter a ticket with normal domain name information, rule matching is carried out according to a network address and an APP for implementing phishing, according to a user successfully matched, network access behavior reduction and log backtracking text analysis are carried out on an access behavior log of any victim user in the network address and the APP for implementing phishing, so that an operation behavior of the user in a cheating process is determined, and the operation behavior is input into a pre-trained fraud event hierarchical classification model to obtain victim degree information; and determining depth victim users according to the victim degree information of each user, so that the depth victim users can be accurately positioned.
As an implementation of the methods shown in the above figures, the present application provides an embodiment of an apparatus for detecting phishing deep victim users, and FIG. 3 shows a schematic structural diagram of an apparatus for detecting phishing deep victim users provided by the present embodiment, which corresponds to the method embodiments shown in FIGS. 1 and 2, and which can be applied in various electronic devices. As shown in fig. 3, the apparatus for detecting phishing deep victim users of the present embodiment includes a domain name extraction unit 310, a white list filtering unit 320, a rule matching unit 330, a victim degree determination unit 340 and a deep victim user determination unit 350.
The domain name extracting unit 310 is configured to generate a ticket file according to internet traffic data of a user, and extract domain name information in the ticket file, where the ticket file at least includes the domain name information and an MSISDN of the user.
The white list filtering unit 320 is configured to filter the call ticket with normal domain name information by using the domain name information to collide with the domain name white list library.
The rule matching unit 330 is configured to perform rule matching on the domain name information in the filtered ticket file according to the phishing-implemented website/APP, and determine a phishing victim user set according to the MSISDN of the user in the ticket file corresponding to the successfully matched domain name information.
The victim level determining unit 340 is configured to perform internet access behavior restoration and log backtracking text analysis on the access behavior log of any victim user in the phishing implemented website/APP in the victim user set to determine the operation behavior of the user in the process of being deceived, input the operation behavior into a pre-trained fraud event hierarchical classification model, and obtain the victim level information of the user output by the fraud event hierarchical classification model.
The depth victim user determination unit 350 is configured to determine a depth victim user according to the victim degree information of each user in the victim user set.
According to one or more embodiments of the present disclosure, the depth victim user determining unit 350 is further configured to, after determining depth victim users according to the victim degree information of each user in the victim user set, respectively send early warning reminding information to each depth victim user.
According to one or more embodiments of the present disclosure, the ticket file further includes internet surfing time, a reporting operator, IMSI, IMEI, URL, top-level domain name, province of the mobile phone, city of the mobile phone, destination IP, country of the destination IP, and province of the destination IP.
According to one or more embodiments of the present disclosure, the victim determining unit 340 is configured to determine the operation behavior in the process that the user is deceived, including: the method comprises the steps of determining the behaviors of inputting a bank card number, accessing a malicious website/fraud website, downloading a Trojan program and installing the Trojan program in the process of cheating a visiting user.
According to one or more embodiments of the present disclosure, the victim determination unit 340 is configured to, before determining the act of inputting a bank card number, the act of accessing a malicious website/fraud website, the act of downloading a trojan program, and the act of installing a trojan program in the process of the visiting user being deceived:
outputting a domain name sample from a known feature library, and performing website family information extraction, main control information extraction, control mailbox extraction, control terminal IP extraction and control terminal mobile phone number tracing analysis on the domain name sample to determine the malicious website/fraud website and the Trojan horse program.
According to one or more embodiments of the present disclosure, the victim extent determining unit 340 is configured to further perform network access behavior restoration and log backtracking text analysis on the access behavior log of any victim user in the phishing-enforcing website/APP in the victim user set, including: the method is used for carrying out network access behavior restoration and log backtracking text analysis on the fraud event aiming at the website/APP Internet access behavior log in the fraud event time window so as to determine all operation behaviors of each victim user in the cheating process.
The device for detecting phishing deep victim users provided by the embodiment can execute the method for detecting phishing deep victim users provided by the embodiment of the method disclosed by the embodiment of the disclosure, and has corresponding functional modules and beneficial effects of the execution method.
Fig. 3a is a schematic structural diagram of a training device of a fraud event classification model according to an embodiment of the present invention, and as shown in fig. 3a, the training device of a fraud event classification model according to the embodiment includes a sample acquisition module 341, a model determination module 342, and a model training module 343.
The sample acquiring module 341 is configured to acquire a training sample set, where the training sample includes an operation behavior in a process that a user is deceived and annotation information indicating a fraud level of the user.
The model determining module 342 is configured for determining an initialized fraud event classification model, wherein the initialized fraud event classification model comprises a target layer for outputting a user victimization degree.
The model training module 343 is configured to, by using a machine learning method, train the operational behavior in the process that the user is deceived in the training samples in the training sample set as the input of the initialized fraud event hierarchical classification model, and train the fraud event hierarchical classification model by using the annotation information corresponding to the input operational behavior in the process that the user is deceived as the expected output of the initialized fraud event hierarchical classification model.
The training device for the fraud event classification model provided by the embodiment can execute the training method for the fraud event classification model provided by the embodiment of the method disclosed by the embodiment of the disclosure, and has corresponding functional modules and beneficial effects of the execution method.
Referring now to FIG. 4, a block diagram of an electronic device 400 suitable for use in implementing embodiments of the present invention is shown. The terminal device in the embodiment of the present invention is, for example, a mobile device, a computer, or a vehicle-mounted device built in a floating car, or any combination thereof. In some embodiments, the mobile device may include, for example, a cell phone, a smart home device, a wearable device, a smart mobile device, a virtual reality device, and the like, or any combination thereof. The electronic device shown in fig. 4 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 4, electronic device 400 may include a processing device (e.g., central processing unit, graphics processor, etc.) 401 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)402 or a program loaded from a storage device 408 into a Random Access Memory (RAM) 403. In the RAM 403, various programs and data necessary for the operation of the electronic apparatus 400 are also stored. The processing device 401, the ROM 402, and the RAM 403 are connected to each other via a bus 404. An input/output (I/O) interface 405 is also connected to bus 404.
Generally, the following devices may be connected to the I/O interface 405: input devices 406 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 407 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 408 including, for example, tape, hard disk, etc.; and a communication device 409. The communication means 409 may allow the electronic device 400 to communicate wirelessly or by wire with other devices to exchange data. While fig. 4 illustrates an electronic device 400 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
In particular, according to an embodiment of the present invention, the processes described above with reference to the flowcharts may be implemented as a computer software program. For example, embodiments of the invention include a computer program product comprising a computer program embodied on a computer-readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication device 409, or from the storage device 408, or from the ROM 402. The computer program performs the above-described functions defined in the methods of embodiments of the invention when executed by the processing apparatus 401.
It should be noted that the computer readable medium mentioned above can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In embodiments of the invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In yet another embodiment of the invention, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.
The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: generating a ticket file according to the internet traffic data of the user, and extracting domain name information in the ticket file, wherein the ticket file at least comprises the domain name information and the MSISDN of the user; adopting the domain name information to collide with a domain name white list library so as to filter the call ticket with normal domain name information; carrying out rule matching on the domain name information in the filtered ticket files according to the network address/APP for implementing phishing, and determining a phishing victim user set according to the MSISDN of the user in the ticket file corresponding to the successfully matched domain name information; performing network access behavior restoration and log backtracking text analysis on access behavior logs of any victim user in the phishing implemented website/APP in the victim user set to determine an operation behavior of the user in a cheating process, inputting the operation behavior into a pre-trained fraud event hierarchical classification model, and obtaining the victim degree information of the user output by the fraud event hierarchical classification model; and determining depth victim users according to the victim degree information of each user in the victim user set.
Computer program code for carrying out operations for embodiments of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present invention may be implemented by software or hardware. Where the name of a unit does not in some cases constitute a limitation of the unit itself, for example, the first retrieving unit may also be described as a "unit for retrieving at least two internet protocol addresses".
The foregoing description is only a preferred embodiment of the invention and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure in the embodiments of the present invention is not limited to the specific combinations of the above-described features, but also encompasses other embodiments in which any combination of the above-described features or their equivalents is possible without departing from the spirit of the disclosure. For example, the above features and (but not limited to) the features with similar functions disclosed in the embodiments of the present invention are mutually replaced to form the technical solution.

Claims (10)

1. A method for detecting phishing deep victim users, comprising:
generating a ticket file according to the internet traffic data of the user, and extracting domain name information in the ticket file, wherein the ticket file at least comprises the domain name information and the MSISDN of the user;
adopting the domain name information to collide with a domain name white list library so as to filter the call ticket with normal domain name information;
carrying out rule matching on the domain name information in the filtered ticket files according to the network address/APP for implementing phishing, and determining a phishing victim user set according to the MSISDN of the user in the ticket file corresponding to the successfully matched domain name information;
performing network access behavior restoration and log backtracking text analysis on access behavior logs of any victim user in the phishing implemented website/APP in the victim user set to determine an operation behavior of the user in a cheating process, inputting the operation behavior into a pre-trained fraud event hierarchical classification model, and obtaining the victim degree information of the user output by the fraud event hierarchical classification model;
and determining depth victim users according to the victim degree information of each user in the victim user set.
2. The method of claim 1, further comprising, after determining a depth victim user according to the victim severity information of each user in the set of victim users:
and respectively sending early warning reminding information to each depth victim user.
3. The method of claim 1, wherein the ticket file further comprises internet surfing time, a reporting operator, an IMSI, an IMEI, a URL, a top-level domain name, a province of the mobile phone, a city of the mobile phone, a destination IP, a country of the destination IP, and a province of the destination IP.
4. The method of claim 1, wherein determining the operational behavior in the process of being spoofed by the user comprises:
determining the behavior of inputting a bank card number in the cheating process of the visiting user, the behavior of visiting a malicious website/cheating website, the behavior of downloading the Trojan horse program, and the behavior of installing the Trojan horse program.
5. The method as recited in claim 4, further comprising, prior to the act of determining an entry of a bank card number during the act of accessing the user to be deceived, an act of accessing a malicious/fraudulent website, an act of downloading a Trojan horse program, and an act of installing the Trojan horse program:
outputting a domain name sample from a known feature library, and performing website family information extraction, main control information extraction, control mailbox extraction, control terminal IP extraction and control terminal mobile phone number tracing analysis on the domain name sample to determine the malicious website/fraud website and the Trojan horse program.
6. The method as recited in claim 1, wherein performing network access behavior recovery and log backtracking text analysis on the access behavior log of any victim user in said phishing-enforcing website/APP of said set of victim users comprises:
aiming at the website/APP internet access behavior logs in the fraud event time window, performing network access behavior restoration and log backtracking text analysis on the fraud event to determine all operation behaviors of each victim user in the cheating process.
7. The method according to one of claims 1 to 6, wherein said fraud event classification model is trained by:
acquiring a training sample set, wherein the training sample comprises operation behaviors in a process that a user is deceived and annotation information used for representing the deceived degree of the user;
determining an initialized fraud event classification model, wherein the initialized fraud event classification model comprises a target layer for outputting a user victimization degree;
and by utilizing a machine learning method, taking the operation behavior in the cheating process of the user in the training samples in the training sample set as the input of the initialized fraud event hierarchical classification model, taking the annotation information corresponding to the input operation behavior in the cheating process of the user as the expected output of the initialized fraud event hierarchical classification model, and training to obtain the fraud event hierarchical classification model.
8. An apparatus for detecting phishing deep victim users, characterized in that,
the domain name extraction unit is used for generating a ticket file according to the internet traffic data of the user and extracting domain name information in the ticket file, wherein the ticket file at least comprises the domain name information and the MSISDN of the user;
the white list filtering unit is used for adopting the domain name information to collide with a domain name white list library so as to filter the call ticket with normal domain name information;
the rule matching unit is used for carrying out rule matching on the domain name information in the filtered ticket file according to the website/APP for implementing phishing, and determining a phishing victim user set according to the MSISDN of the user in the ticket file corresponding to the successfully matched domain name information;
the victim degree determining unit is used for carrying out network access behavior restoration and log backtracking text analysis on access behavior logs of any victim user in the phishing implemented website/APP in the victim user set so as to determine an operation behavior of the user in a cheating process, and inputting the operation behavior into a pre-trained fraud event hierarchical classification model to obtain the victim degree information of the user output by the fraud event hierarchical classification model;
and the depth victim user determining unit is used for determining a depth victim user according to the victim degree information of each user in the victim user set.
9. An electronic device, comprising:
a processor; and
a memory to store executable instructions that, when executed by the one or more processors, cause the electronic device to perform the method of any of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-7.
CN202011322821.9A 2020-11-23 2020-11-23 Method, device, equipment and medium for detecting phishing deep victim user Pending CN112491864A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011322821.9A CN112491864A (en) 2020-11-23 2020-11-23 Method, device, equipment and medium for detecting phishing deep victim user

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011322821.9A CN112491864A (en) 2020-11-23 2020-11-23 Method, device, equipment and medium for detecting phishing deep victim user

Publications (1)

Publication Number Publication Date
CN112491864A true CN112491864A (en) 2021-03-12

Family

ID=74933089

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011322821.9A Pending CN112491864A (en) 2020-11-23 2020-11-23 Method, device, equipment and medium for detecting phishing deep victim user

Country Status (1)

Country Link
CN (1) CN112491864A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113177674A (en) * 2021-05-28 2021-07-27 恒安嘉新(北京)科技股份公司 Phishing early warning method, device, equipment and medium
CN113452670A (en) * 2021-04-30 2021-09-28 恒安嘉新(北京)科技股份公司 Phishing blocking method, device, equipment and medium based on SDN network
CN113518075A (en) * 2021-05-14 2021-10-19 恒安嘉新(北京)科技股份公司 Phishing early warning method and device, electronic equipment and storage medium
CN113923669A (en) * 2021-11-10 2022-01-11 恒安嘉新(北京)科技股份公司 Anti-fraud early warning method, device, equipment and medium for multi-source cross-platform fusion
CN113923011A (en) * 2021-09-30 2022-01-11 北京恒安嘉新安全技术有限公司 Phishing early warning method and device, computer equipment and storage medium
CN114363039A (en) * 2021-12-30 2022-04-15 恒安嘉新(北京)科技股份公司 Method, device, equipment and storage medium for identifying fraud websites
CN115022464A (en) * 2022-05-06 2022-09-06 中国联合网络通信集团有限公司 Number processing method, system, computing device and storage medium
CN117254983A (en) * 2023-11-20 2023-12-19 卓望数码技术(深圳)有限公司 Method, device, equipment and storage medium for detecting fraud-related websites

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106791220A (en) * 2016-11-04 2017-05-31 国家计算机网络与信息安全管理中心 Prevent the method and system of telephone fraud
CN108243049A (en) * 2016-12-27 2018-07-03 中国移动通信集团浙江有限公司 Telecoms Fraud recognition methods and device
CN108449319A (en) * 2018-02-09 2018-08-24 秦玉海 A kind of method and device of identification swindle website and the evidence obtaining of long-range wooden horse
US20190158535A1 (en) * 2017-11-21 2019-05-23 Biocatch Ltd. Device, System, and Method of Detecting Vishing Attacks
CN110839216A (en) * 2018-08-17 2020-02-25 中国移动通信集团广东有限公司 Method and device for identifying communication information fraud

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106791220A (en) * 2016-11-04 2017-05-31 国家计算机网络与信息安全管理中心 Prevent the method and system of telephone fraud
CN108243049A (en) * 2016-12-27 2018-07-03 中国移动通信集团浙江有限公司 Telecoms Fraud recognition methods and device
US20190158535A1 (en) * 2017-11-21 2019-05-23 Biocatch Ltd. Device, System, and Method of Detecting Vishing Attacks
CN108449319A (en) * 2018-02-09 2018-08-24 秦玉海 A kind of method and device of identification swindle website and the evidence obtaining of long-range wooden horse
CN110839216A (en) * 2018-08-17 2020-02-25 中国移动通信集团广东有限公司 Method and device for identifying communication information fraud

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113452670A (en) * 2021-04-30 2021-09-28 恒安嘉新(北京)科技股份公司 Phishing blocking method, device, equipment and medium based on SDN network
CN113452670B (en) * 2021-04-30 2023-07-28 恒安嘉新(北京)科技股份公司 Phishing blocking method, device, equipment and medium based on SDN network
CN113518075A (en) * 2021-05-14 2021-10-19 恒安嘉新(北京)科技股份公司 Phishing early warning method and device, electronic equipment and storage medium
CN113518075B (en) * 2021-05-14 2023-10-17 恒安嘉新(北京)科技股份公司 Phishing warning method, device, electronic equipment and storage medium
CN113177674A (en) * 2021-05-28 2021-07-27 恒安嘉新(北京)科技股份公司 Phishing early warning method, device, equipment and medium
CN113923011A (en) * 2021-09-30 2022-01-11 北京恒安嘉新安全技术有限公司 Phishing early warning method and device, computer equipment and storage medium
CN113923011B (en) * 2021-09-30 2023-10-17 北京恒安嘉新安全技术有限公司 Phishing early warning method, device, computer equipment and storage medium
CN113923669A (en) * 2021-11-10 2022-01-11 恒安嘉新(北京)科技股份公司 Anti-fraud early warning method, device, equipment and medium for multi-source cross-platform fusion
CN114363039A (en) * 2021-12-30 2022-04-15 恒安嘉新(北京)科技股份公司 Method, device, equipment and storage medium for identifying fraud websites
CN115022464A (en) * 2022-05-06 2022-09-06 中国联合网络通信集团有限公司 Number processing method, system, computing device and storage medium
CN117254983A (en) * 2023-11-20 2023-12-19 卓望数码技术(深圳)有限公司 Method, device, equipment and storage medium for detecting fraud-related websites

Similar Documents

Publication Publication Date Title
CN112491864A (en) Method, device, equipment and medium for detecting phishing deep victim user
CN113098870B (en) Phishing detection method and device, electronic equipment and storage medium
CN112468520B (en) Data detection method, device and equipment and readable storage medium
CN112685737A (en) APP detection method, device, equipment and storage medium
CN110008428B (en) News data processing method and device, blockchain node equipment and storage medium
CN112416730A (en) User internet behavior analysis method and device, electronic equipment and storage medium
CN112565250B (en) Website identification method, device, equipment and storage medium
CN113904861B (en) Encryption traffic safety detection method and device
CN113177205A (en) Malicious application detection system and method
CN113518075B (en) Phishing warning method, device, electronic equipment and storage medium
CN109818972B (en) Information security management method and device for industrial control system and electronic equipment
CN107172622A (en) The identification of pseudo-base station note and analysis method, apparatus and system
CN112685255A (en) Interface monitoring method and device, electronic equipment and storage medium
CN114445088A (en) Method and device for judging fraudulent conduct, electronic equipment and storage medium
CN114169456A (en) Data processing method, device, equipment and medium based on 5G terminal security
Riadi et al. Comparative Analysis of Forensic Software on Android-based MiChat using ACPO and DFRWS Framework
CN112667875A (en) Data acquisition method, data analysis method, data acquisition device, data analysis device, equipment and storage medium
CN112307464A (en) Fraud identification method and device and electronic equipment
CN110955890B (en) Method and device for detecting malicious batch access behaviors and computer storage medium
CN116049808A (en) Equipment fingerprint acquisition system and method based on big data
CN108322912B (en) Method and device for distinguishing short messages
CN115688107A (en) Fraud-related APP detection system and method
CN110868410B (en) Method and device for acquiring webpage Trojan horse connection password, electronic equipment and storage medium
CN114417397A (en) Behavior portrait construction method and device, storage medium and computer equipment
CN113822036A (en) Privacy policy content generation method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210312

RJ01 Rejection of invention patent application after publication