CN112491864A - Method, device, equipment and medium for detecting phishing deep victim user - Google Patents
Method, device, equipment and medium for detecting phishing deep victim user Download PDFInfo
- Publication number
- CN112491864A CN112491864A CN202011322821.9A CN202011322821A CN112491864A CN 112491864 A CN112491864 A CN 112491864A CN 202011322821 A CN202011322821 A CN 202011322821A CN 112491864 A CN112491864 A CN 112491864A
- Authority
- CN
- China
- Prior art keywords
- user
- victim
- domain name
- information
- website
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 92
- 238000013145 classification model Methods 0.000 claims abstract description 55
- 230000008569 process Effects 0.000 claims abstract description 44
- 238000004458 analytical method Methods 0.000 claims abstract description 31
- 230000006399 behavior Effects 0.000 claims description 101
- 238000000605 extraction Methods 0.000 claims description 20
- ZXQYGBMAQZUVMI-GCMPRSNUSA-N gamma-cyhalothrin Chemical compound CC1(C)[C@@H](\C=C(/Cl)C(F)(F)F)[C@H]1C(=O)O[C@H](C#N)C1=CC=CC(OC=2C=CC=CC=2)=C1 ZXQYGBMAQZUVMI-GCMPRSNUSA-N 0.000 claims description 13
- 238000004590 computer program Methods 0.000 claims description 10
- 238000010801 machine learning Methods 0.000 claims description 5
- 238000001914 filtration Methods 0.000 claims description 4
- 238000011084 recovery Methods 0.000 claims 1
- 230000009467 reduction Effects 0.000 abstract description 4
- 238000010586 diagram Methods 0.000 description 12
- 230000006870 function Effects 0.000 description 6
- 230000003287 optical effect Effects 0.000 description 4
- 230000009286 beneficial effect Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 238000010219 correlation analysis Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000007667 floating Methods 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000009545 invasion Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000009751 slip forming Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/02—Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
- H04L63/0227—Filtering policies
- H04L63/0236—Filtering by address, protocol, port number or service, e.g. IP-address or URL
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/10—Network architectures or network communication protocols for network security for controlling access to devices or network resources
- H04L63/101—Access control lists [ACL]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1425—Traffic logging, e.g. anomaly detection
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/20—Network architectures or network communication protocols for network security for managing network security; network security policies in general
Abstract
The embodiment of the invention relates to a method, a device, electronic equipment and a storage medium for detecting phishing deep victim users, wherein the method comprises the following steps: generating a ticket file according to the internet traffic data of a user, and extracting domain name information in the ticket file; adopting the domain name information to collide with a domain name white list library to filter call tickets with normal domain name information, then performing rule matching according to the phishing website/APP, performing network access behavior reduction and log backtracking text analysis on an access behavior log of any victim user in the phishing website/APP according to a user who is successfully matched to determine the operation behavior of the user in the cheating process, and inputting the operation behavior to a pre-trained fraud event hierarchical classification model to obtain the victim degree information; and determining depth victim users according to the victim degree information of each user, so that the depth victim users can be accurately positioned.
Description
Technical Field
The embodiment of the invention relates to the technical field of Internet fraud prevention application, in particular to a method, a device, electronic equipment and a storage medium for detecting phishing deep victim users.
Background
In order to purify the network environment, illegal criminal behaviors such as network hackers, telecommunication network fraud, invasion of individual privacy of citizens and the like are strikingly attacked by law, a network criminal interest chain is cut off, a high-voltage situation is continuously formed, and the legal rights and interests of the masses are maintained.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method, an apparatus, an electronic device, and a storage medium for detecting phishing depth victim users, so as to achieve accurate positioning of depth victim users.
Additional features and advantages of embodiments of the invention will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of embodiments of the invention.
In a first aspect of the present disclosure, an embodiment of the present invention provides a method for detecting phishing deep victim users, comprising:
generating a ticket file according to the internet traffic data of the user, and extracting domain name information in the ticket file, wherein the ticket file at least comprises the domain name information and the MSISDN of the user;
adopting the domain name information to collide with a domain name white list library so as to filter the call ticket with normal domain name information;
carrying out rule matching on the domain name information in the filtered ticket files according to the network address/APP for implementing phishing, and determining a phishing victim user set according to the MSISDN of the user in the ticket file corresponding to the successfully matched domain name information;
performing network access behavior restoration and log backtracking text analysis on access behavior logs of any victim user in the phishing implemented website/APP in the victim user set to determine an operation behavior of the user in a cheating process, inputting the operation behavior into a pre-trained fraud event hierarchical classification model, and obtaining the victim degree information of the user output by the fraud event hierarchical classification model;
and determining depth victim users according to the victim degree information of each user in the victim user set.
In an embodiment, after determining a depth victim user according to the victim degree information of each user in the victim user set, the method further includes: and respectively sending early warning reminding information to each depth victim user.
In an embodiment, the ticket file further includes internet surfing time, a reporting operator, IMSI, IMEI, URL, top-level domain name, province of the mobile phone, city of the mobile phone, destination IP, country of the destination IP, and province of the destination IP.
In one embodiment, determining the operational behavior in the process of the user being spoofed includes: determining the behavior of inputting a bank card number in the cheating process of the visiting user, the behavior of visiting a malicious website/cheating website, the behavior of downloading the Trojan horse program, and the behavior of installing the Trojan horse program.
In one embodiment, before the act of determining that the bank card number is input, the malicious website/fraud website is accessed, the Trojan horse program is downloaded, and the Trojan horse program is installed in the process of determining that the visiting user is deceived, the method further comprises the following acts: outputting a domain name sample from a known feature library, and performing website family information extraction, main control information extraction, control mailbox extraction, control terminal IP extraction and control terminal mobile phone number tracing analysis on the domain name sample to determine the malicious website/fraud website and the Trojan horse program.
In an embodiment, the performing network access behavior restoration and log backtracking text analysis on the access behavior log of any victim user in the phishing-implemented website/APP in the victim user set comprises: aiming at the website/APP internet access behavior logs in the fraud event time window, performing network access behavior restoration and log backtracking text analysis on the fraud event to determine all operation behaviors of each victim user in the cheating process.
In one embodiment, the fraud event classification model is trained by the following method:
acquiring a training sample set, wherein the training sample comprises operation behaviors in a process that a user is deceived and annotation information used for representing the deceived degree of the user;
determining an initialized fraud event classification model, wherein the initialized fraud event classification model comprises a target layer for outputting a user victimization degree;
and by utilizing a machine learning method, taking the operation behavior in the cheating process of the user in the training samples in the training sample set as the input of the initialized fraud event hierarchical classification model, taking the annotation information corresponding to the input operation behavior in the cheating process of the user as the expected output of the initialized fraud event hierarchical classification model, and training to obtain the fraud event hierarchical classification model.
In a second aspect of the present disclosure, an embodiment of the present invention further provides an apparatus for detecting a phishing deep victim user, including:
the domain name extraction unit is used for generating a ticket file according to the internet traffic data of the user and extracting domain name information in the ticket file, wherein the ticket file at least comprises the domain name information and the MSISDN of the user;
the white list filtering unit is used for adopting the domain name information to collide with a domain name white list library so as to filter the call ticket with normal domain name information;
the rule matching unit is used for carrying out rule matching on the domain name information in the filtered ticket file according to the website/APP for implementing phishing, and determining a phishing victim user set according to the MSISDN of the user in the ticket file corresponding to the successfully matched domain name information;
the victim degree determining unit is used for carrying out network access behavior restoration and log backtracking text analysis on access behavior logs of any victim user in the phishing implemented website/APP in the victim user set so as to determine an operation behavior of the user in a cheating process, and inputting the operation behavior into a pre-trained fraud event hierarchical classification model to obtain the victim degree information of the user output by the fraud event hierarchical classification model;
and the depth victim user determining unit is used for determining a depth victim user according to the victim degree information of each user in the victim user set.
In an embodiment, the depth victim user determining unit is further configured to send early warning reminding information to each depth victim user after determining a depth victim user according to the victim degree information of each user in the victim user set.
In an embodiment, the ticket file further includes internet surfing time, a reporting operator, IMSI, IMEI, URL, top-level domain name, province of the mobile phone, city of the mobile phone, destination IP, country of the destination IP, and province of the destination IP.
In an embodiment, the operation of the victim determining unit for determining that the user is deceived includes: the method comprises the steps of determining the behaviors of inputting a bank card number, accessing a malicious website/fraud website, downloading a Trojan program and installing the Trojan program in the process of cheating a visiting user.
In an embodiment, the victim degree determining unit is further configured to, before determining the act of inputting a bank card number, the act of accessing a malicious/fraudulent website, the act of downloading a trojan program, and the act of installing the trojan program in the process of the user being deceived: outputting a domain name sample from a known feature library, and performing website family information extraction, main control information extraction, control mailbox extraction, control terminal IP extraction and control terminal mobile phone number tracing analysis on the domain name sample to determine the malicious website/fraud website and the Trojan horse program.
In an embodiment, the victim extent determining unit is configured to perform internet access behavior restoration and log backtracking text analysis on an access behavior log of any victim user in the phishing-implemented website/APP in the victim user set, and includes: the method is used for carrying out network access behavior restoration and log backtracking text analysis on the fraud event aiming at the website/APP Internet access behavior log in the fraud event time window so as to determine all operation behaviors of each victim user in the cheating process.
In one embodiment, the fraud event classification model is trained by the following modules:
the system comprises a sample acquisition module, a training sample collection module and a data processing module, wherein the training sample collection module is used for acquiring a training sample set, and the training sample comprises operation behaviors of a user in a cheating process and marking information for representing the cheating degree of the user;
a model determination module for determining an initialized fraud event classification model, wherein said initialized fraud event classification model comprises a target layer for outputting a user victimization degree;
and the model training module is used for utilizing a machine learning device to train the operation behaviors in the cheating process of the user in the training samples in the training sample set as the input of the initialized fraud event hierarchical classification model, and the annotation information corresponding to the input operation behaviors in the cheating process of the user as the expected output of the initialized fraud event hierarchical classification model to obtain the fraud event hierarchical classification model.
In a third aspect of the disclosure, an electronic device is provided. The electronic device includes: a processor; and a memory for storing executable instructions that, when executed by the processor, cause the electronic device to perform the method of the first aspect.
In a fourth aspect of the disclosure, a computer-readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, carries out the method in the first aspect.
The technical scheme provided by the embodiment of the invention has the beneficial technical effects that:
the embodiment of the invention generates a ticket file according to the internet traffic data of a user, and extracts the domain name information in the ticket file; adopting the domain name information to collide with a domain name white list library to filter call tickets with normal domain name information, then performing rule matching according to the phishing website/APP, performing network access behavior reduction and log backtracking text analysis on an access behavior log of any victim user in the phishing website/APP according to a user who is successfully matched to determine the operation behavior of the user in the cheating process, and inputting the operation behavior to a pre-trained fraud event hierarchical classification model to obtain the victim degree information; and determining depth victim users according to the victim degree information of each user, so that the depth victim users can be accurately positioned.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments of the present invention will be briefly described below, and it is obvious that the drawings in the following description are only a part of the embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the contents of the embodiments of the present invention and the drawings without creative efforts.
FIG. 1 is a schematic flow chart diagram illustrating a method for detecting phishing deep victim users according to an embodiment of the present invention;
FIG. 1a is a schematic flow chart of a training method of a fraud event classification model according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart diagram illustrating another method for detecting phishing deep victim users provided in accordance with an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of an apparatus for detecting phishing deep victim users provided according to an embodiment of the present invention;
FIG. 3a is a schematic structural diagram of a training apparatus of a fraud event classification model according to an embodiment of the present invention;
FIG. 4 shows a schematic diagram of an electronic device suitable for use in implementing embodiments of the present invention.
Detailed Description
In order to make the technical problems solved, the technical solutions adopted and the technical effects achieved by the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be described in further detail below with reference to the accompanying drawings, and it is obvious that the described embodiments are only some embodiments, but not all embodiments, of the embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, belong to the scope of protection of the embodiments of the present invention.
It should be noted that the terms "system" and "network" are often used interchangeably herein in embodiments of the present invention. Reference to "and/or" in embodiments of the invention is intended to include any and all combinations of one or more of the associated listed items. The terms "first", "second", and the like in the description and claims of the present disclosure and in the drawings are used for distinguishing between different objects and not for limiting a particular order.
It should be further noted that, in the embodiments of the present invention, each of the following embodiments may be executed alone, or may be executed in combination with each other, and the embodiments of the present invention are not limited in this respect.
The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.
The technical solutions of the embodiments of the present invention are further described by the following detailed description with reference to the accompanying drawings.
FIG. 1 shows a flow chart of a method for detecting phishing deep victim users provided by an embodiment of the present invention, which can be applied to the case of detecting phishing deep users in a network, the method can be executed by an apparatus for detecting phishing deep victim users configured in an electronic device, as shown in FIG. 1, the method for detecting phishing deep victim users described in the embodiment comprises:
in step S110, a ticket file is generated according to the internet traffic data of the user, and domain name information in the ticket file is extracted, where the ticket file at least includes the domain name information and the MSISDN of the user.
After the user internet traffic data is obtained, the internet traffic can be analyzed, cleaned, sorted and counted to generate a unified ticket file.
For example, the internet traffic data includes or can generate information such as time, reporting operator, IMSI, IMEI, URL, top domain name, province of the mobile phone, city of the mobile phone, destination IP, country of the destination IP, province of the destination IP, and mobile phone number.
In step S120, the domain name information is used to collide with the domain name white list library to filter the call ticket with normal domain name information.
The domain name information in the ticket file can be extracted, the extracted domain name information is collided with a domain name white list library, and normal domain name data are filtered.
In step S130, the domain name information in the filtered ticket file is subjected to rule matching according to the website/APP for performing phishing, and a phishing victim user set is determined according to the MSISDN of the user in the ticket file corresponding to the domain name information that is successfully matched.
The step is used for carrying out website rule matching and APP rule matching on the domain name information which is not matched with the white list, or only carrying out one of the matching, and if the matching is successful, extracting the MSISDN in the access log.
In order to collect more cheated information, access content such as IMSI, access time, access times, URL, page text content and the like and user victim degree information in the access log can be extracted.
In step S140, performing network access behavior restoration and log backtracking text analysis on the access behavior log of any victim user in the phishing implemented website/APP in the set of victim users to determine the operation behavior of the user in the process of being deceived, inputting the operation behavior to a pre-trained fraud event hierarchical classification model, and obtaining the victim degree information of the user output by the fraud event hierarchical classification model.
The network access behavior restoration and log backtracking text analysis of the access behavior log of any victim user in the phishing implemented website/APP in the victim user set can include various methods, for example, the network access behavior restoration and log backtracking text analysis of the phishing event can be performed on the website/APP internet access behavior log within the time window of the phishing event to determine all the operation behaviors of each victim user in the process of being cheated. Specifically, the performing network access behavior restoration and log backtracking text analysis on the access behavior log of any victim user in the phishing-implemented website/APP in the victim user set may include: aiming at the internet access behavior logs of the fraud websites in the fraud event time window, carrying out network access behavior restoration and log backtracking text analysis on the fraud events, retrieving information such as fraud bank card numbers, malicious websites/fraud websites, trojan programs and the like, determining all operation behaviors in the cheating process, and sorting the operation behaviors according to the time sequence.
Fig. 1a is a schematic flow chart of a training method of a fraud event classification model provided according to an embodiment of the present invention, as shown in fig. 1a, the fraud event classification model is trained by the following methods:
in step S141, a training sample set is obtained, wherein the training sample includes an operation behavior in which a user is deceived and annotation information indicating a fraud level of the user.
In step S142, an initialized fraud event classification model is determined, wherein the initialized fraud event classification model comprises a target layer for outputting a user victim level.
In step S143, using a machine learning method, the fraud event classification model is trained by taking the operation behavior in the fraud process of the user in the training samples in the training sample set as an input of the initialized fraud event classification model, and taking the annotation information corresponding to the input operation behavior in the fraud process of the user as an expected output of the initialized fraud event classification model.
In step S150, a depth victim user is determined according to the victim degree information of each user in the victim user set.
The determination of the degree of damage can be determined according to a predetermined rule, for example, for a fraud website, the degree of user damage from low to high can be defined as: the user closes the fraud website in seconds after opening the fraud website, closes the fraud website after browsing for a certain time after opening the fraud website, closes the fraud website after inputting non-sensitive information after opening the fraud website, closes the fraud website after inputting sensitive information such as a bank card number and the like after opening the fraud website but not inputting a password, and inputs the bank card number and the password after opening the fraud website.
For a fraud APP, the user damage level can be defined as: the user downloads the fraud APP installation package but does not install the fraud APP, the user installs the fraud APP but does not use the APP, the user uses the fraud APP but does not input information, the user uses the fraud APP and inputs non-sensitive information, the user uses the fraud APP and inputs sensitive information such as a bank card account number and the like, and the user uses the fraud APP and inputs sensitive information such as a bank card account number and a payment password.
Furthermore, a domain name sample can be output from a known feature library, and the source tracing analysis of the website family information, the master control information, the control mailbox, the control terminal IP and the control terminal mobile phone number can be performed on the sample. Through characteristic fingerprint analysis, such as information of a registration/control mailbox, a contact mailbox of a registration domain name and the like, the information of other registered domain names, mailboxes and the like can be traced, and the website access behavior extraction and restoration capability can be realized by combining with other identity position information acquired by a middling original library.
On the basis, various mutually independent data information can be subjected to correlation analysis, and an internet fraud 'deep victim' analysis method is formed by establishing a fraud information identification model, a fraud event classification model and a victim user image model.
The embodiment generates a ticket file according to the internet traffic data of a user, and extracts the domain name information in the ticket file; adopting the domain name information to collide with a domain name white list library to filter call tickets with normal domain name information, then performing rule matching according to the phishing website/APP, performing network access behavior reduction and log backtracking text analysis on an access behavior log of any victim user in the phishing website/APP according to a user who is successfully matched to determine the operation behavior of the user in the cheating process, and inputting the operation behavior to a pre-trained fraud event hierarchical classification model to obtain the victim degree information; and determining depth victim users according to the victim degree information of each user, so that the depth victim users can be accurately positioned.
Fig. 2 is a flow chart of another method for detecting phishing deep victim users according to an embodiment of the present invention, which is based on the foregoing embodiment and is optimized. As shown in fig. 2, the method for detecting phishing deep victim users of the present embodiment comprises:
in step S201, report data is acquired, and step S202 is executed.
In step S202, invalid data is filtered, and step S203 is executed.
In step S203, a domain name is extracted, and step S204 is executed.
In step S204, it is determined whether the domain name belongs to the domain name white list library, if so, step S205 is executed, otherwise, step S206 is executed.
In step S205, the call ticket is filtered.
In step S206, it is determined whether the domain name matches the url rule information, if yes, step S208 is executed, otherwise, step S207 is executed.
In step S207, it is determined whether the domain name matches the APP rule, if yes, step S208 is executed, otherwise, step S205 is executed to filter the ticket.
In step S208, the user depth information is extracted, and step S209 is executed.
Wherein the depth information refers to user victim information.
In step S209, the user mobile phone number segment is matched, and step S210 is executed.
In step S210, the matched mobile phone number segment is stored in the database, and step S211 is executed.
In step S211, the stored mobile phone number segment is subjected to platform analysis, and the process is ended.
In step S212, the domain name and APP information are acquired, and step S213 is performed.
In step S213, the integration rule is analyzed, and step S206 and step S207 are performed.
According to the method, domain name information is extracted after reported data is obtained, a domain name white list library is collided so as to filter a ticket with normal domain name information, rule matching is carried out according to a network address and an APP for implementing phishing, according to a user successfully matched, network access behavior reduction and log backtracking text analysis are carried out on an access behavior log of any victim user in the network address and the APP for implementing phishing, so that an operation behavior of the user in a cheating process is determined, and the operation behavior is input into a pre-trained fraud event hierarchical classification model to obtain victim degree information; and determining depth victim users according to the victim degree information of each user, so that the depth victim users can be accurately positioned.
As an implementation of the methods shown in the above figures, the present application provides an embodiment of an apparatus for detecting phishing deep victim users, and FIG. 3 shows a schematic structural diagram of an apparatus for detecting phishing deep victim users provided by the present embodiment, which corresponds to the method embodiments shown in FIGS. 1 and 2, and which can be applied in various electronic devices. As shown in fig. 3, the apparatus for detecting phishing deep victim users of the present embodiment includes a domain name extraction unit 310, a white list filtering unit 320, a rule matching unit 330, a victim degree determination unit 340 and a deep victim user determination unit 350.
The domain name extracting unit 310 is configured to generate a ticket file according to internet traffic data of a user, and extract domain name information in the ticket file, where the ticket file at least includes the domain name information and an MSISDN of the user.
The white list filtering unit 320 is configured to filter the call ticket with normal domain name information by using the domain name information to collide with the domain name white list library.
The rule matching unit 330 is configured to perform rule matching on the domain name information in the filtered ticket file according to the phishing-implemented website/APP, and determine a phishing victim user set according to the MSISDN of the user in the ticket file corresponding to the successfully matched domain name information.
The victim level determining unit 340 is configured to perform internet access behavior restoration and log backtracking text analysis on the access behavior log of any victim user in the phishing implemented website/APP in the victim user set to determine the operation behavior of the user in the process of being deceived, input the operation behavior into a pre-trained fraud event hierarchical classification model, and obtain the victim level information of the user output by the fraud event hierarchical classification model.
The depth victim user determination unit 350 is configured to determine a depth victim user according to the victim degree information of each user in the victim user set.
According to one or more embodiments of the present disclosure, the depth victim user determining unit 350 is further configured to, after determining depth victim users according to the victim degree information of each user in the victim user set, respectively send early warning reminding information to each depth victim user.
According to one or more embodiments of the present disclosure, the ticket file further includes internet surfing time, a reporting operator, IMSI, IMEI, URL, top-level domain name, province of the mobile phone, city of the mobile phone, destination IP, country of the destination IP, and province of the destination IP.
According to one or more embodiments of the present disclosure, the victim determining unit 340 is configured to determine the operation behavior in the process that the user is deceived, including: the method comprises the steps of determining the behaviors of inputting a bank card number, accessing a malicious website/fraud website, downloading a Trojan program and installing the Trojan program in the process of cheating a visiting user.
According to one or more embodiments of the present disclosure, the victim determination unit 340 is configured to, before determining the act of inputting a bank card number, the act of accessing a malicious website/fraud website, the act of downloading a trojan program, and the act of installing a trojan program in the process of the visiting user being deceived:
outputting a domain name sample from a known feature library, and performing website family information extraction, main control information extraction, control mailbox extraction, control terminal IP extraction and control terminal mobile phone number tracing analysis on the domain name sample to determine the malicious website/fraud website and the Trojan horse program.
According to one or more embodiments of the present disclosure, the victim extent determining unit 340 is configured to further perform network access behavior restoration and log backtracking text analysis on the access behavior log of any victim user in the phishing-enforcing website/APP in the victim user set, including: the method is used for carrying out network access behavior restoration and log backtracking text analysis on the fraud event aiming at the website/APP Internet access behavior log in the fraud event time window so as to determine all operation behaviors of each victim user in the cheating process.
The device for detecting phishing deep victim users provided by the embodiment can execute the method for detecting phishing deep victim users provided by the embodiment of the method disclosed by the embodiment of the disclosure, and has corresponding functional modules and beneficial effects of the execution method.
Fig. 3a is a schematic structural diagram of a training device of a fraud event classification model according to an embodiment of the present invention, and as shown in fig. 3a, the training device of a fraud event classification model according to the embodiment includes a sample acquisition module 341, a model determination module 342, and a model training module 343.
The sample acquiring module 341 is configured to acquire a training sample set, where the training sample includes an operation behavior in a process that a user is deceived and annotation information indicating a fraud level of the user.
The model determining module 342 is configured for determining an initialized fraud event classification model, wherein the initialized fraud event classification model comprises a target layer for outputting a user victimization degree.
The model training module 343 is configured to, by using a machine learning method, train the operational behavior in the process that the user is deceived in the training samples in the training sample set as the input of the initialized fraud event hierarchical classification model, and train the fraud event hierarchical classification model by using the annotation information corresponding to the input operational behavior in the process that the user is deceived as the expected output of the initialized fraud event hierarchical classification model.
The training device for the fraud event classification model provided by the embodiment can execute the training method for the fraud event classification model provided by the embodiment of the method disclosed by the embodiment of the disclosure, and has corresponding functional modules and beneficial effects of the execution method.
Referring now to FIG. 4, a block diagram of an electronic device 400 suitable for use in implementing embodiments of the present invention is shown. The terminal device in the embodiment of the present invention is, for example, a mobile device, a computer, or a vehicle-mounted device built in a floating car, or any combination thereof. In some embodiments, the mobile device may include, for example, a cell phone, a smart home device, a wearable device, a smart mobile device, a virtual reality device, and the like, or any combination thereof. The electronic device shown in fig. 4 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 4, electronic device 400 may include a processing device (e.g., central processing unit, graphics processor, etc.) 401 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)402 or a program loaded from a storage device 408 into a Random Access Memory (RAM) 403. In the RAM 403, various programs and data necessary for the operation of the electronic apparatus 400 are also stored. The processing device 401, the ROM 402, and the RAM 403 are connected to each other via a bus 404. An input/output (I/O) interface 405 is also connected to bus 404.
Generally, the following devices may be connected to the I/O interface 405: input devices 406 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 407 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 408 including, for example, tape, hard disk, etc.; and a communication device 409. The communication means 409 may allow the electronic device 400 to communicate wirelessly or by wire with other devices to exchange data. While fig. 4 illustrates an electronic device 400 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
In particular, according to an embodiment of the present invention, the processes described above with reference to the flowcharts may be implemented as a computer software program. For example, embodiments of the invention include a computer program product comprising a computer program embodied on a computer-readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication device 409, or from the storage device 408, or from the ROM 402. The computer program performs the above-described functions defined in the methods of embodiments of the invention when executed by the processing apparatus 401.
It should be noted that the computer readable medium mentioned above can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In embodiments of the invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In yet another embodiment of the invention, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.
The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: generating a ticket file according to the internet traffic data of the user, and extracting domain name information in the ticket file, wherein the ticket file at least comprises the domain name information and the MSISDN of the user; adopting the domain name information to collide with a domain name white list library so as to filter the call ticket with normal domain name information; carrying out rule matching on the domain name information in the filtered ticket files according to the network address/APP for implementing phishing, and determining a phishing victim user set according to the MSISDN of the user in the ticket file corresponding to the successfully matched domain name information; performing network access behavior restoration and log backtracking text analysis on access behavior logs of any victim user in the phishing implemented website/APP in the victim user set to determine an operation behavior of the user in a cheating process, inputting the operation behavior into a pre-trained fraud event hierarchical classification model, and obtaining the victim degree information of the user output by the fraud event hierarchical classification model; and determining depth victim users according to the victim degree information of each user in the victim user set.
Computer program code for carrying out operations for embodiments of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present invention may be implemented by software or hardware. Where the name of a unit does not in some cases constitute a limitation of the unit itself, for example, the first retrieving unit may also be described as a "unit for retrieving at least two internet protocol addresses".
The foregoing description is only a preferred embodiment of the invention and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure in the embodiments of the present invention is not limited to the specific combinations of the above-described features, but also encompasses other embodiments in which any combination of the above-described features or their equivalents is possible without departing from the spirit of the disclosure. For example, the above features and (but not limited to) the features with similar functions disclosed in the embodiments of the present invention are mutually replaced to form the technical solution.
Claims (10)
1. A method for detecting phishing deep victim users, comprising:
generating a ticket file according to the internet traffic data of the user, and extracting domain name information in the ticket file, wherein the ticket file at least comprises the domain name information and the MSISDN of the user;
adopting the domain name information to collide with a domain name white list library so as to filter the call ticket with normal domain name information;
carrying out rule matching on the domain name information in the filtered ticket files according to the network address/APP for implementing phishing, and determining a phishing victim user set according to the MSISDN of the user in the ticket file corresponding to the successfully matched domain name information;
performing network access behavior restoration and log backtracking text analysis on access behavior logs of any victim user in the phishing implemented website/APP in the victim user set to determine an operation behavior of the user in a cheating process, inputting the operation behavior into a pre-trained fraud event hierarchical classification model, and obtaining the victim degree information of the user output by the fraud event hierarchical classification model;
and determining depth victim users according to the victim degree information of each user in the victim user set.
2. The method of claim 1, further comprising, after determining a depth victim user according to the victim severity information of each user in the set of victim users:
and respectively sending early warning reminding information to each depth victim user.
3. The method of claim 1, wherein the ticket file further comprises internet surfing time, a reporting operator, an IMSI, an IMEI, a URL, a top-level domain name, a province of the mobile phone, a city of the mobile phone, a destination IP, a country of the destination IP, and a province of the destination IP.
4. The method of claim 1, wherein determining the operational behavior in the process of being spoofed by the user comprises:
determining the behavior of inputting a bank card number in the cheating process of the visiting user, the behavior of visiting a malicious website/cheating website, the behavior of downloading the Trojan horse program, and the behavior of installing the Trojan horse program.
5. The method as recited in claim 4, further comprising, prior to the act of determining an entry of a bank card number during the act of accessing the user to be deceived, an act of accessing a malicious/fraudulent website, an act of downloading a Trojan horse program, and an act of installing the Trojan horse program:
outputting a domain name sample from a known feature library, and performing website family information extraction, main control information extraction, control mailbox extraction, control terminal IP extraction and control terminal mobile phone number tracing analysis on the domain name sample to determine the malicious website/fraud website and the Trojan horse program.
6. The method as recited in claim 1, wherein performing network access behavior recovery and log backtracking text analysis on the access behavior log of any victim user in said phishing-enforcing website/APP of said set of victim users comprises:
aiming at the website/APP internet access behavior logs in the fraud event time window, performing network access behavior restoration and log backtracking text analysis on the fraud event to determine all operation behaviors of each victim user in the cheating process.
7. The method according to one of claims 1 to 6, wherein said fraud event classification model is trained by:
acquiring a training sample set, wherein the training sample comprises operation behaviors in a process that a user is deceived and annotation information used for representing the deceived degree of the user;
determining an initialized fraud event classification model, wherein the initialized fraud event classification model comprises a target layer for outputting a user victimization degree;
and by utilizing a machine learning method, taking the operation behavior in the cheating process of the user in the training samples in the training sample set as the input of the initialized fraud event hierarchical classification model, taking the annotation information corresponding to the input operation behavior in the cheating process of the user as the expected output of the initialized fraud event hierarchical classification model, and training to obtain the fraud event hierarchical classification model.
8. An apparatus for detecting phishing deep victim users, characterized in that,
the domain name extraction unit is used for generating a ticket file according to the internet traffic data of the user and extracting domain name information in the ticket file, wherein the ticket file at least comprises the domain name information and the MSISDN of the user;
the white list filtering unit is used for adopting the domain name information to collide with a domain name white list library so as to filter the call ticket with normal domain name information;
the rule matching unit is used for carrying out rule matching on the domain name information in the filtered ticket file according to the website/APP for implementing phishing, and determining a phishing victim user set according to the MSISDN of the user in the ticket file corresponding to the successfully matched domain name information;
the victim degree determining unit is used for carrying out network access behavior restoration and log backtracking text analysis on access behavior logs of any victim user in the phishing implemented website/APP in the victim user set so as to determine an operation behavior of the user in a cheating process, and inputting the operation behavior into a pre-trained fraud event hierarchical classification model to obtain the victim degree information of the user output by the fraud event hierarchical classification model;
and the depth victim user determining unit is used for determining a depth victim user according to the victim degree information of each user in the victim user set.
9. An electronic device, comprising:
a processor; and
a memory to store executable instructions that, when executed by the one or more processors, cause the electronic device to perform the method of any of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011322821.9A CN112491864A (en) | 2020-11-23 | 2020-11-23 | Method, device, equipment and medium for detecting phishing deep victim user |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011322821.9A CN112491864A (en) | 2020-11-23 | 2020-11-23 | Method, device, equipment and medium for detecting phishing deep victim user |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112491864A true CN112491864A (en) | 2021-03-12 |
Family
ID=74933089
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011322821.9A Pending CN112491864A (en) | 2020-11-23 | 2020-11-23 | Method, device, equipment and medium for detecting phishing deep victim user |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112491864A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113177674A (en) * | 2021-05-28 | 2021-07-27 | 恒安嘉新(北京)科技股份公司 | Phishing early warning method, device, equipment and medium |
CN113452670A (en) * | 2021-04-30 | 2021-09-28 | 恒安嘉新(北京)科技股份公司 | Phishing blocking method, device, equipment and medium based on SDN network |
CN113518075A (en) * | 2021-05-14 | 2021-10-19 | 恒安嘉新(北京)科技股份公司 | Phishing early warning method and device, electronic equipment and storage medium |
CN113923669A (en) * | 2021-11-10 | 2022-01-11 | 恒安嘉新(北京)科技股份公司 | Anti-fraud early warning method, device, equipment and medium for multi-source cross-platform fusion |
CN113923011A (en) * | 2021-09-30 | 2022-01-11 | 北京恒安嘉新安全技术有限公司 | Phishing early warning method and device, computer equipment and storage medium |
CN114363039A (en) * | 2021-12-30 | 2022-04-15 | 恒安嘉新(北京)科技股份公司 | Method, device, equipment and storage medium for identifying fraud websites |
CN115022464A (en) * | 2022-05-06 | 2022-09-06 | 中国联合网络通信集团有限公司 | Number processing method, system, computing device and storage medium |
CN117254983A (en) * | 2023-11-20 | 2023-12-19 | 卓望数码技术(深圳)有限公司 | Method, device, equipment and storage medium for detecting fraud-related websites |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106791220A (en) * | 2016-11-04 | 2017-05-31 | 国家计算机网络与信息安全管理中心 | Prevent the method and system of telephone fraud |
CN108243049A (en) * | 2016-12-27 | 2018-07-03 | 中国移动通信集团浙江有限公司 | Telecoms Fraud recognition methods and device |
CN108449319A (en) * | 2018-02-09 | 2018-08-24 | 秦玉海 | A kind of method and device of identification swindle website and the evidence obtaining of long-range wooden horse |
US20190158535A1 (en) * | 2017-11-21 | 2019-05-23 | Biocatch Ltd. | Device, System, and Method of Detecting Vishing Attacks |
CN110839216A (en) * | 2018-08-17 | 2020-02-25 | 中国移动通信集团广东有限公司 | Method and device for identifying communication information fraud |
-
2020
- 2020-11-23 CN CN202011322821.9A patent/CN112491864A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106791220A (en) * | 2016-11-04 | 2017-05-31 | 国家计算机网络与信息安全管理中心 | Prevent the method and system of telephone fraud |
CN108243049A (en) * | 2016-12-27 | 2018-07-03 | 中国移动通信集团浙江有限公司 | Telecoms Fraud recognition methods and device |
US20190158535A1 (en) * | 2017-11-21 | 2019-05-23 | Biocatch Ltd. | Device, System, and Method of Detecting Vishing Attacks |
CN108449319A (en) * | 2018-02-09 | 2018-08-24 | 秦玉海 | A kind of method and device of identification swindle website and the evidence obtaining of long-range wooden horse |
CN110839216A (en) * | 2018-08-17 | 2020-02-25 | 中国移动通信集团广东有限公司 | Method and device for identifying communication information fraud |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113452670A (en) * | 2021-04-30 | 2021-09-28 | 恒安嘉新(北京)科技股份公司 | Phishing blocking method, device, equipment and medium based on SDN network |
CN113452670B (en) * | 2021-04-30 | 2023-07-28 | 恒安嘉新(北京)科技股份公司 | Phishing blocking method, device, equipment and medium based on SDN network |
CN113518075A (en) * | 2021-05-14 | 2021-10-19 | 恒安嘉新(北京)科技股份公司 | Phishing early warning method and device, electronic equipment and storage medium |
CN113518075B (en) * | 2021-05-14 | 2023-10-17 | 恒安嘉新(北京)科技股份公司 | Phishing warning method, device, electronic equipment and storage medium |
CN113177674A (en) * | 2021-05-28 | 2021-07-27 | 恒安嘉新(北京)科技股份公司 | Phishing early warning method, device, equipment and medium |
CN113923011A (en) * | 2021-09-30 | 2022-01-11 | 北京恒安嘉新安全技术有限公司 | Phishing early warning method and device, computer equipment and storage medium |
CN113923011B (en) * | 2021-09-30 | 2023-10-17 | 北京恒安嘉新安全技术有限公司 | Phishing early warning method, device, computer equipment and storage medium |
CN113923669A (en) * | 2021-11-10 | 2022-01-11 | 恒安嘉新(北京)科技股份公司 | Anti-fraud early warning method, device, equipment and medium for multi-source cross-platform fusion |
CN114363039A (en) * | 2021-12-30 | 2022-04-15 | 恒安嘉新(北京)科技股份公司 | Method, device, equipment and storage medium for identifying fraud websites |
CN115022464A (en) * | 2022-05-06 | 2022-09-06 | 中国联合网络通信集团有限公司 | Number processing method, system, computing device and storage medium |
CN117254983A (en) * | 2023-11-20 | 2023-12-19 | 卓望数码技术(深圳)有限公司 | Method, device, equipment and storage medium for detecting fraud-related websites |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112491864A (en) | Method, device, equipment and medium for detecting phishing deep victim user | |
CN113098870B (en) | Phishing detection method and device, electronic equipment and storage medium | |
CN112468520B (en) | Data detection method, device and equipment and readable storage medium | |
CN112685737A (en) | APP detection method, device, equipment and storage medium | |
CN110008428B (en) | News data processing method and device, blockchain node equipment and storage medium | |
CN112416730A (en) | User internet behavior analysis method and device, electronic equipment and storage medium | |
CN112565250B (en) | Website identification method, device, equipment and storage medium | |
CN113904861B (en) | Encryption traffic safety detection method and device | |
CN113177205A (en) | Malicious application detection system and method | |
CN113518075B (en) | Phishing warning method, device, electronic equipment and storage medium | |
CN109818972B (en) | Information security management method and device for industrial control system and electronic equipment | |
CN107172622A (en) | The identification of pseudo-base station note and analysis method, apparatus and system | |
CN112685255A (en) | Interface monitoring method and device, electronic equipment and storage medium | |
CN114445088A (en) | Method and device for judging fraudulent conduct, electronic equipment and storage medium | |
CN114169456A (en) | Data processing method, device, equipment and medium based on 5G terminal security | |
Riadi et al. | Comparative Analysis of Forensic Software on Android-based MiChat using ACPO and DFRWS Framework | |
CN112667875A (en) | Data acquisition method, data analysis method, data acquisition device, data analysis device, equipment and storage medium | |
CN112307464A (en) | Fraud identification method and device and electronic equipment | |
CN110955890B (en) | Method and device for detecting malicious batch access behaviors and computer storage medium | |
CN116049808A (en) | Equipment fingerprint acquisition system and method based on big data | |
CN108322912B (en) | Method and device for distinguishing short messages | |
CN115688107A (en) | Fraud-related APP detection system and method | |
CN110868410B (en) | Method and device for acquiring webpage Trojan horse connection password, electronic equipment and storage medium | |
CN114417397A (en) | Behavior portrait construction method and device, storage medium and computer equipment | |
CN113822036A (en) | Privacy policy content generation method and device and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210312 |
|
RJ01 | Rejection of invention patent application after publication |