CN116095035A - Method and device for acquiring context information of mailbox - Google Patents

Method and device for acquiring context information of mailbox Download PDF

Info

Publication number
CN116095035A
CN116095035A CN202211686067.6A CN202211686067A CN116095035A CN 116095035 A CN116095035 A CN 116095035A CN 202211686067 A CN202211686067 A CN 202211686067A CN 116095035 A CN116095035 A CN 116095035A
Authority
CN
China
Prior art keywords
mailbox
information
entity
context information
entities
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211686067.6A
Other languages
Chinese (zh)
Inventor
白敏�
黄朝文
李敏
汪列军
王胜利
万文杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qianxin Technology Group Co Ltd
Secworld Information Technology Beijing Co Ltd
Original Assignee
Qianxin Technology Group Co Ltd
Secworld Information Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qianxin Technology Group Co Ltd, Secworld Information Technology Beijing Co Ltd filed Critical Qianxin Technology Group Co Ltd
Priority to CN202211686067.6A priority Critical patent/CN116095035A/en
Publication of CN116095035A publication Critical patent/CN116095035A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/10Network architectures or network communication protocols for network security for controlling access to devices or network resources
    • H04L63/101Access control lists [ACL]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • H04L63/145Countermeasures against malicious traffic the attack involving the propagation of malware through the network, e.g. viruses, trojans or worms

Abstract

The application provides a method and a device for acquiring mailbox context information, wherein the method for acquiring the mailbox context information comprises the following steps: acquiring a mailbox entity to be identified; matching the mailbox entity to be identified with the mailbox entity in the mailbox reputation information library, wherein the mailbox reputation information library comprises various mailbox entities and context information thereof, and the content of the context information of one mailbox entity in the mailbox reputation information library is more than that of the context information extracted from one mailbox entity independently; if the matching is successful, outputting a mailbox entity matched with the mailbox entity to be identified in the mailbox reputation information library and the context information of the mailbox entity; if the match fails, the output is null. The analyst can efficiently and accurately track and trace the mailbox entity to be identified based on the obtained rich context information.

Description

Method and device for acquiring context information of mailbox
Technical Field
The present invention relates to the field of network security technologies, and in particular, to a method and an apparatus for acquiring mailbox context information, and a method and an apparatus for generating a mailbox reputation information library.
Background
In a security analysis scenario, an analyst needs to track and locate an attacking entity through a reputation research tool. The most typical attack entity is a mailbox entity class attack entity. Because most attackers often use phishing mailbox entities, malicious inline mailbox entities, etc. to attack and acquire user data. Therefore, reputation research on mailbox entities becomes an important ring in security analysis.
At present, in order to realize reputation research and judgment on mailbox entities, the main adopted mode is as follows: the analyst inputs the mailbox entity to be judged into the credit judging tool, and the credit judging tool analyzes the mailbox entity and outputs an analysis result. This analysis typically results in whether the mailbox entity is a malicious mailbox entity and some underlying context information extracted from the mailbox entity. If the analysis result shows that the mailbox entity is a malicious mailbox entity, then the analyst needs to track and trace the source based on the context information.
However, the malicious mailbox entity is tracked and traced in the above manner, because the context information of the mailbox entity provided by the reputation research and judgment tool is some basic information for comparison, an analyst cannot very accurately track and trace the source based on the basic context information, and sometimes even the analyst is required to manually correlate related information with more dimensions based on the malicious mailbox entity, so that the efficiency and accuracy of tracking and tracing the source by the analyst are reduced.
Disclosure of Invention
The embodiment of the application aims to provide a method and a device for acquiring mailbox context information and a method and a device for generating a mailbox reputation information library so as to improve the efficiency and accuracy of tracking and tracing by an analyst.
In order to solve the technical problems, the embodiment of the application provides the following technical scheme:
a first aspect of the present application provides a method for acquiring mailbox context information, where the method includes: acquiring a mailbox entity to be identified; matching the mailbox entity to be identified with mailbox entities in a mailbox reputation information library, wherein the mailbox reputation information library comprises various mailbox entities and context information thereof, and the content of the context information of one mailbox entity in the mailbox reputation information library is more than that of the context information extracted from the one mailbox entity independently; if the matching is successful, outputting a mailbox entity matched with the mailbox entity to be identified and the context information thereof in the mailbox reputation information library; if the match fails, the output is null.
A second aspect of the present application provides a method for generating a mailbox reputation information base, where the method includes: collecting a plurality of mailbox entities through a plurality of information sources; extracting context information from the mailbox entities respectively, wherein the context information is related to mailbox entity reputation research; integrating the context information of the same mailbox entity; reputation research and judgment are carried out on the integrated context information, and research and judgment results of all mailbox entities are generated; and adding the research and judgment results of each mailbox entity into the corresponding integrated context information and storing the integrated context information in a mailbox reputation information library.
A third aspect of the present application provides an apparatus for acquiring mailbox context information, where the apparatus includes: the receiving module is used for acquiring a mailbox entity to be identified; the matching module is used for matching the mailbox entity to be identified with the mailbox entity in the mailbox reputation information library, wherein the mailbox reputation information library comprises various mailbox entities and context information thereof, and the content of the context information of one mailbox entity in the mailbox reputation information library is more than that of the context information extracted from the one mailbox entity independently; if the matching is successful, an output module is entered, and the output module is used for outputting mailbox entities and context information thereof matched with the mailbox entities to be identified in the mailbox reputation information library; if the matching is successful, a new adding module is entered, and the new adding module is used for extracting the context information in the mailbox entity to be identified and storing the extracted context information and the mailbox entity to be identified in the mailbox reputation information library.
A fourth aspect of the present application provides a device for generating a mailbox reputation information base, where the device includes: the acquisition module is used for acquiring a plurality of mailbox entities through a plurality of information sources; the extraction module is used for respectively extracting context information from the mailbox entities, and the context information is related to the reputation research and judgment of the mailbox entities; the integrating module is used for integrating the context information of the same mailbox entity; the judging module is used for carrying out reputation judgment on the integrated context information and generating judging results of all mailbox entities; and the storage module is used for adding the research and judgment results of each mailbox entity into the corresponding integrated context information and storing the integrated context information in the mailbox reputation information library.
A fifth aspect of the present application provides a computer storage medium having stored thereon a computer program which when executed by a processor performs the method of the first or second aspect.
A sixth aspect of the present application provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of the first or second aspect when executing the program.
Compared with the prior art, when the context information of the mailbox entity to be identified needs to be acquired, the mailbox reputation information library containing various mailbox entities and rich context information thereof can be used for matching the mailbox entity same as the mailbox entity to be identified from the mailbox reputation information library, the context information of the mailbox entity is further used as the context information of the mailbox entity to be identified and provided for an analyst, so that the analyst can acquire the richer context information of the mailbox entity to be identified, and further trace and trace the mailbox entity to be identified efficiently and accurately based on the rich context information.
The method for generating the mailbox reputation information base, the device for acquiring the mailbox context information, the device for generating the mailbox reputation information base, the computer storage medium and the electronic equipment have the same or similar beneficial effects as the method for generating the mailbox reputation information base.
Drawings
The above, as well as additional purposes, features, and advantages of exemplary embodiments of the present application will become readily apparent from the following detailed description when read in conjunction with the accompanying drawings. Several embodiments of the present application are illustrated by way of example and not by way of limitation in the figures of the accompanying drawings, in which like reference numerals refer to similar or corresponding parts and in which:
fig. 1 is a flow chart of a method for acquiring mailbox context information in an embodiment of the present application;
FIG. 2 is a schematic diagram of a production processing architecture of mailbox reputation information in an embodiment of the present application;
FIG. 3 is a flowchart illustrating a method for generating a mailbox reputation information library according to an embodiment of the present application;
fig. 4 is a schematic diagram of context information of a mailbox entity extracted in an embodiment of the present application;
fig. 5 is a schematic diagram of context information of a mailbox entity output in an embodiment of the present application;
FIG. 6 is a schematic diagram of a production flow of a mailbox reputation information library according to an embodiment of the present application;
FIG. 7 is a schematic diagram of a deployment flow of a mailbox reputation information library in an embodiment of the present application;
fig. 8 is a schematic structural diagram of an apparatus for acquiring mailbox context information in an embodiment of the present application;
fig. 9 is a schematic structural diagram of a device for generating a mailbox reputation information base in an embodiment of the present application.
Detailed Description
Exemplary embodiments of the present application will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present application are shown in the drawings, it should be understood that the present application may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
It is noted that unless otherwise indicated, technical or scientific terms used herein should be given the ordinary meaning as understood by one of ordinary skill in the art to which this application belongs.
Currently, in order to obtain the context information of the target mailbox, an analyst generally obtains the context information of the target mailbox through a manual or existing mailbox reputation service. However, the manual acquisition of the context information of the target mailbox is limited by personal experience differences of analysts, so that the efficiency of acquiring the context information is low, and the accuracy and the comprehensiveness of the acquired context information are different. The context information obtained through the mailbox reputation service is also some information based on comparison, and cannot help an analyst to accurately trace and trace the target mailbox. Therefore, the current mode is adopted to acquire the context information of the mailbox, and tracing are performed based on the acquired context information, so that the efficiency and accuracy of an analyst in tracing and tracing are reduced.
The inventor has found through intensive research that the main reason for the low efficiency and accuracy of the analyst in tracing the trace source is that the acquired context information of the mailbox is not rich enough, and the reason for the insufficient acquired context information of the mailbox is that the mailbox context information in the mailbox reputation information library for providing the context information of the mailbox is not rich enough. If a mailbox reputation information library can be provided, a large amount of context information of various mailboxes is contained in the mailbox reputation information library, rich context information of the target mailbox can be searched based on the mailbox reputation information library, and further an analyst can be helped to track and trace the target mailbox more efficiently and accurately based on the rich context information.
In view of this, the embodiments of the present application provide a method and an apparatus for acquiring context information of a mailbox, and a method and an apparatus for generating a mailbox reputation information library, by establishing a mailbox reputation information library including rich context information of various mailboxes, when an analyst needs to trace and trace a mailbox, the rich context information of the mailbox can be acquired through the mailbox information library, and further trace and trace a target mailbox based on the rich context information.
Next, a detailed description is first given of a method for acquiring mailbox context information provided in the embodiment of the present application.
Fig. 1 is a flow chart of a method for acquiring mailbox context information in an embodiment of the present application, and referring to fig. 1, the method may include:
s101: and acquiring a mailbox entity to be identified.
The mailbox entity to be identified is actually the mailbox entity that the analyst needs to trace and trace, i.e. needs to acquire the context information of the mailbox entity. Whereas the mailbox entity is actually a mailbox account, for example: zhangsan@163.com, or a mail.
S102: and matching the mailbox entity to be identified with the mailbox entity in the mailbox reputation information library. If the matching is successful, step S103 is executed, and if the matching is failed, step S104 is executed.
The mailbox reputation information library comprises various mailbox entities and context information thereof, and the content of the context information of one mailbox entity in the mailbox reputation information library is more than that of the context information extracted from one mailbox entity independently.
In order to acquire the context information rich in the mailbox entity, the context information of the mailbox entity can be acquired from a mailbox reputation information library containing various mailbox entities and rich context information thereof, and the acquired context information is very rich.
The rich context information of the mailbox entities can be obtained from the mailbox reputation information library, mainly because: mailbox reputation information library stores mailbox entities collected from various information sources, and the context information extracted from the mailbox entities is rich. For example, assuming that mailbox entity 1 and mailbox entity 2 are collected from an a information source, context information a and context information b are extracted from mailbox entity 1, and context information c is extracted from mailbox entity 2. The context information d is extracted from the mailbox entity 1 by collecting the mailbox entity 1 from the B information source. Then, the mailbox reputation information library stores the context information a, the context information b and the context information d of the mailbox entity 1 and the context information c of the mailbox entity 2. When an analyst needs to trace and trace the mailbox entity 1, it may only be able to acquire the context information a and the context information b if it acquires its context information based only on the mailbox entity 1 on his hand. If the mailbox entity 1 at hand is matched with each mailbox entity in the mailbox reputation information library, not only the context information a and the context information b but also the context information c can be obtained. Therefore, the contextual information of the mailbox entity 1 obtained by the analyst is richer, and further, the trace tracing can be more efficiently and accurately carried out. The specific manner of acquiring the mailbox reputation information library can integrate the existing mailbox information library, and can also be automatically created again (the specific creation manner will be described in detail later), which is not limited herein.
In the process of matching the mailbox entity to be identified with the mailbox entity in the mailbox reputation information library, the mailbox entity to be identified can be compared with each mailbox entity in the mailbox reputation information library in sequence, if the mailbox entity to be identified is found to be identical with one mailbox entity in the mailbox reputation information library, the mailbox entity to be identified is considered to be successfully matched with the mailbox entity in the mailbox reputation information library, and further, the mailbox entity to be identified is not matched with the mailbox entity behind the mailbox entity, so that the matching efficiency can be improved, the mailbox context information acquisition efficiency can be improved, and the tracking and tracing efficiency of an analyst can be improved. If the last mailbox entity in the mailbox reputation information library is not successfully matched with the mailbox entity to be identified, the mailbox entity which is not matched with the mailbox entity to be identified in the mailbox reputation information library is considered, and further, the mailbox reputation information library does not have abundant context information of the mailbox entity to be identified, and an analyst can only acquire the context information of the mailbox entity in other modes.
S103: and outputting the mailbox entity and the context information matched with the mailbox entity to be identified in the mailbox reputation information library.
When the mailbox entity to be identified is successfully matched with one mailbox entity in the mailbox reputation information library, the context information of the one mailbox entity in the mailbox reputation information library is output to an analyst as the context information of the mailbox entity to be identified, and the analyst can trace and trace the mailbox to be identified more efficiently and accurately based on the context information of the mailbox entity matched with the mailbox entity to be identified in the mailbox reputation information library because the context information of the one mailbox entity in the mailbox reputation information library is more comprehensive and rich than the context information extracted from the mailbox entity to be identified.
S104: the output is null.
When the mailbox entity to be identified is not successfully matched with all mailbox entities in the mailbox reputation information library, the mailbox reputation information library is not capable of providing abundant context information of the mailbox entity, and therefore only null output can be achieved.
Meanwhile, in order to make up for the lack in the mailbox reputation information library, new mailbox reputation information needs to be continuously produced later. When new mailbox reputation information is produced in the mailbox reputation information library, the mailbox entity which is output as empty before can be matched in the mailbox reputation information library again, and if the mailbox entity is successfully matched, prompt information can be sent to an analyst who searches the mailbox entity before so as to prompt the analyst that the mailbox reputation information library has the context information corresponding to the mailbox entity which is inquired by the analyst.
Here, the context information may refer to all information related to the mailbox entity, and may include not only the sender, the topic, etc., but also the reputation evaluation result of the mailbox. The specific content of the above-described context information is not limited herein.
As can be seen from the foregoing, in the method for acquiring the mailbox context information provided in the embodiments of the present application, when the context information of the mailbox entity to be identified needs to be acquired, the mailbox reputation information library including various mailbox entities and rich context information thereof can be used to match the mailbox entity identical to the mailbox entity to be identified from the mailbox reputation information library, and further the context information of the mailbox entity is used as the context information of the mailbox entity to be identified and provided to an analyst, so that the analyst can obtain the richer context information of the mailbox entity to be identified, and further trace and trace the mailbox entity to be identified based on the rich context information with high efficiency and accuracy.
Further, as a refinement and expansion of the above mailbox reputation information library, a description will be given in detail of a creation process of the mailbox reputation information library.
Before this, a description is given of a production processing architecture of each mailbox reputation information in the mailbox reputation information library. Fig. 2 is a schematic diagram of a production processing architecture of mailbox reputation information in an embodiment of the present application, and referring to fig. 2, the architecture may include: the system comprises a data access layer, a data storage layer, a data processing layer and an application layer.
1. Data access layer: and accessing a plurality of types of information sources, wherein each information source is not mailbox reputation information at the moment and is an object of comprehensive research and judgment analysis, namely an object entity to be extracted. The plurality of types of information sources may include, but are not limited to: open source data, domain name query protocol (white) records, collapse attack index (Indicator Of Compromise, IOC) information, threat intelligence (Threat Intelligence, TI) data, web crawler data, and black/white lists.
The data access layer may specifically include: a Whois information module, an open source information module, a crawler information module, an IOC information module, a black/white list module, and a file authentication information module.
(1) Whois information module: for receiving a Whois record.
(2) An open source information module: for receiving open source data.
(3) And the crawler information module is used for: for receiving the web crawler data.
(4) IOC information module: for receiving IOC information.
(5) Black/white list module: for receiving a black/white list of mailboxes.
(6) File authentication information module: for receiving authentication information for a file in a mailbox.
2. Data storage layer: and storing and warehousing various context information of the mailbox entity, associating various information, entering a production flow, and warehousing after processing by a buffer queue task.
The data storage layer may specifically include: the system comprises a data storage module, a mailbox reputation level evaluation module and a dynamic reputation data implementation module.
(1) And a data storage module: for storing various types of context information for mailbox entities.
(2) Mailbox reputation level evaluation module: the credit rating evaluation method is used for evaluating credit rating of the mailbox entity according to various context information of the mailbox entity.
(3) A real-time dynamic reputation data module: for dynamically updating the reputation level of a mailbox entity in real-time.
3. Data processing layer: and extracting data of different types and sources according to entity extraction rules to perform operations such as integration, association, warehouse entry and the like.
The data processing may include: the system comprises a task scheduling module, a data analysis module, a data extraction module, a data conversion module and a rule definition module.
(1) Task scheduling module: the scheduling method is used for realizing scheduling among the data analysis module, the data extraction module and the data conversion module.
(2) And a data analysis module: for parsing data of different types and sources.
(3) And a data extraction module: for extracting context information of mailbox entities from different types and sources of data according to entity extraction rules.
(4) And a data conversion module: and the contextual information of the extracted mailbox entity is converted into a preset format.
(5) Rule definition module: for defining entity extraction rules.
4. Application layer: and checking mailbox judgment results through a preset mailbox white list, performing mailbox entity reputation imaging and the like, providing reputation query service and the like. And finishing the encapsulation of the interface field of the data and the interface provision of an application program interface (Application Program Interface, API), and externally providing the API or application call and the like to realize threat analysis value.
The application layer may specifically include: the system comprises a white list prepositioning module, a mailbox reputation imaging module, a mailbox reputation query module and a mailbox reputation service module.
(1) White list pre-module: and the method is used for storing a preset white list and checking mailbox judging results according to the preset white list.
(2) Mailbox reputation imaging module: and the mailbox is portrait according to the context information and reputation research result of the mailbox entity.
(3) Mailbox reputation query module: and the mailbox entity query module is used for querying and outputting all relevant information of the mailbox entity according to the mailbox entity input by the user.
(4) Mailbox reputation service module: other services for providing mailbox reputation queries.
The above is a schematic of the production architecture of each mailbox reputation information in the mailbox reputation information library, and the creation process of the mailbox reputation information library is described in detail below. Fig. 3 is a flow chart of a method for generating a mailbox reputation information base in an embodiment of the present application, and referring to fig. 3, the method may include:
s301: a plurality of mailbox entities are collected through a plurality of information sources.
In order to construct the mailbox reputation information library, so that mailbox entities in the mailbox reputation information library are richer, and context information corresponding to the mailbox entities is richer, all information related to the mailbox entities needs to be collected from multiple parties.
Specifically, the step S301 may include:
step A: and acquiring a plurality of mailbox entities from the open source data, the domain name query protocol Whois records, the collapse attack index IOC information, the threat information TI data, the large-net crawler data and the black/white list.
The open source data may generally refer to data that is disclosed on the internet and can be directly obtained by any user through a terminal used by the user.
The domain name query protocol, whois, is a database used to query whether a domain name has been registered, and to register details of the domain name. Related data of the mailbox entity can also be obtained through the Whois record.
The collapse attack index IOC information, IOC is one of the most effective judging methods for finding threat. Various types of entities that enable threat discovery through IOCs include, but are not limited to: internet Protocol (IP) addresses, domain names, file hashes (Hash), mail addresses, etc. The IOC can provide an indication of threat studies to discover the signs of various types of attack activity. Thus, various information of the mailbox entity can also be obtained through the IOC information.
Threat information TI data in which various mailbox reputation information exists, various information of mailbox entities can be acquired through the TI data.
The data of the large net crawlers can acquire various data in the Internet through the crawlers, so that various information related to mailbox entities can be crawled through the crawlers.
The black/white list, i.e. the black/white list of mailboxes, also belongs to one type of information of the mailbox entity.
Of course, mailbox entities may also be obtained from other sources other than the above sources, for example: a personally provided mailbox entity, etc. The specific information sources for collecting the plurality of mailbox entities are not limited herein.
According to the content, the mailbox entity is collected through various information sources, so that the mailbox entity can obtain the maximum, mailbox information in a mailbox credit information library is richer finally, and richer context information is provided.
S302: context information is extracted from the plurality of mailbox entities, respectively.
Wherein the context information is related to mailbox entity reputation research.
After collecting a plurality of mailbox entities, the information related to the mailbox entities and capable of being used in mailbox reputation evaluation needs to be extracted, namely, the context information is extracted, so that the richness of the related information of the mailbox entities in the mailbox reputation information library is maximized.
Specifically, the step S302 may include:
and (B) step (B): and extracting at least one of a sender, a mail label, an IP address, a mail subject list, an association sample, a mail entity Whois registration information back check result, a malicious label, a domain name and an active time attribute from the plurality of mail entities as context information respectively corresponding to the plurality of mail entities.
That is, for each mailbox entity, at least one of its sender, mail tag, IP address, mail topic list, association sample, mailbox entity Whois registration information review result, malicious tag, domain name, and active time attribute needs to be extracted and used as the context information of the corresponding mailbox entity.
The sender may refer to the body of the sending mail, for example: and thirdly, stretching.
Mail labels may specifically include: mailbox suffixes (i.e., free public mail services such as 163. Com), recipients (which may be, in particular, recipient unit lists, industry lists, etc.), and the like.
The IP address may specifically include: mail server IP address and sender real IP address, etc.
The mail subject list, i.e., the collection of mail subject names.
The associated sample, i.e. the content sample that is consistent with a certain content of the mail. Specifically, the method comprises the following steps: a sample as a sender mailbox, a sample as a recipient mailbox, an attachment in a sender sample, and a sample as an embedded mailbox of a pilgrim horse.
The mailbox entity Whois registers information and rechecks the result, namely the related information of the mail queried in the Whois.
Malicious tags, i.e. tags carried by the mail that can indicate the nature of the mail. Specifically, the method comprises the following steps: SPAM, phish phishing, fraud, advertisement advertisements, APT and hacked.
The domain name may specifically include: main mx servers, units, industries of domain names.
The active time attribute may specifically include: first active time, last active time.
In order to make it more clear which context information has been extracted, a visual illustration is made here in a diagram. Fig. 4 is a schematic diagram of the extracted context information of the mailbox entity in the embodiment of the present application, referring to fig. 4, the extracted context information of the mailbox entity, that is, mail information enrichment information, and the major aspects mainly include: sender, mail tag, IP address, mail topic list, association sample, white registration information review result, malicious tag, domain name and active time attribute. Some subclasses are listed under some subclasses, and are explicitly shown in fig. 4, and are not described herein.
According to the content, the context information related to the mailbox reputation research and judgment is extracted from the mailbox entity, so that the mailbox entity can be subjected to reputation evaluation based on the context information, and the context information is stored in the mailbox reputation information library, so that the information in the mailbox reputation information library is richer, richer information data is provided for the mailbox entity to be identified, and the efficiency and accuracy of an analyst in tracing and tracing the mailbox entity are improved.
The mailbox entities obtained from different information sources are different in form, and in order to extract corresponding senders, labels, IP addresses, mail topic lists, correlation samples, whois registration information reverse checking results, malicious labels, domain names and active time attributes from the mailbox entities, an information extraction mode provided by the embodiment of the application can be adopted.
Specifically, the step B may include:
step B1: and respectively extracting the mail_ info, target, file, type fields in the original data of the authentication reports of the mailbox entities to obtain first sub-information corresponding to the mailbox entities.
That is, the original data field mail_info.sender in the report related to the mail in the process of identifying all files is extracted, the field information is list, the items with the 'mail_info' field are screened, the fields such as target, file, type are extracted, and the parts such as email and msg are respectively processed.
Step B2: and respectively extracting information summary algorithm (Message Digest Algorithm, MD 5) information of the mailbox entities to obtain second sub-information corresponding to the mailbox entities.
By extracting MD5 information of the mailbox entity, quick research and judgment can be performed based on the content of the mail, so that information in the mailbox reputation information library is more accurate.
Step B3: and respectively extracting the related information of the mails in the sandboxed data messages of the mailbox entities to obtain third sub-information respectively corresponding to the mailbox entities.
Sandboxes, i.e., report packets, are a virtual system program that allows email to run a browser or other program in a sandbox environment. Therefore, the actual running result content of the mail can be obtained, and the mail is researched and judged, so that the information in the mailbox credit information library is more accurate.
Step B4: and respectively extracting Mail exchange (MX) records of a plurality of mailbox entities, sender policy framework (Sender Policy Framework, SPF) protocol information and DMARC (Domain-based Message Authentication, reporting and configuration) information to obtain fourth sub-information respectively corresponding to the mailbox entities.
MX points to a mail server and is used for positioning the mail server according to the address suffix of the addressee when the email system sends the email. SPF, which can provide a check for the recipient of the mail. DMARC, representing domain-based message authentication, is a DNS TXT record that can be issued to a domain to control what happens when message authentication fails. By acquiring MX records, SPF protocol information and DMARC information of the mailbox entity, more abundant context information of the mailbox entity can be obtained, and then a mailbox reputation information library is enriched.
Step B5: and detecting the first sub-information, the second sub-information, the third sub-information and the fourth sub-information through the collapse detection module to obtain detection results respectively corresponding to the mailbox entities.
That is, the mailbox field information is connected with the collapse detection type interface, and is subjected to information association and conversion according to the obtained information such as IOC, malicious family (malpractice_family), activity (campaign) and the like, and is mapped into field information such as mail (email), portraits (portraits), labels (tags), malicious activity (malpractice_activity) and the like.
The collapse detection module herein is a tool capable of maliciously detecting entity information, for example: various existing collapse detection tools. The entity information is in butt joint with the collapse type detection interface, so that malicious detection of the entity information can be realized.
Step B6: based on the first sub-information, the second sub-information, the third sub-information, the fourth sub-information and the detection result, determining a sender, a label, an IP address, a mail subject list, an association sample, a mail entity Whois registration information back check result, a malicious label, a domain name and an active time attribute which are respectively corresponding to a plurality of mail entities, and using the sender, the label, the IP address, the mail subject list, the association sample, the mail entity Whois registration information back check result, the malicious label, the domain name and the active time attribute as context information respectively corresponding to the plurality of mail entities.
After the sub-information is extracted, the sub-information is respectively classified into the categories of a sender, a label, an IP address, a mail subject list, a correlation sample, a mail box entity Whois registration information anti-checking result, a malicious label, a domain name and an active time attribute according to the corresponding categories, and finally the context information of the mail box entity is obtained.
According to the content, various contextual information related to the mailbox entity can be obtained to the greatest extent through the information extraction mode, so that the information content in the mailbox credit information library is more enriched, and the tracking and tracing efficiency and accuracy of an analyst are improved.
When new extraction rules are generated in the follow-up process, new extraction modes can be added on the basis of the information extraction modes to extract the context information of the follow-up mailbox entities, and the previously collected mailbox entities are subjected to the extraction of the context information according to the newly added extraction modes. In this way, the content in the mailbox reputation information repository can be continually updated and further enriched.
Specifically, after the last time the context information of the mailbox entity is put in storage, the method may further include:
step C1: and receiving the newly added extraction rule.
Here, the new extraction rule generally refers to an extraction rule of some mailbox entity context information that is not summarized by the rule in the earlier stage and is thought of in the later stage.
Step C2: and extracting the context information from the mailbox entities according to the newly added extraction rule.
That is, in the existing mailbox entity of the mailbox reputation information library, in addition to extracting the context information in the manner of the above-described steps B1 to B6, it is necessary to extract the context information again according to the newly added extraction rule. And then, after integrating and studying, storing the integrated and studied mail box information into corresponding mail box entities in a mail box credit information library.
According to the above, the context information of the mailbox entity in the mailbox reputation information library can be continuously enriched by adding the extraction rule and adopting the extraction rule to extract the context information of the mailbox entity existing in the mailbox reputation information library.
A specific additional extraction rule is given below, where step C2 may specifically include:
step C21: domain names in a plurality of mailbox entities are respectively extracted.
Domain name, domain. Can be extracted through the email field. The "domain name portion in the email field may be extracted by regularization. Of course, the extraction of the domain name portion may also be performed in other ways, for example: scanning extraction, and the like. The specific manner in which the "domain name portion extraction is performed is not limited herein.
Step C22: judging whether the domain name has an association relation with the known malicious domain name. If yes, go to step C23, if no, go to step C24.
Step C23: taking the same domain name and the known malicious domain name as the context information of the corresponding mailbox entity;
step C24: and taking the extracted domain name as the context information of the corresponding mailbox entity.
If the domain name part extracted from the email field has an association relation with a known malicious domain name, which indicates that a mailbox entity corresponding to the domain name has a problem, the known malicious domain name needs to be added into the extracted context information, so that the malicious domain name related to the mailbox entity is provided when the context information of the mailbox entity is output later, an analyst is reminded of paying attention, and the tracing efficiency and accuracy of the analyst are improved.
If the domain name part extracted from the email field has no association relationship with the known malicious domain name, it is indicated that the mailbox entity corresponding to the domain name does not find any abnormality at present, and only the context information extracted from the mailbox entity is provided for an analyst to trace and trace.
From the above, it can be known that, by extracting the domain name part from the email field and performing association analysis with the known malicious domain name, the malicious domain name with association relationship is added to the context information of the corresponding mailbox entity, so that when the context information of the mailbox entity is output subsequently, the malicious domain name related to the mailbox entity is provided, the analyst is reminded of paying attention, and the efficiency and accuracy of the analyst in tracking and tracing are improved.
S303: and integrating the context information of the same mailbox entity.
After extracting the context information of each mailbox entity, the mailbox collected from different information sources may have the problem of the same mailbox, so that in order to avoid the situation that the same mailbox entity in the mailbox reputation information library corresponds to repeated context information, the context information of the same mailbox entity needs to be integrated, i.e. combined and de-duplicated.
For example, the context information of the mailbox entity 1 collected from the a information source is a, B, c, and the context information of the mailbox entity collected from the B information source is B, c, d. The context information b and c of the mailbox entity collected from the two information sources are repeated, so that integration is needed to obtain the context information a, b, c, d of the mailbox entity 1.
S304: and performing reputation research and judgment on the integrated context information to generate research and judgment results of all mailbox entities.
After obtaining the non-repeated context information of each mailbox entity, the reputation research and judgment are further required to be carried out on the context information to obtain the research and judgment result of each mailbox entity, so that the comprehensive research and judgment result of the mailbox entity can be provided while the context information of the mailbox entity is provided for an analyst in the follow-up process, and the analyst has an overall knowledge about the mailbox entity to be identified.
Specifically, the step S304 may include:
step D1: and judging whether a white list exists in the integrated upper and lower information, label information exists, a collapse detection judgment level score exists, an advanced persistent threat APT family exists, an abnormal field exists and a history score exists. If at least one exists, executing the step D2, and if none exists, executing the step D3.
Step D2: and adding the score corresponding to the content judged to exist into the credit score corresponding to the corresponding mailbox entity.
Step D3: the reputation score of the corresponding mailbox entity is determined to be 0.
That is, the context information integrated by a certain mailbox entity is sequentially judged, that is, whether the context information exists in a white list is firstly judged, whether a label exists in the context information is judged, whether an IOC score exists in the context information is judged, whether an APT mark exists in the context information is judged, whether other abnormal fields (such as abnormal IP address behaviors and abnormal active range) exist in the context information is judged, and finally whether a research result of research and judgment is carried out in the context information is judged.
In the above-described judging process, if a judgment result is yes, a score corresponding to the "yes" item is added on the basis of the basic score. If the "Yes" determination has not occurred from the beginning to the end, then the final score is the base score. The base score may be 0.
For example, for different mailbox entities, threat information interfaces are associated to query, and the types of the threat information interfaces are extracted. For example: phish fishing, fraud, APT etc. And entering score calculation logic according to different marking results.
The code is as follows:
Figure BDA0004021198790000151
Figure BDA0004021198790000161
/>
in practical application, different mailbox entities can be classified into different reputation grades according to reputation scores obtained by the different mailbox entities. For example:
reputation score is 0-30, indicating security. And judging that the mailbox credit value is low, providing possible threat information and mailbox portrait information, and judging that the mailbox is a safe mailbox.
Reputation score is 30-40, indicating unknown. More information is needed for positioning and judging, and whether the mailbox is malicious or not is judged according to portrait information and detail fields of the mailbox.
Reputation scores of 40-80 indicate suspicious. Potentially a malicious threat and possibly a malicious mailbox depending on the context.
Reputation score is 80-100, indicating malicious. The credibility is high, a research and judgment basis and context information are provided, and most of the information is malicious mailbox addresses.
Based on the standard credit score, the credit research and judgment result of the mailbox entity can be continuously updated by adopting a dynamic evaluation mode, so that the accuracy of the credit evaluation of the mailbox is improved, the information in the mailbox credit information library is more accurate, and an analyst is improved to perform effective and accurate tracing and positioning based on the context information of the mailbox entity provided by the mailbox credit information library.
S305: and adding the research and judgment results of each mailbox entity into the corresponding integrated context information and storing the integrated context information in a mailbox reputation information library.
After the research and judgment results of each mailbox entity, namely the reputation evaluation are obtained, the research and judgment results and the context information of the corresponding mailbox can be stored in a mailbox reputation information library, so that when a subsequent analyst needs to obtain the related information of the mailbox entity, the research and judgment results and the context information of the mailbox entity can be directly provided for the analyst.
Specifically, the step S305 may include:
step E: and adding the research and judgment results of each mailbox entity into the corresponding integrated context information and storing the results in a mailbox reputation information library according to the reputation level, the basic information, the malicious activity identification and the classification mode of mailbox behavior portraits.
That is, for each mailbox entity, the reputation level, the basic information, the malicious activity identifier and the classification mode of the mailbox action portrait are required to be stored in a mailbox reputation information library. When the analyst needs to acquire the relevant information of the mailbox entity in the future, the information is output according to classification modes such as reputation level, basic information, malicious activity identification and mailbox behavior portrait.
Reputation level, i.e., reputation evaluation, can be divided into: malicious, suspicious, unknown, secure.
The basic information may refer to some information that is relatively easy to obtain in the mailbox, and may include: mailbox types (whether enterprise mailboxes are available based on icp proposal information, e.g., info@reddit.com, info@qianxin.com. are public/free mailboxes, e.g., person@yahoo.com), MX records (whether the domain of the mailbox is a valid MX record), DMARC records (need and institute confirm data sources), earliest time of viewing (furthest time of registration/mailing), latest time of activity (latest time of registration/mailing).
Malicious activity identification may include: spam, fake mail, malicious labels (e.g., APT, CS) of Whois information-associated domain names, custom operation labels.
Mailbox behavior portrayal may include: application app, whether leakage (yes or no), others (e.g., whether it is present on a public website). Among other things, the application app may include: it is detected whether there is a registration/access record at the application app, for example: for example, a general shopping website such as Beijing dong and Taobao, a social account number (whether a record is registered in a certain social account number or not is detected, for example, a mainstream social website such as twitter and microblog), and a master station register information (whether an account number is registered in a mainstream website, for example, gmail and google or not).
In order to make it possible to more clearly see which specific information is output, a visual illustration is made here. Fig. 5 is a schematic diagram of context information of an output mailbox entity in the embodiment of the present application, and referring to fig. 5, the context information of the output mailbox entity, that is, mail information enrichment information, mainly includes: reputation level, base information, malicious activity identification, and mailbox behavior representation. Some subclasses are listed under some subclasses, and are explicitly shown in fig. 5, and are not described herein.
The code of the enrichment content of the post-field extraction and production processing mailbox reputation information is as follows:
Figure BDA0004021198790000181
/>
Figure BDA0004021198790000191
/>
Figure BDA0004021198790000201
according to the content, the related information of the mailbox entity is output in a classification mode such as reputation level, basic information, malicious activity identification and mailbox behavior portrait, so that the enriched context information is clearer and more convenient for an analyst to check. By adopting the mailbox reputation information library established in the mode, more abundant context information of mailbox entities can be provided for an analyst, so that the analyst can efficiently and accurately trace to the source and position based on the enriched context information.
Finally, the production of the mailbox reputation information library and the deployment process of the mailbox reputation information library in the embodiment of the application are described again by using two specific examples.
Fig. 6 is a schematic production flow diagram of a mailbox reputation information library in the embodiment of the present application, referring to fig. 6, mailbox entities are collected through open source data, white records, IOC information, TI data, crawler data, black/white lists, other information sources, and the collected mailboxes are processed in real time, that is, resolved/extracted/converted/format is unified in real time, then processed through kafka, further processed through warehouse entry, and finally stored in db.
Fig. 7 is a schematic diagram of a deployment flow of a mailbox reputation information library in an embodiment of the present application, and referring to fig. 7, a unified interface, i.e., api. The received data entities are distributed to different gateways of kong1, kong2 or kong3 through lvs load balancing. And then, the mailbox reputation services are distributed to different mailbox reputation services, namely, the email-playback-service invokes and outputs corresponding context information from a mailbox reputation information library.
Based on the same inventive concept, as an implementation of the above-mentioned acquiring method, the embodiment of the application further provides an acquiring device of the mailbox context information. Fig. 8 is a schematic structural diagram of an apparatus for acquiring mailbox context information in an embodiment of the present application, and referring to fig. 8, the apparatus may include:
A receiving module 801, configured to obtain a mailbox entity to be identified;
the matching module 802 is configured to match the mailbox entity to be identified with a mailbox entity in a mailbox reputation information library, where the mailbox reputation information library includes various mailbox entities and context information thereof, and the content of context information of one mailbox entity in the mailbox reputation information library is greater than that of context information extracted from the one mailbox entity alone;
if the matching is successful, a first output module 803 is entered, wherein the first output module 803 is configured to output a mailbox entity and context information thereof in the mailbox reputation information library, the mailbox entity being matched with the mailbox entity to be identified;
if the matching fails, a second output module 804 is entered, where the second output module 804 is configured to extract context information in the mailbox entity to be identified, and store the extracted context information and the mailbox entity to be identified in the mailbox reputation information library.
Further, the mailbox reputation information library refining and expanding device further comprises a mailbox reputation information library generating device. Fig. 9 is a schematic structural diagram of a device for generating a mailbox reputation information base in an embodiment of the present application, and referring to fig. 9, the device may include:
The acquisition module 901 is used for acquiring a plurality of mailbox entities through a plurality of information sources;
an extracting module 902, configured to extract context information from the plurality of mailbox entities, where the context information is related to mailbox entity reputation research;
an integrating module 903, configured to integrate context information of the same mailbox entity;
the research module 904 is configured to perform reputation research on the integrated context information, and generate a research result of each mailbox entity;
and the storage module 905 is used for adding the research and judgment result of each mailbox entity into the corresponding integrated context information and storing the integrated context information in the mailbox reputation information library.
In other embodiments of the present application, the extracting module is specifically configured to extract at least one of a sender, a mail tag, an internet protocol IP address, a mail topic list, a correlation sample, a mail entity Whois registration information review result, a malicious tag, a domain name, and an active time attribute in the plurality of mail box entities as context information corresponding to the plurality of mail box entities, respectively.
In other embodiments of the present application, the extracting module is specifically configured to extract a mail_ info, target, file, type field in original data of identification reports of the plurality of mailbox entities, so as to obtain first sub-information corresponding to the plurality of mailbox entities respectively; respectively extracting information summary algorithm MD5 information of the mailbox entities to obtain second sub-information corresponding to the mailbox entities respectively; respectively extracting the related information of the mails in the sandboxed data messages of the mailbox entities to obtain third sub-information corresponding to the mailbox entities respectively; respectively extracting mail exchange MX records of the mailbox entities, sender policy framework SPF protocol information and DMARC information to obtain fourth sub-information respectively corresponding to the mailbox entities; detecting the first sub-information, the second sub-information, the third sub-information and the fourth sub-information through a collapse detection module to obtain detection results respectively corresponding to a plurality of mailbox entities; based on the first sub-information, the second sub-information, the third sub-information, the fourth sub-information and the detection result, determining a sender, a label, an Internet Protocol (IP) address, a mail topic list, a correlation sample, a mail box entity Whois registration information back check result, a malicious label, a domain name and an active time attribute which are respectively corresponding to a plurality of mail box entities, and using the same as context information respectively corresponding to the plurality of mail box entities.
In other embodiments of the present application, the generating device further includes: a dynamic updating module; the dynamic updating module is used for receiving the newly added extraction rule; and respectively extracting the context information from the mailbox entities according to the newly added extraction rule.
In other embodiments of the present application, the dynamic update module is specifically configured to extract domain names in the mailbox entities respectively; judging whether the domain name has an association relationship with a known malicious domain name or not; if yes, the same domain name and the known malicious domain name are used as the context information of the corresponding mailbox entity; if not, the extracted domain name is used as the context information of the corresponding mailbox entity.
In other embodiments of the present application, the research module is specifically configured to determine whether a white list exists in the integrated context information, whether tag information exists, whether a collapse detection determination level score exists, whether an advanced persistent threat APT family exists, whether an anomaly field exists, and whether a history score exists; if at least one exists, adding the score corresponding to the content judged to exist into the credit score corresponding to the corresponding mailbox entity; and if none exists, determining that the credit score of the corresponding mailbox entity is 0.
In other embodiments of the present application, the storage module is specifically configured to add the research result of each mailbox entity to the corresponding integrated context information and store the research result in the mailbox reputation information library according to the reputation level, the basic information, the malicious activity identifier and the classification mode of the mailbox behavior portrait.
In other embodiments of the present application, the receiving module is specifically configured to collect a plurality of mailbox entities from open source data, domain name query protocol Whois records, collapse attack indicator IOC information, threat intelligence TI data, web crawler data, and a black/white list.
It should be noted here that the above description of the embodiments of the acquisition apparatus and the generation apparatus is similar to the description of the embodiments of the acquisition method and the generation method described above, with similar advantageous effects as the embodiments of the acquisition method and the generation method. For technical details not disclosed in the embodiments of the acquiring apparatus and the generating apparatus of the present application, please refer to the description of the embodiments of the acquiring method and the generating method of the present application for understanding.
Based on the same inventive concept, as an implementation of the above method, the embodiment of the application further provides a computer storage medium. The computer storage medium has stored thereon a computer program which, when executed by a processor, performs the method described above.
It should be noted here that the description of the computer storage medium embodiments above is similar to the description of the method embodiments above, with similar advantageous effects as the method embodiments. For technical details not disclosed in the embodiments of the computer storage medium of the present application, please refer to the description of the method embodiments of the present application.
Based on the same inventive concept, as an implementation of the method, the embodiment of the application also provides electronic equipment. The electronic device may comprise a memory, a processor and a computer program stored on said memory and executable on said processor, said processor implementing the method described above when executing said program.
It should be noted here that the description of the above embodiments of the electronic device is similar to the description of the above embodiments of the method, with similar advantageous effects as the embodiments of the method. For technical details not disclosed in the embodiments of the electronic device of the present application, please refer to the description of the method embodiments of the present application for understanding.
The foregoing is merely specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes and substitutions are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (14)

1. A method for acquiring context information of a mailbox, the method comprising:
acquiring a mailbox entity to be identified;
matching the mailbox entity to be identified with mailbox entities in a mailbox reputation information library, wherein the mailbox reputation information library comprises various mailbox entities and context information thereof, and the content of the context information of one mailbox entity in the mailbox reputation information library is more than that of the context information extracted from the one mailbox entity independently;
if the matching is successful, outputting a mailbox entity matched with the mailbox entity to be identified and the context information thereof in the mailbox reputation information library;
if the match fails, the output is null.
2. The method of claim 1, wherein prior to obtaining the mailbox entity to be identified, the method further comprises:
collecting a plurality of mailbox entities through a plurality of information sources;
extracting context information from the mailbox entities respectively, wherein the context information is related to mailbox entity reputation research;
integrating the context information of the same mailbox entity;
reputation research and judgment are carried out on the integrated context information, and research and judgment results of all mailbox entities are generated;
And adding the research and judgment results of each mailbox entity into the corresponding integrated context information and storing the integrated context information in a mailbox reputation information library.
3. The method of claim 2, wherein the extracting context information from the plurality of mailbox entities, respectively, comprises:
and extracting at least one of a sender, a mail label, an Internet Protocol (IP) address, a mail subject list, a correlation sample, a mail box entity Whois registration information back check result, a malicious label, a domain name and an active time attribute in the plurality of mail box entities as context information respectively corresponding to the plurality of mail box entities.
4. The method of claim 3, wherein extracting at least one of a sender, a tag, an internet protocol IP address, a mail subject list, a correlation sample, a mail box entity Whois registration information review result, a malicious tag, a domain name, and an active time attribute in the plurality of mail box entities as the context information respectively corresponding to the plurality of mail box entities includes:
extracting the mail_ info, target, file, type fields in the original data of the identification reports of the mailbox entities respectively to obtain first sub-information corresponding to the mailbox entities respectively;
Respectively extracting information summary algorithm MD5 information of the mailbox entities to obtain second sub-information corresponding to the mailbox entities respectively;
respectively extracting the related information of the mails in the sandboxed data messages of the mailbox entities to obtain third sub-information corresponding to the mailbox entities respectively;
respectively extracting mail exchange MX records of the mailbox entities, sender policy framework SPF protocol information and DMARC information to obtain fourth sub-information respectively corresponding to the mailbox entities;
detecting the first sub-information, the second sub-information, the third sub-information and the fourth sub-information through a collapse detection module to obtain detection results respectively corresponding to a plurality of mailbox entities;
based on the first sub-information, the second sub-information, the third sub-information, the fourth sub-information and the detection result, determining a sender, a label, an Internet Protocol (IP) address, a mail topic list, a correlation sample, a mail box entity Whois registration information back check result, a malicious label, a domain name and an active time attribute which are respectively corresponding to a plurality of mail box entities, and using the same as context information respectively corresponding to the plurality of mail box entities.
5. The method of claim 2, wherein after adding the grinding results for each mailbox entity to the corresponding integrated context information and storing in the mailbox reputation information repository, the method further comprises:
receiving a new extraction rule;
and respectively extracting the context information from the mailbox entities according to the newly added extraction rule.
6. The method of claim 5, wherein extracting context information from the plurality of mailbox entities according to the new extraction rule, respectively, comprises:
extracting domain names in the mailbox entities respectively;
judging whether the domain name has an association relationship with a known malicious domain name or not;
if yes, the same domain name and the known malicious domain name are used as the context information of the corresponding mailbox entity;
if not, the extracted domain name is used as the context information of the corresponding mailbox entity.
7. The method of claim 2, wherein reputation research of the integrated context information generates a research result of each mailbox entity, comprising:
judging whether a white list exists in the integrated upper and lower information, label information exists, a collapse detection judgment level score exists, an advanced persistent threat APT family exists, an abnormal field exists and a history score exists;
If at least one exists, adding the score corresponding to the content judged to exist into the credit score corresponding to the corresponding mailbox entity;
and if none exists, determining that the credit score of the corresponding mailbox entity is 0.
8. The method of claim 2, wherein adding the grinding results of each mailbox entity to the corresponding integrated context information and storing in the mailbox reputation information repository comprises:
and adding the research and judgment results of each mailbox entity into the corresponding integrated context information and storing the results in a mailbox reputation information library according to the reputation level, the basic information, the malicious activity identification and the classification mode of mailbox behavior portraits.
9. The method of claim 2, wherein the collecting a plurality of mailbox entities via a plurality of information sources comprises:
and acquiring a plurality of mailbox entities from the open source data, the domain name query protocol Whois records, the collapse attack index IOC information, the threat information TI data, the large-net crawler data and the black/white list.
10. A method for generating a mailbox reputation information base, the method comprising:
collecting a plurality of mailbox entities through a plurality of information sources;
Extracting context information from the mailbox entities respectively, wherein the context information is related to mailbox entity reputation research;
integrating the context information of the same mailbox entity;
reputation research and judgment are carried out on the integrated context information, and research and judgment results of all mailbox entities are generated;
and adding the research and judgment results of each mailbox entity into the corresponding integrated context information and storing the integrated context information in a mailbox reputation information library.
11. An apparatus for acquiring context information of a mailbox, the apparatus comprising:
the receiving module is used for acquiring a mailbox entity to be identified;
the matching module is used for matching the mailbox entity to be identified with the mailbox entity in the mailbox reputation information library, wherein the mailbox reputation information library comprises various mailbox entities and context information thereof, and the content of the context information of one mailbox entity in the mailbox reputation information library is more than that of the context information extracted from the one mailbox entity independently;
if the matching is successful, a first output module is entered, wherein the first output module is used for outputting mailbox entities and context information thereof matched with the mailbox entities to be identified in the mailbox reputation information library;
If the matching fails, a second output module is entered, wherein the second output module is used for extracting the context information in the mailbox entity to be identified and storing the extracted context information and the mailbox entity to be identified in the mailbox reputation information library.
12. A device for generating a mailbox reputation information base, the device comprising:
the acquisition module is used for acquiring a plurality of mailbox entities through a plurality of information sources;
the extraction module is used for respectively extracting context information from the mailbox entities, and the context information is related to the reputation research and judgment of the mailbox entities;
the integrating module is used for integrating the context information of the same mailbox entity;
the judging module is used for carrying out reputation judgment on the integrated context information and generating judging results of all mailbox entities;
and the storage module is used for adding the research and judgment results of each mailbox entity into the corresponding integrated context information and storing the integrated context information in the mailbox reputation information library.
13. A computer storage medium having stored thereon a computer program, which when executed by a processor, is adapted to carry out the method of any of claims 1-10.
14. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor is operable to implement a method as claimed in any one of claims 1 to 10 when the program is executed by the processor.
CN202211686067.6A 2022-12-27 2022-12-27 Method and device for acquiring context information of mailbox Pending CN116095035A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211686067.6A CN116095035A (en) 2022-12-27 2022-12-27 Method and device for acquiring context information of mailbox

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211686067.6A CN116095035A (en) 2022-12-27 2022-12-27 Method and device for acquiring context information of mailbox

Publications (1)

Publication Number Publication Date
CN116095035A true CN116095035A (en) 2023-05-09

Family

ID=86211379

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211686067.6A Pending CN116095035A (en) 2022-12-27 2022-12-27 Method and device for acquiring context information of mailbox

Country Status (1)

Country Link
CN (1) CN116095035A (en)

Similar Documents

Publication Publication Date Title
US10530806B2 (en) Methods and systems for malicious message detection and processing
US20170118163A1 (en) Organizing messages in a messaging system using social network information
US10447709B2 (en) Methods and systems for integrating reconnaissance with security assessments for computing networks
US8423616B2 (en) Identifying and correlating electronic mail messages
US8925087B1 (en) Apparatus and methods for in-the-cloud identification of spam and/or malware
US11677783B2 (en) Analysis of potentially malicious emails
US7904958B2 (en) Spam honeypot domain identification
CN108092963B (en) Webpage identification method and device, computer equipment and storage medium
US11681757B2 (en) Similar email spam detection
CN104982011A (en) Document classification using multiscale text fingerprints
US10454967B1 (en) Clustering computer security attacks by threat actor based on attack features
US9223971B1 (en) User reporting and automatic threat processing of suspicious email
US10887261B2 (en) Dynamic attachment delivery in emails for advanced malicious content filtering
GB2555801A (en) Identifying fraudulent and malicious websites, domain and subdomain names
CN112333185B (en) Domain name shadow detection method and device based on DNS (Domain name Server) resolution
US7945630B2 (en) Method and system for verifying a recipient of a communication
JP2019508779A (en) Label data leakage channel detection method and apparatus
US8375089B2 (en) Methods and systems for protecting E-mail addresses in publicly available network content
US9923857B2 (en) Symbolic variables within email addresses
CN115941337A (en) Data analysis method and device, electronic equipment and storage medium
CN116095035A (en) Method and device for acquiring context information of mailbox
CN115001724B (en) Network threat intelligence management method, device, computing equipment and computer readable storage medium
US11962618B2 (en) Systems and methods for protection against theft of user credentials by email phishing attacks
Bo et al. Tom: A threat operating model for early warning of cyber security threats
KR102552330B1 (en) System and Method for detecting malicious internet address using search engine

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination