CN113746814A - Mail processing method and device, electronic equipment and storage medium - Google Patents

Mail processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113746814A
CN113746814A CN202110946078.2A CN202110946078A CN113746814A CN 113746814 A CN113746814 A CN 113746814A CN 202110946078 A CN202110946078 A CN 202110946078A CN 113746814 A CN113746814 A CN 113746814A
Authority
CN
China
Prior art keywords
mail
weight
character string
domain name
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110946078.2A
Other languages
Chinese (zh)
Other versions
CN113746814B (en
Inventor
徐治钦
陈树卫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Hard Link Network Technology Co ltd
Original Assignee
Shanghai Hard Link Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Hard Link Network Technology Co ltd filed Critical Shanghai Hard Link Network Technology Co ltd
Priority to CN202110946078.2A priority Critical patent/CN113746814B/en
Publication of CN113746814A publication Critical patent/CN113746814A/en
Application granted granted Critical
Publication of CN113746814B publication Critical patent/CN113746814B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/42Mailbox-related aspects, e.g. synchronisation of mailboxes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/45Network directories; Name-to-address mapping
    • H04L61/4505Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols
    • H04L61/4511Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols using domain name system [DNS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Computer Hardware Design (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The application discloses a mail processing method, a mail processing device, electronic equipment and a storage medium, wherein the method comprises the following steps: matching the domain name of the sender with each first preset character string in the first preset character string set, and acquiring the mail sending time of the mail when the matching result of the domain name of the sender and each first preset character string is detected not to meet a first preset condition; acquiring time weight corresponding to the time interval according to the time interval between the sending time and the current time; determining the target weight of the mail according to the time weight and the text weight of the mail body in the mail; and sending prompt information corresponding to the mail grade to at least one terminal according to the mail grade corresponding to the target weight so that the terminal triggers a corresponding mail processing flow according to the prompt information. The method and the device reduce the situation that the malicious mail is mistakenly identified as the important mail when word segmentation recognition is used, guarantee the timeliness of reply and improve the processing efficiency of the mail.

Description

Mail processing method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for processing an email, an electronic device, and a storage medium.
Background
In a customer service system, the form of email communication is usually adopted to deal with complaints, obstacle declaration, business consultation, business push and the like of clients. Since the urgency of these e-mails is usually not only the same, the urgency of the obstacle declaration is usually higher than that of the business push, for example. Therefore, in order to facilitate customer service to process the mails, in the related art, the keywords of the mail text in the mails are extracted to identify the mail level of the mails, so that the mails are processed in a differentiated manner according to the mail level of the mails, and if the mail level is higher, the mails need to be processed earlier.
However, only the keywords of the mail text are used to determine the mail level of the mail to perform differentiated processing on the mail, which easily causes that part of the mail is overlooked for a long time, cannot ensure timeliness of reply of the mail, and even when the keywords exist in the junk mail, the junk mail is easily identified as an important mail by mistake, which affects processing efficiency of the mail.
Disclosure of Invention
The present application aims to solve at least one of the technical problems in the prior art, and provides a method, an apparatus and an electronic device for processing a mail, so as to improve the recognition accuracy of the mail and improve the processing efficiency of the mail.
In a first aspect, an embodiment of the present application provides a mail processing method, including:
extracting a sender domain name from the mail;
matching the domain name of the sender with each first preset character string in the first preset character string set, and acquiring the mail sending time of the mail when the matching result of the domain name of the sender and each first preset character string is detected not to meet a first preset condition;
acquiring a time weight corresponding to a time interval according to the time interval between the sending time and the current time, wherein the duration of the time interval is in direct proportion to the time weight;
determining a target weight of the mail according to the time weight and the text weight of the mail body in the mail, wherein the text weight is obtained according to the weighting result of the preset weight of each participle in the mail body;
and sending prompt information corresponding to the mail grade to at least one terminal according to the mail grade corresponding to the target weight so that the terminal triggers a corresponding mail processing flow according to the prompt information.
Further, after detecting the domain name of the sender, the method further comprises:
and when detecting that the matching result of the domain name of the sender and any first preset character string meets a first preset condition, marking the mail as a malicious mail.
Further, the first preset condition is that a character string which is the same as the first preset character string exists in the domain name of the sender.
Further, the preset weight is determined by the part of speech of the participle and the semantic meaning of the participle.
Further, determining the target weight of the mail according to the time weight and the text weight of the mail body in the mail, comprising:
acquiring the initial weight of the mail according to the time weight and the text weight of the mail body in the mail;
matching the domain name of the sender with each second preset character string in a second preset character string set, and acquiring the corresponding domain name weight when detecting that the matching result of the domain name of the sender and any second preset character string meets a second preset condition;
and determining the target weight of the mail according to the initial weight and the domain name weight.
Further, the method also comprises the following steps:
and when the matching result of the domain name of the sender and any second preset character string is detected not to meet a second preset condition, determining the initial weight as the target weight of the mail.
Further, the second preset condition is that a character string which is the same as the second preset character string exists in the domain name of the sender.
In a second aspect, in an embodiment of the present application, there is also provided a mail processing apparatus, including:
the mail acquisition module is used for extracting a sender domain name from a mail;
the domain name detection module is used for matching the domain name of the sender with each first preset character string in the first preset character string set and acquiring the mail sending time of the mail when the matching result of the domain name of the sender and each first preset character string is detected not to meet a first preset condition;
the weight acquisition module is used for acquiring time weight corresponding to a time interval according to the time interval between the sending time and the current time, wherein the duration of the time interval is in direct proportion to the time weight;
the weight determining module is used for determining the target weight of the mail according to the time weight and the text weight of the mail body in the mail, wherein the text weight is obtained according to the weighting result of the preset weight of each participle in the mail body;
and the mail processing module is used for sending prompt information corresponding to the mail grade to at least one terminal according to the mail grade corresponding to the target weight so that the terminal triggers a corresponding mail processing flow according to the prompt information.
In a third aspect, an embodiment of the present application provides an electronic device, including: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the mail processing method as described in the above embodiments when executing the program.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium storing computer-executable instructions for causing a computer to execute the mail processing method according to the foregoing embodiments.
The domain name is detected in advance through the preset character string, normal mails are screened out through the domain name, the probability that the mails subsequently subjected to word segmentation recognition are malicious mails is reduced, the situation that the malicious mails are mistakenly recognized as important mails when word segmentation recognition is used is reduced, the grades of the mails are determined by combining the word segmentation and the mail sending time of the normal mails, the response timeliness is guaranteed, and the mail processing efficiency is improved.
Drawings
The present application is further described with reference to the following figures and examples;
FIG. 1 is a diagram of an application environment of a mail processing method in one embodiment;
FIG. 2 is a schematic flow chart diagram illustrating a method for processing mail in one embodiment;
FIG. 3 is a schematic flow chart diagram of a method for target weight determination in one embodiment;
FIG. 4 is a block diagram showing the structure of a mail processing apparatus according to an embodiment;
FIG. 5 is a block diagram of a computer device in one embodiment.
Detailed Description
Reference will now be made in detail to the present embodiments of the present application, preferred embodiments of which are illustrated in the accompanying drawings, which are for the purpose of visually supplementing the description with figures and detailed description, so as to enable a person skilled in the art to visually and visually understand each and every feature and technical solution of the present application, but not to limit the scope of the present application.
The following describes an embodiment of the present application in detail with reference to the drawings, and the mail processing method of the application provided by the embodiment of the present application is applied to an application environment including a terminal device 110 and a server 120 as shown in fig. 1. Wherein the terminal device 110 and the server 120 are connected via a network. The terminal device 110 may be a desktop terminal or a mobile terminal, wherein the mobile terminal may be one of a mobile phone, a tablet computer, a notebook computer, a wearable device, and the like. The server 120 may be implemented by an independent server or a server cluster composed of a plurality of servers, and may also be a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a web service, cloud communication, middleware service, a domain name service, a security service, a CDN, and a big data and artificial intelligence platform.
The terminal device 110 is used to periodically send each unprocessed mail to the server, or send each unprocessed mail to the server when receiving a new mail. After receiving an unprocessed mail, a server user extracts a sender domain name from the mail, matches the sender domain name with a first preset character string, and when the matching result is that the similarity between the sender domain name and the first preset character string is small, obtains the sending time of the mail, determines the time weight of the mail according to the time interval between the sending time and the current time, determines the mail grade according to the time weight and the mail text weight of the mail, and sends the mail grade to the terminal device 110, so that the terminal device 110 triggers a corresponding process to process the mail according to the mail grade.
Considering that the domain name of a sender of the malicious mail usually has a specific character string, the domain name is detected in advance through the preset character string so as to screen out normal mails through the domain name and reduce the probability that the mails subsequently subjected to word segmentation recognition are malicious mails, thereby reducing the situation that the malicious mails are mistakenly recognized as important mails when the word segmentation recognition is used, determining the grade of the mails by combining the word segmentation and the sending time of the normal mails, ensuring the timeliness of the reply and improving the processing efficiency of the mails.
Hereinafter, the mail processing method provided by the embodiment of the present application will be described and explained in detail through several specific embodiments.
In one embodiment, as shown in FIG. 2, a mail processing method is provided. The embodiment is mainly illustrated by applying the method to computer equipment. The computer device may specifically be the server 120 in fig. 1 described above.
Referring to fig. 2, the mail processing method specifically includes the following steps:
s11, extracting the sender domain name from the mail.
In one embodiment, the server receives each unprocessed mail from the terminal device at regular time; or when the terminal equipment receives the new mail, sending each unprocessed mail to the server; or when the unprocessed mail in the terminal equipment reaches a threshold value, for example 20 mails, sending the 20 mails to the server; or when the server detects that at least one mail exists in the terminal equipment and is not processed within a preset time period, such as 24 hours, the server acquires the unprocessed mail from the terminal equipment. After acquiring unprocessed mails, the server extracts the domain name of the sender, namely a From header domain, From the mails. The sender's domain name is directly associated with the mail content and the identity of the sending principal, and is the only immediately visible domain for the server and recipient, e.g., bj @10086. com.
S12, matching the domain name of the sender with each first preset character string in the first preset character string set, and acquiring the mail sending time of the mail when the matching result of the domain name of the sender and each first preset character string is detected not to meet a first preset condition.
Considering that malicious or counterfeit domain names usually have a fixed format, such as "-peoples", in one embodiment, the first predetermined string is composed of a string of letters or english words and predetermined symbols before and after the strings. Or the first preset character string can be obtained by training a large number of malicious domain names or counterfeit domain names and a large number of normal domain names according to the commonalities of the large number of malicious domain names or counterfeit domain names. Alternatively, the first preset character string may be preset based on manual experience.
In an embodiment, when performing matching, the server may extract the feature character strings from the sender domain name by using a TextRank algorithm, or use a feature character string extraction method based on a corpus, that is, by constructing a corpus in which a plurality of preset character strings are provided. For example, there are preset character strings "-peoples", ". gov", etc. in the corpus. The sender domain name is matched with each preset character string in the corpus, and the character string of the sender domain name corresponding to the preset character string in the corpus is intercepted and taken as the characteristic character string. And after the characteristic character string is obtained, matching the characteristic character string with each first preset character string in the first preset character string set, and judging the mail to be a normal mail when the matching result does not meet a first preset condition. The first preset character string set is a subset of the corpus, that is, the first preset character string in the first preset character string set is a preset character string marked as abnormal in the corpus.
In an embodiment, when it is detected that a matching result of a sender domain name and any first preset character string meets a first preset condition, a mail corresponding to the sender domain name is marked as a malicious mail. Illustratively, a corresponding identifier of a malicious email is added to a header of the email, such as a corresponding text "malicious email" and sent to a malicious email directory of the terminal device, so as to facilitate recognition by a user.
Because the similarity between the characteristic character string of the normal sender domain name and the first preset character string is as high as possible, there may be only one symbol difference, for example, the characteristic character string of the normal sender domain name is ". peoples", and the first preset character string is ". peoples. In order to avoid the false recognition as much as possible, in an embodiment, the first preset condition is that a character string identical to the first preset character string exists in the domain name of the sender. Namely, when the characteristic character string of the domain name of the sender is completely the same as the first preset character string, the mail corresponding to the domain name of the sender is marked as a malicious mail. If not, the mail is judged to be a normal mail, and the sending time of the mail is obtained.
S13, acquiring time weight corresponding to the time interval according to the time interval between the sending time and the current time, wherein the duration of the time interval is in direct proportion to the time weight.
In one embodiment, the server may store a mapping relationship table in advance, where the mapping relationship table records a mapping relationship between a time interval and a time weight, and the time interval is proportional to the time weight. I.e. the larger the interval of time, the higher the temporal weight. If the time interval is 1 hour to 2 hours, the corresponding time weight is 2; the time interval is 2-3 hours, and the corresponding time weight is 4; the time interval is 4-5 hours, corresponding to a time weight of 6. The mapping relationship between the specific time interval and the time weight can be set according to the actual situation.
In an embodiment, the mapping table may also record the time interval ordering and the mapping relationship with the time weight, such as the largest time interval, corresponding to a time weight of 6, the second time interval, corresponding to a time weight of 4, the third time interval, corresponding to a time weight of 2, and so on. The mapping relationship between the specific time interval sequencing and the time weight can be set according to the actual situation. And after the time intervals of all unprocessed mails are obtained, comparing the time intervals of all the mails, and sequencing the mails from big to small according to the comparison result so as to endow corresponding mails with corresponding time weights according to the sequencing result. Wherein the larger the time interval, the higher the corresponding time weight. If there are three unprocessed mails in total, the time interval of the mail A is 1 hour, the time interval of the mail B is 2 hours, and the time interval of the mail C is 3 hours, the time weight of the mail C is 6, the time weight of the mail B is 4, and the time weight of the mail A is 2.
And S14, determining the target weight of the mail according to the time weight and the text weight of the mail body in the mail, wherein the text weight is obtained according to the weighting result of the preset weight of each participle in the mail body.
In one embodiment, the mail text can be participled by using a TextRank algorithm, or a corpus-based participle method is used, that is, a corpus is constructed, and a plurality of grouped entries are arranged in the corpus. For example, there are grouped terms "lasso", "login", "payment", etc. in the corpus. The grouping entries in the corpus can be set by storing entries already existing on the network or manually. The mail text is matched with each grouped entry in the corpus, words of the mail text with the corresponding grouped entries in the corpus are intercepted, and the intercepted words are the participles.
In one embodiment, the server 120 stores a preset weight of each word, such as 6 for "lasso", 5 for "login", 4 for "payment", and so on. The specific value of the preset weight can be adjusted according to the actual situation. After the preset weight of each participle is obtained, the preset weight of each participle can be weighted to obtain the text weight. The preset weight can be preset according to the actual situation.
Since different parts of speech may have different parts of speech, and the influence of different parts of speech on the description content of the mail text is different, for example, the auxiliary word "is" without any influence on the description content of the mail text, and the influence of the noun, the entity word or the verb on the description content of the mail text is larger, in order to highlight the part of speech that has a key influence on the description content of the mail text and improve the accuracy of the subsequent text weight acquisition, in an embodiment, the preset weight of each part of speech is determined according to the part of speech of each part of speech. The weight of the real word is 2, the preset weight of the stop word and the auxiliary word is 0, and the like.
However, considering the diversity and ambiguity of languages, the meaning of the same word under different contexts is different, so that the influence of the same word on the description contents of different mail texts may be different. Therefore, in one embodiment, the preset weight of the participle is determined by the part of speech of the participle and the semantic meaning of the participle in the mail text. After the preset part-of-speech weight of the participle is determined according to the part-of-speech of the participle, the participle and the description content are input into a trained NLP (natural language processing) model, after the preset characteristic weight of the participle is determined, the preset weight of the participle is determined according to the preset part-of-speech weight and the preset characteristic weight. If the preset part-of-speech weight is added to the preset feature weight, the preset weight of the participle can be obtained. The determining of the preset feature weight of the participle through the NLP model may be performed by determining a feature vector of the participle through the NLP model, and then matching a corresponding preset feature weight from a plurality of preset feature weights according to the feature vector, or determining the preset feature weight of the feature vector through other conventional manners, which is not described herein again.
In an embodiment, after the time weight and the text weight are obtained, the time weight and the text weight are added, so that the target weight of the mail can be determined.
And S15, sending prompt information corresponding to the mail grade to at least one terminal according to the mail grade corresponding to the target weight, so that the terminal triggers a corresponding mail processing flow according to the prompt information.
In an embodiment, the server may store a mapping relation table between the target weight interval and the mail level in advance, wherein the larger the target weight interval is, the higher the mail level is. If the target weight interval is 31-40, the corresponding mail level is 1; the target weight interval is 41-50, and the corresponding mail level is 2. And determining the mail grade of the corresponding mail according to the target weight interval to which the target weight belongs. The mapping relationship between the specific target weight interval and the mail level can be set according to the actual situation.
In an embodiment, the server may store a mapping relationship table between the target weight rank and the mail level in advance, where the target weight is the smallest, and the corresponding mail level is 1 level, the target weight is the second smallest, and the corresponding mail level is 2 levels, and so on. After the target weights of all unprocessed mails are obtained, the target weights of all the mails are compared, and sorting is carried out from small to large according to the comparison result so as to endow the corresponding mails with corresponding mail grades according to the sorting result. Wherein, the larger the target weight is, the higher the corresponding mail level is.
In one embodiment, after the mail level is determined, prompt information corresponding to the mail level is sent to the terminal according to the mail level, and the prompt information is used for prompting the importance of the mail to the terminal equipment. Wherein the higher the mail level, the higher the importance of the mail. After receiving the prompt information for prompting the importance of the mails, the terminal equipment sorts the mails on the display interface according to the importance of each mail, and the higher the importance of the mails is, the higher the priority of the mails is.
The domain name is detected in advance through the preset character string, normal mails are screened out through the domain name, the probability that the mails subsequently subjected to word segmentation recognition are malicious mails is reduced, the situation that the malicious mails are mistakenly recognized as important mails when word segmentation recognition is used is reduced, the grades of the mails are determined by combining the word segmentation and the mail sending time of the normal mails, the response timeliness is guaranteed, and the mail processing efficiency is improved.
Considering that the sender domain name of most important mails also has a fixed format, such as the sender domain name of government mail has ". gov.", in order to further improve the accuracy of the identification of the subsequent mail level, in one embodiment, as shown in fig. 3, step S14 includes:
and S21, acquiring the initial weight of the mail according to the time weight and the text weight of the mail body in the mail.
In an embodiment, after the time weight and the text weight are obtained, the time weight and the text weight are added, so that the initial weight of the mail can be determined.
S22, matching the domain name of the sender with each second preset character string in the second preset character string set, and acquiring the corresponding domain name weight when detecting that the matching result of the domain name of the sender and any second preset character string meets a second preset condition.
In one embodiment, the second predetermined character string is composed of a letter string or an english word and a predetermined symbol before and after the letter string or the english word. Or the second preset character string can be obtained according to the commonalities of a large number of important domain names after being trained by the large number of important domain names. Alternatively, the second predetermined string may be preset based on manual experience.
In an embodiment, when performing similarity detection, the server may extract the feature character strings from the sender domain name by using a TextRank algorithm, or use a feature character string extraction method based on a corpus, that is, by constructing a corpus in which a plurality of preset character strings are provided. For example, there are preset character strings "-peoples", ". gov", etc. in the corpus. The sender domain name is matched with each preset character string in the corpus, and the character string of the sender domain name corresponding to the preset character string in the corpus is intercepted and taken as the characteristic character string. And after the characteristic character string is obtained, matching the characteristic character string with each second preset character string in a second preset character string set, and giving the corresponding domain name weight to the mail when the similarity matching result meets a second preset condition.
Because the feature character string of the malicious sender domain name is as high as the similarity of the second preset character string, there may be only one symbol difference, for example, the feature character string of the malicious sender domain name is "-peoples. In order to avoid the false recognition as much as possible, in an embodiment, the second preset condition is that a character string identical to the second preset character string exists in the domain name of the sender. When the characteristic character string of the domain name of the sender is completely the same as the second preset character string, the domain name weight corresponding to the domain name of the sender is given. If not, the mail is judged to be a normal mail, and the domain name weight corresponding to the domain name of the sender is not given. The domain name weight can be preset according to the actual situation.
And S23, determining the target weight of the mail according to the initial weight and the domain name weight.
In an embodiment, after the domain name weight is obtained, the domain name weight is added to the initial weight to obtain the target weight of the mail.
In an embodiment, when the domain name weight is obtained, the initial weight may be further increased, so as to determine the target weight of the mail according to the domain name weight and the increased initial weight.
Illustratively, if the domain name weight is obtained, the initial weight is increased by a preset value, for example, the initial weight is increased by 10, so as to add the domain name weight and the increased initial weight to obtain the target weight of the mail.
The mail grade is determined by combining the domain name weight of the domain name, the text weight of the mail body and the time weight of the mail, and the identification accuracy of the key mail is further improved.
In one embodiment, as shown in fig. 4, there is provided a mail processing apparatus including:
a mail obtaining module 101, configured to extract a sender domain name from a mail.
The domain name detection module 102 is configured to match the domain name of the sender with each first preset character string in the first preset character string set, and obtain the sending time of the mail when it is detected that the matching result of the domain name of the sender and each first preset character string does not satisfy the first preset condition.
The weight obtaining module 103 is configured to obtain a time weight corresponding to a time interval according to the time interval between the sending time and the current time, where a duration of the time interval is proportional to the time weight.
And the weight determining module 104 is configured to determine a target weight of the email according to the time weight and a text weight of the email body in the email, where the text weight is obtained according to a weighting result of a preset weight of each participle in the email body.
And the mail processing module 105 is configured to send, according to the mail level corresponding to the target weight, a prompt message corresponding to the mail level to at least one terminal, so that the terminal triggers a corresponding mail processing flow according to the prompt message.
In an embodiment, the domain name detection module 102 is further configured to: and when detecting that the matching result of the domain name of the sender and any first preset character string meets a first preset condition, marking the mail as a malicious mail.
In an embodiment, the first preset condition is that a character string identical to the first preset character string exists in the domain name of the sender.
In an embodiment, the weight determining module 104 is specifically configured to: acquiring the initial weight of the mail according to the time weight and the text weight of the mail body in the mail; matching the domain name of the sender with each second preset character string in a second preset character string set, and acquiring the corresponding domain name weight when detecting that the matching result of the domain name of the sender and any second preset character string meets a second preset condition; and determining the target weight of the mail according to the initial weight and the domain name weight.
In an embodiment, the weight determination module 104 is further configured to: and when the matching result of the domain name of the sender and any second preset character string is detected not to meet a second preset condition, determining the initial weight as the target weight of the mail.
In an embodiment, the second preset condition is that a character string identical to the second preset character string exists in the domain name of the sender.
In an embodiment, the weight determining module 104 is specifically configured to: and when the domain name weight is obtained, the initial weight is increased, so that the target weight of the mail is determined according to the domain name weight and the increased initial weight.
In one embodiment, a computer apparatus is provided, as shown in fig. 5, comprising a processor, a memory, a network interface, an input device, and a display screen connected by a system bus. Wherein the memory includes a non-volatile storage medium and an internal memory. The non-volatile storage medium of the computer device stores an operating system and may also store a computer program that, when executed by the processor, causes the processor to implement the mail processing method. The internal memory may also have stored therein a computer program that, when executed by the processor, causes the processor to perform a mail processing method. Those skilled in the art will appreciate that the architecture shown in fig. 5 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, the mail processing apparatus provided in the present application may be implemented in the form of a computer program that is executable on a computer device such as that shown in fig. 5. The memory of the computer device may store therein the respective program modules constituting the mail processing apparatus. The computer program constituted by the respective program modules causes the processor to execute the steps in the mail processing method of the respective embodiments of the present application described in the present specification.
In one embodiment, there is provided an electronic device including: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor executing the program to perform the steps of the mail processing method described above. The steps of the mail processing method herein may be steps in the mail processing methods of the respective embodiments described above.
In one embodiment, a computer-readable storage medium is provided, which stores computer-executable instructions for causing a computer to perform the steps of the above-described mail processing method. The steps of the mail processing method herein may be steps in the mail processing methods of the respective embodiments described above.
The foregoing is a preferred embodiment of the present application, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present application, and these modifications and decorations are also regarded as the protection scope of the present application.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.

Claims (10)

1. A method for processing mail, comprising:
extracting a sender domain name from the mail;
matching the domain name of the sender with each first preset character string in the first preset character string set, and acquiring the mail sending time of the mail when the matching result of the domain name of the sender and each first preset character string is detected not to meet a first preset condition;
acquiring a time weight corresponding to a time interval according to the time interval between the sending time and the current time, wherein the duration of the time interval is in direct proportion to the time weight;
determining a target weight of the mail according to the time weight and the text weight of the mail body in the mail, wherein the text weight is obtained according to the weighting result of the preset weight of each participle in the mail body;
and sending prompt information corresponding to the mail grade to at least one terminal according to the mail grade corresponding to the target weight so that the terminal triggers a corresponding mail processing flow according to the prompt information.
2. The mail processing method according to claim 1, further comprising, after detecting the sender's domain name:
and when detecting that the matching result of the domain name of the sender and any first preset character string meets a first preset condition, marking the mail as a malicious mail.
3. The mail processing method according to claim 1 or 2, wherein the first preset condition is that a character string identical to the first preset character string exists in the sender's domain name.
4. The mail processing method according to claim 1, wherein the predetermined weight is determined by a part of speech of the participle and a semantic meaning of the participle.
5. The mail processing method of claim 1, wherein determining the target weight of the mail based on the time weight and the text weight of the body of the mail in the mail comprises:
acquiring the initial weight of the mail according to the time weight and the text weight of the mail body in the mail;
matching the domain name of the sender with each second preset character string in a second preset character string set, and acquiring the corresponding domain name weight when detecting that the matching result of the domain name of the sender and any second preset character string meets a second preset condition;
and determining the target weight of the mail according to the initial weight and the domain name weight.
6. The mail processing method according to claim 5, further comprising:
and when the matching result of the domain name of the sender and any second preset character string is detected not to meet a second preset condition, determining the initial weight as the target weight of the mail.
7. The mail processing method according to claim 5 or 6, wherein the second preset condition is that a character string identical to the second preset character string exists in the sender's domain name.
8. A mail processing apparatus, comprising:
the mail acquisition module is used for extracting a sender domain name from a mail;
the domain name detection module is used for matching the domain name of the sender with each first preset character string in the first preset character string set and acquiring the mail sending time of the mail when the matching result of the domain name of the sender and each first preset character string is detected not to meet a first preset condition;
the weight acquisition module is used for acquiring time weight corresponding to a time interval according to the time interval between the sending time and the current time, wherein the duration of the time interval is in direct proportion to the time weight;
the weight determining module is used for determining the target weight of the mail according to the time weight and the text weight of the mail body in the mail, wherein the text weight is obtained according to the weighting result of the preset weight of each participle in the mail body;
and the mail processing module is used for sending prompt information corresponding to the mail grade to at least one terminal according to the mail grade corresponding to the target weight so that the terminal triggers a corresponding mail processing flow according to the prompt information.
9. An electronic device, comprising: memory, processor and computer program stored on the memory and executable on the processor, characterized in that the processor implements the mail processing method according to any of claims 1 to 7 when executing the program.
10. A computer-readable storage medium, in which a computer program is stored which is adapted to be loaded and executed by a processor to cause a computer device having said processor to carry out the method of any one of claims 1 to 7.
CN202110946078.2A 2021-08-17 2021-08-17 Mail processing method, mail processing device, electronic equipment and storage medium Active CN113746814B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110946078.2A CN113746814B (en) 2021-08-17 2021-08-17 Mail processing method, mail processing device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110946078.2A CN113746814B (en) 2021-08-17 2021-08-17 Mail processing method, mail processing device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113746814A true CN113746814A (en) 2021-12-03
CN113746814B CN113746814B (en) 2024-01-09

Family

ID=78731589

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110946078.2A Active CN113746814B (en) 2021-08-17 2021-08-17 Mail processing method, mail processing device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113746814B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114520797A (en) * 2022-02-14 2022-05-20 广州拓波软件科技有限公司 Intelligent control method and device for mails

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2443201A1 (en) * 2003-09-29 2005-03-29 Tiki Technologies Corp. Probabalistic email intrusion identification methods and systems
US20130185230A1 (en) * 2012-01-12 2013-07-18 Microsoft Corporation Machine-learning based classification of user accounts based on email addresses and other account information
CN105915440A (en) * 2016-04-19 2016-08-31 乐视控股(北京)有限公司 Mail recognition method and device
CN106230867A (en) * 2016-09-29 2016-12-14 北京知道创宇信息技术有限公司 Prediction domain name whether method, system and the model training method thereof of malice, system
CN106992926A (en) * 2017-06-13 2017-07-28 深信服科技股份有限公司 A kind of method and system for forging mail-detection
CN110149266A (en) * 2018-07-19 2019-08-20 腾讯科技(北京)有限公司 Spam filtering method and device
CN111404805A (en) * 2020-03-12 2020-07-10 深信服科技股份有限公司 Junk mail detection method and device, electronic equipment and storage medium
CN111835622A (en) * 2020-07-10 2020-10-27 腾讯科技(深圳)有限公司 Information interception method and device, computer equipment and storage medium
WO2020253388A1 (en) * 2019-06-19 2020-12-24 深圳壹账通智能科技有限公司 Machine learning-based e-mail message processing method, apparatus, medium, and electronic device
CN112686631A (en) * 2020-12-29 2021-04-20 平安普惠企业管理有限公司 Task item processing method and device, computer equipment and storage medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2443201A1 (en) * 2003-09-29 2005-03-29 Tiki Technologies Corp. Probabalistic email intrusion identification methods and systems
US20130185230A1 (en) * 2012-01-12 2013-07-18 Microsoft Corporation Machine-learning based classification of user accounts based on email addresses and other account information
CN105915440A (en) * 2016-04-19 2016-08-31 乐视控股(北京)有限公司 Mail recognition method and device
CN106230867A (en) * 2016-09-29 2016-12-14 北京知道创宇信息技术有限公司 Prediction domain name whether method, system and the model training method thereof of malice, system
CN106992926A (en) * 2017-06-13 2017-07-28 深信服科技股份有限公司 A kind of method and system for forging mail-detection
CN110149266A (en) * 2018-07-19 2019-08-20 腾讯科技(北京)有限公司 Spam filtering method and device
WO2020253388A1 (en) * 2019-06-19 2020-12-24 深圳壹账通智能科技有限公司 Machine learning-based e-mail message processing method, apparatus, medium, and electronic device
CN111404805A (en) * 2020-03-12 2020-07-10 深信服科技股份有限公司 Junk mail detection method and device, electronic equipment and storage medium
CN111835622A (en) * 2020-07-10 2020-10-27 腾讯科技(深圳)有限公司 Information interception method and device, computer equipment and storage medium
CN112686631A (en) * 2020-12-29 2021-04-20 平安普惠企业管理有限公司 Task item processing method and device, computer equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114520797A (en) * 2022-02-14 2022-05-20 广州拓波软件科技有限公司 Intelligent control method and device for mails
CN114520797B (en) * 2022-02-14 2024-02-09 广州拓波软件科技有限公司 Intelligent mail management and control method and device

Also Published As

Publication number Publication date
CN113746814B (en) 2024-01-09

Similar Documents

Publication Publication Date Title
RU2708508C1 (en) Method and a computing device for detecting suspicious users in messaging systems
CN110149266B (en) Junk mail identification method and device
EP2803031B1 (en) Machine-learning based classification of user accounts based on email addresses and other account information
CN103336766B (en) Short text garbage identification and modeling method and device
US8112484B1 (en) Apparatus and method for auxiliary classification for generating features for a spam filtering model
CN112487149B (en) Text auditing method, model, equipment and storage medium
CN111538929B (en) Network link identification method and device, storage medium and electronic equipment
WO2017173093A1 (en) Method and device for identifying spam mail
CN113055386A (en) Method and device for identifying and analyzing attack organization
RU2763921C1 (en) System and method for creating heuristic rules for detecting fraudulent emails attributed to the category of bec attacks
US12021896B2 (en) Method for detecting webpage spoofing attacks
Balim et al. Automatic detection of smishing attacks by machine learning methods
CN111259207A (en) Short message identification method, device and equipment
CN111680161A (en) Text processing method and device and computer readable storage medium
Hamisu et al. Detecting advance fee fraud using nlp bag of word model
CN111835622A (en) Information interception method and device, computer equipment and storage medium
US20190372998A1 (en) Exchange-type attack simulation device, exchange-type attack simulation method, and computer readable medium
US20220210188A1 (en) Message phishing detection using machine learning characterization
CN113746814B (en) Mail processing method, mail processing device, electronic equipment and storage medium
CN112039874B (en) Malicious mail identification method and device
CN109905359B (en) Communication message processing method, device, computer equipment and readable access medium
US10163005B2 (en) Document structure analysis device with image processing
US11936686B2 (en) System, device and method for detecting social engineering attacks in digital communications
CN113556347B (en) Detection method, device and equipment for phishing mails and storage medium
Manek et al. ReP-ETD: A Repetitive Preprocessing technique for Embedded Text Detection from images in spam emails

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant