CN113746814B - Mail processing method, mail processing device, electronic equipment and storage medium - Google Patents

Mail processing method, mail processing device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113746814B
CN113746814B CN202110946078.2A CN202110946078A CN113746814B CN 113746814 B CN113746814 B CN 113746814B CN 202110946078 A CN202110946078 A CN 202110946078A CN 113746814 B CN113746814 B CN 113746814B
Authority
CN
China
Prior art keywords
mail
weight
domain name
character string
preset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110946078.2A
Other languages
Chinese (zh)
Other versions
CN113746814A (en
Inventor
徐治钦
陈树卫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Hard Link Network Technology Co ltd
Original Assignee
Shanghai Hard Link Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Hard Link Network Technology Co ltd filed Critical Shanghai Hard Link Network Technology Co ltd
Priority to CN202110946078.2A priority Critical patent/CN113746814B/en
Publication of CN113746814A publication Critical patent/CN113746814A/en
Application granted granted Critical
Publication of CN113746814B publication Critical patent/CN113746814B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/42Mailbox-related aspects, e.g. synchronisation of mailboxes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/45Network directories; Name-to-address mapping
    • H04L61/4505Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols
    • H04L61/4511Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols using domain name system [DNS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic

Abstract

The application discloses a mail processing method, a mail processing device, electronic equipment and a storage medium, wherein the mail processing method comprises the following steps: matching the domain name of the sender with each first preset character string in the first preset character string set, and acquiring the sending time of the mail when the matching result of the domain name of the sender and each first preset character string is detected to not meet the first preset condition; acquiring time weight corresponding to the time interval according to the time interval between the delivery time and the current time; determining the target weight of the mail according to the time weight and the text weight of the mail text in the mail; and sending prompt information corresponding to the mail grade to at least one terminal according to the mail grade corresponding to the target weight, so that the terminal triggers a corresponding mail processing flow according to the prompt information. According to the method and the device, the situation that malicious mails are mistakenly identified as important mails when word segmentation identification is used is reduced, the timeliness of reply is guaranteed, and the processing efficiency of the mails is improved.

Description

Mail processing method, mail processing device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a mail processing method, a device, an electronic device, and a storage medium.
Background
In customer service systems, electronic mail communication is generally adopted to deal with complaints, obstacle declarations, business consultations, business pushing and the like of customers. Since the urgency of these emails is often not just the same, for example, the urgency of obstacle declaration is often higher than traffic pushing. Therefore, in order to facilitate customer service to process the mail, in the related art, keywords of the mail text in the mail are extracted to identify the mail grade of the mail, so as to perform differentiated processing on the mail according to the mail grade of the mail, for example, the mail grade is higher and needs to be processed earlier.
However, the mail grade is determined only by using keywords of the mail text, so that the mail is subjected to differentiated processing, part of the mail is easily ignored for a long time, the timeliness of mail reply cannot be ensured, and even when keywords exist in the junk mail, the junk mail is easily mistakenly identified as important mail, and the processing efficiency of the mail is affected.
Disclosure of Invention
The purpose of the application is to at least solve one of the technical problems existing in the prior art, and provide a mail processing method, a mail processing device and electronic equipment, so that the recognition accuracy of mails is improved, and the mail processing efficiency is improved.
In a first aspect, an embodiment of the present application provides a mail processing method, including:
extracting a sender domain name from a mail;
matching the domain name of the sender with each first preset character string in the first preset character string set, and acquiring the sending time of the mail when the matching result of the domain name of the sender and each first preset character string is detected to not meet the first preset condition;
acquiring time weight corresponding to the time interval according to the time interval between the sending time and the current time, wherein the duration of the time interval is in direct proportion to the time weight;
determining a target weight of the mail according to the time weight and the text weight of the mail text in the mail, wherein the text weight is obtained according to a weighting result of a preset weight of each word in the mail text;
and sending prompt information corresponding to the mail grade to at least one terminal according to the mail grade corresponding to the target weight, so that the terminal triggers a corresponding mail processing flow according to the prompt information.
Further, after detecting the domain name of the sender, the method further comprises:
and marking the mail as malicious mail when the matching result of the domain name of the sender and any one of the first preset character strings is detected to meet the first preset condition.
Further, the first preset condition is that the same character string as the first preset character string exists in the domain name of the sender.
Further, the preset weight is determined by the part of speech of the word and the semantics of the word.
Further, determining the target weight of the mail according to the time weight and the text weight of the mail body in the mail comprises the following steps:
acquiring initial weight of the mail according to the time weight and text weight of the mail text in the mail;
matching the sender domain name with each second preset character string in the second preset character string set, and acquiring corresponding domain name weight when the matching result of the sender domain name and any second preset character string is detected to meet the second preset condition;
and determining the target weight of the mail according to the initial weight and the domain name weight.
Further, the method further comprises the following steps:
and when the matching result of the domain name of the sender and any second preset character string is detected to not meet the second preset condition, determining the initial weight as the target weight of the mail.
Further, the second preset condition is that the same character string as the second preset character string exists in the domain name of the sender.
In a second aspect, in an embodiment of the present application, there is further provided a mail processing apparatus, including:
the mail acquisition module is used for extracting the domain name of the sender from the mail;
the domain name detection module is used for matching the domain name of the sender with each first preset character string in the first preset character string set, and acquiring the sending time of the mail when the matching result of the domain name of the sender and each first preset character string is detected to not meet the first preset condition;
the weight acquisition module is used for acquiring time weight corresponding to the time interval according to the time interval between the sending time and the current time, wherein the duration of the time interval is in direct proportion to the time weight;
the weight determining module is used for determining the target weight of the mail according to the time weight and the text weight of the mail text in the mail, wherein the text weight is obtained according to the weighting result of the preset weight of each word in the mail text;
and the mail processing module is used for sending prompt information corresponding to the mail grade to at least one terminal according to the mail grade corresponding to the target weight, so that the terminal triggers a corresponding mail processing flow according to the prompt information.
In a third aspect, an embodiment of the present application provides an electronic device, including: a memory, a processor, and a computer program stored on the memory and executable on the processor, which when executed implements the mail processing method as described in the above embodiments.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium storing computer-executable instructions for causing a computer to perform the mail processing method according to the above embodiments.
The domain name is detected in advance through the preset character string, so that normal mails are screened out through the domain name, the probability that the mails subjected to word segmentation recognition are malicious mails is reduced, the situation that the malicious mails are mistakenly recognized as important mails when the word segmentation recognition is used is reduced, the grade of the mails is determined by combining the word segmentation and the sending time of the normal mails, the timeliness of reply is guaranteed, and the processing efficiency of the mails is improved.
Drawings
The present application is further described below with reference to the drawings and examples;
FIG. 1 is an application environment diagram of a mail processing method in one embodiment;
FIG. 2 is a flow diagram of a mail processing method in one embodiment;
FIG. 3 is a flow chart of a method of determining target weights in one embodiment;
FIG. 4 is a block diagram showing the structure of a mail processing apparatus in one embodiment;
fig. 5 is a block diagram of a computer device in one embodiment.
Detailed Description
Reference will now be made in detail to the present embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein the purpose of the accompanying drawings is to supplement the description of the written description section with figures, so that one can intuitively and intuitively understand each technical feature and overall technical scheme of the present application, but not to limit the scope of protection of the present application.
The following describes embodiments of the present application in detail with reference to the accompanying drawings, where the mail processing method of the application provided in the embodiments of the present application is applied to an application environment including a terminal device 110 and a server 120 as shown in fig. 1. Wherein the terminal device 110 is connected to the server 120 via a network. The terminal device 110 may be a desktop terminal or a mobile terminal, wherein the mobile terminal may be one of a cell phone, a tablet computer, a notebook computer, a wearable device, etc. The server 120 may be implemented by a stand-alone server or a server group formed by a plurality of servers, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, basic cloud computing services such as big data and artificial intelligence platforms, and the like.
The terminal device 110 is configured to send each unprocessed mail to the server at regular time, or send each unprocessed mail to the server when a new mail is received. After receiving the unprocessed mail, the server user extracts the domain name of the sender from the mail, matches the domain name of the sender with a first preset character string, obtains the time of sending the mail when the matching result is that the similarity between the domain name of the sender and the first preset character string is smaller, determines the time weight of the mail according to the time interval between the time of sending and the current time, and sends the grade of the mail to the terminal equipment 110 after determining the grade of the mail according to the time weight and the mail text weight of the mail, so that the terminal equipment 110 triggers the corresponding process to process the mail according to the grade of the mail.
Considering that the domain name of the sender of the malicious mail is usually provided with a specific character string, the domain name is detected in advance through the preset character string, so that normal mail is screened out through the domain name, the probability that the mail subjected to word segmentation recognition is the malicious mail is reduced, the situation that the malicious mail is mistakenly recognized as important mail when the word segmentation recognition is used is reduced, the grade of the mail is determined by combining the word segmentation and the sending time of the normal mail, the timeliness of reply is ensured, and the processing efficiency of the mail is improved.
The mail processing method provided in the embodiment of the present application will be described and illustrated in detail by means of several specific embodiments.
As shown in FIG. 2, in one embodiment, a mail processing method is provided. The embodiment is mainly exemplified by the method applied to computer equipment. The computer device may be specifically the server 120 of fig. 1 described above.
Referring to fig. 2, the mail processing method specifically includes the steps of:
s11, extracting the domain name of the sender from the mail.
In one embodiment, the server periodically receives each unprocessed mail from the terminal device; or when receiving new mails, the terminal equipment sends each unprocessed mail to the server; or when the unprocessed mail in the terminal equipment reaches a threshold value, for example, 20 mails, the 20 mails are sent to a server; or when the server detects that at least one mail exists in the terminal equipment and is not processed within a preset period of time, such as 24 hours, the unprocessed mail is acquired from the terminal equipment. After the server obtains the unprocessed mail, the domain name of the sender, i.e., the sending domain, is extracted from the mail (From header domain). The sender domain name is directly associated with the mail content and the identity of the responsible person sending, and is the only immediate visible domain for the server and recipient, as bj@10086.com.
And S12, matching the domain name of the sender with each first preset character string in the first preset character string set, and acquiring the time of sending the mail when the matching result of the domain name of the sender and each first preset character string is detected to not meet the first preset condition.
Considering that malicious or counterfeit domain names are typically in a fixed format, such as "-scope", in one embodiment the first predetermined string consists of alphabetic strings or english words followed by predetermined symbols. Alternatively, the first preset character string may be obtained from a common property of a large number of malicious or counterfeit domain names after training the large number of normal domain names. Alternatively, the first preset character string may be preset according to human experience.
In an embodiment, when matching is performed, the server may use TextRank algorithm to extract the feature strings from the domain name of the sender, or use a feature string extraction method based on a corpus, i.e. by constructing a corpus, where there are a plurality of preset strings in the corpus. For example, the corpus has a predetermined string "-people", ". Gov", and so forth. The method comprises the steps of carrying out matching processing on a sender domain name and each preset character string in a corpus, and intercepting out character strings of the sender domain name, which correspond to the preset character strings, in the corpus to serve as characteristic character strings. After the characteristic character string is obtained, the characteristic character string is matched with each first preset character string in the first preset character string set, and when the matching result does not meet the first preset condition, the mail is judged to be normal mail. The first preset character string set is a subset of the corpus, namely, the first preset character strings in the first preset character string set are preset character strings marked as abnormal in the corpus.
In an embodiment, when the matching result of the sender domain name and any one of the first preset character strings is detected to meet the first preset condition, the mail corresponding to the sender domain name is marked as malicious mail. The method includes the steps that a corresponding identifier of a malicious mail is added to a header of the mail, for example, the corresponding text of the malicious mail is sent to a malicious mail directory of terminal equipment, so that a user can conveniently recognize the malicious mail.
Since the feature string of the normal sender domain name has a higher similarity to the first preset string, there may be only one symbol difference, for example, the feature string of the normal sender domain name is ". People" ", and the first preset string is" -people "". In this case, in order to avoid misrecognition as much as possible, in an embodiment, the first preset condition is that a character string identical to the first preset character string exists in the domain name of the sender. When the characteristic character string of the domain name of the sender is identical to the first preset character string, the mail corresponding to the domain name of the sender is marked as malicious mail. If the mail is not identical, judging that the mail is normal mail, and acquiring the sending time of the mail.
S13, according to the time interval between the sending time and the current time, obtaining time weight corresponding to the time interval, wherein the duration of the time interval is in direct proportion to the time weight.
In an embodiment, the server may store a mapping relationship table in advance, where the mapping relationship table records a mapping relationship between a time interval and a time weight, and the time interval is proportional to the time weight. I.e. the larger the time interval, the higher the time weight. If the time interval is 1-2 hours, the corresponding time weight is 2; the time interval is 2-3 hours, and the corresponding time weight is 4; the time interval is 4-5 hours, and the corresponding time weight is 6. The mapping relationship between the specific time interval and the time weight can be set according to actual situations.
In an embodiment, the mapping table may record a mapping relationship between time interval sequences and time weights, for example, the time interval is largest, the corresponding time weight is 6, the time interval is second, the corresponding time weight is 4, the time interval is third, the corresponding time weight is 2, and so on. The mapping relationship between the specific time interval sequencing and the time weight can be set according to actual conditions. After the time intervals of all the unprocessed mails are acquired, the time intervals of all the mails are compared, and sorting is carried out from large to small according to the comparison result, so that corresponding time weights are given to the corresponding mails according to the sorting result. Wherein, the larger the time interval, the higher the corresponding time weight. If there are three unprocessed mails, the time interval of mail a is 1 hour, the time interval of mail B is 2 hours, the time interval of mail C is 3 hours, the time weight of mail C is 6, the time weight of mail B is 4, and the time weight of mail a is 2.
S14, determining the target weight of the mail according to the time weight and the text weight of the mail body in the mail, wherein the text weight is obtained according to the weighting result of the preset weight of each word in the mail body.
In one embodiment, the text of the mail may be segmented by using TextRank algorithm, or a segmentation method based on a corpus, i.e. by constructing a corpus, in which there are several grouping terms. For example, the corpus has grouping terms "luxury", "log in", "pay" and so on. The grouping entries in the corpus may be set by saving existing entries on the network or manually. The method comprises the steps of carrying out matching processing on each mail text and each grouping entry in a corpus, intercepting out the words of the mail text with the corresponding grouping entries in the corpus, wherein the intercepted words are segmentation.
In an embodiment, the preset weight of each word is pre-stored in the server 120, for example, the preset weight of "le Suo" is 6, the preset weight of "log in" is 5, the preset weight of "pay" is 4, etc. The specific value of the preset weight can be adjusted according to actual conditions. After the preset weights of the segmented words are obtained, the preset weights of the segmented words can be weighted to obtain text weights. The preset weight can be preset according to actual conditions.
Since different word parts may have different parts of speech, and the influence of the different parts of speech on the descriptive content of the mail text is different, for example, the word "assistant", there is no influence on the descriptive content of the mail text, and the noun, the entity word or the verb has a larger influence on the descriptive content of the mail text, in order to highlight the word part that has a key influence on the descriptive content of the mail text, and improve the accuracy of subsequently acquiring the text weight, in an embodiment, the preset weight of each word part is determined according to the part of speech of each word part. If the weight of the entity word is 2, the preset weights of the stop word and the auxiliary word are 0, etc.
However, given the diversity and ambiguity of languages, the meaning of the same word in different contexts is also different, resulting in the fact that the effect of the same word on the descriptive content of different mail text may be different. Thus, in one embodiment, the preset weight of the word segment is determined by the part of speech of the word segment and the semantics of the word segment in the mail text. After the preset part-of-speech weight of the word is determined according to the part-of-speech of the word, inputting the word and the descriptive content into a trained NLP (natural language processing) model, and after the preset feature weight of the word is determined, determining the preset weight of the word according to the preset part-of-speech weight and the preset feature weight. If the preset part-of-speech weight is added with the preset feature weight, the preset weight of the word segmentation can be obtained. The determining the preset feature weight of the word segmentation through the NLP model may be performed by determining the feature vector of the word segmentation through the NLP model, and then matching a corresponding preset feature weight from a plurality of preset feature weights according to the feature vector, or determining the preset feature weight of the feature vector through other conventional manners, which are not described herein.
In one embodiment, after the time weight and the text weight are obtained, the time weight and the text weight are added, so that the target weight of the mail can be determined.
And S15, sending prompt information corresponding to the mail grade to at least one terminal according to the mail grade corresponding to the target weight, so that the terminal triggers a corresponding mail processing flow according to the prompt information.
In an embodiment, the server may store a mapping relationship table of the target weight interval and the mail grade in advance, wherein the larger the target weight interval is, the higher the mail grade is. If the target weight interval is 31-40, the corresponding mail grade is 1; the target weight interval is 41-50, and the corresponding mail grade is 2. And determining the mail grade of the corresponding mail according to the target weight interval to which the target weight belongs. The mapping relation between the specific target weight interval and the mail grade can be set according to actual conditions.
In an embodiment, the server may pre-store a mapping relationship table of the target weight ranking and the mail grade, for example, the target weight is minimum, the corresponding mail grade is 1 grade, the target weight is second minimum, the corresponding mail grade is 2 grade, and so on. After the target weights of all the unprocessed mails are obtained, the target weights of all the mails are compared, and the mails are ranked from small to large according to the comparison result, so that corresponding mail grades are assigned to the corresponding mails according to the ranking result. Wherein, the larger the target weight is, the higher the corresponding mail grade is.
In one embodiment, after determining the mail grade, a prompt message corresponding to the mail grade is sent to the terminal according to the mail grade, where the prompt message is used to prompt the importance of the mail to the terminal device. Wherein, the higher the mail grade, the higher the importance of the mail. After receiving the prompt information for prompting the importance of the mails, the terminal equipment sorts the mails on the display interface according to the importance of each mail, and the higher the importance, the earlier the mail is sorted, so that the higher the importance, the earlier the mail is processed preferentially.
The domain name is detected in advance through the preset character string, so that normal mails are screened out through the domain name, the probability that the mails subjected to word segmentation recognition are malicious mails is reduced, the situation that the malicious mails are mistakenly recognized as important mails when the word segmentation recognition is used is reduced, the grade of the mails is determined by combining the word segmentation and the sending time of the normal mails, the timeliness of reply is guaranteed, and the processing efficiency of the mails is improved.
Considering that the sender domain name of most important mail is also in a fixed format, such as that of government mail is in ". Gov.", in order to further improve the accuracy of the subsequent mail class identification, in one embodiment, as shown in fig. 3, step S14 includes:
s21, acquiring initial weights of the mails according to the time weights and text weights of mail texts in the mails.
In one embodiment, after the time weight and the text weight are obtained, the time weight and the text weight are added to determine the initial weight of the mail.
S22, matching the sender domain name with each second preset character string in the second preset character string set, and acquiring corresponding domain name weight when the matching result of the sender domain name and any second preset character string is detected to meet the second preset condition.
In one embodiment, the second preset character string is composed of a letter string or an english word and preset symbols before and after the letter string or the english word. Alternatively, the second preset string may be obtained from a commonality of a plurality of important domain names after training the plurality of important domain names. Alternatively, the second preset string may be preset according to human experience.
In an embodiment, when similarity detection is performed, the server may use TextRank algorithm to extract the feature strings from the domain name of the sender, or use a feature string extraction method based on a corpus, i.e. by constructing a corpus, where there are a plurality of preset strings in the corpus. For example, the corpus has a predetermined string "-people", ". Gov", and so forth. The method comprises the steps of carrying out matching processing on a sender domain name and each preset character string in a corpus, and intercepting out character strings of the sender domain name, which correspond to the preset character strings, in the corpus to serve as characteristic character strings. After the characteristic character string is obtained, the characteristic character string is matched with each second preset character string in the second preset character string set, and when the similarity matching result meets the second preset condition, the corresponding domain name weight of the mail is given.
Because the similarity between the characteristic string of the malicious sender domain name and the second preset string is higher, there may be only one symbol difference, for example, the characteristic string of the malicious sender domain name is "-scope" ", and the second preset string is". In this case, in order to avoid misrecognition as much as possible, in an embodiment, the second preset condition is that the same character string as the second preset character string exists in the domain name of the sender. When the characteristic character string of the domain name of the sender is identical to the second preset character string, the domain name weight corresponding to the domain name of the sender is given. If the mail is not identical, judging that the mail is normal mail, and not giving the domain name weight corresponding to the domain name of the sender. The domain name weight can be preset according to actual conditions.
S23, determining the target weight of the mail according to the initial weight and the domain name weight.
In one embodiment, after obtaining the domain name weight, the domain name weight is added to the initial weight to obtain the target weight of the mail.
In an embodiment, when the domain name weight is obtained, the initial weight may be further promoted, so as to determine the target weight of the mail according to the domain name weight and the promoted initial weight.
For example, if the domain name weight is obtained, the initial weight is increased by a preset value, for example, the initial weight is increased by 10, so that the domain name weight is added to the lifted initial weight, and the target weight of the mail is obtained.
The grade of the mail is determined by combining the domain name weight of the domain name, the text weight of the mail text and the time weight of the mail, so that the recognition accuracy of the key mail is further improved.
In one embodiment, as shown in fig. 4, there is provided a mail processing apparatus including:
the mail obtaining module 101 is configured to extract a domain name of a sender from a mail.
The domain name detection module 102 is configured to match the domain name of the sender with each first preset character string in the first preset character string set, and obtain the sending time of the mail when detecting that the matching result of the domain name of the sender and each first preset character string does not meet the first preset condition.
The weight obtaining module 103 is configured to obtain a time weight corresponding to a time interval according to a time interval between the time of the delivery and the current time, where a duration of the time interval is proportional to the time weight.
The weight determining module 104 is configured to determine a target weight of the mail according to the time weight and a text weight of a mail body in the mail, where the text weight is obtained according to a weighted result of preset weights of the words in the mail body.
And the mail processing module 105 is configured to send prompt information corresponding to the mail grade to at least one terminal according to the mail grade corresponding to the target weight, so that the terminal triggers a corresponding mail processing flow according to the prompt information.
In one embodiment, the domain name detection module 102 is further configured to: and marking the mail as malicious mail when the matching result of the domain name of the sender and any one of the first preset character strings is detected to meet the first preset condition.
In an embodiment, the first preset condition is that the same character string as the first preset character string exists in the domain name of the sender.
In one embodiment, the weight determining module 104 is specifically configured to: acquiring initial weight of the mail according to the time weight and text weight of the mail text in the mail; matching the sender domain name with each second preset character string in the second preset character string set, and acquiring corresponding domain name weight when the matching result of the sender domain name and any second preset character string is detected to meet the second preset condition; and determining the target weight of the mail according to the initial weight and the domain name weight.
In an embodiment, the weight determining module 104 is further configured to: and when the matching result of the domain name of the sender and any second preset character string is detected to not meet the second preset condition, determining the initial weight as the target weight of the mail.
In an embodiment, the second preset condition is that the same character string as the second preset character string exists in the domain name of the sender.
In one embodiment, the weight determining module 104 is specifically configured to: when the domain name weight is obtained, the initial weight is promoted, and the target weight of the mail is determined according to the domain name weight and the promoted initial weight.
In one embodiment, a computer apparatus is provided, as shown in FIG. 5, comprising a processor, a memory, a network interface, an input device, and a display screen connected by a system bus. The memory includes a nonvolatile storage medium and an internal memory. The non-volatile storage medium of the computer device stores an operating system, and may also store a computer program that, when executed by a processor, causes the processor to implement a mail processing method. The internal memory may also store a computer program that, when executed by the processor, causes the processor to perform the mail processing method. It will be appreciated by those skilled in the art that the structure shown in fig. 5 is merely a block diagram of some of the structures associated with the present application and is not limiting of the computer device to which the present application may be applied, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, the mail processing apparatus provided herein may be implemented in the form of a computer program that is executable on a computer device as shown in fig. 5. The memory of the computer device may store the various program modules that make up the mail processing apparatus. The computer program constituted by the respective program modules causes the processor to execute the steps in the mail processing method of the respective embodiments of the present application described in the present specification.
In one embodiment, there is provided an electronic device including: the mail processing system comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the steps of the mail processing method. The steps of the mail processing method here may be the steps in the mail processing method of each of the above embodiments.
In one embodiment, a computer-readable storage medium storing computer-executable instructions for causing a computer to perform the steps of the mail processing method described above is provided. The steps of the mail processing method here may be the steps in the mail processing method of each of the above embodiments.
While the foregoing is directed to the preferred embodiments of the present application, it will be appreciated by those skilled in the art that changes and modifications may be made without departing from the principles of the present application, such changes and modifications are also intended to be within the scope of the present application.
Those skilled in the art will appreciate that implementing all or part of the above-described methods in accordance with the embodiments may be accomplished by way of a computer program stored on a computer readable storage medium, which when executed may comprise the steps of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), or the like.

Claims (9)

1. A mail processing method, characterized by comprising:
receiving each unprocessed mail from the terminal equipment, and extracting the domain name of the sender from the mail; matching the domain name of the sender with each first preset character string in the first preset character string set, and acquiring the sending time of the mail when the matching result of the domain name of the sender and each first preset character string is detected to not meet the first preset condition; after detecting the domain name of the sender, the method further comprises: marking the mail as malicious mail when the matching result of the domain name of the sender and any first preset character string is detected to meet the first preset condition;
acquiring time weight corresponding to the time interval according to the time interval between the sending time and the current time, wherein the duration of the time interval is in direct proportion to the time weight;
determining a target weight of the mail according to the time weight and the text weight of the mail text in the mail, wherein the text weight is obtained according to a weighting result of a preset weight of each word in the mail text;
and sending prompt information corresponding to the mail grade to at least one terminal according to the mail grade corresponding to the target weight, so that the terminal triggers a corresponding mail processing flow according to the prompt information, wherein the prompt information is used for prompting the importance of the mail to the terminal equipment, and the higher the mail grade is, the higher the importance of the mail is.
2. The mail processing method according to claim 1, wherein the first preset condition is that the same character string as the first preset character string exists in the sender domain name.
3. The mail processing method according to claim 1, wherein the preset weight is determined by the part of speech of the word and the semantics of the word.
4. The mail processing method according to claim 1, wherein determining the target weight of the mail based on the time weight and the text weight of the mail body in the mail comprises:
acquiring initial weight of the mail according to the time weight and text weight of the mail text in the mail;
matching the sender domain name with each second preset character string in the second preset character string set, and acquiring corresponding domain name weight when the matching result of the sender domain name and any second preset character string is detected to meet the second preset condition;
and determining the target weight of the mail according to the initial weight and the domain name weight.
5. The mail processing method as set forth in claim 4, further comprising:
and when the matching result of the domain name of the sender and any second preset character string is detected to not meet the second preset condition, determining the initial weight as the target weight of the mail.
6. The mail processing method according to claim 4 or 5, wherein the second preset condition is that the same character string as the second preset character string exists in the sender domain name.
7. A mail processing apparatus, characterized by comprising:
the mail acquisition module is used for receiving each unprocessed mail from the terminal equipment and extracting the domain name of the sender from the mail;
the domain name detection module is used for matching the domain name of the sender with each first preset character string in the first preset character string set, and acquiring the sending time of the mail when the matching result of the domain name of the sender and each first preset character string is detected to not meet the first preset condition; marking the mail as malicious mail when the matching result of the domain name of the sender and any first preset character string is detected to meet the first preset condition;
the weight acquisition module is used for acquiring time weight corresponding to the time interval according to the time interval between the sending time and the current time, wherein the duration of the time interval is in direct proportion to the time weight;
the weight determining module is used for determining the target weight of the mail according to the time weight and the text weight of the mail text in the mail, wherein the text weight is obtained according to the weighting result of the preset weight of each word in the mail text;
and the mail processing module is used for sending prompt information corresponding to the mail grade to at least one terminal according to the mail grade corresponding to the target weight, so that the terminal triggers a corresponding mail processing flow according to the prompt information, and the prompt information is used for prompting the importance of the mail to the terminal equipment, wherein the higher the mail grade is, the higher the importance of the mail is.
8. An electronic device, comprising: a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the mail processing method of any one of claims 1 to 6 when executing the program.
9. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a computer program adapted to be loaded and executed by a processor to cause a computer device having the processor to perform the method of any of claims 1-6.
CN202110946078.2A 2021-08-17 2021-08-17 Mail processing method, mail processing device, electronic equipment and storage medium Active CN113746814B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110946078.2A CN113746814B (en) 2021-08-17 2021-08-17 Mail processing method, mail processing device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110946078.2A CN113746814B (en) 2021-08-17 2021-08-17 Mail processing method, mail processing device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113746814A CN113746814A (en) 2021-12-03
CN113746814B true CN113746814B (en) 2024-01-09

Family

ID=78731589

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110946078.2A Active CN113746814B (en) 2021-08-17 2021-08-17 Mail processing method, mail processing device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113746814B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114520797B (en) * 2022-02-14 2024-02-09 广州拓波软件科技有限公司 Intelligent mail management and control method and device

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2443201A1 (en) * 2003-09-29 2005-03-29 Tiki Technologies Corp. Probabalistic email intrusion identification methods and systems
CN105915440A (en) * 2016-04-19 2016-08-31 乐视控股(北京)有限公司 Mail recognition method and device
CN106230867A (en) * 2016-09-29 2016-12-14 北京知道创宇信息技术有限公司 Prediction domain name whether method, system and the model training method thereof of malice, system
CN106992926A (en) * 2017-06-13 2017-07-28 深信服科技股份有限公司 A kind of method and system for forging mail-detection
CN110149266A (en) * 2018-07-19 2019-08-20 腾讯科技(北京)有限公司 Spam filtering method and device
CN111404805A (en) * 2020-03-12 2020-07-10 深信服科技股份有限公司 Junk mail detection method and device, electronic equipment and storage medium
CN111835622A (en) * 2020-07-10 2020-10-27 腾讯科技(深圳)有限公司 Information interception method and device, computer equipment and storage medium
WO2020253388A1 (en) * 2019-06-19 2020-12-24 深圳壹账通智能科技有限公司 Machine learning-based e-mail message processing method, apparatus, medium, and electronic device
CN112686631A (en) * 2020-12-29 2021-04-20 平安普惠企业管理有限公司 Task item processing method and device, computer equipment and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9189746B2 (en) * 2012-01-12 2015-11-17 Microsoft Technology Licensing, Llc Machine-learning based classification of user accounts based on email addresses and other account information

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2443201A1 (en) * 2003-09-29 2005-03-29 Tiki Technologies Corp. Probabalistic email intrusion identification methods and systems
CN105915440A (en) * 2016-04-19 2016-08-31 乐视控股(北京)有限公司 Mail recognition method and device
CN106230867A (en) * 2016-09-29 2016-12-14 北京知道创宇信息技术有限公司 Prediction domain name whether method, system and the model training method thereof of malice, system
CN106992926A (en) * 2017-06-13 2017-07-28 深信服科技股份有限公司 A kind of method and system for forging mail-detection
CN110149266A (en) * 2018-07-19 2019-08-20 腾讯科技(北京)有限公司 Spam filtering method and device
WO2020253388A1 (en) * 2019-06-19 2020-12-24 深圳壹账通智能科技有限公司 Machine learning-based e-mail message processing method, apparatus, medium, and electronic device
CN111404805A (en) * 2020-03-12 2020-07-10 深信服科技股份有限公司 Junk mail detection method and device, electronic equipment and storage medium
CN111835622A (en) * 2020-07-10 2020-10-27 腾讯科技(深圳)有限公司 Information interception method and device, computer equipment and storage medium
CN112686631A (en) * 2020-12-29 2021-04-20 平安普惠企业管理有限公司 Task item processing method and device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN113746814A (en) 2021-12-03

Similar Documents

Publication Publication Date Title
US10785176B2 (en) Method and apparatus for classifying electronic messages
CN108519970B (en) Method for identifying sensitive information in text, electronic device and readable storage medium
CN110149266B (en) Junk mail identification method and device
JP5759228B2 (en) A method for calculating semantic similarity between messages and conversations based on extended entity extraction
EP2803031B1 (en) Machine-learning based classification of user accounts based on email addresses and other account information
CN108092963B (en) Webpage identification method and device, computer equipment and storage medium
US20060149820A1 (en) Detecting spam e-mail using similarity calculations
CN105792152B (en) Pseudo base station short message identification method and device
US20240031481A1 (en) Dynamically providing safe phone numbers for responding to inbound communications
Singh et al. Email spam classification by support vector machine
CN111614543B (en) URL-based spear phishing mail detection method and system
CN104184653A (en) Message filtering method and device
CN113746814B (en) Mail processing method, mail processing device, electronic equipment and storage medium
CN114818705A (en) Method, electronic device and computer program product for processing data
CN112039874B (en) Malicious mail identification method and device
CN110955796B (en) Case feature information extraction method and device based on stroke information
US20230104884A1 (en) Method for detecting webpage spoofing attacks
CN109510904B (en) Method and system for detecting call center outbound record
CN111083705A (en) Group-sending fraud short message detection method, device, server and storage medium
CN113472686B (en) Information identification method, device, equipment and storage medium
CN113556347B (en) Detection method, device and equipment for phishing mails and storage medium
CN113420549B (en) Abnormal character string identification method and device
US11681966B2 (en) Systems and methods for enhanced risk identification based on textual analysis
Manek et al. ReP-ETD: A Repetitive Preprocessing technique for Embedded Text Detection from images in spam emails
CN112989838B (en) Text contact entity extraction method, device and equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant