CN112101917A - Mail outgoing processing method, device, system and storage medium - Google Patents
Mail outgoing processing method, device, system and storage medium Download PDFInfo
- Publication number
- CN112101917A CN112101917A CN202011039188.2A CN202011039188A CN112101917A CN 112101917 A CN112101917 A CN 112101917A CN 202011039188 A CN202011039188 A CN 202011039188A CN 112101917 A CN112101917 A CN 112101917A
- Authority
- CN
- China
- Prior art keywords
- outgoing
- data
- sensitive data
- mails
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 9
- 238000000034 method Methods 0.000 claims abstract description 32
- 230000000903 blocking effect Effects 0.000 claims abstract description 12
- 238000012545 processing Methods 0.000 claims description 81
- 230000015654 memory Effects 0.000 claims description 29
- 238000004590 computer program Methods 0.000 claims description 18
- 239000012634 fragment Substances 0.000 claims description 17
- 238000004364 calculation method Methods 0.000 claims description 14
- 238000013467 fragmentation Methods 0.000 claims description 10
- 238000006062 fragmentation reaction Methods 0.000 claims description 10
- 230000008569 process Effects 0.000 claims description 10
- 238000012549 training Methods 0.000 claims description 9
- 238000010801 machine learning Methods 0.000 claims description 8
- 238000012550 audit Methods 0.000 claims description 4
- 238000000605 extraction Methods 0.000 claims description 3
- 238000010586 diagram Methods 0.000 description 11
- 230000006870 function Effects 0.000 description 11
- 238000004891 communication Methods 0.000 description 9
- 238000004422 calculation algorithm Methods 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 6
- 230000007246 mechanism Effects 0.000 description 5
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 4
- 229910052799 carbon Inorganic materials 0.000 description 4
- 239000011159 matrix material Substances 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 230000001960 triggered effect Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 238000000586 desensitisation Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 230000014509 gene expression Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000005055 memory storage Effects 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000013439 planning Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation; Time management
- G06Q10/107—Computer-aided management of electronic mailing [e-mailing]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/10—Machine learning using kernel methods, e.g. support vector machines [SVM]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L51/00—User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
- H04L51/42—Mailbox-related aspects, e.g. synchronisation of mailboxes
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Human Resources & Organizations (AREA)
- General Physics & Mathematics (AREA)
- Strategic Management (AREA)
- Mathematical Physics (AREA)
- Entrepreneurship & Innovation (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Medical Informatics (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Molecular Biology (AREA)
- Health & Medical Sciences (AREA)
- Computer Hardware Design (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Economics (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Information Transfer Between Computers (AREA)
Abstract
The embodiment of the specification provides a mail outgoing processing method, a device, a system and a storage medium, wherein the method comprises the following steps: extracting the mail content of the outgoing mail; sensitive data matching is carried out on the mail content; and for the outgoing mails matched with the sensitive data, blocking the outgoing mails from being sent and triggering the outgoing mails to be audited. The embodiment of the specification can prevent or reduce the leakage of sensitive data by an outgoing mail mode.
Description
Technical Field
The present disclosure relates to the field of mail information security technologies, and in particular, to a method, a device, a system, and a storage medium for processing outgoing mails.
Background
With the advance of information technology revolution represented by artificial intelligence, big data and the internet of things, the value of the data is further highlighted. Data has become the driving force for significant assets and continuous innovation for enterprises at present. Therefore, the importance of ensuring the safety of the data in each link of collection, transmission, utilization, sharing and the like is self-evident.
Research shows that with the rapid development of information technology, electronic mail becomes an important way for data exchange between enterprises and the outside, but sensitive data is leaked by mail outgoing, and meanwhile, the electronic mail also becomes one of the main ways for enterprise data leakage. Therefore, how to prevent sensitive data from being leaked by outgoing mails is a technical problem to be solved urgently at present.
Disclosure of Invention
The embodiments of the present specification aim to provide a method, device, system and storage medium for processing outgoing mails, so as to prevent or reduce leakage of sensitive data by outgoing mails.
In order to achieve the above object, in one aspect, an embodiment of the present specification provides a method for processing outgoing mails, including:
extracting the mail content of the outgoing mail;
sensitive data matching is carried out on the mail content;
and for the outgoing mails matched with the sensitive data, blocking the outgoing mails from being sent and triggering the outgoing mails to be audited.
In an embodiment of the present specification, the method further includes:
and carrying out outgoing processing on outgoing mails which are not matched with the sensitive data.
In an embodiment of this specification, the performing sensitive data matching on the mail content includes:
extracting feature data of the mail content;
and inputting the characteristic data into a sensitive mail prediction model obtained based on machine learning model training so as to identify whether the outgoing mail corresponding to the mail content is the outgoing mail containing the sensitive data.
In an embodiment of this specification, the performing sensitive data matching on the mail content includes:
dividing the mail content into a plurality of data segments according to a preset fragmentation rule;
performing a hash calculation on each data segment to generate a digital fingerprint of each data segment;
matching the word fingerprint with a digital fingerprint in a digital fingerprint database to identify whether an outgoing mail corresponding to the mail content is an outgoing mail containing sensitive data;
and the digital fingerprints in the digital fingerprint database are digital fingerprints corresponding to the sensitive data.
In an embodiment of this specification, the digital fingerprints in the digital fingerprint database are obtained in advance by:
when the file containing the sensitive data is structured data, performing hash calculation on each line of the sensitive data of the file to generate a digital fingerprint of the line of the sensitive data.
In an embodiment of this specification, the digital fingerprints in the digital fingerprint database are obtained in advance by:
when the file containing the sensitive data is unstructured data, the sensitive data is divided into a plurality of data fragments according to the fragmentation rule, and hash calculation is carried out on each data fragment to generate a digital fingerprint of each data fragment.
In an embodiment of this specification, the triggering to perform audit processing on the mobile terminal includes:
and sending the outgoing mail matched with the sensitive data to an OA examination and approval system for auditing so that the OA examination and approval system determines a mail sending strategy according to an examination and approval result.
In an embodiment of this specification, the performing outgoing processing includes:
and adding the outgoing mails which are not matched with the sensitive data into a mail outgoing queue to send the outgoing mails.
In an embodiment of the present specification, the mail content includes: the receiver, the sender, the mail subject, the mail body and the mail attachment.
On the other hand, the embodiment of the present specification further provides a mail outgoing processing device, including:
the extraction module is used for extracting the mail content of the outgoing mail;
the matching module is used for matching sensitive data of the mail content;
and the blocking module is used for blocking the sending of the outgoing mails matched with the sensitive data and triggering the outgoing mails to be audited.
In an embodiment of this specification, the device for processing outgoing mail further includes:
and the outgoing module is used for carrying out outgoing processing on the outgoing mails which are not matched with the sensitive data.
On the other hand, the embodiment of the present specification further provides a mail outgoing processing system, including a mail outgoing processing device and a mail server, where the mail outgoing processing device is a next hop of an outgoing mail route of the mail server;
the mail outgoing processing device is configured to:
extracting the mail content of the outgoing mail;
sensitive data matching is carried out on the mail content;
and for the outgoing mails matched with the sensitive data, blocking the outgoing mails from being sent and triggering the outgoing mails to be audited.
In an embodiment of the present specification, the mail outgoing processing device is in a software and hardware integrated form or a virtualized form.
On the other hand, the embodiments of the present specification also provide another mail outgoing processing device, which includes a memory, a processor, and a computer program stored on the memory, and when the computer program is executed by the processor, the computer program executes the instructions of the above method.
In another aspect, the present specification further provides a computer storage medium, on which a computer program is stored, and the computer program is executed by a processor of a computer device to execute the instructions of the method.
As can be seen from the technical solutions provided by the embodiments of the present specification, in the embodiments of the present specification, sensitive data matching may be performed on the mail content of the outgoing mail, and if the sensitive data is matched, it indicates that the outgoing mail is a sensitive mail, so that sending of the outgoing mail may be blocked and an audit process may be triggered, thereby preventing or slowing down leakage of the sensitive data by the outgoing mail, and thus facilitating improvement of data security of the sensitive data.
Drawings
In order to more clearly illustrate the embodiments of the present specification or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only some embodiments described in the present specification, and for those skilled in the art, other drawings can be obtained according to the drawings without any creative effort. In the drawings:
FIG. 1 is a flow chart of a method of an outgoing mail handler in some embodiments provided herein;
FIG. 2 is a block diagram of an outgoing mail processing system in some embodiments provided herein;
FIG. 3 is a block diagram of an outgoing mail processing device in some embodiments provided herein;
fig. 4 is a block diagram of an outgoing mail processing device in further embodiments provided herein.
[ description of reference ]
31. An extraction module;
32. a matching module;
33. a blocking module;
34. an outgoing module;
402. a computer device;
404. a processor;
406. a memory;
408. a drive mechanism;
410. an input/output module;
412. an input device;
414. an output device;
416. a presentation device;
418. a graphical user interface;
420. a network interface;
422. a communication link;
424. a communication bus.
Detailed Description
In order to make those skilled in the art better understand the technical solutions in the present specification, the technical solutions in the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification, and it is obvious that the described embodiments are only a part of the embodiments of the present specification, and not all of the embodiments. All other embodiments obtained by a person skilled in the art based on the embodiments in the present specification without any inventive step should fall within the scope of protection of the present specification.
The method for processing the outgoing mail in the embodiment of the present specification may be applied to the side of the outgoing mail processing device, where the outgoing mail processing device is a next hop of an outgoing mail route of the mail server, that is, the mail route of the mail server is directed to the outgoing mail processing device in a next hop configuration in an outgoing direction. At the moment, all outgoing mails firstly pass through the mail server and then reach the mail outgoing processing, so that the mail outgoing processing equipment can carry out the mail outgoing processing on each outgoing mail.
Referring to fig. 1, the method for processing outgoing mail in the embodiment of the present specification may include the following steps:
s101, mail content of the outgoing mail is extracted.
And S102, performing sensitive data matching on the mail content.
S103, for the outgoing mails matched with the sensitive data, the outgoing mails are blocked from being sent and are triggered to be audited.
Therefore, in the embodiment of the description, the sensitive data matching can be performed on the mail content of the outgoing mail, and if the sensitive data is matched, the outgoing mail is indicated as the sensitive mail, so that the sending of the outgoing mail can be blocked and the auditing process can be triggered, thereby preventing or slowing down the leakage of the sensitive data through the outgoing mail mode, and being beneficial to improving the data security of the sensitive data.
Of course, if the sensitive data is not matched, the outgoing mail is indicated to be a normal mail, so that the outgoing processing can be normally carried out on the outgoing mail which is not matched with the sensitive data. That is, outgoing mail that does not match the sensitive data may be added to the mail outgoing queue to facilitate sending the outgoing mail.
In the embodiments of the present specification, outgoing mail is mail sent to the outside, and is generally sent to an external mailbox (e.g., a client mailbox) through an external network (e.g., the internet).
In the embodiment of the specification, the mail content can comprise a receiver (comprising a carbon copy and a blind carbon copy), a sender, a mail subject, a mail body and mail attachment information. When receiving the outgoing mail transmitted by the mail server, the information content can be extracted from the outgoing mail for subsequent processing. In some embodiments of the present description, the selection of which parts of the content of the extracted mail are selected according to actual needs. However, in most cases, sensitive data is often contained in the mail body and/or the mail attachment, and therefore, both parts of the mail body and/or the mail attachment generally need to be extracted.
In the embodiments of the present specification, sensitive data generally refers to data that is not suitable for external publishing and may cause serious harm to enterprises or individuals after leakage. For example, for an enterprise or social organization, the business situation (including financial statements, statistical statements), internal data (including technical data, business planning data, etc.), contract information, server username and password, IP address list, etc.
In some embodiments of the present specification, the matching sensitive data to the mail content may include:
1) and extracting the characteristic data of the mail content.
The purpose of extracting the characteristic data of the mail content is to convert the mail content into characteristic matrix data, namely to convert the mail content of the text information into matrix data similar to pictures so as to facilitate the subsequent input sensitive mail prediction model processing. In an embodiment of the present specification, the mail content may be converted into the feature matrix data by encoding the mail content. For example, in an exemplary embodiment, the mail content may be data-sliced and each data-slice may be hash-computed, so that the obtained hash value may be used as the encoding of the corresponding content. In an exemplary application scenario, each sender may act as a data fragment; each recipient (including a carbon copy and a blind carbon copy) can be used as a data fragment; the mail text can be divided into a plurality of data fragments according to a preset fragment rule; similarly, each file in the mail attachment can be divided into a plurality of data fragments according to a preset fragmentation rule.
2) And inputting the characteristic data into a sensitive mail prediction model obtained based on machine learning model training to identify whether the outgoing mail corresponding to the mail content is the outgoing mail containing the sensitive data.
In some embodiments of the present description, a training set, a validation set, and a test set may be constructed based on outgoing mail history data. On the basis, the selected initial machine learning model is trained by using samples in the training set to obtain a sensitive mail prediction model (namely, a mail classifier which can recognize whether the outgoing mail is sensitive mail or not) capable of recognizing whether the outgoing mail is sensitive mail or not. Of course, in the training process, the intermediate model trained each time can be verified by using the sample in the verification set, so as to continuously update and perfect the intermediate model. After the sensitive mail prediction model is trained, the sensitive mail prediction model can be tested by using samples in the test set so as to test whether the sensitive mail prediction model meets preset model evaluation indexes (such as accuracy, recall rate, confusion matrix and the like). When the sensitive mail prediction model meets the preset model evaluation condition, the sensitive mail prediction model can be used as a final sensitive mail prediction model for subsequent sensitive mail prediction.
After the feature data is input into the sensitive mail prediction model obtained based on machine learning model training, the sensitive mail prediction model can calculate and output a prediction result. For example, an output of 1 indicates that the outgoing mail is a sensitive mail, and an output of 0 indicates that the outgoing mail is a normal mail. Because the sensitive mail prediction model is obtained based on machine learning model training, the sensitive mail prediction model can accurately predict new data which do not appear in the sample.
In some embodiments of the present description, the initial machine learning model used by the sensitive mail prediction model may be, for example, a neural network (e.g., a convolutional neural network, a cyclic neural network, a fully-connected neural network, etc.), a Support Vector Machine (SVM), naive bayes, deep learning, etc.
In other embodiments of the present specification, the matching sensitive data to the mail content may include:
1) and dividing the mail content into a plurality of data segments according to a preset fragmentation rule.
2) And performing a hash calculation on each data segment to generate a digital fingerprint of each data segment.
3) And matching the word fingerprint with the digital fingerprint in a digital fingerprint database to identify whether the outgoing mail corresponding to the mail content is the outgoing mail containing sensitive data.
The digital fingerprints in the digital fingerprint database are digital fingerprints corresponding to the sensitive data, and the digital fingerprints in the digital fingerprint database can be obtained in advance in the following way:
when the file containing the sensitive data is structured data (such as a database table and the like), performing hash calculation on each row of the sensitive data of the file to generate a digital fingerprint of the row of the sensitive data. For structured data (such as a customer information table), a database administrator is not required to assign special authority or intervene in database service, and only read authority is required to grab data from a database table and generate a digital fingerprint to be input into a digital fingerprint database.
When the file containing the sensitive data is unstructured data (such as a text file, a word file, a picture and the like), the file containing the sensitive data is divided into a plurality of data fragments according to the fragmentation rule, and hash calculation is carried out on each data fragment to generate a digital fingerprint of each data fragment to be recorded into a digital fingerprint database.
Therefore, the digital fingerprints of the mail content are matched with the digital fingerprints in the digital fingerprint database, so that the original file content can be identified very accurately, the modified file content in a certain range can also be identified, and the sensitive mail interception can be favorably reduced. Of course, the digital fingerprint database needs to be updated periodically since sensitive data may change over time.
In addition, considering that only a part of the content of a file containing sensitive data belongs to the sensitive data, when the digital fingerprint database is constructed, no matter structured data or unstructured data, data fragmentation can be selectively performed, that is, only the sensitive data part in the file can be fragmented, so that the processing amount is reduced, and the efficiency of mail outgoing processing is improved.
Those skilled in the art will appreciate that the foregoing is merely illustrative and that in other embodiments of the present description sensitive data matching of mail content may also be employed. E.g., a regular match, etc. Of course, if regular matching is employed, regular expressions need to be defined in advance. The regular expression is a logic formula for operating character strings and special characters, namely a 'regular character string' is formed by using a plurality of specific characters which are defined in advance and the combination of the specific characters, and is used for expressing an interception filtering logic for the character strings.
In some embodiments of the present specification, the triggering of auditing may include:
and sending the outgoing mail matched with the sensitive data to an OA examination and approval system for auditing so that the OA examination and approval system determines a mail sending strategy according to an examination and approval result. The OA approval system can be an approval flow, the approval flow can comprise one-stage or multi-stage anti-disclosure approval, an auditor of each stage of approval can perform corresponding approval decision operation on the OA approval system after verifying the mail content, all stages of approval decision operation are integrated, and the OA approval system can obtain a final approval result and can determine a mail sending strategy according to the final approval result.
For example, in some exemplary embodiments, when the final approval result is "no transmission", the OA approval system may automatically archive and isolate the corresponding outgoing mail. When the final approval result is desensitization sending, the OA approval system can call a data desensitization tool to desensitize the mail content, and form a new outgoing mail after sensitive data is removed and add the new outgoing mail into an outgoing mail queue. When the final approval result is "encrypted sending", the OA approval system may encrypt the data of the mail content (at least the sensitive data portion in the mail content), and add the encrypted mail to the outgoing mail queue to form a new outgoing mail.
Of course, for each outgoing mail blocked by interception, the sender corresponding to the alarm may also be notified after the mail sending policy is determined.
In an embodiment of the present specification, the hash calculation may be a hash calculation based on a specified hash function. The specified Hash function may include, for example, but is not limited to, a Message Digest Algorithm (Message-Digest Algorithm, SHA) Algorithm, and a Secure Hash (Secure Hash Algorithm, SHA) Algorithm. Typical MD algorithms include MD4, MD5, and the like. Typical SHA algorithms include SHA-1, SHA-2, and SHA-256, among others.
While the process flows described above include operations that occur in a particular order, it should be appreciated that the processes may include more or less operations that are performed sequentially or in parallel (e.g., using parallel processors or a multi-threaded environment).
Corresponding to the mail outgoing processing method, the embodiment of the specification further provides a mail outgoing processing system. Referring to fig. 2, the mail outgoing processing system may include a mail outgoing processing device and a mail server, the mail outgoing processing device being a next hop of an outgoing mail route of the mail server; i.e. the mail route of the mail server is directed to the mail outgoing processing device in the next hop configuration in the outgoing direction. At the moment, all outgoing mails firstly pass through the mail server and then reach the mail outgoing processing, so that the mail outgoing processing equipment can carry out the mail outgoing processing on each outgoing mail. The mail server can receive the outgoing mail provided by the user terminal and transmit the outgoing mail to the mail outgoing processing device for processing. The mail outgoing processing device may be configured to:
extracting the mail content of the outgoing mail;
sensitive data matching is carried out on the mail content;
and for the outgoing mails matched with the sensitive data, blocking the outgoing mails from being sent and triggering the outgoing mails to be audited.
In some embodiments of this specification, the outgoing mail processing device may monitor and process an outgoing mail provided by the mail server by using a Secure Socket Layer (SSL). Of course, the mail server needs to be configured accordingly for this purpose.
In the outgoing mail processing system of some embodiments of the present specification, the outgoing mail processing device may be further configured to:
and carrying out outgoing processing on outgoing mails which are not matched with the sensitive data.
In the outgoing mail processing system according to some embodiments of the present specification, the performing sensitive data matching on the mail content may include:
extracting feature data of the mail content;
and inputting the characteristic data into a sensitive mail prediction model obtained based on machine learning model training so as to identify whether the outgoing mail corresponding to the mail content is the outgoing mail containing the sensitive data.
In the outgoing mail processing system according to some embodiments of the present specification, the performing sensitive data matching on the mail content may include:
dividing the mail content into a plurality of data segments according to a preset fragmentation rule;
performing a hash calculation on each data segment to generate a digital fingerprint of each data segment;
matching the word fingerprint with a digital fingerprint in a digital fingerprint database to identify whether an outgoing mail corresponding to the mail content is an outgoing mail containing sensitive data;
and the digital fingerprints in the digital fingerprint database are digital fingerprints corresponding to the sensitive data.
In the outgoing mail processing system according to some embodiments of the present specification, the digital fingerprint in the digital fingerprint database is obtained in advance by:
when the file containing the sensitive data is structured data, performing hash calculation on each line of the sensitive data of the file to generate a digital fingerprint of the line of the sensitive data.
In the outgoing mail processing system according to some embodiments of the present specification, the digital fingerprint in the digital fingerprint database is obtained in advance by:
when the file containing the sensitive data is unstructured data, the sensitive data is divided into a plurality of data fragments according to the fragmentation rule, and hash calculation is carried out on each data fragment to generate a digital fingerprint of each data fragment.
In the outgoing mail processing system according to some embodiments of the present specification, the triggering to audit the outgoing mail may include:
and sending the outgoing mail matched with the sensitive data to an OA examination and approval system for auditing so that the OA examination and approval system determines a mail sending strategy according to an examination and approval result.
In the outgoing mail processing system according to some embodiments of the present specification, the performing outgoing processing may include:
and adding the outgoing mails which are not matched with the sensitive data into a mail outgoing queue to send the outgoing mails.
In the outgoing mail processing system of some embodiments of the present specification, the mail content may include: the receiver, the sender, the mail subject, the mail body and the mail attachment.
In an embodiment of the present disclosure, the user terminal may be a desktop computer, a tablet computer, a notebook computer, a smart phone, a digital assistant, and the like. Of course, the client is not limited to the electronic device with certain entities, and may also be software running in the electronic device.
In an embodiment of the present specification, the mail server may be an electronic device with computing and network interaction functions; software that runs in the electronic device and provides business logic for data processing and network interaction is also possible. The server can receive the communication message sent by the client and send the communication message to the client.
In an embodiment of the present specification, the mail server may be a mail sending server, and in another embodiment of the present specification, the mail server may also be a mail receiving/sending server (i.e., a server having a mail sending/receiving management function).
Referring to fig. 3, in some embodiments of the present specification, the mail outgoing processing device may be in a virtualized form, that is, the mail outgoing processing device may be a software module. In this case, the mail outgoing processing apparatus may include:
an extracting module 31, which can be used to extract the mail content of the outgoing mail;
a matching module 32, which can be used for performing sensitive data matching on the mail content;
the blocking module 33 may be configured to block the outgoing email matched with the sensitive data from being sent and trigger an auditing process for the outgoing email.
In other embodiments of the present disclosure, as shown in fig. 3, the mail outgoing processing device in the virtualized form may further include an outgoing module 34. The outbound module 34 may be used to outbound process outbound mail that does not match sensitive data.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functions of the various elements may be implemented in the same one or more software and/or hardware implementations of the present description.
In some embodiments of the present disclosure, the mail outgoing processing device may also be a combination of hardware and software, that is, the mail outgoing processing device may be a computer device combining hardware and software. For example, as shown in fig. 4, the computer device 402 may include one or more processors 404, such as one or more Central Processing Units (CPUs) or Graphics Processors (GPUs), each of which may implement one or more hardware threads. The computer device 402 may also comprise any memory 406 for storing any kind of information, such as code, settings, data, etc., and in a particular embodiment a computer program running on the memory 406 and on the processor 404, which computer program, when executed by the processor 404, may perform the instructions according to the above-described method. For example, and without limitation, memory 406 may include any one or more of the following in combination: any type of RAM, any type of ROM, flash memory devices, hard disks, optical disks, etc. More generally, any memory may use any technology to store information. Further, any memory may provide volatile or non-volatile retention of information. Further, any memory may represent fixed or removable components of computer device 402. In one case, when the processor 404 executes the associated instructions, which are stored in any memory or combination of memories, the computer device 402 can perform any of the operations of the associated instructions. The computer device 402 also includes one or more drive mechanisms 408, such as a hard disk drive mechanism, an optical disk drive mechanism, etc., for interacting with any memory.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the specification. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
As will be appreciated by one skilled in the art, embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, embodiments of the present description may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present description may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.
The embodiments of this specification may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The described embodiments may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment. In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of an embodiment of the specification. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.
Claims (15)
1. A mail outgoing processing method is characterized by comprising the following steps:
extracting the mail content of the outgoing mail;
sensitive data matching is carried out on the mail content;
and for the outgoing mails matched with the sensitive data, blocking the outgoing mails from being sent and triggering the outgoing mails to be audited.
2. The mail outgoing processing method according to claim 1, further comprising:
and carrying out outgoing processing on outgoing mails which are not matched with the sensitive data.
3. The method of claim 1, wherein the matching sensitive data to the mail content comprises:
extracting feature data of the mail content;
and inputting the characteristic data into a sensitive mail prediction model obtained based on machine learning model training so as to identify whether the outgoing mail corresponding to the mail content is the outgoing mail containing the sensitive data.
4. The method of claim 1, wherein the matching sensitive data to the mail content comprises:
dividing the mail content into a plurality of data segments according to a preset fragmentation rule;
performing a hash calculation on each data segment to generate a digital fingerprint of each data segment;
matching the word fingerprint with a digital fingerprint in a digital fingerprint database to identify whether an outgoing mail corresponding to the mail content is an outgoing mail containing sensitive data;
and the digital fingerprints in the digital fingerprint database are digital fingerprints corresponding to the sensitive data.
5. The outgoing mail processing method as set forth in claim 4, wherein the digital fingerprint in the digital fingerprint database is obtained in advance by:
when the file containing the sensitive data is structured data, performing hash calculation on each line of the sensitive data of the file to generate a digital fingerprint of the line of the sensitive data.
6. The outgoing mail processing method as set forth in claim 4, wherein the digital fingerprint in the digital fingerprint database is obtained in advance by:
when the file containing the sensitive data is unstructured data, the sensitive data is divided into a plurality of data fragments according to the fragmentation rule, and hash calculation is carried out on each data fragment to generate a digital fingerprint of each data fragment.
7. The method of claim 1, wherein the triggering of the audit process comprises:
and sending the outgoing mail matched with the sensitive data to an Office Automation (OA) approval system for auditing so that the OA approval system determines a mail sending strategy according to an approval result.
8. The method of outgoing mail processing according to claim 2, wherein said outgoing processing includes:
and adding the outgoing mails which are not matched with the sensitive data into a mail outgoing queue to send the outgoing mails.
9. The mail outgoing processing method according to claim 1, wherein the mail contents include: the receiver, the sender, the mail subject, the mail body and the mail attachment.
10. An outgoing mail processing apparatus, comprising:
the extraction module is used for extracting the mail content of the outgoing mail;
the matching module is used for matching sensitive data of the mail content;
and the blocking module is used for blocking the sending of the outgoing mails matched with the sensitive data and triggering the outgoing mails to be audited.
11. The mail outgoing processing device according to claim 10, wherein said mail outgoing processing device further comprises:
and the outgoing module is used for carrying out outgoing processing on the outgoing mails which are not matched with the sensitive data.
12. The mail outgoing processing system is characterized by comprising a mail outgoing processing device and a mail server, wherein the mail outgoing processing device is the next hop of an outgoing mail route of the mail server;
the mail outgoing processing device is configured to:
extracting the mail content of the outgoing mail;
sensitive data matching is carried out on the mail content;
and for the outgoing mails matched with the sensitive data, blocking the outgoing mails from being sent and triggering the outgoing mails to be audited.
13. The mail outgoing processing system of claim 12, wherein said mail outgoing processing device is in a software and hardware integrated form or a virtualized form.
14. A mail outgoing processing device comprising a memory, a processor, and a computer program stored on the memory, characterized in that the computer program, when executed by the processor, executes the instructions of the method according to any one of claims 1-9.
15. A computer storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor of a computer device, executes instructions of a method according to any one of claims 1-9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011039188.2A CN112101917A (en) | 2020-09-28 | 2020-09-28 | Mail outgoing processing method, device, system and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011039188.2A CN112101917A (en) | 2020-09-28 | 2020-09-28 | Mail outgoing processing method, device, system and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112101917A true CN112101917A (en) | 2020-12-18 |
Family
ID=73783524
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011039188.2A Pending CN112101917A (en) | 2020-09-28 | 2020-09-28 | Mail outgoing processing method, device, system and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112101917A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113872852A (en) * | 2021-09-29 | 2021-12-31 | 平安科技(深圳)有限公司 | Outgoing mail monitoring method and device, electronic equipment and storage medium |
CN113992621A (en) * | 2021-09-08 | 2022-01-28 | 厦门天锐科技股份有限公司 | System and method for mail outgoing examination and approval |
CN114840871A (en) * | 2022-04-06 | 2022-08-02 | 北京蓝海在线科技有限公司 | Data desensitization method and device, electronic equipment and storage medium |
CN115834524A (en) * | 2022-11-18 | 2023-03-21 | 中国建设银行股份有限公司湖南省分行 | System and method for sending out bank intranet mails |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160301693A1 (en) * | 2015-04-10 | 2016-10-13 | Maxim Nikulin | System and method for identifying and protecting sensitive data using client file digital fingerprint |
CN107911277A (en) * | 2017-09-29 | 2018-04-13 | 北京明朝万达科技股份有限公司 | A kind of outgoing mail auditing method and system based on machine learning |
CN108600081A (en) * | 2018-03-26 | 2018-09-28 | 北京明朝万达科技股份有限公司 | A kind of method and device that mail outgoing achieves, Mail Gateway |
CN111310205A (en) * | 2020-02-11 | 2020-06-19 | 平安科技(深圳)有限公司 | Sensitive information detection method and device, computer equipment and storage medium |
-
2020
- 2020-09-28 CN CN202011039188.2A patent/CN112101917A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160301693A1 (en) * | 2015-04-10 | 2016-10-13 | Maxim Nikulin | System and method for identifying and protecting sensitive data using client file digital fingerprint |
CN107911277A (en) * | 2017-09-29 | 2018-04-13 | 北京明朝万达科技股份有限公司 | A kind of outgoing mail auditing method and system based on machine learning |
CN108600081A (en) * | 2018-03-26 | 2018-09-28 | 北京明朝万达科技股份有限公司 | A kind of method and device that mail outgoing achieves, Mail Gateway |
CN111310205A (en) * | 2020-02-11 | 2020-06-19 | 平安科技(深圳)有限公司 | Sensitive information detection method and device, computer equipment and storage medium |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113992621A (en) * | 2021-09-08 | 2022-01-28 | 厦门天锐科技股份有限公司 | System and method for mail outgoing examination and approval |
CN113872852A (en) * | 2021-09-29 | 2021-12-31 | 平安科技(深圳)有限公司 | Outgoing mail monitoring method and device, electronic equipment and storage medium |
CN113872852B (en) * | 2021-09-29 | 2022-11-22 | 平安科技(深圳)有限公司 | Outgoing mail monitoring method and device, electronic equipment and storage medium |
CN114840871A (en) * | 2022-04-06 | 2022-08-02 | 北京蓝海在线科技有限公司 | Data desensitization method and device, electronic equipment and storage medium |
CN115834524A (en) * | 2022-11-18 | 2023-03-21 | 中国建设银行股份有限公司湖南省分行 | System and method for sending out bank intranet mails |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11468192B2 (en) | Runtime control of automation accuracy using adjustable thresholds | |
US11194962B2 (en) | Automated identification and classification of complaint-specific user interactions using a multilayer neural network | |
CN112101917A (en) | Mail outgoing processing method, device, system and storage medium | |
US11811799B2 (en) | Identifying security risks using distributions of characteristic features extracted from a plurality of events | |
US11971985B2 (en) | Adaptive detection of security threats through retraining of computer-implemented models | |
US20220405535A1 (en) | Data log content assessment using machine learning | |
US11663329B2 (en) | Similarity analysis for automated disposition of security alerts | |
US11178022B2 (en) | Evidence mining for compliance management | |
US11664998B2 (en) | Intelligent hashing of sensitive information | |
US20230171287A1 (en) | System and method for identifying a phishing email | |
US10445514B1 (en) | Request processing in a compromised account | |
US20200336506A1 (en) | Predicting a next alert in a pattern of alerts to identify a security incident | |
WO2021068835A1 (en) | Data outgoing method and device, and related apparatus | |
Sethi et al. | Spam email detection using machine learning and neural networks | |
US20220255950A1 (en) | System and method for creating heuristic rules to detect fraudulent emails classified as business email compromise attacks | |
US20220294751A1 (en) | System and method for clustering emails identified as spam | |
WO2022098759A1 (en) | Computer-based systems configured for automated computer script analysis and malware detection and methods thereof | |
US11929969B2 (en) | System and method for identifying spam email | |
CN105912946A (en) | Document detection method and device | |
EP4356564A1 (en) | Likelihood assessment for security incident alerts | |
AU2011276987A1 (en) | Monitoring communications | |
US20240121096A1 (en) | Systems and methods for intelligently constructing, transmitting, and validating spoofing-conscious digitally signed web tokens using microservice components of a cybersecurity threat mitigation platform | |
US11809602B2 (en) | Privacy verification for electronic communications | |
US20220147654A1 (en) | Data anonymization | |
US20200313989A1 (en) | Method and apparatus for variable sampling for outlier mining |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |