KR102259789B1

KR102259789B1 - Method and apparatus for filtering of outgoing and incoming spam mail

Info

Publication number: KR102259789B1
Application number: KR1020200022369A
Authority: KR
Inventors: 김성환; 오충용
Original assignee: 삼정데이타서비스 주식회사
Priority date: 2020-02-24
Filing date: 2020-02-24
Publication date: 2021-06-02

Abstract

The present invention relates to a method of operating a server operated by at least one processor, which includes the steps of: receiving filtering rules for determining spam mail from a blockchain network; receiving mail from an arbitrary mail server; determining whether the mail corresponds to the spam mail by an outgoing mail filtering rule or an incoming mail filtering rule included in the filtering rules; and blocking the sending or receiving of the mail, and storing the information of the mail in the blockchain network, when the mail corresponds to the spam mail, wherein mail information is provided to a plurality of mail servers connected to the blockchain network. According to the present invention, it is possible to prevent the illegal use of an account and leakage of company secrets by filtering whether a content is appropriate for security before sending.

Description

Spam filtering method and device for outgoing mail and incoming mail {METHOD AND APPARATUS FOR FILTERING OF OUTGOING AND INCOMING SPAM MAIL}

본 발명은 발신 메일 및 수신 메일의 스팸을 필터링하는 기술에 관한 것이다.The present invention relates to a technique for filtering spam of outgoing mail and incoming mail.

스팸(Spam)은 이메일, 문자, 전화 등을 통해 불특정 다수의 사람들에게 보내는 광고성 편지 또는 메시지를 의미한다. 스팸을 차단하기 위해, 종래의 스팸 필터링 시스템은 수신 메일의 제목, 본문 그리고 첨부 파일 등의 구성요소를 확인하여 광고성 메일, 피싱 메일 또는 바이러스, 멀웨어(Malware) 등의 악성 프로그램을 차단하여, 사용자의 시스템을 보호할 수 있었다. Spam means an advertisement letter or message sent to an unspecified number of people through e-mail, text message, phone, etc. In order to block spam, the conventional spam filtering system checks components such as the subject, body, and attachments of an incoming mail to block advertising mail, phishing mail, or malicious programs such as viruses and malware. system could be protected.

그러나, 종래 스팸 필터링 기술은 메일의 텍스트를 기반으로 필터링하므로, 스팸 이미지가 포함된 메일에 대해 정확한 스팸 필터링에 한계가 있다. However, since the conventional spam filtering technology filters based on the text of the mail, there is a limit to the correct spam filtering for the mail including the spam image.

또한, 스팸 필터링 시스템이 수신 메일에 한하여 적용되므로, 이메일 계정이 도용되어 내부 정보가 유출되어 발생하는 피해를 방지할 수 없다는 문제가 있다. 스패머가 대량의 수신자에게 광고성 메일에 관련된 규정을 위반하여 발송하는 것 역시 막을 수 없다. In addition, since the spam filtering system is applied only to the received mail, there is a problem in that it is impossible to prevent damage caused by the theft of the email account and the leakage of internal information. It is also impossible to prevent spammers from sending to a large number of recipients in violation of the regulations related to advertising mail.

따라서 수신 메일에 대해 메일에 포함된 이미지를 분석하는 필터링 방법뿐만 아니라 발신 단계에서, 발송하려는 메일이 스팸 메일에 해당하는지, 계정이 도용되어 발송되는 메일인지, 기업의 정보가 유출되는 부적절한 메일인지 등을 확인하여 적절한 메일만을 발송하는 방법이 요구된다. Therefore, not only the filtering method that analyzes the image contained in the mail for the incoming mail, but also whether the mail to be sent corresponds to spam mail, whether the mail to be sent is a stolen account, or inappropriate mail that leaks corporate information, etc. A method of sending only appropriate emails by checking is required.

또한, 하나의 서버만으로 스팸 필터링이 이루어지면, 서버 장애 또는 서버의 복구가 불가능한 경우에 스팸 필터링 서비스가 중단되는 위험이 있고, 해당 서버에서 경험하지 않은 스팸 메일에 대한 필터링이 미흡할 수 있다. 따라서 서버에서 필터링 규칙이 생성되지 않은 스팸 메일에 대해서도 스팸 서비스를 안전하고 지속적으로 제공하기 위해 블록체인 네트워크의 구성원 노드들과 스팸 필터링 규칙을 공유할 필요가 있다.In addition, if spam filtering is performed with only one server, there is a risk that the spam filtering service is interrupted in case of server failure or server recovery is not possible, and filtering of spam mail that has not been experienced by the server may be insufficient. Therefore, it is necessary to share the spam filtering rules with the member nodes of the blockchain network in order to safely and continuously provide spam services even for spam mails for which no filtering rules are created on the server.

해결하고자 하는 과제는 메일 발송 전, 발송하려는 메일이 보안상 부적절하거나 스팸에 해당하는지 판단하고, 적절하지 않은 메일 발송을 사전에 차단하는 방법을 제공하는 것이다.The task to be solved is to determine whether the e-mail to be sent is inappropriate for security or to be considered spam before sending the e-mail, and to provide a method to block inappropriate e-mail transmission in advance.

또한, 해결하고자 하는 과제는 수신 메일에 포함된 이미지의 스팸 여부를 확인하여 필터링하는 방법을 제공하는 것이다.In addition, the task to be solved is to provide a method for filtering by checking whether an image included in an incoming mail is spam.

또한, 해결하고자 하는 과제는 블록체인 네트워크를 통해 스팸 필터링 규칙을 공유하는 방법을 제공하는 것이다.Also, the challenge to be solved is to provide a way to share spam filtering rules over a blockchain network.

한 실시예에 따른 적어도 하나의 프로세서에 의해 동작하는 서버의 동작 방법으로서, 블록체인 네트워크로부터 스팸 메일을 판단하는 필터링 규칙들을 제공받는 단계, 임의의 메일 서버로부터 메일을 입력받는 단계, 상기 필터링 규칙들에 포함된 발신 메일 필터링 규칙 또는 수신 메일 필터링 규칙으로 상기 메일이 상기 스팸 메일에 해당하는지 판단하는 단계, 그리고 해당하는 경우, 상기 메일의 발신 또는 수신을 차단하고, 상기 메일의 정보를 상기 블록체인 네트워크에 저장하는 단계를 포함하고, 상기 메일의 정보는 상기 블록체인 네트워크에 연결된 복수의 메일 서버들로 제공된다. A method of operating a server operated by at least one processor according to an embodiment, comprising: receiving filtering rules for determining spam mail from a block chain network; receiving mail from an arbitrary mail server; the filtering rules Determining whether the mail corresponds to the spam mail by an outgoing mail filtering rule or an incoming mail filtering rule included in the block chain network, and if applicable, blocking the sending or receiving of the mail, and transmitting the information of the mail to the block chain network and storing the information in the mail server, wherein the mail information is provided to a plurality of mail servers connected to the blockchain network.

상기 발신 메일 필터링 규칙은, 상기 메일에 유출이 허용되지 않은 개인정보가 포함되어 있는지 확인하는 제1 필터링 규칙, 상기 메일의 수신자 메일 서버가 상기 메일의 계정을 차단하는지 확인하는 제2 필터링 규칙, 상기 메일의 본문 또는 첨부 파일에 포함된 이미지에 스팸 문구가 포함되어 있는지 확인하는 제3 필터링 규칙, 상기 메일이 미리 저장된 메일 발송 패턴과 다른지 확인하는 제4 필터링 규칙 중 적어도 어느 하나를 포함할 수 있다.The outgoing mail filtering rule includes a first filtering rule for checking whether the mail contains personal information that is not permitted to leak, a second filtering rule for checking whether a recipient mail server of the mail blocks the account of the mail, and the It may include at least one of a third filtering rule for checking whether a spam phrase is included in the body of the mail or an image included in an attached file, and a fourth filtering rule for checking whether the mail is different from a pre-stored mail sending pattern.

상기 제3 필터링 규칙은, 상기 메일의 본문 또는 첨부 파일에 포함된 이미지를 추출하고, 상기 이미지로부터 텍스트 영역을 추출하고, 상기 필터링 규칙들에 포함된 상기 스팸 문구와의 유사도를 계산하고, 상기 유사도가 미리 설정된 기준값을 초과하는 경우, 상기 메일을 상기 스팸 메일로 판단할 수 있다.The third filtering rule extracts an image included in the body or attachment of the mail, extracts a text area from the image, calculates a similarity with the spam phrase included in the filtering rules, and the similarity When ? exceeds a preset reference value, the e-mail may be determined as the spam e-mail.

상기 미리 설정된 기준값은, 상기 복수의 메일 서버들 중 어느 하나로부터 제공받은 값일 수 있다.The preset reference value may be a value provided from any one of the plurality of mail servers.

상기 수신 메일 필터링 규칙은, 상기 메일의 본문 또는 첨부 파일에 포함된 이미지에 스팸 문구가 포함되어 있는지 확인하는 필터링 규칙 또는 상기 메일에 악성 프로그램이 포함되어 있는지 확인하는 필터링 규칙 중 적어도 어느 하나를 포함할 수 있다.The incoming mail filtering rule may include at least one of a filtering rule for checking whether a spam phrase is included in an image included in the body or attachment of the mail, or a filtering rule for checking whether a malicious program is included in the mail can

상기 제공받는 단계는, 상기 복수의 메일 서버들에서 사용되는 필터링 규칙들 또는 상기 복수의 메일 서버들에서 상기 스팸 메일로 판단된 메일들을 제공받을 수 있다.In the receiving step, filtering rules used in the plurality of mail servers or mails determined as the spam mail by the plurality of mail servers may be provided.

상기 저장하는 단계는, 상기 메일의 발신자 메일 주소, 수신자 메일 주소, 상기 스팸 메일에 해당하는 원인 중 적어도 하나를 상기 서버의 관리자에게 전송할 수 있다.The storing may include transmitting at least one of a sender mail address of the mail, a receiver mail address, and a cause corresponding to the spam mail to the administrator of the server.

다른 실시예에 따른 적어도 하나의 프로세서에 의해 동작하는 서버의 동작 방법으로서, 블록체인 네트워크로부터 스팸 메일 또는 보안 정책에 위반되는 위험 메일을 판단하는 필터링 규칙을 제공받는 단계, 임의의 메일 서버로부터 발신 메일을 입력받는 단계, 상기 필터링 규칙으로 상기 발신 메일이 상기 스팸 메일 또는 상기 위험 메일에 해당하는지 판단하는 단계, 그리고 해당하는 경우, 상기 발신 메일의 정보가 상기 블록체인 네트워크에 연결된 복수의 메일 서버들로 제공되도록, 상기 발신 메일의 정보를 상기 블록체인 네트워크에 저장하는 단계를 포함한다. As a method of operating a server operated by at least one processor according to another embodiment, receiving a filtering rule for determining spam mail or dangerous mail that violates security policy from a block chain network, an outgoing mail from an arbitrary mail server receiving an input, determining whether the outgoing mail corresponds to the spam mail or the dangerous mail by the filtering rule, and, if applicable, the information of the outgoing mail to a plurality of mail servers connected to the block chain network and storing the information of the outgoing mail in the blockchain network to be provided.

상기 발신 메일의 수신자 수가 미리 정해진 기준값을 초과하는지 확인하는 단계, 초과하는 경우, 상기 발신 메일의 제목 또는 본문에 상기 발신 메일이 광고성 메일임을 알리는 제1 정보, 상기 발신 메일의 수신을 거부할 수 있는 방법을 포함하는 제2 정보, 그리고 상기 발신 메일의 수신자가 상기 발신 메일의 수신을 동의한 사실을 포함하는 제3 정보가 포함되어 있는지 확인하는 단계, 그리고 제1 정보 내지 제3 정보 중 어느 하나라도 포함되지 않은 경우, 상기 발신 메일을 상기 위험 메일로 결정하는 단계를 더 포함할 수 있다.Checking whether the number of recipients of the outgoing mail exceeds a predetermined reference value, if it exceeds a first information indicating that the outgoing mail is an advertisement mail in the subject or body of the outgoing mail, the reception of the outgoing mail can be rejected Checking whether the second information including a method and third information including the fact that the recipient of the outgoing mail agrees to receive the outgoing mail is included, and any one of the first to third information If not included, the method may further include determining the outgoing mail as the dangerous mail.

상기 판단하는 단계는, 상기 발신 메일의 제목, 본문 그리고 첨부 파일의 텍스트를 추출하는 단계, 상기 텍스트에 주민등록번호, 휴대 전화번호, 유선 전화번호, 이메일 주소 중 적어도 하나를 포함하는 개인정보를 추출하는 단계, 상기 개인정보가 상기 보안 정책에 포함된 유출 금지 대상에 해당하는지 판단하는 단계, 그리고 상기 개인정보가 상기 유출 금지 대상에 해당하는 경우, 상기 발신 메일을 상기 위험 메일로 결정하는 단계를 포함할 수 있다.The determining may include extracting the title, body, and text of the attached file of the outgoing mail, and extracting personal information including at least one of a resident registration number, a mobile phone number, a landline phone number, and an email address in the text. , determining whether the personal information corresponds to a leak prohibited target included in the security policy, and determining the outgoing mail as the dangerous mail if the personal information corresponds to the leak prohibited target have.

상기 판단하는 단계는, 상기 발신 메일의 계정에서 상기 발신 메일의 수신자 메일 서버의 테스트 계정으로 테스트 메일을 발송하는 단계, 그리고 상기 테스트 메일의 발송이 성공하면 상기 발신 메일을 수신자 메일 계정으로 전송하고, 상기 테스트 메일의 발송이 실패하면 상기 수신자 메일 서버에 상기 발신 메일 계정의 차단을 해제 요청하는 단계를 포함할 수 있다.The determining step includes sending a test mail from the account of the outgoing mail to a test account of the recipient mail server of the outgoing mail, and if the sending of the test mail is successful, sending the outgoing mail to the recipient mail account, and requesting the recipient mail server to unblock the outgoing mail account when the transmission of the test mail fails.

상기 판단하는 단계는, 상기 발신 메일이 미리 저장된 메일 발송 패턴과 다른 패턴으로 발송되었는지 확인하는 단계를 포함하고, 상기 미리 저장된 메일 발송 패턴은 상기 발신 메일의 발신자에 의해 등록되거나, 상기 임의의 메일 서버로의 접근 패턴을 추출한 것일 수 있다.The determining may include checking whether the outgoing mail is sent in a pattern different from a pre-stored mail sending pattern, wherein the pre-stored mail sending pattern is registered by the sender of the outgoing mail or the arbitrary mail server It may be an extraction of the access pattern to

상기 확인하는 단계는, 상기 임의의 메일 서버에 접속하는 위치 정보, IP 주소 그리고 MAC 주소 중 적어도 하나를 포함하는 메일 서버 접근 조건을 입력받는 단계, 상기 발신 메일을 입력받을 때의 위치 정보, IP 주소 그리고 MAC 주소를 추출하는 단계, 그리고 추출한 정보와 상기 메일 서버 접근 조건이 다르면 상기 발신 메일을 상기 위험 메일로 결정하는 단계를 포함할 수 있다.The confirming may include receiving an input of a mail server access condition including at least one of location information, an IP address, and a MAC address for accessing the arbitrary mail server, location information when receiving the outgoing mail, and an IP address and extracting the MAC address, and determining the outgoing mail as the dangerous mail if the extracted information and the mail server access condition are different.

한 실시예에 따른 컴퓨팅 장치로서, 메모리, 그리고 상기 메모리에 로드된 프로그램의 명령들(instructions)을 실행하는 적어도 하나의 프로세서를 포함하고, 상기 프로그램은 임의의 메일 서버로부터 메일을 입력받는 단계, 필터링 규칙에 따라 상기 메일이 스팸 메일에 해당하는지 판단하는 단계, 그리고 해당하는 경우, 상기 메일 또는 상기 메일의 정보를 블록체인 네트워크에 저장하는 단계를 실행하도록 기술된 명령들을 포함하고, 상기 메일의 정보는 상기 블록체인 네트워크에 연결된 복수의 메일 서버들로 제공되어 각 메일 서버의 스팸 필터링 규칙에 적용된다.A computing device according to an embodiment, comprising a memory and at least one processor executing instructions of a program loaded into the memory, wherein the program receives mail from an arbitrary mail server, filtering and judging whether the e-mail corresponds to a spam e-mail according to a rule, and if applicable, storing the e-mail or information of the e-mail in a blockchain network, wherein the e-mail information is It is provided to a plurality of mail servers connected to the blockchain network and applied to the spam filtering rules of each mail server.

상기 저장하는 단계는, 상기 복수의 메일 서버들에 의해 상기 메일의 정보의 해시값으로 생성된 트랜잭션이 검증되면, 검증된 트랜잭션을 상기 블록체인 네트워크에 저장할 수 있다. In the storing, when the transaction generated by the hash value of the information of the mail by the plurality of mail servers is verified, the verified transaction may be stored in the blockchain network.

본 발명에 따르면 메일을 수신할 때뿐만 아니라, 발송 전에 보안상 적절한 내용인지 필터링하므로 계정이 도용되어 기업 내 비밀이 유출되는 피해를 막을 수 있고, 메일을 대량 발송하는 경우 해당 메일의 관련 법규 위반 여부를 사전에 확인할 수 있다. According to the present invention, it is possible to prevent the leakage of company secrets due to account theft by filtering whether the content is appropriate for security purposes as well as when receiving the mail, and whether the mail is violating the relevant laws when sending a large amount of mail. can be checked in advance.

또한 본 발명에 따르면 블록체인 네트워크에 연결된 구성원 노드가 스팸 필터링 규칙에 대한 정보를 공유하므로, 학습 데이터가 많지 않거나 신규로 생성된 서버의 경우에도 스팸 필터링 성능을 높일 수 있다. In addition, according to the present invention, since member nodes connected to the block chain network share information on spam filtering rules, spam filtering performance can be improved even in the case of a server that does not have much learning data or is newly created.

도 1은 한 실시예에 따른 스팸 필터링 시스템의 설명도이다.
도 2는 한 실시예에 따른 스팸 필터링 서버의 동작 방법의 설명도이다.
도 3은 한 실시예에 따른 스팸 필터링 서버의 동작 방법의 흐름도이다.
도 4는 한 실시예에 따른 스팸 필터링 서버의 구성도이다.
도 5는 한 실시예에 따른 스팸 필터링 방법의 흐름도이다.
도 6은 다른 실시예에 따른 스팸 필터링 방법의 흐름도이다.
도 7은 또 다른 실시예에 따른 스팸 필터링 방법의 흐름도이다.
도 8은 또 다른 실시예에 따른 스팸 필터링 방법의 흐름도이다.
도 9는 또 다른 실시예에 따른 스팸 필터링 방법의 설명도이다.
도 10은 또 다른 실시예에 따른 스팸 필터링 방법의 흐름도이다.
도 11은 한 실시예에 따른 컴퓨팅 장치의 하드웨어 구성도이다.1 is an explanatory diagram of a spam filtering system according to an embodiment.
2 is an explanatory diagram of a method of operating a spam filtering server according to an embodiment.
3 is a flowchart of a method of operating a spam filtering server according to an embodiment.
4 is a configuration diagram of a spam filtering server according to an embodiment.
5 is a flowchart of a spam filtering method according to an embodiment.
6 is a flowchart of a spam filtering method according to another embodiment.
7 is a flowchart of a spam filtering method according to another embodiment.
8 is a flowchart of a spam filtering method according to another embodiment.
9 is an explanatory diagram of a spam filtering method according to another embodiment.
10 is a flowchart of a spam filtering method according to another embodiment.
11 is a hardware configuration diagram of a computing device according to an embodiment.

아래에서는 첨부한 도면을 참고로 하여 본 발명의 실시예에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those of ordinary skill in the art may easily implement the present invention. However, the present invention may be implemented in various different forms and is not limited to the embodiments described herein. In the drawings, parts irrelevant to the description are omitted in order to clearly describe the present invention, and similar reference numerals are attached to similar parts throughout the specification.

명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다. 또한, 명세서에 기재된 "…부", "…기", "모듈" 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어나 소프트웨어 또는 하드웨어 및 소프트웨어의 결합으로 구현될 수 있다.Throughout the specification, when a part "includes" a certain component, it means that other components may be further included rather than excluding other components unless specifically stated to the contrary. In addition, terms such as “…unit”, “…group”, and “module” described in the specification mean a unit that processes at least one function or operation, which may be implemented as hardware or software or a combination of hardware and software. have.

본 명세서에서 스팸 메일이란 불특정 다수의 사람들에게 전송되거나 수신하는 광고성 메일을 의미한다. 따라서 발신 메일이 스팸 메일이거나, 수신 메일이 스팸 메일에 해당할 수 있다. 편의상 '스팸'이라고 호칭할 수 있다.As used herein, spam mail refers to advertising mail sent or received to an unspecified number of people. Therefore, the outgoing e-mail may correspond to a spam e-mail or the received e-mail may correspond to a spam e-mail. For convenience, it may be referred to as 'spam'.

본 명세서에서 위험 메일은 사용자의 정보, 사용자가 속한 기업의 정보를 외부로 유출하는 것으로 의심되는 메일, 관리자에 의해 설정된 메시지 전송 규칙에 위반되는 메일 또는 사용자 계정이 도용되어 도용자에 의해 발송되는 것으로 판단되는 메일을 의미한다. 발신 메일이 위험 메일에 해당할 수 있다. In this specification, dangerous mail is determined to be sent by a thief after a mail that is suspected of leaking user information, company information to which the user belongs, mail that violates the message transfer rules set by the administrator, or user account is stolen means the mail Outgoing mail may be dangerous mail.

본 명세서에서 필터링 규칙이란 각 필터링 방법에서 사용하는 임계값 또는 기준값을 의미할 수 있다. 예를 들어 유사도로 계산된 값인 경우 임계값을 의미할 수 있고, 스팸 필터링 서버(100)의 관리자가 미리 설정한 특정 텍스트 또는 특정 이미지를 의미할 수 있다. In this specification, a filtering rule may mean a threshold value or a reference value used in each filtering method. For example, in the case of a value calculated by the degree of similarity, it may mean a threshold value, and may mean a specific text or a specific image preset by the administrator of the spam filtering server 100 .

도 1은 한 실시예에 따른 스팸 필터링 시스템의 설명도이다.1 is an explanatory diagram of a spam filtering system according to an embodiment.

도 1을 참고하면, 스팸 필터링 시스템(1000)은 사용자로부터 외부 메일 서버로 전송되는 발신 메일과 외부 메일 서버로부터 수신하는 수신 메일의 적절성 여부를 판단하는 스팸 필터링 서버(100)와 스팸 필터링 규칙들을 공유하는 블록체인 네트워크(200)를 포함한다. Referring to FIG. 1 , the spam filtering system 1000 shares spam filtering rules with the spam filtering server 100 that determines whether an outgoing mail transmitted from a user to an external mail server and an incoming mail received from an external mail server are appropriate. and a blockchain network 200 that

스팸 필터링 시스템(1000)은 스팸 메일 또는 위험 메일의 발신 그리고 스팸 메일의 수신을 차단하는 역할을 한다. 스팸 필터링 시스템(1000)은 임의의 메일 서버에 구현될 수 있다. The spam filtering system 1000 serves to block the transmission of spam or dangerous mail and the reception of spam mail. The spam filtering system 1000 may be implemented in any mail server.

스팸 필터링 서버(100)는 발신 메일이 스팸이거나 부적절한지 판단하는 발신 필터링부(110), 수신 메일이 스팸에 해당하는지 판단하는 수신 필터링부(120), 필터링 결과에 따라 업데이트된 필터링 규칙들을 블록체인 네트워크(200)에 저장하는 필터링 규칙 관리부(130)를 포함한다. The spam filtering server 100 implements the outgoing filtering unit 110 that determines whether the outgoing mail is spam or inappropriate, the receiving filtering unit 120 that determines whether the received mail is spam, and the filtering rules updated according to the filtering result in a block chain. It includes a filtering rule management unit 130 that is stored in the network 200 .

설명을 위해, 발신 필터링부(110), 수신 필터링부(120), 필터링 규칙 관리부(130)로 명명하여 부르나, 이들은 적어도 하나의 프로세서에 의해 동작하는 컴퓨팅 장치이다. 여기서, 발신 필터링부(110), 수신 필터링부(120), 필터링 규칙 관리부(130)는 하나의 컴퓨팅 장치에 구현되거나, 별도의 컴퓨팅 장치에 분산 구현될 수 있다. 별도의 컴퓨팅 장치에 분산 구현된 경우, 발신 필터링부(110), 수신 필터링부(120), 필터링 규칙 관리부(130)는 통신 인터페이스를 통해 서로 통신할 수 있다. 컴퓨팅 장치는 본 발명을 수행하도록 작성된 소프트웨어 프로그램을 실행할 수 있는 장치이면 충분하고, 예를 들면, 서버, 랩탑 컴퓨터, 인터넷 브라우저가 내장된 기기 등 이메일의 송수신이 가능한 장치일 수 있다.For the sake of explanation, the outgoing filtering unit 110 , the receiving filtering unit 120 , and the filtering rule management unit 130 are named and called, but these are computing devices operated by at least one processor. Here, the outgoing filtering unit 110 , the receiving filtering unit 120 , and the filtering rule manager 130 may be implemented in one computing device or distributed in separate computing devices. When distributed in separate computing devices, the outgoing filtering unit 110 , the receiving filtering unit 120 , and the filtering rule manager 130 may communicate with each other through a communication interface. A computing device may be any device capable of executing a software program written to perform the present invention, and may be, for example, a device capable of sending and receiving e-mail, such as a server, a laptop computer, or a device having a built-in Internet browser.

스팸 필터링 서버(100)는 사용자가 발송하려는 메일이 사용자의 의도에 무관하게 스팸 메일에 해당하는지, 사용자가 기업에 속한 경우 해당 기업의 기밀 유지 정책에 위반되는지, 평소 사용자의 메일 발송 패턴과 차이가 있어 계정 도용으로 의심되는지 등을 판단하여 적절하지 않은 메일의 발송을 사전에 차단한다. The spam filtering server 100 determines whether the mail that the user intends to send is spam regardless of the user's intention, whether the user belongs to the company, whether it violates the company's confidentiality policy, and whether the user's mail sending pattern is different It determines whether the account is stolen or not, and blocks the sending of inappropriate e-mails in advance.

발신 필터링부(110)는 RBL 필터링 규칙, 보안 필터링 규칙, 대량 발송 필터링 규칙, 본문 필터링 규칙, 이미지 필터링 규칙 그리고 계정 도용 확인 규칙에 따라 발신 메일을 확인한다. 규칙들 중 어느 하나에 해당하는 경우 발신 메일을 스팸 메일 또는 위험 메일이라고 판단하고 발송을 차단한다.The outgoing filtering unit 110 checks the outgoing mail according to the RBL filtering rule, the security filtering rule, the mass sending filtering rule, the body filtering rule, the image filtering rule, and the account theft confirmation rule. If any one of the rules is met, the outgoing mail is judged as spam or dangerous mail and the sending is blocked.

발송이 차단된 메일은 사용자의 스팸 메일함으로 이동될 수 있다. 각 규칙에 따른 필터링 방법은 도 4 내지 도 10을 통해 자세히 설명한다.Blocked mail may be moved to the user's spam mailbox. A filtering method according to each rule will be described in detail with reference to FIGS. 4 to 10 .

수신 필터링부(120)는 본문 필터링 규칙, 이미지 필터링 규칙, 악성 프로그램 필터링 규칙에 따라 수신 메일을 확인한다. 규칙들 중 어느 하나에 해당하는 경우 수신 메일을 스팸 메일이라고 판단하고, 해당 메일을 사용자의 스팸 메일함으로 이동시킬 수 있다. The reception filtering unit 120 checks the received mail according to the body filtering rule, the image filtering rule, and the malicious program filtering rule. If any one of the rules is satisfied, it is determined that the received e-mail is a spam e-mail, and the e-mail may be moved to the user's spam mailbox.

또한 수신 필터링부(120)는 수신 메일의 계정을 스팸 필터링 서버(100)에 등록할 수 있다. In addition, the reception filtering unit 120 may register an account of the received mail in the spam filtering server 100 .

수신 메일의 계정으로 인증을 요청하고, 인증 결과에 따라 인증이 성공하면 수신 메일의 계정을 등록하고, 이후 필터링 규칙을 진행할 수 있다. 인증이 실패하면 스팸 필터링 규칙을 거치지 않고, 해당 수신 메일을 스팸 메일로 판단하고 사용자의 스팸 메일함으로 이동할 수 있다. 인증이 성공한 수신 메일의 계정 정보와 인증 내역을 블록체인 네트워크(200)에 저장할 수 있다.Authentication is requested with the account of the received mail, and if authentication is successful according to the authentication result, the account of the received mail can be registered, and then the filtering rules can be performed. If authentication fails, the received e-mail is judged to be spam mail without going through the spam filtering rules and can be moved to the user's spam mailbox. Account information and authentication details of the received mail for which authentication is successful may be stored in the blockchain network 200 .

이를 통해 인증되지 않은 불특정 다수로부터 유입되는 메일을 한번에 스팸으로 처리할 수 있으므로, 스팸을 발송하는 주체의 효율성을 저하시킬 수 있다. 한편 각 규칙에 따른 수신 메일의 필터링 방법은 도 4 내지 도 10을 통해 자세히 설명한다.In this way, since mails coming from a large number of unauthenticated and unspecified can be treated as spam at once, the efficiency of the subject sending the spam can be reduced. Meanwhile, a method of filtering received mail according to each rule will be described in detail with reference to FIGS. 4 to 10 .

한편, 발신 필터링부(110) 및 수신 필터링부(120)에서 사용되는 각 규칙들은 규칙 기반 시스템(Rule-Based System), 머신러닝(Machine Learning) 또는 딥러닝(Deep Learning) 모델로 구현될 수 있다. Meanwhile, each rule used in the outgoing filtering unit 110 and the receiving filtering unit 120 may be implemented as a rule-based system, machine learning, or deep learning model. .

필터링 규칙 관리부(130)는 발신 필터링부(110)와 수신 필터링부(120)에서 사용하는 규칙들을 블록체인 네트워크(200)에 저장하고, 블록체인 네트워크(200)에 저장된 다른 스팸 필터링 서버(100)에서 생성된 규칙들 또는 필터링 결과를 제공받을 수 있다. 또한 스팸이라고 판단된 메일을 저장하거나 제공받을 수 있다. 이 경우 메일은 원문 그 자체일 수 있고 해시 알고리즘으로 변경된 것일 수 있다. The filtering rule management unit 130 stores the rules used by the outgoing filtering unit 110 and the receiving filtering unit 120 in the block chain network 200, and another spam filtering server 100 stored in the block chain network 200. Rules or filtering results generated in . In addition, e-mails determined to be spam may be stored or provided. In this case, the mail may be the original text itself, or it may have been changed with a hash algorithm.

따라서 필터링 데이터가 공유되지 않고, 스팸 필터링 규칙이 고정된 일반 메일 서버 또는 송수신량이 많지 않은 메일 서버의 경우에도 블록체인 네트워크(200)의 노드에 참여한 메일 서버 모두 데이터를 공유하여 스팸 필터링 규칙을 확장할 수 있다. Therefore, even in the case of a general mail server where filtering data is not shared and spam filtering rules are fixed or mail servers with low sending and receiving volume, all mail servers participating in the node of the blockchain network 200 share data to extend the spam filtering rules. can

필터링 규칙 관리부(130)는 블록체인 네트워크(200)에 새로운 스팸 필터링 서버(100)가 연결되는 경우 기존의 스팸 필터링 서버(100)들의 필터링 규칙들을 제공할 수 있다. 새로운 스팸 필터링 서버(100)는 제공받은 필터링 규칙 또는 관리자에 의해 설정된 필터링 규칙들에 따라 발신 메일 또는 수신 메일을 필터링할 수 있다. 제공받은 필터링 규칙들을 전부 적용할 수 있고, 관리자에 의해 선택된 일부의 필터링 규칙들만 적용할 수 있다. 이는 각 필터링 방법의 특성에 따라 결정될 수 있다.The filtering rule manager 130 may provide filtering rules of the existing spam filtering servers 100 when the new spam filtering server 100 is connected to the block chain network 200 . The new spam filtering server 100 may filter outgoing mail or incoming mail according to the provided filtering rule or filtering rules set by the administrator. All of the provided filtering rules can be applied, and only some filtering rules selected by the administrator can be applied. This may be determined according to the characteristics of each filtering method.

스팸 필터링 서버(100)의 관리자에 의해 필터링 규칙이 업데이트 되는 경우, 필터링 규칙 관리부(130)는 변경된 필터링 규칙을 블록체인 네트워크(200)에 저장할 수 있다. 관리자의 설정 변경으로 필터링 규칙이 업데이트될 수 있으며, 규칙이 변경되는 경우는 이에 한정되지 않는다. When the filtering rule is updated by the administrator of the spam filtering server 100 , the filtering rule management unit 130 may store the changed filtering rule in the block chain network 200 . Filtering rules may be updated by an administrator's change of settings, but the case where the rules are changed is not limited thereto.

각 필터링 규칙이 규칙 기반 시스템으로 구현된 경우, 블록체인 네트워크(200)에 연결된 복수의 스팸 필터링 서버(100)들은 새로 저장된 필터링 규칙을 반영하여 각자의 필터링 규칙들을 업데이트 할 수 있다. 또한, 새로운 필터링 규칙을 생성할 수도 있다.When each filtering rule is implemented as a rule-based system, a plurality of spam filtering servers 100 connected to the block chain network 200 may update their respective filtering rules by reflecting the newly stored filtering rules. You can also create new filtering rules.

한편, 각 필터링 규칙이 머신러닝 또는 딥러닝으로 구현된 경우, 필터링 규칙 관리부(130)는 필터링 결과들을 블록체인 네트워크(200)에 저장한다. 이때 각 스팸 필터링 서버(100)들은 필터링 결과들을 학습하여 스스로 규칙을 변경할 수 있다. On the other hand, when each filtering rule is implemented by machine learning or deep learning, the filtering rule management unit 130 stores the filtering results in the block chain network 200 . At this time, each of the spam filtering servers 100 may learn the filtering results and change the rules by themselves.

사용자 또는 관리자는 필터링 규칙 관리부(130)가 스팸 필터링 서버(100)의 다른 노드들로부터 필터링 규칙을 제공받는 기능을 사용할 것인지 선택할 수 있다. 예를 들어 이용 중인 스팸 필터링 서버(100)의 필터링 규칙은 블록체인 네트워크(200)에 저장하되, 다른 노드들로부터 제공받은 필터링 규칙은 현재 스팸 필터링 서버(100)에 반영하지 않을 수 있다.A user or an administrator may select whether to use the function of the filtering rule management unit 130 receiving filtering rules from other nodes of the spam filtering server 100 . For example, the filtering rules of the spam filtering server 100 in use are stored in the block chain network 200, but the filtering rules provided from other nodes may not be reflected in the current spam filtering server 100.

블록체인 네트워크(200)는 복수의 블록으로 구성된 분산 데이터베이스의 일종으로, 복수의 스팸 필터링 서버(100)들과 연결되어 있다. The blockchain network 200 is a kind of distributed database composed of a plurality of blocks, and is connected to a plurality of spam filtering servers 100 .

블록체인 네트워크(200)는 각 스팸 필터링 서버(100)의 스팸 메일 또는 위험 메일을 판단하는 필터링 규칙들을 저장한다. 블록체인 네트워크(200)를 구성하는 구성원 노드들은 저장된 필터링 규칙을 임의의 스팸 필터링 서버(100)의 발신 필터링부(110) 또는 수신 필터링부(120)가 사용하도록 공유할 수 있다. 이를 통해, 메일의 발신량이나 수신량이 적은 스팸 필터링 서버(100) 또는 신규 생성된 스팸 필터링 서버(100)도 필터링 규칙들을 적용할 수 있다.The block chain network 200 stores filtering rules for determining spam mail or dangerous mail of each spam filtering server 100 . Member nodes constituting the block chain network 200 may share the stored filtering rules for use by the outgoing filtering unit 110 or the receiving filtering unit 120 of any spam filtering server 100 . Through this, the spam filtering server 100 or the newly created spam filtering server 100 with a small amount of sending or receiving mail may also apply the filtering rules.

한편, 스팸 필터링 서버(100)와 블록체인 네트워크(200)는 게이트웨이 역할을 하는 API(Application Programming Interface)를 통해 연결될 수 있다. On the other hand, the spam filtering server 100 and the block chain network 200 may be connected through an API (Application Programming Interface) serving as a gateway.

그리고, 블록체인 네트워크(200)에 저장된 필터링 규칙들은 노드 간 서로 공유될 수 있는 데이터로서, 온체인이라 호칭할 수 있고, 각 스팸 필터링 서버(100)에 저장된 필터링 규칙들은 사이드 체인이라 호칭할 수 있다. And, the filtering rules stored in the block chain network 200 are data that can be shared between nodes, and can be called on-chain, and the filtering rules stored in each spam filtering server 100 can be called side chains. .

도 2는 한 실시예에 따른 스팸 필터링 서버의 동작 방법의 설명도이다.2 is an explanatory diagram of a method of operating a spam filtering server according to an embodiment.

도 2를 참고하면, 블록체인 네트워크(200)에 연결된 각 스팸 필터링 서버들(100-1, 100-2, 100-3)은, 각 서버의 관리자에 의해서 또는 각 서버의 인공지능으로 구현된 필터링 규칙이 변경되면, 인접한 노드들에 변경된 내용을 전송하고, 인접 노드들은 변경된 내용에 대해 트랜잭션을 생성하고, 이에 대한 보증 절차를 진행한다. Referring to FIG. 2 , each of the spam filtering servers 100-1, 100-2, and 100-3 connected to the block chain network 200 is a filtering implemented by an administrator of each server or by artificial intelligence of each server. When a rule is changed, the changed content is transmitted to the adjacent nodes, and the adjacent nodes generate a transaction for the changed content and proceed with the assurance procedure.

예를 들어, 스팸 필터링 서버1(100-1)의 필터링 규칙 관리부(130)는 필터링 규칙이 변경되면 최근접 노드인 스팸 필터링 서버2(100-2)에 갱신된 내용을 트랜잭션으로 생성하여 전달한다. 스팸 필터링 서버2(100-2)는 트랜잭션을 보증하고, 인접한 노드인 스팸 필터링 서버3(100-3)으로 트랜잭션을 전달한다. 스팸 필터링 서버3(100-3)이 보증을 마치면, 블록체인 네트워크(200)에 해당 트랜잭션을 기록하고, 업데이트된 필터링 규칙은 블록체인 네트워크(200)에 연결된 전체 노드들에 배포될 수 있다.For example, when the filtering rule is changed, the filtering rule management unit 130 of the spam filtering server 1 100-1 generates and transmits the updated contents as a transaction to the spam filtering server 2 100-2, which is the nearest node. . The spam filtering server 2 (100-2) guarantees the transaction, and transmits the transaction to the spam filtering server 3 (100-3), which is an adjacent node. When the spam filtering server 3 ( 100 - 3 ) completes the guarantee, the transaction is recorded in the block chain network 200 , and the updated filtering rules can be distributed to all nodes connected to the block chain network 200 .

한편, 스팸 필터링 서버(100)에 의해 스팸 메일 또는 위험 메일로 분류된 메일 자체가 블록체인 네트워크(200)에 저장될 수 있다. 발신 메일의 경우 발신 메일 주소, 메일을 발송한 시간, 발신자의 IP 주소, 스팸 필터링 서버(100)의 노드 ID로부터 해시값을 생성할 수 있다. 생성된 해시값을 블록체인 네트워크(200)를 구성하는 각 노드들에 저장할 수 있다. On the other hand, the mail itself classified as spam mail or dangerous mail by the spam filtering server 100 may be stored in the block chain network 200 . In the case of outgoing mail, a hash value may be generated from the outgoing mail address, the time the mail was sent, the sender's IP address, and the node ID of the spam filtering server 100 . The generated hash value may be stored in each node constituting the blockchain network 200 .

수신 메일도 역시 수신 메일을 발송한 발신 메일 주소, 메일을 발송한 시간, 발신자의 IP 주소, 스팸 필터링 서버(100)의 노드 ID로부터 해시값을 생성하고, 블록체인 네트워크(200)를 구성하는 각 노드들에 해시값을 저장할 수 있다. The received mail also generates a hash value from the sending mail address that sent the received mail, the sending time of the mail, the sender's IP address, and the node ID of the spam filtering server 100, and each Hash values can be stored in nodes.

저장된 해시값은 각 스팸 필터링 서버(100)에서 필터링 규칙을 생성하거나 업데이트할 때 이용될 수 있다. The stored hash value may be used when creating or updating a filtering rule in each spam filtering server 100 .

도 3은 한 실시예에 따른 스팸 필터링 서버의 동작 방법의 흐름도이다.3 is a flowchart of a method of operating a spam filtering server according to an embodiment.

도 3을 참고하면, 스팸 필터링 서버(100)가 사용자 또는 외부의 메일 서버로부터 메일을 입력받는다(S101).Referring to FIG. 3 , the spam filtering server 100 receives mail from a user or an external mail server (S101).

스팸 필터링 서버(100)는 입력된 메일이 수신 메일인지 발신 메일인지 분류한다(S102).The spam filtering server 100 classifies whether the input mail is an incoming mail or an outgoing mail (S102).

수신 메일인 경우, 수신 필터링부(120)가 각 규칙에 따라 해당 메일을 필터링한다(S103). 본문 필터링 규칙, 이미지 필터링 규칙, 악성 프로그램 필터링 규칙을 적용하여 어느 하나라도 해당되는 경우 수신 메일을 스팸으로 판단할 수 있다. In the case of received mail, the reception filtering unit 120 filters the corresponding mail according to each rule (S103). By applying the body filtering rule, image filtering rule, and malicious program filtering rule, if any one of them is applicable, the incoming mail can be judged as spam.

각 규칙별 설정된 기준값에 해당하는지 확인하여 해당 메일이 스팸인지 판단한다(S104). 이때 각 규칙의 기준값은 관리자에 의해 설정되거나 블록체인 네트워크(200)로부터 수신한 정보를 기반으로 설정될 수 있다. 한 예로서, 수신 필터링부(120)가 머신러닝 또는 딥러닝으로 구현된 경우에는 수치가 아닌 결정 경계 또는 함수로 기준값이 결정될 수 있다.It is determined whether the corresponding e-mail is spam by checking whether it corresponds to the reference value set for each rule (S104). In this case, the reference value of each rule may be set by an administrator or based on information received from the block chain network 200 . As an example, when the reception filtering unit 120 is implemented by machine learning or deep learning, the reference value may be determined as a decision boundary or a function rather than a numerical value.

정상 메일로 판단된 경우, 해당 메일을 사용자의 수신 메일함으로 이동시키고, 스팸 메일로 판단된 경우, 해당 메일을 사용자의 스팸 메일함으로 이동시킨다(S105, S106).When it is determined that the mail is normal, the corresponding mail is moved to the user's receiving mailbox, and when it is determined that the mail is spam, the corresponding mail is moved to the user's spam mailbox (S105, S106).

스팸 필터링 서버(100)는 스팸 필터링 결과를 블록체인 네트워크(200)에 저장한다(S107). 한 예로서, 발신 필터링부(110) 또는 수신 필터링부(120)가 규칙 기반 시스템으로 구현된 경우, 각 필터링 규칙에서 계산된 유사도 값 또는 스팸이라고 판단된 텍스트 또는 이미지를 저장할 수 있다. The spam filtering server 100 stores the spam filtering result in the block chain network 200 (S107). As an example, when the outgoing filtering unit 110 or the receiving filtering unit 120 is implemented as a rule-based system, the similarity value calculated in each filtering rule or text or images determined to be spam may be stored.

다른 예로서 발신 필터링부(110) 또는 수신 필터링부(120)가 머신러닝 또는 딥러닝으로 구현된 경우, 스팸 또는 정상이라고 판단된 해당 메일 자체를 블록체인 네트워크(200)에 저장할 수 있다.As another example, when the outgoing filtering unit 110 or the receiving filtering unit 120 is implemented by machine learning or deep learning, the mail itself determined to be spam or normal may be stored in the block chain network 200 .

한편, 관리자에 의해 각 규칙들의 임계값 또는 설정이 변경되는 경우 변경된 내용은 블록체인 네트워크(200)에 저장될 수 있다.On the other hand, when the threshold value or setting of each rule is changed by the administrator, the changed content may be stored in the blockchain network 200 .

S102 단계에서, 발신 메일인 경우, 대량 발송 메일인지 판단하기 위해 수신자 수가 미리 정해진 기준 숫자 이상인지 확인한다(S108). 대량 발송 메일임을 판단하는 기준값은 관리자에 의해 변경될 수 있다. In step S102, in the case of outgoing mail, it is checked whether the number of recipients is equal to or greater than a predetermined reference number to determine whether the mail is mass-sent (S108). The reference value for determining that the mail is mass sent may be changed by the administrator.

대량 발송 메일인 경우, 발신 필터링부(110)는 대량 발송 필터링 규칙에 따라 필터링한다(S109). 대량 발송 필터링 규칙은 메일을 발신 또는 수신하는 국가의 법률 또는 규정을 포함할 수 있다.In the case of mass sending mail, the sending filtering unit 110 filters according to the mass sending filtering rule (S109). The bulk sending filtering rules may include laws or regulations of the country sending or receiving the mail.

대량 발송 메일이 아니면, 발신 필터링부(110)는 RBL 필터링 규칙, 보안 필터링 규칙, 본문 필터링 규칙, 이미지 필터링 규칙, 계정 도용 확인 규칙에 따라 필터링한다(S110). 각 규칙의 자세한 내용은 도 4 내지 도 10을 통해 설명한다.If it is not a mass sent mail, the outgoing filtering unit 110 filters according to the RBL filtering rule, the security filtering rule, the body filtering rule, the image filtering rule, and the account theft confirmation rule (S110). Details of each rule will be described with reference to FIGS. 4 to 10 .

각 규칙별 설정된 기준값에 해당하는지 확인하여 해당 메일이 위험 메일 또는 스팸인지 판단한다(S111).It is determined whether the corresponding e-mail is a dangerous e-mail or spam by checking whether it corresponds to the reference value set for each rule (S111).

위험 메일 또는 스팸이 아니면, 수신자에게 해당 메일을 전송한 후 발신 메일함으로 이동시키고, 위험 메일 또는 스팸이라고 판단되면, 해당 메일을 스팸 메일함으로 이동하고, 발신자에게 전송 실패를 알린다(S112, S113).If it is not a dangerous mail or spam, the mail is sent to the recipient and then moved to the outgoing mailbox. If it is determined that the mail is dangerous or spam, the corresponding mail is moved to the spam mailbox, and the sender is notified of transmission failure (S112, S113).

이후 필터링 규칙 관리부(130)는 스팸 필터링 결과를 블록체인 네트워크(200)에 저장한다(S107).Thereafter, the filtering rule management unit 130 stores the spam filtering result in the block chain network 200 (S107).

도 4는 한 실시예에 따른 스팸 필터링 서버의 구성도이다. 4 is a configuration diagram of a spam filtering server according to an embodiment.

도 4를 참고하면, 스팸 필터링 서버(100)는 발신 필터링부(110), 수신 필터링부(120), 필터링 규칙 관리부(130)를 포함한다. 발신 필터링부(110)와 수신 필터링부(120)는 복수의 필터링 규칙들을 포함하며, 메일의 종류에 따라 각 규칙들을 적용하여 메일이 스팸 메일이거나 위험 메일인지 판단한다. Referring to FIG. 4 , the spam filtering server 100 includes an outgoing filtering unit 110 , a receiving filtering unit 120 , and a filtering rule management unit 130 . The outgoing filtering unit 110 and the receiving filtering unit 120 include a plurality of filtering rules, and apply each rule according to the type of mail to determine whether the mail is a spam mail or a dangerous mail.

한편 도 1에서 설명한 바와 같이, 발신 필터링부(110)와 수신 필터링부(120)는 규칙 기반 시스템으로 구현되거나 머신러닝 또는 딥러닝으로 구현될 수 있다.Meanwhile, as described in FIG. 1 , the outgoing filtering unit 110 and the receiving filtering unit 120 may be implemented as a rule-based system or may be implemented by machine learning or deep learning.

필터링 규칙 관리부(130)는 복수의 필터링 규칙들을 블록체인 네트워크(200)에 저장하고, 블록체인 네트워크(200)에 저장된 다른 스팸 필터링 서버(100)로부터 획득한 필터링 규칙들을 발신 필터링부(110)와 수신 필터링부(120)의 규칙들에 적용시킨다.The filtering rule management unit 130 stores a plurality of filtering rules in the block chain network 200, and transmits filtering rules obtained from other spam filtering servers 100 stored in the block chain network 200 with the outgoing filtering unit 110 and It is applied to the rules of the reception filtering unit 120 .

이하에서는 발신 필터링부(110)에 포함된 RBL 필터링 규칙, 보안 필터링 규칙, 대량 발송 필터링 규칙, 본문 필터링 규칙, 이미지 필터링 규칙, 계정 도용 확인 규칙에 대해 설명하고 수신 필터링부(120)에 포함된 본문 필터링 규칙, 이미지 필터링 규칙, 악성 프로그램 필터링 규칙에 대해 설명한다. Hereinafter, the RBL filtering rule, the security filtering rule, the mass sending filtering rule, the body filtering rule, the image filtering rule, and the account theft confirmation rule included in the outgoing filtering unit 110 will be described and the body included in the receiving filtering unit 120 . Describes filtering rules, image filtering rules, and malicious program filtering rules.

보안 필터링 규칙, 계정 도용 확인 규칙, 악성 프로그램 필터링 규칙을 먼저 설명하고, 이후 발신 메일과 수신 메일에 공통적으로 적용되는 규칙인 본문 필터링 규칙과 이미지 필터링 규칙에 대해 설명하고, RBL 필터링 규칙, 대량 발송 필터링 규칙을 설명한다.Security filtering rules, account hijacking checking rules, and malware filtering rules are first explained, followed by body filtering rules and image filtering rules, which are rules commonly applied to outgoing and incoming mail, RBL filtering rules, mass sending filtering Explain the rules.

보안 필터링 규칙은 발신 메일의 제목, 본문 그리고 첨부 파일에 개인정보 또는 보안되어야 할 내용이 포함되어 있는지 확인하는 것을 의미한다. 이때 확인하는 내용들은 임의의 기업에 속한 사용자가 발신하는 메일에 기업의 정보를 유출하는 것을 막기 위해 관리자에 의해 미리 등록된 것일 수 있다. The security filtering rule means checking whether the subject, body, and attachments of outgoing mail contain personal information or content that should be secured. In this case, the contents to be checked may be pre-registered by the administrator in order to prevent leakage of corporate information in an e-mail sent by a user belonging to an arbitrary company.

또한, 수신자의 국가에서 법률로 명시한 위반 사항에 해당하는 내용이 있는지 확인할 수 있다. 이 경우 보안 필터링 규칙은 각 국가별로 메일의 위법성을 평가하는 기준을 포함할 수 있다.In addition, you can check whether there is any content that falls under the laws of the recipient's country. In this case, the security filtering rule may include criteria for evaluating the illegality of mail for each country.

먼저, 스팸 필터링 서버(100)는 발신 메일의 제목, 본문, 첨부 파일의 텍스트를 추출한다. 이후 주민등록번호, 휴대 전화번호 등의 개인정보와 URL 주소 등 보안 내용을 포함할 수 있는 텍스트 배열이 있는지 확인한다. 예를 들어 표 1의 검출 코드를 이용하여 추출한 텍스트 내에 주민등록번호, 휴대 전화번호, 유선 전화번호, 이메일 주소, URL 주소를 검출할 수 있다. 한편 수신자의 국가에 따라 검출하는 내용 및 검출 코드가 달라질 수 있으며, 사용되는 툴에 따라 구현되는 방식이 달라질 수 있다. First, the spam filtering server 100 extracts the subject, body, and text of the attached file of the outgoing mail. After that, it is checked whether there is a text arrangement that can contain personal information such as resident registration number and mobile phone number and security contents such as URL address. For example, a resident registration number, a mobile phone number, a landline phone number, an e-mail address, and a URL address may be detected in the extracted text using the detection code of Table 1. Meanwhile, the detection content and detection code may vary depending on the recipient's country, and the implementation method may vary depending on the tool used.

검출 내용Detected content 검출 코드detection code 주민등록번호Resident registration number \b(?:[0-9]{2}(?:0[1-9]|1[0-2])(?:0[1-9]|[1,2][0-9]|3[0,1]))-[1-4][0-9]{6}\b\b(?:[0-9]{2}(?:0[1-9]|1[0-2])(?:0[1-9]|[1,2][0-9] |3[0,1]))-[1-4][0-9]{6}\b 휴대 전화번호mobile phone number /^01([0|1|6|7|8|9]?)-?([0-9]{3,4})-?([0-9]{4})$//^01([0|1|6|7|8|9]?)-?([0-9]{3,4})-?([0-9]{4})$/ 유선 전화번호telephone number /^\d{3}-\d{3,4}-\d{4}$//^\d{3}-\d{3,4}-\d{4}$/ 이메일 주소Email Address /^[0-9a-zA-Z]([-_\.]?[0-9a-zA-Z])*@[0-9a-zA-Z]([-_\.]?[0-9a-zA-Z])*\.[a-zA-Z]{2,3}$/i /^[0-9a-zA-Z]([-_\.]?[0-9a-zA-Z]) *@[0-9a-zA-Z]([-_\.]?[0 -9a-zA-Z])*\.[a-zA-Z]{2,3}$/i URL 주소URL address ^(https?):\/\/([^:\/\s]+)(:([^\/]*))?((\/[^\s/\/]+)*)?\/?([^#\s\?]*)(\?([^#\s]*))?(#(\w*))?$^(https?):\/\/([^:\/\s]+)(:([^\/]*))?((\/[^\s/\/]+)*)? \/?([^#\s\?]*)(\?([^#\s]*))?(#(\w*))?$

이후 스팸 필터링 서버(100)는 추출된 주민등록번호, 휴대 전화번호 등이 유출 금지 내용에 해당하는지 판단한다. 유출 금지 내용은 관리자에 의해 미리 설정된 것이거나 필터링 규칙 관리부(130)가 블록체인 네트워크(200)로부터 수신한 것일 수 있다. Thereafter, the spam filtering server 100 determines whether the extracted resident registration number, mobile phone number, and the like correspond to the contents prohibited from leaking. The content prohibited from leaking may be preset by an administrator or may be received by the filtering rule management unit 130 from the block chain network 200 .

발신 메일에 유출 금지 내용이 포함되어 있는 경우, 스팸 필터링 서버(100)는 해당 메일의 발송을 차단한다. 그리고 발신자 메일 주소, 발송을 시도한 시간, 수신자 메일 주소, 차단 사유를 포함한 리포트를 관리자에게 전송할 수 있다. 발신자에게는 차단 사유를 포함한 발송 실패 알림을 전송할 수 있다.When the outgoing e-mail contains contents prohibited from leaking, the spam filtering server 100 blocks the sending of the corresponding e-mail. In addition, a report including the sender's e-mail address, the time at which the transmission was attempted, the recipient's e-mail address, and the reason for blocking can be sent to the administrator. The sender can be sent a notification of failure to send including the reason for blocking.

계정 도용 확인 규칙이란 사용자가 미리 설정한 위치 정보, IP, MAC 주소 등의 서버 접근 조건을 만족하는 상태에서만 메일 발송을 허용하는 것을 의미한다. The account hijacking check rule means that the sending of mail is allowed only when the server access conditions such as location information, IP, and MAC address set in advance by the user are satisfied.

우선, 사용자는 자신이 주로 메일 계정에 접속하는 곳의 위치 정보, 메일 계정을 접속하는 IP, MAC 주소를 포함하는 서버 접근 조건을 스팸 필터링 서버(100)에 등록한다. 한편, 이 과정은 사용자가 직접 등록할 수도 있으나 사용자의 접속하는 패턴을 파악하여 빈도가 높은 위치 정보, IP, MAC 주소를 서버 접근 패턴으로 자동 등록할 수 있다. 서버 접근 패턴은 사용자가 접속하는 단말의 종류, OS 및 브라우저 등 메일을 전송하려는 시스템 환경을 더 포함할 수 있다.First, the user registers in the spam filtering server 100 server access conditions including location information where he or she mainly accesses the mail account, IP and MAC address for accessing the mail account. On the other hand, this process can be directly registered by the user, but by identifying the user's access pattern, frequent location information, IP, and MAC addresses can be automatically registered as server access patterns. The server access pattern may further include a type of terminal accessed by the user, an OS, and a system environment to transmit mail, such as a browser.

이후 스팸 필터링 서버(100)는 발신 메일을 입력받으면, 해당 메일이 전송되는 환경과 등록된 사용자 서버 접근 조건을 비교한다. 구체적으로, 사용자가 미리 설정한 위치, IP, MAC 주소 등의 서버 접근 조건과 현재 메일 발송 정보를 통해 접속 환경이 변경되었는지를 판단한다. Then, when the spam filtering server 100 receives the outgoing mail, it compares the environment in which the mail is transmitted and the conditions for accessing the registered user server. Specifically, it is determined whether the access environment has been changed based on server access conditions such as location, IP, MAC address, etc. preset by the user and current mail sending information.

접속 환경이 변경된 경우, 스팸 필터링 서버(100)는 미리 설정된 시간 동안 해당 메일의 발송을 제한하고, 발송 차단 사유를 사용자 또는 관리자에게 전달할 수 있다.When the access environment is changed, the spam filtering server 100 may limit the sending of the corresponding mail for a preset time, and transmit the reason for blocking the sending to the user or the administrator.

악성 프로그램 필터링 규칙은 사용자의 동의 없이 개인정보나 부정 과금을 유발하는 등의 악성 행위를 하는 프로그램의 수신 또는 발신을 방지하기 위한 것이다. The malicious program filtering rule is to prevent the reception or transmission of programs that perform malicious actions such as inducing personal information or illegal charges without the user's consent.

스팸 필터링 서버(100)는 첨부 파일이 포함된 메일의 경우, 해당 첨부 파일에 대해 공지된 악성 프로그램 검출 툴을 사용하거나, 첨부된 프로그램의 헤더를 추출하여 악성 프로그램의 특징과 일치하는지 확인할 수 있다. 악성 프로그램의 특징은 관리자에 의해 미리 설정되거나 블록체인 네트워크(200)로부터 수신한 것일 수 있다. 한편, 악성 프로그램을 검출하는 방법은 이미 공지된 내용이므로 자세한 설명을 생략한다.The spam filtering server 100 may use a known malicious program detection tool for the attached file or extract the header of the attached program to check whether the mail includes an attached file and match the characteristics of the malicious program. The characteristic of the malicious program may be preset by an administrator or received from the blockchain network 200 . Meanwhile, since the method of detecting a malicious program is already known, a detailed description thereof will be omitted.

도 5는 한 실시예에 따른 스팸 필터링 방법의 흐름도이고, 도 6은 다른 실시예에 따른 스팸 필터링 방법의 흐름도이고, 도 7은 또 다른 실시예에 따른 스팸 필터링 방법의 흐름도이고, 도 8은 또 다른 실시예에 따른 스팸 필터링 방법의 흐름도이고, 도 9는 또 다른 실시예에 따른 스팸 필터링 방법의 설명도이고, 도 10은 또 다른 실시예에 따른 스팸 필터링 방법의 흐름도이다.5 is a flowchart of a spam filtering method according to an embodiment, FIG. 6 is a flowchart of a spam filtering method according to another embodiment, FIG. 7 is a flowchart of a spam filtering method according to another embodiment, and FIG. It is a flowchart of a spam filtering method according to another embodiment, FIG. 9 is an explanatory diagram of a spam filtering method according to another embodiment, and FIG. 10 is a flowchart of a spam filtering method according to another embodiment.

도 5를 참고하면, 스팸 필터링 서버(100)는 메일의 본문 내용을 추출한다(S210).Referring to FIG. 5 , the spam filtering server 100 extracts the body content of the mail (S210).

스팸 필터링 서버(100)는 추출한 텍스트와 미리 설정된 스팸 코퍼스의 텍스트들과의 유사도를 계산한다(S220). 코퍼스(Corpus)란 언어 연구를 위해 특정한 목적을 가지고 언어의 표본을 추출한 집합을 의미하며, 대규모의 언어 데이터베이스, 또는 컴퓨터가 판독할 수 있는 형태로 저장된 자연어 문장 및 이에 대한 정보들을 포함한다. The spam filtering server 100 calculates a similarity between the extracted text and the texts of the preset spam corpus (S220). A corpus refers to a set of samples extracted from languages for a specific purpose for language research, and includes natural language sentences and information about them stored in a large-scale language database or computer-readable format.

스팸 코퍼스란, 스팸 메일에 포함되어 있는 특정 단어들을 모아놓은 코퍼스로, 예를 들어 '대리운전 정성껏 모시겠습니다', '무이자 대출', '현금 필요시 전화주세요'등의 문장을 포함할 수 있다. 한편 스팸 코퍼스는 블록체인 네트워크(200) 또는 외부 데이터베이스(미도시)에 저장된 것일 수 있다. The spam corpus is a corpus of specific words included in spam emails. For example, it may include sentences such as 'I will serve you by proxy', 'interest-free loan', 'Call me if you need cash'. Meanwhile, the spam corpus may be stored in the blockchain network 200 or an external database (not shown).

단어 또는 문장 간 유사도를 판단하는 방법은 어느 하나로 한정되지 않으며, 예를 들어 단어 또는 문장을 벡터로 변환하고, 벡터의 유사도를 계산할 수 있다. 구체적으로 워드 임베딩(Word Embedding) 방법을 통해 임베딩 벡터를 생성하고, 생성된 벡터들 간 코사인 유사도(Cosine Similarity)를 계산할 수 있다. 코사인 유사도 값은 -1 내지 1 사이의 실수값일 수 있다. 유사도 계산 방법은 어느 하나에 한정되지 않으며, 이미 공지된 기술이므로 자세한 설명은 생략한다. A method of determining the similarity between words or sentences is not limited to any one, for example, a word or sentence may be converted into a vector and the vector similarity may be calculated. Specifically, an embedding vector may be generated through a word embedding method, and cosine similarity between the generated vectors may be calculated. The cosine similarity value may be a real value between -1 and 1. The similarity calculation method is not limited to any one, and since it is a known technique, a detailed description thereof will be omitted.

계산된 유사도가 임계값 이상인 경우, 스팸 필터링 서버(100)는 해당 메일을 스팸 또는 위험 메일이라고 판단한다(S230). 임계값은 관리자에 의해 미리 설정된 값이거나 필터링 규칙 관리부(130)에 의해 블록체인 네트워크(200)로부터 제공받은 값일 수 있다. 한편, 유사도의 값의 범위에 따라 스팸 등급을 구분하여 판단할 수 있다.If the calculated similarity is greater than or equal to the threshold, the spam filtering server 100 determines that the corresponding mail is spam or dangerous mail (S230). The threshold value may be a value preset by an administrator or a value provided from the block chain network 200 by the filtering rule management unit 130 . On the other hand, according to the range of the similarity value, it can be determined by classifying the spam class.

도 6 내지 도 7에서, 이미지 필터링 규칙이란 발신 메일에 포함된 이미지가 스팸 이미지인지 또는 이미지에 포함된 텍스트 내용이 스팸인지 확인하는 것을 의미한다.6 to 7 , the image filtering rule means checking whether an image included in an outgoing mail is a spam image or a text content included in the image is spam.

도 6을 참고하면, 이미지 자체가 스팸인 경우를 판단하기 위해 스팸 필터링 서버(100)는 우선 메일에 포함된 이미지를 추출한다(S310). 메일 본문에 포함되거나 첨부 파일로 포함된 이미지를 모두 추출할 수 있다.Referring to FIG. 6 , in order to determine if the image itself is spam, the spam filtering server 100 first extracts an image included in the mail ( S310 ). You can extract all images included in the body of the mail or included as attachments.

스팸 필터링 서버(100)는 이미지 특징을 추출한다(S320). 이미지 특징이란 물체의 형태나 크기, 위치와 무관하게 식별이 가능한 특성을 의미한다. 이미지 특징을 추출하는 방법은 SIFT(Scale-Invariant Feature Transform), SURF(Speeded-Up Robust Features), FAST(Features from Accelerated Segment Test), ORB(Oriented FAST and Rotated BRIEF) 등이 있으며 어느 하나로 한정되지 않는다. The spam filtering server 100 extracts image features (S320). Image characteristics refer to characteristics that can be identified regardless of the shape, size, or location of an object. Methods for extracting image features include SIFT (Scale-Invariant Feature Transform), SURF (Speeded-Up Robust Features), FAST (Features from Accelerated Segment Test), ORB (Oriented FAST and Rotated BRIEF), but is not limited to any one. .

스팸 필터링 서버(100)는 미리 학습된 스팸 이미지 특징과의 유사도를 계산한다(S330). 이때 블록체인 네트워크(200) 또는 외부 데이터베이스(미도시)에 저장된 스팸 이미지 데이터로부터 동일한 특징을 추출하고, 추출된 특징 간 유사도를 계산할 수 있다. The spam filtering server 100 calculates a similarity with the pre-learned spam image feature (S330). At this time, the same feature may be extracted from the spam image data stored in the block chain network 200 or an external database (not shown), and the degree of similarity between the extracted features may be calculated.

한편 유사도를 계산하는 방법은 Euclidean Distance, Minkowski Distance, 코사인 유사도 등을 이용할 수 있으며 어느 하나로 한정되지 않는다.Meanwhile, the method for calculating the similarity may use Euclidean Distance, Minkowski Distance, cosine similarity, and the like, but is not limited thereto.

유사도가 임계값 이상인 경우 해당 메일을 스팸으로 판단한다(S340). 임계값은 관리자에 의해 설정되거나, 필터링 규칙 관리부(130)에 의해 블록체인 네트워크(200)에 저장된 다른 스팸 필터링 서버(100)의 임계값을 사용할 수 있다. If the similarity is greater than or equal to the threshold, the corresponding mail is determined as spam (S340). The threshold value may be set by an administrator or a threshold value of another spam filtering server 100 stored in the block chain network 200 by the filtering rule management unit 130 may be used.

한편, 이미지로부터 임의의 특징을 추출하여 유사도를 비교하는 과정은 머신러닝 또는 딥러닝 모델로 구현될 수 있다. 이 경우 이미지에서 추출되는 특징과 계산되는 유사도 값은 모델의 종류에 따라 바뀔 수 있다.On the other hand, the process of extracting arbitrary features from an image and comparing the similarity may be implemented as a machine learning or deep learning model. In this case, the feature extracted from the image and the calculated similarity value may change depending on the type of model.

도 7을 참고하면, 스팸 필터링 서버(100)는 메일 본문에 포함된 이미지 또는 첨부 파일에 포함된 이미지를 추출한다(S410).Referring to FIG. 7 , the spam filtering server 100 extracts an image included in the mail body or an image included in an attached file (S410).

스팸 필터링 서버(100)는 이미지를 전처리하고 텍스트 영역만을 추출한다(S420). 전처리 방법의 구체적인 예로서, Open CV를 이용하여 이미지를 그레이 스케일로 변환하고, 임계값 이하의 값을 가지는 픽셀을 검정색으로, 임계값 이상의 값을 가지는 픽셀을 흰색으로 이진화하여 흑백 이미지를 생성하고 컨투어를 추출할 수 있다. 이 과정을 코드로 설명하면 표 2와 같다.The spam filtering server 100 pre-processes the image and extracts only the text area (S420). As a specific example of the preprocessing method, an image is converted to gray scale using Open CV, and pixels having a value below the threshold are binarized into black, and pixels with a value above the threshold are binarized into white to generate a black and white image and contour. can be extracted. Table 2 shows this process in code.

#Gary Scale Convert/*
Img=cv2.imread(Number,cv2.IMREAD_COLOR)
Copy_img=img.copy()
Img2=cv2.cvtColor(img,cv2,COLOR_BGR2GRAY)

#Outline spot_Gaussian Filter/*
Blur=cv2.GaussianVlur(img2,(3,3),0)
Cv2.imwrite('blur.jpg', blur)

#edge export/*
Canny=cv2.Canny(blur,100,200)
Cv2.imwire('canny.jpg', canny)

#Contours search/*
Cnts.contours,hierarchy=cv2.findContours(canny,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)

#white color match/*
for i in range(len(contours)):
cnt=contours[i]
area = cv2.contourArea(cnt)
x,y,w,h = cv2.boundingRect(cnt)
rect_area=w*h #area size
aspect_ratio = float(w)/h # ratio = width/height
if
(aspect_ratio>=0.2)and(aspect_ratio<=1.0)and(rect_area>=100)and(rect_area<=700):
cv2.rectangle(img,(x,y),(x+w,y+h),(0,255,0),1)
box1.append(cv2.boundingRect(cnt))

#text Area Export/* buble sort
for i in range(len(box1)): ##Buble Sort on python
for j in range(len(box1)-(i+1)):
if box1[j][0]>box1[j+1][0]:
temp=box1[j]
box1[j]=box1[j+1]
box1[j+1]=temp

#measure number plate size
if count > f_count:
select = m
f_count = count;
plate_width=delta_x
cv2.imwrite('snake.jpg',img)

#text area extract
number_plate=copy_img[box1[select][1]-10:box1[select][3]+box1[select][1]+20,box1[select][0]-10:140+box1[select][0]]
resize_plate=cv2.resize(number_plate,None,fx=1.8,fy=1.8,interpolation=cv2.INTER_CUBIC+cv2.INTER_LINEAR)
plate_gray=cv2.cvtColor(resize_plate,cv2.COLOR_BGR2GRAY)
ret,th_plate = cv2.threshold(plate_gray,150,255,cv2.THRESH_BINARY)

cv2.imwrite('plate_th.jpg',th_plate)
kernel = np.ones((3,3),np.uint8)
er_plate = cv2.erode(th_plate,kernel,iterations=1)
er_invplate = er_plate
cv2.imwrite('er_plate.jpg',er_invplate)
result = pytesseract.image_to_string(Image.open('er_plate.jpg'), lang='kor')
return(result.replace(" "," ")) #Gary Scale Convert/*
Img=cv2.imread(Number,cv2.IMREAD_COLOR)
Copy_img=img.copy()
Img2=cv2.cvtColor(img,cv2,COLOR_BGR2GRAY)

#Outline spot_Gaussian Filter/*
Blur=cv2.GaussianVlur(img2,(3,3),0)
Cv2.imwrite('blur.jpg', blur)

#edge export/*
Canny=cv2.Canny(blur,100,200)
Cv2.imwire('canny.jpg', canny)

#Contours search/*
Cnts.contours,hierarchy=cv2.findContours(canny,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)

#white color match/*
for i in range(len(contours)):
cnt=contours[i]
area = cv2.contourArea(cnt)
x,y,w,h = cv2.boundingRect(cnt)
rect_area=w*h #area size
aspect_ratio = float(w)/h # ratio = width/height
if
(aspect_ratio>=0.2)and(aspect_ratio<=1.0)and(rect_area>=100)and(rect_area<=700):
cv2.rectangle(img,(x,y),(x+w,y+h),(0,255,0),1)
box1.append(cv2.boundingRect(cnt))

#text Area Export/* bubble sort
for i in range(len(box1)): ##Buble Sort on python
for j in range(len(box1)-(i+1)):
if box1[j][0]>box1[j+1][0]:
temp=box1[j]
box1[j]=box1[j+1]
box1[j+1]=temp

#measure number plate size
if count > f_count:
select = m
f_count = count;
plate_width=delta_x
cv2.imwrite('snake.jpg',img)

#text area extract
number_plate=copy_img[box1[select][1]-10:box1[select][3]+box1[select][1]+20,box1[select][0]-10:140+box1[select][0 ]]
resize_plate=cv2.resize(number_plate,None,fx=1.8,fy=1.8,interpolation=cv2.INTER_CUBIC+cv2.INTER_LINEAR)
plate_gray=cv2.cvtColor(resize_plate,cv2.COLOR_BGR2GRAY)
ret,th_plate = cv2.threshold(plate_gray,150,255,cv2.THRESH_BINARY)

cv2.imwrite('plate_th.jpg',th_plate)
kernel = np.ones((3,3),np.uint8)
er_plate = cv2.erode(th_plate,kernel,iterations=1)
er_invplate = er_plate
cv2.imwrite('er_plate.jpg',er_invplate)
result = pytesseract.image_to_string(Image.open('er_plate.jpg'), lang='eng')
return(result.replace("",""))

다른 예로서, CNN(Convolutional Neural Network)을 개량한 Inception이라는 인공지능 모델을 사용할 수 있다. As another example, an artificial intelligence model called Inception, which is an improved Convolutional Neural Network (CNN), can be used.

이후 전처리된 이미지로부터 광학 문자 인식(Optical Character Recognition, OCR)으로 텍스트를 추출할 수 있다. 예를 들어 Tesseract라는 OCR 엔진을 이용할 수 있다. Thereafter, text can be extracted from the preprocessed image by optical character recognition (OCR). For example, you can use an OCR engine called Tesseract.

스팸 필터링 서버(100)는 추출한 텍스트와 미리 설정된 스팸 코퍼스의 언어들과의 유사도를 계산한다(S430). 예를 들어, 텍스트를 벡터로 변환하여 두 벡터 사이의 유사도를 계산할 수 있다. The spam filtering server 100 calculates a similarity between the extracted text and the preset languages of the spam corpus (S430). For example, you can convert text to a vector to calculate the similarity between two vectors.

구체적으로 one-hot vector 또는 word2vec 모델을 이용하여 단어 또는 문장을 벡터화하고, 두 벡터 간 L1 거리, 유클라디안 거리(L2 거리) 또는 코사인 유사도를 계산할 수 있다. 텍스트 간 유사도를 계산하는 방법은 어느 하나에 한정되지 않는다.Specifically, a word or sentence can be vectorized using a one-hot vector or word2vec model, and the L1 distance, Euclidean distance (L2 distance), or cosine similarity between the two vectors can be calculated. A method of calculating the similarity between texts is not limited to any one method.

유사도가 임계값 이상인 경우 해당 메일을 스팸으로 판단한다(S440). 임계값은 관리자에 의해 설정되거나, 필터링 규칙 관리부(130)에 의해 블록체인 네트워크(200)에 저장된 다른 스팸 필터링 서버(100)의 임계값을 사용할 수 있다.If the similarity is greater than or equal to the threshold, it is determined that the mail is spam (S440). The threshold value may be set by an administrator, or a threshold value of another spam filtering server 100 stored in the block chain network 200 by the filtering rule management unit 130 may be used.

도 8 및 도 9를 참고하면, RBL(Real-time Blocking List) 필터링 규칙은 발신 메일의 수신자 메일 서버를 확인하고, 해당 메일 서버가 발신자의 메일 주소를 차단(Block)한 상태인지 확인하는 것을 의미한다. 즉 발신자의 계정이 수신자의 메일 서버에 블랙 리스트로 등록되었는지 사전에 확인하는 것이다. 8 and 9 , the RBL (Real-time Blocking List) filtering rule means checking the recipient mail server of the outgoing mail, and checking whether the mail server blocks the sender's mail address. do. That is, it is checked in advance whether the sender's account is registered as a blacklist in the recipient's mail server.

스팸 필터링 서버(100)는 복수의 메일 서버에 테스트 계정을 생성한다(S510). 해당 메일 서버가 발신자 메일 주소를 RBL에 등록해 두었는지 확인하기 위함이다.The spam filtering server 100 creates a test account in a plurality of mail servers (S510). This is to check whether the mail server has registered the sender's mail address in RBL.

발신자가 화면에 자신의 계정을 입력한 후 계정 체크 버튼을 클릭한다(S520). 예를 들어 도 9의 (a)와 같은 화면에 발신자의 메일 주소를 입력할 수 있다.After the caller enters his or her account on the screen, he clicks the check account button (S520). For example, the sender's e-mail address may be input on the screen as shown in FIG. 9(a).

스팸 필터링 서버(100)는 발신자 계정에서 서로 다른 포털의 테스트 계정으로 테스트 메일을 발송한다(S530).The spam filtering server 100 sends a test mail from the sender account to the test account of different portals (S530).

스팸 필터링 서버(100)는 각 테스트 계정에서 메일 수신 결과를 확인하고, 수신 여부를 화면에 표시한다(S540). 예를 들어 도 9의 (b)와 같은 화면을 제공할 수 있다.The spam filtering server 100 checks the mail reception result in each test account, and displays the reception or not on the screen (S540). For example, a screen as shown in (b) of FIG. 9 may be provided.

이후 사용자는 발신 차단된 것으로 표시된 메일 서버에 발신자 메일 주소를 RBL에서 제거해달라는 요청을 보낼 수 있다. Afterwards, the user can send a request to the mail server marked as blocked to remove the sender's mail address from the RBL.

이를 통해 사용자의 메일 주소의 차단 상태를 인지할 수 있고, 차단된 경우 서버에 RBL 해제 요청을 할 수 있으며 메일이 안전하게 수신자에게 전달될 수 있다. Through this, the blocking state of the user's e-mail address can be recognized, and when it is blocked, an RBL release request can be made to the server, and the e-mail can be safely delivered to the recipient.

한편, 차단된 사이트가 있는 경우 사용자가 발송을 진행하면 해당 메일 계정의 수신자에게 전달되지 않을 수 있음을 알린다. On the other hand, if there is a blocked site, if the user proceeds to send, it notifies the recipient of the corresponding mail account that it may not be delivered.

도 10을 참고하면, 대량 발송 필터링 규칙은 수신자 수가 미리 정해진 수 이상인 경우, 스팸 메일로 인한 수신자의 피해를 예방하기 위해 발신 메일에 반드시 포함되어야 하는 안내 사항 등의 표현이 있는지 확인하는 것을 의미한다. Referring to FIG. 10 , when the number of recipients is greater than or equal to a predetermined number, the mass sending filtering rule means checking whether there is an expression such as guidance information that must be included in an outgoing mail to prevent damage to recipients due to spam mail.

대량 발송 메일이란, 수신자 수가 미리 설정된 기준값 이상인 경우를 의미하며, 예를 들어 스팸 필터링 서버(100)는 수신자 수가 30명 이상인 경우 대량 발송 메일로 판단할 수 있다. 한편 본 명세서에서는 대량 발송 메일은 상업적인 목적으로 발송되는 것으로 간주한다. 이하에서는 대량 발송 메일을 발신하는 경우에 대해 설명한다.The mass-sent mail means a case in which the number of recipients is equal to or greater than a preset reference value. For example, the spam filtering server 100 may determine the mass-sent mail when the number of recipients is 30 or more. Meanwhile, in this specification, mass-sent mail is considered to be sent for commercial purposes. Hereinafter, a case of sending a mass sending mail will be described.

스팸 필터링 서버(100)는 발신 메일의 제목 및 본문의 텍스트를 추출한다(S610).The spam filtering server 100 extracts the text of the subject and body of the outgoing mail (S610).

스팸 필터링 서버(100)는 메일 제목에 광고성 메일을 알리는 문구가 있는지 확인한다(S620). 구체적으로, 대량 발송 메일은 발신 메일의 제목에 해당 메일이 영리 목적의 광고성 메일임을 알리는 문구가 포함되어 있어야 한다. 따라서 제목에 (광고) 또는 (성인광고) 등의 문구가 있는지 확인한다.The spam filtering server 100 checks whether there is a phrase informing the advertisement mail in the mail title (S620). Specifically, in the mass-sent mail, the subject line of the outgoing mail must include a text indicating that the mail is an advertisement for commercial purposes. Therefore, check if there is a phrase such as (advertisement) or (adult advertisement) in the title.

스팸 필터링 서버(100)는 발신 메일의 본문에 해당 메일의 수신을 거부할 수 있는 방법이 안내되어 있는지 확인한다(S630). 예를 들어, 해당 발신자 메일 주소로부터 발신되는 메일을 수신 거부할 수 있는 링크가 포함되어 있는지 확인할 수 있다.The spam filtering server 100 checks whether a method for rejecting reception of the mail is provided in the body of the outgoing mail (S630). For example, you can check whether a link to unsubscribe from an email from the corresponding sender's email address is included.

스팸 필터링 서버(100)는 수신자가 해당 발신 메일의 수신을 동의한 사실을 안내하는 내용이 포함되어 있는지 확인한다(S640). 예를 들어, 수신자가 해당 메일의 수신을 동의한 시기를 포함할 수 있다.The spam filtering server 100 checks whether content guiding the fact that the recipient agrees to receive the outgoing mail is included (S640). For example, it may include the time when the recipient agrees to receive the corresponding mail.

스팸 필터링 서버(100)는 S620 내지 S640의 내용이 모두 포함되어 있는 경우, 해당 메일을 발송한다(S650).When the spam filtering server 100 includes all of the contents of S620 to S640, the spam filtering server 100 sends the corresponding mail (S650).

S620 내지 S640의 내용 중 어느 하나라도 만족하지 않는 경우, 스팸 필터링 서버(100)는 해당 대량 발송 메일을 위험 메일이라고 판단하고, 해당 메일의 발송을 차단한다(S660). 추가로, 발신자 메일 주소, 발송을 시도한 시간, 수신자 메일 주소, 차단 사유를 포함한 리포트를 관리자에게 전송할 수 있다. 발신자에게는 차단 사유를 포함한 발송 실패 알림을 전송할 수 있다.If any one of the contents of S620 to S640 is not satisfied, the spam filtering server 100 determines that the mass sent mail is a dangerous mail, and blocks the sending of the corresponding mail (S660). In addition, a report including the sender's e-mail address, the time the transmission was attempted, the recipient's e-mail address, and the reason for blocking can be sent to the administrator. The sender can be sent a notification of failure to send including the reason for blocking.

도 11은 한 실시예에 따른 컴퓨팅 장치의 하드웨어 구성도이다.11 is a hardware configuration diagram of a computing device according to an embodiment.

도 11을 참고하면, 발신 필터링부(110), 수신 필터링부(120), 필터링 규칙 관리부(130)는 적어도 하나의 프로세서에 의해 동작하는 컴퓨팅 장치(300)에서, 본 발명의 동작을 실행하도록 기술된 명령들(instructions)이 포함된 프로그램을 실행한다. Referring to FIG. 11 , the outgoing filtering unit 110 , the receiving filtering unit 120 , and the filtering rule management unit 130 are described to execute the operation of the present invention in the computing device 300 operated by at least one processor. Executes a program containing instructions.

컴퓨팅 장치(300)의 하드웨어는 적어도 하나의 프로세서(310), 메모리(320), 스토리지(330), 통신 인터페이스(340)를 포함할 수 있고, 버스를 통해 연결될 수 있다. 이외에도 입력 장치 및 출력 장치 등의 하드웨어가 포함될 수 있다. 컴퓨팅 장치(300)는 프로그램을 구동할 수 있는 운영 체제를 비롯한 각종 소프트웨어가 탑재될 수 있다.The hardware of the computing device 300 may include at least one processor 310 , a memory 320 , a storage 330 , and a communication interface 340 , and may be connected through a bus. In addition, hardware such as an input device and an output device may be included. The computing device 300 may be loaded with various software including an operating system capable of driving a program.

프로세서(310)는 컴퓨팅 장치(300)의 동작을 제어하는 장치로서, 프로그램에 포함된 명령들을 처리하는 다양한 형태의 프로세서(310)일 수 있고, 예를 들면, CPU(Central Processing Unit), MPU(Micro Processor Unit), MCU(Micro Controller Unit), GPU(Graphic Processing Unit) 등 일 수 있다. 메모리(320)는 본 발명의 동작을 실행하도록 기술된 명령들이 프로세서(310)에 의해 처리되도록 해당 프로그램을 로드한다. 메모리(320)는 예를 들면, ROM(read only memory), RAM(random access memory) 등 일 수 있다. 스토리지(330)는 본 발명의 동작을 실행하는데 요구되는 각종 데이터, 프로그램 등을 저장한다. 통신 인터페이스(340)는 유/무선 통신 모듈일 수 있다.The processor 310 is a device for controlling the operation of the computing device 300 , and may be a processor 310 of various types that processes instructions included in a program, for example, a central processing unit (CPU), an MPU ( It may be a micro processor unit), a micro controller unit (MCU), a graphic processing unit (GPU), or the like. The memory 320 loads the corresponding program so that the instructions described to execute the operation of the present invention are processed by the processor 310 . The memory 320 may be, for example, read only memory (ROM), random access memory (RAM), or the like. The storage 330 stores various data, programs, etc. required for executing the operation of the present invention. The communication interface 340 may be a wired/wireless communication module.

이상에서 설명한 본 발명의 실시예는 장치 및 방법을 통해서만 구현이 되는 것은 아니며, 본 발명의 실시예의 구성에 대응하는 기능을 실현하는 프로그램 또는 그 프로그램이 기록된 기록 매체를 통해 구현될 수도 있다.The embodiments of the present invention described above are not implemented only through an apparatus and a method, but may be implemented through a program that realizes a function corresponding to the configuration of the embodiment of the present invention or a recording medium in which the program is recorded.

이상에서 본 발명의 실시예에 대하여 상세하게 설명하였지만 본 발명의 권리범위는 이에 한정되는 것은 아니고 다음의 청구범위에서 정의하고 있는 본 발명의 기본 개념을 이용한 당업자의 여러 변형 및 개량 형태 또한 본 발명의 권리범위에 속하는 것이다.Although the embodiments of the present invention have been described in detail above, the scope of the present invention is not limited thereto, and various modifications and improvements by those skilled in the art using the basic concept of the present invention defined in the following claims are also provided. It belongs to the scope of rights.

Claims

A method of operating a server operated by at least one processor, comprising:
A step of receiving filtering rules for determining spam mail from a blockchain network in which a plurality of mail servers are connected,
step of receiving outgoing mail,
determining whether the outgoing mail corresponds to the spam mail by an outgoing mail filtering rule included in the filtering rules; and
In the case of the spam mail, blocking the transmission of the outgoing mail to a recipient mail server, and storing the information of the outgoing mail as a spam mail in the block chain network,
The outgoing mail information stored in the blockchain network is provided to a plurality of mail servers connected to the blockchain network,
The filtering rules are updated through a transaction and guarantee procedure for the changed filtering rule among the plurality of mail servers when a filtering rule is changed in any mail server among the plurality of mail servers, and in the blockchain network Recorded and distributed, how it works.

In claim 1,
The outgoing mail filtering rule is
A first filtering rule for checking whether the outgoing mail contains personal information that is not allowed to leak, a second filtering rule for checking whether the recipient mail server of the outgoing mail blocks the account of the outgoing mail, the body of the outgoing mail or at least one of a third filtering rule for checking whether an image included in an attachment contains a spam phrase, and a fourth filtering rule for checking whether the outgoing mail is different from a pre-stored mail sending pattern.

In paragraph 2,
The third filtering rule is,
The third filtering rule is,
Extracts an image included in the body or attachment of the outgoing mail, extracts a text area from the image, calculates a similarity with the spam phrase included in the filtering rules, and the similarity exceeds a preset reference value If so, determining the outgoing mail as the spam mail, the operating method.

In claim 3,
The preset reference value is
a value provided from any one of the plurality of mail servers.

delete

In claim 1,
The receiving step is
Filtering rules used in the plurality of mail servers or receiving mails determined as the spam mail from the plurality of mail servers.

In claim 1,
The saving step is
Storing at least one of a sender mail address of the outgoing mail, a receiver mail address, and a cause corresponding to the spam mail.

A method of operating a server operated by at least one processor, comprising:
A step of receiving a filtering rule for judging spam mail or dangerous mail that violates security policies from a blockchain network in which a plurality of mail servers are connected;
step of receiving outgoing mail,
Determining whether the outgoing mail corresponds to the spam mail or the dangerous mail by the filtering rule, and
In the case of the spam mail or the dangerous mail, storing the information of the outgoing mail in the block chain network so that the information of the outgoing mail is provided to a plurality of mail servers connected to the block chain network, ,
The filtering rule is updated through a transaction and guarantee procedure for the changed filtering rule among the plurality of mail servers when a filtering rule is changed in any mail server among the plurality of mail servers, and is stored in the blockchain network. Recorded and distributed, how it works.

In claim 8,
checking whether the number of recipients of the outgoing mail exceeds a predetermined reference value;
If it exceeds, first information indicating that the outgoing mail is an advertisement mail in the subject or body of the outgoing mail, second information including a method for refusing to receive the outgoing mail, and the recipient of the outgoing mail Checking whether third information including the fact that you have consented to receive outgoing mail is included; and
If any one of the first information to the third information is not included, determining the outgoing mail as the dangerous mail
Further comprising, the method of operation.

In claim 8,
The determining step,
extracting the subject, body, and text of the attached file of the outgoing mail;
extracting personal information including at least one of a resident registration number, a mobile phone number, a landline phone number, and an email address from the text;
Determining whether the personal information corresponds to the subject of the leakage prohibition included in the security policy, and
Determining the outgoing e-mail as the dangerous e-mail when the personal information falls under the prohibited subject
comprising, a method of operation.

In claim 8,
The determining step,
sending a test mail from the account of the outgoing mail to a test account of the recipient mail server of the outgoing mail; and
If the sending of the test mail is successful, sending the outgoing mail to the recipient's mail account, and if the sending of the test mail fails, requesting the recipient's mail server to unblock the outgoing mail account
comprising, a method of operation.

In claim 8,
The determining step,
Checking whether the outgoing mail is sent in a pattern different from a pre-stored mail sending pattern,
The pre-stored mail sending pattern is registered by the sender of the outgoing mail, or an access pattern to the arbitrary mail server is extracted.

In claim 12,
The checking step is
receiving an input of a mail server access condition including at least one of location information for accessing the arbitrary mail server, an IP address, and a MAC address;
extracting location information, IP address, and MAC address when receiving the outgoing mail; and
Determining the outgoing mail as the dangerous mail if the extracted information and the mail server access condition are different
comprising, a method of operation.

A computing device comprising:
memory, and
at least one processor executing instructions of a program loaded into the memory;
the program is
Storing a filtering rule provided from a blockchain network in which a plurality of mail servers are connected;
step of receiving outgoing mail,
determining whether the outgoing mail corresponds to spam mail according to the filtering rule; and
In the case of the spam mail, it includes instructions described to execute the step of storing the outgoing mail or information of the outgoing mail in the block chain network,
When the transaction is verified by the plurality of mail servers, the outgoing mail information is stored in the blockchain network and distributed to a plurality of mail servers connected to the blockchain network,
The filtering rule is updated through a transaction and guarantee procedure for the changed filtering rule among the plurality of mail servers when a filtering rule is changed in any mail server among the plurality of mail servers, and is stored in the blockchain network. A computing device that is recorded and distributed.

delete