KR102136999B1

KR102136999B1 - Unrecogized character reading system and authentication method using the same

Info

Publication number: KR102136999B1
Application number: KR1020190127445A
Authority: KR
Inventors: 엄요한
Original assignee: 엄요한
Priority date: 2019-10-15
Filing date: 2019-10-15
Publication date: 2020-07-23

Abstract

The present invention relates to a media character reading system. According to one embodiment of the present invention, the media character reading system extracts a part which a computer needs to be read as a media file in the media digitized by a computer, uses the media file to generate a problem to provide the problem to a plurality of users, and collects answer candidate data of the part which needs to be read, and extracts an answer to verify or correct the answer. In addition, according to another embodiment of the present invention, a media file that needs to be read may be used for security authentication of operation permission control such as automatic input prevention and automatic download prevention.

Description

Media character reading system and security authentication method utilizing the same {UNRECOGIZED CHARACTER READING SYSTEM AND AUTHENTICATION METHOD USING THE SAME}

본 발명은 미디어 문자 판독 시스템 및 이를 활용한 보안 인증 방법에 관한 것으로, 보다 상세하게는 미디어(이미지 또는 음성 등)를 디지털 문자로 변환하는 과정에서 판독이 필요한 부분을 추출하고 검증하여 교정하는 시스템과 이를 활용하여 동작 허용 제어에 이용가능한 보안 인증 방법에 관한 것이다.The present invention relates to a media character reading system and a security authentication method utilizing the same, and more specifically, a system for extracting, verifying, and correcting a portion that needs to be read in the process of converting media (image or voice, etc.) into digital characters. It relates to a security authentication method that can be used to control the operation allowance utilizing this.

인간의 지식을 보호하고, 전세계에서 정보에 보다 잘 접근할 수 있도록, 컴퓨터 시대 이전에 쓰여진 물리적인 책이나 텍스트들은 전부 디지털화되고 있다. 페이지들은 이미지 형태로 포토그래픽-스캔되고, 이후 광학 문자 인식(OCR: Optical Character Recognition)을 이용하여 컴퓨터가 인식하는 디지털 텍스트로 인식된다. 하지만 디지털화 프로세스의 가장 큰 장애물 중 하나는 스캐닝된 이미지 내의 단어를 판독하는 것이 쉽지 않다는 것이다. 이에 기존에는 이미지 속 글씨(인쇄 및 필기)를 컴퓨터로 인식을 하는 경우 정확도가 높지 않으므로 사람이 직접 검수 및 수정을 하므로 시간과 비용이 소모된다. 인쇄 명함을 디지털화 하는 경우에도 정확도가 떨어져 글자의 약 20%를 타이피스트(사람)가 입력하는 것으로 알려져 있다. 더욱이 사람이 직접 쓰는 주관식 답을 구하는 설문조사나 통계청의 인구 총조사의 경우에는 컴퓨터가 인식한 주관식 답을 타이피스트(사람)가 다시 검수하고 교정하는 단계를 반드시 거쳐야만 한다.Physical books and texts written before the computer age are all being digitized to protect human knowledge and better access to information around the world. The pages are photographically-scanned in the form of an image, and then recognized as digital text recognized by a computer using Optical Character Recognition (OCR). However, one of the biggest obstacles to the digitization process is that it is not easy to read words in the scanned image. Therefore, in the case of recognizing the characters (printing and writing) in an image with a computer, the accuracy is not high, so a person directly inspects and corrects the time and cost. It is known that typists (persons) input about 20% of letters due to poor accuracy even when digitizing printed business cards. Moreover, in the case of a survey that asks for a subjective answer written by a person or a census of the National Statistical Office, a typist (person) must review and correct the subjective answer recognized by the computer.

한편, CAPTCHA(Completely　Automated　Public　Turing test to tell　Computers and　Humans　Apart, 완전 자동화된 사람과 컴퓨터 판별, 캡차)는 HIP(Human Interaction Proof) 기술의 일종으로, 어떠한 사용동작자가 실제　사람인지　컴퓨터　프로그램인지를 구별하기 위해 사용되는 방법이다. 사람은 구별할 수 있지만 컴퓨터는 구별하기 힘들게 의도적으로 비틀거나 덧칠한 그림을 주고 그 그림에 쓰여 있는 내용을 물어보는 방법이 자주 사용된다. 이것은 기존의 텍스트와 이미지를 일그러뜨린 형태로 변형한 후 인식 대상이 변형된 이미지로부터 기존 이미지를 도출해 낼 수 있는지를 확인하는 방식의 테스트이다. 컴퓨터 프로그램이 변형시킨 이미지는 사람이 쉽게 인식할 수 있지만 컴퓨터 프로그램은 변형된 이미지를 인식하지 못하므로 테스트를 통과하지 못한다면 테스트 대상이 사람이 아님을 판정할 수 있다. 흔히 웹사이트 회원가입을 할 때 뜨는 자동가입방지 프로그램 같은 곳에 쓰인다.On the other hand, CAPTCHA (Completely 　Automated　Public　Turing test to tell　Computers and　Humans 판별Apart, fully automated person and computer discrimination, CAPTCHA) is a kind of HIP (Human Interaction Proof) technology that distinguishes which user is a real 　person　computer　program. It is a method used to do. People can distinguish, but computers are often used to intentionally give twisted or overlaid pictures and ask what is written on them. This is a test in which the existing text and image are transformed into a distorted form, and then it is checked whether the object to be recognized can derive the existing image from the transformed image. An image modified by a computer program can be easily recognized by a human, but a computer program does not recognize a modified image, so if the test does not pass, it can be determined that the test object is not a human. It is often used in places like the automatic subscription prevention program that appears when you sign up for a website.

본 발명의 배경기술은 대한민국 공개특허 제10-2015-0025452호에 개시되어 있다.Background of the invention is disclosed in Republic of Korea Patent Publication No. 10-2015-0025452.

본 발명은 컴퓨터가 인식하는 미디어(이미지 또는 음성 등) 속에서 문자를 추출하여 판독하는 미디어 문자 판독 시스템 및 이를 활용한 보안 인증 방법을 제공한다.The present invention provides a media character reading system for extracting and reading characters from media (such as images or voices) recognized by a computer, and a security authentication method using the media.

본 발명은 판독이 필요한 미디어 파일을 자동 입력 방지 문자 또는 자동 다운로드 방지 문자, 음성 인식같은 동작 허용 제어에 이용하여 보안 인증 처리하는 미디어 문자 판독 시스템 및 이를 활용한 보안 인증 방법을 제공한다.The present invention provides a media character reading system and a security authentication method utilizing the media file that needs to be read for security authentication processing by using an automatic input protection character, an automatic download protection character, and an operation allowance control such as voice recognition.

본 발명은 판독이 필요한 미디어 파일을 제시하고, 제시된 이미지에 대한 인식 값을 입력 받으면, 그에 대한 보상을 부여하는 미디어 문자 판독 시스템 및 이를 활용한 보안 인증 방법을 제공한다.The present invention provides a media character reading system and a security authentication method utilizing the media file, which presents a media file that needs to be read, and provides compensation for the received recognition value for the presented image.

본 발명의 일 측면에 따르면, 미디어 문자 판독 시스템을 제공한다. According to one aspect of the present invention, there is provided a media character reading system.

본 발명의 일 실시예에 따른 미디어 문자 판독 시스템에 있어서, 판독이 필요한 미디어 파일을 저장하고 관리하는 관리부, 미디어 파일을 이용하여 문제를 생성하고, 제시하는 문제 생성부, 문제 생성부가 제시한 문제에 대해 입력된 답을 수신하여 정답 후보 그룹에 저장하는 데이터 수집부, 및 정답 후보 그룹에서 다수가 입력한 정답 후보를 산출하는 추천부를 포함할 수 있다.In a media character reading system according to an embodiment of the present invention, a management unit that stores and manages a media file that needs to be read, generates a problem using a media file, presents a problem generator, and presents a problem to the problem generator. It may include a data collection unit for receiving the answer input for the data stored in the correct answer candidate group, and a recommendation unit for calculating a correct answer candidate input by a plurality of correct answer candidate groups.

본 발명의 다른 일 측면에 따르면, 미디어 문자 판독 시스템을 활용한 보안 인증 방법 및 이를 실행하는 컴퓨터 프로그램이 기록된 컴퓨터가 판독 가능한 기록매체를 제공한다.According to another aspect of the present invention, there is provided a computer-readable recording medium in which a security authentication method utilizing a media character reading system and a computer program executing the same are recorded.

본 발명의 일 실시 예에 따른 미디어 문자 판독 시스템을 활용한 보안 인증 방법 및 이를 실행하는 컴퓨터 프로그램이 저장된 기록매체는 미디어 문자 판독 시스템이 미디어 문자를 판독하는 방법에 있어서, 판독이 필요한 미디어 파일을 선택하는 단계, 미디어 파일을 이용하여 문제를 생성하는 단계, 문제를 사용자 단말기에 제시하는 단계, 사용자 단말기에 입력된 답을 수신하고 정답 후보 그룹에 저장하는 단계 및 정답 후보 그룹에 저장된 정답 후보 중 입력빈도가 가장 높은 정답 후보를 산출하는 단계를 포함할 수 있다.A security authentication method utilizing a media character reading system according to an embodiment of the present invention and a recording medium storing a computer program executing the same are selected in the media character reading system reading method for media characters: Step, generating a problem using a media file, presenting the problem to the user terminal, receiving an answer input to the user terminal and storing it in the correct answer candidate group, and input frequency among correct answer candidates stored in the correct answer candidate group And calculating the highest correct answer candidate.

본 발명의 일 실시 예에 따르면, 미디어 문자 판독 시스템 및 이를 활용한 보안 인증 방법은 컴퓨터로 디지털화 작업을 진행하는 미디어 파일을 검증하기 위한 것으로 다수의 사람이 인식한 데이터를 수집하고 가장 많은 입력빈도를 가지는 정답 후보를 산출하는 것으로 미디어 파일의 인식 정확도를 높일 수 있다.According to an embodiment of the present invention, a media character reading system and a security authentication method utilizing the same are for verifying a media file that is being digitized by a computer, collecting data recognized by a large number of people, and using the most input frequency. Branches can increase the recognition accuracy of media files by calculating candidates for correct answers.

본 발명의 일 실시 예에 따르면, 미디어 문자 판독 시스템 및 이를 활용한 보안 인증 방법은 컴퓨터가 인식하지 못하는 손 글씨 또는 필기체의 원래 형태를 이용하여 문자 변형 또는 왜곡을 최소화하여 사람임을 인증할 수 있다.According to an embodiment of the present invention, a media character reading system and a security authentication method utilizing the same can authenticate a person by minimizing character deformation or distortion using an original form of handwriting or handwriting that the computer does not recognize.

도 1 내지 도4는 본 발명의 일 실시 예에 따른 미디어 문자 판독 시스템을 설명하기 위한 도면들.
도 5 는 본 발명의 일 실시 예에 따른 미디어 문자 판독 방법을 설명하기 위한 도면.
도 6은 본 발명의 일 실시 예에 따른 미디어 문자 판독 시스템을 이용한 입력 보상 시스템의 보상 방법을 설명하기 위한 도면.
도 7은 본 발명의 일 실시 예에 따른 미디어 문자 판독 시스템을 이용한 동작 허용 제어를 위한 보안 인증 방법을 설명하기 위한 도면.
도 8 내지 도 17은 판독이 필요한 미디어 파일 중 문자 이미지를 이용하여 문제를 생성하는 예시 화면들.1 to 4 are diagrams for explaining a media character reading system according to an embodiment of the present invention.
5 is a view for explaining a media character reading method according to an embodiment of the present invention.
6 is a view for explaining a compensation method of an input compensation system using a media character reading system according to an embodiment of the present invention.
7 is a view for explaining a security authentication method for operation permission control using a media character reading system according to an embodiment of the present invention.
8 to 17 are example screens for generating a problem using a text image among media files that need to be read.

본 발명은 다양한 변경을 가할 수 있고 여러 가지 실시 예를 가질 수 있는 바, 특정 실시 예들을 도면에 예시하고 이를 상세한 설명을 통해 상세히 설명하고자 한다. 그러나, 이는 본 발명을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. 본 발명을 설명함에 있어서, 관련된 공지 기술에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다. 또한, 본 명세서 및 청구항에서 사용되는 단수 표현은, 달리 언급하지 않는 한 일반적으로 "하나 이상"을 의미하는 것으로 해석되어야 한다.The present invention can be applied to various changes and can have various embodiments, and specific embodiments will be illustrated in the drawings and described in detail through detailed description. However, this is not intended to limit the present invention to specific embodiments, and should be understood to include all modifications, equivalents, and substitutes included in the spirit and scope of the present invention. In describing the present invention, when it is determined that a detailed description of related known technologies may unnecessarily obscure the subject matter of the present invention, the detailed description will be omitted. In addition, the singular expressions used in the specification and claims should be construed to mean “one or more” in general unless stated otherwise.

이하, 본 발명의 바람직한 실시 예를 첨부도면을 참조하여 상세히 설명하기로 하며, 첨부 도면을 참조하여 설명함에 있어, 동일하거나 대응하는 구성 요소는 동일한 도면번호를 부여하고 이에 대한 중복되는 설명은 생략하기로 한다.Hereinafter, a preferred embodiment of the present invention will be described in detail with reference to the accompanying drawings, and in the description with reference to the accompanying drawings, identical or corresponding components are assigned the same reference numbers, and redundant description thereof will be omitted. Shall be

도 1 내지 도4는 본 발명의 일 실시 예에 따른 미디어 문자 판독 시스템을 설명하기 위한 도면들이다. 1 to 4 are diagrams for explaining a media character reading system according to an embodiment of the present invention.

도 1을 참조하면, 미디어 문자 판독 시스템(10)은 미디어 파일에서 추출된 판독이 필요한 문자 이미지(110)를 검증하고 교정하기 위한 시스템이다. 미디어 파일은 이미지 또는 음성 파일 등일 수 있으나 이해를 돕기 위해 문자 이미지를 기준으로 설명하도록 한다.Referring to FIG. 1, the media character reading system 10 is a system for verifying and correcting a character image 110 requiring readout extracted from a media file. The media file may be an image or a voice file, but the description is based on the text image to help understanding.

기존의 광학 문자 인식을 통해 생성된 디지털 문자(30)는 다시 사람이 통해 검수하고 교정하는 과정이 반드시 필요하다. 미디어 문자 판독 시스템(10)은 광학 문자 인식(OCR)을 통해 생성된 디지털 문자(30) 검수에 사용할 수도 있다. 예를 들면, 미디어 문자 판독 시스템(10)은 각종 이미지 속 글씨(인쇄 및 필기)를 컴퓨터에서 디지털 문자로 인식할 때 인식된 판독이 필요한 문자 이미지(110)를 검수하여 정답 후보를 추천할 수 있다. 음성 파일의 경우에도 문자 이미지와 마찬가지로 판독이 필요한 부분을 추출하여 검증하고 교정할 수 있다. 알파벳과 숫자 뿐만 아니라 다른 문자에 대해서도 사용 가능하고, 사람이 손을 쓴 주관식 답변을 데이터를 옮기거나, 그 데이터를 검수하는데 사용할 수 있다. The digital character 30 generated through the existing optical character recognition is again required to be inspected and corrected by a person. The media character reading system 10 may also be used to inspect digital characters 30 generated through optical character recognition (OCR). For example, the media character reading system 10 may recommend a correct answer candidate by inspecting the character image 110 that needs to be recognized when the computer recognizes the characters (printing and writing) in various images as digital characters. . In the case of a voice file, like a text image, a portion that needs to be read can be extracted, verified, and corrected. It can be used for alphabets and numbers, as well as for other characters, and can be used to move data or examine human-written subjective answers.

미디어 문자 판독 시스템(10)은 판독이 필요한 문자 이미지(110)를 검수할 때, 복수의 다양한 사용자 단말기(20)에 판독이 필요한 문자 이미지(110)를 제시하고, 입력된 값들을 판단하여 정답 후보를 추천할 수 있다. 예를 들면, 사용자 단말기(20)는 PC, 테블릿, 랩탑, 휴대폰 등 다양한 모바일 단말기를 포함할 수 있다. 또한 미디어 문자 판독 시스템(10)과 사용자 단말기(20)는 유무선 통신으로 연결될 수 있다.When the media character reading system 10 inspects the character image 110 that needs to be read, it presents the character image 110 that needs to be read to a plurality of various user terminals 20, judges the input values, and is a correct answer candidate Can recommend. For example, the user terminal 20 may include various mobile terminals such as a PC, tablet, laptop, and mobile phone. In addition, the media character reading system 10 and the user terminal 20 may be connected by wired or wireless communication.

도 2를 참조하면, 미디어 문자 판독 시스템(10)은 컴퓨터가 혼동하는 문자 이미지의 판독 정확성을 높일 수 있다. 예를 들면, 도2에서와 같이 손 글씨 또는 필기체에서 “나” 와 아라비아 숫자”4”를 컴퓨터는 구분하지 못한다. 또한 대문자 I(아이)와 소문자 ㅣ 엘) 도 손글씨나 필기체에서는 구분하기 힘든 경우가 많다. 이렇게 컴퓨터는 식별하기 힘들지만 사람은 문단 전체를 보고 개별 문자를 구별한다. 미디어 문자 판독 시스템은 컴퓨터가 혼동하는 인식율이 낮은 문자 이미지(110)를 검증을 통해 인식 정확도를 높인다.Referring to FIG. 2, the media character reading system 10 may increase the reading accuracy of a character image confused by a computer. For example, as shown in Fig. 2, the computer cannot distinguish between "B" and the Arabic numeral "4" in handwriting or handwriting. Also, uppercase letter I (child) and lowercase letter l) are often difficult to distinguish in handwriting or handwriting. Computers like this are difficult to identify, but people look at the entire paragraph and identify individual characters. The media character reading system increases the recognition accuracy through verification of the character image 110 having a low recognition rate confused by the computer.

도3을 참조하면, 미디어 문자 판독 시스템(10)은 판독이 필요한 문자 이미지(110)를 동작 허용 제어에 이용하여 보안 인증할 수 있다. 예를 들면, 판독이 필요한 문자 이미지(110)를 동작 허용 제어를 할 때 보안 인증에 대한 문제로 이용할 수 있다. 예를 들면, 판독이 필요한 문자 이미지(110)를 그대로 또는 왜곡을 최소화하여 동작 허용 제어의 보안 인증에 대한 문제로 제시하고, 보안 인증 요청자에 의해 입력된 값을 판단하여 보안 인증을 허용하는 보안 인증 방법으로 판독이 필요한 문자 이미지(110)의 정답 후보 데이터를 수집할 수 있다.Referring to FIG. 3, the media character reading system 10 may securely authenticate the character image 110 that needs to be read by using the operation permission control. For example, the text image 110 that needs to be read can be used as a problem for security authentication when controlling operation permission. For example, the text image 110 that needs to be read is presented as a problem for security authentication of the operation permission control by minimizing distortion or as it is, and the security authentication that allows security authentication by judging the value input by the security authentication requester In the method, candidate data for correct answers of the text image 110 that needs to be read can be collected.

미디어 문자 판독 시스템(10)은 판독이 필요한 문자 이미지(110)를 제시하고 제시된 문자 이미지에 대해 답을 입력하는 사용자에게 보상을 주는 입력 보상 방법을 이용하여 인식율이 낮은 문자 이미지(110)의 정답 후보 데이터를 수집할 수 있다. 예를 들면 입력에 대한 보상으로 포인트를 제공하는 입력 포인트 시스템을 이용할 수 있다.The media character reading system 10 presents a character image 110 that needs to be read, and uses the input compensation method to compensate a user who inputs an answer to the presented character image, and is a candidate for the correct answer of the character image 110 having a low recognition rate. Data can be collected. For example, an input point system that provides points as compensation for input may be used.

미디어 문자 판독 시스템(10)은 보안 인증 방법 및 입력 보상 방법을 이용하여 수집한 정답 후보 데이터를 조합하여 추천 정답을 산출할 수도 있고, 보안 인증 방법 및 입력 보상 방법 중 어느 하나를 이용하여 정답 후보 데이터를 수집하고 추천 정답을 산출할 수도 있다.The media character reading system 10 may combine the correct answer candidate data collected by using the security authentication method and the input compensation method to calculate a recommended correct answer, and the correct answer candidate data by using any one of the security authentication method and the input compensation method You can also collect and calculate the correct answer.

도 4를 참조하면, 미디어 문자 판독 시스템(10)은 관리부(100), 문제 생성부(200), 데이터 수집부(300) 및 추천부(400)를 포함한다.Referring to FIG. 4, the media character reading system 10 includes a management unit 100, a problem generation unit 200, a data collection unit 300, and a recommendation unit 400.

관리부(100)는 추출된 문자 이미지 중 판독이 필요한 문자 이미지(110)를 저장하고 관리한다. 예를 들면 미디어 파일에 해당하는 문자 이미지(110)는 컴퓨터가 판독했지만 판독이 어려워 정확도가 낮거나 인식하지 못한 문자 이미지(110)를 포함할 수 있다. 관리부(100)는 판독이 필요한 문자 이미지, 이미 정답 후보가 추천되어 판독이 완료된 문자 이미지, 오류가 발생한 문자 이미지를 구분하여 관리한다. 문자 이미지(110)는 관리자가 입력한 정답 후보 또는 사용자 단말기(20)에서 입력된 정답 후보가 저장된 정답 후보 그룹(120)을 포함한다. 관리부(100)는 문자 이미지(110)에서 추출 또는 판독한 문자를 정답 후보 그룹(120)에 저장하고 관리한다. 관리부(100)는 문자 이미지(110)를 외부(예를 들면 연결된 OCR 시스템)에서 자동으로 수신할 수도 있고, 관리자가 직접 판독이 필요한 문자 이미지를 추가할 수도 있다.The management unit 100 stores and manages the text image 110 that needs to be read out of the extracted text images. For example, the text image 110 corresponding to the media file may include a text image 110 that is read by a computer, but is difficult to read, and thus is not accurate or recognized. The management unit 100 separates and manages a text image that needs to be read, a text image that has been read because a candidate for a correct answer is already recommended, and a text image in which an error has occurred. The text image 110 includes a correct answer candidate group 120 in which a correct answer candidate input by the administrator or a correct answer candidate input from the user terminal 20 is stored. The management unit 100 stores and manages characters extracted or read from the text image 110 in the correct answer candidate group 120. The management unit 100 may automatically receive the text image 110 from the outside (for example, a connected OCR system), or the administrator may add a text image requiring direct reading.

문제 생성부(200)는 관리부(100)에 저장된 문자 이미지(110)를 이용하여 문자 인식을 위한 문제를 생성하고 관리한다. 문제 생성 방법에 대하여는 이후에 예시를 이용하여 자세히 설명하도록 한다. 문제 생성부(200)는 사용자 단말기(20)에 문제를 제시한다. 문제 생성부(200)는 높은 확률로 정답 후보 그룹(120)에 충분한 정답 후보가 쌓인 문자 이미지(110)의 문제를 제시하고, 낮은 확률로 정답 후보 그룹(120)에 정답 후보가 적게 입력된 문자 이미지(110)의 문제를 제시한다.The problem generator 200 generates and manages a problem for character recognition using the text image 110 stored in the manager 100. The problem creation method will be described in detail later using examples. The problem generator 200 presents a problem to the user terminal 20. The problem generator 200 presents a problem of a character image 110 in which a sufficient number of correct answer candidates are accumulated in the correct answer candidate group 120 with a high probability, and a character with fewer correct answer candidates entered in the correct answer candidate group 120 with a low probability The problem of the image 110 is presented.

데이터 수집부(300)는 문제 생성부(200)에서 사용자 단말기(20)로 제시한 문제에 대해 사용자 단말기(20)로 입력된 값을 수신하고 정답 후보 그룹(120)에 저장한다. 자세히 설명하면, 미디어 문자 판독 시스템(10)은 동일한 문자 이미지를 반복적으로 또는 여러 사람에게 제공하여 복수의 정답 후보를 하나 이상 수집하여 정답 후보 관리 그룹(120)에 저장하고 관리할 수 있다.The data collection unit 300 receives the value input to the user terminal 20 for the problem presented by the problem generation unit 200 to the user terminal 20 and stores it in the correct answer candidate group 120. In detail, the media character reading system 10 may provide the same character image repeatedly or to multiple people to collect one or more candidates for a plurality of correct answers and store and manage them in the correct answer candidate management group 120.

추천부(400)는 저장된 정답 후보 그룹(120) 중 동일한 답이 입력된 비율을 계산한다. 추천부(400)는 특정 정답 후보(33)의 비율이 미리 지정된 임계 값 이상이면 해당 특정 정답 후보를 정답으로 추천한다. 추천부(400)는 관리부(100)가 정답이 추천된 문자 이미지(110)는 문제로 생성되지 않도록 정답 판독 완료된 이미지로 분류할 것을 전달한다. 예를 들면, 문자 이미지 판독을 요청한 주체에게 정답이 추천된 문자 이미지(110)와 추천 정답을 제공하고, 관리부(100)에서 삭제할 수도 있다.The recommendation unit 400 calculates a ratio in which the same answer is input among the stored correct answer candidate groups 120. The recommendation unit 400 recommends the specific correct answer candidate as a correct answer when the ratio of the specific correct answer candidate 33 is equal to or greater than a predetermined threshold. The recommendation unit 400 transmits that the management unit 100 classifies the text image 110 in which the correct answer is recommended as an image in which the correct answer has been read so that it is not generated as a problem. For example, a text image 110 and a correct answer that the correct answer is recommended are provided to a subject who has requested to read the text image, and the manager 100 may delete the correct answer.

도 5 는 본 발명의 일 실시 예에 따른 미디어 문자 판독 방법을 설명하기 위한 도면이다.5 is a view for explaining a media character reading method according to an embodiment of the present invention.

본 발명의 일 실시 예에 따르면, 미디어 문자 판독 시스템(10)은 복수의 사용자 단말기(20)에 판독이 필요한 문자 이미지(110)를 제시하여, 사용자에게 인식 문자를 입력하도록 하고, 그에 대한 보상을 하는 방식으로 이용할 수 있다. 이런 경우 미디어 문자 판독 시스템(10)은 기존의 컴퓨터가 인식하지 못한 문자 이미지(110)를 처리하기 위해 교정 작업자가 통상 1-2회 검수하는 것에 비해 더 많은 횟수를 검수할 수 있으므로 인식 정확도를 높일 수 있다.According to an embodiment of the present invention, the media character reading system 10 presents a character image 110 that needs to be read to a plurality of user terminals 20 to allow a user to input a recognized character, and compensate for it. Can be used in the same way. In this case, the media character reading system 10 can increase the recognition accuracy because it can inspect more times than the calibration worker normally inspects 1-2 times to process the text image 110 that is not recognized by the existing computer. Can.

도 5을 참조하면, 미디어 문자 판독 시스템(10)은 단계 S510에서 수집된 판독이 필요한 미디어 파일을 다른 시스템에서 수신하거나 관리자가 직접 저장한다. 예를 들면 미디어 파일은 인식율이 낮아 판독이 필요한 문자 이미지(110)일 수 있다.Referring to FIG. 5, the media character reading system 10 receives a media file that needs to be read in step S510 from another system or directly stores it by an administrator. For example, the media file may be a text image 110 that needs to be read because of low recognition rate.

단계 S520에서 미디어 문자 판독 시스템(10)은 판독이 필요한 미디어 파일 중 하나는 선택한다.In step S520, the media character reading system 10 selects one of the media files that needs to be read.

단계 S530에서 선택한 미디어 파일을 이용하여 문제를 생성한다. 미디어 문자 판독 시스템(10)은 높은 확률로 정답 후보 그룹(120)에 충분한 정답 후보가 쌓인 문자 이미지(110) 문제를 생성하고, 낮은 확률로 정답 후보 그룹(120)에 정답 후보가 적게 입력된 문자 이미지(110)의 문제를 생성한다.A problem is generated using the media file selected in step S530. The media character reading system 10 generates a character image 110 problem in which sufficient correct answer candidates are stacked in the correct answer candidate group 120 with a high probability, and characters with fewer correct answer candidates are entered into the correct answer candidate group 120 with a low probability. The problem of the image 110 is created.

미디어 문자 판독 시스템(10)은 판독이 필요한 문자 이미지(110)를 그대로 사용하여 문제를 생성할 수도 있고, 민감한 정보의 유출을 막기 위해 이후 설명할 도8에서와 같이 문자 이미지(110)를 편집하여 사용할 수도 있다. 미디어 문자 판독 시스템(10)은 인식율을 높이기 위해 문자 자체의 변형은 최소화한다. The media character reading system 10 may create a problem by using the character image 110 that needs to be read as it is, and edit the character image 110 as shown in FIG. 8 to be described later in order to prevent the leakage of sensitive information. You can also use The media character reading system 10 minimizes the deformation of the character itself in order to increase the recognition rate.

단계 S540에서 미디어 문자 판독 시스템(10)은 생성한 문제를 사용자 단말기에 제시한다.In step S540, the media character reading system 10 presents the generated problem to the user terminal.

단계 S550에서 미디어 문자 판독 시스템(10)은 사용자 단말기(20)에서 입력된 값을 수신하여 정답 후보 그룹(120)에 저장한다.In step S550, the media character reading system 10 receives the value input from the user terminal 20 and stores it in the correct answer candidate group 120.

단계 S560에서 미디어 문자 판독 시스템(10)은 정답 후보 그룹(120)에 저장된 정답 후보 중 입력빈도가 가장 높은 정답 후보를 산출한다. 자세히 설명하면, 복수의 사용자로 인해 정답 후보 그룹(120)에 정답 후보가 미리 설정된 개수 이상 입력되면, 정답 후보 그룹(120)에서 가장 많이 입력된 정답 후보를 산출한다. 입력된 정답 후보의 개수가 미리 설정된 범위를 넘어서고, 그 중 가장 많이 대답한 정답 후보의 입력율이 미리 지정된 임계 값을 넘어서면, 가장 많이 입력된 정답 후보를 정답으로 추천할 수 있다.In step S560, the media character reading system 10 calculates a correct answer candidate having the highest input frequency among correct answer candidates stored in the correct answer candidate group 120. In detail, if more than a predetermined number of correct answer candidates are input to the correct answer candidate group 120 due to a plurality of users, the correct answer candidate that is input the most from the correct answer candidate group 120 is calculated. If the number of correct answer candidates entered exceeds a preset range, and the input rate of the most correct answer candidates exceeds a predetermined threshold value, the most correct answer candidates can be recommended as correct answers.

이때 정답 후보 그룹(120)에 저장된 정답 후보 중에 압도적 다수가 없고, 정답 후보의 개수가 지나치게 많은 경우 또는 사용자 단말기에서 답이 입력되지 않고, 다른 문제를 요청하는 상황(예를 들면 문제 리프레시 버튼을 사용하는 경우)이 많이 발생하는 경우 문제 자체에 오류가 있는 것으로 판단한다. 제시된 문제 자체의 이상으로 판단하는 경우 오류가 발생한 문자 이미지로 분류하고, 더 이상 해당 미디어 파일에 관한 문제가 생성되지 않도록 관리부(100)에서 별도 관리하여 미디어 파일과 문제 생성 패턴을 검사하도록 할 수 있다.At this time, if there are no overwhelming majority of the correct answer candidates stored in the correct answer candidate group 120, or if the number of correct answer candidates is too large or the answer is not input from the user terminal, another situation is requested (for example, a question refresh button is used) If it occurs a lot, it is judged that there is an error in the problem itself. When it is judged that the proposed problem is an abnormality, it can be classified as a text image in which an error occurs, and managed by the management unit 100 to check the media file and the problem creation pattern so that a problem with the corresponding media file is no longer generated. .

단계 S570에서 미디어 문자 판독 시스템(10)은 정답이 추천된 미디어 파일과 추천 정답은 의뢰자에게 전송된다. In step S570, the media character reading system 10 transmits the media file to which the correct answer is recommended and the correct answer to the requester.

단계 S580에서 미디어 문자 판독 시스템(10)은 정답이 추천된 미디어 파일과 추천 정답은 더 이상 문제 생성에 사용되지 않도록 관리하고, 단계 S520으로 이동하여 새로운 미디어 파일을 선택하여 문자 이미지 판독을 계속 진행한다.In step S580, the media character reading system 10 manages the media file for which the correct answer is recommended and the recommended correct answer are no longer used for generating a problem, and proceeds to step S520 to select a new media file to continue reading the text image. .

다시 단계 S560에서 미디어 문자 판독 시스템(10)은 미리 지정된 임계 값을 초과하는 정답 후보가 없으면 단계 S520으로 이동하여 새로운 미디어 파일을 선택하여 문자 이미지 판독을 계속 진행한다.In step S560, the media character reading system 10 proceeds to step S520 to select a new media file to continue reading the character image if there is no correct answer candidate exceeding a predetermined threshold.

도 6은 본 발명의 일 실시 예에 따른 미디어 문자 판독 시스템을 이용한 입력 보상 시스템의 보상 방법을 설명하기 위한 도면이다.6 is a view for explaining a compensation method of an input compensation system using a media character reading system according to an embodiment of the present invention.

도 6을 참조하면, 미디어 문자 판독 시스템(10)은 단계 S610에서 수집된 판독이 필요한 미디어 파일을 다른 시스템에서 수신하거나 관리자가 직접 저장한다. 예를 들면 미디어 파일은 판독이 필요한 문자 이미지(110) 또는 음성 파일일 수 있다.Referring to FIG. 6, the media character reading system 10 receives a media file that needs to be read in step S610 from another system or directly stores it by an administrator. For example, the media file may be a text image 110 or a voice file that needs to be read.

단계 S620에서 미디어 문자 판독 시스템(10)은 판독이 필요한 미디어 파일을 선택한다.In step S620, the media character reading system 10 selects a media file that needs to be read.

단계 S630에서 선택한 미디어 파일을 이용하여 문제를 생성한다. 미디어 문자 판독 시스템(10)은 판독이 필요한 미디어 파일을 그대로 사용하여 문제를 생성할 수도 있고, 민감한 정보의 유출을 막기 위해 도 8에서와 같이 미디어 파일을 편집하여 사용할 수도 있다. 미디어 문자 판독 시스템(10)은 인식율을 높이기 위해 문자 자체의 변형은 최소화한다.A problem is generated using the media file selected in step S630. The media character reading system 10 may create a problem using the media file that needs to be read as it is, or may edit and use the media file as shown in FIG. 8 to prevent the leakage of sensitive information. The media character reading system 10 minimizes the deformation of the character itself in order to increase the recognition rate.

단계 S640에서 미디어 문자 판독 시스템(10)은 생성한 문제를 사용자 단말기에 제시한다. In step S640, the media character reading system 10 presents the generated problem to the user terminal.

단계 S650에서 미디어 문자 판독 시스템(10)은 사용자 단말기(20)에서 입력된 값을 수신하여 정답 후보 그룹(120)에 저장한다.In step S650, the media character reading system 10 receives the value input from the user terminal 20 and stores it in the correct answer candidate group 120.

단계 S660에서 미디어 문자 판독 시스템(10)은 사용자 단말기(20)에 입력에 따른 보상을 제공한다. 예를 들면 보상은 포인트와 같은 형태로 제공할 수 있다.In step S660, the media character reading system 10 provides compensation according to the input to the user terminal 20. For example, rewards may be provided in the form of points.

단계 S670에서 미디어 문자 판독 시스템(10)은 정답 후보 그룹(120)에 저장된 정답 후보 중 입력비율이 가장 높은 정답 후보를 산출한다. 자세히 설명하면, 복수의 사용자가 입력하여 저장된 정답 후보 그룹(120)에서 다수결로 하여 가장 많이 입력된 정답 후보를 산출한다. 가장 많이 입력된 정답 후보의 입력율이 미리 지정된 임계 값을 넘어서면, 가장 많이 입력된 정답 후보를 정답으로 추천할 수 있다.In step S670, the media character reading system 10 calculates a correct answer candidate having the highest input ratio among correct answer candidates stored in the correct answer candidate group 120. In more detail, a majority of the correct answer candidates are calculated from a majority of the correct answer candidate groups 120 input and stored by a plurality of users. If the input rate of the most accurately entered candidate exceeds a predetermined threshold value, the most accurately entered candidate can be recommended as the correct answer.

이때 정답 후보 그룹(120)에 저장된 정답 후보 중에 압도적 다수가 없고, 정답 후보의 개수가 지나치게 많은 경우 또는 사용자 단말기에서 답을 입력하지 않고, 다른 문제를 요청하는 상황(예를 들면 문제 리프레시 버튼을 사용하는 경우)이 많이 발생하는 경우 문제 자체에 이상이 있는 것으로 판단한다. 제시된 문제 자체의 이상으로 판단하는 경우 더 이상 해당 미디어 파일에 관한 문제가 생성되지 않도록 관리부(100)에서 별도 관리하여 미디어 파일과 문제 생성 패턴을 검사하도록 할 수 있다.At this time, if there are no overwhelming majority of the correct answer candidates stored in the correct answer candidate group 120, or if the number of correct answer candidates is too large, or the user terminal does not input an answer, a situation in which another problem is requested (for example, using the question refresh button) If it occurs a lot), it is determined that the problem itself is abnormal. When it is determined that the proposed problem is abnormal, the management unit 100 may separately manage the media file and the problem creation pattern so that the problem related to the corresponding media file is no longer generated.

단계 S680에서 미디어 문자 판독 시스템(10)은 정답이 추천된 미디어 파일과 추천 정답은 의뢰자에서 전송된다. In step S680, the media character reading system 10 transmits the media file with the correct answer and the recommended correct answer from the requester.

단계 S690에서 미디어 문자 판독 시스템(10)은 정답이 추천된 미디어 파일과 추천 정답은 더 이상 문제 생성에 사용되지 않도록 관리부(100)에서 별도 관리하고, 단계 S620으로 이동하여 새로운 미디어 파일을 선택하여 문자 이미지 판독을 계속 진행한다.In step S690, the media character reading system 10 separately manages the media file in which the correct answer is recommended and the recommended correct answer in the management unit 100 so that it is no longer used to generate a problem, and then moves to step S620 to select a new media file and enters the character. Continue reading the image.

다시 단계 S670에서 미디어 문자 판독 시스템(10)은 미리 지정된 임계 값을 초과하는 정답 후보가 없으면 단계 S620으로 이동하여 새로운 미디어 파일을 선택하여 문자 이미지 판독을 계속 진행한다.In step S670, the media character reading system 10 proceeds to step S620 to select a new media file to continue reading the character image if there is no correct answer candidate exceeding a predetermined threshold.

도 7은 본 발명의 일 실시 예에 따른 미디어 문자 판독 시스템(10)을 이용한 동작 허용 제어를 위한 보안 인증 방법을 설명하기 위한 도면이다.7 is a view for explaining a security authentication method for the operation permission control using the media character reading system 10 according to an embodiment of the present invention.

동작 허용 제어 인증은 다량의 다운로드, 매크로, 봇(bot), 에이전트 등을 이용한 정보 유출을 막기 위해, 자동 입력 방지 문자, 자동 다운로드 방지 문자, 음성을 통한 인증을 진행하는 것이다. 본 발명은 동작 허용 제어 보안 인증을 할 때 왜곡된 이미지 대신 판독이 필요한 미디어 파일을 제시하여 판독 데이터를 수집하고 검증하는 것과 동시에 보안 인증을 할 수 있다.The operation permission control authentication is to prevent the information leakage using a large amount of downloads, macros, bots, agents, etc., and to perform authentication through automatic input protection text, automatic download protection text, and voice. The present invention can perform security authentication at the same time as collecting and verifying read data by presenting a media file that needs to be read instead of a distorted image when performing the operation admission control security authentication.

도 7을 참조하면, 미디어 문자 판독 시스템(10)은 동작 허용 요청을 수신하면 단계 S710에서 판독이 필요한 미디어 파일을 선택하여 보안 인증을 위한 문제를 생성한다. 예를 들면 판독이 필요한 미디어 파일은 인식율이 떨어지는 문자 이미지, 음성 파일 등을 포함할 수 있다.Referring to FIG. 7, when the media character reading system 10 receives an operation permission request, in step S710, a media file that needs to be read is selected to generate a problem for security authentication. For example, a media file that needs to be read may include a text image, a voice file, or the like with a low recognition rate.

단계 S720에서 미디어 문자 판독 시스템(10)은 동작 허용 제어 보안 인증을 요청한 사용자 단말기에 생성된 문제를 제시한다.In step S720, the media character reading system 10 presents a problem generated to the user terminal requesting the operation permission control security authentication.

단계 S730에서 미디어 문자 판독 시스템(10)은 제시된 문제의 답으로 사용자 단말기(20)에 입력되어 전송된 데이터를 수신한다.In step S730, the media character reading system 10 receives the input and transmitted data to the user terminal 20 in response to the presented problem.

단계 S740에서 미디어 문자 판독 시스템(10)은 정답 후보 그룹(120)에 정답 후보가 저장되어 있는지 확인한다. 정답 후보 그룹(120)에 정답 후보의 개수가 미리 설정된 설정정답후보수와 같거나 적다면, 단계 S730에서 수신한 데이터를 정답 후보 그룹(120)에 저장하고, 단계 S710으로 이동하여 새로운 문제를 생성한다. 예를 들어, 설정정답후보수가 0으로 설정되어 있고, 이때 정답 후보가 0개라면, 미디어 문자 판독 시스템(10)은 단계 S710으로 이동하여 새로운 문제를 생성할 때 정답 후보 그룹(120)에 저장된 정답 후보가 충분한 미디어 파일의 문제를 생성할 수 있다. 이런 경우 사용자 단말기(20)의 사용자는 기존의 문제의 대답을 잘못하였거나 오타가 난 것으로 생각하게 된다. 설정정답후보수는 0을 설정할 수 있지만 초기의 정확도를 위해 조정할 수 있다.In step S740, the media character reading system 10 checks whether the correct answer candidate is stored in the correct answer candidate group 120. If the number of correct answer candidates in the correct answer candidate group 120 is equal to or less than the preset correct correct answer, the data received in step S730 is stored in the correct answer candidate group 120, and the process moves to step S710 to generate a new problem. do. For example, if the post-correct correct answer is set to 0, and if there are 0 correct answer candidates, the media character reading system 10 moves to step S710 to generate a new question, and the correct answer stored in the correct answer candidate group 120 Candidates can create problems with enough media files. In this case, the user of the user terminal 20 thinks that the answer to the existing problem is wrong or that there is a typo. After correct answer, 0 can be set, but it can be adjusted for initial accuracy.

단계 S750에서 미디어 문자 판독 시스템(10)은 단계 S730에서 전송된 답과 동일한 값이 정답 후보 그룹에 있는지 확인한다. In step S750, the media character reading system 10 checks whether the same value as the answer transmitted in step S730 is in the correct answer candidate group.

정답 후보 그룹(120)에 전송된 답과 동일한 정답 후보가 존재하면 전송된 답을 정답 후보 그룹(120)에 추가(단계 S760)하고, 단계 S770에서 동작 허용을 제어하기 위한 인증은 완료된다. 이때 인증 완료 조건은 정답 후보 그룹에 전송된 답과 동일한 정답 후보가 특정 비율 이상 입력된 경우로 설정할 수도 있다. 검증 횟수가 많은 경우를 예로 들면, 정답 후보 그룹에49회 입력된 정답 후보와 1회 입력된 다른 정답 후보가 있는 상태도 발생할 수 있다. 이때, 미디어 문자 판독 시스템(10)은 소수 입력된 정답 후보와 동일한 답이 또 입력되면, 인증을 통과할 수 없도록 할 수 있다. 자세히 설명하면, 미디어 문자 판독 시스템(10)은 입력된 답이 정답 후보 그룹에서 특정 비율 이상인 경우에만 인증이 완료되도록 할 수 있다. 미디어 문자 판독 시스템(10)은 인증이 완료되지 않으면 단계 S710으로 이동하여 새로운 문제를 생성한다.If the same answer candidate as the answer transmitted to the correct answer candidate group 120 exists, the transmitted answer is added to the correct answer candidate group 120 (step S760), and authentication to control operation permission is completed in step S770. At this time, the authentication completion condition may be set to a case in which a correct answer candidate identical to the answer transmitted to the correct answer candidate group is inputted in a specific ratio or more. For example, when the number of verifications is large, for example, a state in which a correct answer candidate entered 49 times and another correct answer candidate entered once in the correct answer candidate group may also occur. At this time, the media character reading system 10 may prevent the authentication from being passed if the same answer as the candidate for the correct answer entered again is input again. In more detail, the media character reading system 10 may allow authentication to be completed only when the input answer is a certain percentage or more in the correct answer candidate group. If the authentication is not completed, the media character reading system 10 proceeds to step S710 to create a new problem.

단계 S750에서 정답 후보 그룹(120)에 전송된 답과 동일한 값이 없으면 전송된 답을 정답 후보 그룹(120)에 추가하고, 단계 S710으로 이동하여 새로운 문제를 생성한다. 이때 미디어 문자 판독 시스템(10)은 정답 후보 그룹(120)에 정답 후보가 충분히 저장된 미디어 파일을 이용하여 문제를 생성하도록 한다. 자세히 설명하면, 제시된 문제의 정답 후보 그룹(120)에 값이 없거나(Null 값) 충분한 데이터가 없는 경우(예를 들면, 설정정답후보수보다 정답 후보의 개수가 적은 경우) 다시 제시되는 문제는 정답 후보 그룹(120)에 정답 후보가 충분하고 동일한 정답 후보가 다수 존재하여 보안 인증이 가능한 미디어 파일을 이용하여 문제를 생성하고 제시할 수 있다.If there is no value equal to the answer transmitted to the correct answer candidate group 120 in step S750, the transmitted answer is added to the correct answer candidate group 120, and the process moves to step S710 to generate a new problem. At this time, the media character reading system 10 generates a problem using a media file in which the correct answer candidate is sufficiently stored in the correct answer candidate group 120. In more detail, if the correct answer candidate group 120 of the presented problem has no value (null value) or there is not enough data (for example, the number of correct answer candidates is less than the set correct answer), the question that is presented again is correct. The candidate group 120 has a sufficient number of correct answer candidates and a plurality of identical correct answer candidates, so that a problem can be generated and presented using a media file capable of security authentication.

미디어 문자 판독 시스템(10)은 도 7과 같은 사용자의 동작 허용 제어 인증 과정을 복수의 사용자에게 반복적으로 수행하여 정답 후보 그룹(120)의 데이터를 수집할 수 있다. The media character reading system 10 may collect data of the correct answer candidate group 120 by repeatedly performing a user's operation permission control authentication process as illustrated in FIG. 7 to a plurality of users.

단계 S780에서 미디어 문자 판독 시스템(10)은 정답 후보 그룹(120)에 저장된 정답 후보 중 입력비율이 가장 높은 정답 후보를 산출한다. 자세히 설명하면, 복수의 사용자가 입력하여 저장된 정답 후보 그룹(120)에서 다수결로 하여 가장 많이 입력된 정답 후보를 산출한다. 가장 많이 입력된 정답 후보의 입력율이 미리 지정된 임계 값을 넘어서면, 가장 많이 입력된 정답 후보를 정답으로 추천할 수 있다.In step S780, the media character reading system 10 calculates a correct answer candidate having the highest input ratio among correct answer candidates stored in the correct answer candidate group 120. In more detail, a majority of the correct answer candidates are calculated from a majority of the correct answer candidate groups 120 input and stored by a plurality of users. If the input rate of the most accurately entered candidate exceeds a predetermined threshold value, the most accurately entered candidate can be recommended as the correct answer.

단계 S790에서 미디어 문자 판독 시스템(10)은 충분히 수집된 정답 후보 그룹(120)의 데이터를 판독 요청을 한 의뢰자 또는 관리자에게 제공하거나, 판독 완료로 분류하여 다시 문제 생성에 사용되지 않도록 관리할 수 있다. In step S790, the media character reading system 10 may provide data of the correct answer candidate group 120, which has been sufficiently collected, to a requester or an administrator who has requested a reading, or classify it as reading completion and manage it so that it is not used to generate a problem again. .

제공되는 데이터는 미디어 파일과 그의 정답 후보 그룹(120)의 데이터일 수도 있고, 미디어 문자 판독 시스템(10)의 판단을 거쳐 추천된 정답 후보와 그 미디어 파일 일 수 있다. The provided data may be data of a media file and its correct answer candidate group 120, or may be a recommended correct answer candidate and its media file through determination by the media character reading system 10.

다만 정답 후보 그룹(120)에 저장된 정답 후보 중에 압도적 다수가 없고, 정답 후보의 개수가 지나치게 많은 경우 또는 사용자 단말기에서 답을 입력하지 않고, 다른 문제를 요청하는 상황(예를 들면 문제 리프레시 버튼을 사용하는 경우)이 많이 발생하는 경우 문제 자체에 이상이 있는 것으로 판단한다. 제시된 문제 자체의 이상으로 판단하는 경우 더 이상 해당 미디어 파일에 관한 문제가 생성되지 않도록 관리부(100)에서 별도 관리하여 미디어 파일과 문제 생성 패턴을 검사하도록 할 수 있다.However, if there are no overwhelming majority of correct answer candidates stored in the correct answer candidate group 120, or if the number of correct answer candidates is too large, or the user terminal does not input an answer, another situation is requested (for example, a question refresh button is used) If it occurs a lot), it is determined that the problem itself is abnormal. When it is determined that the proposed problem is abnormal, the management unit 100 may separately manage the media file and the problem creation pattern so that the problem related to the corresponding media file is no longer generated.

도 8 내지 도 17은 판독이 필요한 미디어 파일 중 문자 이미지를 이용하여 문제를 생성하는 예시 화면들이다.8 to 17 are example screens for generating a problem using a text image among media files that need to be read.

도 8 내지 도 17의 문제 생성 방법들은 미디어 파일 중 문자 이미지 속 글자를 사용자 단말기에서 알아보기 힘들지 않도록 글자 자체의 변형은 최소화하면서, 매크로나 봇(Bot), 자동 에이전트 등으로는 통과하기는 쉽지 않다. 최소 “한 글자” 또는 “한 단어”만 입력해도 인증 확인이 가능하다. 기존 문자 또는 문자 이미지를 이용한 인증 방법은 컴퓨터가 판독하기 힘들게 하기 위해 글자 변형(왜곡)을 심하게 해서 사람이 알아보기 힘든 경우도 있으나, 본 발명은 인위적인 문자 변형을 최소화하고, 사람이 직접 쓴 단어 그대로 문자 이미지를 사용할 수 있어 사람에게는 가독성이 더 높아 인식율을 높일 수 있다. 똑같이 생긴 글자라도 사람은 문맥을 통해 다른 글자를 판단하지만, 컴퓨터가 판단하기는 쉽지 않다. 특히 손글씨의 경우 글자 변형(왜곡)을 굳이 하지 않아도 컴퓨터의 인식율이 낮고, 알파벳이 아닌 다른 문자의 경우 정확도는 더욱 떨어진다.The problem generating methods of FIGS. 8 to 17 are not easy to pass through macros, bots, and automatic agents while minimizing the deformation of the letters themselves so that the characters in the text images among the media files are not difficult to recognize on the user terminal. . Authentication can be verified by entering at least “one letter” or “one word”. In some cases, an authentication method using an existing text or a text image may be difficult for a human to recognize because the text modification (distortion) is severe to make the computer difficult to read, but the present invention minimizes artificial text modification, and the word written by the person is the same. Since text images can be used, readability is higher for humans, which can increase the recognition rate. Even if the letters look the same, a person judges other letters through context, but it is not easy for a computer to judge. In particular, in the case of handwriting, the recognition rate of the computer is low even if the character is not changed (distortion), and the accuracy of other characters other than the alphabet is further reduced.

도 8을 참조하면, 입력된 정보 유출을 줄이기 위해, 여러 개의 이미지를 나눌 수도 있고(도 8a 참조), 나눈 문자 이미지를 반복하거나 순서를 바꾸어서(도 8b참조) 문제 제공에 사용할 수도 있다. 또한, 해커가 인증이 불가능하도록 다른 문자 이미지로 인식하게 위치와 각도를 다양하게 조정(도 8c 참조할 수도 있고, 매크로, BOT 등이 인식하기 힘들도록 함정과 함께 문자 이미지를 제시(도 4d 참조)하여 사용될 수도 있다.Referring to FIG. 8, in order to reduce the leakage of inputted information, multiple images may be divided (see FIG. 8A ), or the divided character images may be repeated or reordered (see FIG. 8B) to be used to provide a problem. In addition, various positions and angles are adjusted so that the hacker recognizes as a different character image so that authentication is not possible (refer to FIG. 8C, and a character image is presented along with a trap so that macros, BOT, etc. are difficult to recognize (see FIG. 4D). Can also be used.

도 9를 참조하면, 문제 생성부(200)는 문맥상 의미가 없는 낙서, 그림 및 글자를 함께 지시하는 방법이다. 매크로, 봇(bot), 자동 에이전트과 같은 컴퓨터는 문자로 인식하나 사람은 문자로 인식하지 않는 점을 이용하여 문제를 생성할 수 있다.Referring to FIG. 9, the problem generator 200 is a method of instructing graffiti, pictures, and letters that have no meaning in context. Computers, such as macros, bots, and automatic agents, can recognize problems as characters, but humans do not.

도 10을 참조하면, 문제 생성부(200)는 원본에 없는 글자나 사람이 인식하는 교정부호(컴퓨터가 인식하지 못하는 교정 부호)를 삽입하여 매크로, 봇(bot), 자동 에이전트과 같은 컴퓨터의 인식율을 낮추는 방법으로 문제를 생성할 수 있다.Referring to FIG. 10, the problem generator 200 inserts a character that is not in the original or a correction code recognized by a person (a correction code that the computer does not recognize) to determine the recognition rate of a computer such as a macro, a bot, or an automatic agent. You can create a problem by lowering it.

도 11을 참조하면, 문제 생성부(200)는 문장, 단어, 글자 일부를 교대로 삭제 표시와 함께 제시하고 전체를 파악하는 방법으로 매크로, 봇(bot), 자동 에이전트과 같은 컴퓨터는 이해하기 어려운 점을 이용하여 문제를 생성할 수 있다.Referring to FIG. 11, the problem generator 200 alternately presents a sentence, a word, and a letter together with a tombstone and grasps the whole, so a computer such as a macro, a bot, or an automatic agent is difficult to understand. Can be used to create a problem.

도 12를 참조하면, 문제 생성부(200)는 전체 중 일부(밑줄, 색깔, 표시, 위치 및 순서(n번째) 등)만 입력 요구하는 방법으로 문제를 생성할 수 있다.Referring to FIG. 12, the problem generator 200 may generate a problem by requesting input of only a part (underline, color, display, position and order (nth), etc.) of the whole.

도 13을 참조하면, 문제 생성부(200)는 문제 지시문의 일부 또는 전체를 이미지로 제시하여 매크로, 봇(bot), 자동 에이전트과 같은 컴퓨터가 인식하지 못하는 문제를 생성할 수 있다.Referring to FIG. 13, the problem generator 200 may generate a problem that a computer such as a macro, a bot, or an automatic agent does not recognize by presenting part or all of the problem directive as an image.

도 14를 참조하면, 문제 생성부(200)는 문제 지시문과 입력해야 하는 문자를 함께 이미지로 제시하여 매크로, 봇(bot), 자동 에이전트과 같은 컴퓨터가 인식하지 못하는 문제를 생성할 수 있다. 또한 판독해야 하는 글자를 여러 개 추가하여 위치에 따른 문자를 추출하는 방법을 더 포함할 수도 있다.Referring to FIG. 14, the problem generator 200 may generate a problem that a computer such as a macro, a bot, or an automatic agent does not recognize by presenting a problem directive and a character to be input as an image. In addition, the method may further include a method of extracting characters according to locations by adding several characters to be read.

도 15를 참조하면, 문제 생성부(200)는 지시문을 알파벳 및 기호 등으로 치환하고 천지인 입력의 원리로 모음을 점과 선으로 나누어 표기하거나, 자음 또는 모음만을 치환하고 나머지는 그대로 사용하는 등의 치환 규칙을 이용하여 매크로, 봇(bot), 자동 에이전트과 같은 컴퓨터가 인식하지 못하는 문제를 생성할 수 있다.Referring to FIG. 15, the problem generator 200 replaces the directives with alphabets and symbols, and divides vowels into dots and lines based on the principle of input, or replaces only consonants or vowels and uses the rest as it is. Using substitution rules, you can create problems that your computer doesn't recognize, such as macros, bots, and automatic agents.

도 16을 참조하면, 문제 생성부(200)는 정황상 사람은 무시하는 내용 또는 크기가 작아서 사람의 눈으로 판독하기가 힘든 글자를 입력하고, 함께 문제를 생성하면 매크로, 봇(bot), 자동 에이전트과 같은 컴퓨터는 사람이 무시하거나 인식할 수 없는 글자를 인식하도록 하여 혼란을 주는 문제를 생성할 수 있다.Referring to FIG. 16, the problem generator 200 inputs characters that are difficult to be read by the human eye because contents or sizes that are ignored by a person under normal circumstances are generated, and when a problem is generated together, macros, bots, and automatics are generated. Computers, such as agents, can create problems that can be confusing by allowing people to recognize letters that they can ignore or not recognize.

도 17을 참조하면, 문제 생성부(200)는 판독이 필요한 미디어 파일의 변형을 최소화하고, 일부 확대, 축소, 회전, 간격 등을 조절한 같은 문자들을 반복해서 사람에게는 동일한 문제이지만 매크로, 봇(bot), 자동 에이전트과 같은 컴퓨터는 여러 문제로 인식할 가능성이 높은 문제를 생성할 수 있다.Referring to FIG. 17, the problem generator 200 minimizes the deformation of a media file that needs to be read, and repeatedly repeats the same characters with some enlargement, reduction, rotation, spacing, etc. Computers such as bots and automated agents can create problems that are more likely to be recognized as problems.

상술한 미디어 문자 판독 방법, 이를 이용한 입력 보상 시스템의 보상 방법 및 이를 이용한 동작 허용 제어 인증 시스템의 보안 인증 방법은 컴퓨터가 읽을 수 있는 매체 상에 컴퓨터가 읽을 수 있는 코드로 구현될 수 있다. 상기 컴퓨터로 읽을 수 있는 기록 매체는, 예를 들어 이동형 기록 매체(CD, DVD, 블루레이 디스크, USB 저장 장치, 이동식 하드 디스크)이거나, 고정식 기록 매체(ROM, RAM, 컴퓨터 구비형 하드 디스크)일 수 있다. 상기 컴퓨터로 읽을 수 있는 기록 매체에 기록된 상기 컴퓨터 프로그램은 인터넷 등의 네트워크를 통하여 다른 컴퓨팅 장치에 전송되어 상기 다른 컴퓨팅 장치에 설치될 수 있고, 이로써 상기 다른 컴퓨팅 장치에서 사용될 수 있다.The above-described media character reading method, the compensation method of the input compensation system using the same, and the security authentication method of the operation permission control authentication system using the same may be implemented as computer readable codes on a computer readable medium. The computer-readable recording medium may be, for example, a removable recording medium (CD, DVD, Blu-ray Disc, USB storage device, removable hard disk), or a fixed recording medium (ROM, RAM, computer-equipped hard disk). Can. The computer program recorded on the computer-readable recording medium may be transmitted to another computing device through a network such as the Internet and installed on the other computing device, and thus used on the other computing device.

이상에서, 본 발명의 실시 예를 구성하는 모든 구성 요소들이 하나로 결합되거나 결합되어 동작하는 것으로 설명되었다고 해서, 본 발명이 반드시 이러한 실시 예에 한정되는 것은 아니다. 즉, 본 발명의 목적 범위안에서라면, 그 모든 구성요소들이 하나 이상으로 선택적으로 결합하여 동작할 수도 있다.In the above, even if all the components constituting the embodiment of the present invention are described as being combined or operated as one, the present invention is not necessarily limited to these embodiments. That is, within the object scope of the present invention, all of the components may be selectively combined and operated.

도면에서 동작들이 특정한 순서로 도시되어 있지만, 반드시 동작들이 도시된 특정한 순서로 또는 순차적 순서로 실행되어야만 하거나 또는 모든 도시 된 동작들이 실행되어야만 원하는 결과를 얻을 수 있는 것으로 이해되어서는 안 된다. 특정 상황에서는, 멀티태스킹 및 병렬 처리가 유리할 수도 있다. 더욱이, 위에 설명한 실시 예 들에서 다양한 구성들의 분리는 그러한 분리가 반드시 필요한 것으로 이해되어서는 안 되고, 설명된 프로그램 컴포넌트들 및 시스템들은 일반적으로 단일 소프트웨어 제품으로 함께 통합되거나 다수의 소프트웨어 제품으로 패키지 될 수 있음을 이해하여야 한다.Although the operations are shown in a specific order in the drawings, it should not be understood that the operations must be performed in a specific order or in a sequential order, or all illustrated operations must be executed to obtain a desired result. In certain situations, multitasking and parallel processing may be advantageous. Moreover, the separation of various configurations in the above-described embodiments should not be understood as such separation is essential, and the described program components and systems may generally be integrated together into a single software product or packaged into multiple software products. It should be understood that there is.

이제까지 본 발명에 대하여 그 실시 예들을 중심으로 살펴보았다. 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자는 본 발명이 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 변형된 형태로 구현될 수 있음을 이해할 수 있을 것이다. 그러므로 개시된 실시 예들은 한정적인 관점이 아니라 설명적인 관점에서 고려되어야 한다. 본 발명의 범위는 전술한 설명이 아니라 특허청구범위에 나타나 있으며, 그와 동등한 범위 내에 있는 모든 차이점은 본 발명에 포함된 것으로 해석되어야 할 것이다.So far, the present invention has been focused on the embodiments. Those skilled in the art to which the present invention pertains will understand that the present invention can be implemented in a modified form without departing from the essential characteristics of the present invention. Therefore, the disclosed embodiments should be considered in terms of explanation, not limitation. The scope of the present invention is shown in the claims rather than the foregoing description, and all differences within the equivalent range should be interpreted as being included in the present invention.

10: 미디어 문자 판독 시스템
100: 관리부
110: 문자 이미지
120: 정답 후보 그룹
200: 문제 생성부
300: 데이터 수집부
400: 추천부10: Media character reading system
100: management department
110: Character image
120: correct answer candidate group
200: problem generator
300: data collection unit
400: recommendation part

Claims

A media character reading system for reading a character image with a low recognition rate of a computer,
A management unit that stores and manages the text image;
A problem generator for generating and presenting a problem using the text image;
A data collection unit that receives an answer input for a problem presented by the problem generation unit and stores it in a candidate group for a correct answer; And
Including the recommendation unit for calculating a number of correct answer candidates entered by the correct answer candidate group,
The problem generating unit
Create handwritten handwriting as a problem, divide the above-mentioned text image into multiple images, repeat the image, or change the order to create a problem.
The recommendation section
If a ratio of a specific correct answer candidate among the stored correct answer candidate groups is equal to or greater than a predetermined threshold, the specific correct answer candidate is recommended as a correct answer,
The management department
The character image is a media character reading system that completes the reading of the character image when the specific correct answer candidate is recommended and classifies it as an error character image when the specific correct answer candidate is not recommended.

delete

A media character reading method for reading a character image with a low recognition rate of a computer,
Selecting the text image that needs to be read;
Generating a problem using the text image;
Presenting the problem to a user terminal;
Receiving an answer input to the user terminal and storing it in a correct answer candidate group; And
Comprising the step of calculating the correct answer candidate with the highest input frequency among the correct answer candidates stored in the correct answer candidate group as a recommended correct answer,
The steps to create the problem
Create handwritten handwriting as a problem, divide the above-mentioned text image into multiple images, repeat the image, or change the order to create a problem.
The step of calculating with the correct answer above is
If a ratio of a specific correct answer candidate among the stored correct answer candidate groups is equal to or greater than a predetermined threshold, the specific correct answer candidate is recommended as a correct answer,
The text image is a media character reading method in which, when recommending the specific correct answer candidate, the reading of the text image is completed, and when the specific correct answer candidate is not recommended, the text image is classified as an error text image.

delete

A computer program recorded on a recording medium readable by a computer executing the media character reading method of claim 4.