KR102524124B1

KR102524124B1 - Metadata generation apparatus and method for verifying transformation and pragiarism of the image object in documents

Info

Publication number: KR102524124B1
Application number: KR1020220155463A
Authority: KR
Inventors: 임헌영; 연창균
Original assignee: 주식회사 무하유
Priority date: 2022-11-18
Filing date: 2022-11-18
Publication date: 2023-04-20
Also published as: JP2024074239A; CN118057488A

Abstract

Provided is a method for generating metadata, which includes the steps of: receiving document input; converting the input document into an image file; extracting bibliographic information of the input document; extracting format information from the input document; extracting an image area within the input document and at least one image object within the image area; extracting structural feature points of at least one image object; extracting contextual feature points of at least one image object; and generating integrated metadata based on the bibliographic information, format information, structural feature points, and contextual feature points.

Description

Apparatus and Method for Generating Metadata for Transformation and Plagiarism Verification of Image Objects in Documents

본 개시는 메타데이터 생성 장치 및 그 방법에 관한 것이다. 보다 상세하게는, 본 개시는 문서 내 이미지 객체의 변형 및 표절 검증을 위한 메타데이터 생성 장치 및 그 방법에 관한 것이다.The present disclosure relates to a metadata generating apparatus and method. More specifically, the present disclosure relates to an apparatus and method for generating metadata for deformation and plagiarism verification of an image object in a document.

컴퓨터 비전 분야에서 이미지 간 유사도 판단은 픽셀 단위로 수행될 수 있다. 구체적으로, 유사도 판단은 원본 이미지와 비교 대상 이미지 간 전체 픽셀 배열이 일치하는 정도에 기반하여 수행될 수 있다. 그러나, 픽셀 단위 유사도 판단은 비교 대상 이미지가 원본 이미지에 있어서 일부 픽셀, 스케일, 각도, 명도, 채도 등에 변형을 가한 이미지인 경우, 이러한 변형을 검증하는 것이 어렵다는 문제점이 존재한다. 또한, 픽셀 단위 유사도 판단은 비교 대상 이미지가 원본 이미지의 일부인 경우, 이를 검증하는 것이 어렵다는 문제점이 존재한다. 상술된 단점을 보완하기 위해 이미지 내 특징점을 추출하여 유사도를 판단하는 다양한 기술들이 고안된 바 있다. In the field of computer vision, similarity determination between images may be performed in units of pixels. Specifically, similarity determination may be performed based on the degree of matching of entire pixel arrangements between the original image and the comparison target image. However, in the pixel-unit similarity determination, when the comparison target image is an image obtained by adding transformations to some pixels, scales, angles, brightness, saturation, etc. in the original image, it is difficult to verify such transformations. In addition, the pixel-unit similarity determination has a problem in that it is difficult to verify the image to be compared when it is a part of the original image. In order to compensate for the above-mentioned disadvantages, various techniques for determining similarity by extracting feature points in images have been devised.

본 개시의 비교 대상 이미지는 문서(예: 논문, 기고문) 내에 포함된 이미지 객체일 수 있다. 이 경우, 이미지 객체의 종류(예: 사진, 그림, 일러스트, 표, 차트) 및 이미지 객체에 가해질 수 있는 변형의 종류가 다양할 수 있다. 따라서, 기존의 픽셀 및 특징점에 기반한 유사도 판단만으로는 문서 내 이미지 객체의 변형 및 표절을 검증하는데 있어 어려움이 존재할 수 있다.A comparison target image of the present disclosure may be an image object included in a document (eg, a thesis or a contribution). In this case, the types of image objects (eg, photos, drawings, illustrations, tables, charts) and types of transformations that can be applied to the image objects may vary. Therefore, it may be difficult to verify deformation and plagiarism of image objects in a document only by determining similarity based on existing pixels and feature points.

KR 10-2019-0064288KR 10-2019-0064288 KR 10-2020-0046182KR 10-2020-0046182 KR 10-2021-0086836KR 10-2021-0086836

상술된 바와 같이, 문서 내 이미지 영역의 변형 및 표절을 검증하기 위하여, 본 개시에 개시된 메타데이터 생성 장치는 이미지 객체의 구조적 정보 및 문맥적 정보를 통합 메타데이터로 가공하여 데이터베이스를 구축하는데 그 목적이 있다.As described above, in order to verify deformation and plagiarism of an image area in a document, the metadata generating device disclosed in the present disclosure processes structural information and contextual information of an image object into integrated metadata to build a database. there is.

본 개시가 해결하고자 하는 과제들은 이상에서 언급된 과제로 제한되지 않으며, 언급되지 않은 또 다른 과제들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.The problems to be solved by the present disclosure are not limited to the problems mentioned above, and other problems not mentioned will be clearly understood by those skilled in the art from the description below.

상술한 기술적 과제를 달성하기 위한 본 개시의 일 측면에 따른 메타데이터 생성 방법은 문서를 입력 받는 단계, 입력된 문서를 이미지 파일로 변환하는 단계, 입력된 문서의 서지 정보를 추출하는 단계, 입력된 문서의 서식 정보를 추출하는 단계, 입력된 문서 내 이미지 영역 및 이미지 영역 내 적어도 하나의 이미지 객체를 추출하는 단계, 적어도 하나의 이미지 객체의 구조적 특징점을 추출하는 단계, 적어도 하나의 이미지 객체의 문맥적 특징점을 추출하는 단계, 서지 정보, 서식 정보, 구조적 특징점, 및 문맥적 특징점에 기반하여 통합 메타데이터를 생성하는 단계를 포함하는 것을 특징으로 할 수 있다.A metadata generation method according to an aspect of the present disclosure for achieving the above technical problem includes receiving a document, converting the input document into an image file, extracting bibliographic information of the input document, and inputting the input document. Extracting format information of a document, extracting an image area within an input document and at least one image object within the image area, extracting structural feature points of at least one image object, It may be characterized by including the step of extracting feature points, generating integrated metadata based on bibliographic information, format information, structural feature points, and contextual feature points.

또한, 서지 정보는 상기 입력된 문서의 제목, 종류, 저널명, 저자, 출판사, 출판지, 키워드, DOI(Digital Object Identifier), 원문 링크, 정보 링크, 검색 링크, 등록일자, 상태 코드, 권, 호, 및 페이지 정보 중 적어도 하나를 포함할 수 있다. In addition, the bibliography information includes the title, type, journal name, author, publisher, publication site, keyword, DOI (Digital Object Identifier), original text link, information link, search link, registration date, status code, volume, number, and at least one of page information.

또한, 입력된 문서의 서식 정보를 추출하는 단계는 입력된 문서의 메타데이터를 추출하는 단계 입력된 문서의 형식을 분류하는 단계, 입력된 문서로부터 바이트 코드를 추출하는 단계, 입력된 문서의 구조를 분석하는 단계, 및 최종 서식 정보를 생성하는 단계를 포함하고, 입력된 문서의 구조를 분석하는 단계는 적어도 하나의 이미지 객체를 사진(picture), 그림(figure), 일러스트(illustration), 차트(chart), 및 표(table) 중 적어도 하나로 분류하는 단계를 포함할 수 있다. In addition, the step of extracting the format information of the input document includes the step of extracting the metadata of the input document, the step of classifying the format of the input document, the step of extracting byte codes from the input document, and the structure of the input document. The step of analyzing the structure of the input document includes the step of analyzing, and the step of generating final form information, and the step of analyzing the structure of the input document converts at least one image object into a picture, figure, illustration, or chart. ), and a step of classifying at least one of tables.

또한, 적어도 하나의 이미지 객체의 구조적 특징점을 추출하는 단계는, 에너지 역치, 대조 강화, SIFT(Scale-Invariant Feature Transform) 및 FAST (Features from Accelerated Segment Test) 기술 중 적어도 하나를 이용하여, 적어도 하나의 이미지 객체의 강건한 특징점(robust key point)을 추출하는 단계를 포함할 수 있다. In addition, the step of extracting structural feature points of at least one image object may include at least one using at least one of energy threshold, contrast enhancement, scale-invariant feature transform (SIFT), and features from accelerated segment test (FAST) techniques. It may include extracting a robust key point of the image object.

또한, 적어도 하나의 이미지 객체의 문맥적 특징점을 추출하는 단계는, 적어도 하나의 이미지 객체 중 텍스트를 포함하는 이미지 객체를 OCR(Optical character recognition) 기술에 기반하여 식별하는 단계, 및 식별된 텍스트 정보에 기반하여 세부 문맥 정보를 추출하는 단계를 포함할 수 있다.In addition, the step of extracting contextual feature points of at least one image object may include identifying an image object including text among the at least one image object based on OCR (Optical Character Recognition) technology, and the identified text information and extracting detailed context information based on the method.

또한, 상술한 기술적 과제를 달성하기 위한 본 개시의 다른 측면에 따른 메타데이터 생성 장치는, 메모리; 문서를 입력 받기 위한 문서 입력부; 및 메타데이터 생성부의 동작을 제어하는 제어부;를 포함하고, 상기 메타데이터 생성부는, 상기 입력된 문서를 이미지 파일로 변환하는 문서 변환부; 상기 입력된 문서의 서지 정보를 추출하는 서지 정보 추출부; 상기 입력된 문서의 서식 정보를 추출하는 서식 정보 추출부; 상기 입력된 문서 내 이미지 영역 및 상기 이미지 영역 내 적어도 하나의 이미지 객체를 추출하는 이미지 영역 추출부; 상기 적어도 하나의 이미지 객체의 구조적 특징점을 추출하는 구조적 특징 추출부; 상기 적어도 하나의 이미지 객체의 문맥적 특징점을 추출하는 문맥적 특징 추출부; 및 상기 서지 정보, 상기 서식 정보, 상기 구조적 특징점, 및 상기 문맥적 특징점에 기반하여 통합 메타데이터를 생성하는 통합 메타데이터 생성부;를 포함하는 것을 특징으로 한다.In addition, a metadata generating apparatus according to another aspect of the present disclosure for achieving the above technical problem includes a memory; a document input unit for receiving documents; and a control unit for controlling the operation of the metadata generation unit, wherein the metadata generation unit includes: a document conversion unit that converts the input document into an image file; a bibliographic information extraction unit extracting bibliographic information of the input document; a format information extraction unit extracting format information of the input document; an image area extraction unit extracting an image area within the input document and at least one image object within the image area; a structural feature extraction unit extracting structural feature points of the at least one image object; a contextual feature extraction unit extracting contextual feature points of the at least one image object; and an integrated metadata generation unit generating integrated metadata based on the bibliography information, the format information, the structural feature points, and the contextual feature points.

이 외에도, 본 개시를 구현하기 위한 실행하기 위한 컴퓨터 판독 가능한 기록 매체에 저장된 컴퓨터 프로그램이 더 제공될 수 있다.In addition to this, a computer program stored in a computer readable recording medium for execution to implement the present disclosure may be further provided.

이 외에도, 본 개시를 구현하기 위한 방법을 실행하기 위한 컴퓨터 프로그램을 기록하는 컴퓨터 판독 가능한 기록 매체가 더 제공될 수 있다.In addition to this, a computer readable recording medium recording a computer program for executing a method for implementing the present disclosure may be further provided.

본 개시의 전술한 과제 해결 수단에 의하면, 문서 내 이미지 객체에 특화된메타데이터를 생성하여, 제3자에 의한 이미지 변형 및 표절을 용이하게 검증 또는 탐지하는 효과를 제공한다.According to the above-mentioned problem solving means of the present disclosure, meta data specialized for an image object in a document is generated to provide an effect of easily verifying or detecting image modification and plagiarism by a third party.

본 개시의 효과들은 이상에서 언급된 효과로 제한되지 않으며, 언급되지 않은 또 다른 효과들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.The effects of the present disclosure are not limited to the effects mentioned above, and other effects not mentioned will be clearly understood by those skilled in the art from the description below.

도 1은 본 개시의 메타데이터 생성 장치의 블록도이다.
도 2는 본 개시의 일 실시예에 따른, 메타데이터 생성부를 설명하기 위한 블록 구성도이다.
도 3은 본 개시의 일 실시에에 따른, 통합 메타데이터 생성 장치의 동작을 설명하기 위한 흐름도이다.
도 4는 본 개시의 일 실시에에 따른, 서식 정보 추출 동작을 설명하기 위한 흐름도이다.
도 5는 본 개시의 일 실시에에 따른, 입력된 문서에 대한 이미지 영역 추출결과를 도시한 것이다.1 is a block diagram of a metadata generating device of the present disclosure.
2 is a block diagram illustrating a metadata generator according to an embodiment of the present disclosure.
3 is a flowchart illustrating an operation of an integrated metadata generating device according to an embodiment of the present disclosure.
4 is a flowchart illustrating an operation of extracting form information according to an embodiment of the present disclosure.
5 illustrates a result of extracting an image area for an input document according to an embodiment of the present disclosure.

본 개시 전체에 걸쳐 동일 참조 부호는 동일 구성요소를 지칭한다. 본 개시가 실시예들의 모든 요소들을 설명하는 것은 아니며, 본 개시가 속하는 기술분야에서 일반적인 내용 또는 실시예들 간에 중복되는 내용은 생략한다. 명세서에서 사용되는 '부, 모듈, 부재, 블록'이라는 용어는 소프트웨어 또는 하드웨어로 구현될 수 있으며, 실시예들에 따라 복수의 '부, 모듈, 부재, 블록'이 하나의 구성요소로 구현되거나, 하나의 '부, 모듈, 부재, 블록'이 복수의 구성요소들을 포함하는 것도 가능하다. Like reference numbers designate like elements throughout this disclosure. The present disclosure does not describe all elements of the embodiments, and general content or overlapping content between the embodiments in the technical field to which the present disclosure belongs is omitted. The term 'unit, module, member, or block' used in the specification may be implemented as software or hardware, and according to embodiments, a plurality of 'units, modules, members, or blocks' may be implemented as one component, It is also possible that one 'part, module, member, block' includes a plurality of components.

명세서 전체에서, 어떤 부분이 다른 부분과 "연결"되어 있다고 할 때, 이는 직접적으로 연결되어 있는 경우 뿐 아니라, 간접적으로 연결되어 있는 경우를 포함하고, 간접적인 연결은 무선 통신망을 통해 연결되는 것을 포함한다.Throughout the specification, when a part is said to be "connected" to another part, this includes not only the case of being directly connected but also the case of being indirectly connected, and indirect connection includes being connected through a wireless communication network. do.

또한 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다.In addition, when a certain component is said to "include", this means that it may further include other components without excluding other components unless otherwise stated.

명세서 전체에서, 어떤 부재가 다른 부재 "상에" 위치하고 있다고 할 때, 이는 어떤 부재가 다른 부재에 접해 있는 경우 뿐 아니라 두 부재 사이에 또 다른 부재가 존재하는 경우도 포함한다.Throughout the specification, when a member is said to be located “on” another member, this includes not only a case where a member is in contact with another member, but also a case where another member exists between the two members.

제 1, 제 2 등의 용어는 하나의 구성요소를 다른 구성요소로부터 구별하기 위해 사용되는 것으로, 구성요소가 전술된 용어들에 의해 제한되는 것은 아니다. Terms such as first and second are used to distinguish one component from another, and the components are not limited by the aforementioned terms.

단수의 표현은 문맥상 명백하게 예외가 있지 않는 한, 복수의 표현을 포함한다.Expressions in the singular number include plural expressions unless the context clearly dictates otherwise.

각 단계들에 있어 식별부호는 설명의 편의를 위하여 사용되는 것으로 식별부호는 각 단계들의 순서를 설명하는 것이 아니며, 각 단계들은 문맥상 명백하게 특정 순서를 기재하지 않는 이상 명기된 순서와 다르게 실시될 수 있다. In each step, the identification code is used for convenience of description, and the identification code does not explain the order of each step, and each step may be performed in a different order from the specified order unless a specific order is clearly described in context. there is.

이하 첨부된 도면들을 참고하여 본 개시의 작용 원리 및 실시예들에 대해 설명한다.Hereinafter, the working principle and embodiments of the present disclosure will be described with reference to the accompanying drawings.

본 명세서에서 '본 개시에 따른 장치'는 연산처리를 수행하여 사용자에게 결과를 제공할 수 있는 다양한 장치들이 모두 포함된다. 예를 들어, 본 개시에 따른 장치는, 컴퓨터, 서버 장치 및 휴대용 단말기를 모두 포함하거나, 또는 어느 하나의 형태가 될 수 있다.In this specification, the 'apparatus according to the present disclosure' includes all various devices capable of providing results to users by performing calculation processing. For example, a device according to the present disclosure may include a computer, a server device, and a portable terminal, or may be in any one form.

여기에서, 상기 컴퓨터는 예를 들어, 웹 브라우저(WEB Browser)가 탑재된 노트북, 데스크톱(desktop), 랩톱(laptop), 태블릿 PC, 슬레이트 PC 등을 포함할 수 있다.Here, the computer may include, for example, a laptop computer, a desktop computer, a laptop computer, a tablet PC, a slate PC, and the like equipped with a web browser.

상기 서버 장치는 외부 장치와 통신을 수행하여 정보를 처리하는 서버로써, 애플리케이션 서버, 컴퓨팅 서버, 데이터베이스 서버, 파일 서버, 게임 서버, 메일 서버, 프록시 서버 및 웹 서버 등을 포함할 수 있다.The server device is a server that processes information by communicating with an external device, and may include an application server, a computing server, a database server, a file server, a game server, a mail server, a proxy server, and a web server.

상기 휴대용 단말기는 예를 들어, 휴대성과 이동성이 보장되는 무선 통신 장치로서, PCS(Personal Communication System), GSM(Global System for Mobile communications), PDC(Personal Digital Cellular), PHS(Personal Handyphone System), PDA(Personal Digital Assistant), IMT(International Mobile Telecommunication)-2000, CDMA(Code Division Multiple Access)-2000, W-CDMA(W-Code Division Multiple Access), WiBro(Wireless Broadband Internet) 단말, 스마트 폰(Smart Phone) 등과 같은 모든 종류의 핸드헬드(Handheld) 기반의 무선 통신 장치와 시계, 반지, 팔찌, 발찌, 목걸이, 안경, 콘택트 렌즈, 또는 머리 착용형 장치(head-mounted-device(HMD) 등과 같은 웨어러블 장치를 포함할 수 있다.The portable terminal is, for example, a wireless communication device that ensures portability and mobility, and includes a Personal Communication System (PCS), a Global System for Mobile communications (GSM), a Personal Digital Cellular (PDC), a Personal Handyphone System (PHS), and a PDA. (Personal Digital Assistant), IMT (International Mobile Telecommunication)-2000, CDMA (Code Division Multiple Access)-2000, W-CDMA (W-Code Division Multiple Access), WiBro (Wireless Broadband Internet) terminal, smart phone ) and wearable devices such as watches, rings, bracelets, anklets, necklaces, glasses, contact lenses, or head-mounted-devices (HMDs). can include

도 1은 본 개시의 메타데이터 생성 장치의 블록도이다. 1 is a block diagram of a metadata generating device of the present disclosure.

도 1을 참고하면, 메타데이터 생성 장치(100)는 제어부(110), 메모리(120), 문서 입력부(122) 및/또는 메타데이터 생성부(130)를 포함할 수 있다. Referring to FIG. 1 , the metadata generating device 100 may include a controller 110, a memory 120, a document input unit 122, and/or a metadata generator 130.

일 실시 예에서, 제어부(110)는 본 장치 내의 구성요소들의 동작을 제어하기 위한 알고리즘 또는 알고리즘을 재현한 프로그램에 대한 데이터를 저장하는 메모리(120), 및 메모리(120)에 저장된 데이터를 이용하여 전술한 동작을 수행하는 적어도 하나의 프로세서(미도시)로 구현될 수 있다. 이때, 메모리(120)와 프로세서는 각각 별개의 칩으로 구현될 수 있다. 또는, 메모리(120)와 프로세서는 단일 칩으로 구현될 수도 있다.In one embodiment, the control unit 110 uses a memory 120 that stores data for an algorithm or a program that reproduces the algorithm for controlling the operation of the components in the device, and the data stored in the memory 120. It may be implemented with at least one processor (not shown) that performs the above-described operation. In this case, the memory 120 and the processor may be implemented as separate chips. Alternatively, the memory 120 and the processor may be implemented as a single chip.

또한, 제어부는 이하의 도 2 내지 도 5에서 설명되는 본 개시에 따른 다양한 실시 예들을 본 장치 상에서 구현하기 위하여, 위에서 살펴본 구성요소들 중 어느 하나 또는 복수를 조합하여 제어할 수 있다. In addition, the control unit may control any one or a combination of the components described above in order to implement various embodiments according to the present disclosure described in FIGS. 2 to 5 below on the present device.

일 실시 예에서, 메모리(120)는, 본 장치의 다양한 기능을 지원하는 데이터와, 제어부(110)의 동작을 위한 프로그램을 저장할 수 있고, 입/출력되는 데이터들(예를 들어, 음악 파일, 정지영상, 동영상 등)을 저장할 있고, 본 장치에서 구동되는 다수의 응용 프로그램(application program 또는 애플리케이션(application)), 본 장치의 동작을 위한 데이터들, 명령어들을 저장할 수 있다. 이러한 응용 프로그램 중 적어도 일부는, 무선 통신을 통해 외부 서버로부터 다운로드 될 수 있다. In one embodiment, the memory 120 may store data supporting various functions of the device and a program for the operation of the control unit 110, and input/output data (eg, music files, still image, video, etc.), and can store a plurality of application programs (application programs or applications) running in the device, data for operation of the device, and commands. At least some of these application programs may be downloaded from an external server through wireless communication.

이러한, 메모리(120)는 플래시 메모리 타입(flash memory type), 하드디스크 타입(hard disk type), SSD 타입(Solid State Disk type), SDD 타입(Silicon Disk Drive type), 멀티미디어 카드 마이크로 타입(multimedia card micro type), 카드 타입의 메모리(예를 들어 SD 또는 XD 메모리 등), 램(random access memory; RAM), SRAM(static random access memory), 롬(read-only memory; ROM), EEPROM(electrically erasable programmable read-only memory), PROM(programmable read-only memory), 자기 메모리, 자기 디스크 및 광디스크 중 적어도 하나의 타입의 저장매체를 포함할 수 있다. 또한, 메모리(120)는 본 장치와는 분리되어 있으나, 유선 또는 무선으로 연결된 데이터베이스가 될 수도 있다.The memory 120 may be a flash memory type, a hard disk type, a solid state disk type, a silicon disk drive type, or a multimedia card micro type. micro type), card type memory (eg SD or XD memory, etc.), random access memory (RAM), static random access memory (SRAM), read-only memory (ROM), electrically erasable EEPROM (EEPROM) It may include a storage medium of at least one type of a programmable read-only memory (PROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, and an optical disk. In addition, the memory 120 is separated from the apparatus, but may be a database connected by wire or wirelessly.

일 실시 예에서, 문서 입력부(122)는 사용자로부터 문서(예: 논문, 학술지)를 입력 받을 수 있다. 예를 들어, 문서는 적어도 하나의 텍스트 영역 및/또는 이미지 영역을 포함할 수 있다. In one embodiment, the document input unit 122 may receive a document (eg, a thesis, a journal) from a user. For example, a document may include at least one text area and/or image area.

일 실시 예에서, 메타데이터 생성부(130)는 입력된 문서의 서지 정보 및 서식 정보를 추출할 수 있다. 일 실시 예에서, 메타데이터 생성부(130)는 입력 받은 문서 내 적어도 하나의 이미지 객체를 추출하고, 추출된 이미지 객체의 구조적 특징 및 문맥적 특징을 추출할 수 있다. 일 실시 예에서, 메타데이터 생성부(130)는 문서 정보(예: 서지 정보 및 서식 정보), 구조적 특징 및 문맥적 특징에 기반하여 각각의 이미지 객체에 대한 통합 메타데이터를 생성할 수 있다. In one embodiment, the metadata generator 130 may extract bibliographic information and format information of an input document. In an embodiment, the metadata generating unit 130 may extract at least one image object from an input document and extract structural and contextual features of the extracted image object. In an embodiment, the metadata generator 130 may generate integrated metadata for each image object based on document information (eg, bibliographic information and format information), structural characteristics, and contextual characteristics.

도 2는 본 개시의 일 실시예에 따른, 메타데이터 생성부를 설명하기 위한 블록 구성도이다.2 is a block diagram illustrating a metadata generator according to an embodiment of the present disclosure.

도 2를 참조하면, 메타데이터 생성부(130)는 문서 변환부(200), 서지 정보 추출부(210), 서식 정보 추출부(220), 이미지 영역 추출부(230), 구조적 특징 추출부(240), 문맥적 특징 추출부(250), 및/또는 통합 메타데이터 생성부(260)를 포함할 수 있다. 이하에서, 메타데이터 생성부(130) 및 메타데이터 생성부(130)의 각 구성들의 동작은 메타데이터 생성 장치(100)의 제어부(110)에 의해 수행되는 것으로 이해될 수 있다. Referring to FIG. 2 , the metadata generation unit 130 includes a document conversion unit 200, a bibliographic information extraction unit 210, a format information extraction unit 220, an image area extraction unit 230, a structural feature extraction unit ( 240), a contextual feature extractor 250, and/or an integrated metadata generator 260. Hereinafter, it may be understood that the operation of the metadata generator 130 and each component of the metadata generator 130 is performed by the control unit 110 of the metadata generator 100 .

일 실시 예에서, 문서 변환부(200)는 문서 입력부(122)로부터 입력된 문서를 전달받을 수 있다. 예를 들어, 제어부(110)는 문서 입력부(122)를 통해 사용자로부터 문서를 입력 받을 수 있다. 문서 변환부(200)는 입력된 문서를 이미지 파일로 변환할 수 있다. In one embodiment, the document conversion unit 200 may receive a document input from the document input unit 122 . For example, the controller 110 may receive a document input from a user through the document input unit 122 . The document conversion unit 200 may convert an input document into an image file.

일 실시 예에서, 서지 정보 추출부(210)는 입력된 문서의 서지 정보를 추출할 수 있다. In one embodiment, the bibliographic information extraction unit 210 may extract bibliographic information of an input document.

일 실시 예에서, 서식 정보 추출부(220)는 입력된 문서의 서식 정보를 추출할 수 있다. In one embodiment, the format information extractor 220 may extract format information of an input document.

일 실시 예에서, 이미지 영역 추출부(230)는 입력된 문서 내의 이미지 영역을 추출할 수 있다. 이미지 영역은 적어도 하나의 이미지 객체를 포함할 수 있다. 제어부(110)는 이미지 영역에 포함된 적어도 하나의 이미지 객체를 식별할 수 있다. In an embodiment, the image area extractor 230 may extract an image area within an input document. The image area may include at least one image object. The controller 110 may identify at least one image object included in the image area.

일 실시 예에서, 구조적 특징 추출부(240)는 적어도 하나의 이미지 객체의 구조적 특징점을 추출할 수 있다. In an embodiment, the structural feature extractor 240 may extract structural feature points of at least one image object.

일 실시 예에서, 문맥적 특징 추출부(250)는 적어도 하나의 이미지 객체에 포함된 텍스트 정보를 식별하고, 식별된 텍스트 정보에 기반하여 세부적인 문맥 정보를 추출할 수 있다. In an embodiment, the contextual feature extractor 250 may identify text information included in at least one image object, and extract detailed contextual information based on the identified text information.

일 실시 예에서, 통합 메타데이터 생성부(260)는 상기 구성들에 의해 추출된 문서 정보(예: 입력된 문서의 서지 정보, 서식 정보), 구조적 특징(예: 이미지 객체의 특징점 정보, 픽셀 영역 관련 정보) 및 문맥적 특징(예: 이미지 객체에 포함된 문맥 정보)을 하나의 메타 데이터(예: 통합 메타데이터)로 병합할 수 있다. 적어도 하나의 이미지 객체가 복수 개인 경우, 통합 메타데이터 생성부(260)는 각각의 이미지 객체에 대한 통합 메타데이터를 생성할 수 있다. In one embodiment, the integrated metadata generator 260 includes document information (eg, bibliographic information and format information of an input document) extracted by the above components, structural characteristics (eg, feature point information of an image object, pixel area) related information) and contextual characteristics (eg, contextual information included in an image object) may be merged into one meta data (eg, integrated metadata). When there are a plurality of at least one image object, the integrated metadata generating unit 260 may generate integrated metadata for each image object.

도 1 내지 2에 도시된 구성 요소들의 성능에 대응하여 적어도 하나의 구성요소가 추가되거나 삭제될 수 있다. 또한, 구성 요소들의 상호 위치는 시스템의 성능 또는 구조에 대응하여 변경될 수 있다는 것은 당해 기술 분야에서 통상의 지식을 가진 자에게 용이하게 이해될 것이다.At least one component may be added or deleted corresponding to the performance of the components shown in FIGS. 1 and 2 . In addition, it will be easily understood by those skilled in the art that the mutual positions of the components may be changed corresponding to the performance or structure of the system.

한편, 도 1 내지 2에서 도시된 각각의 구성요소는 소프트웨어 및/또는 Field Programmable Gate Array(FPGA) 및 주문형 반도체(ASIC, Application Specific Integrated Circuit)와 같은 하드웨어 구성요소를 의미한다.Meanwhile, each component shown in FIGS. 1 and 2 means software and/or hardware components such as a Field Programmable Gate Array (FPGA) and Application Specific Integrated Circuit (ASIC).

도 3은 본 개시의 일 실시에에 따른, 통합 메타데이터 생성 장치의 동작을 설명하기 위한 흐름도이다.3 is a flowchart illustrating an operation of an integrated metadata generating device according to an embodiment of the present disclosure.

동작 300에서, 메타데이터 생성 장치(100)의 제어부(110)는 문서 입력부(122)를 통해, 문서를 입력 받을 수 있다. 일 실시 예에서, 문서는 텍스트 영역 및/또는 이미지 영역을 포함할 수 있다. In operation 300, the control unit 110 of the metadata generating device 100 may receive a document through the document input unit 122. In one embodiment, a document may include a text area and/or an image area.

동작 305에서, 제어부(110)는 문서 변환부(200)를 통해 입력된 문서를 이미지 파일로 변환할 수 있다. 예를 들어, 입력된 문서의 이미지 파일로의 변환은 페이지 단위로 수행될 수 있다.In operation 305, the controller 110 may convert the document input through the document conversion unit 200 into an image file. For example, conversion of an input document into an image file may be performed in units of pages.

동작 310에서, 제어부(110)는 서지 정보 추출부(210)를 통해 입력된 문서의 서지 정보를 추출할 수 있다. 예를 들어, 서지 정보는 상기 입력된 문서의 제목, 종류, 저널명, 저자, 출판사, 출판지, 키워드, DOI(Digital Object Identifier), 원문 링크, 정보 링크, 검색 링크, 등록일자, 상태 코드, 권, 호, 및/또는 페이지 정보를 포함할 수 있다. In operation 310, the control unit 110 may extract bibliographic information of the document input through the bibliographic information extracting unit 210. For example, the bibliography information includes the title, type, journal name, author, publisher, publication site, keyword, DOI (Digital Object Identifier), original text link, information link, search link, registration date, status code, volume, call, and/or page information.

동작 320에서, 제어부(110)는 서식 정보 추출부(220)를 통해 입력된 문서의 서식 정보를 추출할 수 있다. In operation 320, the controller 110 may extract format information of the document input through the format information extractor 220.

도 4를 참조하면, 동작 320은 동작 400 내지 동작 440을 포함할 수 있다. 동작 400에서, 서식 정보 추출부(220)는 입력된 문서에 기록된 메타데이터를 추출할 수 있다. 예를 들어, 동작 400에서 추출된 메타데이터는 문서 유형, 문서 구조, 문서 길이, 문서 작성자, 및/또는 작성일을 포함할 수 있다. 동작 410에서, 서식 정보 추출부(220)는 입력된 문서의 형식을 분류할 수 있다. 예를 들어, 문서의 형식은 doc, docx, hwp, pdf와 같은 확장자를 의미할 수 있다. 동작 420에서, 서식 정보 추출부(220)는 입력된 문서로부터 바이트 코드를 추출할 수 있다. 동작 430에서, 서식 정보 추출부(220)는 문서 구조를 분석할 수 있다. 일 실시 예에서, 서식 정보 추출부(220)는 입력된 문서를 텍스트 영역 및/또는 이미지 영역으로 분류할 수 있다. 서식 정보 추출부(220)는 분류된 이미지 영역을 분석하여 이미지 영역에 포함된 적어도 하나의 이미지 객체를 사진(picture), 그림(figure), 일러스트(illustration), 차트(chart), 및 표(table) 중 적어도 하나로 세부 분류할 수 있다. 동작 440에서, 서식 정보 추출부(220)는 동작 400 내지 동작 430의 결과에 기반하여 최종 서식 정보를 생성할 수 있다. Referring to FIG. 4 , operation 320 may include operations 400 to 440 . In operation 400, the format information extractor 220 may extract metadata recorded in the input document. For example, the metadata extracted in operation 400 may include document type, document structure, document length, document author, and/or creation date. In operation 410, the format information extractor 220 may classify the format of the input document. For example, the format of a document may mean an extension such as doc, docx, hwp, or pdf. In operation 420, the format information extraction unit 220 may extract byte codes from the input document. In operation 430, the format information extractor 220 may analyze the document structure. In one embodiment, the form information extractor 220 may classify the input document into a text area and/or an image area. The format information extractor 220 analyzes the classified image area and converts at least one image object included in the image area into pictures, figures, illustrations, charts, and tables. ) can be subdivided into at least one of them. In operation 440, the format information extractor 220 may generate final format information based on the results of operations 400 to 430.

다시 도 3을 참조하면, 동작 330에서, 제어부(110)는 이미지 영역 추출부(230)를 통해 입력된 문서 내의 이미지 영역을 추출할 수 있다. 일 실시 예에서, 이미지 영역은 적어도 하나의 이미지 객체를 포함할 수 있다. 제어부(110)는 이미지 영역 내 적어도 하나의 이미지 객체를 식별할 수 있다.Referring back to FIG. 3 , in operation 330, the controller 110 may extract an image area within the document input through the image area extractor 230. In one embodiment, the image area may include at least one image object. The controller 110 may identify at least one image object within the image area.

도 5를 참조하면, 이미지 영역 추출부(230)는 입력된 문서(500) 내의 텍스트 영역 및/또는 이미지 영역을 식별할 수 있다. 일 실시 예에서, 텍스트 영역 및/또는 이미지 영역은 바운딩 박스로 식별될 수 있다. 예를 들어, 이미지 영역은 바운딩 박스(510, 540)로 식별될 수 있다. 예를 들어, 텍스트 영역은 바운딩 박스(520, 530)로 식별될 수 있다. 각각의 바운딩 박스는 식별 좌표(예: text0, text1, figure0, figure1)에 기반하여 식별될 수 있다. 식별 좌표는 식별된 영역의 특징(예: 텍스트, 이미지)에 기반하여 부여될 수 있다. Referring to FIG. 5 , the image area extraction unit 230 may identify a text area and/or an image area within the input document 500 . In one embodiment, the text area and/or image area may be identified as a bounding box. For example, image regions may be identified by bounding boxes 510 and 540 . For example, text areas may be identified as bounding boxes 520 and 530 . Each bounding box can be identified based on identification coordinates (eg text0, text1, figure0, figure1). Identification coordinates may be assigned based on characteristics (eg, text, image) of the identified area.

일 실시 예에서, 이미지 영역(510)은 이미지 객체를 포함할 수 있다. 예를 들어, 이미지 영역(510)의 이미지 객체(512)는 일러스트로 이해될 수 있다. 제어부(110)는 이미지 영역(510)으로부터 이미지 객체를 식별할 수 있다.In one embodiment, the image area 510 may include an image object. For example, the image object 512 of the image area 510 may be understood as an illustration. The controller 110 may identify an image object from the image area 510 .

일 실시 예에서, 이미지 영역(540)은 이미지 객체를 포함할 수 있다. 예를 들어, 이미지 영역(540)의 이미지 객체는 차트로 이해될 수 있다. 제어부(110)는 이미지 영역(540)으로부터 이미지 객체를 식별할 수 있다.In one embodiment, image area 540 may include an image object. For example, an image object in the image area 540 may be understood as a chart. The controller 110 may identify an image object from the image area 540 .

다시 도 3을 참조하면, 제어부(110)는 동작 330에서 이미지 영역을 추출한 뒤, 동작 340 및/또는 동작 350으로 진행할 수 있다. 동작 340 및 동작 350은 순차적으로 또는 동시에 수행될 수 있다. 동작 340 및 동작 350이 수행되는 순서는 임의적으로 정해질 수 있다.Referring back to FIG. 3 , after extracting the image area in operation 330 , the controller 110 may proceed to operation 340 and/or operation 350 . Operations 340 and 350 may be performed sequentially or concurrently. The order in which operations 340 and 350 are performed may be arbitrarily determined.

동작 340에서, 제어부(110)는 구조적 특징 추출부(240)를 통해 적어도 하나의 이미지 객체로부터 구조적 특징점을 추출할 수 있다. 예를 들어, 구조적 특징 추출부(240)는 에너지 역치, 대조 강화, SIFT(Scale-Invariant Feature Transform) 및 FAST (Features from Accelerated Segment Test) 기술 중 적어도 하나를 이용하여, 적어도 하나의 이미지 객체의 강건한 특징점(robust key point)을 추출할 수 있다. 예를 들어, 강건한 특징점은 적어도 하나의 이미지 객체의 코너점을 포함할 수 있다. 제어부(110)는 강건한 특징점을 적어도 하나의 이미지 객체의 구조적 특징점으로 추출할 수 있다. 일 실시 예에서, 구조적 특징 추출부(240)는 특징점의 강건한(robust) 성격을 유지하기 위하여 추출된 구조적 특징점을 절대 좌표가 아닌 상대 좌표로 환산하여 저장할 수 있다. In operation 340, the controller 110 may extract structural feature points from at least one image object through the structural feature extractor 240. For example, the structural feature extraction unit 240 uses at least one of energy threshold, contrast enhancement, Scale-Invariant Feature Transform (SIFT), and Features from Accelerated Segment Test (FAST) techniques to robustly extract at least one image object. A robust key point can be extracted. For example, the robust feature points may include corner points of at least one image object. The controller 110 may extract robust feature points as structural feature points of at least one image object. In an embodiment, the structural feature extractor 240 may convert and store the extracted structural feature points into relative coordinates instead of absolute coordinates in order to maintain robust characteristics of the feature points.

동작 350에서, 제어부(110)는 문맥적 특징 추출부(250)를 통해 적어도 하나의 이미지 객체 중 텍스트를 포함하는 이미지 객체의 문맥 정보를 추출할 수 있다. 일 실시 예에서, 문맥적 특징 추출부(250)는 추출된 적어도 하나의 이미지 객체(예: 도 5의 이미지 객체(510, 540)) 중 텍스트를 포함하는 이미지 객체로부터 텍스트 정보를 식별할 수 있다. 문맥적 특징 추출부(250)는 식별된 텍스트 정보에 기반하여 세부 문맥 정보를 추출할 수 있다. 예를 들어, 세부 문맥 정보는 문서 프로그램 내부에서 작성된 것이 아닌 외부에서 작업하여 이미지 형태로 첨부된 차트, 표, 이미지 등의 내부에 포함된 텍스트 정보를 의미할 수 있다. 일 실시 예에서, 문맥적 특징 추출부(250)는 텍스트 정보 식별을 위해 OCR(Optical character recognition) 기술을 이용할 수 있다. In operation 350, the controller 110 may extract contextual information of an image object including text from among at least one image object through the contextual feature extractor 250. In an embodiment, the contextual feature extractor 250 may identify text information from an image object including text among at least one extracted image object (eg, the image objects 510 and 540 of FIG. 5 ). . The contextual feature extractor 250 may extract detailed contextual information based on the identified text information. For example, the detailed context information may refer to text information included inside a chart, table, image, etc., which is not created inside a document program but is externally worked and attached in the form of an image. In one embodiment, the contextual feature extractor 250 may use Optical Character Recognition (OCR) technology to identify text information.

동작 360에서, 제어부(110)는 통합 메타데이터 생성부(260)를 통해 적어도 하나의 이미지 객체에 대한 통합 메타데이터를 생성할 수 있다. 일 실시 예에서, 적어도 하나의 이미지 객체가 복수 개인 경우, 통합 메타데이터 생성부(260)는 각각의 이미지 객체에 대하여 메타 데이터를 생성할 수 있다. In operation 360, the controller 110 may generate integrated metadata for at least one image object through the integrated metadata generator 260. In one embodiment, when there are a plurality of at least one image object, the integrated metadata generating unit 260 may generate metadata for each image object.

일 실시 예에서, 통합 메타데이터 생성부(260)는 상기 구성들에 의해 추출된 문서 정보, 적어도 하나의 이미지 객체의 추출된 구조적 특징점 및 문맥적 특징점을 하나의 메타 데이터(예: 통합 메타데이터)로 병합할 수 있다. 예를 들어, 문서 정보는 입력된 문서의 서지 정보 및 서식 정보를 포함할 수 있다. 예를 들어, 구조적 특징점은 이미지 객체의 특징점 정보 및 픽셀 영역 정보를 포함할 수 있다. 예를 들어, 문맥적 특징점은 이미지 객체에 포함된 텍스트의 세부 문맥 정보를 포함할 수 있다. In one embodiment, the integrated metadata generation unit 260 converts the document information extracted by the above configurations, the extracted structural feature points and contextual feature points of at least one image object into one piece of metadata (eg, integrated metadata). can be merged with For example, document information may include bibliography information and format information of an input document. For example, structural feature points may include feature point information and pixel area information of an image object. For example, the contextual feature point may include detailed contextual information of text included in the image object.

일 실시 예에서, 통합 메타데이터 생성부(260)에서 생성된 메타데이터는 데이터베이스 형태로 메모리(120)에 저장될 수 있다. 메모리(120)에 저장된 이미지 객체 별 메타데이터는 이미지 객체에 대한 변형 및 표절을 탐지하기 위하여 이용될 수 있다. In one embodiment, the metadata generated by the integrated metadata generator 260 may be stored in the memory 120 in the form of a database. Metadata for each image object stored in the memory 120 may be used to detect deformation and plagiarism of the image object.

한편, 개시된 실시예들은 컴퓨터에 의해 실행 가능한 명령어를 저장하는 기록매체의 형태로 구현될 수 있다. 명령어는 프로그램 코드의 형태로 저장될 수 있으며, 프로세서에 의해 실행되었을 때, 프로그램 모듈을 생성하여 개시된 실시예들의 동작을 수행할 수 있다. 기록매체는 컴퓨터로 읽을 수 있는 기록매체로 구현될 수 있다.Meanwhile, the disclosed embodiments may be implemented in the form of a recording medium storing instructions executable by a computer. Instructions may be stored in the form of program codes, and when executed by a processor, create program modules to perform operations of the disclosed embodiments. The recording medium may be implemented as a computer-readable recording medium.

컴퓨터가 읽을 수 있는 기록매체로는 컴퓨터에 의하여 해독될 수 있는 명령어가 저장된 모든 종류의 기록 매체를 포함한다. 예를 들어, ROM(Read Only Memory), RAM(Random Access Memory), 자기 테이프, 자기 디스크, 플래쉬 메모리, 광 데이터 저장장치 등이 있을 수 있다. Computer-readable recording media include all types of recording media in which instructions that can be decoded by a computer are stored. For example, there may be read only memory (ROM), random access memory (RAM), magnetic tape, magnetic disk, flash memory, optical data storage device, and the like.

이상에서와 같이 첨부된 도면을 참조하여 개시된 실시예들을 설명하였다. 본 개시가 속하는 기술분야에서 통상의 지식을 가진 자는 본 개시의 기술적 사상이나 필수적인 특징을 변경하지 않고도, 개시된 실시예들과 다른 형태로 본 개시가 실시될 수 있음을 이해할 것이다. 개시된 실시예들은 예시적인 것이며, 한정적으로 해석되어서는 안 된다.As above, the disclosed embodiments have been described with reference to the accompanying drawings. Those skilled in the art to which the present disclosure pertains will understand that the present disclosure may be implemented in a form different from the disclosed embodiments without changing the technical spirit or essential features of the present disclosure. The disclosed embodiments are illustrative and should not be construed as limiting.

Claims

In the method performed by the metadata generating device,
Receiving a document input;
converting the input document into an image file;
extracting an image area within the input document and at least one image object within the image area;
extracting bibliography information of the input document;
extracting format information of the input document;
extracting structural feature points of the at least one image object;
extracting contextual feature points of the at least one image object;
Generating integrated metadata based on the bibliographic information, the format information, the structural feature, and the contextual feature;
The bibliographic information includes a link to the original text of the input document, an information link, and a search link,
The step of extracting the format information of the input document,
extracting metadata of the input document;
classifying the type of the input document;
extracting byte codes from the input document;
analyzing the structure of the input document, and classifying the at least one image object into a picture, a figure, an illustration, a chart, and a table; and
Generating final format information based on the extracted metadata, the classified format, the extracted byte code, and the detailed classified result;
The step of extracting structural feature points of the at least one image object,
Using at least one of energy threshold, contrast enhancement, scale-invariant feature transform (SIFT), and features from accelerated segment test (FAST) techniques, robust feature points (robust extracting a key point); and
Converting the extracted robust feature points into relative coordinates and storing them;
The step of extracting contextual feature points of the at least one image object,
identifying text information for an image object including text among the at least one image object based on OCR (Optical Character Recognition) technology; and
Extracting detailed context information based on the identified text information; includes,
The detailed context information is text information included in charts, tables, and images that are not created inside the document program but are worked outside and attached in the form of images,
The step of generating the integrated metadata,
When a plurality of image objects are included in the image area, generating integrated metadata for each of the plurality of image objects;
The integrated metadata generated for each of the plurality of image objects is used to determine deformation and plagiarism for each image object,
How to create metadata.

delete

A program stored in a computer readable recording medium in order to execute the method of generating metadata according to claim 1 in combination with a computer.

Memory;
a document input unit for receiving documents; and
Including; a controller for controlling the operation of the metadata generator;
The metadata generator,
a document conversion unit that converts the input document into an image file;
an image area extraction unit extracting an image area within the input document and at least one image object within the image area;
a bibliographic information extraction unit extracting bibliographic information of the input document;
a format information extraction unit extracting format information of the input document;
a structural feature extraction unit extracting structural feature points of the at least one image object;
a contextual feature extraction unit extracting contextual feature points of the at least one image object;
An integrated metadata generator for generating integrated metadata based on the bibliography information, the format information, the structural feature points, and the contextual feature points;
The bibliographic information includes a link to the original text of the input document, an information link, and a search link,
The control unit, through the form information extraction unit,
Extract format information of the input document, extract metadata of the input document, classify the format of the input document, extract byte codes from the input document, and extract the structure of the input document. While analyzing, the at least one image object is classified in detail into a picture, a figure, an illustration, a chart, and a table, and the extracted metadata and the classified format , Based on the extracted byte code and the detailed classified result, final format information is generated,
The control unit, through the structural feature extraction unit,
Using at least one of energy threshold, contrast enhancement, scale-invariant feature transform (SIFT), and features from accelerated segment test (FAST) techniques, robust feature points (robust key point) is extracted, and the extracted robust feature points are converted into relative coordinates and stored,
The control unit, through the contextual feature extraction unit,
Identifying text information based on OCR (Optical Character Recognition) technology for an image object including text among the at least one image object, extracting detailed context information based on the identified text information,
The detailed context information is text information included in charts, tables, and images that are not created inside the document program but are worked outside and attached in the form of images,
The control unit, through the integrated metadata generation unit,
When a plurality of image objects are included in the image area, generating integrated metadata for each of the plurality of image objects;
The integrated metadata generated for each of the plurality of image objects is used to determine deformation and plagiarism for each image object,
Metadata generator.

delete