WO2023149618A1

WO2023149618A1 - System and method for providing digital reference book according to copyright ownership

Info

Publication number: WO2023149618A1
Application number: PCT/KR2022/016591
Authority: WO
Inventors: 최현욱
Original assignee: 주식회사 테스트뱅크
Priority date: 2022-02-04
Filing date: 2022-10-27
Publication date: 2023-08-10
Also published as: KR20230118522A

Abstract

The present invention relates to a system for providing a digital reference book according to copyright ownership. The system for providing a digital reference book according to copyright ownership, according to the present invention, comprises a digital conversion server, wherein the digital conversion server comprises: a copyright determination unit for determining whether there is copyright on a portable document format (PDF) file of a digital reference book to be provided to a user terminal; a region detection unit that detects a problem region and an explanation region from the PDF file and divides the problem region into a question region and an answer region; an element extraction unit that extracts elements from the question region, the answer region, and the explanation region, and groups the elements; and a problem provision unit for providing, to the user terminal, the elements grouped by the element extraction unit, wherein, when the system for providing a digital reference book does not have the copyright on the digital reference book, the element extraction unit does not extract the elements from the question region, the answer region, and the explanation region, and the problem provision unit provides the explanation region to the user terminal.

Description

Digital reference book providing system and method according to copyright ownership

The present invention relates to a digital reference book providing system and method according to whether or not a copyright is owned, and more particularly, by different procedures depending on whether or not the system has a copyright for a reference book, interaction with a user A system and method for creating a digital reference book that enables

For the production of conventional books, procedures are carried out in the form of “planning, manuscript, illustration, typesetting”, and after each procedure is completed, the PDF file for the book is calculated immediately before output. These PDF files are designed for output, and it is difficult to implement copyright protection, solution data recording, and various interactions required for digital content by utilizing them.

In order to implement various functions desired by users, the following procedures are required. (1) systematically extract and save the data included in each book PDF, (2) restore the saved data by assembling it in the form of original font, styling, and layout, (3) operate the saved and extracted data in an appropriate form service planning, (4) application development suitable for service planning

Due to the time and cost incurred in these complicated procedures, PDF files are created for publication of each book, but it is not converted into an application form and distributed, and a book-centered distribution structure is maintained. Such complex conventional procedures are causing a lack of supply of digital contents such as digital reference books.

In the step of extracting data from book PDF, OCR technology is used to recognize characters in images and extract them as text strings. At this time, when using a font that has not been learned in the OCR technology, a problem occurs in which the text string is extracted in an inaccurate form. Furthermore, when using a specific font when extracting data using OCR technology, a license agreement for using the font is required. In addition, there is a problem that the accuracy of the conversion result from image to text string is not 100% guaranteed. Furthermore, in the case of syntax containing mathematical formulas, (1) additional development of OCR technology that recognizes formulas, (2) technology development such as latex that converts the format of recognized formulas is required.

In the step of storing the extracted data, learning questions are rarely composed of simple text and are implemented with images in fingerprints, images in questions, and selected numbers, so it is required to classify and store them systematically. When text strings are extracted by processing this with simple OCR, all texts in the question are merged into one text and extracted regardless of whether they are included in detailed classifications such as question numbers, question areas, original numbers, options, and illustrations. That is, the data extracted as text strings without being classified according to the context needs to be separated to fit the context again, but this task is very complicated and requires a lot of time and resources. In order for the data extracted from each item to be assembled and restored in accordance with the context, tagging of subclasses such as item description, original number, and illustration within the item must be performed, and grouping must be facilitated by decomposition by OCR. do not support

In the stage of restoring the saved data, it is necessary to systematically restore the saved data using the font used in the original data and reassemble it by considering the styling (font size, spacing, layout, etc.). During this assembly process, previously used fonts must be used, and at this time, an issue of securing a font license may occur. On the other hand, it is very difficult to assemble the original styling based on the text data extracted using OCR, and systematic extraction of each styling as well as text extraction using OCR is required during extraction, but this is very difficult.

An object of the present invention to solve the above problems is to apply semantic grouping to elements included in a digital reference book using machine learning, thereby enabling smooth interaction with a digital reference book. To provide a reference book providing system and method.

Another object of the present invention is a digital reference book providing system and method for enabling smooth interaction with a user by hierarchically classifying and extracting elements included in the commentary area of a digital reference book and applying a more accurate mapping with the problem area. is to provide

Another object of the present invention is a digital reference book providing system that first determines whether the system has the copyright for the reference book to be digitized, creates a digital reference book by different procedures depending on whether the system owns it, and provides it to the user, and to provide that way.

In order to solve the above problem, the digital reference book providing system according to the present invention performs Semantic Gourping using machine learning. Although the number and order of specific procedures may vary, a digital reference book that enables smooth interaction with users can be created through the process below.

Column classification is performed using machine learning prior to OCR processing in order to systematically database each question, which is performed in the following procedure. (1) Detect problem area within the page, (2) Detect circle number area in each problem area, (3) In each problem area, problem number, question text, image, sub question area, sub area and circle number Decomposition into numbers and text areas, (4) Calculation of Semantic Binding box for each question

In order to implement a method without loss of styling, the digital reference book providing system according to the present invention does not use resources for restoring styling, discovers only the areas that require interaction, and implements multi-layer interaction on top of it. The following procedure is performed do. (1) Detect problem areas within the page, (2) Discover areas that require answer records in each problem area

Furthermore, when the target of digital conversion is the explanation sheet area containing the answer and explanation of the problem according to the problem area, the layout, font, and styling are repeatedly used inside the book, and data classification using internal rules is possible. Specifically, the digital reference book providing system according to the present invention decomposes the commentary area for each item through the following procedure. (1) Determine the center line of the page, (2) Create a page by converting the two-column structure into a one-column structure based on the determined center line, (3) Detect the separator within the page, (4) Detect the area of the separator, (5) ) extract the area to be mapped, (6) map the area to the problem

On the other hand, if the system owns the copyright of the reference book that the user wants to interact with, it creates a digital reference book and provides it to the user as described above. HTML layer and handwriting layer are provided to the user terminal without extracting elements from the area.

The present invention provides a system and method for providing a digital reference book that enables smooth interaction with users by applying semantic grouping to elements included in a digital reference book using machine learning. can

The present invention provides a digital reference book providing system and method for enabling smooth interaction with a user by hierarchically classifying and extracting elements included in the commentary area of a digital reference book and applying a more accurate mapping with the problem area. can

The present invention provides a digital reference book providing system and method for first determining whether a system has a copyright for a reference book to be digitized, generating a digital reference book through different procedures depending on whether the system owns it, and providing the digital reference book to the user. can provide

1 is a diagram showing the configuration of a digital reference book providing system according to an embodiment of the present invention.

[Correction under Rule 91 29.12.2022]
2 is a diagram showing a sequence of a digital reference book providing method according to an embodiment of the present invention. 3 to 18 show specific examples for explaining the present invention.

Hereinafter, some embodiments of the present invention will be described in detail through exemplary drawings. In adding reference numerals to components of each drawing, it should be noted that the same components have the same numerals as much as possible, even if they are displayed on different drawings.

And, in describing the embodiments of the present invention, if it is determined that a detailed description of a related known configuration or function hinders understanding of the embodiments of the present invention, the detailed description will be omitted.

Also, terms such as first, second, A, B, (a), and (b) may be used to describe components of an embodiment of the present invention. These terms are only used to distinguish the component from other components, and the nature, order, or order of the corresponding component is not limited by the term.

In this specification, the x coordinate means a horizontal coordinate value within a page, and the y coordinate means a vertical coordinate value within a page.

Referring to FIG. 1, the configuration and operation of a digital reference book providing system according to copyright ownership according to an embodiment of the present invention will be described below.

1 is a diagram showing the configuration of a digital reference book providing system (hereinafter referred to as "this digital reference book providing system") according to copyright ownership according to an embodiment of the present invention.

Referring to FIG. 1 , this digital reference book providing system 1000 includes an external server 10 and/or a digital conversion server 100 .

The external server 10 may indicate a server that stores PDF files requiring conversion, and the digital conversion server 100 may receive PDF files requiring conversion from the external server 10 .

The digital conversion server 100 separates, extracts, stores and manages the elements included in the PDF (Portable Document Format) file, semantically groups the extracted elements to create a digital reference book that can be interacted with, This can be provided to the user terminal.

Each component of the digital conversion server 100 may be driven by a known library. For example, fitz, a PDF library used for converting PDFs into images and extracting text page by page; opencv (Open Source Computer Vision), a computer vision library used for image transformation and border extraction; object detection model training, prediction, and use as a tool Libraries such as MathPix used as object detection pytorch and tensorflow, MathML and LaTex grammar OCR APIs can be used, and they can be run on Python, but are not limited thereto. In addition, libraries such as numpy, sklearn, re, and glob can be used and run on Python, and text can be detected on PDF using the Font-forge library, matched with a suitable font, and extracted in a renderable form. , Font format may be calculated in the form of TTF, TTF2, OTF, woff, etc., but is not limited thereto.

The digital conversion server 100 may include a file generator 110, a region detector 120, and/or an element extractor 130.

The file generator 110 extracts a plurality of pages from the PDF file of the digital reference book and creates a page file. That is, the file generator 110 creates a page file by dividing a plurality of pages of the PDF file into single pages. In this case, the page file may be a bitmap format file of BMP, GIF, JPEG, and PNG using pixels, respectively.

The PDF file from which the file creation unit 110 extracts pages may be a PDF file that has been created and stored in the database unit 150 . The PDF file from which the file generator 110 extracts pages may be a PDF file generated by scanning a reference book by a user terminal such as a smartphone, scanner, or multifunction device. Alternatively, the PDF file from which the file generator 110 extracts pages may be received from another external server 10 connected through a communication network. A PDF file created by scanning a reference book contains multiple pages. The file generator 110 extracts a plurality of pages into a single page and creates a page file.

[Correction under Rule 91 29.12.2022]
Meanwhile, the PDF file from which the file creation unit 110 extracts pages may be a structured PDF file or a scanned PDF file depending on how it is created. Structured PDF is a PDF file created electronically using computer software, and Scanned PDF is a PDF file created in the form of an image using a specific book in the real world. The big difference between Structured PDF and Scanned PDF is that in Structured PDF, referring to Figure 3 below, each element such as text, image or table included in each page can be identified and extracted according to selection. The blue highlights in FIG. 3 show the selected elements. Unlike Structured PDF, in Scanned PDF, referring to FIG. 4 below, it is impossible to select and extract each element. Therefore, when the prepared PDF file is Scanned PDF, the file creation unit 110 constructs the Structure through the OCR process. restore At this time, the OCR process may be performed by the optical character recognition unit 140 to be described later. Alternatively, when the prepared PDF file is a scanned PDF, the file creation unit 110 extracts the question area, answer area, and commentary area by the area detection unit 120 to be described later without going through the OCR process, and extracts the extracted area. Images of may be stored in the database unit 150 and managed.

[Correction under Rule 91 29.12.2022]

[Correction under Rule 91 29.12.2022]
< Fig. 3: Structured PDF example>

[Correction under Rule 91 29.12.2022]

[Correction under Rule 91 29.12.2022]
< Figure 4: Scanned PDF example>

The digital reference book according to the present invention includes a question paper area and a commentary area. Among them, the problem area is usually composed of two columns regardless of subject, grade, book type, etc. In addition, the number of questions observed within one page is distributed within a minimum of 4 to 8 questions. Most of the questions are in the form of multiple choice, and depending on the case, there may be short answer and subjective questions, but they are not frequent. The most important element to find in the problem sheet area is the problem area. The problem area consists of a problem number, problem title, and options, and has a similar form regardless of the type of subject.

The commentary area does not have a shared geometric feature regardless of subject or book type like the problem area of the problem area, but all pages and elements within the commentary have the same layout design and element shape. There is no overall trend found in the commentary area, each book has many individual characteristics, there is no commonly found pattern like the two column structure, there is no pattern found in the form of multiple choice, and each book (especially book series) has its own unique font, styling, Layout is used frequently. In the case of commentary areas with these characteristics, it is very likely that Machine Learning for Semantic Binding for the purpose of implementing Interaction in questions will not work well.

In the case of the problem paper area, the area detection unit 120 according to the present invention, in the page file generated by the file creation unit 110, the entire image of the page file, text included in the page file, HTML, CSS, Font, Background vector Extract elements such as images. After that, the area detection unit 120 converts each page file into an HTML page. When Structured PDF is converted to HTML page, each element can be detected (clickable) after conversion, and for Scanned PDF, the entire page is extracted as a single image while each element is undetectable. Therefore, as described above, the scanned PDF needs to be converted into a form in which characters/numbers in the image can be detected through OCR processing. Meanwhile, after the region detection unit 120 converts each page file into an HTML page, an id value is assigned to an identified element within each page. At this time, the selectable element includes each element such as text, image, or table, and may include, for example, a problem area, a question area and an answer area within the problem area, and a number option within the answer area.

In more detail, the area detection unit 120 converts the structured PDF file produced during the book production process at the publishing house into HTML, which is a form usable for the service. The PDF file used at this time has the characteristics that text can be copied, font information is accurately expressed, and object files can be selected individually.

[Correction under Rule 91 29.12.2022]
After extracting all the elements included in the PDF file, the area detection unit 120 combines the images into one large image to make a background image, and the text is placed at the same position as in the original PDF by creating a new font. do. Specifically, it is performed in the following procedure (see FIG. 5). (1) Using the PDF rendering library, extract all objects (objects and elements) inside the PDF along with their position and size values, (2) among the extracted objects, the background (background) image and object image Merge them into one image (by arranging and merging object images in the background image at the corresponding location, considering the position value and size value of the object image), (3) Among the extracted objects, the form of the extracted text for text (4) Using the texts and the generated fonts, place the texts in the page-sized matrix in the same size and position as the original form and add styling, (5) Finished image and text strings are internalized in HTML in Tag format

[Correction under Rule 91 29.12.2022]

[Correction under Rule 91 29.12.2022]
< Figure 5: How to convert Structured PDF to HTML >

Meanwhile, the region detector 120 may use an image-based PDF file calculated through a scan procedure when a typesetting PDF file is not secured.

Structured PDF files can select objects, highlighting by designating a text block, selecting an image or table, and declaring a selector tag by accessing an object. On the other hand, in a scanned PDF file, the entire page is implemented as an image, so it is impossible to highlight by designating a text block, and it is impossible to select an image or table, and a tag must be declared using Machine Learning Selector Detection technology.

In a structured PDF file, all elements in each page are implemented in a recognizable form, and classes are assigned individually or collectively, so it is possible to implement a specific class javascript event. However, in the case of a scanned PDF file, since the entire image is converted into a single class, it is necessary to discover the parts that require interaction in each image and configure the points that require javascript event implementation with overlay.

When converting scanned PDF into HTML, the area detection unit 120 discovers a text area, designates a text block, and implements a text block area so that a highlight can be designated. Specifically, the following procedure is performed. (1) Determine whether a specific character is text based on the OCR model, (2) Text box detection to designate and select the text box and implement interaction, (3) Text box overlay at the same position for each text character

On the other hand, the area detection unit 120 sets a location-based tag point so that an event can be implemented in a selector within a question, and is specifically performed in the following procedure. (1) Discrimination of questions within a page, (2) Discrimination of choice area within a question, (3) Determination of the position of a circle number within a choice, (4) Assignment of a tag capable of setting a javascript event for each circle number area

However, even if a Structured PDF file or a Scanned PDF file is effectively converted to HTML as described above, HTML that is not semantically implemented because a meaningful tag is not assigned cannot be used for interaction.

[Correction under Rule 91 29.12.2022]
Referring to FIG. 6, tags based on location values have the same screen configuration that users see on the web page, but are not semantically grouped, making it difficult to connect in semantic units. Therefore, scoring and answer connection, etc. It is difficult to implement the event of

[Correction under Rule 91 29.12.2022]

[Correction under Rule 91 29.12.2022]
< Figure 6: Tag based on location value>

[Correction under Rule 91 29.12.2022]
On the other hand, referring to FIG. 7, tags based on semantic values have the same screen configuration that users see on the webpage compared to cases based on location values, but are semantically grouped and connected in semantic units. It is easy, so it is easy to implement events such as scoring and answer connection.

[Correction under Rule 91 29.12.2022]

[Correction under Rule 91 29.12.2022]
< Figure 7: Tag based on meaning value >

That is, the class id value (Tag) given to the selectable element is generated based on location information within each page in the process of converting PDF to HTML page, and the id value has no meaning. Therefore, in order to group these id values into semantic units, such as pages, problems, and options, the following machine learning process is performed on the image of the page file. In this case, grouping may mean, for example, grouping problems, options, and multiple-choice questions attached to one passage.

That is, the region detection unit 120 groups classes for each problem region and assigns an id. Specifically, the corresponding work is performed in the following procedure. (1) Detection of problem areas within the page, (2) Detection of areas requiring interaction application within the problem area, (3) Detection of elements present in each area

The area detection unit 120 first detects a problem area in each page file, assigns an ID, separates the problem area into a question area and an answer area requiring interaction, and detects it. An ID (Sub-id) is also assigned to the detected question area and/or answer area. In addition, the area requiring interaction within the problem area may differ for each subject in the reference book. For example, in the case of Korean or English, there is a bundle of texts, and these bundles of texts are combined with the text for the convenience of learners. It is desirable to be able to view them on one screen at the same time. According to an embodiment, the area detection unit 120 may detect an answer area within the problem area and classify an area other than the answer area as a question area.

[Correction under Rule 91 29.12.2022]
The area detection unit 120 according to the present invention detects a problem area (FIG. 9, A) in the page file extracted by the file creation unit 110. The area detection unit 120 may extract object information in a JSON (JavaScript Object Notation) format using the fitz library. The fitz library can extract text from a page as shown in FIG. 8 below.

[Correction under Rule 91 29.12.2022]

[Correction under Rule 91 29.12.2022]
< Figure 8: Example of object extraction using fitz library >

[Correction under Rule 91 29.12.2022]
The area detection unit 120 detects text from the page through the fitz library. The area detection unit 120 may detect each problem area, such as , with a boundary between text characters or vertical lengths longer than other text characters or vertical lengths.

When detecting problem areas in each page file, the area detection unit 120 may generate a bounding box surrounding each problem area using machine learning. The region detection unit 120 may add a unified background color to the problem region within the bounding box generated by machine learning. The region detection unit 120 determines whether other problem regions exist by expanding adjacent pixels in the remaining region except for the bounding box through opencv. Meanwhile, in the present specification, a bounding box may be generated based on a machine learning model based on Fast RCNN, which will be described later.

Only the information within the bounding box generated by the region detection unit 120 is allocated to the work region, and the processing speed according to the present invention can be faster than processing the entire page. Meanwhile, the area detection unit 120 detects and stores information about the position and size of the created bounding box.

On the other hand, when the region detection unit 120 detects a problem region in each page file, the x-coordinate (x1) of the leftmost point and the x-coordinate (x1) of the rightmost point of the bounding box are higher than the x-coordinate (x half) of the midpoint in the horizontal direction of the page. When the value of x2) is larger, it can be determined that the layout of the page is two columns. In this case, the midpoint coordinates in the horizontal direction of the page may be coordinate values corresponding to the center point of the entire page or the center of a line segment connecting the left and right sides of the page.

[Correction under Rule 91 29.12.2022]

[Correction under Rule 91 29.12.2022]
< Figure 9: Example of problem area extraction using bounding box >

[Correction under Rule 91 29.12.2022]
The area detection unit 120 separates the detected problem area into a question area and an answer area (B, FIG. 10) as shown in FIG. 10 . When dividing the problem area into a question area and an answer area, the area detector 120 first selects and extracts the answer area, and selects the remaining problem areas excluding the selected answer area as the question area.

In the answer area, options may be arranged horizontally or vertically. If the answer area is aligned vertically, the choice of answer area is as follows.

When detecting an answer area in a page file, the area detection unit 120 may detect a row including a circle number. When a series of consecutive circular numbers is detected at the leftmost side of a row by the area detection unit 120 and a series of consecutive circular numbers are detected in a downward direction in a column of the circular numbers, the area detection unit 120 has a problem. Select an area containing a sentence containing a circle number detected in the area as an answer area.

In addition, when the answer areas are aligned horizontally, the area detection unit 120 may detect lines included so that the original number continuously increases when detecting the answer areas in the page file. When a series of consecutive circular numbers is detected along a row by the area detection unit 120, the area detection unit 120 selects an area containing a sentence including the circle numbers detected in the problem area as an answer area.

[Correction under Rule 91 29.12.2022]

[Correction under Rule 91 29.12.2022]
< Figure 10: Example of extracting answer areas within problem areas>

[Correction under Rule 91 29.12.2022]
As shown in FIG. 11, the element extraction unit 130 extracts elements C to F (FIG. 11) from the question area and the answer area separated by the area detection unit 120 and stores the extracted elements in the database unit 150. Save. At this time, the elements include a problem index (C), text (D), image (E), text box (not shown), and table (not shown). When extracting the problem index C, the element extraction unit 130 selects and extracts the index located at the leftmost side of the question area. The element extraction unit 130 may select and extract the image E from the problem area. The problem index extracted by the element extractor 130 can be used for scoring automation by being connected to an answer area of the commentary area, which will be described later.

Except for the problem index (C) and the image (E) extracted from the question area by the element extraction unit 130, the rest may be text (D). At this time, each text may be selected and extracted based on a region having a different bounding box according to each position of the text. Meanwhile, the element extraction unit 130 may select and extract a series of options (F) from text in the selected question area. A series of options (F) are 'a. you. c.', '(a) (b) (c)', 'a. me. all.' Alternatively, when '(a) (b) (c)' is detected in series, the sentences or the area can be selected and extracted as a series of options (F). That is, the element extractor 130 may select and extract sentences within a region in which consonants are sequentially detected in sequence or consonants and vowels are sequentially detected in sequence in a combined state as a series of options (F). there is.

The element extraction unit 130 stores the selected and extracted elements in the database unit 150. Each element selected and extracted by the element extraction unit 130 has a bounding box, and the area within the bounding box is stored in the database unit 150 in the form of an image.

In addition, the element extraction unit 130 determines that a sentence including a circle number is an option in an area selected as an answer area because the above-described area detection unit 120 has a sentence including a circle number in the problem area. The element extraction unit 130 extracts the aforementioned elements from the answer area including the options, and stores the extracted elements in the database unit 150.

[Correction under Rule 91 29.12.2022]

[Correction under Rule 91 29.12.2022]
<Figure 11: Example of element extraction in problem area>

On the other hand, depending on the subject type of the reference book, there may be a difference in the format of the questions. For example, if the subject of the reference book is mathematics, formulas may exist in the problem area. Elements extracted by the element extractor 130 may further include formulas, and when the element extractor 130 extracts formula elements in the problem area, MathML for formulas using MatchOCR (eg, Mathpix). Alternatively, you can extract formulas in LaTex format. On the other hand, it is not limited to extracting formula elements from the problem area only when the subject of the reference book is mathematics, and formula elements can be extracted anytime when formulas in the problem area are detected.

[Correction under Rule 91 29.12.2022]
In addition, when the subject of the reference book is Korean or English, a bundled fingerprint may be included as shown in FIG. 12. Bundled fingerprints refer to fingerprints in which a plurality of questions are included in one fingerprint.

[Correction under Rule 91 29.12.2022]
The area detection unit 120 may detect a bundle fingerprint number (G, FIG. 12 ) at the top of the problem area. When a bundled fingerprint number is detected, the area detection unit 120 performs a document merging operation to include a problem area corresponding to the bundled fingerprint number (refer to the description area to be described later), and the area including the bundled fingerprint number. Create a bounding box in At this time, the region detection unit 120 may create a bounding box to include problem regions according to bundled fingerprint numbers. For example, as shown in FIG. 12, when '35 to 37' are bundle fingerprint numbers, the area detection unit 120 sets the bounding box so that all problem areas having problem indexes 35 to 37 (C) are included together with the bundle fingerprint. create And when the element extraction unit 130 extracts elements from the question area and the answer area and stores them in the database unit 150, it also extracts elements from the aforementioned bundled fingerprints and stores them in the database unit 150. At this time, the question area and Elements extracted from the answer area and elements extracted from the bundled fingerprints may be stored in the database unit 150 so as to be connected to each other.

The detection of bundled fingerprints is not limited only to cases where the subject of the reference book is Korean or English, and whenever the region detection unit 120 detects a number array in the form of 'numbers to numbers' such as the aforementioned bundled fingerprint numbers, bundled fingerprint elements can be extracted and stored.

[Correction under Rule 91 29.12.2022]
On the other hand, unlike FIG. 12, there is a case where a bundle number is not assigned to a bundled fingerprint of a specific area. In this case, the remaining area after detecting each problem area is defined as a fingerprint area, and the remaining fingerprints are checked through the inspection procedure for each page. You can proceed with the mapping process with relevant problem areas and post-correction.

[Correction under Rule 91 29.12.2022]

[Correction under Rule 91 29.12.2022]
< Figure 12: Figure showing bundled fingerprint numbers in bundled fingerprints >

The element extraction unit 130 extracts the extracted elements, that is, the problem index (C), text (D), image (E), text box (not shown), table (not shown), a series of options (F), and page information. etc. are grouped into one problem and stored in the database unit 150. The grouped elements may be provided to the user terminal by the problem providing unit 160 to be described later. When there is a bundled fingerprint, it is preferable that problems corresponding to one bundled fingerprint are simultaneously provided on one screen. Therefore, the element extraction unit 130 determines whether there is a bundled fingerprint in the page file, how many questions are the questions corresponding to the bundled fingerprint, that is, which problem index it has, and if the bundled fingerprint is turned over, the page information at that time. are extracted and stored in the database unit 150. The element extraction unit 130 may assign a specific class id value to the elements when grouping the extracted elements.

The optical character recognition unit 140 may extract text from elements extracted and stored in the database unit 150 through Optical Character Recognition (OCR) processing or the Fitz library, and store them in the database unit 150 again. . The optical character recognition unit 140 may perform optical character recognition processing on an element using a known optical character recognition algorithm.

The problem providing unit 160 provides the elements grouped by the element extraction unit 130 to the screen of the user terminal. At this time, provided elements may be provided as an HTML layer in a state of being arranged at positions suitable for respective position values.

On the other hand, the problem providing unit 160 may provide a multi-layer structure to the user terminal. That is, a writing layer for writing may be further provided on the HTML layer in which the grouped problems are provided to the user terminal. The user terminal may proceed with writing and problem solving in the writing layer provided on the HTML layer. The question providing unit 160 may recognize an answer check through a known library capable of recognizing handwriting written in the handwriting layer. At this time, if the answer is checked in the handwriting method in the handwriting layer at the same position as the position value where a certain number included in the answer area of the HTML layer is located, the problem providing unit 160 is the problem index of the answer area to be described later. Based on the answer and the correct answer, it is possible to automate the scoring of the problem by comparing the answer check of the user terminal and the correct answer.

[Correction under Rule 91 29.12.2022]
In the case where the target of conversion is the explanation sheet area that includes the answers and explanations of the problems according to the problem sheet area, the layout, font, and styling are repeatedly used inside the book, enabling data classification using internal rules. Specifically, referring to FIG. 13 , the area detection unit 120 decomposes the commentary area for each item through the following procedure. (1) Determine the center line of the page, (2) Create a page by converting the two-column structure into a one-column structure based on the determined center line, (3) Detect the separator within the page, (4) Detect the area of the separator, (5) ) extract the area to be mapped, (6) map the area to the problem

[Correction under Rule 91 29.12.2022]

[Correction under Rule 91 29.12.2022]
< Figure 13: Item decomposition procedure in the commentary area>

More specifically, the area detection unit 120 extracts the entire image file from the page file. The area detection unit 120 extracts a commentary area from the extracted image file. At this time, the extraction of the commentary area may be performed by the same process as the detection of the problem area described above.

[Correction under Rule 91 29.12.2022]
On the other hand, when the region detection unit 120 extracts the commentary region, in the entire image file extracted from each page file, first in the n-column configuration, referring to FIG. 14, the dividing line (midline, H, figure 14) can be detected. In addition, the area detection unit 120 may also detect an outer line (I, FIG. 14 ), which is the innermost dividing line constituting content within each page. The area detector 120 may cut images containing internal contents based on the center line and the outer line, and merge them into one image by cutting them based on the center line. An example of merging into one image is shown in FIG. 15 . Such merged images may be obtained by merging images corresponding to all pages of the commentary area into a single image in one column.

[Correction under Rule 91 29.12.2022]

[Correction under Rule 91 29.12.2022]
< Fig. 14: Example of detecting center line and outer line in a page>

[Correction under Rule 91 29.12.2022]

[Correction under Rule 91 29.12.2022]
< Figure 15: Example of a single image in column 1 >

The element extraction unit 130 divides the images merged into one column into chapter areas. When the element extractor 130 divides the merged image into chapter areas, referring to FIG. 14, a first delimiter for dividing the chapters is selected, and referring to FIG. 15, feature points of the selected first delimiter are extracted and , Referring to Figure 16, each chapter can be distinguished by detecting a second delimiter based on the extracted first delimiter.

[Correction under Rule 91 29.12.2022]

[Correction under Rule 91 29.12.2022]
< Figure 16: Example of selecting a first delimiter for dividing chapters >

[Correction under Rule 91 29.12.2022]

[Correction under Rule 91 29.12.2022]
< Figure 17: Example of feature point extraction of the first classifier >

[Correction under Rule 91 29.12.2022]

[Correction under Rule 91 29.12.2022]
< Figure 18: Example of detecting the second separator (other chapters) based on the first separator>

The element extraction unit 130 separates the merged commentary area images by chapter and section through the above-mentioned feature points (detection of the area of the separator), and extracts each page information and problem index for each chapter (extracts the area to be mapped ). The element extraction unit 130 extracts the correct answers and explanations for each problem index and stores them in the database unit 150. The element extraction unit 130 connects the problem index in the problem area in the aforementioned problem area with the problem index in the explanation area (mapping the area with the problem). Therefore, as described above, the problem providing unit 160 compares the location information of the answer check of the user terminal with the location information of the answer area of the question to automatically score the problem.

The area detection unit 120 detects a problem area within the page, detects an area requiring interaction within the problem area (eg, an answer area), and uses a pre-learned machine learning model to detect elements within the problem area. use The machine learning model used at this time was based on a Fast RCNN-based machine learning model, and through this, it was possible to compensate for the slow speed with a structure that combined classification after CNN and Bounding Box regression with RPN (Region Proposal Network). Using a project that digitally converted book-type data (A unified toolkit for Deep Learning Based Document Image Analysis), the same model, Faster-RCNN initial model, was implemented to utilize transfer learning for the data, and the iBM PubLayNet dataset An initial model was set up through pre-training with 350,000 sheets, single-class object detection, and hyperparameter tuning was performed to suit the book image data. After that, 20,000 sheets were pre-trained using commercial reference books, workbook image grayscale, and image augmentation techniques to enhance performance.

In the database unit 150 according to the present invention, information on reference books for which the system according to the present invention has copyrights or is licensed for copyrights and reference books for which copyrights are not held and for which licenses for copyrights have not been obtained are all stored. can be stored Alternatively, the digital conversion server 100 according to the present invention provides information on a reference book for which the system according to the present invention has a copyright or is licensed for copyright, and a reference book that does not have a copyright and is not licensed for copyright. It is possible to access the external server 10 including.

Meanwhile, the digital reference book providing system 1000 may further include a copyright determination unit (not shown). The copyright judging unit may determine whether or not the present system has the copyright of the reference file (PDF file) uploaded to this system by the user terminal or the reference file to be provided to the user terminal. The database unit 150 stores all stored reference PDF files by dividing them into PDF files with copyright and PDF files without copyright in this system. In addition, the copyright determination unit determines whether or not the system has the copyright for the reference file uploaded from the user terminal or the reference file to be provided to the user terminal, based on the copyright existence information for each reference file stored in the database unit 150. can judge

When the present system creates and provides a digital reference book for a copyrighted reference book, the digital reference book is provided to the user terminal through the process of the above-described area detection unit 120 and element extraction unit 130. On the other hand, if the present system needs to provide a digital reference book for a reference book that does not have copyright, the system can generate only an additional layer irrelevant to copyright, such as automatic scoring, and provide it to the user terminal. That is, the present system can provide only the HTML layer and/or the handwriting layer to the user terminal without extracting elements from the problem area and the commentary area. If necessary, only the writing layer may be provided without region detection. That is, when a user stores and reads a PDF file of a reference book that does not have copyright in this system and is not licensed for copyright, the problem providing unit 160 provides HTML for providing problems of the reference book to the user terminal. layer and/or handwriting layer. After the user checks the answer through the handwriting layer of the user terminal, that is, the handwriting layer at the same position as the position value of the original number in the HTML layer answer area as described above and automatically scores, if the correct answer to the question is inconsistent, problem solving The study 160 displays the commentary area of the commentary corresponding to the question on the screen of the user terminal.

The present digital reference book providing system 1000 may further include a purchase authentication unit 170 that verifies that the user has legally purchased a book corresponding to the digital reference book. The purchase authentication unit 170 transmits a specific random number generated together with a specific page to the user terminal.

The user terminal writes the received specific random number on the same page as the specific page received from the purchase authentication unit 170 and transmits an image of the page with the random number to the purchase authentication unit 170 . The purchase authentication unit 170 mutually compares elements included in the page on which the user terminal writes a specific random number with elements included in the same page of the reference book stored in the database unit 150, and when the same element is detected, It can be judged that the user has legally purchased the book. Therefore, the elements extracted and grouped as described above are combined into an HTML layer and provided to the user terminal.

This digital reference book providing system 1000 concludes a copyright licensing agreement with reference book publishers and/or their distributors for distributing specific reference books, receives books for distribution, or is provided by reference book publishers and/or their distributors. A PDF file of the specific reference can be received from an external server 10 .

The digital conversion server 100 according to the present invention builds an application that can operate a website and/or application-based sales channel and a digitally converted reference book, and provides the built application to a user terminal. A provision unit (not shown) may be further included. The application built by the application providing unit is installed in the user terminal.

The user terminal purchases a reference book such as a book by generating an ID in a sales channel established by the application providing unit. The user terminal can use the digitally converted reference book by authenticating the ID on the application.

When the purchase confirmation process is completed after receiving the purchased reference book, the user terminal acquires the right to read and use the digitally converted reference book with the same contents as the purchased reference book.

On the other hand, the application built by the application providing unit according to the present invention may have the following safety devices implemented. First, in the application, a procedure for verifying whether the purchaser of the reference book and the holder of the logged-in user terminal are the same may be implemented through the login and channel authentication procedures. In addition, in the application, a procedure for requesting an irrevocable expression of intent to confirm purchase from the user terminal that purchased the reference book and providing access to the digital reference book to the user terminal only when there is an expression of intent to confirm purchase may be implemented. This is to prevent users who have not purchased reference books from using digital reference books, which are an additional function of purchasing reference books. On the other hand, when an application provides a digital reference to a user terminal, it can be provided as fragmented pages in a CDN/security-enhanced protocol, not in the form of a PDF file.

2 is a diagram showing a sequence of a digital reference book providing method according to an embodiment of the present invention.

Referring to FIG. 2 , a method for providing a digital reference book according to an embodiment of the present invention includes the steps of determining whether or not a copyright of a PDF (Portable Document Format) file of a digital reference book to be provided to a user terminal is held (S10); Detecting a problem area and a commentary area in the PDF file, and separating the problem area into a question area and an answer area (S20); extracting the elements from the question area, the answer area, and the commentary area, and grouping the elements (S30); providing the elements grouped by the element extraction unit to a user terminal (S40); and/or providing the commentary area to the user terminal without extracting the elements from the question area, the answer area, and the commentary area, when the digital reference book providing system does not have the copyright for the digital reference book. Step (S50) may be included.

A detailed description of each step of the above-described method is replaced with a description of the operation of the above-described digital reference book providing system.

The embodiments described in this specification belong to the same technical field, and components constituting one embodiment may be combined with components constituting another embodiment to form a new embodiment.

In the present specification, the file generation unit, the area detection unit, the element extraction unit, the optical character recognition unit, the database unit, the problem providing unit, and/or the purchase authentication unit may be processors that execute successive processes stored in memory. Or, it can operate as software modules driven and controlled by a processor. Further, a processor may be a hardware device.

For reference, the digital reference book providing method according to an embodiment of the present invention may be implemented in the form of program instructions that can be executed through various computer means and recorded on a computer readable medium. The computer readable medium may include program instructions, data files, data structures, etc. alone or in combination. Program instructions recorded on the medium may be those specially designed and configured for the present invention or those known and usable to those skilled in computer software. Examples of computer readable media include magnetic media such as hard disks, floppy disks and magnetic tapes, optical recording media such as CD-ROMs and DVDs, magneto-optical media such as floptical disks, and ROMs, RAMs, A hardware device specially configured to store and execute program instructions, such as flash memory, may be included. Examples of program instructions include not only machine code generated by a compiler but also high-level language codes that can be executed by a computer using an interpreter or the like. The hardware device described above may be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.

The protection scope of the present invention is not limited to the description and expression of the embodiments explicitly described above. In addition, it is added once again that the scope of protection of the present invention cannot be limited due to obvious changes or substitutions in the technical field to which the present invention belongs.

Claims

In a digital reference book providing system according to copyright ownership, including a digital conversion server,

The digital conversion server,

A copyright judgment unit that determines whether or not a user has a copyright for a PDF (Portable Document Format) file of a digital reference to be provided to the user terminal;

an area detection unit that detects a problem area and an explanation area in the PDF file, and separates the problem area into a question area and an answer area;

an element extraction unit extracting the elements from the question area, the answer area, and the commentary area and grouping the elements; and

And a problem providing unit for providing the elements grouped by the element extraction unit to a user terminal,

If the digital reference book providing system does not have a copyright for the digital reference book, the element extraction unit does not extract the element from the question area, the answer area, and the commentary area;

The problem providing unit provides the commentary area to the user terminal, a digital reference book providing system according to copyright ownership.
The method of claim 1,

When the digital reference book providing system does not have a copyright on the digital reference book, the question providing unit provides an HTML layer and a handwriting layer for the question paper area of the digital reference book. Reference book delivery system.
The method of claim 2,

The problem providing unit compares the location value of the original number of the answer area on the HTML layer and the location value of the answer checked in the handwriting layer, and then provides a commentary area corresponding to the question to the user terminal if the correct answer to the question is inconsistent. Characteristically, a digital reference book providing system according to copyright ownership.
The method of claim 1,

When the region detection unit detects the problem region,

Creating a bounding box surrounding the problem area using machine learning,

The area detection unit and the element extraction unit,

A digital reference book providing system according to copyright ownership, characterized in that data is processed only with the content in the bounding box to improve data processing speed.
Determining whether or not a user has a copyright for a PDF (Portable Document Format) file of a digital reference to be provided to the user terminal;

detecting a problem area and a commentary area in the PDF file, and separating the problem area into a question area and an answer area;

extracting the elements from the question area, the answer area, and the commentary area, and grouping the elements;

providing the elements grouped by the element extraction unit to a user terminal; and

and providing the commentary area to the user terminal without extracting the elements from the question area, the answer area, and the commentary area, when the digital reference book providing system does not have a copyright for the digital reference book. How to provide a digital reference book, whether copyrighted or not.