KR20170042145A - Apparatus and method for generating application and method for zing the same - Google Patents
Apparatus and method for generating application and method for zing the same Download PDFInfo
- Publication number
- KR20170042145A KR20170042145A KR1020150141726A KR20150141726A KR20170042145A KR 20170042145 A KR20170042145 A KR 20170042145A KR 1020150141726 A KR1020150141726 A KR 1020150141726A KR 20150141726 A KR20150141726 A KR 20150141726A KR 20170042145 A KR20170042145 A KR 20170042145A
- Authority
- KR
- South Korea
- Prior art keywords
- document
- unit
- distribution
- pattern
- sentence
- Prior art date
Links
Images
Classifications
-
- G06F17/21—
-
- G06F17/211—
-
- G06F17/2705—
Abstract
Description
The present invention relates to a document format-based document reduction, more particularly, to a document format that can easily understand a content and a flow of a document by summarizing the document using the document object, format information, and paragraph information And a document summarizing method using the same.
For the modern people, the web environment is one of the fastest and convenient way to get necessary information. As the spread of terminals such as PC, smart phone or tablet PC is spreading, The use of services is increasing.
However, the existing terminal does not provide a summary document of the original document, either providing the original document to the user, or storing the main part checked by the user in the original document separately and providing it to the user.
Therefore, when a document retrieved from a terminal corresponds to hundreds of pages, the user must read all the documents of several hundred pages in order to obtain desired information from the document.
In addition, there may be cases where the user does not have the desired information in the read document. Therefore, there is a need for an alternative that allows the user to grasp the contents of the document more quickly by providing the user with a summary document summarizing the original document.
SUMMARY OF THE INVENTION The present invention has been made to solve the above problems of the prior art, and it is an object of the present invention to provide a document formatting system capable of easily summarizing a document using an object, format information, And a document summarizing method using the same.
According to an aspect of the present invention, there is provided an apparatus for reducing a document based on a document format, including: a document loading unit loading a document to be summarized; The pattern and form distribution of the retrieved document are analyzed, and the number of paragraphs and the size of the paragraph are analyzed. The frequency and distribution of the forms included in the sentence are analyzed, and the distribution and the number of the objects used in the document are grasped A document pattern / form distribution analyzer; An analysis type determining unit for determining a type of a document analyzed by the document pattern / style distribution analyzing unit; A form / paragraph classifier for classifying forms and paragraphs in the analysis type determined by the analysis type determination unit; An important syntax selection unit for selecting and marking a sentence, an outline, and an object of a document designated and selected by the document pattern / form distribution analyzer as an important sentence; And a summary document generation unit for generating a summary document using the sentences, outlines, and entities of the marked document selected and selected by the important phrase selection unit.
Here, the analysis of the pattern distribution of the document is to analyze the frequency and distribution of the form included in the sentence, and to designate the sentence as the selection subject according to the frequency of use of the words included in the form and the form information.
Also, the format information is at least one of bold, italic, color, and underline included in the word, and the object used in the document is to grasp the distribution and number of tables, pictures, and charts.
On the other hand, the type of document determined by the analysis type determination unit is characterized in that it is applied to a markup-based document including .hwp, doc and OWPML, OOXML, HTML and PDF for the sentence, outline and object of the document.
According to another aspect of the present invention, there is provided a document summarizing method using a document reduction device based on the document format of the present invention, comprising: loading a document to be summarized in a document loading section; The document pattern / form distribution analyzing unit analyzes the number of paragraphs and the size of paragraphs in the document loaded by the document loading unit, analyzes frequency and distribution of the forms included in the sentence, Analyzing a document pattern and a form distribution for grasping a distribution and a number; Determining in the type analysis type determining unit of the document analyzed by the document pattern / form distribution analyzing unit; Determining a summary method according to the analysis type determined by the analysis type determination unit in a form / paragraph classifier; Selecting and marking a sentence, an outline, and an object of a document designated and selected by the document pattern / form distribution analyzer as an important sentence in an important sentence selecting unit; And generating a summary document in the summary document generation unit using the sentences, outlines, and entities of the marked document selected and selected by the important phrase selection unit.
Here, the step of analyzing the pattern and the form distribution of the loaded document may include analyzing the paragraph structure, analyzing the form in the loaded document, analyzing the distribution of the object used in the document, And a step of recognizing the number of the image data.
In the step of analyzing the paragraph structure, the number of paragraphs and the size of the paragraphs are analyzed, and the corresponding paragraphs are reduced in units of sentences of a predetermined length according to the number and size of the analyzed paragraphs. If the contents are used and the contents are large, the sub-outline is reduced. When the outline is used less than the predetermined length, the analysis is performed so as to be the selection object with the first paragraph centering on the outline center.
Meanwhile, the step of determining by the type analysis type determining unit of the document analyzed by the document pattern / form distribution analyzing unit may include the steps of: .hwp, doc and OWPML for sentences, outlines and objects of the document designated and selected by the document pattern / , OOXML, HTML, and PDF, so that they can be applied to markup-based documents.
In addition, the step of determining the summary method in the form / paragraph classification section is characterized by classifying the form and the paragraph suitable for the type of document determined in the analysis type determination section.
Here, the non-marking area is set to be hidden in the step of selecting and marking the important syntax in the important syntax selection part.
The present invention has the following effects.
First, it is possible to summarize and organize long texts with only document structure and format, and users can quickly recognize and process large amounts of information.
Second, the user can easily understand the contents and the flow of the document by summarizing the document using the document object, format information, and paragraph information.
Third, it is applicable to markup based documents such as OWPML, OOXML, HTML, and PDF.
1 is a block diagram illustrating a document format-based document reduction apparatus according to the present invention.
FIG. 2 is a diagram illustrating an example of a summary document using a document format-based document reduction apparatus according to the present invention.
3 is a flowchart illustrating a document summarizing method using a document format-based document reduction apparatus according to the present invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings.
In addition, although the term used in the present invention is selected as a general term that is widely used at present, there are some terms selected arbitrarily by the applicant in a specific case. In this case, since the meaning is described in detail in the description of the relevant invention, It is to be understood that the present invention should be grasped as a meaning of a term that is not a name of the present invention. Further, in describing the embodiments, descriptions of technical contents which are well known in the technical field to which the present invention belongs and which are not directly related to the present invention will be omitted. This is for the sake of clarity of the present invention without omitting the unnecessary explanation.
FIG. 1 is a block diagram illustrating a document format-based document reduction apparatus according to the present invention, and FIG. 2 is a diagram illustrating an example of a summary document using a document format-based document reduction apparatus according to the present invention.
1, the document shaper-based document reduction apparatus according to the present invention includes a
The
The document pattern / form
On the other hand, the document pattern /
Also, the document pattern / form
The analysis
The form / paragraph classifier (40) classifies forms and paragraphs in the analysis type determined by the analysis type determination unit (30).
The important
The summary
Meanwhile, an example of the document summarized using the document reduction apparatus based on the document format of the present invention can be configured as shown in FIG.
For example, at the top of the summary document, the title of the document is displayed, and the important sentences and figures or tables of the document are summarized.
3 is a flowchart illustrating a document summarizing method using a document format-based document reduction apparatus according to the present invention.
As shown in FIG. 3, the
Next, the document pattern /
Then, the document pattern /
In addition, the document pattern /
The analysis
Then, the form /
Then, the important
The summary
While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it should be understood that various changes and modifications will be apparent to those skilled in the art. Obviously, the invention is not limited to the embodiments described above. Accordingly, the scope of protection of the present invention should be construed according to the following claims, and all technical ideas which fall within the scope of equivalence by alteration, substitution, substitution, and the like within the scope of the present invention, Range. In addition, it should be clarified that some configurations of the drawings are intended to explain the configuration more clearly and are provided in an exaggerated or reduced size than the actual configuration.
10: document loading section 20: document pattern / form distribution analyzing section
30: analysis type determination unit 40: form / paragraph classification unit
50: important syntax selection unit 60: summary document generation unit
Claims (10)
The pattern and form distribution of the retrieved document are analyzed, and the number of paragraphs and the size of the paragraph are analyzed. The frequency and distribution of the forms included in the sentence are analyzed, and the distribution and the number of the objects used in the document are grasped A document pattern / form distribution analyzer;
An analysis type determining unit for determining a type of a document analyzed by the document pattern / style distribution analyzing unit;
A form / paragraph classifier for classifying forms and paragraphs in the analysis type determined by the analysis type determination unit;
An important syntax selection unit for selecting and marking a sentence, an outline, and an object of a document designated and selected by the document pattern / form distribution analyzer as an important sentence; And
And a summary document generation unit for generating a summary document using the sentences, outlines and entities of the marked document selected and selected by the important phrase selection unit.
Wherein the analysis of the form distribution of the document analyzes the use frequency and distribution of the form included in the sentence and designates the sentence as the selection subject according to the frequency of use of the words included in the form and the form information. Document Reduction Device.
The format information is at least one of bold, italic, color and underline included in the word,
Wherein the object used in the document is to grasp the distribution and the number of the table, the figure and the chart.
The type of document determined by the analysis type determination unit is
Based document, wherein the document is applied to a markup-based document including .hwp, doc and OWPML, OOXML, HTML, PDF for the sentence, outline and object of the document.
The document pattern / form distribution analyzing unit analyzes the number of paragraphs and the size of paragraphs in the document loaded by the document loading unit, analyzes frequency and distribution of the forms included in the sentence, Analyzing a document pattern and a form distribution for grasping a distribution and a number;
Determining in the type analysis type determining unit of the document analyzed by the document pattern / form distribution analyzing unit;
Determining a summary method according to the analysis type determined by the analysis type determination unit in a form / paragraph classifier;
Selecting and marking a sentence, an outline, and an object of a document designated and selected by the document pattern / form distribution analyzer as an important sentence in an important sentence selecting unit;
And a step of generating a summary document in a summary document generation unit using a sentence, an outline and an entity of the marked document selected and selected by the important statement selection unit. Summary method.
The document pattern / form distribution analyzing section analyzes the pattern and the form distribution of the loaded document, analyzing the paragraph structure, analyzing the form in the loaded document, and analyzing the distribution and number of the objects used in the document The document summary method according to claim 1, wherein the step of extracting the document comprises the steps of:
Wherein analyzing the paragraph structure comprises:
The number of paragraphs and the size of paragraphs are analyzed, and the paragraphs are reduced in units of sentences of predetermined length according to the number and size of the analyzed paragraphs. If the outline is used more than the preset length and the content is large, And if the outline is used less than the predetermined length, the analysis is performed such that the selection target is mainly centered on the first paragraph based on the outline.
Wherein the step of determining by the type-analysis-type determining unit of the document analyzed by the document pattern / form distribution analyzing unit comprises: analyzing the .hwp, doc and OWPML of sentences, outlines and objects of the document designated and selected by the document pattern / , OOXML, HTML, PDF, and so on, so that it can be applied to the corresponding type.
Wherein the step of determining the summary method in the form /
Wherein the document type classification unit classifies forms and paragraphs suitable for the type of document determined in the analysis type determination unit.
Wherein the non-marking area is set to hide in a step of selecting and marking as an important phrase in the important syntax selection unit.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020150141726A KR101740926B1 (en) | 2015-10-08 | 2015-10-08 | Apparatus and method for generating application and method for zing the same |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020150141726A KR101740926B1 (en) | 2015-10-08 | 2015-10-08 | Apparatus and method for generating application and method for zing the same |
Publications (2)
Publication Number | Publication Date |
---|---|
KR20170042145A true KR20170042145A (en) | 2017-04-18 |
KR101740926B1 KR101740926B1 (en) | 2017-05-29 |
Family
ID=58704100
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020150141726A KR101740926B1 (en) | 2015-10-08 | 2015-10-08 | Apparatus and method for generating application and method for zing the same |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR101740926B1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20200114604A (en) * | 2019-03-29 | 2020-10-07 | 주식회사 한글과컴퓨터 | Electronic device capable of generating a summary image through merging of objects inserted in an electronic document and operating method thereof |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3597697B2 (en) | 1998-03-20 | 2004-12-08 | 富士通株式会社 | Document summarizing apparatus and method |
-
2015
- 2015-10-08 KR KR1020150141726A patent/KR101740926B1/en active IP Right Grant
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20200114604A (en) * | 2019-03-29 | 2020-10-07 | 주식회사 한글과컴퓨터 | Electronic device capable of generating a summary image through merging of objects inserted in an electronic document and operating method thereof |
Also Published As
Publication number | Publication date |
---|---|
KR101740926B1 (en) | 2017-05-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9311422B2 (en) | Dynamic simulation of a responsive web page | |
US20160342578A1 (en) | Systems, Methods, and Media for Generating Structured Documents | |
CN110770735B (en) | Transcoding of documents with embedded mathematical expressions | |
US9898548B1 (en) | Image conversion of text-based images | |
US20030163790A1 (en) | Solution data edit processing apparatus and method, and automatic summarization processing apparatus and method | |
US20090292987A1 (en) | Formatting selected content of an electronic document based on analyzed formatting | |
US9348799B2 (en) | Forming a master page for an electronic document | |
US20150331847A1 (en) | Apparatus and method for classifying and analyzing documents including text | |
KR20100057089A (en) | Presentation of large objects on small displays | |
US20150302247A1 (en) | Read determining device and method | |
JP2016042349A (en) | Automatic method for division into chapters and sections | |
US20120328187A1 (en) | Text analysis and visualization | |
US7602972B1 (en) | Method and apparatus for identifying white space tables within a document | |
AU2014309040A1 (en) | Presenting fixed format documents in reflowed format | |
US11615635B2 (en) | Heuristic method for analyzing content of an electronic document | |
US20120017144A1 (en) | Content analysis apparatus and method | |
CN104951429A (en) | Recognition method and device for page headers and page footers of format electronic document | |
US20190392209A1 (en) | Document Analyzer, Document Analysis Method, and Computer-Readable Storage Medium Storing Program | |
WO2016130236A1 (en) | Responsive course design system and method | |
US9298675B2 (en) | Smart document import | |
CN110162773A (en) | Title estimator | |
KR100463835B1 (en) | Index extraction method of web contents transcoding system for small display devices | |
KR101740926B1 (en) | Apparatus and method for generating application and method for zing the same | |
US20220058214A1 (en) | Document information extraction method, storage medium and terminal | |
KR20170057951A (en) | Mehtod and apparatus for sentence correction using natural language processing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A201 | Request for examination | ||
E902 | Notification of reason for refusal | ||
E701 | Decision to grant or registration of patent right | ||
GRNT | Written decision to grant |