KR101754473B1 - Method and system for automatically summarizing documents to images and providing the image-based contents - Google Patents
Method and system for automatically summarizing documents to images and providing the image-based contents Download PDFInfo
- Publication number
- KR101754473B1 KR101754473B1 KR1020150094112A KR20150094112A KR101754473B1 KR 101754473 B1 KR101754473 B1 KR 101754473B1 KR 1020150094112 A KR1020150094112 A KR 1020150094112A KR 20150094112 A KR20150094112 A KR 20150094112A KR 101754473 B1 KR101754473 B1 KR 101754473B1
- Authority
- KR
- South Korea
- Prior art keywords
- document
- sentence
- image
- rti
- sentences
- Prior art date
Links
Images
Classifications
-
- G06F17/211—
-
- G06F17/2705—
-
- G06F17/277—
Abstract
A method and system for providing a summary of a document as image-based content is disclosed. A computer implemented method comprising: extracting sentences included in a document from a given document; Calculating similarity scores and diversity scores for the sentences and summarizing the documents into at least one key sentence using the scores; Selecting an image associated with the core sentence with respect to at least one of an image included in the document and an image on the database; And generating summary content for the document by combining the image with the core sentence.
Description
The description below refers to a technique for automatically summarizing a document.
Recently, with the spread of computers and the development of the Internet, the amount of electronic documents has rapidly increased, and accordingly, it takes a comparatively long time to extract desired documents out of numerous electronic documents.
A document retrieval system is a retrieval system using a common keyword. When a document is retrieved by a key word, the user can retrieve desired information by simply inputting a simple keyword.
However, in today's world, the amount of search results corresponding to a search term is not only vast, but also it can not be accurately determined whether or not the search result is correct. Therefore, the user actually has to check documents corresponding to the search result.
Techniques have been developed to automatically summarize the original document so that the user can more quickly and easily grasp the contents of the document.
A document summary is simply a 'reduction of the contents of a document to a certain size'. In detail, it is possible to compress a document content System.
For example, Korean Registration No. 10-0435442 (Date of Registration June 01, 2004) "document summary method and system" describes the structural characteristics of a document, structures it according to a certain rule, Discloses a technique of extracting a pattern that occurs and automatically summarizing the document using natural language processing (NLP) technology.
It provides a document summarization method and system that can quickly and effectively deliver key information of a document to a user in a short time by summarizing the contents of the original document into a small number of images and representative sentences.
A computer-implemented method comprising: extracting sentences included in a document from a given document; Calculating similarity scores and diversity scores for the sentences and summarizing the documents into at least one key sentence using the scores; Selecting an image associated with the core sentence with respect to at least one of an image included in the document and an image on the database; And generating summary content for the document by combining the selected image with the core sentence.
A document summary extracting sentences included in the document from a given document, calculating a similarity score and a diversity score for the sentences, and summarizing the document into at least one core sentence using the similarity score and the diversity score part; An image selection unit for selecting an image associated with the core sentence on at least one of the image included in the document and the image on the database; And a content generation unit for generating summary content for the document by combining the selected image with the core sentence.
By summarizing the body of a given document as a core sentence and combining images that are highly relevant to the content of the summarized key sentence, the document can be summarized and generated as image-based content such as short posts.
Considering the limited display characteristics of the mobile terminal environment, it is possible to summarize the source of documents such as Internet news articles, blogs, or posts of online communities and social networks as a small number of images and representative sentences, A new type of service is possible to deliver effectively.
1 is a diagram illustrating an example of an operating environment of a system according to an embodiment of the present invention.
2 is a block diagram illustrating an internal configuration of an electronic device and a server according to an embodiment of the present invention.
3 is a diagram schematically illustrating a process of summarizing and providing a document in a server according to an exemplary embodiment of the present invention.
4 is a block diagram illustrating a processor included in a server according to an exemplary embodiment of the present invention.
5 to 7 are diagrams for explaining a process of extracting a key sentence from a document according to an embodiment of the present invention.
8 to 10 are views for explaining a process of selecting an image having high relevance to a document in an embodiment of the present invention.
11 to 13 are diagrams illustrating a process of providing summary contents of a document in an embodiment of the present invention.
Figure 14 illustrates a learning module for document summarization in one embodiment of the present invention.
Figure 15 illustrates an execution module for document summarization in one embodiment of the present invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.
The following description relates to a technique for automatically summarizing a document, and more particularly to a document summarizing method and system for summarizing a document into image-based content.
In this specification, 'document' refers to data in which files such as text are expressed in a logical structure, and not only standard data such as a database (DB), but also unstructured data such as web data such as blogs and cafes It can mean. But is not limited to, documents with online multimodal features such as Internet news articles, blogs, or posts in online communities and social networks. Here, the multi-modal document means a document in which at least one or more different expression methods such as text, image, and voice are used for semantic expression to be transmitted in the document.
1 is a diagram illustrating an example of an operating environment of a system for summarizing and providing documents in an embodiment of the present invention. 1 shows an example of an operating environment, which shows a plurality of
The
The communication method is not limited, and may include a communication method using a communication network (for example, a mobile communication network, a wired Internet, a wireless Internet, a broadcasting network) that the
Each of the
The
The
Hereinafter, various embodiments of the present invention will be described in terms of one
2 is a block diagram illustrating an internal configuration of an electronic device and a server according to an embodiment of the present invention. The
The
The
The input /
The
Also, in other embodiments, the
3 is a diagram schematically showing a process of summarizing a document in an embodiment of the present invention. 3 shows a
In step (1), the
In step (2), the
In step (3), the
In step (4), the
Accordingly, the
FIG. 4 is a block diagram for explaining a processor included in a server according to an embodiment of the present invention, and FIGS. 5, 8, and 11 illustrate a method performed by a server in an embodiment of the present invention FIG. The
5 is a flowchart for explaining a process of extracting a key sentence from a document according to an embodiment of the present invention.
In
In
A word frequency histogram, a term frequency (TF) / inverse document frequency (IDF), a language learning model (e.g., word2vec, phrase2vec, document2vec, etc.) can be used for generating a text descriptor.
Specifically, for example, if the following example sentence is defined as a 512-dimensional real value vector s,
Example sentence: <Kim, Yeon-ae Let it go
→ s = {s 1 = 0.032, s 2 = -0.595, ... , s 512 = 1.22} (s n is the n-th variable of the s vector)
Can be expressed as
The real vector value of the sentence is calculated through learning using the whole corpus as the training data, and if the meaning of the sentence is similar, it is learned that the distance on the 512 dimension of the two sentences becomes closer. In addition, through learning, not only the sentence but also each word is expressed as a high-dimensional real number vector, where the number of dimensions of sentence and word vector can be set to an arbitrary integer value greater than zero.
In
In
Scores can be calculated using similarity and diversity to extract key sentences in the document.
For example, for the sentences represented by the vector s = {s 1 , s 2 , ..., s 512 }, the score of each sentence is the sum of the similarity (S) , Which is defined as Equation (1).
Where s is a sentence in the body of the document, t is the title of the document, C is a group of sentences containing s, and w 1 and w 2 are weight values of similarity (S) and diversity (D), respectively. Here, C may be a cluster of the entire sentences included in the document, or the entire sentence may be a cluster of clusters of similar meaning.
The
At this time, the similarity between sentences is calculated from the similarity score between the words constituting the sentence. For example, the similarity between words can be calculated from a model in which semantics are learned based on a common appearance pattern of words included in all documents of an online document database.
For example, given the title t = {t 1 , t 2 , ...} and one sentence s = {s 1 , s 2 , ...} in the text, the similarity between each sentence and the title S (s, t) = cosine similarity ({s 1 , s 2 , ...}, {t 1 , t 2 , ...) if cosine similarity is used to obtain s .
The similarity degree similarity between vectors A and B is defined by Equation (2).
The degree of similarity for determining the similarity between sentences is not limited to the degree of cosine similarity, and various measurement methods can be applied according to the characteristics of text descriptors such as Euclidean distance and Hamming distance.
Accordingly, the
And, diversity among sentences can be calculated based on the uncertainty calculated from the probability of emergence of the words composing each sentence, and calculated uncertainty. At this time, sentences with similar probabilities of occurrence of words can be regarded as similar sentences, and sentences with completely different probabilities of occurrence of words can be understood as sentences containing different meanings.
In order to extract the key sentences with diversity in a given document, the sentence vectors are clustered into a suitable number of clusters between semantically similar sentences by clustering method and the sentences with the highest uncertainty are extracted from each of the clusters, Can be extracted.
At this time, the number K of clusters is given as an integer larger than 0, and when K = 1, the entire text of the document is used as a cluster. Examples of clustering methods include K-Means Clustering, mean-shift clustering, and hierarchical clustering.
If the uncertainty of sentence s for cluster C is Entropy (s, C), the key sentence s' selected for the sentences belonging to the mth cluster C m is s, which maximizes Entropy (s, C) '= argmax s (Entropy (s, C m )).
As an example of uncertainty, the uncertainty computed from the sentence vectors in a particular cluster can use Shannon's Boltzmann entropy calculation, which is often used in information theory.
The sentences with the highest uncertainty in each cluster represent the sentences represented by the most diverse words in the cluster.
FIGS. 6 and 7 illustrate an example of a process of extracting a core sentence from a document.
6, the
Referring to FIG. 7, similarity (S) between a title and a sentence of a document is calculated for each sentence included in the document. At this time, a sentence most similar in meaning to the title of the sentence included in the document is selected, (D) between each of the sentences of each
(F = w 1 * S + w 2 * D) for each similar sentence S and D, and then calculates a weighted sum F of all sentences in the document as a threshold value 1) can be selected as the
Accordingly, the
8 is a flowchart illustrating a process of selecting an image having high relevance to a document according to an exemplary embodiment of the present invention.
In
In
As another example, the
The
The
In order to efficiently learn the image, the category or label of the image can be automatically set by using at least one word included in the title of the document. In other words, an automated labeling technique can be applied by using the category or label information of the images included in the document as important words included in the title of the document.
In order to improve the quality of the image descriptor, important words (e.g., characters, concepts, words with high arithmetic values such as TF / IDF (Term Frequency - Inverse Document Frequency)) among the words included in the title of the document to which the image belongs, You can conduct supervised or supervised learning by setting it as an argument for the generation of the inter-relationship.
The image is represented by a high dimensional real vector form descriptor through the preprocessing technique for learning the common semantic pattern between the title and the images of the document including the image. For this purpose, the CNN (Convolutional neural network) feature, SIFT (Scale Invariant Feature Transform) , Histogram of Oriented Gradient (HOG), and SURF (Speeded Up Robust Features). For example, if one image is defined as a 1024-dimensional real number vector i, i = {i 1 = 0.000, i 2 = 0.859, ... , i 1024 = 1.245}. At this time, a descriptor is defined such that the distance between two vectors in the 1024-dimensional space becomes closer as the two images become closer to each other. The number of dimensions is exemplary and is not limited to 1024.
In the present invention, a model for representing text information (for example, text in a title or a body text) and image information in the same semantic space for documents including an image for image retrieval associated with a key sentence of the document, Technology can be applied.
For example, a text represented by a 512-dimensional real vector and an image represented by a 1024-dimensional real vector may be represented on another real vector space of the same 200-dimension, and text and images having similar meaning in this space Learning of the model proceeds so that the distance is close to the defined semantic space. The semantic space represented by the above model can be expressed as shown in FIG. 10, and text and images having similar meaning can be expressed on a space of a close distance.
The
The
11 is a flowchart illustrating a process of providing summary contents of a document in an embodiment of the present invention.
In
Referring to FIG. 12, the
In order to increase the readability of the
In
For example, when a query term for a search request is received through a user terminal, the
13, a post-image 1310 including each of the
Furthermore, the summary contents exposed through the service channel may include a user interface for receiving feedback information such as a user evaluation of the content. The
Figure 14 illustrates a
14, the
In particular, the
At this time, the text can be represented by a real vector descriptor through learning using the entire corpus as training data. In the case of an image, it can be expressed as a real vector descriptor using the characteristics of the image itself or using words included in the title of the document to which the image belongs.
The
Figure 15 illustrates an
The
The
The
According to the embodiments of the present invention, the text of a given document is summarized as a key sentence, and the image, which is highly related to the content of the summarized key sentence, is combined with the corresponding sentence to summarize the document as image- Can be generated. Therefore, by summarizing the original document with a small number of images and representative sentences, it is possible to provide a new type of service that enables users to transmit key information of a document more quickly and effectively. Furthermore, various personalization services can be implemented by applying a technique of summarizing a document to image-based content to data mining techniques such as multi-sensor personal log as well as image-text based document data.
The apparatus described above may be implemented as a hardware component, a software component, and / or a device described above as a combination of hardware components, software components, and / or hardware components and software components. For example, the apparatus and components described in the embodiments may be implemented within a computer system, such as, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate array (FPGA) , A programmable logic unit (PLU), a microprocessor, or any other device capable of executing and responding to instructions. The processing device may execute an operating system (OS) and one or more software applications running on the operating system. The processing device may also access, store, manipulate, process, and generate data in response to execution of the software. For ease of understanding, the processing apparatus may be described as being used singly, but those skilled in the art will recognize that the processing apparatus may have a plurality of processing elements and / As shown in FIG. For example, the processing unit may comprise a plurality of processors or one processor and one controller. Other processing configurations are also possible, such as a parallel processor.
The software may include a computer program, code, instructions, or a combination of one or more of the foregoing, and may be configured to configure the processing device to operate as desired or to process it collectively or collectively Device can be commanded. The software and / or data may be in the form of any type of machine, component, physical device, virtual equipment, computer storage media, or device As shown in FIG. The software may be distributed over a networked computer system and stored or executed in a distributed manner. The software and data may be stored on one or more computer readable recording media.
The method according to an embodiment may be implemented in the form of a program command that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like, alone or in combination. The program instructions to be recorded on the medium may be those specially designed and configured for the embodiments or may be available to those skilled in the art of computer software. Examples of computer-readable media include magnetic media such as hard disks, floppy disks and magnetic tape; optical media such as CD-ROMs and DVDs; magnetic media such as floppy disks; Magneto-optical media, and hardware devices specifically configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. Examples of program instructions include machine language code such as those produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter or the like.
While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the invention is not limited to the disclosed exemplary embodiments. For example, it is to be understood that the techniques described may be performed in a different order than the described methods, and / or that components of the described systems, structures, devices, circuits, Lt; / RTI > or equivalents, even if it is replaced or replaced.
Therefore, other implementations, other embodiments, and equivalents to the claims are also within the scope of the following claims.
Claims (13)
Extracting sentences included in the document from a given document;
A similarity score indicating a degree of similarity between words constituting a sentence as information on the sentences, and a diversity score indicating an entropy calculated from the probability that the words constituting the sentence appear in the document, Summarizing at least one key sentence;
Selecting an image associated with the core sentence with respect to at least one of an image included in the document and an image on the database; And
Generating summary content for the document by combining the selected image with the core sentence
Lt; RTI ID = 0.0 > 1, < / RTI >
The summarizing step comprises:
Obtaining a weight of the similarity score and the diversity score; And
Selecting a sentence having a high weight as the key sentence
≪ / RTI >
The summarizing step comprises:
A step of converting a sentence included in the document into a text descriptor represented by a numerical vector using a previously learned text descriptor learning model in which a similarity between sentences or words is expressed by a numerical vector
≪ / RTI >
The summarizing step comprises:
Calculating a similarity score according to the degree of similarity between the title of the document and the words constituting the sentence for each of the sentences included in the document;
Calculating a diversity score between the sentences based on the sentence having the highest similarity score among the sentences included in the document for each of the sentences included in the document; And
Selecting at least one sentence among the sentences included in the document based on the similarity score and the diversity score as the core sentence
Lt; RTI ID = 0.0 > 1, < / RTI >
The summarizing step comprises:
A step of converting a sentence included in the document into a text descriptor represented by a numerical vector using a first learned learning model in advance;
Lt; / RTI >
Wherein the selecting comprises:
Generating a numeric vector corresponding to a text descriptor of the core sentence using an image descriptor using a second learning model in which text and an image are expressed in a same semantic space as a numeric vector; And
Selecting at least one of an image on the document and an image on a database using the image descriptor and selecting an image associated with the core sentence
Lt; RTI ID = 0.0 > 1, < / RTI >
Wherein the second learning model comprises:
Learning by setting the label information of the image included in the document to at least one word contained in the title of the document
Lt; RTI ID = 0.0 > 1, < / RTI >
Wherein the generating comprises:
Determining at least one of a color and a position of the key sentence to be combined with the selected image according to the pattern of the selected image
Lt; RTI ID = 0.0 > 1, < / RTI >
Providing an image combining the core sentence as summary content of the document to a user terminal over a network; And
Collecting user feedback information on the summary content from the user terminal
Further comprising:
Wherein the user feedback information is used for the first learning model and the second learning model
Lt; RTI ID = 0.0 > 1, < / RTI >
An image selection unit for selecting an image associated with the core sentence on at least one of the image included in the document and the image on the database; And
Generating a summary content for the document by combining the selected image with the core sentence,
≪ / RTI >
The document summarizing unit,
Calculating a similarity score and a weight of the diversity score, and then selecting the sentence having a higher weight as the key sentence
Lt; / RTI >
The document summarizing unit,
A sentence included in the document is converted into a text descriptor represented by a numerical vector using a previously learned text descriptor learning model,
Calculating the similarity score and the diversity score using the text descriptor,
Selecting at least one sentence among the sentences included in the document as the key sentence using the similarity score and the diversity score
Lt; / RTI >
The document summarizing unit,
A sentence included in the document is converted into a text descriptor represented by a numerical vector using a first learned learning model,
Wherein the image selection unit comprises:
A numerical vector corresponding to a text descriptor of the core sentence is generated as an image descriptor using a second learning model in which text and an image are expressed by a numerical vector in the same semantic space,
Retrieving at least one of the image included in the document and the image on the database using the image descriptor to select an image associated with the core sentence
Lt; / RTI >
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020150094112A KR101754473B1 (en) | 2015-07-01 | 2015-07-01 | Method and system for automatically summarizing documents to images and providing the image-based contents |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020150094112A KR101754473B1 (en) | 2015-07-01 | 2015-07-01 | Method and system for automatically summarizing documents to images and providing the image-based contents |
Publications (2)
Publication Number | Publication Date |
---|---|
KR20170004154A KR20170004154A (en) | 2017-01-11 |
KR101754473B1 true KR101754473B1 (en) | 2017-07-05 |
Family
ID=57832667
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020150094112A KR101754473B1 (en) | 2015-07-01 | 2015-07-01 | Method and system for automatically summarizing documents to images and providing the image-based contents |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR101754473B1 (en) |
Families Citing this family (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101871828B1 (en) * | 2017-07-03 | 2018-06-28 | (주)망고플레이트 | Apparatus and method for selecting representative images of online contents |
US10699062B2 (en) * | 2017-08-01 | 2020-06-30 | Samsung Electronics Co., Ltd. | Apparatus and method for providing summarized information using an artificial intelligence model |
KR102542049B1 (en) * | 2017-08-01 | 2023-06-12 | 삼성전자주식회사 | Apparatus and Method for providing a summarized information using a artificial intelligence model |
JP7137815B2 (en) * | 2018-04-19 | 2022-09-15 | Jcc株式会社 | Recording playback system |
KR101981746B1 (en) * | 2018-09-10 | 2019-06-03 | 주식회사 시스메틱 | Method, apparatus and computer-readable medium for providing information contents based on keyword |
US10831821B2 (en) | 2018-09-21 | 2020-11-10 | International Business Machines Corporation | Cognitive adaptive real-time pictorial summary scenes |
KR102545666B1 (en) | 2018-12-18 | 2023-06-21 | 삼성전자주식회사 | Method for providing sententce based on persona and electronic device for supporting the same |
KR102270989B1 (en) * | 2019-06-20 | 2021-06-30 | (주)대왕시스템 | Artificial intelligence fashion coordination system |
KR102293950B1 (en) * | 2019-06-20 | 2021-08-26 | 민 정 고 | Apparatus and method for converting text to image based on learning |
CN112667826A (en) * | 2019-09-30 | 2021-04-16 | 北京国双科技有限公司 | Chapter de-noising method, device and system and storage medium |
KR102445932B1 (en) * | 2020-11-02 | 2022-09-21 | 한양대학교 산학협력단 | Image generation technique using multi-modal mapping information on knowledge distillation |
KR102446305B1 (en) * | 2020-11-20 | 2022-09-23 | 네이버 주식회사 | Method and apparatus for sentiment analysis service including highlighting function |
KR102571595B1 (en) * | 2020-12-28 | 2023-08-28 | 한국과학기술원 | Method and apparatus for extracting hashtags based on recurrent generation model with hashtag feedback |
KR102604055B1 (en) * | 2021-10-28 | 2023-11-21 | 주식회사 메타소프트 | Intelligent information tracking presentation system to support exploratory reading of e-books |
KR102556487B1 (en) * | 2022-02-15 | 2023-07-14 | 이종혁 | System for producing election promotional material using election promises |
-
2015
- 2015-07-01 KR KR1020150094112A patent/KR101754473B1/en active IP Right Grant
Also Published As
Publication number | Publication date |
---|---|
KR20170004154A (en) | 2017-01-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR101754473B1 (en) | Method and system for automatically summarizing documents to images and providing the image-based contents | |
CN111753060B (en) | Information retrieval method, apparatus, device and computer readable storage medium | |
US11334635B2 (en) | Domain specific natural language understanding of customer intent in self-help | |
Mathur et al. | Detecting offensive tweets in hindi-english code-switched language | |
US11062095B1 (en) | Language translation of text input using an embedded set for images and for multilanguage text strings | |
Alami et al. | Unsupervised neural networks for automatic Arabic text summarization using document clustering and topic modeling | |
CN112131350B (en) | Text label determining method, device, terminal and readable storage medium | |
US9846836B2 (en) | Modeling interestingness with deep neural networks | |
CN106462807B (en) | Learn semantic information of multimedia according to extensive unstructured data | |
US20130060769A1 (en) | System and method for identifying social media interactions | |
CN107209861A (en) | Use the data-optimized multi-class multimedia data classification of negative | |
Wu et al. | Learning of multimodal representations with random walks on the click graph | |
CN103699625A (en) | Method and device for retrieving based on keyword | |
US10915756B2 (en) | Method and apparatus for determining (raw) video materials for news | |
KR20200087977A (en) | Multimodal ducument summary system and method | |
CN112805715A (en) | Identifying entity attribute relationships | |
Salur et al. | A soft voting ensemble learning-based approach for multimodal sentiment analysis | |
CN110717038B (en) | Object classification method and device | |
CN109271624A (en) | A kind of target word determines method, apparatus and storage medium | |
Bouchakwa et al. | Multi-level diversification approach of semantic-based image retrieval results | |
CN111950265A (en) | Domain lexicon construction method and device | |
CN113408282B (en) | Method, device, equipment and storage medium for topic model training and topic prediction | |
US11501071B2 (en) | Word and image relationships in combined vector space | |
CN114330296A (en) | New word discovery method, device, equipment and storage medium | |
Tang et al. | Labeled phrase latent Dirichlet allocation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A201 | Request for examination | ||
E902 | Notification of reason for refusal | ||
E701 | Decision to grant or registration of patent right |