KR101753762B1 - Robot Journalism Method and System for Automatic Article Generation - Google Patents

Robot Journalism Method and System for Automatic Article Generation Download PDF

Info

Publication number
KR101753762B1
KR101753762B1 KR1020150162833A KR20150162833A KR101753762B1 KR 101753762 B1 KR101753762 B1 KR 101753762B1 KR 1020150162833 A KR1020150162833 A KR 1020150162833A KR 20150162833 A KR20150162833 A KR 20150162833A KR 101753762 B1 KR101753762 B1 KR 101753762B1
Authority
KR
South Korea
Prior art keywords
article
data
generating
sentence
event
Prior art date
Application number
KR1020150162833A
Other languages
Korean (ko)
Other versions
KR20170058785A (en
Inventor
이준환
김동환
Original Assignee
서울대학교산학협력단
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 서울대학교산학협력단 filed Critical 서울대학교산학협력단
Priority to KR1020150162833A priority Critical patent/KR101753762B1/en
Publication of KR20170058785A publication Critical patent/KR20170058785A/en
Application granted granted Critical
Publication of KR101753762B1 publication Critical patent/KR101753762B1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06F17/21
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/101Collaborative creation, e.g. joint development of products or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Tourism & Hospitality (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Data Mining & Analysis (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Primary Health Care (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The present invention seeks to implement an article creation method performed by a fully automated article generation system and an article generation system by replacing all the judgments necessary for article creation with a machine. The article generation system collects data, Determining at least one important event by assigning a weight to the extracted one or more events, determining a mood of the article through interpretation of data of the selected important event, and determining an article based on the determined atmosphere of the article And automatically generating an article by including the generating step.

Figure R1020150162833

Description

[0001] Robot Journalism System and Method for Automated Article Generation [

BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a robot journalism system and method for automatically generating articles through data, and more particularly, to an article generating method and an article generating system that can be widely used in the media industry field and general information communication field.

In the past, journalism was the process of collecting, judging, writing, and distributing all data to subscribers. It was only possible to use the computer as an auxiliary.

In recent years, the explosion of the media, including the Internet, has increased competition among media companies. As a result, media companies need to generate information faster and more accurately, while reducing the size of media companies. Therefore, there is a growing need for a robot journalism framework that enables more productive work with less manpower.

In the past, there has been a level of technology that moves the newspaper articles of various media companies already prepared as described in Registration No. 10-1377114 and adjusts the layout. The present invention further attempts to implement a fully automated news article creation system by replacing all the judgments necessary for article creation.

On the other hand, the background art described above is technical information acquired by the inventor for the derivation of the present invention or obtained in the derivation process of the present invention, and can not necessarily be a known technology disclosed to the general public before the application of the present invention .

An embodiment of the present invention has an object to provide a rich article which is more than a simple information transmission by applying a mood in generating an article.

In addition, an embodiment of the present invention is directed to preempting speed competition in collecting and processing data and issuing articles.

An embodiment of the present invention also aims at saving manpower and time by performing repetitive tasks that a person has to deal with in journalism instead of performing the tasks.

In addition, an embodiment of the present invention is directed to providing personalized content to subscribers.

In addition, an embodiment of the present invention aims at providing narratives suitable for a user's situation and environment by making information length and type different according to a device used by a user in the process of generating an article .

In addition, an embodiment of the present invention is intended to represent news articles in various languages including foreign languages such as English, Chinese, and Japanese as well as Korean.

As a technical means for achieving the above technical object, a first aspect of the present invention is a method of generating an article performed by an article generating system, the article generating system comprising: collecting data; Selecting at least one important event by assigning a weight to the extracted one or more events, determining a mood of the article through interpretation of the data of the selected important event, and determining, based on the determined atmosphere of the article, And generating an article.

According to a second aspect of the present invention, an article generating system includes a data collecting unit for collecting data through a data crawling algorithm, an event extracting unit for extracting an event from the collected data, a weighting unit for assigning a weight to the extracted event, An event selection unit, an atmosphere determination unit for determining an atmosphere of the article through interpretation of data of the selected important event, and an article generation unit for generating an article based on the determined atmosphere of the article.

According to a third aspect of the present invention there is provided a computer program stored on a recording medium for performing an article generating method, comprising: collecting data; extracting one or more events from the collected data; Selecting at least one important event, determining an atmosphere of the article through interpretation of the data of the selected important event, and generating an article based on the determined atmosphere of the article. do.

According to a fourth aspect of the present invention, there is provided a computer-readable recording medium having recorded thereon a program for performing an article generating method, comprising: collecting data; extracting one or more events from the collected data; Selecting at least one important event, determining an atmosphere of the article through interpretation of the data of the selected important event, and generating an article based on the determined atmosphere of the article A recorded computer-readable recording medium is disclosed.

According to one of the above-mentioned objects of the present invention, an embodiment of the present invention can provide a subscriber with a richer article than mere information transmission by applying mood in generating an article.

Further, according to any one of the tasks of the present invention, an embodiment of the present invention can automatically pre-collect and process data and perform pre-publication of articles without human intervention, have.

Further, according to any one of the tasks of the present invention, an embodiment of the present invention can save manpower and time by performing repetitive tasks that a person has to perform in place of the machine.

In addition, according to any one of the tasks of the present invention, personalized information can be provided to a subscriber by generating customized information based on an individual's interest and taste.

In addition, according to any one of the tasks of the present invention, a narrative suitable for a user's situation and environment can be generated by making information length and type different according to a device used by a user in the process of generating an article.

In addition, according to any one of the tasks of the present invention, internationalized journalism can be performed by expressing articles in various languages including English, Chinese, and Japanese as well as Korean.

It should be understood, however, that the effects obtained by the present invention are not limited to the above-mentioned effects, and other effects not mentioned may be clearly understood by those skilled in the art to which the present invention belongs It will be possible.

1 is a block diagram illustrating an automated journalism environment in accordance with one embodiment of the present invention.
2 is a block diagram illustrating an article generating system in accordance with an embodiment of the present invention.
3 is a flowchart illustrating an article generating method according to an embodiment of the present invention.
4 and 5 are exemplary diagrams for explaining an article generating method according to an embodiment of the present invention.
6 is a flowchart according to an embodiment of the present invention and an article generating method.

Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings, which will be readily apparent to those skilled in the art. The present invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. In order to clearly illustrate the present invention, parts not related to the description are omitted, and similar parts are denoted by like reference characters throughout the specification.

Throughout the specification, when a part is referred to as being "connected" to another part, it includes not only "directly connected" but also "electrically connected" with another part in between . Also, when an element is referred to as "comprising ", it means that it can include other elements as well, without departing from the other elements unless specifically stated otherwise.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, the present invention will be described in detail with reference to the accompanying drawings.

Before describing this, we first define the meaning of the terms used below.

Journalism is a broad range of activities that provide information and entertainment through mass media, including activities that provide public information and opinions to the public. Recently, electronic journalism in which an electronic document containing various information is distributed via a network is being activated.

An article generating method to be implemented as an embodiment of the present invention refers to journalism automatically performed by a computer algorithm. According to each step of the article generating method, a media activity including a process of collecting and analyzing data necessary for article creation, extracting an event from the collected data, determining an article atmosphere for the extracted event, Algorithm, and a device capable of implementing this is referred to as an article creation system 100.

As an embodiment of the present invention, an article generating system 100, an external DB (database) 200 in which the article generating system 100 can collect information data, and an external DB Is referred to as an automated journalism environment (1000) including an article provision device (300).

1 is a block diagram illustrating an automated journaled environment 1000 including an article generating system 100 in accordance with an embodiment of the present invention.

The automated journalism environment 1000 includes an article creation system 100, an external DB (external database) 200, and an article provision device 300. As an embodiment of the present invention, the external DB 200 is information providing means connected to the article generating system 100 through the network N1, and the article generating system 100 may provide all the information on the web Means.

As an embodiment of the present invention, an article providing apparatus 300 is a device capable of distributing an article generated by an article generating system 100 to a subscriber, and is connected to an article generating system 100 via a network N2, Lt; / RTI > For example, the article providing apparatus 300 may include a terminal capable of outputting an electronic document.

The terminal may be implemented as a computer, a portable terminal, a television, a wearable device, or the like, which can be connected to a remote server through a network or to be connected to other terminals and servers. Here, the computer includes, for example, a notebook computer, a desktop computer, a laptop computer, and the like, each of which is equipped with a web browser (WEB Browser), and the portable terminal may be a wireless communication device , Personal Communication System (PCS), Personal Digital Cellular (PDC), Personal Handyphone System (PHS), Personal Digital Assistant (PDA), Global System for Mobile communications (GSM), International Mobile Telecommunication (IMT) (W-CDMA), Wibro (Wireless Broadband Internet), Smart Phone, Mobile WiMAX (Mobile Worldwide Interoperability for Microwave Access) (Handheld) based wireless communication device. In addition, the television may include an Internet Protocol Television (IPTV), an Internet television (TV), a terrestrial TV, a cable TV, and the like. Further, the wearable device is an information processing device of a type that can be directly worn on a human body, for example, a watch, a glasses, an accessory, a garment, shoes, or the like, and can be connected to a remote server via a network, Lt; / RTI >

As another embodiment of the present invention, the article providing apparatus 300 may be an apparatus for issuing articles as fluids. For example, the article providing apparatus 300 may be a device for issuing paper newspapers.

The network N1 connecting between the article generating system 100 and the external DB 200 and the network N2 connecting between the article generating system 100 and the article providing apparatus 300 may be used as an embodiment of the present invention. (LAN), a wide area network (WAN), a value added network (VAN), a personal area network (PAN), a mobile radio communication network ), Wibro (Wireless Broadband Internet), Mobile WiMAX, HSDPA (High Speed Downlink Packet Access), or satellite communication network.

As an embodiment of the present invention, the network N1 connecting between the article generating system 100 of the automated journalism environment 1000 and the external DB 200 and the network N1 connecting between the article generating system 100 and the article providing apparatus 300 May be the same or different from each other. That is, when the network N1 connecting between the article generating system 100 and the external DB 200 is connected to the wide area network, the network N1 connecting between the article generating system 100 and the article providing apparatus 300 N2) may be an intranet. Or vice versa.

2 is a block diagram illustrating an article generating system 100 according to an embodiment of the present invention.

The article generating system 100 includes a data collecting unit 101, an event extracting unit 102, an event selecting unit 103, an atmosphere determining unit 104, and an article generating unit 105 . Also, as one embodiment of the present invention, the article creation system 100 may further include a database 106, a personalization section 107, and an article distribution section 108, and may be included in the article creation system 100 And a control unit 109 functioning to allow the respective components to function in relation to each other.

As one embodiment of the present invention, the data collection unit 101 may collect data through a data collection algorithm. For example, the data collection unit 101 can crawl data in the external DB 200 through the network N1.

In one embodiment of the present invention, the data collection unit 101 may convert the collected data into an analytical data format. For example, you can parse the collected data by decomposing the collected series of data into meaningful tokens and creating a parse tree of decomposed tokens.

In one embodiment of the present invention, the event extraction unit 102 may extract one or more events from the data collected by the data collection unit 101. The event extracting unit 102 can reconstruct the collected data as an event corresponding to the event context according to an interpretation rule.

In one embodiment of the present invention, the event selection unit 103 may assign at least one event to the at least one event extracted by the event extraction unit 102 to select at least one important event.

The following is an embodiment of the present invention for calculating weights. The event selection unit 103 may assign a weight to words included in the extracted event. The weight for a word can be determined by calculating at least one of a repetition value of the same word including a synonym, an adjective value for searching for a word, and a positional value in the entire article of a sentence including the word.

In an embodiment of the present invention, the repetition value can be calculated by the number of repetitions of the same word including synonyms, and the adjective value can be calculated by the number of adjectives that search for a word and the strength of an adjective. To this end, the database 106 may store a value according to the strength of the adjective to match each adjective.

As an embodiment of the present invention, the position value of a sentence can be calculated according to an inverse pyramid-style article creation rule. According to the inverse pyramid-writing rule, a reporter places news information or facts to be conveyed in an article in reverse order of importance. Therefore, when the crawled data is a newspaper article, the weight of the sentence containing the word to be weighted can be given higher weight as it is located at the back of the entire article.

As another embodiment for calculating the weights, the present invention, article creation system 100 may further include a personalization unit 107 that receives personalization data corresponding to at least one of interest and taste from a subscriber of the article have. The event selection unit 103 can calculate a weight based on the associated word corresponding to the personal setting data input through the personal setting unit 107. [

As another example of calculating the weight, the article generating system 100 may regard the peak of the time series data as an important event. For example, the extracted event-related data may be classified according to the viewpoint, and a weight may be assigned according to the amount of data at each viewpoint, or a weight may be assigned according to the amount of data increase at each viewpoint. According to this method, an event to which a high weight is assigned among the events related to the extracted event can be selected as the important event.

As one embodiment of the present invention, the event selection unit 103 calculates the weight value calculated by the calculation method including the above embodiment as the weight score of each event, and calculates the weight score of each event based on the weight score for each event At least one important event can be selected.

In one embodiment of the present invention, the weight score for the extracted one or more events may be calculated as a weight score for the event as the total score or the average score of the weight corresponding to the word or sentence included in the event data.

On the other hand, as one embodiment of the present invention, the atmosphere determining unit 104 can determine the mood of the article by analyzing the data of the important event selected by the event selecting unit 103. [

In one embodiment of the present invention, the atmosphere determination unit 104 may extract context information of important events in the process of analyzing important events. Context information is information that considers the relationship between important events derived from the weighting process and various situations related to important events, and can set the mood of the entire event on the basis of the information.

For example, if a particular incident occurs frequently, it tends to be "habitual", and if something rarely happens, it is labeled "surprisingly". In this case, the expression " habitual " or " surprisingly " can be extracted as context information. Or, considering the relationship of various events related to important events and important events, we can analyze the contextual information of "habitual" when specific events occur frequently, or analyze the contextual information "surprisingly" when it is rarely happening.

As another embodiment of the present invention, context information can be grasped by analyzing social data. For example, by analyzing data on a particular match in social media, you can analyze the time period when the most activity is captured, and find out the highlights of the game. Or, by watching the drama, you can analyze the posts left by the viewers to find out who appears in the video and when. In other words, contextual information can be analyzed by performing at least one of quantitative analysis and qualitative analysis on social media data related to important events. At this time, quantitative analysis may include analyzing the amount of social media data, and qualitative analysis may include analysis of the occurrence of social media data, the time of occurrence, or the time of occurrence of the maximum amount of data.

As another example, in the case of a baseball game article, the fact that the home team wins the away team in the baseball game is selected as the important event, and one of the situations related to the important event is extracted as the context information can do. At this time, the atmosphere determining unit 104 can set the atmosphere as " precious victory after three consecutive defeats ".

As one embodiment of the present invention, the atmosphere determination unit 104 may set specific conditions in context information analysis in advance. For example, in the case of a baseball game article, when a large score difference of 10 or more is analyzed as contextual information, it can be determined that the atmosphere of "great victory" can be determined, and when a slight score difference is analyzed as context information, Victory "can be determined. In other words, if the event in which the home team wins the away team is selected as the important event, the viewpoint of "winning the home team" or "winning the home team" may be set. In other words, the mood of the article is determined based on the most important events, and the overall tone of the news article is determined based on this.

In one embodiment of the present invention, the atmosphere determination unit 104 can determine the intensity or the altitude of the atmosphere according to the weight score of the important event. At this time, the adjective indicating the atmosphere can be selected according to the strength of the adjective stored in the database 106.

Like the embodiment of the present invention, context information analyzed by the atmosphere determination unit 104 and a weight score of a significant event produce a consistent tone or perspective through the entire article. By describing the event in a specific tone or perspective, the story can be conveyed more abundantly and the viewpoint of the event can be provided to the subscriber by generating articles with atmosphere. The viewpoint is an important reference point for defining the overall atmosphere of an article, and may be described positively or negatively for the same event depending on the viewpoint.

In other words, as an embodiment of the present invention, the article generating unit 105 can generate an article based on the atmosphere of the article determined by the atmosphere determining unit 104. [ The article generating unit 105 generates an actual article by synthesizing interpretations of data and data including at least one of data collected and converted into an analytical form, important events to which a weight score is assigned, and an atmosphere determined for an event .

As an embodiment of the present invention, the article generating unit 105 may generate an article by dragging and dropping a sentence template. In other words, after you have made several sentences, you can write articles by selecting sentences according to context and context. These sentences can be thought of as templates that follow the general article structure.

As an embodiment of the present invention, the sentence template may be a sentence in which subject, object, etc. are left empty, and the article generating unit 105 may generate an article in a manner of combining the sentence template and the data of the important event. For example, in the case of sports news, if you create a sentence template that says "ㅇㅇ ㅇ hit a hurricane that hits a runner," you can use the extracted data to fill in the blanks. This example is an example of hitting a hit, and can be applied to various sentence templates according to the situation such as a home run or a pushing score.

As an embodiment of the present invention, when a determined atmosphere is applied to a sentence template, expressions such as 'unfortunately' unable to score or 'cool doublet' can be made. The atmosphere is judged in the data processing of the atmosphere determining unit 104. The article generating unit 105 generates a complete sentence to which the atmosphere is applied in this way and reselects the generated completed sentence in accordance with the situation in which the important event occurs, . For example, an array of completion sentences can be arranged according to the weighted score of the important event. According to the inverted pyramid method, the completion sentence corresponding to the important event with a high weight score can be placed in the second half of the article. Or, you can write articles according to the narrative journalism method. When writing an article according to the narrative journalism method, it is possible to arrange a sentence focused on a character rather than a sentence containing statistical information. Or conversation-centered sentences can be arranged. Such an effort to convey the actually rich story is expressed in the article generating unit 105.

As an embodiment of the present invention, the following algorithm may be applied to the method in which the article generating unit 105 arranges the completion sentence. For example, the dynamic importance of the completed sentence according to the fixed importance, the selected important event, and the determined atmosphere, which the sentence template has basically, is calculated for each sentence. Based on the calculated completed sentence, Can be arranged.

In one embodiment of the present invention, the fixed importance may be stored in the database 106 in advance for each sentence template. The dynamic importance can be calculated according to the fixed importance as well as the weight score given to the event.

As an embodiment of the present invention, the article generating unit 105 may adjust the length or type of the article according to the article providing apparatus 300. [

For example, depending on the type of article, there may be flexibility in the creation of completed sentences in a manner that differs from the number of sentence templates included. That is, a breaking news article can reduce the number of sentence templates. Or, the articles displayed on the website for PC may be generated in a larger number of sentences or may be generated in longer sentences than the articles to be provided in the website for small screen mobile terminals. In this case, the number of important events can be increased to determine the number of appropriate sentences and to assign them.

Also, as an embodiment of the present invention, the article generating unit 105 may vary the article providing form by adjusting the length or type of articles. For example, you can create a short article and a detailed article together. After you provide a brief article, if you receive input from a subscriber to select any part of the article, you can diversify the way the articles are served by providing detailed articles. As another example, when an article is provided along with multimedia information such as moving image data, the form of providing the article can be diversified. If a user input for selecting an article at a certain point in the moving image playback is received, an article related to that point in time can be provided. Here, a method of analyzing the social media data described above and grasping context information can be used. For example, the social media data generated in the drama broadcasting time zone can be analyzed to generate an article by the broadcasting time zone.

Also, to reduce the feeling that the computer has written a sentence, it is better to reduce the repetition of the same sentence. Therefore, the sentence template for the same situation is advantageous as much as possible. As an embodiment of the present invention, each sentence template has several attribute values, and the article creation algorithm can calculate the matching score between the attribute value of the sentence template and the given data to select the sentence template that best fits the situation.

As an embodiment of the present invention, the article generating unit 105 may generate an article with at least one of Korean and a foreign language. The foreign language may include, for example, one or more of Japanese, Chinese, and English.

As an embodiment of the present invention, the article creation system 100 may further include an article distribution unit 108 that distributes the generated articles to article subscribers. The article distributing unit 108 may provide the generated article to the article providing apparatus 300 via the network.

As an embodiment of the present invention, the article generating system 100 may further include a database 106. [ The database 106 can store data collected by the data collection unit 101 and data converted into an analytical form and can store personal setting data received from an article subscriber. In addition, a fixed importance set for the sentence template, each sentence template, and various data necessary for the article creation system 100 to generate an article can be stored.

On the other hand, the article generating method according to the embodiment shown in FIG. 3 includes steps that are processed in a time-series manner in the article generating system 100 shown in FIG. Therefore, even if omitted from the following description, the above description of the article generating system 100 shown in FIG. 2 can also be applied to the article generating method according to the embodiment shown in FIG.

According to an embodiment of the present invention, an article generating method performed by the article generating system 100 may include collecting data (S101). Step S101 of collecting the data may further include collecting the data and converting the data into an analytical form and storing the analytically transformed data in the database 106. [

According to an embodiment of the present invention, an article generating method may further include extracting one or more events from collected data (S102). In step S102 of extracting the event, the collected data may be reconstructed into an event corresponding to the context of the event according to an interpretation rule.

According to an embodiment of the present invention, the article generating method may further include a step (S103) of assigning a weight to one or more extracted events to select at least one important event.

In one embodiment of the present invention, the weight can be determined by calculating at least one of a repetition value of the same word including a synonym, an adjective value to be searched, and a position value of the sentence in the entire article. In yet another embodiment, the weights may be computed based on association terms corresponding to personalization data. The personalization data may be input from the user as data corresponding to at least one of the interest and taste of the subscriber of the article. In yet another embodiment, the weights can be made by analyzing social media data.

According to an embodiment of the present invention, the article generating method may further include determining (S104) an atmosphere of the article by analyzing data of the selected important event.

In one embodiment of the present invention, interpretation of data can be accomplished by extracting contextual information from a key event. The step of determining the mood of the article may include analyzing one or more of the weighted score of the selected important event and the extracted important event related context information. Analyzing at least one of the weight score of the important event and the context information related to the extracted important event and determining the atmosphere of the article based on the important event.

According to an embodiment of the present invention, an article generating method may further include generating an article based on the determined atmosphere of the article (S105). Step S105 of generating an article may include selecting one or more sentence templates corresponding to the selected important event. The step of selecting a sentence template may select a sentence template by calculating a matching score between the attribute value of the sentence template and the related data.

In the step S105 of generating an article as an embodiment of the present invention, the step of combining the selected sentence template with the data of the important event, the completion sentence is generated by applying the atmosphere of the determined article to the sentence template combined with the data of the important event Step < / RTI > Step S105 of generating an article may further include arranging the generated completed sentence according to the weight score.

Further, in one embodiment of the present invention, arranging the completion sentence includes calculating a dynamic importance of the completed sentence according to the fixed importance of the sentence template, the selected important event, and the determined atmosphere, And arranging the sentence based on the sentence.

According to one embodiment of the present invention, step S105 of generating an article may generate at least one of the length and type of the article according to the article providing apparatus 300, and may generate an article with at least one of Korean and a foreign language have.

As an embodiment of the present invention, the article generating method may further include a step (S106) of distributing the article generated at each of the above steps to the subscriber. According to an embodiment of the present invention, the step of distributing to the subscriber may be performed through a terminal capable of outputting an electronic document including the generated article. Alternatively, the article may be published as a fluid such as a paper newspaper and distributed to the subscriber.

4 is an exemplary diagram illustrating an embodiment of the present invention, which is an actual example of an article 400 about a professional baseball game result generated by the article generating system 100, And is an example of JSON data 500 of character relay when relay is performed.

As an embodiment of the present invention, the article generating system 100 may invoke the character relay data 500 of a professional baseball game to generate a news article as shown in FIG. 4 that is actually provided to the subscriber.

An article (400) on the results of the professional baseball game shown in FIG. 4 is an article about the games of Nexen and Lotte on May 12, 2015. As described above, the article generating method performed by the article generating system 100 can generate a completely new article based on the professional baseball information collected on-line.

According to an article generating method according to an embodiment of the present invention, the article generating system 100 can play a role of a journalist in various journalism fields such as sports articles including professional baseball and financial articles including stock market information .

In addition, the article generating system 100 according to an embodiment of the present invention can generate various news articles related to politics, economy, and society, and can also generate various types of news articles such as an enterprise analysis report, Of health records from Internet (IoT) devices to intelligent systems that periodically generate and consult health reports.

6 is a flow chart according to an embodiment of the present invention and an article generating method. An article generating method according to an embodiment of the present invention can be performed through various steps such as data collection, event extraction, important event selection, mood (atmosphere) determination, and article creation, Between the steps, an event scoring may be performed, that is, a step of giving a weight score to the extracted event. Further, between the selection of the important event and the mood determination stage of the article, the context information may be further considered, and the mood of the article may be determined using the context information considered at this stage.

The article generating method according to the embodiment described with reference to FIG. 3 may also be implemented in the form of a recording medium including instructions executable by a computer such as a program module executed by a computer. Computer readable media can be any available media that can be accessed by a computer and includes both volatile and nonvolatile media, removable and non-removable media. The computer-readable medium may also include computer storage media. Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Communication media typically includes any information delivery media, including computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave, or other transport mechanism.

Also, the article generating method according to an embodiment of the present invention may be implemented as a computer program (or a computer program product) including instructions executable by a computer. A computer program includes programmable machine instructions that are processed by a processor and can be implemented in a high-level programming language, an object-oriented programming language, an assembly language, or a machine language . The computer program may also be recorded on a computer readable recording medium of a type (e.g., memory, hard disk, magnetic / optical medium or solid-state drive).

Thus, an article generating method according to an embodiment of the present invention can be implemented by a computer program as described above being executed by a computing device. The computing device may include a processor, a memory, a storage device, a high-speed interface connected to the memory and a high-speed expansion port, and a low-speed interface connected to the low-speed bus and the storage device. Each of these components is connected to each other using a variety of buses and can be mounted on a common motherboard or mounted in any other suitable manner.

Where the processor may process instructions within the computing device, such as to display graphical information to provide a graphical user interface (GUI) on an external input, output device, such as a display connected to a high speed interface And commands stored in memory or storage devices. As another example, multiple processors and / or multiple busses may be used with multiple memory and memory types as appropriate. The processor may also be implemented as a chipset comprised of chips comprising multiple independent analog and / or digital processors.

The memory also stores information within the computing device. In one example, the memory may comprise volatile memory units or a collection thereof. In another example, the memory may be comprised of non-volatile memory units or a collection thereof. The memory may also be another type of computer readable medium such as, for example, a magnetic or optical disk.

And the storage device can provide a large amount of storage space to the computing device. The storage device may be a computer readable medium or a configuration including such a medium and may include, for example, devices in a SAN (Storage Area Network) or other configurations, and may be a floppy disk device, a hard disk device, Or a tape device, flash memory, or other similar semiconductor memory device or device array.

It will be understood by those skilled in the art that the foregoing description of the present invention is for illustrative purposes only and that those of ordinary skill in the art can readily understand that various changes and modifications may be made without departing from the spirit or essential characteristics of the present invention. will be. It is therefore to be understood that the above-described embodiments are illustrative in all aspects and not restrictive. For example, each component described as a single entity may be distributed and implemented, and components described as being distributed may also be implemented in a combined form.

The scope of the present invention is defined by the appended claims rather than the detailed description and all changes or modifications derived from the meaning and scope of the claims and their equivalents are to be construed as being included within the scope of the present invention do.

1000: Automated Journalism Environment N1, N2: Network
100: Article generating system 200: External DB 300: Article providing apparatus
101: data collecting unit 102: event extracting unit 103:
104: atmosphere determining unit 105: article generating unit 106: database 107: personal setting unit 108: article distributing unit 109:

Claims (20)

An article generating method performed by an article generating system,
Collecting data from the data collector;
Extracting one or more events from the collected data;
Selecting at least one important event by assigning a weight to the at least one event from which the event selecting unit has been extracted;
Determining a mood of an article through analysis of data of the selected important event;
Generating an article based on the atmosphere of the article in which the article generating section is determined; And
Receiving personalization data corresponding to at least one of interest and taste from a subscriber of the article,
The weighting value,
Based on at least one of a repetition value of the same word including a synonym, an adjective value to be searched, a position value of a sentence in the entire article, and an association word corresponding to the personal setting data,
Wherein collecting the data comprises:
Collecting and transforming the data into a form that can be analyzed; And
Further comprising storing the converted data in an analytical form in a database,
In the data interpretation,
And extracting context information from a temporal element of the data, wherein the time element includes extraction of context information including a generation time of the media data related to the important event, an occurrence time period, and,
The step of determining the atmosphere of the article comprises:
Analyzing at least one of a weight score of the selected important event and the extracted context information; And
Determining a weight or an altitude of the atmosphere of the article according to the analysis,
The step of generating the article comprises:
Selecting one or more sentence templates corresponding to the selected important event;
Combining the selected sentence template with the data of the important event;
Generating a completed sentence by applying the strength of the adjective according to the atmosphere of the article determined in the sentence template combined with the data of the important event; And
And arranging the completion sentence,
Wherein the arranging comprises:
Calculating a dynamic importance of the completed sentence according to the fixed importance of the sentence template, the selected important event, and the determined atmosphere; And
And arranging the completed sentence on the basis of the completed completed sentence having the calculated dynamic importance.
delete delete delete delete delete delete The method according to claim 1,
The step of generating the article comprises:
Wherein at least one of length and type of an article is generated differently according to an article providing apparatus for providing an article generated from the article generating system to a subscriber.
The method according to claim 1,
The step of generating the article comprises:
And generating an article with at least one of Korean and a foreign language.
In an article generating system,
Database;
A data collection unit for collecting data and converting the data into a form that can be analyzed and storing the data converted into an analytical form into the database;
An event extracting unit for extracting one or more events from the collected data;
An event selection unit for assigning a weight to the extracted one or more events and selecting at least one important event;
An atmosphere decision unit for determining an atmosphere of the article through interpretation of data of the selected important event;
An article generating unit for generating an article based on the determined atmosphere of the article; And
And a personal setting unit that receives personal setting data corresponding to at least one of interest and taste from a subscriber of the article,
The event selection unit
Calculating a weight based on at least one of a repetition value of the same word including a synonym, an adjective value to be searched, a position value of the sentence in the entire article, and an association word corresponding to the personalization data,
Wherein,
Extracting contextual information from a temporal element of the data, wherein the time element includes a data amount of media data related to the selected important event, a generation trend of the media data, an occurrence time period and time zone information in which data is generated in a maximum amount,
Analyzing at least one of a weight score of the important event and extracted context information to determine the severity or the altitude of the atmosphere of the article according to the analysis,
The article generating unit,
Selecting one or more sentence templates corresponding to the selected important events to combine the data of the important events and generating the completed sentence by applying the strength of the adjective to the sentence template combined according to the determined atmosphere of the article, And arranging the sentence, the dynamic importance of the completed sentence according to the fixed importance of the selected sentence template, the selected important event, and the determined atmosphere is calculated, The article creation system, which arranges the sentence to completion.
delete delete delete delete delete delete 11. The method of claim 10,
The article generating unit,
Wherein the article generation system generates at least one of an article length and a type differently according to an article providing apparatus that provides an article generated from the article generation system to a subscriber.
11. The method of claim 10,
The article generating unit,
An article generating system for generating an article with at least one of Korean and a foreign language.
A computer program stored on a recording medium for performing the method recited in claim 1, performed by an article generating system. A computer-readable recording medium on which a program for carrying out the method according to claim 1 is recorded.
KR1020150162833A 2015-11-19 2015-11-19 Robot Journalism Method and System for Automatic Article Generation KR101753762B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1020150162833A KR101753762B1 (en) 2015-11-19 2015-11-19 Robot Journalism Method and System for Automatic Article Generation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1020150162833A KR101753762B1 (en) 2015-11-19 2015-11-19 Robot Journalism Method and System for Automatic Article Generation

Publications (2)

Publication Number Publication Date
KR20170058785A KR20170058785A (en) 2017-05-29
KR101753762B1 true KR101753762B1 (en) 2017-07-04

Family

ID=59053643

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020150162833A KR101753762B1 (en) 2015-11-19 2015-11-19 Robot Journalism Method and System for Automatic Article Generation

Country Status (1)

Country Link
KR (1) KR101753762B1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20190068005A (en) 2017-12-08 2019-06-18 (주)엠로보 Method and apparatus for investment information curation using robot journalism
KR102214136B1 (en) 2019-08-22 2021-02-09 백석대학교산학협력단 goods image searching method based social networks

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102135712B1 (en) * 2017-12-29 2020-07-20 (주)엠로보 Apparatus and method for generating stock article
KR102020012B1 (en) 2018-06-08 2019-09-11 (주)에이피케이어플킹 System method for writing sports article based on bigdata analysis
KR102026907B1 (en) * 2019-04-29 2019-09-30 이원세 Robot journalism device and method
CN112232035A (en) * 2019-07-15 2021-01-15 北京字节跳动网络技术有限公司 Article generation method and device, electronic equipment and storage medium
KR102488878B1 (en) * 2020-08-14 2023-01-13 김지완 System for auto-creation news articles by robot journalism
WO2024085718A1 (en) * 2022-10-20 2024-04-25 주식회사 아이팩토리 Document creation device having function of automatically generating text by using sentence template, method, computer program, computer-readable recording medium, server and system

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101377114B1 (en) * 2012-10-11 2014-03-24 한양대학교 에리카산학협력단 News snippet generation system and method for generating news snippet

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101377114B1 (en) * 2012-10-11 2014-03-24 한양대학교 에리카산학협력단 News snippet generation system and method for generating news snippet

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20190068005A (en) 2017-12-08 2019-06-18 (주)엠로보 Method and apparatus for investment information curation using robot journalism
KR102214136B1 (en) 2019-08-22 2021-02-09 백석대학교산학협력단 goods image searching method based social networks

Also Published As

Publication number Publication date
KR20170058785A (en) 2017-05-29

Similar Documents

Publication Publication Date Title
KR101753762B1 (en) Robot Journalism Method and System for Automatic Article Generation
US20180253173A1 (en) Personalized content from indexed archives
CN102708174B (en) Method and device for displaying rich media information in browser
US10423664B2 (en) Method and system for providing recommended terms
CN106354861B (en) Film label automatic indexing method and automatic indexing system
US20180039627A1 (en) Creating a content index using data on user actions
CN101000627B (en) Method and device for issuing correlation information
CN104391999B (en) Information recommendation method and device
US20090063984A1 (en) Customized today module
CN106339398A (en) Pre-reading method and device for webpage and intelligent terminal device
US20160364373A1 (en) Method and apparatus for extracting webpage information
CN1761972A (en) A method of determining an intention of internet user, and a method of advertising via internet by using the determining method and a system thereof
CN111479169A (en) Video comment display method, electronic equipment and computer storage medium
CN103885987A (en) Music recommendation method and system
CN107566906B (en) Video comment processing method and device
CN102541892A (en) Method for recording and analyzing user behavior characteristic
CN104090923A (en) Method and device for displaying rich media information in browser
CN111343467A (en) Live broadcast data processing method and device, electronic equipment and storage medium
JP2022043273A (en) Method, apparatus, device, storage medium, and computer program product for generating caption
CN106326261A (en) Pre-reading method and device for webpage and intelligent terminal device
JP5988345B1 (en) Evaluation device, evaluation method, evaluation program, recommendation device, recommendation method, and recommendation program
CN105893584A (en) Method, client and system for displaying website label of favorites
CN111859973A (en) Method and device for generating commentary
CN107085573A (en) The acquisition methods and device of hot information
CN110309415B (en) News information generation method and device and readable storage medium of electronic equipment

Legal Events

Date Code Title Description
A201 Request for examination
E902 Notification of reason for refusal
AMND Amendment
E601 Decision to refuse application
AMND Amendment
X701 Decision to grant (after re-examination)
GRNT Written decision to grant