KR101924642B1

KR101924642B1 - Apparatus and method for tagging topic to contents

Info

Publication number: KR101924642B1
Application number: KR1020160009774A
Authority: KR
Inventors: 손정우; 김선중; 박원주; 이상윤; 류원; 김상권; 김승희; 정우석
Original assignee: 한국전자통신연구원
Priority date: 2015-09-01
Filing date: 2016-01-27
Publication date: 2019-02-27
Also published as: KR20170027253A

Abstract

An apparatus and method for tagging a topic in a content is disclosed. The topic tagging apparatus includes an unstructured data base topic generation unit for generating a topic model including atypical data based topics based on contents and unstructured data, A multi-lateral topic generation unit for generating multi-lateral topics based on the topic model and characteristics of the viewer group, a content division unit for dividing the content into a plurality of scenes, And a tagging unit for tagging the multi-lateral topic in the scene.

Description

[0001] APPARATUS AND METHOD FOR TAGGING TOPIC TO CONTENTS [0002]

The present invention relates to a broadcast communication technology, and more particularly, to a device and a method capable of performing tagging by combining a topic obtained as a result of association data analysis on broadcast content divided into a predetermined unit and viewer information viewing the broadcast content .

According to viewers, personalized content recommendation and search or contents related advertisement services are presented. As one of the technologies for realizing these services, automatic tagging technology for broadcast contents is required.

The existing technology includes information such as the date of broadcasting for the content, the manufacturer, the compression format, and some additional information (such as the actor or place of appearance). Many of these pieces of information have manual manipulation methods that require human action.

Although there is an automatic tagging technology for some information, the subject to extract information is limited to domains generated in broadcast contents such as subtitles and ambassadors, and the range of information to be tagged is also limited to a person or an object having a range.

This conventional technology has a problem in that it can not provide various information about the broadcast contents to viewers, so that only the information about the content is transmitted to the viewers and the content provider can not diversify the profit model.

The present invention provides a method and apparatus for providing various information related to a content to a user by tagging multiple side topics on the content based on viewing situation information and unstructured data.

A topic tagging apparatus according to an embodiment of the present invention is a topic tagging apparatus for a content based on a viewing situation, comprising: an atypical data-based topic generating unit for generating a topic model including atypical data-based topics based on contents and unstructured data; A viewer group analyzer for analyzing the characteristics of the viewer group including the viewer based on the social network of the viewer and the viewing condition information of the viewer; A multiple side topic generation unit for generating multiple side topics based on the topic model and the characteristics of the viewer group; A content divider for dividing the content into a plurality of scenes; And a tagging unit for tagging the multi-lateral topic in the scene.

The atypical data-based topic generation unit may include a content-related unstructured data collection unit for collecting content-related unstructured data associated with the content from the content; A keyword extracting unit for extracting a first keyword and a second keyword from the content-related unstructured data; And a topic model generation unit that generates an atypical data-based topic for the content using the first keyword and the second keyword, and generates a topic model based on the atypical data-based topic, And may be determined from among the first keywords based on the frequency of the first keyword.

The atypical data-based topic generation unit may include: an external unstructured data analysis unit for extracting a third keyword from external unstructured data; And a model extension unit that extends the topic model based on the third keyword.

Wherein the viewer group analysis unit comprises: a social network generating unit for generating the social network based on the online information of the viewer; A proximity network generation unit for generating a proximity network from the viewing situation information; A network integration unit for integrating the social network and the proximity network; And a group feature extraction unit for extracting a common feature of the group of viewers based on the integrated network.

And a viewer group extracting unit for extracting the group of viewers from the integrated network.

Wherein the multi-lateral topic generator comprises: a relevance analyzer for analyzing the association between the atypical data-based topic and the feature of the viewer group; And a weight calculation unit for calculating a weight for each viewer group corresponding to each of the atypical data-based topics based on the association and reflecting the weight to the topic model.

The multi-lateral topic generation unit may further include a topic model re-learning unit that changes the topic model based on the association.

Wherein the tagging unit analyzes the association of the viewer group to the scene and the association of the multi-lateral topic to the scene, and based on the association of the viewer group to the scene and the association of the multi-lateral topic to the scene Side topic in the scene.

The association of the multi-lateral topic to the scene is analyzed based on the association of the first keyword to the scene, and the first keyword may be extracted from the content-related unstructured data associated with the content.

A topic tagging method according to an embodiment includes: generating a topic related to a topic of broadcast content; Extracting characteristics of a viewer group based on audience information of a viewer; Generating a multi-lateral topic based on the topic of the broadcast content and the characteristics of the audience group; And tagging the multi-lateral topic in the segmented broadcast content.

According to an embodiment of the present invention, there is provided a topic tagging method for content based on a viewing situation, the method comprising: generating a topic model including atypical data based topics based on contents and unstructured data; Analyzing a characteristic of a viewer group including the viewer based on the social network of the viewer of the contents and the viewing condition information of the viewer; Creating multiple side topics based on the topic model and characteristics of the audience group; Dividing the content into a plurality of scenes; And tagging the multi-lateral topic in the scene.

The recording medium according to an embodiment may be a computer-readable recording medium in which a program for executing the topic tagging method is recorded.

According to an exemplary embodiment of the present invention, various side information regarding a content can be provided to a user by tagging multiple side topics on the content based on the viewing situation information and unstructured data.

BRIEF DESCRIPTION OF THE DRAWINGS Figure 1 is a diagram illustrating the components of a topic tacking apparatus according to one embodiment.
FIG. 2 is a diagram illustrating the components of an unstructured data-based topic generation unit according to an embodiment.
FIG. 3 is a diagram illustrating the components of a viewer group analysis unit according to an exemplary embodiment of the present invention.
FIG. 4 is a diagram illustrating the components of a multi-lateral topic generation unit according to an exemplary embodiment of the present invention.
FIG. 5 is a diagram illustrating the components of a viewer-based and topic-based scene unit tagging unit according to an embodiment.
6 shows a flowchart for a topic tagging method according to one embodiment.

It should be understood that the specific structural and functional descriptions below are merely illustrative of the embodiments and are not to be construed as limiting the scope of the patent application described herein. Various modifications and variations may be made thereto by those skilled in the art to which the present invention pertains. Reference throughout this specification to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment, It should be understood that references to "an embodiment" are not all referring to the same embodiment.

It is to be understood that the specific structural or functional descriptions for the embodiments disclosed herein are presented for purposes of illustrating illustrative embodiments only and that the embodiments may be embodied in various forms and are not limited to the embodiments described herein It does not.

The embodiments disclosed herein are capable of various modifications and may take various forms, so that the embodiments are illustrated in the drawings and described in detail herein. It is not intended to be exhaustive or to limit the invention to the specific forms disclosed, but on the contrary, is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the disclosure.

The terms first, second, or the like may be used to describe various elements, but the elements should not be limited by the terms. The terms are for purposes of distinguishing one element from another, for example, without departing from the scope of the present disclosure, the first element may be referred to as a second element, The component may also be referred to as a first component.

It is to be understood that when an element is referred to as being "connected" or "connected" to another element, it may be directly connected or connected to the other element, . On the other hand, when an element is referred to as being "directly connected" or "directly connected" to another element, it should be understood that there are no other elements in between. Expressions that describe the relationship between components, for example, "between" and "immediately" or "directly adjacent to" should be interpreted as well.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to limit the scope of the rights. The singular expressions include plural expressions unless the context clearly dictates otherwise. In this specification, the terms "comprises ", or" having ", and the like, are used to specify one or more of the features, numbers, steps, operations, elements, But do not preclude the presence or addition of steps, operations, elements, parts, or combinations thereof.

Unless otherwise defined, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art. Terms such as those defined in commonly used dictionaries are to be interpreted as having a meaning consistent with the meaning of the context in the relevant art and, unless explicitly defined herein, are to be interpreted as ideal or overly formal Do not.

Embodiments to be described below can be applied to identify and determine the type of motion of an object included in a moving image.

Hereinafter, embodiments will be described in detail with reference to the accompanying drawings. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS In the following description of the present invention with reference to the accompanying drawings, the same components are denoted by the same reference numerals regardless of the reference numerals, and a duplicate description thereof will be omitted.

BRIEF DESCRIPTION OF THE DRAWINGS Figure 1 is a diagram illustrating the components of a topic tacking apparatus according to one embodiment.

The topic tagging apparatus according to an exemplary embodiment may tag not only the content of the content itself but also atypical data related to the content, the social network network of the viewer, and the viewing situation information in a content scene unit based on multiple side topics. Hereinafter, tagging may be referred to as indexing. In addition, the content may include broadcast content.

The topic tagging apparatus 100 according to an exemplary embodiment includes an unstructured data based topic generation unit 110, a viewer group analysis unit 120, a multiple side topic generation unit 130, a content division unit 140, and a tagging unit 150 ).

The topic tagging apparatus 100 receives content from the content storage apparatus 180 and receives unstructured data related to the content from the content related unstructured data storage apparatus 190. The unstructured data-based topic generation unit 110 may generate an unstructured data-based topic based on the unstructured data. Here, the atypical data may include atypical data related to contents such as scripts, subtitles, and external unstructured data such as blogs and news displayed on websites.

The topic tagging apparatus 100 may collect the social network and viewing status information of the viewer viewing the content from the social network and the viewing status storage device 170. [ The viewer group analysis unit 120 can generate viewer groups based on the social network and the viewing situation information and extract information of each group. The information of the viewer group may include characteristics of the viewer group.

The multi-lateral topic generator 130 may calculate multi-lateral topics and weights for each viewer group based on the information on the viewer group and the unstructured data-based topic. The content division unit 140 may divide the input content into units of a scene or the like. Specifically, the unit of division may be a scene or a set of scenes. The tagging unit 150 may tag multiple side topics in the divided scene.

FIG. 2 is a diagram illustrating the components of an unstructured data-based topic generation unit according to an embodiment.
The atypical data-based topic generation unit 110 according to an embodiment may generate an unstructured data-based topic based on unstructured data. The atypical data-based topic generation unit 110 includes a content-related unstructured data collection unit 211, a keyword extraction unit 212, a topic model generation unit 213, an external unstructured data analysis unit 214, . &Lt; / RTI >
The atypical data-based topic generation unit 110 may first collect two types of atypical data from the content-related unstructured data storage unit 190. The content-related unstructured data collection unit 211 may collect the content-related unstructured data 222 and the external unstructured data analysis unit 214 may collect the external unstructured data 221.
Here, the content-related unstructured data 222 may refer to data directly related to contents except for the content itself such as script, subtitles, and the like. The external unstructured data 221 may refer to data indirectly associated with contents such as blogs and news related to contents displayed on a web site or the like.

delete

The keyword extracting unit 212 may extract the first keyword based on the regional characteristics from the collected data. Here, the regional characteristic may mean a characteristic appearing in a specific time domain based on the time axis. For example, if a keyword A appears at a high frequency in a script or subtitles in a specific time domain, A may be referred to as a regional characteristic in that time domain. Further, the keyword extracting unit 212 can extract the second keyword based on the first keyword. Here, the second keyword may be determined from the first keyword based on the frequency of the first keyword. The second keyword is determined based on the local frequency of the first keyword appearing in the script or subtitles in order to prevent the generation of duplicate topics or noise topics. That is, the probability that the first keyword is the second keyword is concentrated in a specific time period, the frequency is high, and can be determined to be high if it can be represented semantically. The second keyword may be referred to as the seed word of the topic.

delete

The topic model generation unit 213 may generate an unstructured data-based topic for the content using the first keyword and the second keyword, and may generate the topic model based on the unstructured data-based topic. Where the topic model may include atypical data based topics. Here, the generation of the model can be referred to as the learning of the model.

According to one embodiment, the model extension 215 may extend the learned topic model based on the external unstructured data 221. [ First, the external unstructured data analysis unit 214 may receive external unstructured data 221 from the content-related unstructured data storage unit 190. [ The external unstructured data analysis unit 214 may extract the third keyword from the external unstructured data.

The model extension unit 215 can expand the topic model based on the extracted keyword. Specifically, the model extension unit 215 can determine whether the third keyword extracted from the external unstructured data is highly related to the keyword (the first keyword or the second keyword) belonging to the existing topic. The model extension unit 215 may expand a keyword of an existing topic if the association is high and generate a new topic if the association is low.

FIG. 3 is a diagram illustrating the components of a viewer group analysis unit according to an exemplary embodiment of the present invention.

The viewer group analysis unit 120 according to an exemplary embodiment may generate viewer groups on the basis of the social network and the viewing situation information, and then extract information of each group. The viewer group analyzing unit 120 may include a social networking network generating unit 311, a proximity network generating unit 312, a network integrating unit 313, a viewer group extracting unit 314, and a group feature extracting unit 315 have.

The viewer group analysis unit 120 can receive the online information and the viewing status information of the viewer from the social network and the viewing status storage unit 170. [ The social network generating unit 311 can generate an online network of viewers based on the online information of the viewer. An online network can be called a social network.

The proximity network generation unit 312 may generate a proximity network from the viewing situation information. Specifically, the proximity network can be generated on the basis of proximity calculated through information such as the viewer's position, age, sex, viewing device, and the like.

Thereafter, the network integration unit 313 can integrate the social network and the proximity network into one network. Specifically, some of the plurality of viewers may belong to the two networks at the same time, and the network integrating unit 313 may integrate the two networks using the same.
For example, the social network N is composed of {Vn, En}, Vn is a node, and En is an edge, which can mean a relationship between nodes. The neighboring network P is composed of {Vp, Ep}, Vp is a node, and Ep is an edge, which can mean a relationship between nodes. Unlike N, which can extract explicit relationships, Ep can be generated through the proximity function Dp (). At this time, | Vn? Vp | &Gt; 0. The network integration unit 313 can combine the neighboring viewers based on the users belonging to both N and P to generate a fused social network. Fused social networks can be referred to as integrated networks.

The viewer group extracting unit 314 can extract the viewer group from the integrated network. Specifically, the viewer group can extract by separating the convergence social network into k subgraphs. Then, the group feature extraction unit 315 can extract a feature and a characteristic value common to the viewer group extracted from the integrated network. For example, the group feature extraction unit 315 can acquire "age" which is a feature commonly appearing in a plurality of viewer groups and "20 units "

delete

FIG. 4 is a diagram illustrating the components of a multi-lateral topic generation unit according to an exemplary embodiment of the present invention.

The multi-lateral topic generator 130 according to an exemplary embodiment may calculate multi-lateral topics and weights for each viewer group on the basis of the information on the viewer group and the unstructured data-based topic. As described above, the multi-lateral topic generation unit 130 can combine the viewer group and the feature information acquired from the viewer group analysis unit 120 with the topic model.

The multiple side topic generation unit 130 may include a relevance analysis unit 411, a weight calculation unit 412, and a topic re-learning unit 413. [ Where the multiple facets can be referred to as multiple domains.

The relevance analysis unit 411 can analyze the association between the atypical data-based topic and the characteristics of the viewer group. The weight calculation unit 412 may calculate a weight for each viewer group corresponding to each atypical data-based topic based on the association, and reflect the weight to the topic model. Specifically, the weight calculator 412 may calculate a topic weight by calculating a weight for each viewer group in units of keywords, and then integrating the weights for each viewer group. As described above, the multiple side topic generator 130 can generate a connection between a viewer group and a topic.

According to one embodiment, the topic re-learning unit 413 may remove a topic having low connectivity with a viewer group, allocate a keyword belonging to the topic to an existing topic, or assign it to a new topic.

FIG. 5 is a diagram illustrating the components of a viewer-based and topic-based scene unit tagging unit according to an embodiment.
According to one embodiment, the tagging unit 150 may be referred to as a viewer and a topic-based scene unit. The tagging unit may include a scene-viewer group interrelation measuring unit 511, a scene-keyword interrelation measuring unit 512, a scene-topic interrelation measuring unit 513, and a scene unit multiple side topic tagging unit 514 have.
The scene-viewer group inter-relationship measurement unit 511 can measure the inter-scene-viewer group association. Specifically, the scene-viewer group inter-relevance measuring unit 511 can measure the relevance between a scene and a scene group combined with a topic in a multi-lateral topic. When a specific scene of the broadcast content is given, the tagging unit 150 measures from the characteristics of the viewer group how much the viewer group is related to the scene through the degree of viewing of the corresponding section of the group member, the activity information in the online community, can do.
The scene-viewer group inter-relevance measuring unit 511 may measure the relevance between scenes and keywords. Specifically, the scene-viewer group inter-relation measurement unit 511 can determine how relevant a keyword constituting a specific topic is to the scene, and can determine whether or not the contents including the time information appearing on the closed caption or the tag information including the time axis The relevance can be measured based on the keyword associated with the timeline of FIG.
The scene-viewer group inter-relevance measuring unit 511 can measure the relevance between scenes and topics. Concretely, the scene-viewer group inter-relevance measuring unit 511 may measure the degree of relevance between a scene and a topic by combining the associations between scenes and keywords.
The scene-viewer group inter-relationship measurement unit 511 may tag multiple side topics in scene units. Specifically, the scene-viewer group inter-relevance measuring unit 511 may combine the relevance of the scene-viewer-topic to tag multiple side-by-scene topics on a scene-by-scene basis.
The scene-viewer group association measuring unit 511 may store the multi-lateral topic tagging information of the scene unit in the multi-lateral topic-based scene unit metadata storage unit 160 of FIG. The contents stored in the repository may include viewer characteristics, topic information, topic weights according to viewers, scene segmentation information, and topic weights according to a scene unit viewer group. The storage format can be freely selected according to implementation of JSON, XML, have.

delete

6 shows a flowchart for a topic tagging method according to one embodiment.
In step 610, the topic tagging device 100 may generate a topic model that includes atypical data based topics based on content and unstructured data. In step 620, the topic tagging device 100 may analyze the characteristics of the viewer group including the viewer based on the social network of the viewer of the contents and the viewing condition information of the viewer. At step 630, the topic tagging device 100 may generate multiple side topics based on the topic model and characteristics of the viewer group. In step 640, the topic tagging device 100 may split the content into a plurality of scenes. At step 650, the topic tagging device 100 may tag multiple side topics in the scene.

delete

The apparatus described above may be implemented as a hardware component, a software component, and / or a combination of hardware components and software components. For example, the apparatus and components described in the embodiments may be implemented within a computer system, such as, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate array (FPGA) , A programmable logic unit (PLU), a microprocessor, or any other device capable of executing and responding to instructions. The processing device may execute an operating system (OS) and one or more software applications running on the operating system. The processing device may also access, store, manipulate, process, and generate data in response to execution of the software. For ease of understanding, the processing apparatus may be described as being used singly, but those skilled in the art will recognize that the processing apparatus may have a plurality of processing elements and / As shown in FIG. For example, the processing unit may comprise a plurality of processors or one processor and one controller. Other processing configurations are also possible, such as a parallel processor.

The software may include a computer program, code, instructions, or a combination of one or more of the foregoing, and may be configured to configure the processing device to operate as desired or to process it collectively or collectively Device can be commanded. The software and / or data may be in the form of any type of machine, component, physical device, virtual equipment, computer storage media, or device , Or may be permanently or temporarily embodied in a transmitted signal wave. The software may be distributed over a networked computer system and stored or executed in a distributed manner. The software and data may be stored on one or more computer readable recording media.

The method according to an embodiment may be implemented in the form of a program command that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like, alone or in combination. The program instructions to be recorded on the medium may be those specially designed and configured for the embodiments or may be available to those skilled in the art of computer software. Examples of computer-readable media include magnetic media such as hard disks, floppy disks and magnetic tape; optical media such as CD-ROMs and DVDs; magnetic media such as floppy disks; Magneto-optical media, and hardware devices specifically configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. Examples of program instructions include machine language code such as those produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.

While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the invention is not limited to the disclosed exemplary embodiments. For example, it is to be understood that the techniques described may be performed in a different order than the described methods, and / or that components of the described systems, structures, devices, circuits, Lt; / RTI > or equivalents, even if it is replaced or replaced.

Therefore, other implementations, other embodiments, and equivalents to the claims are also within the scope of the following claims.

100: Topic tagging device
110: Atypical Data Base Topic Generation Unit
120: viewer group analysis section
130: multiple side topic generation unit
140: Content divider
150: tagging unit
160: Multi-lateral topic-based scene unit metadata storage
170: Social network and viewing situation storage device
180: Content storage device
190: Content-related unstructured data storage

Claims

1. A topic tagging apparatus for a content based on a viewing situation,
An unstructured data-based topic generation unit for generating a topic model including an unstructured data-based topic based on contents and unstructured data;
A viewer group analyzer for analyzing the characteristics of the viewer group including the viewer based on the social network of the viewer and the viewing situation information of the viewer;
A multiple side topic generation unit for generating multiple side topics based on the topic model and the characteristics of the viewer group;
A content divider for dividing the content into a plurality of scenes; And
And a tagging unit for tagging the multi-
Lt; / RTI >
Wherein the atypical data-based topic generation unit comprises:
A content-related unstructured data collection unit for collecting content-related unstructured data associated with the content from the content;
A keyword extracting unit for extracting a first keyword and a second keyword from the content-related unstructured data; And
A topic model generation unit that generates an atypical data base topic for the content using the first keyword and the second keyword and generates a topic model based on the atypical data based topic,
The second keyword is determined from the first keyword based on the frequency of the first keyword,
Wherein the atypical data-based topic generation unit comprises:
An external unstructured data analysis unit for extracting a third keyword from external unstructured data; And
And a model extension unit for expanding the topic model based on the third keyword,
Wherein the model extension unit comprises:
Determining whether the third keyword is associated with the first keyword or the second keyword, expanding the generated atypical data-based topic if the association is high, and if the association is low, A topic tagging device for generating a new topic different from a topic.

delete

The method according to claim 1,
The viewer group analyzing unit,
A social network generating unit for generating the social network based on online information of the viewer;
A proximity network generation unit for generating a proximity network from the viewing situation information;
A network integration unit for integrating the social network and the proximity network; And
And a group feature extraction unit for extracting common characteristics of the group of viewers based on the integrated network
Topic tagging device.

5. The method of claim 4,
And a viewer group extracting unit for extracting the group of viewers from the integrated network.

The method according to claim 1,
Wherein the multi-
A relevance analyzer for analyzing the association between the atypical data-based topic and the feature of the viewer group; And
And a weight calculation unit for calculating a weight for each viewer group corresponding to each of the atypical data based topics based on the association and reflecting the weight to the topic model
Topic tagging device.

The method according to claim 6,
Wherein the multi-
And a topic model re-learning unit that changes the topic model based on the association.

The method according to claim 1,
The tagging unit,
Analyzing a relevance of the viewer group to the scene and a relevance of the multi-lateral topic to the scene,
Wherein the multi-lateral topic is tagged on the scene based on a relevance of the viewer group to the scene and a relevance of the multi-lateral topic to the scene.

9. The method of claim 8,
The relevance of the multi-lateral topic to the scene is analyzed based on a relevance of a first keyword to the scene,
Wherein the first keyword is extracted from content-related unstructured data associated with the content.

delete

A topic tagging method for a content based on a viewing situation,
The topic tagging method comprises:
Generating a topic model including atypical data-based topics based on content and unstructured data;
Analyzing characteristics of a viewer group including the viewer based on a social network of a viewer of the content and viewing status information of the viewer;
Creating multiple side topics based on the topic model and characteristics of the audience group;
Dividing the content into a plurality of scenes; And
Tagging the multi-lateral topic in the scene
Lt; / RTI >
Wherein the step of generating the topic model comprises:
Collecting content-related unstructured data associated with the content;
Extracting a first keyword and a second keyword from the content-related unstructured data; And
Generating an atypical data-based topic for the content using the first keyword and the second keyword, and generating a topic model based on the atypical data-based topic,
Wherein the step of generating the topic model comprises:
Extracting a third keyword from external unstructured data; And
Expanding the topic model based on the third keyword
Lt; / RTI >
Wherein expanding the topic model comprises:
Determining whether the third keyword is associated with the first keyword or the second keyword, expanding the generated atypical data-based topic if the association is high, and if the association is low, A topic tagging method for generating a new topic different from a topic.

delete