KR101602898B1

KR101602898B1 - Data visualization method and system using comment data for objects

Info

Publication number: KR101602898B1
Application number: KR1020140154168A
Authority: KR
Inventors: 이경원; 김기남; 하효지
Original assignee: 아주대학교산학협력단
Priority date: 2014-11-07
Filing date: 2014-11-07
Publication date: 2016-03-11
Also published as: WO2016072769A3; WO2016072769A2

Abstract

The present invention relates to a technology about a visualization method using expression elements collected from comment data of an object and a system thereof. For example, if an object is one content, the present invention relates to a technology for visualizing expression elements shown on a comment in which a consumer, who consumes the content, expresses emotions or opinions about the object. The present invention visually realizes expression elements in which emotions or opinions of a user are expressed wherein the emotions or the opinions are shown about an object in a user comment. The present invention analyzes objective information such as a manufacture company, a price, and the like that existing object information provides, and expression elements which express emotions or opinions that a user has while using the object. in which emotions or opinions are expressed wherein a user feels the emotions or the opinions while the user uses an object. Therefore, the present invention can provide information, which can be a reference of selecting an object, to a user who wants to newly use the object.

Description

TECHNICAL FIELD [0001] The present invention relates to a data visualization method and system using comment data of an object,

본 발명은 객체의 코멘트 데이터를 이용한 데이터 시각화 방법 및 시스템에 관한 기술로, 보다 상세하게는 사용자 또는 소비자의 감정 또는 의견이 표현된 요소를 시각화하는 기술에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a data visualization method and system using comment data of an object, and more particularly, to a technique of visualizing elements expressing feelings or opinions of a user or a consumer.

본 발명은 교육부 및 한국연구재단의 인문사회기초연구사업의 일환으로 수행한 연구로부터 도출된 것이다[과제관리번호: S-2013-A0403-00010, 과제명: 상황별 감정어휘 분포맵을 이용한 영화추천 시스템의 시각화].The present invention was derived from a research carried out by the Ministry of Education and the Korea Research Foundation as part of the Basic Research Project of the Humanities Society [Project Number: S-2013-A0403-00010, Title: Visualization of the system].

일반적으로, 영화, 음악, 문학 작품, 등 콘텐츠를 소비하거나, 상품, 또는 서비스를 이용한 사용자들은 콘텐츠, 상품, 또는 서비스(이하 "객체"라 함)를 이용한 감정 또는 의견을 코멘트(리뷰)형식으로 표현하게 되고, 아직 객체를 이용하지 못한 사용자들이나 객체에 대한 정보를 얻고자 하는 사용자들은 객체를 미리 이용해본 사용자들이 남긴 코멘트(리뷰)를 참고하여 정보를 얻게 된다.In general, users who consume content, such as movies, music, or literary works, or who use a product or service, use the content, product, or service ("object" And users who have not yet used the object or who want to get information about the object get the information by referring to the comments left by the users who have already used the object.

사용자는 객체에 대한 정보를 얻고자 하는데, 객체에 대한 코멘트 데이터는 텍스트에 기반하여 구성되어 있으므로, 사용자가 코멘트 데이터를 참고하여 객체에 대한 정보를 얻는 데 걸리는 시간이 상당히 길게 소요되는 문제점이 있었다. 특히, 객체에 대한 코멘트 데이터의 양이 방대하거나, 다수의 사용자가 코멘트를 남겼거나, 오랜 시간 동안 코멘트 데이터가 누적된 경우에는 사용자가 코멘트 데이터의 내용을 읽는 것만으로도 상당한 노력이 필요하였다.The user wants to obtain information about the object. Since the comment data on the object is composed based on the text, there is a problem that the time required for the user to obtain information about the object with reference to the comment data is considerably long. Particularly, when the amount of comment data for an object is large, a large number of users leave a comment, or when comment data is accumulated for a long time, a considerable effort is required even if the user reads the contents of the comment data.

따라서 이러한 문제점에 대응하고자 코멘트 데이터 상의 어휘를 기반으로 하여 코멘트 또는 객체를 검색하고, 사용자로 하여금 코멘트 및 객체에 대한 탐색 시간을 단축하도록 할 수 있는 기술에 대한 연구가 진행되었다. Therefore, in order to cope with such a problem, research has been conducted on techniques for searching for comments or objects based on vocabularies on comment data, and for allowing users to shorten the search time for comments and objects.

이러한 콘텐츠에 대한 코멘트 정보를 이용하여 콘텐츠를 검색하는 방법의 일 예가 한국등록특허 제10-0917784호 "콘텐츠에 대한 코멘트를 기반으로 한 집단 감성 정보 검색방법 및 시스템"에 기술되어 있다.An example of a method of retrieving contents using comment information on such contents is described in Korean Patent No. 10-0917784 entitled " Method and system for collective sensibility information retrieval based on comments on contents ".

상기 선행기술은 인터넷 상의 각종 콘텐트에 달린 코멘트를 수집하여 검색용 데이터베이스(이하 DB라고 한다)를 작성하고 이 검색용DB를 이용하여 감성적인 질의에 대해 객관적이고 신뢰할 수 있는 순위 결과를 보여 주는 검색 방법 및 시스템을 제공하는데 그 목적이 있다. 특히 감성적인 단어가 포함된 질의에 대하여 감성적인 단어가 코멘트 상에 나타나는 빈도를 반영하여 객체의 추천 우선 순위를 조정하는 기술이다.The prior art described above collects comments attached to various contents on the Internet, creates a search database (hereinafter referred to as a DB), and provides a search method that displays objective and reliable ranking results for emotional queries using the search DB And a system. Especially, it is a technique to adjust the recommendation priority of an object by reflecting the frequency of emotional words appearing on the comments with respect to queries containing emotional words.

그러나 위의 선행기술은 객체에 대한 코멘트로부터 감성 단어를 검색해 내는 기술에 대해서는 언급하고 있지만, 객체에 대하여 사용자에게 기대되는 전반적인 감정 또는 의견이 효과적으로 도시되는 것은 아니고, 다수의 코멘트가 존재하는 객체가 우선적으로 추천되는 등의 한계가 있다.However, although the above prior art refers to a technique for retrieving emotional words from comments on an object, it does not mean that the overall emotion or opinion expected from the user for the object is effectively displayed, And there are limitations such as being recommended.

이는 위의 선행기술이 텍스트 기반의 감정단어 검색을 채택하였기 때문에 생기는 한계로서, 하나의 객체(콘텐츠, 상품 또는 서비스)에 대하여 기대되는 전반적인 감정 또는 의견을 효과적으로 보여줄 수 있는 기술의 개발이 요구된다.This is due to the fact that the above prior art adopts a text-based emotional word search, and it is required to develop a technique that can effectively show the overall emotion or opinion expected of a single object (content, product, or service).

한국등록특허 제10-0917784호 (등록일 2009.09.10)Korean Patent No. 10-0917784 (registered on September 10, 2009)

본 발명은 상기와 같은 종래 기술의 문제점을 해결하고자 도출된 것으로서, 객체(콘텐츠, 상품 또는 서비스)에 대하여 기존의 사용자 코멘트에서 나타나는 사용자의 감정 또는 의견이 표현된 요소들을 시각화하는 것으로 기존의 객체 정보가 제공하는 제작회사, 가격 등 객관적인 정보뿐만 아니라 사용자가 객체를 이용하고 표현하는 감정 또는 의견을 분석하여 객체를 새로이 이용하려는 사용자에게 객체 선택의 기준으로 삼을 수 있는 정보를 제공하는 것을 목적으로 한다.SUMMARY OF THE INVENTION The present invention has been made to solve the above problems of the conventional art, and it is an object of the present invention to visualize elements expressing emotions or opinions of a user appearing in existing user comments on an object (contents, goods, or services) The object of the present invention is to provide information that can be used as a criterion of object selection to a user who wants to newly use an object by analyzing emotions or opinions that a user uses and expresses not only objective information such as a manufacturer, .

본 발명은 하나의 객체에 대하여 감정 또는 의견이 표현된 복수의 요소들을 의미 거리(semantic distance) 기반으로 시각화함으로써, 객체에 대하여 표현된 감정 또는 의견의 전체적인 분포를 직관적으로 시각화할 수 있는 방법 및 시스템을 제공하는 것을 목적으로 한다.The present invention relates to a method and system for intuitively visualizing the overall distribution of emotions or opinions expressed on an object by visualizing a plurality of elements expressing emotion or opinion on one object on the basis of a semantic distance And to provide the above objects.

본 발명은 하나의 객체에 대하여 대표적으로 표현된 감정 또는 의견을 시각화할 수도 있지만, 복수의 표현 요소(expression element)들을 상대적인 의미 거리 기반으로 시각화함으로써, 객체에 대하여 표현된 복수의 표현 요소들 간의 상대적인 거리 및 분포를 직관적으로 인식할 수 있는 수단을 제공하는 것을 목적으로 한다.Although the present invention can visualize emotion or opinion that is typically represented for one object, it is possible to visualize a plurality of expression elements on the basis of a relative meaning distance, so that the relative expression of a plurality of expression elements And a means for intuitively recognizing the distance and the distribution.

또한 본 발명은 텍스트에 한정하지 않고, 이모티콘이나 아이콘 등 감정 또는 의견을 표현할 수 있는 다양한 비언어적 요소를 모두 반영하여 의미 거리 기반의 분포를 시각화할 수 있는 수단을 제공하는 것을 목적으로 한다. 또한, 텍스트라는 제약에서 벗어나 자유로울 수 있기 때문에, 다양한 외국어로 표현된 의견 또는 감정까지도 망라하여 하나의 프레임 안에서 시각화할 수 있는 수단을 제공할 수도 있다.It is another object of the present invention to provide a means for visualizing a distribution based on a semantic distance by reflecting not only text but also various non-verbal elements expressing emotions or opinions such as emoticons and icons. In addition, since it can be freed from the restriction of text, it can provide a means of visualizing in a frame all the opinions or emotions expressed in various foreign languages.

또한 본 발명은 다양한 경로를 통하여 얻어질 수 있는, 예를 들어, 웹사이트에서 수집할 수 있는 코멘트 데이터에서 각 감정 또는 의견이 표현된 표현요소의 빈도를 계산하여 그 결과를 쉽게 이해할 수 있도록 시각적인 그래프로 제공하는 것을 목적으로 한다.In addition, the present invention can calculate the frequency of expression elements in which each emotion or opinion is expressed from comment data that can be obtained through various paths, for example, collected from a web site, It is intended to provide a graph.

상기와 같은 목적을 달성하기 위하여, 본 발명의 일 실시예에 따른 표현요소(expression element)를 시각화 하는 방법은 사용자가 선택한 객체에 대하여 수집된 코멘트 데이터에서 복수의 표현요소들을 추출하는 단계 및 상기 추출된 복수의 표현요소들 간의 의미 거리에 기반한 분포에 기초하여 상기 추출된 복수의 표현요소들을 시각화하는 단계를 포함한다.According to an aspect of the present invention, there is provided a method of visualizing an expression element, the method comprising: extracting a plurality of expression elements from comment data collected for an object selected by a user; And visualizing the extracted plurality of presentation elements based on a distribution based on the semantic distance between the plurality of presentation elements.

이때, 상기 추출된 표현요소들의 상기 코멘트 데이터 내에서 추출된 빈도수를 측정하는 단계를 더 포함하고, 상기 표현요소들을 시각화하는 단계는 상기 추출된 표현요소들을 상기 측정된 표현요소들의 빈도수에 따라 시각화 하는 것을 특징으로 한다.The method of claim 1, further comprising measuring an extracted frequency in the comment data of the extracted expression elements, wherein visualizing the expression elements visualizes the extracted expression elements according to the frequency of the measured expression elements .

또한, 상기 표현요소들을 추출하는 단계 이후에, 상기 추출된 표현요소들과 기존에 추출된 표현요소들을 비교하는 단계 및 상기 추출된 표현요소들 중 신규한 표현요소가 추가되었는지 여부를 확인하는 단계를 더 포함할 수 있다.The step of extracting the expression elements may further include comparing the extracted expression elements with previously extracted expression elements and checking whether a new expression element is added to the extracted expression elements .

상기 표현요소들을 시각화하는 단계는 상기 추출된 표현요소들 중 신규한 표현요소가 추가된 경우, 상기 기존에 추출된 표현요소들 중 상기 신규한 표현요소와 의미 거리가 일정 기준 이내인 하나 이상의 인접 표현요소를 결정하는 단계 및 상기 결정된 하나 이상의 인접 표현요소들로부터의 의미 거리에 기반하여 상기 신규한 표현요소의 의미상 위치를 결정하는 단계를 포함한다.Wherein the step of visualizing the expression elements comprises the steps of: when a new expression element among the extracted expression elements is added, the step of visualizing one or more adjacent expressions having a semantic distance with the new expression element among the previously extracted expression elements, Determining a semantic location of the new presentation element based on the determined semantic distance from the one or more neighbor presentation elements.

또한, 상기 표현요소들을 추출하는 단계 이후에, 상기 추출된 표현요소의 유효성을 판단하는 단계 및 상기 사용자가 선택한 객체에 대하여 상기 추출된 표현요소가 유효하지 않은 경우, 상기 유효하지 않은 표현요소를 제거하는 단계를 포함할 수 있으며, 상기 표현요소들을 추출하는 단계 이후에, 상기 추출된 표현요소들의 상기 코멘트 데이터 내에서 추출된 빈도수를 측정하는 단계를 더 포함하고, 상기 추출된 표현요소의 유효성을 판단하는 단계는 상기 추출된 표현요소의 측정된 빈도수를 반영하여 상기 추출된 표현요소의 유효성을 판단할 수 있다.The method may further include the step of determining the validity of the extracted expression element after extracting the expression elements, and if the extracted expression element is invalid for the object selected by the user, The method may further include the step of measuring the frequency extracted in the comment data of the extracted expression elements after extracting the expression elements, wherein the step of determining the validity of the extracted expression elements The validity of the extracted expression element may be determined by reflecting the measured frequency of the extracted expression element.

상기 표현요소들을 추출하는 단계 이후에, 상기 추출된 표현요소가 상기 사용자가 선택한 객체 이외의 다른 객체에서 추출되는 빈도수를 식별하는 단계, 상기 추출된 표현요소가 상기 사용자가 선택한 객체 이외의 다른 객체 중 일정 수 이상의 객체에서 일정 빈도수 이상으로 추출되었는지 여부를 판정하는 단계를 더 포함할 수 있으며, 상기 표현요소들을 추출하는 단계 이후에, 상기 표현요소가 추출된 빈도수를 측정하는 단계 및 상기 측정추출 된 빈도수에 따라 상기 측정된 표현 요소의 빈도수에 가중치를 부여하여, 상기 측정된 빈도수를 조정하는 단계를 더 포함하고, 상기 표현요소들을 시각화하는 단계는 상기 조정된 빈도수를 반영하여 상기 표현요소들을 시각화하는 것을 특징으로 할 수 있다.Identifying a frequency at which the extracted presentation element is extracted from an object other than the object selected by the user after the step of extracting the presentation elements, The method may further include the step of determining whether or not a predetermined number or more of objects have been extracted by a certain frequency or more. The method may further include the steps of: measuring the extracted frequency of the expression elements after extracting the expression elements; Wherein the step of visualizing the expression elements further comprises the step of visualizing the expression elements by reflecting the adjusted frequency of occurrence .

또한, 상기 표현요소들을 추출하는 단계 이후에, 상기 코멘트 데이터에서 상기 표현요소가 추출된 빈도수를 측정하는 단계, 상기 사용자가 선택한 객체에서 상기 표현요소가 출현하는 빈도수와 상기 측정된 빈도수를 비교하는 단계 및 상기 사용자가 선택한 객체에서 상기 표현요소가 출현하는 빈도수와 상기 측정된 빈도수의 비교 결과에 따라, 상기 표현요소가 추출된 빈도수에 가중치를 부여하여 상기 측정된 빈도수를 조정하는 단계를 더 포함할 수 있다.The method may further include the steps of: measuring the frequency at which the expression elements are extracted from the comment data after extracting the expression elements; comparing the frequency with which the expression elements appear in the object selected by the user, And adjusting the measured frequency by assigning a weight to the frequency at which the presentation element is extracted according to a comparison result between the frequency at which the presentation element appears in the object selected by the user and the measured frequency, have.

상기 표현요소들을 추출하는 단계는 표준화된 표현요소가 미리 저장된 데이터베이스 내에, 상기 추출된 표현요소가 저장되어 있는지 여부를 탐색하는 단계 및 상기 추출된 표현요소가 상기 데이터베이스 내에 저장되어 있지 않으면, 상기 추출된 표현요소와 가장 의미 거리가 가까운 상기 데이터베이스 상의 표준화된 표현요소를 상기 추출된 표현요소의 대표 표현요소로 식별하는 단계를 포함하고, 상기 빈도수를 측정하는 단계는 상기 추출된 표현요소가 상기 코멘트 데이터 내에서 추출된 빈도수를 상기 식별된 대표 표현요소가 상기 코멘트 데이터 내에서 추출된 빈도수에 합산하고, 상기 표현요소들을 시각화하는 단계는 상기 대표 표현요소를 상기 합산된 빈도수를 반영하여 시각화하는 것을 특징으로 할 수 있다.The step of extracting the expression elements may include the steps of searching whether or not the extracted expression elements are stored in a database in which standardized expression elements are stored in advance, and if the extracted expression elements are not stored in the database, Identifying a standardized expression element on the database closest to the expression element as a representative expression element of the extracted expression element, wherein the step of measuring the frequency includes: Wherein the step of adding the frequency extracted from the identified representative expression element to the frequency extracted in the comment data and visualizing the expression elements visualizes the representative expression element by reflecting the added frequency .

또한, 상기 표현요소들을 시각화하는 단계는, 상기 표현요소들을 포함하는 다차원척도 분석 지도(Multi-Dimensional Scaling map: MDS map)를 배경으로 하여 상기 표현요소들을 시각화하는 것을 특징으로 한다.In addition, the step of visualizing the expression elements visualizes the expression elements using a multi-dimensional scaling map (MDS map) including the expression elements as a background.

본 발명의 일 실시예에 따른 표현요소를 시각화 하는 시스템은 사용자가 선택한 객체에 대한 코멘트 데이터를 저장하는 스토리지 장치 상기 저장된 코멘트 데이터에서 복수의 표현요소들을 추출하는 표현요소 추출부 및 상기 추출된 복수의 표현요소들 간의 의미 거리에 기반한 분포에 기초하여 상기 추출된 복수의 표현요소들을 시각화하는 시각화부를 포함한다.A system for visualizing an expression element according to an exemplary embodiment of the present invention includes a storage element for storing comment data on an object selected by a user, an expression element extractor for extracting a plurality of expression elements from the stored comment data, And a visualization unit for visualizing the extracted plurality of presentation elements based on the distribution based on the semantic distance between the presentation elements.

본 발명에 따르면 객체를 이미 사용해본 사람들이 객체를 사용하고 느낀 표현요소(expression element)들을 시각화 그래프를 통해 확인함으로써 해당 객체를 사용하기 전 사용자들이 느끼는 표현요소들을 직관적으로 분석이 가능하여, 객체를 사용하는 사람의 입장에서는 해당 객체에 대해 사람들이 어떠한 감정을 가지는지, 또는 객체를 선택하는 사용자가 원하는 객체를 손쉽게 선택할 수 있는 효과가 있다.According to the present invention, users who have already used an object can confirm intuitively the expression elements that the user feels before using the object by checking the expression elements of the object through the visualization graph, From the viewpoint of the user, there is an effect that people feel about the object, or the user who selects the object can easily select the desired object.

또한, 본 발명을 통하여 생성된 시각화 그래프를 스크립트 프로그램으로 웹 사이트상으로 제공됨으로써 많은 사용자들을 대상으로 동시에 제공될 수 있다.Also, since the visualization graph generated through the present invention is provided as a script program on a web site, it can be simultaneously provided to a large number of users.

또한, 본 발명은 별도의 프로그램 설치 없이 브라우저상의 웹 페이지를 통해 제공할 수 있으므로 코멘트 데이터가 갱신될 때마다 개발자가 새로운 데이터 관리나 배포의 절차 없이 실시간으로 분석 결과를 사용자는 제공 받을 수 있다.In addition, since the present invention can be provided through a web page on a browser without installing a separate program, a developer can receive analysis results in real time without a new data management or distribution procedure every time the comment data is updated.

또한, 본 발명은 정부 또는 공공기관이 어떠한 정책이나 계획을 발표하여 사람들이 이에 대하여 인터넷을 통해 의사를 표현하는 경우, 정책에 대한 여론의 반응을 직관적으로 확인할 수도 있다.In addition, the present invention can intuitively confirm the reaction of public opinion on a policy when a government or a public agency announces a policy or a plan and people express their intention through the Internet on this.

또한, 인터넷에서 발생한 기업에서 발생하는 각종 사고 또는 외부에서 발생하는 기업에 대한 여론을 수집하고 이를 분석한 여론 반응의 변화를 실시간으로 파악할 수도 있으며, 기업은 이러한 정보를 이용하여 사내 위기관리 프로토콜을 통해 대응할 수도 있다.In addition, it is possible to collect changes in response to public opinion by collecting public opinions about the incidents that occur in the Internet or companies that are generated from the outside, and it is possible to grasp the changes in public opinion in real time. It may respond.

또한, 하나의 객체에 대하여 감정 또는 의견이 표현된 복수의 요소들을 의미 거리(semantic distance) 기반으로 시각화함으로써, 객체에 대하여 표현된 감정 또는 의견의 전체적인 분포를 직관적으로 사용자에게 제공할 수 있다.In addition, by visualizing a plurality of elements expressing an emotion or opinion with respect to one object on the basis of a semantic distance, it is possible to intuitively provide a user with an overall distribution of emotions or opinions expressed to the object.

또한, 하나의 객체에 대하여 대표적으로 표현된 감정 또는 의견을 시각화할 수도 있지만, 복수의 표현 요소들을 상대적인 의미 거리 기반으로 시각화함으로써, 객체에 대하여 표현된 복수의 표현 요소들 간의 상대적인 거리 및 분포를 직관적으로 인식할 수 있다.It is also possible to visualize emotions or opinions representative of one object. However, by visualizing a plurality of expression elements on the basis of relative meaning distances, the relative distances and distributions between a plurality of expression elements represented on the object can be intuitively .

또한, 텍스트에 한정하지 않고, 이모티콘이나 아이콘 등 감정 또는 의견을 표현할 수 있는 다양한 비언어적 요소를 모두 반영하여 의미 거리 기반의 분포를 시각화할 수 있으며, 텍스트라는 제약에서 벗어나 자유로울 수 있기 때문에, 다양한 외국어로 표현된 의견 또는 감정까지도 망라하여 하나의 프레임 안에서 시각화할 수도 있다.In addition, it is possible to visualize the distribution based on the meaning distance by reflecting not only text but also various non-verbal elements capable of expressing emotions or opinions such as emoticons and icons, and can freely deviate from the constraint of text, It can also visualize within a single frame, including expressed opinions or emotions.

도 1은 본 발명의 일 실시예에 따른 감정어휘 분포맵 제작을 위해 선정된 감정어휘를 나타낸 도면이다.
도 2는 도 1에 도시된 각 감정어휘의 TF-IDF 스코어의 최대치를 나타낸 도면이다.
도 3은 도 1에 도시된 각 감정어휘들 중 최종 선정된 36개의 감정어휘를 나타낸 도면이다.
도 4는 본 발명의 일 실시예에 따른 감정어휘 분포맵을 나타낸 도면이다.
도 5 내지 도 8은 본 발명의 일 실시예에 따른 객체의 코멘트 데이터에서 추출한 표현요소를 히트맵(Heat-map)형태로 나타낸 도면이다.
도 9는 본 발명의 일 실시예에 따른 표현요소를 시각화 하는 방법에 대한 순서도를 나타낸 도면이다.
도 10은 본 발명의 일 실시예에 따른 표현요소들의 빈도수에 따라 측정된 표현요소를 시각화 하는 방법에 대한 순서도를 나타낸 도면이다.
도 11은 본 발명의 일 실시예에 따른 신규 어휘가 추가되었는지를 확인하는 순서도를 나타낸 도면이다.
도 12는 본 발명의 일 실시예에 따른 신규한 표현요소가 추가되는 경우에 따른 순서도를 나타낸 도면이다.
도 13은 본 발명의 일 실시예에 따른 표현요소의 유효성을 판단하는 과정을 나타낸 도면이다.
도 14는 본 발명의 일 실시예에 따른 표현요소의 유효성을 표현요소의 빈도수를 기준을 판단하는 과정을 나타낸 도면이다.
도 15는 본 발명의 일 실시예에 따른 표현요소의 유효성을 판단하는 과정을 구체화하여 나타낸 순서도이다.
도 16은 본 발명의 일 실시예에 따른 특정 표현요소가 집중되어 있는 경우 표현요소의 영향력을 조절하는 방법에 대한 순서도를 나타낸 도면이다.
도 17은 본 발명의 일 실시예에 따른 특정 표현요소가 특정 객체에서 실제로 나타나는 빈도수가 낮은 경우 가중치를 부여하는 방법에 대한 순서도를 나타낸 도면이다.
도 18은 본 발명의 일 실시예에 따른 표현요소를 미리 저장된 표준형의 표현요소로 매핑하고 빈도수를 측정하는 방법에 대한 순서도를 나타낸 도면이다.
도 19는 본 발명의 일 실시예에 따른 표현요소를 시각화 하는 시스템을 나타낸 도면이다.
도 20은 본 발명의 일 실시예에 따른 신규 표현요소를 확인하여 표현요소를 시각화 하는 시스템을 나타낸 도면이다.
도 21은 본 발명의 일 실시예에 따른 표현요소의 빈도수를 측정 및 조정하여 표현요소를 시각화 하는 시스템을 나타낸 도면이다.
도 22는 본 발명의 일 실시예에 따른 표현요소 추출부를 상세히 나타낸 도면이다.
도 23 내지 도 27는 본 발명의 일 실시예에 따라 각기 다른 시각화 방법을 나타낸 도면이다.
도 28는 본 발명의 일 실시예에 따른 히트맵(Heat-map) 시각화 방법을 3차원으로 응용하여 나타낸 도면이다.
도 29는 본 발명의 일 실시예에 따른 표현요소 시각화 방법을 등고선으로 나타낸 도면이다.
도 30은 도 29에 도시한 등고선 맵을 3차원으로 나타낸 도면이다.
도 31 내지 도 33은 본 발명의 일 실시예에 따른 의미지도를 기반으로 한 활용 방법을 나타낸 도면이다.FIG. 1 is a diagram illustrating emotional vocabulary selected for producing an emotional lexical distribution map according to an embodiment of the present invention.
FIG. 2 is a diagram showing a maximum value of a TF-IDF score of each emotion word shown in FIG.
FIG. 3 is a diagram showing 36 emotion vocabularies finally selected among the emotion words shown in FIG.
4 is a diagram illustrating an emotional lexical distribution map according to an embodiment of the present invention.
FIG. 5 to FIG. 8 are views showing a representation element extracted from comment data of an object according to an embodiment of the present invention in a heat-map form.
9 is a flowchart illustrating a method of visualizing a presentation element according to an exemplary embodiment of the present invention.
FIG. 10 is a flowchart illustrating a method of visualizing measured expressions according to frequency of expression elements according to an exemplary embodiment of the present invention. Referring to FIG.
FIG. 11 is a flowchart illustrating a process for confirming whether a new vocabulary is added according to an embodiment of the present invention.
12 is a flowchart illustrating the addition of a new rendering element according to an embodiment of the present invention.
FIG. 13 illustrates a process of determining the validity of an expression element according to an exemplary embodiment of the present invention. Referring to FIG.
FIG. 14 is a diagram illustrating a process of determining the validity of an expression element according to an exemplary embodiment of the present invention, based on the frequency of expression elements.
FIG. 15 is a flowchart illustrating a process for determining the validity of an expression element according to an exemplary embodiment of the present invention. Referring to FIG.
FIG. 16 is a flowchart illustrating a method of controlling influence of an expression element when specific expression elements are concentrated according to an exemplary embodiment of the present invention. Referring to FIG.
17 is a flowchart illustrating a method of assigning weights when a specific expression element according to an embodiment of the present invention actually appears in a specific object is low.
18 is a flowchart illustrating a method of mapping an expression element according to an exemplary embodiment of the present invention to a previously stored standard expression element and measuring frequency.
19 illustrates a system for visualizing presentation elements in accordance with an embodiment of the present invention.
Figure 20 illustrates a system for identifying new presentation elements and visualizing presentation elements in accordance with an embodiment of the present invention.
21 illustrates a system for visualizing presentation elements by measuring and adjusting the frequency of presentation elements according to an embodiment of the present invention.
FIG. 22 is a detailed view of an expression element extracting unit according to an embodiment of the present invention.
23 to 27 are diagrams illustrating different visualization methods according to an embodiment of the present invention.
28 is a diagram showing a heat-map visualization method according to an embodiment of the present invention in a three-dimensional application.
29 is a diagram showing contour lines of a visualization method of an expression element according to an embodiment of the present invention.
Fig. 30 is a diagram showing the contour map shown in Fig. 29 in three dimensions.
31 to 33 are diagrams illustrating an application method based on a semantic map according to an embodiment of the present invention.

상기 목적 외에 본 발명의 다른 목적 및 특징들은 첨부 도면을 참조한 실시 예에 대한 설명을 통하여 명백히 드러나게 될 것이다.Other objects and features of the present invention will become apparent from the following description of embodiments with reference to the accompanying drawings.

본 발명의 바람직한 실시예를 첨부된 도면들을 참조하여 상세히 설명한다. 본 발명을 설명함에 있어, 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명은 생략한다.Preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. In the following description of the present invention, a detailed description of known functions and configurations incorporated herein will be omitted when it may make the subject matter of the present invention rather unclear.

그러나, 본 발명이 실시예들에 의해 제한되거나 한정되는 것은 아니다. 각 도면에 제시된 동일한 참조 부호는 동일한 부재를 나타낸다.However, the present invention is not limited to or limited by the embodiments. Like reference symbols in the drawings denote like elements.

도 1은 본 발명의 일 실시예에 따른 감정어휘 분포맵 제작을 위해 선정된 감정어휘를 나타낸 도면이다.FIG. 1 is a diagram illustrating emotional vocabulary selected for producing an emotional lexical distribution map according to an embodiment of the present invention.

본 발명은 객체의 코멘트 데이터에서 수집된 표현요소(expression element)를 이용한 시각화 방법 및 시스템에 관한 것으로서, 객체는 사용자가 선택한 영화, 상품, 소설, 게임, 여행 등 사람의 감정이 포함되는 객체를 의미하며, 사람의 감정이 포함되는 객체에 대한 코멘트 또는 리뷰로부터 나타나는 감정을 시각화 할 수 있다.The present invention relates to a visualization method and system using an expression element collected from comment data of an object. The object is an object including a feeling, such as a movie, a product, a novel, a game, And can visualize emotions appearing from comments or reviews of objects that contain human emotions.

본 발명의 일 실시예로서, 객체는 영화로 한정하여 영화에 대한 코멘트 데이터를 이용한 시각화 방법 및 시스템으로도 설명할 수 있다.As an embodiment of the present invention, the object can be also described as a visualization method and system using comment data for a movie, which is limited to a movie.

영화의 코멘트 데이터는 사용자에 따라서 구축된 웹 서비스 통해 수집된 데이터를 이용할 수 있으며, 또는 대형 포털 및 동호회 게시판에 축적되는 코멘트 데이터를 프로그램을 이용하여 개별적으로 수집할 수도 있다. The comment data of the movie can utilize the data collected through the web service constructed according to the user, or the comment data accumulated in the large portal and the fan club bulletin board can be individually collected using the program.

본 발명의 일 실시예로서, 영화에 대한 코멘트 데이터로부터 사용자의 감정을 포함하고 있는 감정어휘 수집을 자동화하기 위하여 데이터를 수집할 수 있는 웹 크롤러를 이용할 수 있으며, 크롤러는 대형 포털(네이버, 다음 등) 영화 홈페이지에서 특정 영화의 댓글과 코멘트들을 정제되지 않은 데이터 형태로 수집하고, 수집된 데이터를 연구에 사용 가능한 데이터로 가공할 수 있으며, 정제된 데이터를 분석하여 감정어휘를 추출할 수 있다. 이에 따라 크롤러를 통하여 수집되는 감정어휘는 영화를 보는 상황과 연결시켜 추후에 사용자의 이용 동기에 맞는 영화를 추천할 수도 있다.As an embodiment of the present invention, a web crawler capable of collecting data may be used to automate emotional vocabulary collection including user's emotion from comment data of a movie. The crawler may be a large portal (Naver, ) It is possible to collect comments and comments of certain movies on the movie homepage in the form of non-refined data, process the collected data into data that can be used for research, and extract emotional vocabulary by analyzing the refined data. Accordingly, the emotional vocabulary collected through the crawler can be linked with the situation of watching a movie, so that a movie suitable for a user's motivation can be recommended later.

영화에 나타난 감정어휘의 빈도를 시각화하기 위해서는 2차원 평면상에 각 감정어의 위치를 지정해야 한다. 이를 위해 감정어간의 상관관계를 이용하여 2차원 상의 위치 좌표를 도출할 수 있다. 감정어휘의 분포맵을 제작하기 위하여 한덕웅, 강혜자(2000)의 한국어 정서 용어들의 적절성과 경험 빈도에 대한 연구를 참고하여 834개의 정서용어 중에서 영화를 봤을 때 느낄 수 있는 감정어휘만을 분류하였다. 이때, 아주대학교의 국어국문학과박사 전문가 1명과 본 발명의 발명자 2명이 함께 서로 의견취합이 가능한 감정어휘만을 골라 최종 100개의 감정어휘를 선별하였다.To visualize the frequency of emotion vocabulary in a movie, the position of each emotion word should be specified on a two-dimensional plane. For this, the positional coordinates on two dimensions can be derived by using correlation of emotion words. In order to construct a distribution map of emotional vocabulary, we classify only emotional vocabulary that can be perceived when watching movies among 834 emotional terms by referring to the study on the appropriateness and experience frequency of Korean emotional terms of Han Dukwoong and Kang Hyeja (2000). At this time, only one doctoral specialist in Korean Language and Literature Department of Ajou University and two inventors of the present invention chose only emotional vocabulary to collect opinions, and finally selected 100 emotional vocabularies.

또한, 전문가 분석을 통한 감정어휘 선별작업 이외에도 영화를 시청하였을 때, 사용자들이 가장 많이 느끼는 감정어휘를 선별하기 위해 선정된 100개의 감정어휘를 토대로 최종감정어휘 선정을 위한 서베이(survey)를 실시하였다. 서베이는 아주대학교의 미디어학과 학생 30명을 대상으로 영화를 봤을 때 느낄 수 있는 감정에 대한 간단한 개념 설명을 거친 뒤에, 전문가 분석을 통해 얻어진 100개의 감정어휘에 대해서 영화를 보는 상황일 때 해당 감정어휘를 느낄 수 있는 정도가 어떻게 되는지를 조사 하였다. 실제 설문에서는 ‘여러분이 지금까지 보신 여러 장르의 영화 스토리를 생각하신 뒤 해당 영화를 봤을 때 다음에 제시된 감정어휘들을 느끼는 정도가 어떠한 지 답하시기 바랍니다.’와 같이 시작하였으며, 각 감정어휘에 대해서 리커트 7점 척도(Likert-type scale)로 응답하게 하여 1점은 ‘전혀 관련 없다.’를 의미하고 7점은 ‘매우 관련 있다.’를 의미하도록 질문하였다.In addition, in addition to emotional vocabulary selection through expert analysis, a survey was conducted to select the final emotional vocabulary based on the 100 emotional vocabularies selected to select the emotional vocabulary that users most feel when watching movies. The survey was conducted by a group of 30 media students at Ajou University. After a brief explanation of the emotions that could be felt when watching a movie, when the viewer views a movie about 100 emotional vocabularies obtained through expert analysis, To the extent that they can feel it. In the actual questionnaire, "I thought about the movie stories of various genres you have watched so far, and then what is the degree of feeling the following emotional vocabulary when watching the movie?" I responded with a Likert-type scale, 1 point meaning 'no relevance at all' and 7 point meaning 'very relevant'.

본 연구에서는 사용자의 이용 동기를 이용해 영화를 추천한다는 취지에 부합하고자 영화를 봤을 때 가장 잘 느낄 수 있는 감정어휘를 수집하기 위해 전문가 분석 및 사용자 서베이를 실시하였으며, 사용자들이 설문한 리커트 7점 척도 정보를 토대로 관련성이 높은 감정어휘를 선별하기 위해 평균분석을 통해서 평균이 상대적으로 낮은 감정어휘(4.00 ‘보통이다.’를 뜻하는 수치 이하) 32개를 추가적으로 제거하여 영화 추천에 적합한 68개의 감정어휘를 선별하였다.In this study, we conducted a specialist analysis and user survey to collect the best vocabulary that can be felt when watching movies in order to meet the recommendation of movies using user motivation. In order to select relevant emotional vocabularies based on the information, 32 averaged emotional vocabularies (below 4.00 'normal') were further removed through average analysis, and 68 emotional vocabularies Respectively.

도 1은 이렇게 선별된 영화 추천에 적합한 68개의 감정어휘를 나타낸 도면이다.FIG. 1 is a diagram showing 68 emotional vocabularies suitable for the selected movie recommendation.

도 2는 도 1에 도시된 각 감정어휘의 TF-IDF 스코어의 최대치를 나타낸 도면이다.FIG. 2 is a diagram showing a maximum value of a TF-IDF score of each emotion word shown in FIG.

도 2는 도 1에 설명된 68개의 감정어휘에 실제 영화데이터를 비교하여 영향력이 미미한 감정어휘를 추가로 제거하기 위하여, 영화의 코멘트 또는 리뷰에 나타나는 각 감정어휘의 TF-IDF 스코어를 도출하고, 각 감정어휘에 나타날 수 있는 TF-IDF 스코어의 최대치를 나타낸 도면이다.FIG. 2 is a flowchart illustrating a method for deriving a TF-IDF score of each emotional vocabulary appearing in a movie comment or a review, in order to further remove emotional vocabulary having little influence by comparing actual movie data with the 68 emotional vocabularies described in FIG. And the maximum value of the TF-IDF score that can appear in each emotion vocabulary.

이때, TF(단어 빈도수, term frequency)는 특정한 단어가 문서 내에 얼마나 자주 등장하는지 나타내는 값을 의미하며, DF(Document Frequency)는 특정 단어가 나타난 문서의 수를 의미하며, 이 값의 역수를 IDF(inverse document frequency)라고 한다.In this case, TF (word frequency, term frequency) means a value indicating how often a specific word appears in a document, DF (Document Frequency) means the number of documents in which a specific word appears, inverse document frequency).

도 3은 도 1에 도시된 각 감정어휘들 중 최종 선정된 36개의 감정어휘를 나타낸 도면이다.FIG. 3 is a diagram showing 36 emotion vocabularies finally selected among the emotion words shown in FIG.

도 2에 도시된 도면은 TF-IDF 스코어가 도출된 각 감정어휘들 중에서 '경악하다'의 경우 모든 영화에서 TF-IDF 스코어의 비율이 0.8% 이하로 나타났으며, 반면에 '달콤하다'의 경우에는 적어도 한 개의 영화에서는 TF-IDF 스코어의 비율이 42%에 달하는 것을 의미한다. In FIG. 2, the ratio of TF-IDF score is 0.8% or less in all movies in the case of 'astonishing' among the respective emotion vocabulary derived from the TF-IDF score, , The ratio of the TF-IDF score in at least one movie is 42%.

이때, 도 3은 TF-IDF 스코어의 비율이 10% 미만인 감정어휘를 제거하고 최종적으로 선택된 36개의 감정어휘를 나타낸 도면이다.FIG. 3 is a diagram showing the 36 selected emotion vocabularies after eliminating the emotional vocabulary of which the ratio of the TF-IDF score is less than 10%.

도 4는 본 발명의 일 실시예에 따른 감정어휘 분포맵을 나타낸 도면이다.4 is a diagram illustrating an emotional lexical distribution map according to an embodiment of the present invention.

도 3에 도시한 최종 군집화된 36개의 감정어휘를 2차원 평면에 각 감정어휘 간의 의미 거리를 도출하기 위하여 36개의 감정어휘를 바탕으로 유사하거나 상이한 감정어휘 간의 거리를 측정하여 상관관계를 분석한 다음 다차원척도 분석(Multi-Dimensional Scaling: MDS)을 이용할 수 있다. In order to derive the semantic distances between the 36 emotion vocabularies shown in FIG. 3 on the two-dimensional plane, the distance between similar or different emotional vocabularies based on 36 emotional vocabularies was analyzed to analyze the correlation Multi-Dimensional Scaling (MDS) can be used.

이때, 다차원척도 분석이란 개체들 간의 상대적인 거리를 계산하여 사람이 인지 할 수 있는 평면상에 상대적인 거리로 나타내는 통계와 연관된 기술로 정보 시각화에서는 데이터 내의 유사성 및 비유사성을 측정하기 위한 배경기술이다.At this time, multidimensional scaling analysis is a technique related to statistics which calculates the relative distances between individuals and is expressed as a distance relative to a plane that can be perceived by humans, and is a background technique for measuring similarities and dissimilarities in data in information visualization.

다차원척도법의 장점은 상대적인 거리만을 알고 있는 개체들의 의미 지도를 작성할 수 있으며 물리적인 거리뿐만 아니라 심리적인 거리에 근거하여서도 의미 지도를 작성할 수 있다. The advantage of the multidimensional scaling method is that it can create semantic maps of individuals who know only relative distances and can create semantic maps based on physical distances as well as psychological distances.

본 발명의 일 실시예에 따른 다차원척도 분석을 위해 경기도 및 서울 소재 대학교 20대 남학생 11명, 여학생 9명으로 총 20명을 실험 대상자로 하여 36개의 감정어휘에 대해 의미상 거리 서베이를 실시하였으며, 서베이는 가로축 세로축 36개의 감정어휘를 배치한 설문지를 만들고(68x68), 감정어휘간의 거리가 가장 가깝다고 느껴지면 3점, 가장 멀다고 느껴지면 -3점을 주는 방식의 리커트 척도를 이용하여 체크하는 형식으로 구성하였다. 20명이 기록한 데이터를 바탕으로 다양한 네트워크 분석기법이 활용 가능한 UCINET 프로그램을 사용 하였고, 이에 따라 영화 36개 감정 어휘 간의 의미상의 거리에 선정된 68개의 감정어휘를 기반한 Metric MDS를 도 4에 나타내었다.In order to analyze the multidimensional scale according to an embodiment of the present invention, a semantic distance survey was conducted on 36 emotion vocabularies, with a total of 20 persons as experimental subjects, 11 male and 9 female students in their twenties in Gyeonggi Province and Seoul, The questionnaire was composed of 36 questionnaires (68x68) arranged along the horizontal axis and vertical axis. The questionnaire survey was conducted using the Rickert scale, which gives 3 points when the distance between the emotional vocabulary is the closest and 3 points when it feels the farthest distance. Respectively. Figure 4 shows the Metric MDS based on the 68 emotional vocabulary selected for the semantic distance between 36 movies and 36 emotional vocabularies, based on data recorded by 20 people.

그 결과, X축의 양(+)의 방향으로는 대표어 “Happy”, “Surprise”와 관련된 감정어휘가 분포되었으며, X축의 음(-)의 방향으로는 대표어 “Anger”, “Disgust”와 관련된 감정어휘가 분포되었다. 그리고 Y축의 양(+)의 방향으로는 대표어 “Fear”, “Surprise”와 관련된 감정어휘가 분포되었으며, Y축의 음(-)의 방향으로는 대표어 “Sad”, “Boring”과 관련된 감정어휘가 분포되었다. As a result, the emotional vocabularies related to the representative words "Happy" and "Surprise" were distributed in the positive direction of the X axis and the representative words "Anger", "Disgust" Related emotional vocabulary. The emotional vocabulary related to the representative words "Fear" and "Surprise" was distributed in the positive direction of the Y axis and the emotion vocabulary related to the representative words "Sad" and "Boring" The vocabulary was distributed.

이에 따라, 감정어휘의 성격 상 X축의 양(+)의 방향은 긍정적인 감정어휘들이 분포되었고, X축의 음(-)의 방향으로는 부정적인 감정어휘들이 분포됨을 알 수 있다. As a result, positive emotional vocabularies are distributed in the positive (+) direction of the X axis, and negative emotional vocabularies are distributed in the negative (-) direction of the X axis due to the nature of the emotional vocabulary.

또한, Y축의 양(+)의 방향은 동적인(감정을 느낄 때 비교적 큰 제스처를 취할 수 있는) 감정어휘들이 분포되었고, Y축의 음(-)의 방향으로는 정적인(감정을 느낄 때 비교적 작은 제스처를 취할 수 있는) 감정어휘들이 분포됨을 알 수 있다.In addition, positive (+) direction of the Y axis is the distribution of emotional vocabularies that are dynamic (which can take a relatively large gesture when feeling emotion), and static (negative (Which can take a small gesture).

그리고 ‘Happy’, ‘Sad’, ‘Anger’, ‘Fear’, ‘Disgust’, ‘Boring’ 대표어와 관련된 어휘들은 각각의 단어가 뚜렷하게 군집이 되는 것을 볼 수 있는데, 대표어 ‘Surprise’에 대해서는 ‘Happy’ 대표어 군집과 ‘Fear’ 대표어 군집에 나뉘어서 분포한다는 것을 알 수 있다. 이는 사용자들이 영화를 봤을 때 ‘벅찬 기쁨으로 인해 놀라운 감정이 생기는 경우’와 ‘갑작스럽게 등장하는 공포로 인해서 놀라운 감정이 생기는 경우’가 지배적이기 때문인 것으로 해석할 수 있다. The vocabularies associated with the representative words 'Happy', 'Sad', 'Anger', 'Fear', 'Disgust', and 'Boring' can be seen as distinct clusters of words, Happy 'and' Fear ', respectively. This can be interpreted as a result of the fact that when a user watches a movie, "an incredible feeling occurs due to a joyful joy" and "when an amazing feeling occurs due to a sudden appearance of fear" is dominant.

도 5 내지 도 8은 본 발명의 일 실시예에 따른 객체의 코멘트 데이터에서 추출한 표현요소를 히트맵(Heat-map)형태로 나타낸 도면이다.FIG. 5 to FIG. 8 are views showing a representation element extracted from comment data of an object according to an embodiment of the present invention in a heat-map form.

도 1 내지 도 4에서 설명한 영화에 대한 코멘트 데이터로부터 추출된 감정어휘를 시각화하기 위해서는 MDS Map을 구성하고 있는 감정어휘의 빈도수가 필요하다. 상위 과정을 통해 선별된 코멘트 데이터와 감정어휘사전을 비교하여 각 영화에서의 감정어휘 빈도수를 측정한다. In order to visualize the emotional vocabulary extracted from the comment data for the movie described with reference to Figs. 1 to 4, the frequency of the emotional vocabulary constituting the MDS Map is required. We compare emotional vocabulary dictionaries with selected comment data through the upper process and measure emotional vocabulary frequency in each movie.

또한, 영화의 성격과 관계없이 자주 등장하는 특정 어휘의 가중치를 낮추기 위해 TF-IDF 스코어를 계산하여 수치를 조정한다. 최종적으로 선별된 각 감정어휘의 TF-IDF 스코어를 이용하여 시각화 할 수 있다.In addition, the TF-IDF score is calculated and adjusted to lower the weight of a specific vocabulary that appears frequently, regardless of the nature of the movie. Finally, the visualization can be done using the TF-IDF score of each selected emotion vocabulary.

최종 시각화 그래프는 감정어휘의 MDS Map을 배경으로 하고, 사각형의 작은 셀로 구성된 히트맵(Heat-map)으로 나타낼 수 있다. 이때, 모든 셀은 0의 수치로 초기화 되어 있으며, 해당 셀에 위치한 감정어휘의 TF-IDF스코어에 따라 셀의 수치가 증가한다. 셀이 가진 수치가 높아질수록 다른 색으로 변함으로써 해당 감정어휘 TF-IDF스코어의 높고 낮음을 확인 할 수 있다. 또한 수치가 올라간 셀은 주위 셀의 수치에 영향을 미침으로써 그래프의 모습은 지형도의 모습을 띄게 된다. The final visualization graph can be represented as a heat-map consisting of small cells in a square with the background of the emotional vocabulary MDS Map. At this time, all the cells are initialized to a value of 0, and the cell value increases according to the TF-IDF score of the emotion vocabulary located in the corresponding cell. As the value of the cell increases, the color changes to a different color, so that the TF-IDF score of the emotion vocabulary can be ascertained high or low. Also, the cell with the numerical value affects the numerical value of the surrounding cell, so that the graph is shown as a topographic map.

도 5는 영화 '설국열차'에 대한 관람객들의 코멘트 데이터에 나타나는 감정어휘의 분포를 시각화한 그래프이다. 도 5에 도시된 것처럼 영화 '설국열차'에서 관객들은 재미있고 대단하다는 반응을 보이고 있으며, 안타깝고 지루하다는 감정 또한 높은 빈도를 보이고 있다. 실제로 '설국열차'에 대한 영화 코멘트를 살펴보면 영화에 대해 실망한 관객들의 리뷰가 많은 것을 볼 수 있다.5 is a graph visualizing the distribution of the emotional vocabulary appearing in the comment data of visitors to the movie 'Snowy National Railroad'. As shown in FIG. 5, in the movie "Snowy trains", the audience responds that they are interesting and wonderful, and the feeling that they are sad and boring is also showing a high frequency. Indeed, if you look at the movie comments on the Snowy Trains, you can see many disappointed audiences in the movie.

도 6은 영화 '극락도 살인사건'을 히트맵(Heat-map) 형태로 시각화한 도면이며, 공포영화인 '극락도 살인사건'에서 나타난 관람객들의 감정 중 가장 높은 감정어휘가 '놀라다'이며, 그 밖에 공포와 관련된 감정어휘의 빈도가 높게 나타난 것을 확인할 수 있다. FIG. 6 is a visualization of the movie "Purgatory and murder" in the form of a heat map, and the highest emotion vocabulary among the visitors' feelings in the horror movie "Purgatory and murder" is "surprising" The frequency of emotional vocabulary related to fear is high.

도 7은 영화 '돈 크라이 마미'를 히트맵(Heat-map) 형태로 시각화한 도면이며, 실제 범죄사건을 모티브로 제작된 '돈 크라이 마미'에 대한 관람객의 감정어휘의 분포는 '분노하다'와 '격분하다'에 많은 빈도수를 나타내는 것을 확인할 수 있다.FIG. 7 is a diagram visualizing the movie 'Don Clamamy' in the form of a heat map. The distribution of the viewer's emotional vocabulary on the 'Don Clamami' And 'outrageous', indicating that they are frequent.

도 8은 영화 '워낭소리'를 히트맵 형태로 시각화한 도면이며, '워낭소리'의 경우 관객들의 감정은 '슬프다'와 '감동적이다'에 높은 빈도를 보이는 것으로 나타났다. FIG. 8 is a visualization of the movie "Warang Sound" in the form of a heat map. In the case of Warang Son, the audience's emotions are highly sad and touching.

위와 같은 예시를 통하여 영화를 시청한 후 작성된 코멘트에서 수집된 코멘트 데이터가 영화의 장르 및 스토리 특성에 부합하여 감정어휘 패턴이 나타남을 알 수 있다.Through the above example, it can be seen that the comment data collected from the comment after the movie is viewed matches the genre and story characteristics of the movie, and the emotional vocabulary pattern appears.

본 발명의 일 실시예로서, 영화에 대한 코멘트 데이터를 이용하여 감정어휘를 추출하고 이를 시각화 하는 방법에 대하여 기술하였으나, 영화 뿐만 아니라 인간관계와 사회관계에서 사람들이 갖거나 드러내는 사고, 의도, 평가, 의견, 논증, 반박 같은 인지활동과 감정, 정서, 욕망, 태도 등의 감성적 반응을 대상으로 할 수도 있다.As an embodiment of the present invention, a method of extracting an emotional vocabulary using a comment data of a movie and visualizing the emotional vocabulary has been described. However, in the case of an idea, an intention, an evaluation, It may also be the subject of emotional reactions such as cognitive activities, emotions, emotions, desires, and attitudes, such as opinions, arguments, and refutations.

또한, 본 발명이 적용될 수 잇는 대상은 감성적인 부분에선 인간의 감정, 정서, 욕망, 태도 등을 포함하며, 인지 부분에선 사고, 의도, 평가, 의견, 논증, 반박 등을 포함한다. 또한, 관계 부분에선 문화콘텐츠, 인간관계(소통, 갈등), 사회관계(다문화 Homo hundred 등), 테크놀로지와의 관계(문화지체 등)을 포함한다.In addition, the object to which the present invention can be applied includes human emotions, emotions, desires, attitudes and the like in an emotional part, and includes thinking, intention, evaluation, opinion, argument, refutation and the like in the recognition part. Also, the relationship part includes cultural contents, human relations (communication, conflict), social relations (multicultural homo hundred etc.), and relationship with technology (cultural retention, etc.).

도 9는 본 발명의 일 실시예에 따른 표현요소를 시각화 하는 방법에 대한 순서도를 나타낸 도면이다.9 is a flowchart illustrating a method of visualizing a presentation element according to an exemplary embodiment of the present invention.

표현요소를 시각화 하는 방법은 사용자가 선택한 객체에 대하여 수집된 코멘트 데이터에서 복수의 표현요소들을 추출하고(S910), 추출된 복수의 표현요소들 간의 의미 거리에 기반한 분포에 기초하여 추출된 복수의 표현요소들을 시각화(S920)한다.A method of visualizing an expression element includes extracting a plurality of expression elements from comment data collected for an object selected by the user (S910), and extracting a plurality of expressions extracted based on the distribution based on the semantic distance between the extracted plurality of expression elements Visualize the elements (S920).

이때, 객체에 대한 코멘트 데이터는 영화에 대한 리뷰, 상품에 대한 상품평, 소설 리뷰, 게임 리뷰, 여행 리뷰, 서비스에 대한 평가 등 사람들의 감정이 포함되는 모든 코멘트 데이터를 의미한다.At this time, the comment data of the object means all comment data including the feelings of the people such as a review of a movie, a review of a product, a review of a novel, a review of a game, a review of a travel,

또한, 표현요소는 코멘트 데이터에서 추출되는 사람들의 감정을 나타내는 단어, 문단, 이모티콘 등을 포함한다.In addition, the expression elements include words, paragraphs, emoticons, etc. expressing emotions of people extracted from the comment data.

또한, 복수의 표현요소들간의 의미 거리에 기반한 분포에 기초하여 시각화 하는 방법은 도 1 내지 도 4에서 설명된 다차원척도 분석 지도(Multi-Dimensional Scaling map: MDS map)을 기반으로 히트맵(Heat-map) 형태 또는 등고선 등의 모양으로 나타낼 수 있다.In addition, a method of visualizing based on the distribution based on the semantic distance between a plurality of expression elements is based on a multi-dimensional scaling map (MDS map) described in Figs. 1 to 4, map shape or contour shape.

도 10은 본 발명의 일 실시예에 따른 표현요소들의 빈도수에 따라 측정된 표현요소를 시각화 하는 방법에 대한 순서도를 나타낸 도면이다.FIG. 10 is a flowchart illustrating a method of visualizing measured expressions according to frequency of expression elements according to an exemplary embodiment of the present invention. Referring to FIG.

표현요소를 시각화 하는 방법은 사용자가 선택한 객체에 대하여 수집된 코멘트 데이터에서 복수의 표현요소들을 추출하고(S910), 추출된 표현요소들의 코멘트 데이터 내에서 추출된 빈도수를 측정하여(S930), 추출된 복수의 표현요소들 간의 의미 거리에 기반한 분포에 기초하여 추출된 복수의 표현요소들을 시각화(S920)할 수 있다. 이때, 추출된 표현요소들을 측정된 표현요소들의 빈도수에 따라 표현요소들을 포함하는 다차원척도 분석 지도(Multi-Dimensional Scaling map: MDS map)를 배경으로 하여 히트맵(Heat-map), 등고선 등의 모양으로 시각화할 수 있다.In the method of visualizing the presentation element, a plurality of presentation elements are extracted from the collected comment data for the object selected by the user (S910), the frequency extracted in the comment data of the extracted presentation elements is measured (S930) (S920) a plurality of extracted expression elements based on a distribution based on a semantic distance between a plurality of expression elements. At this time, the extracted expression elements are classified into a shape of a heat map, a contour line, etc. based on a multi-dimensional scaling map (MDS map) including expression elements according to the frequency of the measured expression elements. Can be visualized.

또한, 표현요소들을 추출하여, 표현요소들의 빈도수를 측정할 때 표준화된 표현요소가 표준형이 아닌 경우, 사전 상에 저장된 표준형의 표현요소로 매핑하고, 그 매핑된 사전 상의 표준형의 표현요소 기준으로 각 객체의 코멘트 데이터에서의 빈도수를 측정할 수 있다.In addition, when the expression elements are extracted and the frequency of the expression elements is measured, when the standardized expression elements are not standard types, they are mapped to the expression elements of the standard type stored in the dictionary, The frequency of the comment data of the object can be measured.

도 11은 본 발명의 일 실시예에 따른 신규 어휘가 추가되었는지를 확인하는 순서도를 나타낸 도면이다.FIG. 11 is a flowchart illustrating a process for confirming whether a new vocabulary is added according to an embodiment of the present invention.

도 9에 도시된 표현요소를 시각화 하는 방법은 사용자가 선택한 객체에 대하여 수집된 코멘트 데이터에서 복수의 표현요소들을 추출하고(S910) 추출된 표현요소들과 기존에 추출된 표현요소들을 비교한다(S1110). 이후, 추출된 표현요소들 중 신규한 표현요소가 추가되었는지 여부를 확인하여(S1120), 추출된 복수의 표현요소들 간의 의미 거리에 기반한 분포에 기초하여 추출된 복수의 표현요소들을 시각화(S920)할 수 있다. 이 때 신규한 표현요소의 의미를 찾는 과정은 문맥 기반 분석(context-based analysis) 등의 기법을 통하여 실행될 수 있다.9 extracts a plurality of expression elements from the collected comment data for the object selected by the user (S910), and compares the extracted expression elements with the extracted expression elements (S1110 ). In operation S1120, a plurality of expression elements extracted based on the distribution based on the semantic distances between the plurality of extracted expression elements are visualized in operation S920. can do. In this case, the process of finding the meaning of a new expression element can be performed through a technique such as context-based analysis.

도 12는 본 발명의 일 실시예에 따른 신규한 표현요소가 추가되는 경우에 따른 순서도를 나타낸 도면이다.12 is a flowchart illustrating the addition of a new rendering element according to an embodiment of the present invention.

도 11에서 신규한 표현요소가 추가되었는지 확인(S921)하고, 신규한 표현요소가 추가되었을 경우, 기존에 추출된 표현요소들 중 신규한 표현요소와 의미 거리가 일정 기준 이내인 하나 이상의 인접 표현요소를 결정한다(S922).In step S921, it is determined whether a new expression element is added in step S921. If a new expression element is added, a new expression element and one or more adjacent expression elements (S922).

이때 일정 기준이라 함은, 기존에 추출한 표현요소들 중 신규한 표현요소와 가장 가까운 N개를 기준으로 삼을 수도 있고, 또는 기존에 추출한 표현요소들 중 신규한 표현요소와 의미 거리가 r 이내인 경우를 기준으로 삼을 수도 있다.In this case, the predetermined criterion may be that of the previously extracted expression elements, which is closest to the new expression element, or may be a combination of the extracted expression elements and the new expression element within the range of r You can also use the case as a reference.

이후, 결정된 하나 이상의 인접 표현요소들로부터의 의미 거리에 기반하여 신규한 표현요소의 의미상 위치를 결정하여(S923), 위치가 결정된 신규한 표현요소를 시각화 한다(S924).Then, the semantic position of the new expression element is determined based on the determined semantic distance from the adjacent expression elements (S923), and the new found expression element is visualized (S924).

이때, 신규한 표현요소와 인접 표현요소의 의미가 유사할수록 신규한 표현요소와 인접 표현요소 간의 의미 거리가 가까워지도록 가중치를 부여하여 위치를 결정할 수도 있다. 즉, 제1 인접 표현요소가 제2 인접 표현요소보다 신규한 표현요소의 의미와 유사한 경우에는, 신규한 표현요소와 제1 인접 표현요소 간의 거리가 신규한 표현요소와 제2 인접 표현요소 간의 거리보다 더 짧게 되도록 신규한 표현요소의 위치가 결정될 수 있다. 이 때, 표현요소 간의 의미 유사성은 문맥 기반 분석을 통하여 얻어질 수도 있고, 또는 다수 인원에 대한 설문 조사 등의 다양한 방법을 이용하여 얻어질 수도 있다.At this time, as the similarity between the new expression element and the neighboring expression element becomes similar, the position may be determined by weighting so that the meaning distance between the new expression element and the adjacent expression element becomes closer to each other. That is, when the first adjacent expression element is similar to the meaning of the new expression element than the second neighbor expression element, the distance between the new expression element and the first neighbor expression element is smaller than the distance between the new expression element and the second adjacent expression element The position of the new presentation element can be determined so that it is shorter than that of the new presentation element. In this case, the semantic similarity between expression elements can be obtained through context-based analysis, or it can be obtained by various methods such as questionnaires for a large number of people.

도 13은 본 발명의 일 실시예에 따른 표현요소의 유효성을 판단하는 과정을 나타낸 도면이다.FIG. 13 illustrates a process of determining the validity of an expression element according to an exemplary embodiment of the present invention. Referring to FIG.

도 9에 도시된 표현요소를 시각화 하는 방법에서 사용자가 선택한 객체에 대하여 수집된 코멘트 데이터에서 복수의 표현요소들을 추출하고(S910), 추출된 표현요소의 유효성을 판단한다(S1310). 이때, 사용자가 선택한 객체에 대하여 추출된 표현요소가 유효하지 않은 경우, 유효하지 않은 표현요소를 제거한다(S1320). In the method of visualizing the expression elements shown in FIG. 9, a plurality of expression elements are extracted from the collected comment data about the object selected by the user (S910), and the validity of the extracted expression elements is determined (S1310). At this time, if the extracted expression element is not valid for the object selected by the user, the invalid expression element is removed (S1320).

이후, 추출된 복수의 표현요소들 간의 의미 거리에 기반한 분포에 기초하여 추출된 복수의 표현요소들 중 유효하지 않은 표현요소가 제거된 복수의 표현요소들을 시각화(S920)할 수 있다. 특정 표현요소의 유효하지 않은 것으로 판단하는 기준으로는 특정 표현요소의 의미가 다른 표현요소들과 현저히 다르거나, 특정 표현요소의 빈도가 기준값 미만으로 현저하게 적게 나타나거나, 또는 특정 표현요소가 특정 콘텐츠만이 아닌 다수의 콘텐츠에 변별력 없이 일정한 비율로 나타나는 경우(이 경우에는 진정한 리뷰라기보다는 기계적으로 반복되는 홍보, 또는 공지 사항 등일 수가 있음) 등을 들 수 있다.In operation S920, a plurality of expression elements from which the invalid expression elements are removed from the plurality of expression elements extracted based on the distribution based on the semantic distance between the extracted plurality of expression elements may be visualized. As a criterion for determining that a specific expression element is invalid, the meaning of the specific expression element is significantly different from other expression elements, the frequency of the specific expression element is remarkably less than the reference value, (In this case, it may be a mechanical repetition, rather than a genuine review, or an announcement, etc.).

도 14는 본 발명의 일 실시예에 따른 표현요소의 유효성을 표현요소의 빈도수를 기준을 판단하는 과정을 나타낸 도면이다.FIG. 14 is a diagram illustrating a process of determining the validity of an expression element according to an exemplary embodiment of the present invention, based on the frequency of expression elements.

도 13에서 설명된 표현요소를 시각화 하는 방법에서 사용자가 선택한 객체에 대하여 수집된 코멘트 데이터에서 복수의 표현요소들을 추출하고(S910), 추출된 표현요소들의 코멘트 데이터 내에서 추출된 빈도수를 측정한다(S1410).In the method of visualizing the expression elements described in FIG. 13, a plurality of expression elements are extracted from the collected comment data about the object selected by the user (S910), and the extracted frequency in the comment data of the extracted expression elements is measured S1410).

이후, 표현요소의 빈도수를 이용하여 추출된 표현요소의 유효성을 판단한다(S1310). 이때, 사용자가 선택한 객체에 대하여 추출된 표현요소가 유효하지 않은 경우, 유효하지 않은 표현요소를 제거한다(S1320).Then, the validity of the extracted expression element is determined using the frequency of the expression element (S1310). At this time, if the extracted expression element is not valid for the object selected by the user, the invalid expression element is removed (S1320).

도 15는 본 발명의 일 실시예에 따른 표현요소의 유효성을 판단하는 과정을 구체화하여 나타낸 순서도이다.FIG. 15 is a flowchart illustrating a process for determining the validity of an expression element according to an exemplary embodiment of the present invention. Referring to FIG.

도 9에 도시된 표현요소를 시각화 하는 방법에서 사용자가 선택한 객체에 대하여 수집된 코멘트 데이터에서 복수의 표현요소들을 추출하고(S910), 추출된 표현요소가 사용자가 선택한 객체 외의 다른 객체에서 추출되는 빈도수를 식별한다(S1510). 이후, 추출된 표현요소가 사용자가 선택한 객체 이외의 다른 객체 중 일정 수 이상의 객체에서 일정 빈도수 이상으로 추출되었는지 여부를 판정하고(S1520), 일정 수 이상의 객체에서 일정 빈도수 이상으로 추출된 표현요소에 대하여 가중치를 조정한다(S1530). 이후, 추출된 표현요소들 간의 의미 거리에 기반한 분포에 기초하여 추출된 복수의 표현요소들을 시각화(S920)한다.In the method of visualizing the presentation element shown in FIG. 9, a plurality of presentation elements are extracted from the collected comment data about the object selected by the user (S910), and the extracted expression elements are extracted from the objects other than the object selected by the user (S1510). Thereafter, it is determined whether or not the extracted expression elements have been extracted by more than a predetermined number of objects among the objects other than the object selected by the user (S1520). If the extracted expression elements are extracted from a certain number or more of objects, The weight is adjusted (S1530). Then, a plurality of extracted expression elements are visualized based on the distribution based on the semantic distance between the extracted expression elements (S920).

이에 따라 표현요소가 모든 객체(콘텐츠)에 대하여 변별력 없이 동등하게 나타나는 경우, 유효하지 않는 것으로 간주할 수 있다.Accordingly, if the expression element appears equally to all objects (contents) without discrimination, it can be regarded as invalid.

도 16은 본 발명의 일 실시예에 따른 특정 표현요소가 집중되어 있는 경우 표현요소의 영향력을 조절하는 방법에 대한 순서도를 나타낸 도면이다.FIG. 16 is a flowchart illustrating a method of controlling influence of an expression element when specific expression elements are concentrated according to an exemplary embodiment of the present invention. Referring to FIG.

도 9에 도시된 표현요소를 시각화 하는 방법에서 사용자가 선택한 객체에 대하여 수집된 코멘트 데이터에서 복수의 표현요소들을 추출하고(S910), 표현요소가 추출된 빈도수를 측정한다(S1610). 이후, 측정된 빈도수에 따라 측정된 빈도수에 가중치를 부여하여, 측정된 빈도수를 조정한다(S1620). 이때, 조정된 빈도수를 반영하여 표현요소들을 시각화할 수 있다(S920).In the method of visualizing the presentation element shown in FIG. 9, a plurality of presentation elements are extracted from the collected comment data about the object selected by the user (S910), and the frequency of extraction of the presentation elements is measured (S1610). Thereafter, the measured frequency is weighted according to the measured frequency, and the measured frequency is adjusted (S1620). At this time, the presentation elements can be visualized by reflecting the adjusted frequency (S920).

이에 따라, 특정 표현요소가 특정 콘텐츠에 과다 집중되어 나타나는 경우에 가중치를 조정하여 특정 표현요소의 영향력을 조절할 수 있다. 즉, 특정 표현요소가 과도하게 집중되어 나타나는 경우, 그로 인하여 다른 표현요소들의 영향력이 지나치게 과소평가될 우려가 있기 때문에 특정 표현요소의 영향력을 조절하는 경우이다.Accordingly, when a specific expression element appears to be concentrated on a specific content, the influence of the specific expression element can be adjusted by adjusting the weight. In other words, if the specific expression elements appear to be overly concentrated, the influence of the other expression elements may be underestimated.

도 17은 본 발명의 일 실시예에 따른 특정 표현요소가 특정 객체에서 실제로 나타나는 빈도수가 낮은 경우 가중치를 부여하는 방법에 대한 순서도를 나타낸 도면이다.17 is a flowchart illustrating a method of assigning weights when a specific expression element according to an embodiment of the present invention actually appears in a specific object is low.

도 9에 도시된 표현요소를 시각화 하는 방법에서 사용자가 선택한 객체에 대하여 수집된 코멘트 데이터에서 복수의 표현요소들을 추출하고(S910), 코멘트 데이터에서 표현요소가 추출된 빈도수를 측정한다(S1710). 이후, 사용자가 선택한 객체에서 표현요소가 출현하는 빈도수와 측정된 빈도수를 비교하여(S1720), 사용자가 선택한 객체에서 표현요소가 출현하는 빈도수와 측정된 빈도수의 비교 결과에 따라, 표현요소가 추출된 빈도수에 가중치를 부여하여 측정된 빈도수를 조정한다(S1730).In the method of visualizing the expression elements shown in FIG. 9, a plurality of expression elements are extracted from the collected comment data about the object selected by the user (S910), and the frequency of extraction of the expression elements from the comment data is measured (S1710). Thereafter, the frequency with which the presentation element appears in the object selected by the user is compared with the measured frequency (S1720), and the presentation element is extracted according to the comparison result between the frequency at which the presentation element appears in the object selected by the user and the measured frequency. The frequency is weighted and the measured frequency is adjusted (S1730).

이에 따라, 특정 표현요소가 특정 객체(콘텐츠/영화)에서 실제로 나타나는 빈도수와 코멘트 데이터에서 나타나는 빈도수를 비교하여 코멘트 데이터에서 나타나는 빈도수가 낮은 경우에는 낮은 가중치를 부여할 수 있다.Accordingly, a frequency at which a specific expression element actually appears in a specific object (content / movie) is compared with a frequency at which comment data appears, and a low weight value can be given when the frequency of occurrence in the comment data is low.

도 18은 본 발명의 일 실시예에 따른 표현요소를 미리 저장된 표준형의 표현요소로 매핑하고 빈도수를 측정하는 방법에 대한 순서도를 나타낸 도면이다.18 is a flowchart illustrating a method of mapping an expression element according to an exemplary embodiment of the present invention to a previously stored standard expression element and measuring frequency.

도 10에 도시된 표현요소를 시각화 하는 방법에서 사용자가 선택한 객체에 대하여 수집된 코멘트 데이터에서 복수의 표현요소들을 추출(S910)할 때, 표준화된 표현요소가 미리 저장된 데이터베이스 내에 추출된 표현요소가 저장되어 있는지 여부를 탐색하고(S911), 추출된 표현요소가 데이터베이스 내에 저장되어 있지 않으면, 추출된 표현요소와 가장 의미 거리가 가까운 상기 데이터베이스 상의 표준화된 표현요소를 추출된 표현요소의 대표 표현요소로 식별한다(S912).In the method of visualizing the presentation element shown in FIG. 10, when a plurality of presentation elements are extracted from the collected comment data for the object selected by the user (S910), the extracted presentation elements in the database, in which the standardized presentation elements are stored in advance, (S911). If the extracted expression elements are not stored in the database, the standardized expression elements on the database closest to the extracted expression elements are identified as representative expression elements of the extracted expression elements (S912).

이후, 추출된 표현요소들의 코멘트 데이터 내에서 추출된 빈도수를 측정할 때(S930), 추출된 표현요소가 코멘트 데이터 내에서 추출된 빈도수를 식별된 대표 표현요소가 코멘트 데이터 내에서 추출된 빈도수에 합산하고, 추출된 표현요소들 간의 의미 거리에 기반한 분포에 기초하여 추출된 복수의 표현요소들을 시각화할 때(S920), 대표 표현요소를 합산된 빈도수를 반영하여 시각화 한다.Thereafter, when the frequency extracted in the comment data of the extracted expression elements is measured (S930), the extracted frequency of expression elements in the comment data is added to the frequencies extracted in the comment data, And visualizes a plurality of expression elements extracted based on the distribution based on the semantic distance between the extracted expression elements (S920), the representative expression elements are visualized by reflecting the sum frequency.

이에 따라, 표현요소가 표준형이 아닌 경우, 감정어휘사전 상에 저장된(미리 저장된 데이터베이스 내에) 표준형의 표현요소로 매핑하고, 매핑된 감정어휘사전 상의 표준형의 표현요소 기준으로 각 객체의 코멘트 데이터에서의 빈도수를 측정할 수 있다.Accordingly, when the expression element is not a standard type, it is mapped to a standard-type expression element stored in the emotional vocabulary dictionary (in a pre-stored database), and in the mapped emotional vocabulary dictionary, The frequency can be measured.

도 19는 본 발명의 일 실시예에 따른 표현요소를 시각화 하는 시스템을 나타낸 도면이다.19 illustrates a system for visualizing presentation elements in accordance with an embodiment of the present invention.

표현요소를 시각화 하는 시스템(1900)은, 예를 들어 컴퓨팅 시스템일 수 있으며, 스토리지 장치(1910) 및 프로세서(1920)를 포함한다. 이때, 프로세서(1920)는 표현요소 추출부(1930), 빈도수 측정부(1940), 유효성 판단부(1950), 시각화부(1960)를 포함한다.A system 1900 for visualizing presentation elements may be, for example, a computing system, and includes a storage device 1910 and a processor 1920. The processor 1920 includes an expression element extraction unit 1930, a frequency measurement unit 1940, a validity determination unit 1950, and a visualization unit 1960.

스토리지 장치는(1910)는 사용자가 선택한 객체에 대한 코멘트 데이터를 저장하고, 표현요소 추출부(1930)는 스토리지 장치(1910)에 저장된 코멘트 데이터에서 복수의 표현요소들을 추출하여 시각화부(1960)에서 추출된 복수의 표현요소들 간의 의미 거리에 기반한 분포에 기초하여 추출된 복수의 표현요소들을 시각화 한다.The storage device 1910 stores the comment data for the object selected by the user and the expression element extraction unit 1930 extracts a plurality of expression elements from the comment data stored in the storage device 1910 and outputs the extracted expression data to the visualization unit 1960 And visualizes a plurality of extracted presentation elements based on the distribution based on the semantic distance between the extracted plurality of presentation elements.

또한, 빈도수 측정부(1940)에서 추출된 표현요소들의 코멘트 데이터 내에서 추출된 빈도수를 측정하는 경우 시각화부(1960)는 추출된 표현요소들을 측정된 표현요소들의 빈도수에 따라 시각화 할 수도 있으며, 이때, 유효성 판단부(1950)가 추출된 표현요소의 유효성을 판단하고, 사용자가 선택한 객체에 대하여 추출된 표현요소가 유효하지 않은 경우, 유효하지 않은 표현요소를 제거할 수도 있다.In addition, when the frequency extracted in the comment data of the expression elements extracted by the frequency measurement unit 1940 is measured, the visualization unit 1960 may visualize the extracted expression elements according to the frequency of the measured expression elements, , The validity determination unit 1950 determines the validity of the extracted expression element, and if the extracted expression element is not valid for the object selected by the user, it may remove the invalid expression element.

또한, 빈도수 측정부(1940)는 표현요소 추출부(1930)에서 추출된 표현요소가 사용자가 선택한 객체 이외의 다른 객체에서 추출되는 빈도수를 식별하여, 추출된 표현요소가 사용자가 선택한 객체 이외의 다른 객체 중 일정 수 이상의 객체에서 일정 빈도수 이상으로 추출되었는지 여부를 판정할 수 있다.The frequency measurement unit 1940 also identifies the frequencies at which the presentation elements extracted by the presentation element extraction unit 1930 are extracted from objects other than the object selected by the user, It is possible to judge whether or not a certain number or more of objects have been extracted at a certain frequency or more.

또한, 표현요소는 코멘트 데이터에서 추출되는 사람들의 감정을 나타내는 단어, 문단, 이모티콘 등을 포함하고, 복수의 표현요소들간의 의미 거리에 기반한 분포에 기초하여 도 1 내지 도 4에서 설명된 다차원척도 분석 지도(Multi-Dimensional Scaling map: MDS map)을 기반으로 히트맵(Heat-map) 형태 또는 등고선 등의 모양으로 시각화할 수 있다.In addition, the expression element includes a word, a paragraph, an emoticon, etc. expressing the feelings of people extracted from the comment data, and based on the distribution based on the semantic distance between the plurality of expression elements, the multidimensional scale analysis Based on a map (Multi-Dimensional Scaling map: MDS map), it can be visualized in the shape of heat-map or contour.

도 20은 본 발명의 일 실시예에 따른 신규 표현요소를 확인하여 표현요소를 시각화 하는 시스템을 나타낸 도면이다.Figure 20 illustrates a system for identifying new presentation elements and visualizing presentation elements in accordance with an embodiment of the present invention.

신규 표현요소를 확인하여 표현요소를 시각화 하는 시스템(1900)은 스토리지 장치(1910) 및 프로세서(1920)를 포함한다. 이때, 프로세서(1920)는 표현요소 추출부(1930), 표현요소 비교부(1970), 신규 표현요소 확인부(1980), 시각화부(1960)를 포함한다.A system 1900 for identifying new presentation elements and visualizing presentation elements includes a storage device 1910 and a processor 1920. The processor 1920 includes an expression element extraction unit 1930, an expression element comparison unit 1970, a new expression element verification unit 1980, and a visualization unit 1960.

이때, 표현요소 비교부(1970)는 표현요소 추출부(1930)에서 추출된 표현요소들과 기존에 추출된 표현요소들을 비교하고, 신규 표현요소 확인부(1980)는 추출된 표현요소들 중 신규한 표현요소가 추가되었는지 여부를 확인한다.At this time, the expression element comparison unit 1970 compares the extracted expression elements with the expression elements extracted from the expression element extraction unit 1930, and the new expression element verification unit 1980 checks whether the extracted expression elements are new Check whether an expression is added.

도 21은 본 발명의 일 실시예에 따른 표현요소의 빈도수를 측정 및 조정하여 표현요소를 시각화 하는 시스템을 나타낸 도면이다.21 illustrates a system for visualizing presentation elements by measuring and adjusting the frequency of presentation elements according to an embodiment of the present invention.

표현요소를 시각화 하는 시스템(1900)은 스토리지 장치(1910) 및 프로세서(1920)를 포함한다. 이때, 프로세서(1920)는 표현요소 추출부(1930), 표현요소 비교부(1970), 신규 표현요소 확인부(1980), 시각화부(1960)를 포함한다.A system 1900 for visualizing presentation elements includes a storage device 1910 and a processor 1920. The processor 1920 includes an expression element extraction unit 1930, an expression element comparison unit 1970, a new expression element verification unit 1980, and a visualization unit 1960.

빈도수 측정부(1940)는 표현요소 추출부(1930)에서 추출된 표현요소가 사용자가 선택한 객체 이외의 다른 객체에서 추출되는 빈도수를 식별하여, 추출된 표현요소가 사용자가 선택한 객체 이외의 다른 객체 중 일정 수 이상의 객체에서 일정 빈도수 이상으로 추출되었는지 여부를 판정할 수 있다.The frequency measurement unit 1940 identifies the frequencies at which the presentation elements extracted by the presentation element extraction unit 1930 are extracted from objects other than the object selected by the user, It is possible to determine whether or not the object has been extracted more than a certain frequency from a predetermined number or more of objects.

이때, 빈도수 조정부(1990)는 측정된 표현요소의 빈도수에 따라 표현요소의 빈도수에 따라 표현요소의 빈도수에 가중치를 부여하여, 표현요소의 빈도수를 조정할 수 있다.At this time, the frequency adjuster 1990 may weight the frequency of the presentation element according to the frequency of the presentation element according to the frequency of the measured presentation element, and adjust the frequency of the presentation element.

이때, 시각화부(1960)는 빈도수 조정부(1990)에서 조정된 빈도수를 반영하여 표현요소들을 시각화할 수 있다.At this time, the visualization unit 1960 can visualize the presentation elements by reflecting the adjusted frequency in the frequency adjustment unit 1990.

이에 따라, 특정 표현요소가 특정 콘텐츠에 과하게 집중되어 나타나는 경우에도 가중치를 낮추어 그 표현요소의 영향력을 조절할 수 있다.Accordingly, even when a specific expression element appears overly concentrated on a specific content, the influence of the expression element can be controlled by lowering the weight.

또한, 빈도수 측정부(1940)는 저장된 코멘트 데이터에서 표현요소가 추출된 빈도수를 측정하고, 사용자가 선택한 객체에서 표현요소가 출현하는 빈도수와 식별된 빈도수를 비교할 수 있으며, 이때, 빈도수 조정부(1990)는 사용자가 선택한 객체에서 표현요소가 출현하는 빈도수와 식별된 빈도수의 비교 결과에 따라, 표현요소가 추출된 빈도수에 가중치를 부여하여 표현요소가 추출된 빈도수를 조정할 수 있다.Also, the frequency measurement unit 1940 measures the frequency of extraction of the presentation elements from the stored comment data, and compares the frequency with which the presentation elements appear in the object selected by the user and the identified frequency. At this time, Can adjust the frequency of extracting the expression elements by assigning weights to the extracted frequencies of the expression elements according to the comparison result between the frequency of occurrence of the expression elements in the object selected by the user and the identified frequency.

도 22는 본 발명의 일 실시예에 따른 표현요소 추출부를 상세히 나타낸 도면이다.FIG. 22 is a detailed view of an expression element extracting unit according to an embodiment of the present invention.

도 19 내지 도 21에 기재된 표현요소 추출부(1930)는 표현요소 탐색부(1931)와 표현요소 식별부(1932)를 포함한다.The expression element extraction unit 1930 shown in FIGS. 19 to 21 includes an expression element search unit 1931 and an expression element identification unit 1932.

표현요소 탐색부(1931)는 표준화된 표현요소가 미리 저장된 데이터베이스 내에, 추출된 표현요소가 저장되어 있는지 여부를 탐색하고, 표현요소 식별부(1932)는 추출된 표현요소가 데이터베이스 내에 저장되어 있지 않으면, 추출된 표현요소와 가장 의미 거리가 가까운 데이터베이스 상의 표준화된 표현요소를 추출된 표현요소의 대표 표현요소로 식별한다.The expression element search unit 1931 searches whether or not the extracted expression element is stored in the database in which the standardized expression element is stored in advance and the expression element identification unit 1932 determines whether the extracted expression element is stored in the database , And identifies the standardized expression element on the database closest to the extracted expression element as the representative expression element of the extracted expression element.

이때, 빈도수 측정부(1940)는 추출된 표현요소가 코멘트 데이터 내에서 추출된 빈도수를 식별된 대표 표현요소가 코멘트 데이터 내에서 추출된 빈도수에 합산하고, 시각화부(1960)는 대표 표현요소를 합산된 빈도수에 반영하여 시각화한다.At this time, the frequency measurement unit 1940 adds the extracted frequency of the extracted expression elements to the frequency of the identified representative expression elements in the comment data, and the visualization unit 1960 adds the representative expression elements to the frequency And visualized by reflecting on the frequency.

이에 따라, 표현요소가 표준형이 아닌 경우, 감정어휘사전 상에 저장된(미리 저장된 데이터베이스 내에) 표준형의 표현요소로 매핑하고, 그 매핑된 감정어휘사전 상의 표준형의 표현요소 기준으로 각 객체의 코멘트 데이터에서의 빈도수를 측정할 수 있다.Accordingly, when the expression element is not a standard type, it is mapped into a representation element of a standard type stored in an emotional vocabulary dictionary (in a pre-stored database), and in the comment data of each object on the basis of a standard expression element on the mapped emotional vocabulary dictionary Can be measured.

도 23 내지 도 27는 본 발명의 일 실시예에 따라 각기 다른 시각화 방법을 나타낸 도면이다.23 to 27 are diagrams illustrating different visualization methods according to an embodiment of the present invention.

도 23은 본 발명의 일 실시예에 따라 히트맵 형태 외에 시각화 그래프로서, 본 발명을 산점도(Scatter plot) 형태로 나타낸 그래프이며, 이 경우에는 표현어휘의 빈도수에 따라 빈도수가 많을수록 색상이 붉게 표현될 수 있다. 도 24는 Small Multiples 형태로 나타낸 그래프이다.23 is a graph showing the present invention in the form of a scatter plot in addition to a heat map form according to an embodiment of the present invention. In this case, as the frequency of the expression vocabulary increases, the color becomes redder . 24 is a graph in the form of Small Multiples.

도 25는 본 발명을 등고선(Contour Lines) 형태로 나타낸 도면이며, 이 경우에는 표현어휘의 빈도수에 따라 빈도수가 많을수록 상대적으로 높은 값을 가지게 되어 높이가 높게 표현될 수 있다. 도 26는 코로플레스 맵(Choropleth Maps)으로 나타낸 도면이다. 이 경우에는 본 발명이 반드시 직사각형의 정형화된 형태에서 벗어나 지도의 일부분 등 자연지형 또는 자연물의 형상에서도 구현될 수 있다. FIG. 25 shows the present invention in the form of contour lines. In this case, the higher the frequency, the higher the value is, and the higher the height, the higher the frequency. Fig. 26 is a diagram showing a choropleth map. Fig. In this case, the present invention can be realized not only in the shape of a rectangle, but also in the form of a natural terrain or a natural object such as a part of a map.

도 27은 통계지도(Cartograms)로 본 발명을 나타낸 도면이며, 이 경우에는 사용자가 선택한 코멘트 데이터 또는 의견에 따라서 시·도·군에 따라 각각의 표현어휘가 나오는 경우 높은 빈도의 표현어휘를 사용자에게 제공하여 사용자가 선택한 높은 빈도의 표현어휘가 지도상에 표시되도록 표현할 수도 있다.FIG. 27 is a diagram showing the present invention with statistical cartograms. In this case, when each expression word comes out according to the comment data or the opinion selected by the user, a high-frequency expression word is given to the user So that a high-frequency expression word selected by the user can be displayed on the map.

도 28는 본 발명의 일 실시예에 따른 히트맵(Heat-map) 시각화 방법을 3차원으로 응용하여 나타낸 도면이다.28 is a diagram showing a heat-map visualization method according to an embodiment of the present invention in a three-dimensional application.

본 발명에서 기재된 히트맵(Heat-map) 형태의 시각화는 2차원 평면상에 나타내고 있지만 같은 성질을 유지하면서 3차원과 같은 입체적인 형태로도 변형이 가능하다.Although the heat-map visualization described in the present invention is shown on a two-dimensional plane, it can be transformed into a three-dimensional form such as three-dimensional while maintaining the same property.

표현요소의 빈도수에 따라 차원의 변형, 각도, 픽셀의 크기, 색상을 조절할 수 있으며, 도 28은 본 발명에서 기재된 히트맵(Heat-map) 형태를 3차원 형태로 나타낸 도면이다.Angle, pixel size, and color can be adjusted according to the frequency of the expression elements. FIG. 28 is a diagram showing a heat-map form in the three-dimensional form according to the present invention.

도 29는 본 발명의 일 실시예에 따른 표현요소 시각화 방법을 등고선으로 나타낸 도면이다.29 is a diagram showing contour lines of a visualization method of an expression element according to an embodiment of the present invention.

도 29는 표현어휘의 빈도수에 따라 2차원 등고선으로 나타낸 도면으로, 표현어휘의 빈도수에 따라 등고선의 색상 및 크기를 조절할 수 있다.FIG. 29 is a diagram showing a two-dimensional contour line according to the frequency of the expression vocabulary. The color and size of the contour line can be adjusted according to the frequency of the expression vocabulary.

도 30은 도 29에 도시한 등고선 맵을 3차원으로 나타낸 도면이다. Fig. 30 is a diagram showing the contour map shown in Fig. 29 in three dimensions.

도 30는 도 29에 도시된 2차원 등고선 맵을 표현어휘의 빈도수에 따라 3차원 등고선으로 나타낸 도면으로, 표현어휘의 빈도수에 따라 등고선의 색상, 높낮이 및 크기를 조절할 수 있다.FIG. 30 is a diagram showing the two-dimensional contour map shown in FIG. 29 in a three-dimensional contour line according to the frequency of the presentation lexicon, and the color, height and size of the contour line can be adjusted according to the frequency of the presentation lexicon.

도 31 내지 도 33은 본 발명의 일 실시예에 따른 의미지도를 기반으로 한 활용 방법을 나타낸 도면이다.31 to 33 are diagrams illustrating an application method based on a semantic map according to an embodiment of the present invention.

도 31은 본 발명에서 사용되는 다차원척도 분석 지도(Multi-Dimensional Scaling map: MDS map)를 포지셔닝을 활용하는 일 실시예를 나타내며, MDS맵 포지셔닝은 2개의 축을 사용하여 4가지 속성을 나타내는 기존의 포지셔닝에 비하여 MDS맵 상에 나타난 다양한 속성을 기준으로 다차원의 포지셔닝이 가능하다. FIG. 31 shows an embodiment that utilizes positioning of a multi-dimensional scaling map (MDS map) used in the present invention. The MDS map positioning is an existing positioning Dimensional positioning based on various attributes shown on the MDS map.

자동차 기업을 예로 들면, 2사분면에 위치한 Audi의 경우 BMW와 같은 사분면에 위치하여 있지만 좀더 미래 지향적인(Future-oriented)의 이미지에 가깝게 위치해 있는 것을 알 수 있다. 마찬가지로 4사분면에 위치한 SM의 경우 KIA에 비해 편안한(Relaxed) 이미지에 좀더 가까운 것을 알 수 있다. For example, Audi, located in the second quadrant, is located in the same quadrant of BMW as the automotive enterprise, but it is more closely related to the future-oriented image. Likewise, the SM in the fourth quadrant is closer to the relaxed image than the KIA.

이러한 MDS Map을 활용한 포지셔닝은 기업 이미지뿐만 아니라 MDS상에 나타나는 특징에 따라 도 32 및 도33과 같이 상품, 인물 및 캐릭터의 이미지 포지셔닝에도 활용이 가능하다.Such positioning using the MDS Map can be utilized not only for the corporate image but also for image positioning of goods, characters, and characters as shown in FIGS. 32 and 33, according to the features appearing on the MDS.

상술한 본 발명의 실시예들은 하나의 객체(콘텐츠)에 대한 코멘트(리뷰) 데이터로부터 추출된 표현요소(어휘, 이모티콘, 감정, 평가, 의견을 포함하는 요소)를 중심으로 기술되었다. 그러나 본 발명의 사상은 하나의 객체에 대한 코멘트 데이터의 표현요소를 하나의 감정지도 안에서 직관적으로 표현하는 경우에 한정되는 것은 아니다.The embodiments of the present invention described above are centered on expression elements (elements including vocabulary, emoticon, emotion, evaluation, and opinion) extracted from comment data on one object (content). However, the idea of the present invention is not limited to the case where the expression elements of the comment data for one object are intuitively expressed in one emotional map.

즉, 본 발명의 또 다른 실시예에 따르면 사용자에 의한 편집 메뉴 또는 둘 이상의 객체에 대한 비교 분석 기능을 제공하는 메뉴가 제공될 수 있다. 이 때 사용자는 제1 객체와 제2 객체를 선택하여 제1 객체에 대한 리뷰 내의 표현요소들과 제2 객체에 대한 리뷰 내의 표현요소들을 비교할 수 있다. 이 때 비교 메뉴로는 제1 객체에 대한 리뷰 내의 표현요소들과 제2 객체에 대한 리뷰 내의 표현요소들 간의 집합 연산(합집합, 교집합, 차집합) 등을 수행하여 양 집합을 비교할 수 있으며, 교집합, 합집합 또는 차집합에 대하여 다시 시각화를 실행할 수 있는 re-draw 메뉴도 제공될 수 있다.That is, according to another embodiment of the present invention, a menu for providing an editing menu by a user or a comparison analysis function for two or more objects may be provided. At this time, the user can select the first object and the second object to compare the presentation elements in the review for the first object with the presentation elements in the review for the second object. At this time, the comparison menu can perform a set operation (union, intersection, difference set) between the expression elements in the review for the first object and the expression elements in the review for the second object, A re-draw menu may also be provided that allows visualization again for a union or difference set.

또한 동일한 객체에 대해서도 하나의 시각화 데이터만이 존재하는 것이 아니고, 시계열적인 버전 관리에 따라 둘 이상의 시간 버전(또는 시간 레이어)에 따른 시각화 버전이 존재할 수도 있으며, 시간에 따른 노드의 위치와 속성의 변화를 추적할 수도 있다. 이 때 시간에 따른 각 노드(표현요소)의 속성은 면적, 색상 등으로 나타내어질 수 있으며, 빈도, 집중도 등을 반영할 수 있다. 이를 테면 히트맵(Heat-Map)이 그 하나의 예가 될 수 있음은 앞에서 설명한 바와 같다.In addition, not only one visualization data exists for the same object, but there may be a visualization version according to two or more time versions (or time layers) according to time series version control, . &Lt; / RTI > At this time, attributes of each node (expression element) according to time can be represented by area, color, etc., and can reflect frequency, concentration, and the like. For example, the heat-map can be an example of this.

본 발명의 일 실시 예에 따른 표현요소 시각화 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 본 발명의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The visualization method of the presentation element according to an exemplary embodiment of the present invention can be implemented in the form of a program command that can be executed through various computer means and recorded in a computer readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like, alone or in combination. The program instructions recorded on the medium may be those specially designed and configured for the present invention or may be available to those skilled in the art of computer software. Examples of computer-readable media include magnetic media such as hard disks, floppy disks and magnetic tape; optical media such as CD-ROMs and DVDs; magnetic media such as floppy disks; Magneto-optical media, and hardware devices specifically configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. Examples of program instructions include machine language code such as those produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.

이상과 같이 본 발명에서는 구체적인 구성 요소 등과 같은 특정 사항들과 한정된 실시예 및 도면에 의해 설명되었으나 이는 본 발명의 보다 전반적인 이해를 돕기 위해서 제공된 것일 뿐, 본 발명은 상기의 실시예에 한정되는 것은 아니며, 본 발명이 속하는 분야에서 통상적인 지식을 가진 자라면 이러한 기재로부터 다양한 수정 및 변형이 가능하다.As described above, the present invention has been described with reference to particular embodiments, such as specific elements, and specific embodiments and drawings. However, it should be understood that the present invention is not limited to the above- And various modifications and changes may be made thereto by those skilled in the art to which the present invention pertains.

따라서, 본 발명의 사상은 설명된 실시예에 국한되어 정해져서는 아니 되며, 후술하는 특허청구범위뿐 아니라 이 특허청구범위와 균등하거나 등가적 변형이 있는 모든 것들은 본 발명 사상의 범주에 속한다고 할 것이다.Accordingly, the spirit of the present invention should not be construed as being limited to the embodiments described, and all of the equivalents or equivalents of the claims, as well as the following claims, belong to the scope of the present invention .

1910: 스토리지 장치
1920: 프로세서1910: Storage device
1920: Processor

Claims

Extracting a plurality of expression elements from the collected comment data for an object selected by the user; And
Visualizing the extracted plurality of presentation elements based on a distribution based on a semantic distance between the extracted plurality of presentation elements;
Lt; / RTI >
Wherein a distance in a plane or space of each of the visualized plurality of presentation elements is determined based on a semantic distance between each of the plurality of presentation elements.

The method according to claim 1,
Measuring the extracted frequency in the comment data of the extracted presentation elements;
Further comprising:
The step of visualizing the presentation elements
And visualizing the extracted presentation elements according to the frequency of the measured presentation elements.

The method according to claim 1,
After extracting the presentation elements,
Comparing the extracted expression elements with previously extracted expression elements; And
Determining whether a new expression element among the extracted expression elements has been added
Further comprising the steps of:

The method of claim 3,
The step of visualizing the presentation elements
Determining one or more neighboring expression elements having a semantic distance with the new expression element among the previously extracted expression elements within a certain standard when a new expression element among the extracted expression elements is added;
Determining a semantic position of the new presentation element based on the determined semantic distance from the adjacent presentation elements;
Lt; / RTI >
Wherein determining the semantic position of the new presentation element comprises:
When the semantic distance between the first adjacent expression element and the new expression element among the one or more adjacent expression elements is closer to the semantic distance between the second adjacent expression element and the new expression element among the one or more adjacent expression elements, Wherein the visualized position of the new presentation element is determined to be closer to the visualized position of the first adjacent presentation element than the visualized position of the second adjacent presentation element.

The method according to claim 1,
After extracting the presentation elements,
Determining validity associated with the selected object of the extracted presentation element; And
Removing the invalid expression element if the extracted expression element is invalid for the object selected by the user;
Further comprising the steps of:

6. The method of claim 5,
After extracting the presentation elements,
Measuring the extracted frequency in the comment data of the extracted presentation elements;
Further comprising:
The step of determining the validity of the extracted expression element
Wherein the validity of the extracted presentation element is determined by reflecting the measured frequency of the extracted presentation element.

The method according to claim 1,
After extracting the presentation elements,
Identifying a frequency at which the extracted presentation element is extracted from objects other than the object selected by the user; And
Determining whether the extracted presentation element has been extracted at a certain frequency or more from a number of objects other than the object selected by the user;
Further comprising the steps of:

The method according to claim 1,
After extracting the presentation elements,
Measuring a frequency at which the expression elements are extracted; And
Assigning a weight to the measured frequency according to the measured frequency, and adjusting the measured frequency;
Further comprising:
The step of visualizing the presentation elements
And visualizing the presentation elements by reflecting the adjusted frequency.

The method according to claim 1,
After extracting the presentation elements,
Measuring a frequency at which the expression elements are extracted from the comment data;
Comparing the frequency with which the expression element appears in the object selected by the user and the measured frequency; And
Adjusting the measured frequency by assigning a weight to the extracted frequency of the presentation element according to a result of comparison between the frequency at which the presentation element appears in the object selected by the user and the measured frequency,
Further comprising the steps of:

3. The method of claim 2,
The step of extracting the presentation elements
Searching whether the extracted presentation element is stored in a database in which standardized presentation elements are stored in advance; And
Identifying a standardized presentation element on the database closest to the extracted presentation element as a representative presentation element of the extracted presentation element if the extracted presentation element is not stored in the database;
Lt; / RTI >
The step of measuring the frequency includes
Adding the extracted frequency of the extracted expression elements in the comment data to the frequency of the extracted representative expression elements extracted in the comment data,
The step of visualizing the presentation elements
And visualizing the representative expression elements by reflecting the summed frequency.

3. The method of claim 2,
Wherein visualizing the presentation elements comprises:
Wherein the presentation elements are visualized using a multi-dimensional scaling map (MDS map) including the presentation elements and visualized by reflecting a meaning distance between the presentation elements.

A computer-readable recording medium having recorded therein a program for executing the method according to any one of claims 1 to 11.

A storage device for storing comment data for an object selected by a user;
An expression element extracting unit for extracting a plurality of expression elements from the stored comment data; And
A visualization unit for visualizing the extracted plurality of presentation elements based on a distribution based on a semantic distance between the extracted plurality of presentation elements;
Lt; / RTI >
Wherein a distance in a plane or space of each of the visualized plurality of presentation elements is determined based on a semantic distance between each of the plurality of presentation elements.

14. The method of claim 13,
A frequency measurement unit for measuring a frequency extracted in the comment data of the extracted expression elements;
Further comprising:
The visualization unit
And visualizes the extracted presentation elements according to the frequency of the measured presentation elements.

14. The method of claim 13,
An expression element comparison unit comparing the extracted expression elements with previously extracted expression elements; And
A new expression element checking unit for checking whether a new expression element among the extracted expression elements has been added;
The data visualization system further comprising:

14. The method of claim 13,
A validity determining unit for determining validity associated with the object selected by the user of the extracted presentation element and removing the invalid presentation element when the extracted presentation element is invalid for the object selected by the user;
The data visualization system further comprising:

17. The method of claim 16,
A frequency measurement unit for measuring a frequency extracted in the comment data of the extracted expression elements;
Further comprising:
The validity determination unit
Wherein the validity of the extracted presentation element is determined by reflecting the measured frequency of the extracted presentation element.

14. The method of claim 13,
Wherein the extracted expression element identifies a frequency at which the extracted expression element is extracted from another object other than the object selected by the user and determines whether the extracted expression element is extracted at a certain frequency or more from a predetermined number or more of objects other than the object selected by the user A frequency measuring unit for determining whether or not the received signal is a signal;
The data visualization system further comprising:

14. The method of claim 13,
A frequency adjuster for measuring frequency of extraction of the expression elements, weighting frequency of the expression elements according to the measured frequency, and adjusting frequency of the expression elements;
Further comprising:
The visualization unit
And visualizes the presentation elements by reflecting the adjusted frequency.

14. The method of claim 13,
A frequency measurement unit measuring a frequency at which the expression elements are extracted from the stored comment data and comparing the frequency with which the expression elements appear in the object selected by the user and the measured frequency; And
A frequency adjustment unit for assigning a weight to the extracted frequency of the expression elements according to a comparison result between the frequency at which the expression elements appear in the object selected by the user and the measured frequency,
The data visualization system further comprising:

15. The method of claim 14,
The expression element extracting unit
An expression element search unit for searching whether or not the extracted expression elements are stored in a database in which standardized expression elements are stored in advance; And
A representative expression element identification unit for identifying a standardized expression element on the database closest to the extracted expression element as a representative expression element of the extracted expression element if the extracted expression element is not stored in the database, ;
Lt; / RTI >
The frequency measurement unit
Adding the extracted frequency of the extracted expression elements in the comment data to the frequency of the extracted representative expression elements extracted in the comment data,
The visualization unit
And wherein the representative visualization element is visualized by reflecting the summed frequency.