KR20150077957A

KR20150077957A - Method of determining a trend and a turning point of a stock price index using sentiment based indexes according to an analysis of social data and system thereof

Info

Publication number: KR20150077957A
Application number: KR1020130166947A
Authority: KR
Inventors: 김영대; 고경훈; 이동진
Original assignee: 주식회사 코스콤
Priority date: 2013-12-30
Filing date: 2013-12-30
Publication date: 2015-07-08
Also published as: KR101540322B1

Abstract

Provided are a method and a system for determining a trend and a turning point of a stock price index using sentiment-based indexes in accordance with an analysis of social data. The method for determining a trend and a turning point of a stock price index comprises the steps of: collecting a plurality of documents related with an item group from social media data or the like during a first period to an updated date, when the renewal of a date occurs, and a second period longer than the first period and generating and storing sensitivity evaluation data about the entire documents by date; calculating a first moving average of sensitivity evaluation data by date included in the first period and generating accumulative first moving average data including a first moving average before the renewal and a first moving average after the renewal; calculating a second moving average of sensitivity evaluation data by date included in the second period and generating accumulative second moving average data including a second moving average before the renewal and a second moving average after the renewal; and determining a short-term trend in the stock price index related to the item group as a rise or a fall, when the accumulative first moving average data are greater or less than the accumulative second moving average data.

Description

In this paper, we propose a method to determine the turnaround point of a stock index based on sensitivity index based on the analysis of social data. thereof}

본 발명은 주가 지수의 추세와 전환점 판정 방법 및 그 시스템에 관한 것으로, 보다 상세하게는 소셜 데이터의 분석에 따른 감성 기반 인덱스를 이용한 주가 지수의 추세와 전환점 판정 방법 및 그 시스템에 관한 것이다.The present invention relates to a method and system for determining a trend of a stock index and a turning point, and more particularly, to a method and system for determining a trend of a stock index using a sensitivity index based on an analysis of social data and a turning point determination method.

주식시장은 특유의 복잡한 가격결정 메커니즘으로 인해 적어도 하나의 종목을 포함하는 종목군의 주가 또는 종목군의 시가총액을 지수화한 특정 지수와 같은 주가 지수의 변동을, 시장 펀더멘탈의 변화로 설명할 수 없는 경우가 자주 발생한다. 펀더멘탈의 뚜렷한 변화가 발생하지 않았음에도 불구하고 가격이 크게 변동하는 것을 발견할 수 있는데, 이때 새로운 뉴스의 출현이 가격변동의 중요한 원인으로 종종 작용하곤 한다. 뉴스는 현실 세계에 일어나는 각종 현상에 대한 설명과 미래의 정치, 경제,사회, 기업 등과 관련하여 앞으로 어떤 변화가 발생되고 진행되어 갈지 그에 대한 정보들을 포함하고 있기 때문이다. 그러므로 뉴스와 주가는 밀접한 관계를 가지고 있으며, 뉴스를 통해 시장 참가자들은 주식시장의 변동성을 일부나마 예측할 수 있게 된다.The stock market may not be able to account for changes in the stock price index, such as the stock price of stocks that include at least one stock or the specific market index of the stock group, due to the complex pricing mechanism, as a result of changes in market fundamentals It happens frequently. Despite the absence of significant changes in fundamentals, we find that prices fluctuate significantly, with the emergence of new news often contributing to price fluctuations. This is because the news includes information on various phenomena that take place in the real world and information about what kind of changes will occur and proceed in the future with regard to politics, economy, society and enterprise in the future. Therefore, news and stock prices are closely related, and news allows market participants to predict the volatility of the stock market in some way.

한편, 최근에는 증권사, 언론사 등에서 제공되는 뉴스 정보 뿐만 아니라, 모바일 기기의 급격한 발전으로 인하여, 소셜 미디어 데이터, 예컨대 트위터(twitter), 증시 관련 개인 블로그(blog), 페이스북, 다양한 포털 사이트의 소셜 데이터 서비스 등에 의해서 제공되는 정보가 폭발적으로 증가하고 있다. 이와 같은 데이터는 뉴스 정보보다 매우 많은 양으로 시장 참가자들에게 유통되고 있며, 이에 대해 빅데이터라고 칭하고 있다. In recent years, not only news information provided by securities companies, media companies, etc., but also social media data such as tweets, personal blogs related to stock market, Facebook, social data of various portal sites Information provided by services and the like is explosively increasing. Such data is being distributed to market participants in a much larger amount than news information, and is referred to as Big Data.

소셜 미디어 데이터는 개인의 주관적 관점으로 작성되어 있어 뉴스 정보보다 낮은 신뢰성을 가진다는 측면이 있으나, 소셜 미디어 데이터가 빅데이터급으로 제공되므로, 이 데이터를 통해 시장 참가자들의 주식시장, 특히, 종목군에 대한 반응이 상당 정도의 객관성을 갖고 도출될 뿐만 아니라, 종목군의 향후 전망도 타당성을 가질 수 있는 정도에 이르렀다. Social media data is composed of individual subjective viewpoints and has a lower reliability than news information. However, since social media data is provided as a big data class, Not only are the responses derived with considerable objectivity, but the future prospects of the stock group have reached a point where they can be justified.

그러나, 주가에 영향을 미치는 펀더멘털 요인들은 너무나도 다양하고 복잡하며 이러한 요인들이 소셜 미디어 데이터, 뉴스와 주가에 영향을 미치고 소셜 미디어 데이터 등은 다시 주가 지수에 영향을 미치는 식의 순환이 발생하기도 한다.However, the fundamentals that affect stock prices are so diverse and complex that the circulation of social media data, news and stock prices, and social media data, affect the stock index again.

결국 소셜 미디어 데이터는 주가 지수에 영향을 미치는 영향 요인이 되기도 하고 주가 지수의 흐름을 미리 보여주는 선행지표가 되기도 한다. 그러나 하루에도 수없이 많은 뉴스들이 나타나고 사라지고 있어, 뉴스를 하나하나 분석하여 주가에 미치는 영향을 파악하기란 거의 불가능한 일이다.As a result, social media data may be a factor influencing the stock index and leading indicators of the stock index. However, a lot of news is appearing and disappearing every day, and it is almost impossible to analyze the news one by one to understand the impact on the stock price.

더욱이 거시적 관점의 정책, 전망뉴스부터 매일 매일의 시황, 실적, 기업뉴스 등 다양한 유형의 소셜 미디어 데이터 및 뉴스가 실시간으로 양산되며, 그 내용이 시장에 긍정적인지 부정적인지 명확히 파악하기가 쉽지 않다. 또한 소셜 미디어 데이터 및 뉴스라는 속성상 다소 중립적인 뉘앙스로 주식시장의 긍정/부정 양쪽 의견을 모두 제시하는 경우가 많기 때문에 실상 그 저의를 파악하는 것 또한 간단치 않으며, 뉴스 등을 분석하는 사람마다의 주관에 따라 달라질 위험성이 존재한다.Moreover, it is not easy to grasp clearly whether the contents are positive or negative for the market, because various types of social media data and news such as macro policies, forecast news, daily market conditions, performance, and corporate news are mass-produced in real time. In addition, since it is often the case that both the positive and negative opinions of the stock market are presented by the somewhat neutral nuance of the social media data and news attributes, it is not simple to grasp the hypothesis of the stock market. There is a risk that it will change depending on

이로 인하여, 기존의 연구들 역시 쉽게 판별이 가능한 특정 사건과 뉴스들을 위주로 그에 반응하는 주가 지수를 분석하거나, 주가 지수가 크게 변동되었을 때 이를 야기한 뉴스 등이 존재했는지를 역으로 분석하였다. 그러나 뉴스 등이 대부분 일정한 양식이나 속성이 없는 텍스트들로 구성되어 있으며, 하루에도 수없이 뉴스들이 양산된다. Therefore, previous studies have analyzed the stock price index that responds to particular events and news that can easily be identified, or reverse the existence of news that caused the price index to fluctuate significantly. However, most of the news is composed of texts with no fixed form or attribute, and many news items are mass produced every day.

따라서, 최근 뉴스를 포함하여 개인화된 소셜 미디어 데이터와 같은 빅데이터를 분석함으로써 주가 지수의 추세를 예측하거나, 주가 지수의 전환점을 판정하려는 방법이 다양하게 시도되고 있다. Accordingly, various attempts have been made to predict trends of stock indexes by analyzing big data such as personalized social media data including recent news, or to determine the turning point of stock indexes.

본 발명이 이루고자 하는 기술적 과제는 종목군과 관련된 소셜 데이터 및 뉴스를 포함한 대량의 데이터를 이용하여 분석된 감성 평가 데이터에 대하여, 서로 다른 기간의 이동 평균 누적 데이터들을 산출하여 비교함으로써, 종목군의 단기 추세 파악을 예측하는데 기여하는 주가 지수의 추세와 전환점 판정 방법 및 그 시스템을 제공하는데 있다. SUMMARY OF THE INVENTION The present invention has been made in view of the above problems, and it is an object of the present invention to provide a method and system for collecting moving average cumulative data of different periods for emotion evaluation data analyzed using a large amount of data including social data and news related to a stock group, And a system and method for determining a turning point of a stock index.

또한, 대량의 데이터에 대한 종목군의 감정 평가 데이터를 분석하고 감성 영향 인덱스를 산출하여 과열 또는 침체 지수와 비교함으로써, 종목군의 향후 주가 지수가 반전되는 시점을 예상하는 주가 지수의 추세와 전환점 판정 방법 및 그 시스템을 제공하는데 있다. In addition, by analyzing emotional evaluation data of the stocks for a large amount of data, and calculating the emotion index of the emotion, and comparing with the overheat or stagnation index, the trend of the stock index, And to provide the system.

본 발명의 목적은 이상에서 언급된 목적으로 제한되지 않으며, 언급되지 않은 다른 목적들은 아래의 기재로부터 당업자에게 명확하게 이해될 수 있을 것이다. The objects of the present invention are not limited to the above-mentioned objects, and other objects not mentioned can be clearly understood by those skilled in the art from the following description.

상기 기술적 과제를 이루기 위한 본 발명의 일 양태에 따르면, 소셜 데이터의 분석에 따른 감성 기반 인덱스를 이용한 주가 지수의 추세 및 전환점 판정 방법은, 날짜(date)의 갱신이 발생되는 경우에, 상기 갱신된 날짜까지의 제 1 기간 및 상기 제 1 기간보다 긴 제 2 기간 동안에, 소셜 미디어 데이터 및 증시 관련 웹데이터로부터 적어도 하나의 종목을 포함하는 종목군과 관련된 복수의 문서를 수집하고, 날짜 별로 상기 복수의 문서 전체에 대한 감성 평가 데이터를 생성하여 저장하는 단계와, 상기 제 1 기간에 속한 날짜 별 감성 평가 데이터에 대한 제 1 이동 평균을 산출하며, 상기 갱신 전 제 1 이동 평균과 상기 갱신 후 제 1 이동 평균으로 구성된 제 1 이동 평균 누적 데이터를 생성하는 단계와, 상기 제 2 기간에 속한 날짜 별 감성 평가 데이터에 대한 제 2 이동 평균을 산출하며, 상기 갱신 전 제 2 이동 평균과 상기 갱신 후 제 2 이동 평균으로 구성된 제 2 이동 평균 누적 데이터를 생성하는 단계, 및 상기 제 1 이동 평균 누적 데이터가 상기 제 2 이동 평균 누적 데이터를 상향으로 초과하는 경우에, 상기 종목군과 관련된 주가 지수의 단기 추세를 상승으로 판정하고, 제 1 이동 평균 누적 데이터가 상기 제 2 이동 평균 누적 데이터보다 하향으로 하락하는 경우에, 상기 주가 지수의 단기 추세를 하락으로 판정하는 단계를 포함한다. According to an aspect of the present invention, there is provided a method for determining a trend and a turning point of a stock index using a sensitivity index based on an analysis of social data, comprising the steps of: Collecting a plurality of documents related to a stock group including at least one item from social media data and stock market related web data during a first period up to a date and a second period longer than the first period, The method according to claim 1, further comprising the steps of: generating and storing emotion evaluation data for the entire period; calculating a first moving average of the emotion evaluation data by date belonging to the first period, Generating cumulative first accumulated moving average data for each of the first period and the second cumulative period, 2 moving average, generating second moving average cumulative data composed of the pre-update second moving average and the updated second moving average, and generating the first moving average cumulative data as the second moving average cumulative data, Wherein when the first moving average cumulative data is lower than the second moving average cumulative data, the short-term trend of the stock price index related to the stock group is judged as an increase, And determining a short-term trend as a decline.

기타 실시예들의 구체적인 사항들은 상세한 설명 및 도면들에 포함되어 있다. The details of other embodiments are included in the detailed description and drawings.

본 발명에 따르면, 종목군과 관련된 소셜 데이터 및 뉴스를 포함한 대량의 데이터를 이용하여 분석된 감성 평가 데이터에 대하여, 서로 다른 기간의 이동 평균 누적 데이터들을 산출하여 비교함으로써, 종목군과 관련된 주가 지수의 단기 추세 파악을 예측하여, 종목군의 추세 변동의 선행 지표를 제공할 수 있다. According to the present invention, moving average cumulative data of different periods are calculated and compared with the emotion evaluation data analyzed by using a large amount of data including social data and news related to the stock group, so that the short term trend of the stock index related to the stock group It is possible to predict the grasp and provide a leading indicator of the trend change of the stock group.

이에 더하여, 단기 추세 파악을 통해 상승 또는 하락 추세인 경우, 증시 지표 데이터로부터 획득되는 종목군과 관련된 거래량이 증가 또는 감소되는 것으로 확인되면, 종목군의 시황이 강세장 또는 약세장으로 전환되는 것을 예측할 수 있다. In addition, if the trend is ascending or descending through short-term trends, it is predicted that the stock market will turn into a bull market or a bear market if it is confirmed that the trading volume related to the stock group obtained from stock market index data increases or decreases.

본 발명에 따르면, 대량의 데이터에 대한 종목군의 감성 평가 데이터를 분석하고 전환점 지표로서의 감성 영향 인덱스를 산출하여 과열 또는 침체 지수와 비교함으로써, 종목군의 시장 상황을 파악하고, 향후 주가 지수가 반전되는 시점을 보다 정확히 파악할 수 있다. 또한, 감성 영향 인덱스의 중간값에 해당하는 전환 지수과의 비교를 통해, 종목군의 주가 지수가 상승 또는 하락할 시점을 예측할 수 있다. According to the present invention, the emotional evaluation data of the stock group for a large amount of data is analyzed, and the emotional impact index as a turning point index is calculated and compared with the overheat or stagnation index to grasp the market situation of the stock group, Can be grasped more accurately. In addition, it is possible to predict when the stock index of the stock group will rise or fall through comparison with the conversion index corresponding to the median value of the sensitivity index.

도 1은 본 발명의 일 실시예에 따른 주가 지수의 추세 및 전환점 판정 시스템을 포함하는 주가 예측 시스템의 구성도이다.
도 2는 키워드 데이터베이스의 구성도이다.
도 3은 문서 저장부의 구성도이다.
도 4는 데이터 분석부의 구성도이다.
도 5는 감성 사전 데이터베이스의 구성도이다.
도 6은 감성 기반 인덱스부의 구성도이다.
도 7은 상관 분석/결정부의 구성도이다.
도 8은 감성 평가 데이터를 생성하는 과정을 포함하는 주가 예측 방법의 순서도이다.
도 9는 본 발명의 일 실시예에 따른 주가 지수의 추세 및 전환점 판정 방법을 구현하기 위한 추세 예측부의 처리 과정을 나타낸 순서도이다.
도 10은 제 1 및 제 2 기간의 설정 기간을 이동하면서 감성 평가 데이터를 수집하는 것을 도시한 도면이다.
도 11은 제 1, 제 2 이동 평균 누적 데이터 및 주가 지수의 단기 추세 결과를 표시부에 표시한 도면이다.
도 12는 본 발명의 다른 실시예에 따른 주가 지수의 추세 및 전환점 판정 방법을 구현하기 위한 전환 예측부의 처리 과정을 나타낸 순서도이다.
도 13은 본 발명의 또 다른 실시예에 따른 주가 지수의 추세 및 전환점 판정 방법을 구현하기 위한 전환 예측부의 처리 과정을 나타낸 순서도이다.
도 14a 및 도 14b는 각각 일별 순증가분과 순감소분을 산출하는 과정 및 일별 실증가분과 실감소분을 산출하는 과정을 도시한 도면이다.
도 15는 인덱스 누적 데이터, 종목군의 시장 상황 및 종목군의 주가 지수의 상승/하락 전환을 표시부에 표시한 도면이다.
도 16은 감성 평가 데이터의 수집 기간, 지연 기간의 결정 및 평가 데이터의 선택 과정을 나타낸 순서도이다. 1 is a configuration diagram of a stock price forecasting system including a trend and a turning point determination system of a stock price index according to an embodiment of the present invention.
2 is a block diagram of a keyword database.
3 is a block diagram of the document storage unit.
4 is a block diagram of the data analysis unit.
5 is a block diagram of the emotion dictionary database.
6 is a configuration diagram of the emotion based index portion.
7 is a block diagram of the correlation analysis / decision unit.
8 is a flowchart of a stock price prediction method including a process of generating emotion evaluation data.
FIG. 9 is a flowchart illustrating a process of a trend prediction unit for implementing a trend and a turning point determination method of a stock price index according to an embodiment of the present invention.
FIG. 10 is a diagram showing collection of emotion evaluation data while moving in the setting period of the first and second periods.
11 is a diagram showing the first and second moving average cumulative data and the short term trend results of the stock price index on the display unit.
12 is a flowchart illustrating a process of a conversion predicting unit for implementing a trend and a turning point determination method of a stock price index according to another embodiment of the present invention.
13 is a flowchart illustrating a process of a conversion predicting unit for implementing a trend and a turning point determination method according to another embodiment of the present invention.
FIGS. 14A and 14B are diagrams illustrating a process of calculating a daily net increase amount and a net decrease amount, and a process of calculating a daily increase amount and a real decrease amount.
FIG. 15 is a diagram showing the index cumulative data, the market conditions of the stock group, and the up / down switch of the stock index of the stock group on the display unit.
FIG. 16 is a flowchart showing a collection period of the sensitivity evaluation data, a determination of a delay period, and a selection process of evaluation data.

이하, 첨부한 도면들 및 후술되어 있는 내용을 참조하여 본 발명의 바람직한 실시예들을 상세히 설명한다. 그러나, 본 발명은 여기서 설명되어지는 실시예들에 한정되지 않고 다른 형태로 구체화될 수도 있다. 오히려, 여기서 소개되는 실시예들은 개시된 내용이 철저하고 완전해질 수 있도록 그리고 당업자에게 본 발명의 사상이 충분히 전달될 수 있도록 하기 위해 제공되어지는 것이다. 명세서 전체에 걸쳐서 동일한 참조번호들은 동일한 구성요소들을 나타낸다. 한편, 본 명세서에서 사용된 용어는 실시예들을 설명하기 위한 것이며 본 발명을 제한하고자 하는 것은 아니다. 본 명세서에서, 단수형은 문구에서 특별히 언급되지 않는 한 복수형도 포함된다. 명세서에서 사용되는 "포함한다(comprises)" 및/또는 "포함하는(comprising)"은 언급된 구성요소, 단계, 동작 및/또는 소자가 하나 이상의 다른 구성요소, 단계, 동작 및/또는 소자의 존재 또는 추가를 배제하지 않는다. Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings and the following description. However, the present invention is not limited to the embodiments described herein but may be embodied in other forms. Rather, the embodiments disclosed herein are being provided so that this disclosure will be thorough and complete, and will fully convey the concept of the invention to those skilled in the art. Like reference numerals designate like elements throughout the specification. It is to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. In the present specification, the singular form includes plural forms unless otherwise specified in the specification. &Quot; comprises "and / or" comprising ", as used herein, unless the recited element, step, operation, and / Or additions.

이하, 도 1 내지 도 7을 참조하여, 본 발명의 일 실시예에 따른 주가 지수의 추세 및 전환점 판정 시스템에 대하여 기술하기로 한다. 도도 1은 본 발명의 일 실시예에 따른 주가 지수의 추세 및 전환점 판정 시스템을 포함하는 주가 예측 시스템의 구성도이다. 도 2는 키워드 데이터베이스의 구성도이며, 도 3은 문서 저장부의 구성도이고, 도 4는 데이터 분석부의 구성도이다. 또한, 도 5는 감성 사전 데이터베이스의 구성도이며, 도 6은 감성 기반 인덱스부의 구성도이고, 도 7은 상관 분석/결정부의 구성도이다. Hereinafter, a trend and a turning point determination system of a stock price index according to an embodiment of the present invention will be described with reference to FIGS. 1 to 7. FIG. 1 is a configuration diagram of a stock price prediction system including a trend and a turning point determination system of a stock price index according to an embodiment of the present invention. FIG. 2 is a block diagram of a keyword database, FIG. 3 is a block diagram of a document storage unit, and FIG. 4 is a block diagram of a data analysis unit. 5 is a block diagram of the emotion dictionary database, FIG. 6 is a block diagram of the emotion-based index unit, and FIG. 7 is a block diagram of the correlation analysis / decision unit.

본 발명에 따른 시스템에 대해서 먼저 설명하면, 주가 지수의 추세 및 전환점 판정 시스템은 주가 예측 시스템(100)에서, 문서 수집/추출부(110), 문서 저장부(120), 형태소 분석부(140), 데이터 분석부(150), 평가 데이터 저장부(165) 및 감성 기반 인덱스부(200)가 해당하며, 이하에서는 주가 예측 시스템(100)을 구성하는 요소와 함께, 판정 시스템의 구성을 상세히 설명하기로 한다. The trend and change point determination system of the stock index of the present invention will be described with reference to the system according to the present invention in the stock price prediction system 100 in the document collecting / extracting unit 110, the document storing unit 120, the morpheme analyzing unit 140, The data analysis unit 150, the evaluation data storage unit 165 and the emotion-based index unit 200. Hereinafter, the configuration of the determination system will be described in detail along with the elements constituting the stock price prediction system 100 .

주가 예측 시스템(100)은 소셜 미디어 데이터(10)와 증시 관련 웹데이터(20)로부터 추출된 키워드마다 긍정과 부정 중 어느 하나로 감성 평가함으로써 생성된 감성 평가 데이터에 기초하여, 종목군의 주가 지수를 예측함과 아울러서, 종목군의 주가 추세 지표로서의 서로 다른 기간의 이동 평균, 누적 이동 평균 데이터 및 감성 영향 인덱스를 생성하여, 종목군의 주가 지수의 단기 추세와 종목군의 시장 상황 예측과 관련된 데이터를 생성한다. 여기서, 종목군은 적어도 하나의 종목(즉, 개별 기업)을 포함하며, 종목군에 포함된 종목이 하나인 경우, 주가 지수는 종목의 주가이고, 종목군에 포함된 종목들이 복수인 경우, 주가 지수는 종목들의 시가 총액을 근거로 지수화한 것으로서, 예컨대 코스피 200 지수일 수 있다. The stock price forecasting system 100 predicts the stock price index of the stock group based on the emotion evaluation data generated by emotionally evaluating each of the keywords extracted from the social media data 10 and the stock market related web data 20, In addition, we generate moving averages, cumulative moving averages data, and emotion impact indexes for different periods as stock price trend indicators of the stock group, and generate data related to the short-term trend of the stock price index of the stock group and the market situation forecast of the stock group. In this case, the stock group includes at least one stock (ie, individual company), the stock price index is the stock price of the stock, and the stock index is a stock index For example, the KOSPI 200 index.

또한, 이동 평균은 이동되는 특정 기간 동안에 감성 평가 데이터를 산술적으로 평균한 값이고, 누적 이동 평균 데이터는 이동되는 이동 평균을 연결하여 누적한 데이터이다. 감성 영향 인덱스는 종목군의 누적된 감성 평가 데이터를 소정의 식을 이용하여 지수화하여 종목군의 주가 지수에 있어서의 추세 패턴 내지는 전환 패턴을 나타내는 선행지표이다. Also, the moving average is a value obtained by arithmetically averaging the emotion evaluation data during a specific period of movement, and the cumulative moving average data is data accumulated by connecting moving moving average. The emotion impact index is a leading indicator indicative of a trend pattern or a conversion pattern in the stock price index of the stock group by indexing cumulative emotion evaluation data of the stock group using a predetermined formula.

구체적으로, 주가 예측 시스템(100)은 날짜(date)의 갱신이 발생될 때마다, 소셜 미디어 데이터(10)와 증시 관련 웹데이터(20)로부터 대량의 문서를 수집하는 문서 수집/추출부(110), 수집된 문서를 개별 기업별로 저장하는 문서 저장부(130), 개별 기업별로 복수의 문서에 포함된 표현 내지는 문장에 대하여 형태소를 분석하는 형태소 분석부(140), 분석된 형태소에서 추출된 키워드마다 긍정 및 부정 중 어느 하나로 종목군(개별 기업 또는 복수 기업의 조합)에 대하여 감성 평가함으로써 복수의 문서 전체에 대한 감성을 평가하여 복수의 문서 전체의 데이터를 분석하는 데이터 분석부(150)를 포함한다. Specifically, the stock price forecasting system 100 includes a document collecting / extracting unit 110 (FIG. 1) for collecting a large amount of documents from the social media data 10 and the stock market related web data 20 every time a date is updated, A morpheme analysis unit 140 for analyzing morphemes with respect to expressions or sentences included in a plurality of documents for each individual company, a keyword extracted from the analyzed morpheme, And a data analyzer 150 for analyzing data of all the plurality of documents by evaluating emotions of the plurality of documents by performing emotional evaluation on the stock group (combination of individual companies or a plurality of companies) .

또한, 주가 예측 시스템(100)은 분석부(150)로부터 생성된 감성 평가 데이터를 날짜 별로 저장하는 평가 데이터 저장부(165) 및 서로 다른 기간의 이동 평균, 누적 이동 평균 데이터, 감성 영향 인덱스를 산출하여 이를 근거로 종목군의 주가 지수의 단기 추세와 시장 상황 예측과 관련된 데이터를 생성하고, 표시부(190)에 표시하는 감성 기반 인덱스부(200)를 포함한다. The stock price prediction system 100 also includes an evaluation data storage 165 for storing emotion evaluation data generated by the analyzer 150 for each date and a moving average, cumulative moving average data, Based index unit 200 for generating data related to the short-term trend of the stock price index of the stock group and the market situation prediction on the basis thereof and displaying the data on the display unit 190.

또한, 주가 예측 시스템(100)은 누적된 감성 평가 데이터 중 소정의 조건에 의해 선택된 감성 평가 데이터와 함께, 증시 지표 데이터와 경제 지표 데이터 간의 상관 관계로부터의 분석 데이터를 생성하는 상관 분석/결정부(170) 및 선택된 감성 평가 데이터와 분석 데이터에 근거하여 종목군의 주가 지수를 예측 산정하는 주가 예측부(180) 및 주가 예측부(180)로부터 도출된 예측 결과를 표시하는 표시부(190)를 포함할 수 있다. The stock price forecasting system 100 also includes a correlation analysis / decision unit (not shown) that generates analysis data from the correlation between the stock market index data and the economic indicator data, together with the emotion evaluation data selected by the predetermined condition among the cumulative sensitivity evaluation data A stock price predicting unit 180 for predicting the stock price index of the stock group based on the selected emotion evaluation data and analytical data, and a display unit 190 for displaying a forecast result derived from the stock price predicting unit 180 have.

문서 수집/추출부(110)는 날짜의 갱신이 발생될 때마다, 소셜 미디어 데이터(10) 및 증시 관련 웹데이터(20)로부터 종목군과 관련된 대량의 문서를 수집하고, 증시 지표 데이터들(30)을 입력받는다. 여기서, 종목군에 포함된 종목은 증시에 상장된 기업이고, 수집되는 문서는 html, PDF(Portable Document Format), 이미지 및 동영상 중 적어도 하나의 형태로 구현될 수 있다. The document collecting / extracting unit 110 collects a large amount of documents related to the stock group from the social media data 10 and the stock market related web data 20 every time a date update occurs, . Here, the items included in the stock group are companies listed on the stock market, and the collected documents can be implemented in the form of at least one of html, PDF (Portable Document Format), image and moving image.

소셜 미디어 데이터(10)는 인터넷 등의 네트워크와 접속되는 고정형 컴퓨터 또는 모바일 기기를 통해 입력되는 미디어 데이터로서, 네트워크와 접속된 다른 사용자와 상호 공유될 수 있는 데이터이다. 예컨대, 소셜 미디어 데이터(10)는 소셜 미디어 서버에서 운영하는 소셜 미디어 사이트들(12) 및 다양한 포털 사이트 등에서 운영하며 개인화된 컨텐츠가 포함된 블로그 사이트들(14)일 수 있다. 소셜 미디어 사이트들(12)은 소위 SNS로서, 트위터(twitter), 페이스북(facebook), 다양한 포털 사이트에서 서비스하는 소셜 미디어일 수 있다. The social media data 10 is media data input via a fixed computer or a mobile device connected to a network such as the Internet and is data that can be mutually shared with other users connected to the network. For example, the social media data 10 may be social media sites 12 operated by a social media server and blog sites 14 operated by various portal sites and including personalized contents. The social media sites 12 may be social media serving on so-called SNS, twitter, facebook, and various portal sites.

증시 관련 웹데이터(20)는 언론사, 공중파 방송사, 케이블 방송사, 포털 사이트 뉴스, 금융사, 증시 관련 기관 등으로부터 제공되는 웹데이터로서, 소셜 미디어 데이터(10)에 비해 전문적이거나 공신력있는 증시 관련 데이터이다. 이러한 증시 관련 웹데이터(20)는 언론사, 방송사, 포털 사이트 뉴스, 포털 사이트가 제공하는 증시 정보로부터 서비스되는 증시 관련 뉴스 사이트들(22), 은행, 증권사, 보험 등의 금융사에서 증시와 관련하여 서비스되는 금융사 포털 사이트들(24) 및 증시 관련 공공 기관 또는 사설 기관에서 증시와 관련된 분석 정보를 제공하는 증시 관련 통계 사이트들(26)일 수 있다. The stock market related web data 20 is web data provided from a media company, a national wave broadcasting company, a cable broadcasting company, a portal site news, a financial corporation, a stock market related institution or the like, and is stock market related data which is more professional or credible than the social media data 10. The stock market related web data 20 includes stock market news 22 that is provided from stock market information provided by a press company, a broadcasting company, a portal site news, and a portal site, a financial service company such as a bank, a securities company, Related statistic sites 26 that provide analytical information related to the stock market at the financial institution portal sites 24 and the stock related public or private institutions.

증시 지표 데이터들(30)은 주식에 상장된 개별 종목마다의 주식 정보로서, 예컨대 시가, 고가, 저가, 종가, 호가, 체결 여부, 거래량, 거래 대금, 거래원, 상한가, 하한가, 신고가, 신저가 등을 포함할 수 있다. The stock index data 30 is stock information for each individual stock listed on the stock, and for example, market information such as market price, high price, low price, close price, closing price, transaction amount, transaction amount, transaction price, transaction source, upper limit price, lower limit price, . &Lt; / RTI >

소셜 미디어 데이터(10) 및 증시 관련 웹데이터(20)로부터 대량의 문서를 수집하는 경우에, 문서 수집/추출부(110)는 모든 문서를 수집하는 것이 아니라, 키워드 데이터베이스(120)를 참조하여 종목군과 관련된 문서를 수집하는 것이다. When collecting a large amount of documents from the social media data 10 and the market related web data 20, the document collecting / extracting unit 110 does not collect all the documents but refers to the keyword database 120, Is to collect documents related to.

키워드 데이터베이스(120)는 개별 종목에 해당하는 기업마다 카테고리화되어 있는 키워드 군을 포함할 수 있으며, 구체적으로 도 2에 도시된 바와 같이, 개별 종목의 기업명과 관련된 메인 키워드(122)와 아울러서, 기업에서 출시하는 상품, 서비스에 관한 제품/서비스 관련 키워드(124), 기업의 경영진 등에 관한 인적 관련 키워드(126) 및 개별 종목에 영향을 미칠 수 있는 단어, 컨텍스트에 관한 기업 상황 관련 키워드(128) 등을 포함하는 서브 키워드를 저장할 수 있다. 서브 키워드는 해당 기업 특유의 단어, 컨텍스트 등으로서, 해당 기업마다 분류되어 카테고리화된 형태로 존재할 수 있다. As shown in FIG. 2, the keyword database 120 includes a main keyword 122 associated with a company name of an individual item, Related keywords 126 related to the company's executives and the like, words that may affect the individual items, company-related keywords 128 related to the context, and the like May be stored. The sub keyword may exist in a categorized form classified for each company as a specific word, context or the like of the corresponding company.

메인 키워드에 대하여 예를 들어 설명하면, 메인 키워드(122)는 삼성전자, 엘지전자, KT 등과 같이 증시에 상장된 개별 종목의 기업명일 수 있으며, 삼성전자의 경우에 제품/서비스 관련 키워드(124)는 "갤럭시", "스마트폰", "하우젠", "태블릿", "앱 마켓" 등일 수 있으며, 인적 관련 키워드(126)는 삼성전자의 주요 임원진, 삼성전가와 거래하는 기업의 임원진 등일 수 있으며, 기업 상황 관련 키워드(128)는 삼성전자의 주가에 영향을 미칠 수 있는 단어 등으로서, "사상최대", "실적", "호조", "애플", "불만", "악화" 등으로 다양한 단어를 포함할 수 있다. For example, the main keyword 122 may be a company name of an individual item listed on the market such as Samsung Electronics, LG Electronics, KT, etc. In the case of Samsung Electronics, the main keyword 122 may include a product / May be "Galaxy", "smartphone", "Hauzen", "tablet", "appmarket" and the like, and the human-related keywords 126 may be executives of major executives of Samsung Electronics, The keywords related to the company situation (128) are words that can affect the stock price of Samsung Electronics and various words such as "maximum ever", "performance", "goodbye", " . &Lt; / RTI >

문서 수집/추출부(110)는 수집된 복수의 문서에 포함된 표현에서 전술한 키워드 중 메인 키워드(122), 제품/서비스 관련 키워드(124) 및 인적 관련 키워드(126)가 포함되는 문서들을 추출함으로써, 감성 평가에 적합한 문서 데이터를 효율적으로 선정할 수 있다. The document collecting / extracting unit 110 extracts documents including the main keyword 122, the product / service related keyword 124 and the human related keyword 126 among the keywords included in the collected plurality of documents. Thus, document data suitable for sensitivity evaluation can be efficiently selected.

문서 저장부(130)는 형태소 분석에 적합한 형태로 추출된 문서들을 저장할 수 있으며, 예컨대 도 3에 도시된 바와 같이, 개별 종목 그룹(131)마다 추출된 문서들의 포맷 별, 즉 html(132), pdf(133), 이미지(134), 동영상(135) 등으로 분산 저장될 수 있다. The document storage unit 130 may store the extracted documents in a form suitable for morphological analysis. For example, as shown in FIG. 3, pdf 133, image 134, moving image 135, and the like.

형태소 분석부(140)는 감성 평가에 적합한 형태로 처리하기 위한 전처리로서, 저장된 복수의 문서의 포맷에 대하여 의미를 갖는 최소의 언어 단위인 형태소를 분석하여 각 품사를 특정하는 처리를 수행한다. 이 경우에, 형태소 분석부(140)는 도 3에 도시된 포맷마다 적합한 처리를 통해, 각 포맷에 대하여 병렬적으로 형태소 분석을 진행할 수 있다. The morpheme analysis unit 140 is a pre-processing unit for processing the morpheme in a form suitable for emotion evaluation, and analyzes the morpheme that is the smallest language unit having a meaning with respect to the format of the stored plurality of documents, and performs processing for specifying each part of speech. In this case, the morpheme analysis unit 140 can perform morphological analysis in parallel on each format through a process suitable for each format shown in Fig.

아울러, 형태소 분석부(140)는 문서의 포맷에 포함된 표현에서 문장, 컨텍스트 등을 어절 단위로 분류하고, 개별 종목과 관련된 키워드에 인접한 키워드들을 파싱(parsing)할 수 있다. 예를 들어 설명하면, 특정인의 블로그 사이트에서 삼성전자와 관련된 문장 및 엘지전자와 관련된 문장이 함께 존재하는 경우에, 형태소 분석부(140)는 문장 구조, 접속 구조, 구문 등을 고려하여 블로그 사이트의 텍스트를 어절 단위로 분류하고, 이후에 삼성전자 또는 엘지전자의 명칭, 상품/서비스, 인적 사항 등의 키워드를 검색하여, 이에 인접한 단어, 구문들을 파싱하고, 삼성전자 및 엘지전자 별 키워드들로 분류하여 저장한다. In addition, the morpheme analysis unit 140 may classify sentences, contexts, and the like in the expressions included in the document format in units of words, and may parse keywords adjacent to keywords related to individual items. For example, in a case where a sentence related to Samsung Electronics and a sentence related to the LG Electronics coexist in a blog site of a particular person, the morphological analysis unit 140 analyzes the sentence structure of the blog site We classify the text into an e-word unit, and then search for keywords such as Samsung Electronics or LG Electronics' names, products / services, personal information, etc., parse words and phrases adjacent thereto, and classify them into keywords for Samsung Electronics and LG Electronics And stores it.

데이터 분석부(150)는 도 4를 참조하면, 형태소 분석부(140)에서 처리된 키워드마다 긍정 및 부정 중 어느 하나로 감성 평가함으로써 복수의 문서 전체를 통해 종목군의 감성 평가 데이터를 산출하는 데이터 감성 평가부(152) 및 형태소 분석부(140)에서 처리된 키워드를 통계 처리하는 키워드 분석부(154)를 포함할 수 있다. Referring to FIG. 4, the data analyzing unit 150 may include a data sensitivity evaluating unit 150 for evaluating emotion evaluation data of the stock group through the entirety of a plurality of documents by evaluating the sensitivity of each keyword processed by the morphology analyzing unit 140, And a keyword analysis unit 154 for statistically processing keywords processed by the morpheme analysis unit 140 and the keyword analysis unit 152. [

데이터 감성 평가부(152)는 형태소 분석부(140)로부터의 키워드마다 긍정, 중립 또는 부정에 대한 평가 및 이 평가와 연계된 스코어를 저장하는 감성 사전 데이터베이스(160)를 참조하여, 추출된 키워드에 대하여 긍정, 중립 및 부정 중 어느 하나로 날짜 별로 평가하면서 스코어링한다. 스코어링 알고리즘은 Naive bayes 알고리즘, Simple voter 알고리즘, KNN(K Nearest Neighborhood), SVM(Support Vector Machine) 일 수 있다. 이 중 Simple voter 알고리즘을 예로 들어 설명하면, 감성 사전 데이터베이스(160)는 도 5에 도시된 바와 같이, 키워드에 대한 감성 평가로서 긍정, 중립, 부정마다의 키워드를 테이블 형태로 저장할 수 있다. 이러한 감성 평가와 관련된 키워드의 품사의 대부분은 명사, 형용사로 구성될 수 있다. 예컨대 긍정 평가의 테이블(162)에서는 "상승", "사상최대", "오르다" 등의 키워드가 존재하고, 각 키워드에 부여되는 스코어 "1"이다. 또한, 부정 평가의 테이블(166)에서는 "불황", "내리다", "불만" 등의 키워드가 존재하고, 각 키워드에 부여되는 스코어 "-1"이다. 중립 평가 테이블(164)에 저장된 키워드에 부여되는 스코어는 "0"이다. 도 5에 도시된 스코어는 긍정과 부정을 구별하기 위한 것으로 예시되고 있으나, 이와는 달리, 긍정 또는 부정 평가와 연계된 스코어는 시장 참가자들이 해당 키워드에 느끼는 감성의 정도에 따라, 해당 키워드의 가중치를 달리하여 서로 다른 스코어로 구성될 수 있다. The data sensitivity evaluation unit 152 refers to the emotion dictionary database 160 storing an evaluation of affirmative, neutral or negative for each keyword from the morpheme analysis unit 140 and a score associated with the evaluation, Scored by evaluating by date, either positive, neutral or negative. The scoring algorithm may be Naive bayes algorithm, Simple voter algorithm, K Nearest Neighborhood (KNN), SVM (Support Vector Machine). As an example of the simple voter algorithm, the emotion dictionaries database 160 can store the keywords of affirmative, neutral, and negation in the form of a table as emotion evaluation for the keyword, as shown in FIG. Most of the parts of a keyword related to such emotional evaluation can be composed of nouns and adjectives. For example, in the affirmative evaluation table 162, there are keywords such as "rise "," maximum ever ", "ascend ", and score" 1 " In the negative evaluation table 166, keywords such as "recession "," down ", and "complaint" exist and the score "-1" The score given to the keyword stored in the neutral evaluation table 164 is "0 ". The score shown in FIG. 5 is exemplified for distinguishing between positive and negative. However, a score associated with positive or negative evaluation is different from a score according to the degree of sensitivity of market participants to the keyword, And can be composed of different scores.

데이터 감성 평가부(152)는 감성 사전 데이터베이스(160)에 의해 긍정, 중립 및 부정으로 판별된 키워드마다 부여된 스코어를 날짜 별로 합산하여 복수의 문서 전체에 대한 종목군의 감성 평가 데이터를 산출할 수 있다. 여기서, 데이터 감성 평가부(152)는 모든 문서의 키워드에 대하여 감성 평가를 수행한 후, 문서 별로 긍정, 중립, 부정의 평가를 수행하지 않는다. 만약 문서의 감성 뉘앙스를 파악하기 위해 문서 별로 감성 평가를 수행하는 경우, 어떤 문서는 다른 문서에 비해 부정적으로 평가된 키워드가 훨씬 많이 존재함에도 불구하고, 각 문서가 동등한 스코어의 부정 평가를 받을 수 있다. 이에 의하면, 소셜 미디어 데이터(10) 및 증시 관련 웹데이터(20)로부터 추출된 복수의 문서 전체로부터 존재하는 종목군의 긍정 또는 부정 요소에 대한 비율이 왜곡되게 분석될 수 있다. 따라서, 본 실시예에서는 복수의 문서 전체로부터 형태소 분석된 키워드들을 문서 별로 그룹핑없이, 감성 평가를 수행함으로써, 분석의 왜곡을 방지할 수 있다. The data sensitivity evaluation unit 152 can calculate the emotion evaluation data of the stock group for all of the plurality of documents by summing the scores given for the keywords determined as positive, neutral, and negative by the emotion dictionaries database 160 for each date . Here, the data sensitivity evaluation unit 152 does not perform evaluation of affirmative, neutral, and negative for each document after performing sensitivity evaluation on keywords of all documents. If an emotional evaluation is performed on a document to determine the sensitivity nuances of the document, each document may be given a negative evaluation of an equivalent score, even though there are many more negatively valued keywords compared to other documents . Accordingly, the ratio of the positive or negative elements of the stock group existing from the entire plurality of documents extracted from the social media data 10 and the stock market related web data 20 can be analyzed to be distorted. Therefore, in this embodiment, the analysis is prevented from being distorted by performing the emotion evaluation without grouping the morphologically analyzed keywords from the entire plurality of documents by document.

키워드 분석부(154)는 형태소 분석부(140)로부터 분석된 키워드들에 대하여 기간별 수집 건수, 각 키워드 간의 상관 분석 등의 통계 분석을 수행하여 그 결과를 표시부(190)에 제공할 수 있다. 또한, 키워드 분석부(154)는 분석된 키워드들 중 키워드 데이터베이스(120)에 등록되지 않은 키워드를 선별하고, 신규로 선별된 키워드는 키워드 데이터베이스(120)에 갱신 저장됨으로써, 문서 수집/추출부(110)에서 수행되는 문서 수집의 정확성을 향상시킬 수 있으며, 관리자는 신규의 키워드 중 감성 평가에 반영할 키워드에 대해서는 감성 사전 데이터베이스(160)에 저장시킬 수 있다.The keyword analyzing unit 154 may perform a statistical analysis on the analyzed keywords from the morpheme analyzing unit 140, such as the number of collected data per period and the correlation analysis between the keywords, and provide the result to the display unit 190. The keyword analysis unit 154 selects keywords that are not registered in the keyword database 120 among the analyzed keywords and the newly selected keywords are updated and stored in the keyword database 120 so that the document collection / 110), and the manager can store the keyword to be reflected in the emotion evaluation of the new keyword in the emotion dictionary database 160. [0053] FIG.

평가 데이터 저장부(165)는 데이터 감성 평가부(152)로부터 산출된 종목군의 감성 평가 데이터를 날짜 별로 저장하고, 저장된 감성 평가 데이터를 감성 기반 인덱스부(200)로 전달한다. The evaluation data storage unit 165 stores the emotion evaluation data of the stock group calculated from the data emotion evaluation unit 152 by date and delivers the stored emotion evaluation data to the emotion based index unit 200.

감성 기반 인덱스부(200)는 서로 다른 기간 동안 이동 평균 및 누적 이동 평균 데이터를 산출하여 분석함으로써, 종목군과 관련된 주가 지수의 단기 추세와 함께, 종목군의 강세장/약세장과 관련된 향후 시황을 판정하는 추세 예측부(210)를 포함할 수 있다. 또한, 감성 기반 인덱스부(200)는 소정 기간 동안에 누적된 감성 평가 데이터로부터 주가 지수의 전환점 지표로서의 감성 영향 인덱스를 산출하여 분석함으로써, 종목군의 과열/침체와 관련된 시장 상황과 아울러서, 종목군의 주가 지수의 향후 등락을 판정하는 전환 예측부(200)를 포함할 수 있다. The emotion-based index unit 200 calculates and analyzes moving average and cumulative moving average data for different periods of time to determine trends related to the bull market / bull market, (Not shown). In addition, the emotion-based index unit 200 calculates emotional impact indexes as indexes of change in the stock price index from the emotion evaluation data accumulated over a predetermined period of time, thereby analyzing the market conditions related to overheating / stagnation of the stocks, And a conversion prediction unit 200 for determining a future up and down.

추세 예측부(210)는 제 1 이동 평균 산출부(212), 제 2 이동 평균 산출부(214), 크로스 분석부(216), 장세 예측부(217) 및 추세 지표부(218)를 포함할 수 있다. The trend prediction unit 210 includes a first moving average calculation unit 212, a second moving average calculation unit 214, a cross analysis unit 216, a long term prediction unit 217, and a trend indicator unit 218 .

제 1 이동 평균 산출부(212)는 평가 데이터 저장부(165)에 날짜 별로 저장된 종목군의 감성 평가 데이터 중 제 1 기간에 속한 날짜 별 감성 평가 데이터에 대한 제 1 이동 평균을 산출하며, 갱신 전 제 1 이동 평균과 상기 갱신 후 제 1 이동 평균으로 구성된 제 1 이동 평균 누적 데이터를 생성한다. 제 1 이동 평균은 단기 추세 파악을 위해 산출되며, 예컨대 도 10과 같이, 초기에 제 1 기간에 해당하는 감성 평가 데이터를 합산하여 제 1 기간의 총 일수로 나눈 값이다. 여기서, 제 1 기간은 단기 추세 파악에 적합한 기간이 설정될 수 있으며, 예를 들어 영업일 기준으로 12일일 수 있다. The first moving average calculation unit 212 calculates a first moving average of the sensitivity evaluation data by date belonging to the first period among the sensitivity evaluation data of the stock group stored in the evaluation data storage unit 165 by date, 1 moving average and the first moving average after the update. The first moving average is calculated for short-term trend recognition. For example, as shown in Fig. 10, the emotion evaluation data corresponding to the first period is added up and divided by the total number of days in the first period. Here, the first period may be set to a period suitable for grasping the short-term trend, for example, 12 business days.

아울러, 제 1 이동 평균은 도 10에서와 같이, 날짜가 갱신되는 경우에, 갱신되는 날까지의 제 1 기간에 해당하는 감성 평가 데이터에 대한 평균값이다. 따라서, 제 1 이동 평균 누적 데이터는 갱신 전, 후의 제 1 이동 평균이 계속해서 누적되어 도 11의 1110에 도시된 라인의 형태로 생성될 수 있다.In addition, the first moving average is an average value of the emotion evaluation data corresponding to the first period until the date when the date is updated, as shown in FIG. Therefore, the first moving average accumulation data can be generated in the form of the line shown in 1110 of FIG. 11 by continuously accumulating the first moving average before and after the update.

제 2 이동 평균 산출부(214)는 평가 데이터 저장부(165)에 날짜 별로 저장된 종목군의 감성 평가 데이터 중 제 1 기간 보다 긴 제 2 기간에 속한 날짜 별 감성 평가 데이터에 대한 제 2 이동 평균을 산출하며, 갱신 전 제 2 이동 평균과 상기 갱신 후 제 2 이동 평균으로 구성된 제 2 이동 평균 누적 데이터를 생성한다. 제 2 이동 평균은 중기 추세 파악을 위해 산출되며, 예컨대 도 10과 같이, 초기에 제 2 기간에 해당하는 감성 평가 데이터를 합산하여 제 2 기간의 총 일수로 나눈 값이다. 여기서, 제 2 기간은 중기 추세 파악에 적합하도록, 예를 들어 영업일 기준으로 26일일 수 있다. The second moving average calculation unit 214 calculates a second moving average of the sensitivity evaluation data by date belonging to the second period longer than the first period among the sensitivity evaluation data of the stock group stored in the evaluation data storage unit 165 by date And generates second moving average accumulated data composed of a second moving average before updating and a second moving average after updating. The second moving average is calculated in order to grasp the mid-term trend. For example, as shown in Fig. 10, the emotion evaluation data corresponding to the second period is added up and divided by the total number of days in the second period. Here, the second period may be 26 days, for example, on a business day basis to be suitable for grasping the mid-term trend.

아울러, 제 2 이동 평균은 제 1 이동 평균과 마찬가지로, 도 10에서와 같이, 날짜가 갱신되는 경우에, 갱신되는 날까지의 제 2 기간에 해당하는 감성 평가 데이터에 대한 평균값이다. 따라서, 제 2 이동 평균 누적 데이터는 갱신 전, 후의 제 2 이동 평균이 계속해서 누적되어 도 11의 1120에 도시된 라인의 형태로 생성될 수 있다.Similarly to the first moving average, the second moving average is an average value of the emotion evaluation data corresponding to the second period up to the day when the date is updated, as shown in Fig. Thus, the second moving average cumulative data can be generated in the form of the line shown in 1120 of FIG. 11 continuously accumulating the second moving average before and after the update.

크로스 분석부(216)는 제 1 및 제 2 이동 평균 누적 데이터를 이동 평균 산출부(212, 214)로부터 수신하고, 제 1 이동 평균 누적 데이터가 도 11의 1130에서와 같이, 제 2 이동 평균 누적 데이터를 상향으로 초과하는 골든 크로스(golden cross)인 경우에, 종목군과 관련된 주가 지수의 단기 추세를 상승으로 판정하고, 제 1 이동 평균 누적 데이터가 도 11의 1140에서와 같이, 제 2 이동 평균 누적 데이터보다 하향으로 하락하는 데드 크로스(dead cross)인 경우에, 주가 지수의 단기 추세를 하락으로 판정한다. The cross analyzer 216 receives the first and second moving average cumulative data from the moving average calculators 212 and 214. When the first moving average cumulative data is the second moving average cumulative data as in 1130 of FIG. 11, the first moving average cumulative data is determined as the second moving average cumulative cumulative cumulative cumulative cumulative cumulative cumulative cumulative cumulative data In the case of a dead cross that falls downwards from the data, the short term trend of the stock index declines.

장세 예측부(217)는 크로스 분석부(216)에서 골든 크로스로 판정되어 종목군과 관련된 주가 지수의 단기 추세가 상승으로 예측되는 경우, 증시 지표 데이터들(30) 중 해당 종목군의 거래량을 수신하여 거래량이 증가되는 것으로 확인되면, 종목군이 강세장으로 전환되는 것으로 예측할 수 있다. 또한, 장세 예측부(217)는 크로스 분석부(216)에서 데드 크로스로 판정되어 종목군과 관련된 주가 지수가 단기 추세가 하락으로 예측되는 경우, 증시 지표 데이터들(30) 중 해당 종목군의 거래량을 수신하여 거래량이 감소되는 것으로 확인되면, 종목군이 약세장으로 전환되는 것으로 예측할 수 있다.When the short-term forecasting unit 217 determines that the short-term trend of the stock price index related to the stock group is predicted as rising, the long-term forecasting unit 217 receives the transaction amount of the corresponding stock group among the stock index data 30, Can be expected to turn into a bull market. When the crossing analysis unit 216 determines that the crossing analysis unit 216 determines that the stock index is related to the stock group and the stock price index related to the stock group is predicted as a short-term trend downward, the long-term forecasting unit 217 receives the transaction amount of the corresponding stock group among the stock index data 30 If the trading volume is confirmed to decrease, it can be predicted that the stock group will be converted into a bear market.

추세 지표부(218)는 제 1, 제 2 이동 평균 누적 데이터, 종목군과 관련된 주가 지수의 단기 추세 판정 결과 및 종목군의 강세장/약세장 예측 결과를 지표화하는 데이터를 생성하고, 지표화된 데이터를 표시부(190)에 전송한다. The trend indicator unit 218 generates data for indexing the first and second moving average cumulative data, the short-term trend determination result of the stock index related to the stock group and the bull market / bull market forecast result of the stock group, and displays the indexed data on the display unit 190 .

한편, 전환 예측부(220)는 종목군의 주가 지수에 전환점 지표로서 감성 영향 인덱스를 산출하고, 감성 영향 인덱스를 분석하여 시장 상황의 과열/침체 예측과 아울러서, 종목군의 주가 지수의 향후 등락을 판정한다. 구체적으로, 전환 예측부(220)는 감성 영향 인덱스부(222), 상황 분석부(224) 및 전환 지표부(226)를 포함할 수 있다. On the other hand, the conversion predicting unit 220 calculates the sensitivity index as a turning point index to the stock index of the stock group and analyzes the sensitivity index to determine whether the stock index of the stock group is up or down . Specifically, the conversion predicting unit 220 may include a sensitivity index unit 222, a situation analysis unit 224, and a conversion index unit 226.

감성 영향 인덱스부(222)는 평가 데이터 저장부(165)에 날짜 별로 저장된 종목군의 감성 평가 데이터 중 제 3 기간에 속한 날짜 별 감성 평가 데이터에 대하여, 긍정으로 평가되어 집계된 상승 스코어와 부정으로 평가되어 집계된 하락 스코어를 날짜 단위로 산출할 수 있다. 구체적으로, 도 14a에 도시된 바와 같이, 키워드마다 긍정 및 부정을 부여한 후에, 제 3 기간 동안에 긍정으로 평가된 키워드의 스코어를 날짜 별로 집계하여 상승 스코어(1412)를 산출함과 아울러서, 제 3 기간 동안에 부정으로 평가된 키워드의 스코어를 일별로 집계하여 하락 스코어(1414)를 산출할 수 있다. 제 3 기간은 제 1 및 제 2 기간과 독립적으로, 누적된 감성 평가 데이터와 개별 종목군의 실제 주가 지수와의 상관도에 근거하여 결정될 수 있으며, 이러한 상관도는 상관 분석/결정부(170)에 의해 결정되어 수 있다. 본 실시예에서 이용되는 제 3 기간은 변동폭이 둔한 시장 특성에 부합되도록, 예컨대 영업일 기준으로 20일일 수 있다. The emotional impact index unit 222 compares the emotional evaluation data for each day belonging to the third period among the emotional evaluation data of the stock group stored in the evaluation data storage unit 165 with the affirmative evaluation score and the positive ascending score The accumulated downward score can be calculated on a date basis. Specifically, as shown in FIG. 14A, after affirmation and negation are given to each keyword, the scores of the keywords evaluated as positive during the third period are aggregated by date to calculate the rising score 1412, It is possible to calculate the drop score 1414 by counting the scores of the keywords evaluated as negative during each day. The third period can be determined on the basis of the degree of correlation between the accumulated emotion evaluation data and the actual stock price index of the individual stock group independently of the first and second periods, As shown in FIG. The third period used in this embodiment may be 20 days, for example, on a business day basis, so that the fluctuation width corresponds to a dull market characteristic.

또한, 감성 영향 인덱스부(222)는 제 3 기간 동안에 산출된 상승 스코어 및 하락 스코어 중 적어도 하나의 스코어에 관한 일별 증가분의 평균값과 스코어에 관한 일별 감소분의 평균값의 비에 근거하여, 종목군의 주가 지수의 전환점 지표로서의 감성 영향 인덱스를 생성할 수 있다. 이 경우에, 이러한 스코어에 관한 일별 증가분 및 일별 감소분은 도 14b에 도시된 바와 같이, 날짜 별로 산출된 상승 스코어와 하락 스코어의 차이값(1416)에 기초한 일별 실증가분(r) 및 일별 실감소분(d)일 수 있다. 이와는 달리, 스코어에 관한 일별 증가분 및 일별 감소분은 도 14a에 도시된 바와 같이, 날짜 별로 산출된 상승 스코어에 기초한 일별 순증가분(u) 및 일별 순감소분(f)일 수 있다. Also, based on the ratio of the average value of the daily increment with respect to the score of at least one of the rising score and the falling score calculated during the third period to the average value of the daily decrease with respect to the score, It is possible to generate a sensitivity influence index as a turning point indicator of the turning point. In this case, the daily increment and the daily decrease with respect to this score are calculated as the daily room increment (r) and the daily room decrease (1416) based on the difference value 1416 between the rising score and the falling score calculated for each day as shown in Fig. d). Alternatively, the daily increment and the daily decrease with respect to the score may be a daily increment (u) and a daily decrease (f) based on a rise score calculated for each day as shown in FIG. 14A.

이에 더하여, 감성 영향 인덱스는 제 1, 제 2 이동 평균과 마찬가지로, 날짜가 갱신되는 경우에, 갱신되는 날까지의 제 3 기간에 해당하는 값이 소정의 식에 의해 산출될 수 있다. 따라서, 감성 영향 인덱스부(222)는 갱신 전, 후 감성 영향 인덱스를 계속해서 누적시켜, 인덱스 누적 데이터를 생성할 수 있다. In addition, as in the case of the first and second moving averages, a value corresponding to the third period up to the date when the date is updated can be calculated by a predetermined formula. Therefore, the emotion influence index section 222 can continuously accumulate the before and after sensitivity affect indexes to generate index accumulation data.

감성 영향 인덱스부(222)에서 감성 영향 인덱스를 생성하는 과정은 도 12 내지 도 14b를 통해 상세히 후술하기로 한다. The process of generating the emotion index at the emotion index unit 222 will be described later in detail with reference to FIGS. 12 to 14B.

상황 분석부(224)는 감성 영향 인덱스부(222)에서 생성된 제 3 기간 동안의 감성 영향 인덱스를 수신하고, 감성 영향 인덱스가 과열 지수 이상인 경우에, 종목군의 시장 상황이 과열인 것으로 판정하고, 감성 영향 인덱스가 침체 지수 이하일 경우에, 종목군의 시장 상황이 침체인 것으로 판정할 수 있다. 여기서, 과열 지수 및 침체 지수는 도 15에 도시된 바와 같이, 각각 70 및 30일 수 있다. The situation analyzing unit 224 receives the sensitivity index for the third period generated by the sensitivity index unit 222 and determines that the market condition of the stock group is overheated when the sensitivity index is higher than the superheat index, If the sensitivity index is below the stagnation index, it can be determined that the market situation of the stock group is stagnant. Here, the superheat index and the stagnation index may be 70 and 30, respectively, as shown in FIG.

아울러, 상황 분석부(224)는 감성 영향 인덱스부(222)에서 도출된 다른 값인 인덱스 누적 데이터가 도 15에 도시된 전환 지수인 50보다 상향으로 초과하는 경우에, 종목군의 주가 지수를 상승 전환으로 판정하고, 인덱스 누적 데이터가 전환 지수인 50보다 하향으로 하락하는 경우에, 종목군의 주가 지수를 하락 전환으로 판정할 수 있다. In addition, when the cumulative index data, which is another value derived from the sensitivity index unit 222, exceeds the conversion index 50 shown in FIG. 15, the situation analyzer 224 determines that the stock index of the stock group , And when the index cumulative data falls to a value lower than 50, which is the conversion index, the stock price index of the stock group can be determined as a downward conversion.

전환 지표부(226)는 인덱스 누적 데이터, 종목군의 시장 상황에 대한 판정 결과 및 종목군의 상승/하락 전환 예측 결과를 지표화하는 데이터를 생성하고, 지표화된 데이터를 표시부(190)에 전송한다. The conversion indicator unit 226 generates data for indexing the index accumulation data, the determination result of the market condition of the stock group and the up / down conversion prediction result of the stock group, and transmits the indexed data to the display unit 190.

한편, 상관 분석/결정부(170)는 누적된 감성 평가 데이터 중 소정의 조건에 의해 선택된 감성 평가 데이터와 함께, 증시 지표 데이터와 경제 지표 데이터 간의 상관 관계로부터의 분석 데이터를 생성할 수 있다. 도 7을 참조하면, 상관 분석/결정부(170)는 제 1 상관 테이블부(172), 평가 데이터 수집 기간 결정부(173), 평가 데이터 선택부(174), 지연 기간 결정부(175), 경제 지표 데이터베이스(176) 및 제 2 상관테이블부(177)를 포함할 수 있다. On the other hand, the correlation analysis / decision unit 170 can generate analysis data from the correlation between the stock market index data and the economic index data, together with the emotion evaluation data selected by the predetermined condition among the accumulated emotional evaluation data. 7, the correlation analysis / determination unit 170 includes a first correlation table unit 172, an evaluation data collection period determination unit 173, an evaluation data selection unit 174, a delay period determination unit 175, An economic index database 176 and a second correlation table section 177. [

제 1 상관테이블부(172)는 평가 데이터 저장부(171)에 날짜 별로 누적 저장된 감성 평가 데이터를 수신하고, 외부로부터 입력되는 증시 지표 데이터들(30)과의 상관 관계 분석을 수행하며, 과거 시점에서 개별 종목의 증시 지표 데이터들(30)과 이에 상응하는 평가 데이터 간의 분석된 상관 관계가 제 1 상관테이블부(172)에 수록된다.The first correlation table unit 172 receives emotion evaluation data accumulated in the evaluation data storage unit 171 on a date-by-date basis and performs correlation analysis with the stock index data 30 input from the outside, The analyzed correlation between the stock index data 30 of the individual stock and the corresponding evaluation data is stored in the first correlation table portion 172. [

또한, 제 1 상관테이블부(172)는 감성 영향 인덱스를 통해 결정되는 종목군의 상승/하락 추세를 판단하고, 판단된 추세와 종목군의 실제 주가 지수 추세 간의 불일치가 발생하는 경우에, 개별 종목의 실제 추가에서 추세 전환의 발생을 통지할 수 있다. In addition, the first correlation table unit 172 judges an upward / downward trend of the stock group determined through the sensitivity index, and when there is a discrepancy between the determined trend and the actual stock index trend of the stock group, The addition can notify the occurrence of the trend change.

평가 데이터 수집 기간 결정부(173)는 제 1 상관테이블부(172)에 저장된 과거 상관 관계에 기초하여 종목군의 주가 지수에 영향을 미치는 평가 데이터의 수집 기간을 결정하고, 평가 데이터 선택부(174)는 평가 데이터 저장부(171)에 누적 저장된 감성 평가 데이터 중 수집 기간에 부합하는 평가 데이터를 선택하여 주가 예측부(180)로 제공할 수 있다. 이러한 수집 기간은 감성 영향 인덱스부(222)에서 이용되는 제 3 기간과 상이하다. The evaluation data collection period determining unit 173 determines the collection period of the evaluation data that affects the stock index of the stock group based on the past correlation stored in the first correlation table unit 172, May select evaluation data corresponding to the collection period among the sensitivity evaluation data accumulated in the evaluation data storage unit 171 and provide the evaluation data to the stock price prediction unit 180. [ This collection period differs from the third period used in the emotion influence index section 222.

또한, 지연 기간 결정부(175)는 제 1 상관테이블부(172)의 과거 상관 관계에 기초하여 감성 평가 데이터가 종목군의 주가 지수에 반영되어질 때까지의 경과되는 지연 기간을 결정하고, 주가 예측부(180)에 종목군의 주가 지수 예측시에 지연 기간을 제공하여, 지연 기간 이후의 주가 지수를 예측할 수 있다. The delay period determination unit 175 determines a delay period that elapses until the emotion evaluation data is reflected in the stock price index of the stock group based on the past correlation of the first correlation table unit 172, It is possible to predict the stock price index after the delay period by providing the delay period at the stock index 180 in the stock index prediction of the stock group.

이와 같이 수집 기간 및 지연 기간을 주가 예측부(180)의 예측시에 제공함으로써, 보다 유효한 감성 평가 데이터를 활용할 수 있으며, 주가 예측 시점을 더 정확하게 특정할 수 있다. By providing the collection period and the delay period in the prediction of the stock predicting unit 180 as described above, it is possible to utilize more effective emotion evaluation data and more accurately specify the stock price prediction timing.

또한, 제 2 상관테이블부(177)는 증시 지표 데이터들(30)과 경제 지표 데이터베이스(176)에 축적된 거시 경제 지수와 관련된 경제 지표 데이터들 간의 상관 관계로부터 도출되는 분석 데이터를 주가 예측부(180)에 제공할 수 있다. 이 경우에, 경제 지표 데이터들은 모든 개별 종목에 기본적으로 공통되게 영향을 주는 경제 지표로서, 예를 들면 금리, 환율, 예상성장율, 물가지수, 국제수지 등일 수 있다. The second correlation table unit 177 also stores the analysis data derived from the correlation between the stock index data 30 and the economic index data related to the macro economic index accumulated in the economic index database 176, 180). In this case, the economic indicator data are basically common economic indicators for all individual items, such as interest rates, exchange rates, projected growth rates, price indexes, balance of payments, and so on.

다시 도 1을 참조하면, 주가 예측부(180)는 상관 분석/결정부(170)로부터 선택된 감성 평가 데이터, 지연 기간 및 제 2 상관테이블부(177)로부터 생성된 분석 데이터에 근거하여 종목군의 주가 지수를 예측할 수 있다. 주가 지수 예측은 증시 지표 데이터들(30)과 경제 지표 데이터에 기초한 시계열 분석을 토대로 하며, 소셜 미디어 데이터(10) 및 증시 관련 웹데이터(20)의 뉴스로부터 분석된 평가 데이터는 상기 시계열 분석으로부터 산출되는 예측 주가 지수를 보정하는 항으로 결합될 수 있다. 주가 지수 예측의 정확성을 보다 높이기 위해, 제 1 상관테이블부(172)의 상관 관계에 기초하여 산출된 가중치가 감성 평가 데이터에 부여됨으로써, 가중치가 부여된 감성 평가 데이터가 주가 지수 예측에 반영될 수 있다. 주가 예측부(180)에서 산출된 개별 종목의 예측 주가 및 그 통계값은 표시부(190)에 표시된다. 1, the stock price predicting unit 180 calculates the stock price of the stock group based on the emotion evaluation data selected from the correlation analysis / decision unit 170, the delay period, and the analysis data generated from the second correlation table unit 177 The index can be predicted. The stock price index prediction is based on the time series analysis based on the stock market index data 30 and the economic index data and the evaluation data analyzed from the news of the social media data 10 and the stock market related web data 20 is calculated from the time series analysis Can be combined with a term that corrects the predicted stock price index. In order to further improve the accuracy of the stock price index prediction, the weight value calculated based on the correlation of the first correlation table section 172 is given to the sensitivity evaluation data, so that the weighted sensitivity evaluation data may be reflected in the stock index have. The predicted stock prices of the individual items calculated by the stock price estimating unit 180 and their statistical values are displayed on the display unit 190.

전술한 주가 예측 시스템(100)에 따르면, 소셜 데이터 및 뉴스를 포함한 대량의 데이터에 대한 감성 평가 데이터를 반영함으로써, 시장 참가자들의 다양한 견해로부터 종목군에 대한 시장 분위기 및 정보를 보다 객관적이면서 유의미하게 추출할 수 있으므로, 종목군의 주가 지수를 보다 신뢰성있게 예측할 수 있다. 특히, 단순히 증시 관련 웹데이터(20)에서 생산되는 뉴스의 분석에 의한 주가 예측보다는 뉴스 분석을 포함한 소셜 미디어 데이터의 감성 평가를 통한 주가 예측이 정확성과 신뢰성을 갖는 이유는 소셜 미디어 데이터가 뉴스에 비해 훨씬 많은 데이터량으로 생산되어, 통계적으로 보다 모집단에 근접한 분석이 이루어지기 때문이다. According to the above-described stock price forecasting system 100, the emotion evaluation data on a large amount of data including social data and news is reflected, so that the market atmosphere and information on the stock group are extracted objectively and meaningfully from various viewpoints of market participants It is possible to predict the stock price index of the stock group more reliably. Particularly, the reason why the stock price prediction through the emotional evaluation of the social media data including the news analysis is more accurate and reliable than the stock price prediction by the analysis of the news produced by the stock market related web data 20 is that the social media data Because they are produced with much more data and statistically closer to the population.

이하에서는, 도 1 및 도 8 내지 도 11을 참조하여 주가 지수 예측 방법 및 본 실시예에 따른 주가 지수의 추세 및 전환점 판정 방법을 구현하기 위한 추세 예측부의 처리 과정에 대하여 상세히 설명하기로 한다. Hereinafter, with reference to FIG. 1 and FIG. 8 to FIG. 11, a description will be made in detail of the process of the trend forecasting unit for implementing the stock index exponentiation method and the trend and change point determination method of the stock index according to the present embodiment.

도 8은 감성 평가 데이터를 생성하는 과정을 포함하는 주가 예측 방법의 순서도이고, 도 9는 본 발명의 일 실시예에 따른 주가 지수의 추세 및 전환점 판정 방법을 구현하기 위한 추세 예측부의 처리 과정을 나타낸 순서도이다. 도 10은 제 1 및 제 2 기간의 설정 기간을 이동하면서 감성 평가 데이터를 수집하는 것을 도시한 도면이고, 도 11은 제 1, 제 2 이동 평균 누적 데이터 및 주가 지수의 단기 추세 결과를 표시부에 표시한 도면이다. FIG. 8 is a flow chart of a stock price prediction method including a process of generating emotion evaluation data, FIG. 9 is a flowchart illustrating a process of a trend prediction unit for implementing a trend and a turning point determination method of a stock price index according to an embodiment of the present invention It is a flowchart. FIG. 10 is a diagram showing collection of emotion evaluation data while moving in the setting period of the first and second periods, FIG. 11 shows the results of the first and second moving average cumulative data and the short- Fig.

문서 수집/추출부(110)는 날짜가 갱신되는 경우에, 소셜 미디어 데이터(10) 및 증시 관련 웹데이터(20)로부터 적어도 하나의 종목을 포함하는 종목군과 관련된 대량의 문서로서, html, PDF, 이미지 및 동영상 중 적어도 하나를 수집하고, 증시 지표 데이터들(30)을 입력받는다(S810). The document collecting / extracting unit 110 collects, from the social media data 10 and the market related web data 20, a large amount of documents related to the stock group including at least one item, such as html, PDF, Image and moving image, and receives stock indicator data 30 (S810).

이 경우에, 소셜 미디어 데이터(10)는 소위 SNS로서, 트위터(twitter), 페이스북(facebook), 다양한 포털 사이트에서 서비스하는 소셜 미디어와 같은 소셜 미디어 사이트들(12) 및 다양한 포털 사이트 등에서 운영하며 개인화된 컨텐츠가 포함된 블로그 사이트들(14)일 수 있다. 증시 관련 웹데이터(20)는 언론사, 방송사, 포털 사이트 로부터 서비스되는 증시 관련 뉴스 사이트들(22), 은행, 증권사, 보험 등의 금융사에서 증시와 관련하여 서비스되는 금융사 포털 사이트들(24) 및 증시 관련 공공 기관 또는 사설 기관에서 증시와 관련된 분석 정보를 제공하는 증시 관련 통계 사이트들(26)일 수 있다. In this case, the social media data 10 is operated as a so-called SNS on twitter, facebook, social media sites 12 such as social media serving on various portal sites, and various portal sites Or blog sites 14 containing personalized content. The stock market related web data 20 includes stock market news sites 22 served by media companies, broadcasters and portal sites, financial company portal sites 24 provided by financial institutions such as banks, Related statistical sites 26 that provide analytical information related to the stock market at the relevant public or private institution.

다음으로, 문서 수집/추출부(110)는 키워드 데이터베이스(120)를 참조하여 종목군과 관련된 문서를 수집하고, 문서 저장부(130)는 형태소 분석에 적합한 형태로 추출된 문서들을 저장할 수 있다(S820). 문서 수집/추출부(110)는 수집된 복수의 문서에 포함된 표현 중, 도 2에 도시된 키워드 데이터베이스(120)에 저장된 키워드 중 메인 키워드(122), 제품/서비스 관련 키워드(124) 및 인적 관련 키워드(126)가 포함되는 문서들을 추출함으로써, 감성 평가에 적합한 문서 데이터를 효율적으로 선정할 수 있다. Next, the document collecting / extracting unit 110 collects documents related to the stock group by referring to the keyword database 120, and the document storing unit 130 may store the extracted documents in a form suitable for morphological analysis (S820 ). The document collecting / extracting unit 110 extracts a main keyword 122, a product / service-related keyword 124, and a personal keyword 122 from the keywords stored in the keyword database 120 shown in FIG. 2 among the expressions contained in the collected plurality of documents. By extracting the documents including the related keywords 126, document data suitable for emotion evaluation can be efficiently selected.

또한, 문서 저장부(130)는 예컨대 도 3에 도시된 바와 같이, 개별 종목 그룹(131)마다 추출된 문서들의 포맷 별, 즉 html(132), pdf(133), 이미지(134), 동영상(135) 등으로 분산 저장할 수 있다.3, the document storage unit 130 stores the extracted documents according to the format of the extracted documents, that is, html 132, pdf 133, image 134, 135) or the like.

이어서, 형태소 분석부(140)는 감성 평가에 적합한 형태로 처리하기 위한 전처리로서, 저장된 복수의 문서의 포맷에 대하여 형태소를 분석한다(S830). 이 경우에, 형태소 분석부(140)는 도 3에 도시된 포맷마다 적합한 처리를 통해, 각 포맷에 대하여 병렬적으로 형태소 분석을 진행할 수 있다. 또한, 형태소 분석부(140)는 문서의 포맷에 포함된 표현에서 문장, 컨텍스트 등을 어절 단위로 분류하고, 개별 종목 또는 복수 종목의 조합과 관련된 키워드에 인접한 키워드들을 파싱(parsing)할 수 있다. 이에 대한 상세하 설명은 주가 예측 시스템(100)의 형태소 분석부(140)에서 기재된 바 생략하기로 한다. Then, the morpheme analysis unit 140 analyzes morphemes for the format of a plurality of stored documents as a pre-processing for processing in a form suitable for emotion evaluation (S830). In this case, the morpheme analysis unit 140 can perform morphological analysis in parallel on each format through a process suitable for each format shown in Fig. In addition, the morpheme analysis unit 140 may classify sentences, contexts, and the like in the expressions included in the document format in units of words, and may parse the keywords adjacent to the keywords associated with individual items or combinations of plural items. The detailed description thereof will be omitted in the morpheme analysis unit 140 of the stock price prediction system 100.

다음으로, 데이터 분석부(150)의 데이터 감성 평가부(152)는 도 5에 도시된 감성 사전 데이터베이스(160)를 참조하여, 형태소 분석부(140)에서 처리된 키워드마다 긍정 및 부정 중 어느 하나로 감성 평가함으로써 복수의 문서 전체에 대한 감성을 평가한다(S840). 그 결과, 데이터 감성 평가부(152)는 날짜 별로 복수의 문서 전체에 대하여 종목군과 관련된 감성 평가 데이터를 생성한다. Next, the data sensitivity evaluation unit 152 of the data analysis unit 150 refers to the emotion dictionaries database 160 shown in Fig. 5 and determines whether the keyword is positive or negative for each keyword processed by the morphology analysis unit 140 The emotion of the plurality of documents is evaluated by emotion evaluation (S840). As a result, the data sensitivity evaluation unit 152 generates emotion evaluation data related to the stock group for all the plurality of documents by date.

보다 구체적으로, 데이터 분석부(150)는 형태소 분석부(140)로부터의 키워드마다 긍정, 중립 또는 부정에 대한 평가 및 이 평가와 연계된 스코어를 저장하는 감성 사전 데이터베이스(160)를 참조하여, 추출된 키워드에 대하여 긍정, 중립 및 부정 중 어느 하나로 평가함과 아울러서 스코어링한다. 아울러, 데이터 감성 평가부(152)는 감성 사전 데이터베이스(160)에 의해 긍정, 중립 및 부정으로 판별된 키워드마다 부여된 스코어를 합산함으로써, 복수의 문서 전체에 대한 종목군의 감성 평가 데이터를 산출할 수 있다. More specifically, the data analysis unit 150 refers to the emotion dictionary database 160 storing an evaluation of affirmative, neutral, or negative for each keyword from the morpheme analysis unit 140 and a score associated with the evaluation, And evaluates the keyword as positive, neutral, or negative, and scores it. The data emotion evaluation unit 152 can calculate the emotion evaluation data of the stock group for all of the plurality of documents by summing the scores given for the keywords determined as positive, neutral, and negative by the emotion dictionaries database 160 have.

이어서, 평가 데이터 저장부(165)는 데이터 감성 평가부(152)로부터 산출된 종목군의 감성 평가 데이터를 날짜 별로 지속적으로 저장하고, 저장된 감성 평가 데이터를 감성 기반 인덱스부(200)의 제 1 및 제 2 이동 평균 산출부(212, 214)로 전달한다(S850). Then, the evaluation data storage unit 165 continuously stores the emotion evaluation data of the stock group calculated from the data emotion evaluation unit 152 for each date, and stores the stored emotion evaluation data in the first and second 2 moving average calculating units 212 and 214 (S850).

다음으로, 도 9를 참조하면, 제 1 이동 평균 산출부(212)는 평가 데이터 저장부(165)에 날짜 별로 저장된 종목군의 감성 평가 데이터 중 제 1 기간에 속한 날짜 별 감성 평가 데이터에 대한 제 1 이동 평균을 산출하며, 갱신 전 제 1 이동 평균과 상기 갱신 후 제 1 이동 평균으로 구성된 제 1 이동 평균 누적 데이터를 생성한다(S910). Next, referring to FIG. 9, the first moving average calculating unit 212 calculates a first moving average of the statistical evaluation data of the stock group stored in the evaluation data storage unit 165, The moving average is calculated, and the first moving average accumulated data including the first moving average before the update and the first moving average after the updating is generated (S910).

제 1 이동 평균은 단기 추세 파악을 위해 산출되며, 예컨대 도 10과 같이, 초기에 제 1 기간에 해당하는 감성 평가 데이터를 합산하여 제 1 기간의 총 일수로 나눈 값이다. 여기서, 제 1 기간은 단기 추세 파악에 적합한 기간이 설정될 수 있으며, 예를 들어 영업일 기준으로 12일일 수 있다. The first moving average is calculated for short-term trend recognition. For example, as shown in Fig. 10, the emotion evaluation data corresponding to the first period is added up and divided by the total number of days in the first period. Here, the first period may be set to a period suitable for grasping the short-term trend, for example, 12 business days.

아울러, 제 1 이동 평균은 도 10에서와 같이, 날짜가 갱신되는 경우에, 갱신되는 날까지의 제 1 기간에 해당하는 감성 평가 데이터에 대한 평균값이다. 따라서, 제 1 이동 평균 누적 데이터(1110)는 도 11에 도시된 바와 같이, 갱신 전, 후의 제 1 이동 평균이 계속해서 누적된 라인의 형태로 생성될 수 있다.In addition, the first moving average is an average value of the emotion evaluation data corresponding to the first period until the date when the date is updated, as shown in FIG. Accordingly, the first moving average accumulated data 1110 can be generated in the form of a line in which the first moving average before and after the update is continuously accumulated, as shown in Fig.

다음으로, 제 2 이동 평균 산출부(214)는 평가 데이터 저장부(165)에 날짜 별로 저장된 종목군의 감성 평가 데이터 중 제 1 기간 보다 긴 제 2 기간에 속한 날짜 별 감성 평가 데이터에 대한 제 2 이동 평균을 산출하며, 갱신 전 제 2 이동 평균과 상기 갱신 후 제 2 이동 평균으로 구성된 제 2 이동 평균 누적 데이터를 생성한다(S920). Next, the second moving average calculator 214 calculates a second moving average of the emotion evaluation data of each of the stock group stored in the evaluation data storage unit 165 for each day belonging to the second period, which is longer than the first period, And generates second moving average cumulative data composed of a second moving average before the update and a second moving average after the updating in operation S920.

제 2 이동 평균은 중기 추세 파악을 위해 산출되며, 예컨대 도 10과 같이, 초기에 제 2 기간에 해당하는 감성 평가 데이터를 합산하여 제 2 기간의 총 일수로 나눈 값이다. 여기서, 제 2 기간은 예를 들어, 영업일 기준으로 26일일 수 있다. The second moving average is calculated in order to grasp the mid-term trend. For example, as shown in Fig. 10, the emotion evaluation data corresponding to the second period is added up and divided by the total number of days in the second period. Here, the second period may be, for example, 26 business days.

아울러, 제 2 이동 평균은 제 1 이동 평균과 마찬가지로, 도 10에서와 같이, 날짜가 갱신되는 경우에, 갱신되는 날까지의 제 2 기간에 해당하는 감성 평가 데이터에 대한 평균값이다. 따라서, 제 2 이동 평균 누적 데이터(1120)는 도 11에 도시된 바와 같이, 갱신 전, 후의 제 2 이동 평균이 계속해서 누적된 라인의 형태로 생성될 수 있다.Similarly to the first moving average, the second moving average is an average value of the emotion evaluation data corresponding to the second period up to the day when the date is updated, as shown in Fig. Therefore, the second moving average accumulated data 1120 can be generated in the form of a line in which the second moving average before and after the update is continuously accumulated, as shown in Fig.

이어서, 크로스 분석부(216)는 제 1 및 제 2 이동 평균 누적 데이터(1110, 1120)를 이동 평균 산출부(212, 214)로부터 수신하고, 제 1 이동 평균 누적 데이터(1110)가 도 11에서와 같이, 제 2 이동 평균 누적 데이터(1120)를 상향으로 초과하는 골든 크로스(1130)인 경우에, 종목군과 관련된 주가 지수의 단기 추세를 상승으로 판정하고, 제 1 이동 평균 누적 데이터(1110)가 제 2 이동 평균 누적 데이터(1120)보다 하향으로 하락하는 데드 크로스(1140)인 경우에, 주가 지수의 단기 추세를 하락으로 판정한다(S930). Then, the cross analysis unit 216 receives the first and second moving average accumulated data 1110 and 1120 from the moving average calculating units 212 and 214, and the first moving average accumulated data 1110 , It is determined that the short-term trend of the stock price index related to the stock group is ascending, and when the first moving average accumulated data 1110 is in the case of the golden cross 1130 that exceeds the second moving average accumulated data 1120 In the case of the dead cross 1140 falling downward from the second moving average accumulated data 1120, it is determined that the short-term trend of the stock price index falls (S930).

이에 따라, 감성 평가 데이터에 대하여, 서로 다른 기간의 이동 평균 누적 데이터들을 산출하여 비교함으로써, 종목군과 관련된 주가 지수의 단기 추세 파악을 예측하여, 종목군의 추세 변동의 선행 지표를 제공할 수 있다. Accordingly, by calculating and comparing moving average cumulative data of different periods with respect to the sensitivity evaluation data, it is possible to predict the short-term trend of the stock price index related to the stock group, and to provide the leading index of the trend change of the stock group.

다음으로, 장세 예측부(217)는 크로스 분석부(216)에서 골든 크로스(1130)로 판정되어 종목군과 관련된 주가 지수가 단기 추세가 상승으로 예측되는 경우, 증시 지표 데이터들(30) 중 종목군의 거래량을 수신하여 거래량이 증가되는 것으로 확인되면, 종목군이 강세장으로 전환되는 것으로 예측할 수 있다(S940). 또한, 장세 예측부(217)는 크로스 분석부(216)에서 데드 크로스(1140)로 판정되어 종목군과 관련된 주가 지수가 단기 추세가 하락으로 예측되는 경우, 증시 지표 데이터들(30) 중 종목군의 거래량을 수신하여 거래량이 감소되는 것으로 확인되면, 종목군이 약세장으로 전환되는 것으로 예측할 수 있다(S940).Next, when the short-term forecasting unit 217 determines that the short-term trend of the stock index related to the stock group is predicted to be a golden cross 1130 in the cross analysis unit 216, If it is confirmed that the transaction volume is increased by receiving the transaction volume, it can be predicted that the stock group is converted into the bull market (S940). If the stock index related to the stock group is predicted as a short-term trend decline, the long-season forecasting unit 217 predicts the trading volume of the stock group among the stock index data 30 (S940), it can be predicted that the stock group will be converted into the bear market.

이에 의하면, 단기 추세 파악을 통해 상승 또는 하락 추세인 경우, 증시 지표 데이터로부터 획득되는 종목군과 관련된 거래량이 증가 또는 감소되는 것으로 확인되면, 종목군의 시황이 강세장 또는 약세장으로 전환되는 것을 예측할 수 있다. According to this, if the trend is ascending or descending through short-term trends, it can be predicted that the stock market will turn into a bull market or a bear market if it is confirmed that the trading volume related to the stock group acquired from stock market index data increases or decreases.

다음으로, 추세 지표부(218)는 제 1, 제 2 이동 평균 누적 데이터, 종목군과 관련된 주가 지수의 단기 추세 판정 결과 및 종목군의 강세장/약세장 예측 결과를 지표화하는 데이터를 생성하고, 지표화된 데이터를 표시부(190)에 전송한다(S950). Next, the trend indicator unit 218 generates data for indexing the first and second moving average cumulative data, the short-term trend determination result of the stock price index related to the stock group, and the bullish / bear market prediction result of the stock group, To the display unit 190 (S950).

다시 도 8을 참조하면, 키워드 분석부(154)는 데이터 감성 평가부(152)에서 이루어지는 감성 평가의 수행 동안에, 형태소 분석부(140)로부터 분석된 키워드들에 대하여 기간별 수집 건수, 각 키워드 간의 상관 분석 등의 통계 분석을 수행하여 그 결과를 표시부(190)에 제공할 수 있다. 또한, 키워드 분석부(154)는 분석된 키워드들 중 키워드 데이터베이스(120)에 등록되지 않은 키워드를 키워드 데이터베이스(120)에 갱신 저장하고, 관리자는 신규의 키워드 중 감성 평가에 반영할 키워드에 대해서는 감성 사전 데이터베이스(160)에 저장시킬 수 있다.Referring again to FIG. 8, the keyword analyzing unit 154 analyzes the keywords analyzed by the morphological analysis unit 140 during the emotion evaluation performed by the data emotion evaluation unit 152, Analysis, and the like, and provide the result to the display unit 190. [0213] FIG. The keyword analyzing unit 154 updates and stores the keywords not analyzed in the keyword database 120 among the analyzed keywords in the keyword database 120, Can be stored in the dictionary database 160. [

다음으로, 상관 분석/결정부(170)는 누적된 감성 평가 데이터 중 소정의 조건에 의해 선택된 감성 평가 데이터와 함께, 증시 지표 데이터와 경제 지표 데이터 간의 상관 관계로부터의 분석 데이터를 생성할 수 있다. Next, the correlation analysis / decision section 170 can generate analysis data from the correlation between the stock market index data and the economic index data, together with the emotion evaluation data selected by the predetermined condition among the accumulated emotional evaluation data.

소정 조건에 의한 감성 평가 데이터의 선택 과정에 대하여 도 7 및도 16을 통해 설명한다. 도 16은 감성 평가 데이터의 수집 기간, 지연 기간의 결정 및 평가 데이터의 선택 과정을 나타낸 순서도이다. The process of selecting emotion evaluation data according to a predetermined condition will be described with reference to FIGS. 7 and 16. FIG. FIG. 16 is a flowchart showing a collection period of the sensitivity evaluation data, a determination of a delay period, and a selection process of evaluation data.

평가 데이터 저장부(165)에 종목군마다, 일별로 누적 저장된 감성 평가 데이터와 증시 지표 데이터들(30) 간의 과거 상관 관계가 저장된 제 1 상관테이블부(172)의 상관 관계 분석 결과에 기초하여, 평가 데이터 수집 기간 결정부(173)는 개별 종목의 주가에 영향을 미치는 평가 데이터의 수집 기간을 결정한다(S852). On the basis of the correlation analysis result of the first correlation table unit 172 in which the past correlation between the sensitivity evaluation data accumulated in the daily accumulation and the stock index indicator data 30 is stored for each stock group in the evaluation data storage unit 165 The data collection period determining unit 173 determines the collection period of the evaluation data that affects the stock price of the individual stock (S852).

다음으로, 지연 기간 결정부(175)는 제 1 상관테이블부(172)의 과거 상관 관계에 기초하여 감성 평가 데이터가 개별 종목의 주가에 반영되어질 때까지 경과되는 지연 기간을 결정한다(S854).Next, the delay period determination unit 175 determines a delay period that elapses until the sensitivity evaluation data is reflected in the stock price of the individual item, based on the past correlation of the first correlation table unit 172 (S854).

계속해서, 평가 데이터 선택부(174)는 평가 데이터 저장부(165)에 누적 저장된 평가 데이터 중 수집 기간에 부합하는 감성 평가 데이터를 선택한다(S856). 이어서, 상관 분석/결정부(170)는 선택된 감성 평가 데이터와 지연 기간을 주가 예측부(180)로 제공한다(S858).Subsequently, the evaluation data selection unit 174 selects emotion evaluation data corresponding to the collection period among the evaluation data cumulatively stored in the evaluation data storage unit 165 (S856). Then, the correlation analysis / decision unit 170 provides the selected sensitivity evaluation data and the delay period to the stock price prediction unit 180 (S858).

다시 도 8을 참조하면, 주가 예측부(180)는 상관 분석/결정부(170)로부터 선택된 감성 평가 데이터, 지연 기간 및 제 2 상관테이블부(177)로부터 생성된 분석 데이터에 근거하여 종목군의 주가 지수를 예측한다(S860). 8, the stock price predicting unit 180 predicts the stock price of the stock group based on the emotion evaluation data selected from the correlation analysis / decision unit 170, the delay period, and the analysis data generated from the second correlation table unit 177 The index is predicted (S860).

한편, 제 1 상관테이블부(172)는 감성 영향 인덱스부(222)에 생성된 감성 영향 인덱스를 통해 개별 종목의 상승/하락 추세를 판단하고, 판단된 추세와 종목군의 실제 주가 지수의 추세 간의 불일치가 발생하는 경우에, 예측 주가 지수와 함께, 종목군의 실제 주가 지수에서 추세 전환이 발생한다는 예측 결과를 표시부(190)에 표시할 수 있다.On the other hand, the first correlation table unit 172 determines an upward / downward trend of the individual item through the sensitivity index generated in the sensitivity index unit 222, and determines whether there is a discrepancy between the determined trend and the trend of the actual stock index of the stock group It is possible to display, on the display unit 190, a predicted result that a trend change occurs in the actual stock index of the stock group together with the predicted stock index.

이하에서는, 도 1 및 도 8, 도 12 내지 도 15를 참조하여 주가 지수 예측 방법 및 본 실시예에 따른 주가 지수의 추세 및 전환점 판정 방법을 구현하기 위한 전환 예측부의 처리 과정에 대하여 상세히 설명하기로 한다. Hereinafter, with reference to FIG. 1, FIG. 8, and FIG. 12 to FIG. 15, a detailed description will be given of the process of the stock index exponentiation method and the process of the conversion predicting part for implementing the trend and change point determination method of the stock index according to the present embodiment do.

도 12는 본 발명의 다른 실시예에 따른 주가 지수의 추세 및 전환점 판정 방법을 구현하기 위한 전환 예측부의 처리 과정을 나타낸 순서도이다. 12 is a flowchart illustrating a process of a conversion predicting unit for implementing a trend and a turning point determination method of a stock price index according to another embodiment of the present invention.

도 8에서, 평가 데이터 저장부(165)는 데이터 감성 평가부(152)로부터 산출된 종목군의 감성 평가 데이터를 날짜 별로 지속적으로 저장하고, 저장된 감성 평가 데이터를 감성 기반 인덱스부(200)의 감성 영향 인덱스부(222)로 전달한다(S850). 8, the evaluation data storage unit 165 continuously stores the emotion evaluation data of the stock group calculated from the data emotion evaluation unit 152 for each date, and stores the stored emotion evaluation data in the emotion-based index unit 200 To the index unit 222 (S850).

이어서, 감성 영향 인덱스부(222)는 평가 데이터 저장부(165)에 날짜 별로 저장된 종목군의 감성 평가 데이터 중 제 3 기간에 속한 날짜 별 감성 평가 데이터에 대하여, 긍정으로 평가되어 집계된 상승 스코어와 부정으로 평가되어 집계된 하락 스코어를 날짜 단위로 산출할 수 있다. 구체적으로, 도 14a에 도시된 바와 같이, 키워드마다 긍정 및 부정을 부여한 후에, 제 3 기간 동안에 긍정으로 평가된 키워드의 스코어를 날짜 별로 집계하여 상승 스코어(1412)를 산출함과 아울러서, 제 3 기간 동안에 부정으로 평가된 키워드의 스코어를 일별로 집계하여 하락 스코어(1414)를 산출할 수 있다. 본 실시예에서 이용되는 제 3 기간은 제 1 및 제 2 기간과 독립적이며, 변동폭이 둔한 시장 특성에 부합되도록, 예컨대 영업일 기준으로 20일일 수 있다. Subsequently, the emotion-influenced index unit 222 compares the emotion evaluation data of the third period among the emotion evaluation data of the stock group stored by date in the evaluation data storage unit 165 with the affirmative evaluation score of the emotion evaluation data by date, And the accumulated downward score can be calculated on a date basis. Specifically, as shown in FIG. 14A, after affirmation and negation are given to each keyword, the scores of the keywords evaluated as positive during the third period are aggregated by date to calculate the rising score 1412, It is possible to calculate the drop score 1414 by counting the scores of the keywords evaluated as negative during each day. The third period used in the present embodiment is independent of the first and second periods, and may be 20 days, for example, on a business day basis so that the fluctuation width corresponds to a dull market characteristic.

상승 스코어(1412)와 하락 스코어(1414)를 날짜 별로 집계한 이후에, 감성 영향 인덱스부(222)는 도 14b에 도시된 바와 같이, 날짜 별로 산출된 상승 스코어(1412)와 하락 스코어(1414)의 차이값을 산출하여 일별 실증가분(r)과 일별 실감소분(d)을 산출할 수 있다(S1210). 도 14b는 일별 실증가분과 실감소분을 산출하는 과정을 도시한 도면이다. 14B, after the rise score 1412 and the drop score 1414 are aggregated by date, the emotion influence index section 222 calculates the rise score 1412 and the drop score 1414, which are calculated for each day, The daily yarn increment r and the daily yarn reduction d can be calculated at step S1210. FIG. 14B is a diagram illustrating a process of calculating the daily room increase and the room decrease.

계속해서, 감성 영향 인덱스부(222)는 집계 기간에서의 일별 실증가분(r)의 평균값 및 일별 실감소분(d)의 평균값을 산출할 수 있다(S1220).Subsequently, the emotion influence index section 222 can calculate the average value of the daily room increment r and the average value of the daily room decrease d in the aggregation period (S1220).

다음으로, 감성 영향 인덱스부(222)는 일별 실증가분(r)의 평균값 및 일별 실감소분(d)의 평균값의 비를 하기 수학식 1에 입력하여 종목군의 감성 영향 인덱스를 산출할 수 있다(S1230). 하기 수학식 1에 의해 획득되는 감성 영향 인덱스는 100에 근접할수록 개별 종목의 추세 패턴이 상승인 것이고, 반대라면 개별 종목의 추세 패턴이 하락임을 나타낸다. Next, the sensitivity index unit 222 can calculate the sensitivity index of the stock group by inputting the ratio of the average value of the daily room increment r and the average value of the daily room decrease d to the equation (1) (S1230 ). The sensitivity index obtained by the following equation (1) indicates that the trend pattern of the individual item increases as the value approaches 100, and if it is the opposite, the trend pattern of the individual item decreases.

[수학식 1][Equation 1]

감성 영향 인덱스=100-(100/(1+ES_1))Emotion Impact Index = 100- (100 / (1 + ES_1))

(여기서, ES(Effective Score)_1=(제 3 기간의 상기 일별 실증가분의 평균값)/(제 3 기간의 상기 일별 실감소분의 평균값)임)(Where ES (Effective Score) _1 = (average value of the daily room increment in the third period) / (average value of the daily room decrement in the third period)

이와 동시에, 감성 영향 인덱스부(222)는 제 1, 제 2 이동 평균과 마찬가지로, 날짜가 갱신되는 경우에, 갱신되는 날까지의 제 3 기간에 해당하는 값이 상기 수학식 1에 의해 산출될 수 있다. 따라서, 감성 영향 인덱스부(222)는 갱신 전, 후 감성 영향 인덱스를 계속해서 누적시켜, 도 15에서와 같이, 인덱스 누적 데이터를 생성할 수 있다(S1230). 도 15는 인덱스 누적 데이터, 종목군의 시장 상황 및 종목군의 주가 지수의 상승/하락 전환을 표시부에 표시한 도면이다. At the same time, as in the case of the first and second moving averages, when the date is updated, a value corresponding to the third period up to the updated date can be calculated by the expression (1) have. Therefore, the emotion influence index section 222 accumulates the before and after sensitivity affect index continuously, and generates the index accumulation data as shown in FIG. 15 (S1230). FIG. 15 is a diagram showing the index cumulative data, the market conditions of the stock group, and the up / down switch of the stock index of the stock group on the display unit.

다음으로, 상황 분석부(224)는 감성 영향 인덱스부(222)에서 생성된 제 3 기간 동안의 감성 영향 인덱스를 수신하고, 감성 영향 인덱스가 과열 지수 이상인 경우에, 종목군의 시장 상황이 과열인 것으로 판정(도 15의 1510 참조)하고, 감성 영향 인덱스가 침체 지수 이하일 경우에, 종목군의 시장 상황이 침체인 것으로 판정할 수 있다(S1240). 여기서, 과열 지수 및 침체 지수는 도 15에 도시된 바와 같이, 각각 70 및 30일 수 있다. Next, the situation analyzing unit 224 receives the sensitivity index for the third period generated by the sensitivity index unit 222, and if the sensitivity index is higher than the superheat index, the market condition of the stock group is overheated (See 1510 in Fig. 15), and when the sensitivity index is lower than the stagnation index, it can be determined that the market situation of the stock group is stagnant (S1240). Here, the superheat index and the stagnation index may be 70 and 30, respectively, as shown in FIG.

이어서, 상황 분석부(224)는 감성 영향 인덱스부(222)에서 도출된 다른 값인 인덱스 누적 데이터가 도 15에 도시된 전환 지수인 50보다 상향으로 초과하는 경우에, 종목군의 주가 지수를 상승 전환으로 판정하고(도 15의 1520 참조), 인덱스 누적 데이터가 전환 지수인 50보다 하향으로 하락하는 경우에, 종목군의 주가 지수를 하락 전환으로 판정(도 15의 1530 참조)을 할 수 있다(S1240). Then, when the index cumulative data, which is another value derived from the sensitivity index unit 222, exceeds the conversion index 50 shown in FIG. 15, the situation analyzer 224 updates the stock price index of the stock group (See 1520 in FIG. 15). If the index cumulative data falls below the conversion index of 50, the stock index of the stock group is determined to be a downward conversion (see 1530 in FIG. 15) (S1240).

계속해서, 전환 지표부(226)는 인덱스 누적 데이터, 종목군의 시장 상황에 대한 판정 결과 및 종목군의 상승/하락 전환 예측 결과를 지표화하는 데이터를 생성하고, 지표화된 데이터를 표시부(190)에 전송한다(S1260).Subsequently, the conversion indicator unit 226 generates data for indexing the index accumulation data, the determination result of the market condition of the stock group and the up / down conversion prediction result of the stock group, and transmits the indexed data to the display unit 190 (S1260).

전술한 실시예에서는, 감성 영향 인덱스를 산출하기 위한 평균값을 구하는 경우에, 제 3 기간 동안의 일별 실증가분(r)과 일별 실감소분(d)을 전부 포함하고 있으나, 변형 실시예로서, 감성 영향 인덱스부(222)는 평균값을 구하는데 이용되는 일별 실증가분(r)과 일별 실감소분(d)을 특정 조건에 따라 선별할 수 있다. In the above-described embodiment, all of the daily room increase r and the daily room decrease d during the third period are all included in the case of obtaining the average value for calculating the emotion influence index. However, as a modified embodiment, The index unit 222 can select the daily room increment r and the daily room decrease d used to calculate the average value according to a specific condition.

이에 대하여 설명하면, 평균값 산출(S1220) 전에, 감성 영향 인덱스부(222)는 일별 실증가분(r)이 그 해당일 직전의 날짜에 비해 임계 비율 이하로 증가되거나, 혹은 일별 실감소분(d)이 해당일 직전의 날짜에 비해 임계 비율 이하로 감소되는지 여부를 판정할 수 있다. 임계 비율은 주식 시장에서 개별 종목의 거래 정지를 유발하는 전일 대비 상승 비율 또는 하락 비율이며, 예를 들면 전일 대비 15%일 수 있다. Before describing the average value calculation (S1220), the emotion influence index section 222 determines whether the daily room increase r is increased to be equal to or smaller than the threshold ratio or the daily room decrease d It is possible to judge whether or not it is reduced to be equal to or less than the critical ratio with respect to the date immediately before the corresponding date. The threshold ratio is the rate of day-to-day rise or fall that causes the stock market to suspend trading of individual stocks, for example, 15% of the previous day.

상기 판정이 수행된 후, 감성 영향 인덱스부(222)가 평균값 산출(S1220)을 수행하는 경우에, 임계 비율 이하의 범위로 증가되거나 감소된 일별 실증가분(r) 및 일별 실감소분(d)에 대해서만 평균값을 산출할 수 있다. 이후의 과정은 전술한 S1230의 과정을 진행하여 감성 영향 인덱스 및 인덱스 누적 데이터를 생성할 수 있다. After the determination is performed, when the sensitivity index unit 222 performs the average value calculation (S1220), the daily room increase r and the daily room decrease d that are increased or decreased in the range below the threshold ratio The average value can be calculated. In the subsequent process, the process of S1230 may be performed to generate the emotion influence index and the index accumulation data.

도 12 및 도 14b에 따른 실시예는 일별로 산출된 상승 스코어(1412)와 하락 스코어(1414) 간의 차이값(1416)에 기초한 일별 실증가분(r)과 일별 실감소분(d)의 비를 이용하여 감성 영향 인덱스를 산출하는 것이나, 도 13 및 도 14a에 도시된 다른 실시예에서는 상승 스코어(1412)에 기초한 일별 순가증분(u)와 일별 순감소분(f)를 이용하여 감성 영향 인덱스를 산출하는 과정을 보여주고 있다. The embodiment according to Figures 12 and 14b uses the ratio of the daily room increment r and the daily room decrease d based on the difference 1416 between the rising score 1412 and the falling score 1414 calculated for each day In other embodiments shown in Figs. 13 and 14A, the emotion index is calculated by using the daily increment (u) and the daily decrease (f) based on the rise score 1412 It shows the process.

도 13은 본 발명의 또 다른 실시예에 따른 주가 지수의 추세 및 전환점 판정 방법을 구현하기 위한 전환 예측부의 처리 과정을 나타낸 순서도이다. 도 14a는 일별 순증가분과 순감소분을 산출하는 과정을 도시한 도면이다. 13 is a flowchart illustrating a process of a conversion predicting unit for implementing a trend and a turning point determination method according to another embodiment of the present invention. 14A is a diagram illustrating a process of calculating a net addition amount and a net decrease amount.

도 13 및 도 14a에 따른 실시예는 도 12 및 도 14b의 단계 S1240, S1250. S1260과 실질적으로 동일하므로, 상이한 점에 대해서만 설명하기로 한다. The embodiment according to Figs. 13 and 14A corresponds to steps S1240, S1250. S1260. Therefore, only different points will be described.

감성 영향 인덱스부(222)는 제 3 기간 동안에 상승 스코어(1412)의 일별 순증가분(u) 및 일별 순감소분(f)을 산출할 수 있다(S1210a). The emotion influence index unit 222 may calculate the daily increment u and the daily decrease f of the rising score 1412 during the third period S1210a.

계속해서, 감성 영향 인덱스부(222)는 제 3 기간에서의 일별 순증가분(u)의 평균값 및 일별 순감소분(f)의 평균값을 산출할 수 있다(S1220a). Subsequently, the emotion influence index section 222 can calculate the average value of the daily net increment u and the average of the daily net decrease f in the third period (S1220a).

다음으로, 감성 영향 인덱스부(222)는 일별 순증가분(u)의 평균값 및 일별 순감소분(f)의 평균값의 비를 하기 수학식 2에 입력하여 감성 영향 인덱스를 산출할 수 있다(S1230a). 하기 수학식 2에 의해 획득되는 감성 영향 인덱스는 100에 근접할수록 개별 종목의 추세 패턴이 상승 추세인 것이고, 반대라면 개별 종목의 추세 패턴이 하락 추세임을 나타낸다.Next, the emotion influence index unit 222 may calculate the emotion influence index by inputting the ratio of the average value of the daily net increment u and the average value of the daily net decrease f to the following equation (S1230a). The sensitivity index obtained by the following equation (2) indicates that the trend pattern of individual items is on an upward trend as the index is closer to 100, and on the contrary, it indicates that the trend pattern of individual items is on a downward trend.

[수학식 2]&Quot; (2) "

감성 영향 인덱스=100-(100/(1+ES_2))Emotion Impact Index = 100- (100 / (1 + ES_2))

(여기서, ES_2=(제 3 기간의 상기 일별 순증가분의 평균값)/(제 3 기간의 상기 일별 순감소분의 평균값)임)(Where ES_2 = (average value of the daily increment of the third period) / (average value of the daily decrement of the third period)

이와 동시에, 감성 영향 인덱스부(222)는 도 12의 설명에서와 마찬가지로, 날짜가 갱신되는 경우에, 갱신되는 날까지의 제 3 기간에 해당하는 값이 상기 수학식 2에 의해 산출될 수 있다. 따라서, 감성 영향 인덱스부(222)는 갱신 전, 후 감성 영향 인덱스를 계속해서 누적시켜, 도 15에서와 같이, 인덱스 누적 데이터를 생성할 수 있다(S1230a). At the same time, as in the description of Fig. 12, the emotion influence index section 222 can calculate the value corresponding to the third period from the date when the date is updated to the date when it is updated, using the above-mentioned equation (2). Therefore, the emotion influence index section 222 may accumulate the before and after emotion influence indexes continuously to generate index accumulation data as shown in FIG. 15 (S1230a).

도 12 내지 도 15를 통해 설명한 실시예에서는 상승 스코어(1412)와 하락 스코어(1414)의 차이값(1416) 또는 상승 스코어(812) 단독에 기초한 증가분 및 감소분의 평균값의 비를 수학식 1 또는 2에 입력하여 감성 영향 인덱스를 생성하는 것을 설명하였다, 그러나 증가분과 감소분을 구하는 것은 전술한 실시예에 제한되지 않고, 상승 스코어(1412)와 하락 스코어(1414)의 다양한 조합에 의한 증가분과 감소분을 획득할 수 있다. 예컨대, 하락 스코어(1414) 단독에 기한 증가분 및 감소분의 평균값을 이용하거나, 상승 스코어(1412)의 증가분과 감소분의 평균값과 하락 스코어(1414)의 증가분과 감소분의 평균값을 분리하여 계산한 후, 각 평균값을 조합하여 증가분과 감소분의 비를 구할 수도 있다. 12-15, the ratio of the difference value 1416 between the rising score 1412 and the falling score 1414 or the average value of the increments and decrements based on the rising score 812 alone can be calculated using Equations 1 or 2 However, it is not limited to the above-described embodiment to obtain the increment and decrement, but it is possible to obtain the increase and decrease by various combinations of the rise score 1412 and the drop score 1414 can do. For example, after calculating the average value of the increase and decrease of the downward score 1414 alone or the average of the increase and decrease of the increase score 1412 and the increase and decrease of the decrease score 1414, The ratio of the increment to the decrease can be obtained by combining the average values.

도 12 내지 도 15을 통해 설명한 실시예에 따르면, 종목군의 감성 평가 데이터를 분석하고 전환점 지표로서의 감성 영향 인덱스를 산출하여 과열 또는 침체 지수와 비교함으로써, 종목군의 시장 상황을 파악하고, 향후 주가 지수가 반전되는 시점을 보다 정확히 파악할 수 있다. 또한, 감성 영향 인덱스의 중간값에 해당하는 전환 지수과의 비교를 통해, 종목군의 주가 지수가 상승 또는 하락할 시점을 예측할 수 있다. According to the embodiment described with reference to Figs. 12 to 15, the emotional evaluation data of the stock group is analyzed, and the emotional impact index as the turning point index is calculated and compared with the overheat or stagnation index to grasp the market situation of the stock group, It is possible to more accurately grasp the time point when the inversion occurs. In addition, it is possible to predict when the stock index of the stock group will rise or fall through comparison with the conversion index corresponding to the median value of the sensitivity index.

도 1에 도시된 주가 지수의 추세 및 전환점 판정 시스템을 포함하는 주가 예측 시스템(100)을 구성하는 구성요소 또는 도 8, 도 9, 도 12, 도 13에 도시된 종목군의 단기 추세와 종목군의 시장 상황 예측을 판정하는 방법은 그 기능을 실현시키는 프로그램의 형태로 컴퓨터 판독가능한 기록 매체에 기록될 수 있다. 여기에서, 컴퓨터 판독 가능한 기록 매체란, 데이터나 프로그램 등의 정보를 전기적, 자기적, 광학적, 기계적, 또는 화학적 작용에 의해 축적하고, 컴퓨터에서 판독할 수 있는 기록 매체를 말한다. 이러한 기록 매체 중 컴퓨터로부터 분리 가능한 것으로서는, 예를 들면, 플렉시블 디스크, 광자기 디스크, CD-ROM, CD-R/W, DVD, DAT, 메모리 카드 등이 있다. 또한, 컴퓨터에 고정된 기록 매체로서 하드디스크나 ROM 등이 있다.8, 9, 12, and 13 and the short-term trend of the stock group shown in Figs. 8, 9, 12, and 13 and the constituent elements of the stock price prediction system 100 including the stock index- The method for determining the situation prediction can be recorded in a computer-readable recording medium in the form of a program realizing the function. Here, the computer-readable recording medium refers to a recording medium that can be read by a computer by accumulating information such as data and programs by electric, magnetic, optical, mechanical, or chemical action. Examples of such a recording medium that can be detached from a computer include a flexible disk, a magneto-optical disk, a CD-ROM, a CD-R / W, a DVD, a DAT, and a memory card. In addition, a hard disk, a ROM, or the like is used as a recording medium fixed to a computer.

이상에서 대표적인 실시예를 통하여 본 발명에 대하여 상세하게 설명하였으나, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자는 상술한 실시예에 대하여 본 발명의 범주에서 벗어나지 않는 한도 내에서 다양한 변형이 가능함을 이해할 것이다. 그러므로 본 발명의 권리 범위는 설명된 실시예에 국한되어 정해져서는 안 되며, 후술하는 특허청구범위뿐만 아니라 이 특허청구범위와 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태에 의하여 정해져야 한다.While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is clearly understood that the same is by way of illustration and example only and is not to be construed as limiting the scope of the present invention. I will understand. Therefore, the scope of the present invention should not be limited to the above-described embodiments, but should be determined by all changes or modifications derived from the scope of the appended claims and the appended claims.

100: 주가 예측 시스템 110: 문서 수집/추출부
120: 키워드 데이터베이스 130: 문서 저장부
140: 형태소 분석부 150: 데이터 분석부
160: 감성 사전 데이터베이스 165: 평가 데이터 저장부
170: 상관 분석/결정부 180: 주가 예측부
190: 표시부 200: 감성 기반 인덱스부100: stock price prediction system 110: document collection /
120: keyword database 130: document storage unit
140: Morphological analysis unit 150: Data analysis unit
160: Sensibility dictionary database 165: Evaluation data storage unit
170: correlation analysis / decision unit 180:
190: Display section 200: Emotion-based index section

Claims

A method of determining a trend and a turning point of a stock price index using a sensitivity based index according to an analysis of social data,
Wherein, in the event that an update of the date occurs, during the first period up to the updated date and during the second period longer than the first period, the item group including at least one item from the social media data and market- Collecting a plurality of documents related to the plurality of documents and generating and storing emotion evaluation data for all the plurality of documents by date;
Calculating a first moving average of the emotion evaluation data by date belonging to the first period, and generating first moving average accumulated data composed of the first moving average before update and the first moving average after updating;
Calculating a second moving average of the sensitivity evaluation data by date belonging to the second period, generating second moving average cumulative data composed of the pre-update second moving average and the updated second moving average; And
When the first moving average cumulative data exceeds the second moving average cumulative data upwardly, it is determined that the short-term trend of the stock index related to the stock group is ascending, and when the first moving average cumulative data is greater than the second moving average And determining that the short-term trend of the stock price index declines when the price falls below the cumulative data.

The method according to claim 1,
If the short-term trend of the stock price index is judged to be an increase, it is predicted that the stock group will be switched to a bull market if the trading volume related to the stock group increases, and if the short- A method of determining a trend and a turning point of a stock price index that predicts that the stock group will transition to a bear market if trading volume decreases.

The method according to claim 1,
If the stock price index is the stock price of the stock, and if the stocks included in the stock group include a plurality of stocks, the stock price index is a stock price index that is indexed based on the market cap of the stocks &Lt; / RTI >

The method according to claim 1,
Wherein the generating and storing the emotion evaluation data comprises:
Collecting a plurality of documents related to the stock group from the social media data and the stock market related web data during the first period and the second period if the update of the date occurs;
Analyzing morphemes for the plurality of documents; And
And evaluating the sensitivity of the stock group to the entirety of the plurality of documents for each date by evaluating the sensitivity of the keywords extracted from the analyzed morpheme to either positive or negative.

5. The method of claim 4,
Collecting the plurality of documents further comprises collecting a plurality of documents related to the stock group from the social media data and the market related web data during a third period when the date update occurs After the step of evaluating the emotion of the stock group for all the plurality of documents by the date,
Calculating a rising score calculated by the positive evaluation and a falling score calculated by evaluating the negative;
As the turning point index of the stock price index of the stock group, based on the ratio of the average value of the daily increment with respect to the score of at least one of the rising score and the falling score calculated during the third period and the average value of the daily decrease with respect to the score Creating an influence index; And
Determining that the market condition of the stock group is overheated when the emotion index is greater than or equal to the superheat index and determining that the market situation of the stock group is stagnant when the emotion index is lower than the stagnation index How to determine trends and turning points in stock indexes.

6. The method of claim 5,
Wherein the superheating index is 70 and the stagnation index is 30.

6. The method of claim 5,
The step of generating the emotion influence index may further include generating index accumulation data composed of the pre-update emotion influence index and the post-update emotion influence index,
After the step of determining that the market condition of the stock group is overheating or stagnating,
When the index cumulative data exceeds the conversion index of 50 upwardly, it is determined that the stock price index of the stock group is ascending conversion, and when the index cumulative data falls downward from the conversion index of 50, And determining the stock price index as a downward conversion.

6. The method of claim 5,
Wherein the daily increment and the daily decrease associated with the score are a daily room increment and a daily room decrease based on the difference between the rising score and the falling score calculated for each day,
Wherein the generating the emotional impact index comprises:
Calculating a difference between the rising score and the falling score by the date during the third period to calculate the daily room increment and the daily room decrease;
Calculating an average value of the daily room increment and the average value of the daily room decrease in the third period; And
And calculating the emotional impact index by inputting the ratio of the average value of the daily room increase and the average value of the daily room decrease to the following equation (1).
[Equation 1]
Emotion Impact Index = 100- (100 / (1 + ES_1))
(Where ES (Effective Score) _1 = (average value of the daily room increment in the third period) / (average value of the daily room decrement in the third period)

6. The method of claim 5,
Wherein the daily increment and the daily decrement relating to the score are a daily increment and a daily decrement based on the rising score calculated for each day,
Wherein the generating the emotional impact index comprises:
Calculating the daily increment and the daily decrement of the rising score during the third period;
Calculating an average value of the daily increment in the third period and an average value of the daily decrement in the third period; And
And calculating the emotion index by inputting the ratio of the average of the daily net increment and the average of the daily net decrease to the following equation (2).
&Quot; (2) "
Emotion Impact Index = 100- (100 / (1 + ES_2))
(Where ES_2 = (average value of the daily increment of the third period) / (average value of the daily decrement of the third period)

A system for determining a trend and a turning point of a stock price index using a sensitivity based index according to an analysis of social data,
Related data from social media data and stock market related web data for a first period up to the updated date and a second period longer than the first period when an update of the date occurs An evaluation data storage unit for storing emotional evaluation data for all the plurality of documents collected by the date;
A first moving average calculating unit configured to calculate a first moving average of the evaluation data by date belonging to the first period and to generate first moving average cumulative data composed of the first moving average before updating and the first moving average after updating, A calculating unit;
A second moving average that generates second moving average cumulative data composed of the pre-update second moving average and the updated second moving average, calculating a second moving average of the evaluation data by date belonging to the second period, A calculating unit; And
When the first moving average cumulative data exceeds the second moving average cumulative data upwardly, it is determined that the short-term trend of the stock index related to the stock group is ascending, and when the first moving average cumulative data is greater than the second moving average And a cross analysis unit for determining a short-term trend of the stock price index as a fall when the price index falls down below the cumulative data.