KR20200061144A

KR20200061144A - Stocks selection apparatus for constructing stock portfolio and method thereof

Info

Publication number: KR20200061144A
Application number: KR1020180146556A
Authority: KR
Inventors: 이기훈; 김동현; 조만재; 신희민; 김홍지
Original assignee: 광운대학교 산학협력단
Priority date: 2018-11-23
Filing date: 2018-11-23
Publication date: 2020-06-02
Also published as: KR102161256B1

Abstract

Disclosed are a stock item selection device for forming a stock portfolio and a method thereof. The stock item selection device can increase a rate of profits by collecting stock price data for each stock item, analyzing a correlation for each stock item from the collected stock price data, predicting stock prices for correlation-analyzed stock items by using a stock price prediction model generated by learning from the stock price data for the correlation-analyzed stock items, and comparing the predicted stock prices to select a stock item of which a stock price is expected to rise among the correlation-analyzed stock items.

Description

Stock equipment selection method and method for composing stock portfolio {STOCKS SELECTION APPARATUS FOR CONSTRUCTING STOCK PORTFOLIO AND METHOD THEREOF}

본 발명은 딥러닝 기법을 이용하여 주식 포트폴리오 구성을 위한 주식 종목 선택 장치 및 그 방법에 관한 것이다.The present invention relates to a stock item selection device and method for constructing a stock portfolio using a deep learning technique.

이 부분에 기술된 내용은 단순히 본 실시 예에 대한 배경 정보를 제공할 뿐 종래기술을 구성하는 것은 아니다.The contents described in this section merely provide background information for the present embodiment, and do not constitute a prior art.

주식투자는 많은 사람들이 이용하는 재테크 방법 중의 하나로, 주식 투자를 함에 있어서 주식의 가치를 파악하여 주가가 상승할지 하락할지를 예측하여야 하고, 이러한 예측을 위해 증권사들은 다양한 기술적 지표, 재무 데이터 등과 같은 다양한 주가지표를 제공하고 있다. Equity investment is one of the re-technologies used by many people. When investing in stocks, it is necessary to understand the value of stocks and predict whether the stock price will rise or fall, and for this prediction, securities companies have various stock indices such as various technical indicators and financial data. Is providing.

일반적인 개인 투자자의 경우에는 상술한 주가지표에 대한 정교한 분석에 기반을 두지 않은 채 주식 종목에 투자하며, 또한 투자 보다는 투기에 가까운 매매성향으로 소수의 종목에 집중 투자함으로써 수익의 극대화를 추구한다. 하지만, 주가 지표의 분석 없이 이루어지는 투자의 경우 투자의 위험도를 높이며, 결국 개인 투자자에게 큰 손실을 가져오는 문제점이 있다.In the case of a general individual investor, it invests in stocks without being based on the detailed analysis of the above-mentioned stock index, and also seeks to maximize profits by focusing on a few stocks with speculation rather than investment. However, in case of investments made without analyzing the stock price index, there is a problem that increases the risk of investment and eventually causes a large loss to individual investors.

상술한 개인 투자자들의 큰 손실을 막을 수 있는 방법은 다수 종목에 분산 투자하여 변동성에 대한 위험성을 줄이는 포트폴리오 투자를 하는 것이지만, 주식 포트폴리오를 구성하는데 있어서 개별 종목의 주가 데이터를 활용하지 않는 문제점이 있으며, 또한 주가에 영향을 미치는 근본 요인들이 너무나도 다양하고 복잡하여 상술한 주가지표만을 이용하여 변동 가능성이 큰 주가를 예측하는 것은 어려운 문제점이 있다.The method to prevent the large loss of the individual investors described above is to invest in portfolios that reduce the risk of volatility by diversifying investments in multiple stocks, but there is a problem of not using stock price data of individual stocks in constructing a stock portfolio. In addition, since the fundamental factors affecting the stock price are so diverse and complex, it is difficult to predict a stock price with high possibility of fluctuation using only the above-mentioned stock price index.

본 발명은 주식 네트워크 분석 기반의 포트폴리오 투자 방법에 딥러닝을 이용하여 주가를 예측함으로써 수익률을 높이도록, 주식 종목의 주가와 관련된 주가 데이터를 각 주식 종목별로 수집하고, 수집된 주가 데이터로부터 각 주식 종목별 간 상관관계를 분석하며, 주가 예측 모델을 이용하여 상관관계가 분석된 주식 종목들에 대한 주가 데이터로부터 예측 날짜의 주가를 예측하고, 상관관계가 분석된 주식 종목들에 대해 예측된 주가를 비교하여 주가 상승이 예측되는 주식 종목을 상관관계가 분석된 주식 종목들 중에서 선택하는 주식 포트폴리오 구성을 위한 주식 종목 선택 장치 및 그 방법을 제공함에 있다.The present invention collects stock price data related to the stock price of each stock item for each stock item to increase the yield by predicting the stock price using deep learning in the portfolio investment method based on the stock network analysis, and for each stock item from the collected stock price data. Analyze the inter-correlation, predict the stock price of the forecast date from the stock price data for the correlated stocks using the stock price prediction model, and compare the predicted stock price for the correlated stocks. The present invention provides an apparatus and method for selecting a stock item for constructing a stock portfolio that selects a stock item for which a stock price increase is predicted from among correlated stock items.

상술한 목적을 달성하기 위한 본 발명의 또 다른 일 실시 예에 따른 주식 포트폴리오 구성을 위한 주식 종목 선택 장치는 주식 종목의 주가와 관련된 주가 데이터를 각 주식 종목별로 수집하는 주가 데이터 수집부; 상기 수집된 주가 데이터를 이용하여 상기 각 주식 종목별 간 상관관계를 분석하는 상관관계 분석부; 상기 상관관계가 분석된 주식 종목들에 대해 상기 수집된 주가 데이터를 기반으로 학습하여 생성한 주가 예측 모델을 이용하여 주가를 예측하려는 예측 날짜에 대한 상기 상관관계가 분석된 주식 종목들 각각의 주가를 예측하는 주가 예측부; 및 상기 상관관계가 분석된 주식 종목들 각각에 대해 예측된 주가를 서로 비교하여 주가 상승이 예측되는 주식 종목을 상기 상관관계가 분석된 주식 종목들 중에서 선택하는 주식 종목 선택부;를 포함할 수 있다.In order to achieve the above object, a stock item selection device for configuring a stock portfolio according to another embodiment of the present invention includes a stock price data collection unit for collecting stock data related to the stock price of each stock item for each stock item; A correlation analysis unit that analyzes a correlation between each stock item by using the collected stock price data; Using the stock price prediction model generated by learning based on the collected stock price data for the stocks analyzed with the correlation, the stock price of each of the stock stocks analyzed with the correlation for the predicted date to predict the stock price A predicted stock price prediction unit; And a stock item selection unit for comparing the predicted stock price for each of the stock items for which the correlation has been analyzed and selecting a stock item for which a stock price increase is predicted from among the stock items for which the correlation has been analyzed. .

상술한 목적을 달성하기 위한 본 발명의 일 실시 예에 따른 컴퓨팅 디바이스에 의한 주식 포트폴리오 구성을 위한 주식 종목 선택 방법은 주식 종목의 주가와 관련된 주가 데이터를 각 주식 종목별로 수집하는 단계; 상기 수집된 주가 데이터를 이용하여 상기 각 주식 종목별 간 상관관계를 분석하는 단계; 상기 상관관계가 분석된 주식 종목들에 대해 상기 수집된 주가 데이터를 기반으로 학습하여 생성한 주가 예측 모델을 이용하여 주가를 예측하려는 예측 날짜에 대한 상기 상관관계가 분석된 주식 종목들 각각의 주가를 예측하는 단계; 및 상기 상관관계가 분석된 주식 종목들 각각에 대해 예측된 주가를 서로 비교하여 주가 상승이 예측되는 주식 종목을 상기 상관관계가 분석된 주식 종목들 중에서 선택하는 단계;를 포함할 수 있다.A stock item selection method for configuring a stock portfolio by a computing device according to an embodiment of the present invention for achieving the above object comprises: collecting stock data related to the stock price of each stock item for each stock item; Analyzing the correlation between each stock item using the collected stock price data; Using the stock price prediction model generated by learning based on the collected stock price data for the stocks analyzed with the correlation, the stock price of each of the stock stocks analyzed with the correlation for the predicted date to predict the stock price Predicting; And comparing the predicted stock price for each of the stocks for which the correlation has been analyzed, and selecting a stock stock for which a stock price increase is predicted from among the stock stocks for which the correlation has been analyzed.

본 발명의 또 다른 목적을 달성하기 위하여 본 발명의 주식 포트폴리오 구성을 위한 주식 종목을 선택하는 방법이 컴퓨터에서 수행하기 위한 컴퓨터에서 판독 가능한 프로그램이 기록된 저장 매체를 제공할 수 있다.In order to achieve another object of the present invention, a method of selecting a stock item for constructing a stock portfolio of the present invention can provide a storage medium in which a computer-readable program is recorded for execution on a computer.

본 발명의 일 실시 예에 따르면 주가 상승이 예측되는 주식 종목에 대해서 분산 투자가 가능하며, 이에 따라 투자 위험도를 낮출 수 있는 주식 포트폴리오를 구성할 수 있다.According to an embodiment of the present invention, it is possible to make a diversified investment in stocks that are expected to rise in stock prices, thereby constructing a stock portfolio that can lower the investment risk.

또한, 본 발명의 일 실시 예에 따르면 상관관계가 낮은 주식 종목들만을 추출하여 주가를 예측하므로 딥러닝 기법을 이용하여 주식 종목의 주가를 예측하는 예측 속도를 향상시킬 수 있다.In addition, according to an embodiment of the present invention, since stock prices are predicted by extracting only stock items with a low correlation, the prediction speed of predicting the stock price of a stock item using a deep learning technique may be improved.

본 발명의 효과들은 이상에서 언급한 효과로 제한되지 않으며, 언급되지 않은 또 다른 효과들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해 될 수 있을 것이다.The effects of the present invention are not limited to the effects mentioned above, and other effects not mentioned will be clearly understood by those skilled in the art from the following description.

도 1은 본 발명의 일 실시 예에 따른 주식 포트폴리오 구성을 위한 주식 종목 선택 장치의 구성을 개략적으로 도시한 블록도이다.
도 2는 본 발명의 일 실시 예에 따른 상관관계 분석부의 구성을 구체적으로 도시한 블록도이다.
도 3은 본 발명의 일 실시 예에 따른 각 주식 종목별 간 산출된 상관계수를 설명하기 위한 도면을 나타낸 것이다.
도 4는 본 발명의 일 실시 예에 따라 생성된 주식 네트워크 그래프를 설명하기 위한 도면을 나타낸 것이다.
도 5는 본 발명의 일 실시 예에 따라 구성된 최소신장트리를 설명하기 위한 도면을 나타낸 것이다.
도 6은 본 발명의 또 다른 일 실시 예에 따른 주식 포트폴리오 구성을 위한 주식 종목 선택 장치의 구성을 구체적으로 도시한 블록도이다.
도 7은 본 발명의 일 실시 예에 따라 주식 종목의 주가를 예측하는 방법을 설명하기 위한 도면을 나타낸 것이다.
도 8은 본 발명의 일 실시 예에 따른 주식 포트폴리오 구성을 위한 주식 종목 선택 방법을 설명하기 위한 흐름도이다.
도 9는 본 발명의 일 실시 예에 따른 각 주식 종목별 간 상관관계를 분석하는 방법을 구체적으로 설명하기 위한 흐름도이다.
도 10은 본 발명의 또 다른 일 실시 예에 따른 주식 포트폴리오 구성을 위한 주식 종목 선택 방법을 설명하기 위한 흐름도이다.1 is a block diagram schematically showing the configuration of a stock item selection device for configuring a stock portfolio according to an embodiment of the present invention.
2 is a block diagram specifically showing the configuration of a correlation analysis unit according to an embodiment of the present invention.
3 is a view for explaining a correlation coefficient calculated between each stock item according to an embodiment of the present invention.
4 is a view for explaining a stock network graph generated according to an embodiment of the present invention.
5 is a view for explaining a minimum height tree constructed according to an embodiment of the present invention.
6 is a block diagram specifically showing the configuration of a stock item selection device for configuring a stock portfolio according to another embodiment of the present invention.
7 is a view for explaining a method for predicting the stock price of a stock item according to an embodiment of the present invention.
8 is a flowchart illustrating a method for selecting a stock item for configuring a stock portfolio according to an embodiment of the present invention.
9 is a flowchart illustrating in detail a method of analyzing a correlation between each stock item according to an embodiment of the present invention.
10 is a flowchart illustrating a method of selecting a stock item for configuring a stock portfolio according to another embodiment of the present invention.

이하, 첨부된 도면을 참조하여 본 발명의 실시 예를 상세히 설명한다. 본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시 예들을 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 게시되는 실시 예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 본 실시 예들은 본 발명의 게시가 완전하도록 하고, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다. 명세서 전체에 걸쳐 동일 참조 부호는 동일 구성 요소를 지칭한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings. Advantages and features of the present invention, and methods for achieving them will be clarified with reference to embodiments described below in detail together with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but may be implemented in various different forms, and only the embodiments allow the publication of the present invention to be complete, and general knowledge in the technical field to which the present invention pertains. It is provided to fully inform the holder of the scope of the invention, and the invention is only defined by the scope of the claims. The same reference numerals refer to the same components throughout the specification.

다른 정의가 없다면, 본 명세서에서 사용되는 모든 용어(기술 및 과학적 용어를 포함)는 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 공통적으로 이해될 수 있는 의미로 사용될 수 있을 것이다. 또 일반적으로 사용되는 사전에 정의되어 있는 용어들은 명백하게 특별히 정의되어 있지 않는 한 이상적으로 또는 과도하게 해석되지 않는다.Unless otherwise defined, all terms (including technical and scientific terms) used in the present specification may be used as meanings commonly understood by those skilled in the art to which the present invention pertains. In addition, terms defined in the commonly used dictionary are not ideally or excessively interpreted unless specifically defined.

본 명세서에서 “학습”, “러닝” 등의 용어는 인간의 교육 활동과 같은 정신적 작용을 지칭하도록 의도된 것이 아닌 절차에 따른 컴퓨팅(computing)을 통하여 기계 학습(machine learning)을 수행함을 일컫는 용어로 해석한다.In this specification, terms such as “learning” and “learning” are terms intended to perform machine learning through computing according to a procedure that is not intended to refer to a mental action such as a human educational activity. Interpret.

본 명세서에서 "제1", "제2" 등의 용어는 하나의 구성요소를 다른 구성요소로부터 구별하기 위한 것으로, 이들 용어들에 의해 권리범위가 한정되어서는 아니 된다. 예를 들어, 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다.In this specification, terms such as “first” and “second” are for distinguishing one component from other components, and the scope of rights should not be limited by these terms. For example, the first component may be referred to as the second component, and similarly, the second component may also be referred to as the first component.

본 명세서에서 각 단계들에 있어 식별부호(예를 들어, a, b, c 등)는 설명의 편의를 위하여 사용되는 것으로 식별부호는 각 단계들의 순서를 설명하는 것이 아니며, 각 단계들은 문맥상 명백하게 특정 순서를 기재하지 않는 이상 명기된 순서와 다르게 일어날 수 있다. 즉, 각 단계들은 명기된 순서와 동일하게 일어날 수도 있고 실질적으로 동시에 수행될 수도 있으며 반대의 순서대로 수행될 수도 있다.In this specification, the identification numbers (for example, a, b, c, etc.) in each step are used for convenience of explanation, and the identification numbers do not describe the order of each step, and each step is clearly in context. Unless a specific order is specified, it may occur differently from the specified order. That is, each step may occur in the same order as specified, may be performed substantially simultaneously, or may be performed in the reverse order.

본 명세서에서, “가진다”, “가질 수 있다”, “포함한다” 또는 “포함할 수 있다”등의 표현은 해당 특징(예: 수치, 기능, 동작, 또는 부품 등의 구성요소)의 존재를 가리키며, 추가적인 특징의 존재를 배제하지 않는다.In this specification, expressions such as “have”, “can have”, “includes” or “can contain” indicate the existence of a corresponding feature (eg, a component such as a numerical value, function, operation, or part). Indicates, does not exclude the presence of additional features.

도 1은 본 발명의 일 실시 예에 따른 주식 포트폴리오 구성을 위한 주식 종목 선택 장치의 구성을 개략적으로 도시한 블록도이다.1 is a block diagram schematically showing the configuration of a stock item selection device for configuring a stock portfolio according to an embodiment of the present invention.

도 1을 참조하면, 본 발명의 일 실시 예에 따른 주식 포트폴리오 구성을 위한 주식 종목 선택 장치(100)는 주가 데이터 수집부(110), 상관관계 분석부(120), 주가 예측부(130) 및 주식 종목 선택부(140)를 포함할 수 있다. 주식 포트폴리오 구성을 위한 주식 종목 선택 장치(100)는 도 1에서 예시적으로 도시한 다양한 구성요소들 중에서 일부 구성요소를 생략하거나 다른 구성요소를 추가로 포함할 수 있다.Referring to FIG. 1, the stock item selection device 100 for configuring a stock portfolio according to an embodiment of the present invention includes a stock price data collection unit 110, a correlation analysis unit 120, a stock price prediction unit 130, and It may include a stock item selection unit 140. The stock item selection apparatus 100 for configuring a stock portfolio may omit some of the various components exemplarily illustrated in FIG. 1 or additionally include other components.

본 발명의 일 실시 예에 따른 주식 종목 선택 장치(100)는 거래하려는 주식 종목의 매수/매도에 대한 주식 포트폴리오를 설정하기 위해 각 주식 종목별 주가를 예측하여 주가가 상승할 것으로 예측되는 주식 종목을 선택할 수 있다.The stock item selection device 100 according to an embodiment of the present invention predicts the stock price for each stock item in order to set a stock portfolio for the buy/sell of the stock item to be traded, and selects the stock item predicted to rise in stock price Can be.

본 발명의 일 실시 예에 따른 주가 데이터 수집부(110)는 주식 종목의 주가와 관련된 주가 데이터를 각 주식 종목별로 수집할 수 있다.The stock price data collection unit 110 according to an embodiment of the present invention may collect stock data related to the stock price of each stock item for each stock item.

주가 데이터는 주식 종목에서 일반적으로 사용되는 시가(open), 고가(high), 저가(low), 종가(close), 조정 종가(adjust close) 및 주식의 거래량(volume)을 포함할 수 있다.The stock price data may include open, high, low, close, adjusted close and volume of stocks commonly used in stocks.

시가(open)는 증권거래소에서 형성되는 주식의 가격을 나타내고, 고가(high)는 당일 주식 장 중에서 주식의 가격이 가장 높았던 가격을 나타내며, 저가(low)는 당일 주식 장 중에서 주식의 가격이 가장 낮았던 가격을 나타내고, 종가(close)는 당일 주식 장이 끝나면서 마지막으로 형성된 주식 가격을 나타내며, 조정 종가(adjust close)는 주식 종목에 해당하는 기업에 증자, 액면분할과 같은 이벤트가 발생하여 주가에 변화가 생기는 경우 과거 주가도 함께 조정하는 것을 나타내고, 주식의 거래량(volume)은 일정 기간(시, 일, 주, 달, 년) 동안 거래된 주식 수(매도, 매수)를 나타낸다.The open price represents the price of stocks formed on the stock exchange, the high value represents the price of the highest price among the stocks of the day, and the low value indicates the lowest price of the stocks among the stocks of the day. The price, the close is the price of the last stock formed at the end of the stock market on the day, and the close is the close to the stock, which causes events such as capital increase or face value to change. In the case, the past share price is also adjusted, and the volume of the stock represents the number of shares (sold, bought) traded over a certain period (hour, day, week, month, year).

본 발명의 일 실시 예에 따른 주가 데이터는 KOSPI200에 있는 200종목, 예측 날짜를 기준으로 최근 5년간의 시가, 고가, 저가, 종가, 조정 종가, 거래량(open, high, low, close, adjust close, volume)을 나타낼 수 있으나, 상술한 예시는 본 발명의 일 실시 예를 설명하기 위한 예시일 뿐 이에 한정되는 것은 아니다.Stock data according to an embodiment of the present invention is 200 stocks in KOSPI 200, market price, high price, low price, closing price, adjusted closing price, trade volume (open, high, low, close, adjust close, volume), but the above-described example is only an example for explaining an embodiment of the present invention, but is not limited thereto.

본 발명의 일 실시 예에 따른 상관관계 분석부(120)는 주가 데이터 수집부(110)에서 수집된 주가 데이터를 이용하여 각 주식 종목별 간 상관관계를 분석할 수 있다.The correlation analysis unit 120 according to an embodiment of the present invention may analyze the correlation between each stock item by using the stock price data collected by the stock price data collection unit 110.

구체적으로, 본 발명의 일 실시 예에 따른 상관관계 분석부(120)는 주가 데이터 수집부(110)에서 수집된 주가 데이터를 이용하여 각 주식 종목들 중에서 상관관계가 높은 주식 종목들로 어떠한 주식 종목들이 있는지 분석할 수 있고, 또한 각 주식 종목들 중에서 상관관계가 낮은 주식 종목들로 어떠한 주식 종목들이 있는지 분석할 수 있다.Specifically, the correlation analysis unit 120 according to an embodiment of the present invention uses the stock price data collected by the stock price data collection unit 110 to determine which stock stocks are high-correlation stocks among each stock stock. You can analyze whether there are stocks, and also which stocks have low correlations among each stocks.

상술한 각 주식 종목별 상관관계를 분석하기 위한 구체적인 방법은 도 2를 함께 참조하여 설명하도록 한다.A detailed method for analyzing the correlation of each stock item described above will be described with reference to FIG. 2 together.

도 2는 본 발명의 일 실시 예에 따른 상관관계 분석부의 구성을 구체적으로 도시한 블록도이다.2 is a block diagram specifically showing the configuration of a correlation analysis unit according to an embodiment of the present invention.

도 2를 참조하면, 본 발명의 일 실시 예에 따른 상관관계 분석부(120)는 상관계수 산출부(121), 거리 산출부(122), 주식 네트워크 그래프 생성부(123), 클러스터링부(124) 및 주식 종목 추출부(125)를 포함할 수 있다. 상관관계 분석부(120)는 도 2에서 예시적으로 도시한 다양한 구성요소들 중에서 일부 구성요소를 생략하거나 다른 구성요소를 추가로 포함할 수 있다.2, the correlation analysis unit 120 according to an embodiment of the present invention includes a correlation coefficient calculation unit 121, a distance calculation unit 122, a stock network graph generation unit 123, a clustering unit 124 ) And stock item extraction unit 125. The correlation analysis unit 120 may omit some components from among various components illustrated in FIG. 2 or may additionally include other components.

본 발명의 일 실시 예에 따른 상관계수 산출부(121)는 각 주식 종목별 주식 가격의 수익률을 이용하여 각 주식 종목별 간 상관관계를 수치화한 상관계수를 산출할 수 있다.The correlation coefficient calculating unit 121 according to an embodiment of the present invention may calculate a correlation coefficient that quantifies the correlation between each stock item by using the yield of the stock price for each stock item.

본 발명의 일 실시 예에 따른 상관계수 산출부(121)는 아래의 수학식 1을 이용하여 주식 종목별 주가 데이터로부터 각각의 로그 수익률을 산출할 수 있다. 즉, 본 발명의 일 실시 예에 따른 상관계수 산출부(121)는 각 주식 종목별 주식 가격의 로그 수익률을 산출할 수 있으며, 산출된 주식 가격의 로그 수익률을 이용하여 각 주식 종목별 상관계수를 산출할 수 있다.The correlation coefficient calculating unit 121 according to an embodiment of the present invention may calculate each log yield from stock price data for each stock item using Equation 1 below. That is, the correlation coefficient calculating unit 121 according to an embodiment of the present invention may calculate the log yield of the stock price for each stock item, and calculate the correlation coefficient for each stock item using the log return of the calculated stock price. Can be.

상술한 수학식 1에서, Si는 주식 종목 i에 대한 주식 가격을 나타내고, Gi(t)는 시간 t에서 주식 종목 i의 주식 가격의 로그 수익률을 나타낸다.In the above equation (1), Si represents the stock price for stock item i, and Gi(t) represents the log return of the stock price of stock item i at time t.

본 발명의 일 실시 예에 따른 상관계수 산출부(121)는 상술한 수학식 1을 이용하여 각 주식 종목당 최근 6개월 동안의 일일 로그 수익률을 산출할 수 있다. 단, 상술한 예시는 본 발명의 일 실시 예를 설명하기 위한 예시일 뿐 이에 한정되는 것은 아니다.The correlation coefficient calculating unit 121 according to an embodiment of the present invention may calculate the daily log yield for the last 6 months for each stock item using Equation 1 described above. However, the above-described example is only an example for explaining an embodiment of the present invention, but is not limited thereto.

본 발명의 일 실시 예에 따른 상관계수 산출부(121)는 아래의 수학식 2를 이용하여 주식 종목 i와 주식 종목 j 사이의 상관관계를 수치화한 상관계수 Ci,j를 산출할 수 있다.The correlation coefficient calculating unit 121 according to an embodiment of the present invention may calculate the correlation coefficient Ci,j which quantifies the correlation between the stock item i and the stock item j using Equation 2 below.

상술한 수학식 2에서, i와 j는 각각 주식 종목을 나타내고, Gi는 주식 종목 i의 주식 가격의 로그 수익률을 나타내고, <Gi>는 Gi의 평균을 나타낸다. 상술한 i와 j는 같은 값일 수도 있고 다른 값일 수도 있다.In Equation 2 described above, i and j each represent a stock item, Gi indicates a log return of the stock price of stock item i, and <Gi> indicates an average of Gi. The aforementioned i and j may be the same value or different values.

상술한 방법에 의해 산출된 각 주식 종목별 간 상관계수를 도 3을 함께 참조하여 설명하도록 한다.The correlation coefficient between each stock item calculated by the above-described method will be described with reference to FIG. 3 together.

도 3은 본 발명의 일 실시 예에 따른 각 주식 종목별 간 산출된 상관계수를 설명하기 위한 도면을 나타낸 것이다.3 is a view for explaining a correlation coefficient calculated between each stock item according to an embodiment of the present invention.

도 3을 참조하면, 본 발명의 일 실시 예에 따른 상관계수 산출부(121)는 제1 내지 제5 주식 종목별 각각의 상관관계에 대해 수치화한 상관계수를 산출할 수 있다.Referring to FIG. 3, the correlation coefficient calculating unit 121 according to an embodiment of the present invention may calculate a numerical correlation coefficient for each correlation of the first to fifth stock items.

구체적으로, 상관관계와 관련하여 상술한 방법에 의해 각 제1 내지 제5 주식 종목별 간 산출된 상관계수를 분석함으로써, 제3 주식 종목과 제4 주식 종목 간 산출된 상관계수가 0.52236으로 다른 주식 종목들에 비해 제3 주식 종목과 제4 주식 종목 간의 상관관계가 가장 높은 것을 확인할 수 있다. 또한 제2 주식 종목과 제4 주식 종목 간 산출된 상관계수가 -0.0795으로 다른 주식 종목들에 비해 제2 주식 종목과 제4 주식 종목 간의 상관관계가 가장 낮은 것을 확인할 수 있다.Specifically, by analyzing the correlation coefficient calculated between each of the first to fifth stock items by the above-described method in relation to the correlation, the calculated stock coefficient between the third stock item and the fourth stock item is 0.52236. It can be seen that the correlation between the 3rd and 4th stocks is the highest. In addition, the correlation coefficient calculated between the second and fourth stocks was -0.0795, indicating that the correlation between the second and fourth stocks was the lowest compared to other stocks.

다시 도 2를 참조하면, 본 발명의 일 실시 예에 따른 클러스터링부(124)는 산출된 상관계수에 따라 상관관계의 유사성을 가지는 주식 종목들을 미리 설정된 개수의 주식 종목 그룹으로 군집화할 수 있다.Referring back to FIG. 2, the clustering unit 124 according to an embodiment of the present invention may cluster stock items having similarity of correlation into a preset number of stock item groups according to the calculated correlation coefficient.

본 발명의 일 실시 예에 따른 주식 종목 추출부(125)는 군집화된 주식 종목 그룹 각각에 포함된 상관관계의 유사성을 가지는 주식 종목들 중 상관계수가 임계치보다 낮은 적어도 하나의 주식 종목을 주가를 예측하기 위한 주식 종목의 후보로 추출할 수 있다.The stock item extracting unit 125 according to an embodiment of the present invention predicts a stock price of at least one stock item having a correlation coefficient lower than a threshold among the stock items having similarity in correlation included in each group of grouped stock items. It can be extracted as a candidate for stock stocks.

상술한 상관계수는 -1 내지 1의 값을 가질 수 있으며, 임계치는 0일 수 있다. 따라서, 본 발명의 일 실시 예에 따른 주식 종목 추출부(125)는 군집화된 주식 종목 그룹 각각에 포함된 상관관계의 유사성을 가지는 주식 종목들 중 상관계수가 0 미만인 적어도 하나의 주식 종목을 주가를 예측하기 위한 주식 종목의 후보로 추출할 수 있으나, 상술한 예시는 본 발명의 일 실시 예를 설명하기 위한 예시일 뿐 이에 한정되는 것은 아니다.The above-described correlation coefficient may have a value of -1 to 1, and the threshold may be 0. Therefore, the stock item extracting unit 125 according to an embodiment of the present invention shares the stock price of at least one stock item having a correlation coefficient of less than 0 among stock items having similarity in correlation included in each group of stock groups. Although it can be extracted as a candidate of a stock item for prediction, the above-described example is only an example for explaining an embodiment of the present invention, but is not limited thereto.

또한, 본 발명의 또 다른 일 실시 예에 따른 거리 산출부(122)는 상관계수 산출부(121)에서 각 주식 종목별 간 산출된 상관계수를 이용하여 각 주식 종목별 간의 거리 관계를 수치화한 각 주식 종목별 간 떨어져 있는 거리를 산출할 수 있다.In addition, the distance calculating unit 122 according to another embodiment of the present invention uses the correlation coefficient calculated between each stock item in the correlation coefficient calculating unit 121 for each stock item that quantifies the distance relationship between each stock item. You can calculate the distance between them.

본 발명의 일 실시 예에 따른 거리 산출부(122)는 아래의 수학식 3을 이용하여 상관계수 산출부(121)에서 산출된 각 주식 종목 간의 상관관계를 이용하여 각 주식 종목 간 거리 관계를 수치화한 거리를 산출할 수 있다.The distance calculating unit 122 according to an embodiment of the present invention quantifies the distance relationship between each stock item using the correlation between each stock item calculated by the correlation coefficient calculating unit 121 using Equation 3 below. One distance can be calculated.

상술한 수학식 3에서 Ci,j가 -1과 1사이의 값에 해당하는 경우 주식 종목 i와 주식 종목 j 간 거리 관계를 수치화한 di,j는 0에서 2사이의 값으로 변환될 수 있다. 따라서, 주식 종목 i와 j 간의 상관관계를 수치화로 나타낸 상관계수인 Ci,j의 값이 클수록 주식 종목 i와 j 간의 거리 관계를 수치화하여 산출된 거리인 di,j의 값은 감소하는 관계를 가지고, Ci,j의 값이 작을수록 di,j의 값은 증가하는 관계를 가질 수 있다.When Ci,j in Equation 3 above corresponds to a value between -1 and 1, di,j, which quantifies the distance relationship between stock item i and stock item j, may be converted to a value between 0 and 2. Therefore, the larger the value of the correlation coefficient Ci,j, which represents the correlation between stocks i and j as a numerical value, has a relationship that decreases the value of the distance di,j calculated by quantifying the distance relationship between stocks i and j. , The smaller the value of Ci,j, the di,j value may have.

따라서, 본 발명의 일 실시 예에 따른 주식 종목 i와 주식 종목 j 간의 상관관계가 높을수록 주식 종목 i와 주식 종목 j 간 거리가 가까운 정도를 나타내고, 주식 종목 i와 주식 종목 j 간의 상관관계가 낮을수록 주식 종목 i와 주식 종목 j 간 거리가 먼 정도를 나타낼 수 있다.Therefore, the higher the correlation between stock item i and stock item j according to an embodiment of the present invention, the closer the distance between stock item i and stock item j is, and the lower the correlation between stock item i and stock item j. The recorded stock may indicate the distance between the stock item i and the stock item j.

본 발명의 일 실시 예에 따른 상관관계 분석부(120)는 거리 산출부(122)에서 산출된 각 주식 종목별 간 거리를 이용하여 각 주식 종목별 간 상관관계 정도를 분석할 수 있다.The correlation analysis unit 120 according to an embodiment of the present invention may analyze the degree of correlation between each stock item by using the distance between each stock item calculated by the distance calculation unit 122.

구체적으로, 본 발명의 일 실시 예에 따른 상관관계 분석부(120)는 거리 산출부(122)에서 산출된 각 주식 종목별 간 거리가 가깝다면 각 주식 종목별 간 상관관계가 높다고 분석할 수 있고, 각 주식 종목별 간 거리가 멀다면 각 주식 종목별 간 상관관계가 낮다고 분석할 수 있다.Specifically, the correlation analysis unit 120 according to an embodiment of the present invention may analyze that the correlation between each stock item is high if the distance between each stock item calculated by the distance calculator 122 is close. If the distance between stock stocks is long, it can be analyzed that the correlation between stock stocks is low.

본 발명의 일 실시 예에 따른 상관관계 분석부(120)는 상술한 거리 산출부(122)에서 산출된 각 주식 종목별 간 거리를 시각적으로 확인할 수 있도록 주식 네트워크 그래프를 생성할 수 있다.The correlation analysis unit 120 according to an embodiment of the present invention may generate a stock network graph to visually check the distance between each stock item calculated by the distance calculation unit 122 described above.

구체적으로, 본 발명의 일 실시 예에 따른 주식 네트워크 그래프 생성부(123)는 산출된 각 주식 종목별 간의 거리에 따라 각 주식 종목별 간 거리 관계를 나타내는 주식 네트워크 그래프를 생성할 수 있다.Specifically, the stock network graph generator 123 according to an embodiment of the present invention may generate a stock network graph indicating a distance relationship between each stock item according to the calculated distance between each stock item.

본 발명의 일 실시 예에 따른 주식 네트워크 그래프 생성부(123)는 상술한 방법에 의해 산출된 주식 종목 i와 j간의 거리 di,j 값을 기반으로 주식 네트워크 그래프를 생성할 수 있다.The stock network graph generator 123 according to an embodiment of the present invention may generate a stock network graph based on a distance di,j value between stock items i and j calculated by the above-described method.

도 4는 본 발명의 일 실시 예에 따라 생성된 주식 네트워크 그래프를 설명하기 위한 도면을 나타낸 것이다.4 is a view for explaining a stock network graph generated according to an embodiment of the present invention.

도 4를 참조하면, 본 발명의 일 실시 예에 따른 주식 네트워크 그래프 생성부(123)는 50개의 주식 종목에 대한 각각의 주식 종목별 간 상관관계를 나타내는 주식 네트워크 그래프를 생성할 수 있다.Referring to FIG. 4, the stock network graph generator 123 according to an embodiment of the present invention may generate a stock network graph showing correlation between each stock item for 50 stock items.

구체적으로, 본 발명의 일 실시 예에 따른 주식 네트워크 그래프 생성부(123)는 50개의 각 주식 종목별 간 상관관계를 각 주식 종목별 간 거리 관계로 변환하고, 변환된 50개의 각 주식 종목별 간 거리 관계를 나타내는 주식 네트워크 그래프를 생성할 수 있다.Specifically, the stock network graph generator 123 according to an embodiment of the present invention converts the correlation between 50 stock items into a distance relationship between each stock item, and converts the distance relationship between the 50 converted stock items. You can create a stock network graph to represent.

즉, 본 발명의 일 실시 예에 따른 주식 네트워크 그래프 생성부(123)는 수학식 3에서 산출된 50개의 각 주식 종목별 간 거리인 di,j를 바탕으로 주식 네트워크 그래프를 생성할 수 있다.That is, the stock network graph generation unit 123 according to an embodiment of the present invention may generate a stock network graph based on di,j, which is a distance between 50 stock items calculated in Equation (3).

다시 도 2를 참조하면, 본 발명의 또 다른 일 실시 예에 따른 클러스터링부(124)는 주식 네트워크 그래프 생성부(123)에서 생성된 주식 네트워크 그래프로부터 각 주식 종목별 간 거리 관계에 따라 상관관계의 유사성을 주식 종목들을 미리 설정된 개수의 주식 종목 그룹으로 군집화할 수 있다.Referring to FIG. 2 again, the clustering unit 124 according to another embodiment of the present invention has similarity of correlation according to the distance relationship between each stock item from the stock network graph generated by the stock network graph generator 123 The stock stocks can be clustered into a preset number of stock stock groups.

본 발명의 일 실시 예에 따른 클러스터링부(124)는 모든 주식 종목별 간의 상관계수 값을 가지고 있는 상관계수 행렬에 군집화 알고리즘을 적용하여 주식 네트워크 그래프로부터 미리 설정된 개수의 주식 종목 그룹들로 군집화할 수 있다.The clustering unit 124 according to an embodiment of the present invention may cluster a predetermined number of stock item groups from a stock network graph by applying a clustering algorithm to a correlation coefficient matrix having a correlation coefficient value between all stock items. .

상술한 군집화 알고리즘으로 K-평균(K-means) 알고리즘 또는 K-메도이드(K-medoid) 알고리즘을 적용할 수 있으나, 상술한 예시는 본 발명의 일 실시 예를 설명하기 위한 예시일 뿐 이에 한정되는 것은 아니다.Although the K-means algorithm or the K-medoid algorithm may be applied as the above-described clustering algorithm, the above-described examples are only examples for explaining an embodiment of the present invention and are limited thereto. It does not work.

K-평균(K-means) 알고리즘은 미리 설정된 개수만큼 무작위로 중심(centroid) 값을 설정하고, 설정된 중심 값과 가까운 주식 종목을 클러스터에 포함시키면서 무작위로 설정된 중심 값을 클러스터의 중앙으로 이동시키는 과정을 반복함으로써 미리 설정된 개수의 주식 종목 그룹들로 군집화할 수 있다.The K-means algorithm sets the centroid value randomly by a preset number and moves the randomly set center value to the center of the cluster by including stocks close to the set center value in the cluster. By repeating, it can be clustered into a preset number of stock groups.

K-메도이드(K-medoid) 알고리즘은 미리 설정된 개수의 각 주식 종목 그룹들인 군집에서 대표 객체(medoids)인 주식 종목을 임의로 찾음으로써 복수 개의 주식 종목들 중에서 미리 설정된 개수의 군집을 찾는 것을 나타낸다. 구체적으로, 상술한 K-메도이드(K-medoid) 알고리즘은 복수 개의 주식 종목들 중 대표 객체인 대표 주식 종목을 미리 설정된 개수만큼 지정한 후, 나머지 주식 종목들을 유사성이 가장 높은 대표 주식 종목에 배속한다. 상술한 유사성은 거리 산출부(122)에서 산출된 각 주식 종목별 간 거리 관계에 따라 거리가 가까울수록 유사성이 높은 것을 나타낸다. The K-medoid algorithm refers to finding a predetermined number of clusters among a plurality of stocks by randomly searching for stocks, which are representative objects (medoids), from a group of groups of each stock item of a preset number. Specifically, the above-described K-medoid algorithm designates a predetermined number of representative stock items, which are representative objects among a plurality of stock items, and then assigns the remaining stock items to the representative stock item having the highest similarity. . The above-described similarity indicates that the closer the distance is, the higher the similarity is, depending on the distance relationship between each stock item calculated by the distance calculation unit 122.

대표 주식 종목이 아닌 다른 주식 종목을 임의로 지정하고, 본래의 대표 주식 종목과 임의로 지정된 주식 종목 간의 총 비용(cost)을 계산한다. 총 비용은 미리 설정된 개수의 군집을 형성한 후 각 군집에 배속된 주식 종목과 각 군집의 중심과의 거리를 모두 더한 값을 나타낸다.The stocks other than the representative stocks are arbitrarily designated, and the total cost between the original representative stocks and the randomly designated stocks is calculated. The total cost represents a value obtained by forming a predetermined number of clusters and adding the distance between the stocks allocated to each cluster and the center of each cluster.

상술한 총 비용을 모든 경우의 수에서 비교하여 가장 작은 경우일 때의 대표 주식 종목과 군집을 최종적으로 선택함으로써 주식 네트워크 그래프 생성부(123)에서 생성된 주식 네트워크 그래프로부터 각 주식 종목별 간 거리 관계에 따라 유사성을 가지는 주식 종목들을 미리 설정된 개수의 주식 종목 그룹으로 군집화할 수 있다.By comparing the above-mentioned total cost in the number of all cases and finally selecting the representative stock item and cluster in the smallest case, the distance relationship between each stock item from the stock network graph generated by the stock network graph generation unit 123 is determined. Accordingly, similar stocks can be grouped into a preset number of stock stock groups.

구체적으로, 본 발명의 일 실시 예에 따른 클러스터링부(124)는 상술한 방법에 의해 최종적으로 선택된 미리 설정된 개수의 군집들 각각의 대표 주식 종목을 중심점으로 설정할 수 있고, 각 군집마다 설정된 중심점을 기준으로 거리 산출부(122)에서 산출된 각 주식 종목들과 상술한 중심점인 대표 주식 종목과의 거리의 값이 1.4 미만에 해당하는 주식 종목들을 하나의 주식 종목 그룹으로 군집화할 수 있고, 상술한 과정을 반복하여 미리 설정된 개수의 주식 종목 그룹들로 각각 군집화할 수 있다.Specifically, the clustering unit 124 according to an embodiment of the present invention may set a representative stock item of each of a predetermined number of clusters finally selected by the above-described method as a center point, and reference the center point set for each cluster As a result, the stock items having a distance value between each stock item calculated by the distance calculating unit 122 and the representative stock item as the central point described above can be clustered into a single stock item group, and the above-described process is performed. By repeating, each group can be grouped into a preset number of stock groups.

따라서, 본 발명의 일 실시 예에 따른 클러스터링부(124)는 K-메도이드(K-medoid) 알고리즘을 적용하여 미리 설정된 개수만큼 대표 주식 종목들을 각각 지정하고, 각각 지정된 대표 주식 종목들과 나머지 주식 종목들 간의 거리 관계에 따른 유사성을 고려하여 미리 설정된 개수만큼 임의의 주식 종목 그룹으로 군집화를 형성하고, 형성된 각 주식 종목 그룹의 군집에 배속된 주식 종목과 각 군집의 중심과의 거리를 비교하여 주식 네트워크 그래프 생성부(123)에서 생성된 주식 네트워크 그래프로부터 각 주식 종목별 간 거리 관계에 따라 유사성을 가지는 주식 종목들을 미리 설정된 개수의 주식 종목 그룹으로 군집화할 수 있다.Therefore, the clustering unit 124 according to an embodiment of the present invention applies a K-medoid algorithm to designate a representative number of stocks as many as a preset number, and the designated stocks and remaining stocks, respectively. In consideration of the similarity according to the distance relationship between the stocks, a cluster is formed into a group of arbitrary stock stocks by a preset number, and the stocks allocated to the clusters of each stock stock group formed and the distance between the centers of the stocks are compared. From the stock network graph generated by the network graph generator 123, stock items having similarity may be grouped into a preset number of stock item groups according to the distance relationship between each stock item.

본 발명의 또 다른 일 실시 예에 따른 주식 종목 추출부(125)는 각 주식 종목 간의 상관계수가 간선의 가중치가 되는 주식 네트워크 그래프로부터 최소신장트리(Minimum Spanning Tree, MST)를 구성할 수 있다.The stock item extracting unit 125 according to another embodiment of the present invention may construct a minimum spanning tree (MST) from a stock network graph in which the correlation coefficient between each stock item is the weight of the edge.

구체적으로, 본 발명의 일 실시 예에 따른 주식 종목 추출부(125)는 거리 산출부(122)에서 산출된 각 주식 종목별 간 거리에 기초하여 주식 네트워크 그래프 생성부(123)에서 생성된 주식 네트워크 그래프로부터 각 주식 종목들을 연결하는 간선의 가중치를 나타내는 각 주식 종목별 간 상관계수를 최소로 하는 최소신장트리를 구성할 수 있다.Specifically, the stock item extraction unit 125 according to an embodiment of the present invention is a stock network graph generated by the stock network graph generation unit 123 based on the distance between each stock item calculated by the distance calculation unit 122 From, a minimum height tree that minimizes the correlation coefficient between each stock item representing the weight of the trunk line connecting each stock item can be constructed.

구체적으로, 상술한 최소신장트리는 주식 네트워크 그래프의 각 변(edge)에 각 주식 종목들을 연결하는 간선의 가중치인 각 주식 종목별 간 상관계수가 주어지는 경우 복수의 신장트리들 중 가중치가 최소인 신장트리로, 본 발명의 일 실시 예에 따른 주식 종목 추출부(125)는 각 주식 종목들을 연결하는 간선이 최소의 가중치가 최소가 되면서 각 주식 종목들을 연결하는 간선이 순환되지 않게 모든 주식 종목들을 연결하는 최소신장트리를 구성할 수 있다. 즉, 본 발명의 일 실시 예에 따른 주식 종목 추출부(125)는 주식 네트워크 그래프로부터 최소신장트리를 구성하는 경우 각각의 노드에 해당하는 주식 종목이 폐루프를 형성하지 않도록 체인을 형성하여 최소 신장트리를 구성할 수 있다.Specifically, the above-described minimum height tree is a height tree with the minimum weight among a plurality of height trees when a correlation coefficient between each stock item, which is a weight of an edge connecting each stock item to each edge of the stock network graph, is given. , In the stock item extracting unit 125 according to an embodiment of the present invention, the trunk line connecting each stock item has a minimum weight and a minimum weight is minimized. A kidney tree can be constructed. That is, the stock item extracting unit 125 according to an embodiment of the present invention forms a chain so that the stock items corresponding to each node do not form a closed loop when configuring the minimum height tree from the stock network graph. You can build a tree.

본 발명의 일 실시 예에 따른 주식 종목 추출부(125)는 크루스칼 알고리즘(Kruskal Algorithm) 또는 프림 알고리즘(Prim’s Algorithm)을 적용하여 주식 네트워크 그래프로부터 최소신장트리를 구성할 수 있다.The stock item extraction unit 125 according to an embodiment of the present invention may configure a minimum height tree from a stock network graph by applying a Kruskal Algorithm or a Prim's Algorithm.

크루스칼 알고리즘(Kruskal Algorithm)은 모든 주식 종목들을 주식 종목들 간 연결된 간선의 가중치를 나타내는 상관계수를 기준으로 오름차순으로 정렬하고, 오름차순으로 정렬된 간선을 순서대로 선택하며, 선택된 간선으로 연결되는 두 주식 종목들이 선택된 간선으로 연결되어 있지 않으면 해당 두 주식 종목들을 폐루프가 형성되지 않은 경우에 한해 두 주식 종목들을 간선으로 연결시키는 과정을 반복하여 최소신장트리를 구성하는 알고리즘을 나타낸다.The Kruskal Algorithm sorts all stock stocks in ascending order based on the correlation coefficient representing the weight of the connected trunks between stock stocks. If the stocks are not connected by the selected trunk, the process of linking the two stock stocks to the trunk is shown as an algorithm that composes the minimum height tree only when the closed loops are not formed.

프림 알고리즘(Prim’s Algorithm)은 주식 종목들 중 기준 주식 종목을 선택하고, 선택된 기준 주식 종목에서 간선으로 연결될 수 있는 모든 주식 종목들 중에 최소의 상관계수 값을 가지는 간선으로 이어지는 주식 종목을 선택하며, 최소의 상관계수 값을 가지는 간선으로 이어지는 주식 종목을 선택할 때 이미 선택한 주식 종목은 다시 선택할 수 없도록 하는 과정을 반복하여 최소신장트리를 구성하는 알고리즘을 나타낸다.Prim's Algorithm selects the stock stocks among stock stocks, and selects stock stocks leading from the selected stock stocks leading to the edge with the minimum correlation coefficient among all stock stocks that can be connected as edges. When selecting a stock item that leads to the edge with the correlation coefficient value of, it represents the algorithm that composes the minimum height tree by repeating the process of not selecting the stock item already selected.

본 발명의 일 실시 예에 따른 주식 종목 추출부(125)는 상술한 방법에 의해 구성된 최소신장트리를 이용하여 클러스터링부(124)에서 군집화된 주식 종목 각각에 포함된 상관관계의 유사성을 가지는 주식 종목들 중 상관계수가 임계치보다 낮은 적어도 하나의 주식 종목을 주가 예측을 위한 주식 종목의 후보로 추출할 수 있다.Stock item extraction unit 125 according to an embodiment of the present invention is a stock item having a similarity of correlation included in each of the stock items clustered in the clustering unit 124 using the minimum height tree constructed by the above-described method Among them, at least one stock item whose correlation coefficient is lower than a threshold may be extracted as a stock stock candidate for stock price prediction.

상술한 상관계수가 임계치보다 낮은 적어도 하나의 주식 종목은 상술한 방법에 의해 구성된 최소신장트리에서 차수가 1에 해당하는 에지(edge)가 하나인 주식 종목을 나타낼 수 있다.At least one stock item with a lower correlation coefficient than the threshold may represent a stock item with one edge corresponding to order 1 in the minimum height tree constructed by the above-described method.

본 발명의 일 실시 예에 따른 주식 종목 추출부(125)는 클러스터링부(124)가 주식 네트워크 그래프 생성부(123)에서 생성된 주식 네트워크 그래프로부터 미리 설정된 개수의 주식 종목 그룹들로 군집화하는 과정과 독립적으로 주식 네트워크 그래프로부터 최소신장트리를 구성할 수 있다.The stock item extraction unit 125 according to an embodiment of the present invention includes a process of clustering the clustering unit 124 into a preset number of stock item groups from the stock network graph generated by the stock network graph generation unit 123. You can construct a minimum height tree independently from the stock network graph.

도 5는 본 발명의 일 실시 예에 따라 구성된 최소신장트리를 설명하기 위한 도면을 나타낸 것이다.5 is a view for explaining a minimum height tree constructed according to an embodiment of the present invention.

구체적으로, 도 5는 본 발명의 일 실시 예에 따른 주식 종목 추출부(125)가 50개의 주식 종목들의 거리 관계에 따라 주식 네트워크 그래프 생성부(123)에서 생성된 주식 네트워크 그래프로부터 상술한 프림(Prim) 알고리즘을 적용하여 구성한 최소신장트리를 나타낸 것이다.Specifically, FIG. 5 illustrates the above-described prim from the stock network graph generated by the stock network graph generation unit 123 according to the distance relationship between 50 stock stocks by the stock item extraction unit 125 according to an embodiment of the present invention ( Prim) shows the minimum height tree constructed by applying the algorithm.

도 5를 참조하면, 본 발명의 일 실시 예에 따른 주식 종목 추출부(125)에서 구성된 최소신장트리에서 1번 내지 50번까지에 해당하는 노드를 나타내는 주식 종목들 중 주식 종목 자신과 연결된 간선에 해당하는 에지(edge)가 하나인 주식 종목들은 1, 3, 8, 5, 10, 13, 16, 18, 22, 23, 25, 31, 33, 34, 35, 36, 38, 41, 43, 45 및 46번에 해당하는 노드인 주식 종목들로 서로 상관관계가 낮은 주식종목들을 나타낼 수 있다.Referring to FIG. 5, among stock stocks representing nodes 1 to 50 in the minimum height tree configured in the stock stock extraction unit 125 according to an embodiment of the present invention, the trunk connected to the stock stock itself Stocks with one corresponding edge are 1, 3, 8, 5, 10, 13, 16, 18, 22, 23, 25, 31, 33, 34, 35, 36, 38, 41, 43, The stocks, which are nodes corresponding to 45 and 46, may represent stocks with low correlation with each other.

다시 도 1을 참조하면, 본 발명의 일 실시 예에 따른 주가 예측부(130)는 상관관계 분석부(120)에서 상관관계가 분석된 주식 종목들에 대해 주가 데이터 수집부(110)에서 수집된 주가 데이터를 기반으로 학습하여 생성한 주가 예측 모델을 이용하여 주가를 예측하려는 예측 날짜에 대한 상관관계가 분석된 주식 종목들 각각의 주가를 예측할 수 있다.Referring back to FIG. 1, the stock price prediction unit 130 according to an embodiment of the present invention is collected by the stock price data collection unit 110 for the stocks whose correlation is analyzed by the correlation analysis unit 120 The stock price prediction model generated by learning based on stock price data can be used to predict the stock price of each of the stocks analyzed for correlation with the prediction date.

상술한 주가 예측부(130)가 주가 예측 모델을 이용하여 상관관계가 분석된 주식 종목들 각각의 주가를 예측하는 구체적인 방법은 도 6 및 도 7을 함께 참조하여 설명하도록 한다.The detailed method of predicting the stock price of each of the stocks whose correlation is analyzed using the stock price prediction model will be described with reference to FIGS. 6 and 7 together.

도 6은 본 발명의 또 다른 일 실시 예에 따른 주식 포트폴리오 구성을 위한 주식 종목 선택 장치의 구성을 구체적으로 도시한 블록도이다. 6 is a block diagram specifically illustrating the configuration of a stock item selection device for configuring a stock portfolio according to another embodiment of the present invention.

도 6을 참조하면, 본 발명의 또 다른 일 실시 예에 따른 주식 포트폴리오 구성을 위한 주식 종목 선택 장치(100)는 주가 데이터 변환부(150)를 더 포함할 수 있다.Referring to FIG. 6, the stock item selection device 100 for configuring a stock portfolio according to another embodiment of the present invention may further include a stock price data conversion unit 150.

본 발명의 일 실시 예에 따른 주가 데이터 변환부(150)는 주가 데이터 수집부(110)에서 수집된 각 날짜 별로 상관관계 분석부(120)에서 상관관계가 분석된 주식 종목들의 주가, 거래량 및 거래 시간을 포함하는 주가 데이터를 가공하여 주가 데이터를 주식 분석에 사용되는 기술적 분석 데이터로 변환할 수 있다.Stock data conversion unit 150 according to an embodiment of the present invention, the stock price data collection unit 110 for each date collected by the correlation analysis unit 120, the stock price, transaction volume and transaction of the correlation analysis The stock price data including time can be processed to convert the stock price data into technical analysis data used for stock analysis.

상술한 기술적 분석 데이터는 기술적 분석 지표인 볼린저 밴드(Bollinger bands, Bband), 이동평균(Moving Average, MA), 이동평균수렴&확산(Moving Average Convergence & Divergence, MACD), 이중 지수 이동 평균(Double Exponential Moving Average, DEMA), 스토캐스틱(Stochastic, STOCH), 트릭스(Triple smoothed Moving Averages, TRIX), 방향성평균지표(Average Directional Movement Index, ADX), 누적 균형거래량(On Balance Volume, OBV), SAR(Stop And Reverse), 중간값(MIDPOINT)인 10개의 기술적 분석 지표 중 적어도 하나의 기술적 분석 지표를 포함할 수 있으나, 상술한 예시는 본 발명의 일 실시 예를 설명하기 위한 예시일 뿐 이에 한정 되는 것은 아니며 기술적 분석 데이터는 주가 데이터를 분석할 수 있는 다양한 기술적 분석 지표들을 포함할 수 있다.The above-described technical analysis data includes Bollinger bands (Bband), Moving Average (MA), Moving Average Convergence & Divergence (MACD), Double Exponential Moving Average, DEMA, Stochastic, STOCH, Triple smoothed Moving Averages, TRIX, Average Directional Movement Index, ADX, On Balance Volume (OBV), SAR (Stop And Reverse), median (MIDPOINT) may include at least one technical analysis indicator among 10 technical analysis indicators, but the above example is only an example for explaining an embodiment of the present invention and is not limited thereto. The analysis data may include various technical analysis indicators capable of analyzing the stock price data.

상술한 볼린저밴드(Bband)는 시간에 따른 주식 가격의 변화 정도를 반영한 지표를 나타낸다. 구체적으로, 볼린저밴드(Bband)는 주가가 수렴과 확장을 반복하며 움직일 때 주가의 움직임에 따라 밴드의 폭이 결정되는 것으로, 가격 변동성이 큰 경우 밴드의 폭이 넓어지게 되고, 반대로 가격의 변동성이 작은 경우 밴드의 폭이 좁아지게 된다. 볼린저밴드의 폭은 상한 밴드(upper band), 중심 밴드(middle band) 및 하한 밴드(down band)를 이용하여 산출될 수 있다.The above-described Bollinger Band (Bband) represents an index reflecting the degree of change in stock price over time. Specifically, in the Bollinger band (Bband), when the stock price moves repeatedly while converging and expanding, the width of the band is determined according to the movement of the stock price. In small cases, the width of the band becomes narrower. The width of the Bollinger band can be calculated using an upper band, a middle band, and a lower band.

상술한 이동평균수렴&확산(MACD) 지수는 장기 및 단기 두 개의 이동평균 교차접근법을 이용한 운동량 지표를 나타낸다.The above-mentioned moving average convergence & diffusion (MACD) index represents a momentum index using two long-term moving average cross-approaches.

상술한 이동 평균(MA)은 미리 정해진 기간 동안의 증권의 평균적 가격을 나타내는 지표를 나타낸다. 본 발명의 일 실시 예에 따른 이동 평균(MA)는 40일 이동 평균을 나타내는 MA(40), 80일 이동 평균을 나타내는 MA(80) 및 120일 이동 평균을 나타내는 MA(120)이 기술적 변환 데이터로 이용될 수 있으나, 상술한 예시는 본 발명의 일 실시 예를 설명하기 위한 예시일 뿐 이에 한정되는 것은 아니다.The above-mentioned moving average (MA) represents an index indicating the average price of securities during a predetermined period. The moving average (MA) according to an embodiment of the present invention includes MA 40 representing a 40-day moving average, MA 80 representing a 80-day moving average, and MA 120 representing a 120-day moving average. It can be used as, but the above-described example is only an example for explaining an embodiment of the present invention is not limited thereto.

이중 지수 이동 평균(DEMA)는 지수이동평균의 이동 평균을 나타낸다.The double exponential moving average (DEMA) represents the moving average of the exponential moving average.

스토캐스틱(STOCH)은 현재 주가가 일정 기간의 주가 변동폭 중 어디에 위치하는지를 백분율로 나타낸 현재 주가의 변동폭상 위치(%)를 나타낸다.Stochastic (STOCH) represents the position of the current price fluctuation (%), expressed as a percentage of where the current price is located within a certain period.

트릭스(TRIX)는 삼중 지수이동 평균값을 사용하여 매매시점을 표시한 지표를 나타낸다.Trix (TRIX) represents the index of the trading point using the triple index moving average.

방향성평균지표(ADX)는 주가·환율의 추세 전환을 판단하기 위한 평균 방향 이동 지표를 나타낸다.The directional average index (ADX) represents the average directional movement index to judge the trend conversion of stock prices and exchange rates.

누적 균형거래량(OBV) 지표는 거래체결 강도를 나타내는 차트를 나타내며, 차트상의 거래량이 항상 주가에 선행된다는 것을 전제로 주가판단을 하는 기법을 이용한다.The cumulative balance trading volume (OBV) indicator shows a chart indicating the strength of the closing, and uses the technique of judging the stock price on the premise that the trading volume on the chart always precedes the stock price.

SAR 지표는 추세 전환 시점을 파악하기 위한 지표를 나타낸다.The SAR indicator represents an indicator to understand when the trend is turning.

중간값(MIDPOINT) 지표는 중간 값을 이용한 누적균형거래량을 나타낸다.The median (MIDPOINT) indicator represents the cumulative balance transaction using the median.

본 발명의 일 실시 예에 따른 주가 예측부(130)는 주가 데이터 수집부(110)에서 상관관계가 분석된 주식 종목들에 대해 수집된 주가 데이터를 이용하여 미리 설정된 차원을 가지는 입력 데이터를 생성할 수 있고, 생성된 입력 데이터를 기반으로 주가 예측 모델(131)이 상관관계가 분석된 주식 종목들 각각에 대한 예측 날짜의 주가를 예측하도록 주가 예측 모델(131)을 학습할 수 있다.The stock price prediction unit 130 according to an embodiment of the present invention generates input data having a predetermined dimension by using the stock price data collected for the stocks that have been correlated in the stock price data collection unit 110. Based on the generated input data, the stock price prediction model 131 may train the stock price prediction model 131 so as to predict the stock price of the prediction date for each of the correlated stock stocks.

또한, 본 발명의 또 다른 일 실시 예에 따른 주가 예측부(130)는 상관관계가 분석된 주식 종목들에 대해 주가 데이터 수집부(110)에서 수집된 주가 데이터 및 주가 데이터 변환부(150)에서 상관관계가 분석된 주식 종목들에 대해 수집된 주가 데이터가 각 날짜 별로 변환된 기술적 분석 데이터로부터 주가의 시계열적인 특성을 고려하여 미리 설정된 차원을 가지는 입력 데이터를 생성할 수 있다.In addition, the stock price prediction unit 130 according to another embodiment of the present invention in the stock price data and the stock price data conversion unit 150 collected from the stock price data collection unit 110 for the stocks analyzed correlations It is possible to generate input data having a predetermined dimension in consideration of time-series characteristics of stock prices from technical analysis data in which the stock price data collected for the correlated stock items is analyzed for each day.

본 발명의 또 다른 일 실시 예에 따른 주가 예측부(130)는 주가 데이터 수집부(110)에서 수집된 상관관계가 분석된 주식 종목들에 대한 주가 데이터 중 예측 날짜의 상관관계가 분석된 주식 종목들에 대한 주가 데이터를 레이블링하여 상관관계가 분석된 주식 종목들에 각각에 대해 예측된 주가가 레이블링된 예측 날짜의 주가가 되도록 주가 데이터 및 각 날짜 별로 변환된 기술적 분석 데이터로부터 생성된 입력 데이터를 이용하여 주가 예측 모델을 학습할 수 있다.The stock price predicting unit 130 according to another embodiment of the present invention is a stock item in which the correlation of the predicted date among the stock data for the stock items for which the correlation data collected by the stock price data collection unit 110 is analyzed Use stock data generated from technical analysis data converted for each day and stock price data so that the predicted stock price for each of the correlated stock stocks is labeled as the stock price of the labeled forecast date by labeling the stock price data for each field. You can learn the stock price prediction model.

구체적으로, 본 발명의 일 실시 예에 따른 주가 예측부(130)는 클러스터링부(124)에서 군집화된 주식 종목 각각에 포함된 상관관계의 유사성을 가지는 주식 종목들 중에서 주식 종목 추출부(125)에서 추출된 상관계수가 임계치보다 낮은 적어도 하나의 주식 종목에 대해 주가 데이터 수집부(110)에서 수집된 주가 데이터 및 주가 데이터 변환부(150)에서 주식 종목 추출부(125)에서 추출된 상관계수가 임계치보다 낮은 적어도 하나의 주식 종목에 대해 수집된 주가 데이터가 각 날짜 별로 변환된 기술적 분석 데이터로부터 주가의 시계열적인 특성을 고려하여 미리 설정된 차원을 가지는 입력 데이터를 생성할 수 있다.Specifically, the stock price prediction unit 130 according to an embodiment of the present invention is the stock item extraction unit 125 among the stock items having similarity in correlation included in each of the stock items clustered in the clustering unit 124. The stock price data collected by the stock price data collection unit 110 and the correlation coefficient extracted by the stock item extraction unit 125 by the stock price data conversion unit 150 for at least one stock item having a lower correlation coefficient than the threshold is a threshold value It is possible to generate input data having a predetermined dimension in consideration of time-series characteristics of stock prices from technical analysis data in which stock price data collected for at least one lower stock item is converted for each day.

상술한 입력 데이터를 생성하는 구체적인 방법은 후술하는 도 7을 참조하여 설명하도록 한다.A detailed method of generating the above-described input data will be described with reference to FIG. 7 described later.

본 발명의 일 실시 예에 따른 주가 예측부(130)는 상관관계가 분석된 주식 종목들 각각에 대해 1 일 단위로 이동하면서 연속 M 일 (M은 자연수)간의 주가 데이터 및 기술적 분석 데이터를 입력 데이터로 생성하고, 생성된 입력 데이터를 이용하여 M 일 이후 연속 N 일 (N은 자연수) 간의 상관관계가 분석된 주식 종목들 각각에 대한 주가를 예측하도록 주가 예측 모델을 학습할 수 있다.본 발명의 일 실시 예에 따른 주가 예측부(130)는 M일 간에 대한 미리 설정된 차원의 입력 데이터를 주가의 시계열적 특성을 반영하여 각 날짜인 M일 별로 분리하여 M 개로 분리된 입력 데이터를 이용하여 주가 예측 모델이 예측 날짜의 주가를 예측하도록 학습할 수 있다.The stock price predicting unit 130 according to an embodiment of the present invention inputs stock data and technical analysis data for consecutive M days (where M is a natural number) while moving in units of one day for each of the stocks for which correlation has been analyzed. Using the generated input data, the stock price prediction model can be trained to predict the stock price for each of the stocks whose correlation between N consecutive days after N days (N is a natural number) is analyzed. The stock price prediction unit 130 according to an embodiment predicts the stock price using input data divided into M data by dividing the input data of a preset dimension for M days for each day, M days, by reflecting the time series characteristics of the stock price. The model can be trained to predict the stock price of the forecast date.

구체적으로, 본 발명의 또 다른 일 실시 예에 따른 주가 예측부(130)는 상술한 방법에 의해 생성된 입력 데이터를 이용하여 M 일 이후 연속 N 일 간의 상관관계가 분석된 주식 종목들 각각에 대한 주가 데이터를 레이블링하여 예측된 주가가 레이블링된 날짜의 주가가 되도록 주가 예측 모델을 학습할 수 있다.Specifically, the stock price predicting unit 130 according to another embodiment of the present invention uses the input data generated by the above-described method for each of the stocks whose correlation between N consecutive days after M days is analyzed. By labeling the stock price data, you can train the stock price prediction model so that the predicted stock price becomes the stock price of the labeled date.

상술한 주가 데이터로는 시가, 고가, 저가 및 종가를 레이블로 사용할 수 있으나, 상술한 레이블로 사용하는 주가 데이터는 본 발명의 일 실시 예를 설명하기 위한 예시일 뿐 이에 한정되는 것은 아니다.As the above-mentioned stock price data, the market price, the high price, the low price, and the closing price can be used as a label, but the stock price data used as the above-described label is only an example for explaining an embodiment of the present invention and is not limited thereto.

본 발명의 또 다른 일 실시 예에 따른 주가 예측부(130)는 주가 예측의 정확도를 높이도록 기술적 분석 데이터 및 주가 데이터로부터 생성된 입력 데이터에 최소-최대 정규화(min-max normalization)를 적용하여 정규화된 입력 데이터를 이용하여 주가 예측 모델이 상관관계가 분석된 주식 종목들 각각에 대한 예측 날짜의 주가를 예측하도록 학습할 수 있다.The stock price predicting unit 130 according to another embodiment of the present invention applies normalization by applying min-max normalization to technical analysis data and input data generated from stock data to increase the accuracy of stock price prediction Using the input data, the stock price prediction model can be trained to predict the stock price of the prediction date for each of the correlated stock stocks.

구체적으로, 본 발명의 또 다른 일 실시 예에 따른 주가 예측부(130)는 시계열적 특성을 가지는 입력 데이터의 최대 값과 최소 값을 이용하여 아래의 수학식 4와 같이 현재 입력 데이터의 값을 정규화시킴으로써, 정규화된 입력 데이터를 이용하여 주가 예측 모델이 상관관계가 분석된 주식 종목들 각각에 대한 예측 날짜의 주가를 예측하도록 학습할 수 있다.Specifically, the stock price predicting unit 130 according to another embodiment of the present invention normalizes the value of the current input data as shown in Equation 4 below by using the maximum and minimum values of the input data having time series characteristics. By doing so, using the normalized input data, the stock price prediction model can be trained to predict the stock price of the prediction date for each of the correlated stock stocks.

본 발명의 일 실시 예에 따른 주가 예측 모델(131)은 M 일치의 시계열적 특성이 반영된 분리된 M 개의 입력 데이터들을 각각 입력 받아 하나의 주식 종목에 대한 예측 날짜의 주가 예측 값을 출력하는 시계열 데이터 분석이 가능한 순환신경망(Recurrent Neural Network, RNN) 기반의 레이어로 구성될 수 있다.The stock price prediction model 131 according to an embodiment of the present invention receives time-separated M input data reflecting time-series characteristics of M match, and outputs time-series prediction value of a prediction date for one stock item. It can be composed of layers based on Recurrent Neural Network (RNN) that can be analyzed.

순환 신경망(RNN)은 은닉층이 체인구조를 이루어 은닉층의 결과가 다시 같은 은닉층의 입력으로 들어가도록 연결 된 루프가 있고, 과거의 데이터가 미래에 영향을 주는 구조를 갖는다. 즉, 순환 신경망(RNN)은 은닉 노드가 방향을 가진 엣지로 연결되어 순환구조를 이룬다.The cyclic neural network (RNN) has a loop in which the hidden layer forms a chain structure, and the loop is connected so that the result of the hidden layer enters the input of the same hidden layer again, and the data of the past affects the future. In other words, the circulating neural network (RNN) forms a circulating structure in which hidden nodes are connected to edges having directions.

구체적으로, 본 발명의 일 실시 예에 따른 주가 예측 모델(131)은 시간적 순서를 고려하여 분리된 M 개의 입력 데이터들을 입력 받아 예측 날짜의 주가 예측 값을 출력하도록 현재 메모리 셀에 이전 메모리 셀 상태를 반영할지 여부를 결정하는 망각 게이트(Forget Gate)를 포함하는 장단기 메모리(Long Short Term Memory, LSTM) 셀로 구성된 순환신경망(RNN) 기반의 레이어로 구성될 수 있다.Specifically, the stock price prediction model 131 according to an embodiment of the present invention receives M pieces of input data separated in consideration of temporal order and outputs the previous memory cell state to the current memory cell to output the stock price prediction value of the prediction date. It may be composed of a layer based on a circulating neural network (RNN) composed of a Long Short Term Memory (LSTM) cell including a forget gate for determining whether to reflect.

상술한 장단기 메모리(LSTM) 모델은 장기 의존성 문제를 해결하도록 은닉층에 여러 개의 게이트가 연결된 셀을 추가한 구조이다. 은닉 층은 입력 게이트(Input Gate), 출력 게이트(Output Gate), 망각 게이트(Forget Gate)를 포함하는 메모리 블록(Memory Block)을 갖는다. 망각 게이트는 과거 정보를 잊기를 위한 게이트이고, 입력 게이트는 현재 정보를 기억하기 위한 게이트이다. 게이트는 각각 세기 및 방향을 가진다. 셀은 컨베이어 벨트 역할을 하고, 상태가 오래 경과하더라도 그래디언트가 비교적 전파를 유지할 수 있다.The long-term and short-term memory (LSTM) model described above is a structure in which a plurality of gate-connected cells are added to a hidden layer to solve a long-term dependency problem. The hidden layer has a memory block including an input gate, an output gate, and a forget gate. The forgetting gate is a gate for forgetting past information, and the input gate is a gate for remembering current information. The gates each have intensity and direction. The cell acts as a conveyor belt, and the gradient can keep the propagation relatively even after a long period of time.

본 발명의 일 실시 예에 따른 장단기 메모리(LSTM) 모델은 M 개의 장단기 메모리 셀(LSTM cell)들을 포함할 수 있고, 마지막 장단기 메모리 셀(LSTM cell)의 출력만을 고려하는 다대일(many to one) 모델로 구현될 수 있으나, 상술한 예시는 본 발명의 일 실시 예를 설명하기 위한 예시일 뿐 이에 한정되는 것은 아니다.The long and short-term memory (LSTM) model according to an embodiment of the present invention may include M long- and short-term memory cells (LSTM cells), and considers only the output of the last short- and long-term memory cell (LSTM cell). Although it may be implemented as a model, the above-described example is only an example for explaining an embodiment of the present invention and is not limited thereto.

본 발명의 일 실시 예에 따른 장단기 메모리 모델(LSTM)은 출력되는 주가 예측 값의 정확도를 높이도록 마지막 장단기 메모리 셀(LSTM cell)에서 나온 복수 개의 출력 값들이 완전 연결 층으로 전달되어, 완전 연결 층은 하나의 주가 예측 값을 출력할 수 있다.In the long-term and short-term memory model (LSTM) according to an embodiment of the present invention, a plurality of output values from the last long-term and short-term memory cell (LSTM cell) are transmitted to the full connection layer to increase the accuracy of the output price prediction value, and the full connection layer Can output one stock price prediction value.

본 발명의 일 실시 예에 따른 주가 예측 모델(131)은 시간적 순서를 고려하여 분리된 M 개의 입력 데이터들을 입력 받아 예측 날짜의 주가 예측 값을 출력하도록 두 개의 장단기 메모리(LSTM) 셀을 결합하여 양방향으로 상태 전파가 이루어지도록 형성된 양방향 장단기 메모리(Bidirectional LSTM, Bi-LSTM) 셀로 구성된 순환신경망(RNN) 기반의 레이어로 구성될 수 있다.The stock price prediction model 131 according to an embodiment of the present invention combines two short-term memory (LSTM) cells to output the stock price prediction value of the prediction date by receiving the separated M input data in consideration of temporal order. It may be composed of a layer based on a circulating neural network (RNN) composed of bidirectional LSTM and Bi-LSTM cells formed to perform state propagation.

구체적으로, 상술한 양방향 장단기 메모리(Bi-LSTM) 셀은 k일(k는 자연수, 1≤k<M)이 k+1일의 결과에 영향을 주는 방향과 k+1일이 k일의 결과에 영향을 주는 방향의 두 장단기 메모리 모델(LSTM)을 결합함으로써 형성될 수 있다.Specifically, in the above-described bidirectional long- and short-term memory (Bi-LSTM) cells, k days (k is a natural number, 1≤k<M) affects the results of k+1 days and k+1 days are k days results It can be formed by combining two short- and long-term memory models (LSTMs) in a direction that affects them.

따라서, 주가 예측부(130)가 양방향 장단기 메모리(Bi-LSTM) 셀로 구현된 주가 예측 모델(131)을 이용하여 상관관계가 분석된 주식 종목들 각각에 대한 예측 날짜의 주가를 예측하는 경우 장단기 메모리(LSTM) 셀로 구현된 주가 예측 모델(131)을 이용하여 상관관계가 분석된 주식 종목들 각각에 대한 예측 날짜의 주가를 예측하는 경우보다 더 높은 정확도로 주가를 예측할 수 있다.Therefore, when the stock price prediction unit 130 predicts the stock price of the prediction date for each of the stocks that are correlated using the stock price prediction model 131 implemented by a bi-short and long-term memory (Bi-LSTM) cell. The stock price prediction model 131 implemented in the (LSTM) cell may be used to predict the stock price with higher accuracy than when predicting the stock price of the prediction date for each of the correlated stocks.

구체적으로, 본 발명의 일 실시 예에 따른 주가 예측부(130)가 양방향 장단기 메모리(Bi-LSTM) 셀로 구현된 주가 예측 모델(131)을 이용하여 주식 종목 추출부(125)에서 추출된 적어도 하나의 주식 종목에 대한 예측 날짜의 주가를 예측하는 경우 장단기 메모리(LSTM) 셀로 구현된 주가 예측 모델(131)을 이용하여 주식 종목 추출부(125)에서 추출된 적어도 하나의 주식 종목에 대한 예측 날짜의 주가를 예측하는 경우보다 더 높은 정확도로 주가를 예측할 수 있다.Specifically, the stock price prediction unit 130 according to an embodiment of the present invention at least one extracted from the stock item extraction unit 125 using a stock price prediction model 131 implemented in a bi-long and short-term memory (Bi-LSTM) cell When predicting the stock price of the forecast date for the stock item of the forecast date for at least one stock item extracted from the stock item extraction unit 125 using the stock price prediction model 131 implemented in a long-term memory (LSTM) cell The stock price can be predicted with higher accuracy than the stock price forecast.

상술한 입력 데이터를 입력 받아 예측 날짜의 주가를 예측하는 구체적인 방법은 도 7을 함께 참조하여 설명하도록 한다.A detailed method of predicting the stock price of the prediction date by receiving the input data described above will be described with reference to FIG. 7 together.

도 7은 본 발명의 일 실시 예에 따라 주식 종목의 주가를 예측하는 방법을 설명하기 위한 도면을 나타낸 것이다.7 is a view for explaining a method for predicting the stock price of a stock item according to an embodiment of the present invention.

도 7을 참조하면, 본 발명의 일 실시 예에 따른 주가 예측 모델(131)은 양방향 장단기 메모리(Bi-LSTM) 셀로 구성된 순환신경망(RNN) 기반의 레이어로 구성될 수 있다.Referring to FIG. 7, the stock price prediction model 131 according to an embodiment of the present invention may be configured as a layer based on a circulating neural network (RNN) composed of bidirectional short and long term memory (Bi-LSTM) cells.

본 발명의 일 실시 예에 따른 주가 예측부(130)는 주가 데이터 수집부(110)에서 수집된 각 주식 종목 별 60 일치의 주가 데이터에 해당하는 시가(open), 고가(high), 저가(low), 종가(close), 조정 종가(adjust close) 및 거래량(volume)과 주가 데이터 변환부(150)에서 상술한 주가 데이터가 변환된 기술적 분석 데이터인 각 주식 종목 별 60 일치의 볼린저 밴드(Bband), 스토캐스틱(STOCH), 이중 지수 이동 평균(DEMA), 이동 평균(MA 40, 80, 120), 이동평균수렴&확산(MACD), 방향성평균지표(ADX), 트릭스(TRIX), 누적 균형거래량(OBV), SAR 및 중간값(MIDPOINT)인 기술적 분석 지표들에 의해 미리 설정된 차원의 입력 데이터(132)를 생성할 수 있다. 단, 상술한 예시는 본 발명의 일 실시 예를 설명하기 위한 예시일 뿐 이에 한정되는 것은 아니다.The stock price prediction unit 130 according to an embodiment of the present invention is open, high, and low corresponding to 60 stock price data for each stock item collected by the stock price data collection unit 110 ), close, adjusted close and trade volume and stock price data conversion unit 150, the above-mentioned stock data is converted technical analysis data for each stock item Bollinger band (Bband) of 60 matches , Stochastic (STOCH), Double Exponential Moving Average (DEMA), Moving Average (MA 40, 80, 120), Moving Average Convergence & Diffusion (MACD), Directional Average Index (ADX), Trix (TRIX), Cumulative Balance OBV), SAR, and median (MIDPOINT) technical indexes can be used to generate input data 132 of a preset dimension. However, the above-described example is only an example for explaining an embodiment of the present invention, but is not limited thereto.

상술한 미리 설정된 차원의 입력 데이터(132)는 60일 X 18인 2차원의 입력 데이터(132)일 수 있으며, 상술한 입력 데이터(132)를 이용하여 주가 예측부(130)는 주가 예측 모델(131)이 예측 날짜의 주가를 예측하도록 학습할 수 있다.The input data 132 of the above-described preset dimension may be input data 132 of two dimensions of 60 days X 18, and the stock price predicting unit 130 using the input data 132 described above may be a stock price prediction model ( 131) can be learned to predict the stock price of the forecast date.

본 발명의 일 실시 예에 따른 주가 예측부(130)가 한번에 학습시킬 데이터의 수를 나타내는 배치 사이즈(Batch-size)는 60일 x 18 일 수 있다.A batch size indicating the number of data to be learned by the stock price prediction unit 130 at a time according to an embodiment of the present invention may be 60 days x 18 days.

본 발명의 일 실시 예에서 상술한 입력 데이터(132)가 주가 예측 모델(131)에 입력되는 경우, 출력하고자 하는 예측 날짜는 60 일 이후 5 일차에 해당하는 날짜로 설명하도록 한다. 단, 상술한 예시는 본 발명의 일 실시 예를 설명하기 위한 예시일 뿐 이에 한정되는 것은 아니다.In the exemplary embodiment of the present invention, when the input data 132 described above is input to the stock price prediction model 131, the prediction date to be output will be described as a date corresponding to day 5 after 60 days. However, the above-described example is only an example for explaining an embodiment of the present invention, but is not limited thereto.

본 발명의 일 실시 예에 따른 주가 예측부(130)는 상술한 60 x18 인 2차원의 입력 데이터(132)를 1차원의 60 개의 입력 데이터들로 분리할 수 있고, 분리된 60개의 1 x 18인 1차원의 데이터들을 이용하여 주가 예측 모델(131)을 학습할 수 있다.The stock price predicting unit 130 according to an embodiment of the present invention may separate the above-described 60 x 18 two-dimensional input data 132 into 60 one-dimensional input data, and the separated 60 1 x 18 The stock price prediction model 131 may be trained by using phosphorus one-dimensional data.

구체적으로, 본 발명의 일 실시 예에 따른 제3 주가 예측 모델(431)은 복수 개의 양방향 장단기 메모리 셀(Bi-LSTM cell)들 및 완전 연결 층(Fully Connected Layer)로 구성될 수 있으며, 순환 신경망(RNN) 내부에서 분리된 60개의 1 X 18인 1차원의 입력 데이터들은 각각 60개의 양방향 장단기 메모리 셀(Bi-LSTM cell)들로 입력될 수 있다. 단, 상술한 예시는 본 발명의 일 실시 예를 설명하기 위한 예시일 뿐 이에 한정되는 것은 아니다.Specifically, the third stock price prediction model 431 according to an embodiment of the present invention may be composed of a plurality of bi-directional long and short-term memory cells (Bi-LSTM cells) and a fully connected layer (Fully Connected Layer), and a cyclic neural network (RNN) Each of the 60 1 X 18 1-dimensional input data separated from each other may be input into 60 bi-directional long and short-term memory cells (Bi-LSTM cells). However, the above-described example is only an example for explaining an embodiment of the present invention, but is not limited thereto.

본 발명의 일 실시 예에 따른 주가 예측 모델(131)은 제1 내지 제3 레이어(131a 내지 131c)로 구성될 수 있다.The stock price prediction model 131 according to an embodiment of the present invention may include first to third layers 131a to 131c.

본 발명의 일 실시 예에 따른 제1 레이어(131a)는 m 개의 양방향 장단기 메모리(Bi-LSTM) 셀로 구성될 수 있다.The first layer 131a according to an embodiment of the present invention may be composed of m bi-directional long- and short-term memory (Bi-LSTM) cells.

본 발명의 일 실시 예에 따른 제1 레이어(131a)는 M 일치의 주가 데이터와 기술적 지표로 구성된 입력 데이터(132)를 입력 받고, 입력 받은 입력 데이터를 미리 설정된 개수의 제1 특징 값으로 변환할 수 있다.The first layer 131a according to an embodiment of the present invention receives input data 132 composed of stock data of M match and technical indicators, and converts the input data to a preset number of first feature values. Can be.

본 발명의 일 실시 예에 따른 제1 레이어(131a)는 N일 간의 주가를 일별로 예측하도록 상술한 입력 데이터(132)를 미리 설정된 개수의 제1 특징 값으로 변환하여 출력하는 과정을 N번 반복할 수 있다.The first layer 131a according to an embodiment of the present invention repeats the process of converting and outputting the above-described input data 132 into a preset number of first feature values N times so as to predict the stock price of N days daily. can do.

구체적으로, 본 발명의 일 실시 예에 따른 제1 레이어(131a)를 구성하는 양방향 장단기 메모리 셀(Bi-LSTM cell)들 각각은 순환 신경망(RNN) 내부에서 시계열적 특성으로 고려하여 60일치에 해당하는 2차원의 입력 데이터가 분리된 60개의 1 X 18인 1차원의 입력 데이터들을 입력 받을 수 있고, 제1 레이어(131a)는 입력 받은 입력 데이터들로부터 미리 설정된 개수의 제1 특징 값으로 변환하여 출력할 수 있으며, 미리 설정된 개수의 제1 특징 값으로 변환하여 출력하는 과정을 5번 반복할 수 있다. 상술한 제1 특징 값의 미리 설정된 개수는 1024개일 수 있다.Specifically, each of the bi- and long-term memory cells (Bi-LSTM cells) constituting the first layer 131a according to an embodiment of the present invention is considered to be a time series characteristic within a circulating neural network (RNN), corresponding to 60 days. The input data of 60 1 X 18 separated 2D input data can be input, and the first layer 131a converts the input data to a preset number of first feature values. It can be output, and the process of converting and outputting a predetermined number of first feature values can be repeated 5 times. The preset number of first feature values may be 1024.

단, 상술한 예시는 본 발명의 일 실시 예를 설명하기 위한 예시일 뿐 이에 한정되는 것은 아니다.However, the above-described example is only an example for explaining an embodiment of the present invention, but is not limited thereto.

본 발명의 일 실시 예에 따른 제2 레이어(131b)는 n 개의 양방향 장단기 메모리(Bi-LSTM) 셀이 N 개인 셀로 구성될 수 있다.In the second layer 131b according to an embodiment of the present invention, n bidirectional short- and long-term memory (Bi-LSTM) cells may be composed of N individual cells.

본 발명의 일 실시 예에 따른 제2 레이어(131b)를 구성하는 N 개의 셀 각각은 제1 레이어(131a)에서 N 번 반복하여 출력된 미리 설정된 개수의 제1 특징 값을 입력 받을 수 있고, 입력 받은 N번 반복된 미리 설정된 개수의 제1 특징 값을 각 일별로 미리 설정된 개수의 제2 특징 값으로 변환하여 출력할 수 있다.Each of the N cells constituting the second layer 131b according to an embodiment of the present invention can receive a predetermined number of first feature values repeatedly output N times in the first layer 131a, and input The received first repeated number of feature values may be converted into a preset number of second feature values for each day and output.

예를 들어, 본 발명의 일 실시 예에 따른 제2 레이어(131b)는 n 개의 양방향 장단기 메모리(Bi-LSTM) 셀이 5개이고, 5개의 셀 각각에 5번 반복하여 출력된 미리 설정된 개수의 제1 특징 값이 입력될 수 있고, 5개의 셀 각각은 입력된 미리 설정된 개수의 제1 특징 값을 각 일별에 해당하는 5일별로 미리 설정된 개수의 제2 특징 값을 변환하여 출력할 수 있다.For example, in the second layer 131b according to an embodiment of the present invention, n bidirectional short- and long-term memory (Bi-LSTM) cells are 5 cells, and a preset number of 5 cells are output by repeating 5 times each. One feature value may be input, and each of the five cells may convert and output the preset number of first feature values of the preset number of second feature values for 5 days corresponding to each day.

상술한 제1 특징 값의 미리 설정된 개수는 1024개 일 수 있고, 제2 특징 값의 미리 설정된 개수는 512개일 수 있으나, 상술한 예시는 본 발명의 일 실시 예를 설명하기 위한 예시일 뿐 이에 한정되는 것은 아니다.The preset number of the first feature value may be 1024, and the preset number of the second feature value may be 512, but the above example is only an example for explaining an embodiment of the present invention. It does not work.

본 발명의 일 실시 예에 따른 제3 레이어(131c)는 N 개의 완전 연결 층(Fully Connected Layer)로 구성될 수 있다.The third layer 131c according to an embodiment of the present invention may be composed of N fully connected layers.

본 발명의 일 실시 예에 따른 제3 레이어(131c)는 제2 레이어(131b)에서 각 일별인 N 일별로 출력된 미리 설정된 개수의 제2 특징 값들로부터 각 일별로 N 일간의 주식 종목의 주가 데이터를 출력할 수 있다. 상술한 방법에 의해 출력된 N일 간의 주식 종목의 주가 데이터는 N 일 간의 주식 종목의 주가를 예측한 값을 나타낸다.In the third layer 131c according to an embodiment of the present invention, the stock price data of N stocks for each day from the preset number of second feature values output per N days for each day in the second layer 131b Can output The stock price data of the N-day stock items output by the above-described method represents a value obtained by predicting the stock price of the N-day stock items.

따라서, 본 발명의 일 실시 예에 따른 제3 레이어(131c)는 제2 레이어(131b)로부터 5일별로 출력된 미리 설정된 개수의 제2 특징 값들로부터 각 일별로 5일 간의 주식 종목의 주가 데이터(133)를 출력할 수 있다. 단, 상술한 예시는 본 발명의 일 실시 예를 설명하기 위한 예시일 뿐 이에 한정되는 것은 아니다.Accordingly, the third layer 131c according to an embodiment of the present invention is the stock price data of the stock item for 5 days per day from the preset number of second feature values output for every 5 days from the second layer 131b ( 133). However, the above-described example is only an example for explaining an embodiment of the present invention, but is not limited thereto.

상술한 주가 데이터는 시가, 고가, 저가, 종가, 조정 종가 및 거래량 중 적어도 하나의 데이터일 수 있다.The above-mentioned stock price data may be data of at least one of a market price, a high price, a low price, a closing price, an adjusted closing price and a trading volume.

예를 들어, 도 7은 본 발명의 일 실시 예에 따른 제1 내지 제3 레이어(131a 내지 131c)로 구성된 주가 예측 모델(131)로부터 2014년 1월 29일부터 2014년 2월 4일까지 예측된 시가, 고가, 저가 및 종가(133)를 나타낸 것이다.For example, FIG. 7 predicts from the stock price prediction model 131 composed of the first to third layers 131a to 131c according to an embodiment of the present invention from January 29, 2014 to February 4, 2014 It shows the market price, high price, low price, and closing price (133).

본 발명의 일 실시 예에 따른 주가 예측부(130)는 주가 데이터 수집부(110)에서 수집된 주가 데이터 중 예측 날짜인 5일차의 실제 주가를 레이블링하여 상술한 방법에 출력된 5일 차의 주가 예측 값(133)이 레이블링된 5일차의 실제 주가가 되도록 주가 예측 모델(131)을 학습할 수 있다.The stock price predicting unit 130 according to an embodiment of the present invention labels the actual stock price of the predicted day 5 among the stock price data collected by the stock price data collection unit 110 and outputs the stock price of the 5th day output in the above-described method. The stock price prediction model 131 may be trained such that the predicted value 133 becomes the labeled actual stock price of the fifth day.

본 발명의 일 실시 예에 따른 주가 예측부(130)는 상술한 방법에 의해 출력된 N 일 간의 각 주식 종목별 시가, 고가, 저가 및 종가와 레이블로 사용된 N 일 간의 주가 데이터의 평균 제곱 오차가 수렴할 때가지 주가 예측 모델(131)을 학습할 수 있다.The stock price predicting unit 130 according to an embodiment of the present invention has the average squared error of the market price, the high price, the low price of each N stock item output by the above-described method, and the stock price data between N days used as a label. The stock price prediction model 131 can be trained until convergence.

본 발명의 일 실시 예에 따른 주가 예측부(130)는 상술한 방법에 의해 주가 예측 모델(131)을 학습한 후 입력되는 새로운 입력 데이터에 대해 학습된 주가 예측 모델(131)에 새로 입력된 입력 데이터만 다시 학습할 수 있다.The stock price prediction unit 130 according to an embodiment of the present invention inputs newly input to the stock price prediction model 131 learned about new input data input after learning the stock price prediction model 131 by the above-described method. Only the data can be learned again.

다시 도 1을 참조하면, 본 발명의 일 실시 예에 따른 주식 종목 선택부(140)는 상관관계 분석부(120)에서 상관관계가 분석된 주식 종목들 각각에 대해 주가 예측부(130)에서 예측된 주가를 서로 비교하여 주가 상승이 예측되는 주식 종목을 상관관계가 분석된 주식 종목들 중에서 선택할 수 있다.Referring back to FIG. 1, the stock item selection unit 140 according to an embodiment of the present invention predicts the stock price prediction unit 130 for each of the stock items for which correlation has been analyzed by the correlation analysis unit 120. You can compare the stock prices to each other and select the stock stocks for which the stock price is expected to rise from among the stock stocks correlated.

구체적으로, 본 발명의 일 실시 예에 따른 주식 종목 선택부(140)는 상관관계 분석부(120)에서 각 주식 종목별 간 분석된 상관관계를 고려하여 클러스터링부(124)에서 군집화된 주식 종목 각각에 포함된 상관관계의 유사성을 가지는 주식 종목들 중에서 주식 종목 추출부(125)에서 추출된 상관계수가 임계치보다 낮은 적어도 하나의 주식 종목에 대해 주가 예측부(130)에서 예측된 주가를 고려하여 주가 상승이 예측되는 주식 종목을 선택할 수 있다.Specifically, the stock item selection unit 140 according to an embodiment of the present invention considers the correlation analyzed by each stock item in the correlation analysis unit 120 to each of the stock items clustered in the clustering unit 124. Of the stocks having similarity in correlation, the stock price is increased by considering the stock price predicted by the stock price prediction unit 130 for at least one stock item having a correlation coefficient extracted from the stock item extraction unit 125 lower than a threshold. You can select this predicted stock stock.

본 발명의 일 실시 예에 따른 주식 종목 추출부(125)에서 추출된 상관계수가 임계치보다 낮은 적어도 하나의 주식 종목은 클러스터링부(124)에서 주식 네트워크 그래프(123)로부터 형성된 각 군집 별로 하나의 주식 종목을 선택하기 위해 도 2에서 설명한 최소신장트리에서 차수가 1에 해당하는 에지(edge)가 하나인 주식 종목인 단말인 주식 종목일 수 있다.At least one stock item having a correlation coefficient lower than a threshold value extracted from the stock item extraction unit 125 according to an embodiment of the present invention is one stock for each cluster formed from the stock network graph 123 in the clustering unit 124 In order to select an item, it may be a stock item that is a terminal that is a stock item with one edge corresponding to order 1 in the minimum height tree described in FIG. 2.

즉, 본 발명의 일 실시 예에 따른 주식 종목 선택부(140)는 클러스터링부(124)에서 군집화된 주식 종목 그룹 각각에서 주식 종목 추출부(125)에서 추출된 적어도 하나의 주식 종목에 대해 주가 예측부(130)에서 예측된 주가를 각각 비교하여 미리 설정된 개수만큼 군집화된 주식 종목 그룹 각각에서 누적 수익률이 가장 높은 주식 종목을 하나씩 선택할 수 있다.That is, the stock item selection unit 140 according to an embodiment of the present invention predicts the stock price for at least one stock item extracted from the stock item extraction unit 125 in each group of stock items clustered in the clustering unit 124. Each stock price predicted by the unit 130 may be compared to select a stock item having the highest cumulative return from each group of stock items grouped by a preset number.

따라서, 본 발명의 일 실시 예에 따른 주식 종목 선택부(140)는 클러스터링부(124)에서 형성된 군집화 된 주식 종목의 그룹 각각에서 주식 종목 추출부(125)에서 구성된 최소신장트리에서의 에지(edge)가 1개인 주식 종목들 중 주가 예측부(130)에서 예측된 각 주식 종목 별에 대한 주가로부터 예측된 주가가 가장 높게 상승 하는 주식 종목을 선택할 수 있다.Therefore, the stock item selection unit 140 according to an embodiment of the present invention is an edge in the minimum height tree configured in the stock item extraction unit 125 in each group of clustered stock items formed by the clustering unit 124. Among stock stocks with 1), the stock price prediction unit 130 may select a stock stock with the highest predicted stock price from the stock price for each stock stock predicted.

따라서, 본 발명의 일 실시 예에 따른 주식 포트폴리오 구성을 위한 주식 종목 선택 장치(100)는 상술한 방법에 의해 선택된 주식 종목들로 주식 포트폴리오를 구성할 수 있다.Therefore, the stock item selection apparatus 100 for configuring a stock portfolio according to an embodiment of the present invention may configure a stock portfolio with stock items selected by the above-described method.

예를 들어, 본 발명의 일 실시 예에 따른 주식 종목 선택부(140)는 클러스터링부(124)에서 군집화된 10개의 주식 종목 그룹 각각에서 상술한 방법에 의해 주식 종목 별 간 상관관계가 적으면서 주가가 상승할 것으로 예측되는 주식 종목을 1개씩 선택할 수 있다. 단, 상술한 예시는 본 발명의 일 실시 예를 설명하기 위한 예시일 뿐 이에 한정되는 것은 아니다.For example, the stock item selection unit 140 according to an embodiment of the present invention has less correlation between stock items by the above-described method in each of the 10 stock item groups clustered in the clustering unit 124, but the stock price is small. You can select one stock item that is expected to rise. However, the above-described example is only an example for explaining an embodiment of the present invention, but is not limited thereto.

구체적으로, 본 발명의 일 실시 예에 따른 주식 포트폴리오 구성을 위한 주식 종목 선택 장치(100)는 아래의 표 1의 포트폴리오 추천 알고리즘을 이용하여 누적 수익률이 가장 높은 주식 종목을 선택하여 주식 포트폴리오에 추가할 수 있다.Specifically, the stock item selection device 100 for configuring a stock portfolio according to an embodiment of the present invention selects the stock item having the highest cumulative return using the portfolio recommendation algorithm of Table 1 below and adds it to the stock portfolio You can.

상술한 표 1의 포트폴리오 추천 알고리즘은 클러스터링부(124)에서 형성된 군집화 된 복수 개의 주식 종목 그룹인 각 군집 CS(i)에서 예측 누적 수익률이 가장 높은 주식 종목 1개를 선정하여 포트폴리오 P에 추가할 수 있다.The portfolio recommendation algorithm of Table 1 described above can select one stock item with the highest predicted cumulative return from each cluster CS(i), which is a group of multiple stock items formed by the clustering unit 124, and add it to the portfolio P. have.

구체적으로, 상술한 표 1의 포트폴리오 추천 알고리즘은 각 군집 CS(i)에 포함된 최소신장트리에서 차수가 1인 복수 개의 주식 종목들(sj)에 대해 주가 예측부(130)가 예측 날짜의 주가를 예측하여, 주가가 예측된 복수 개의 주식 종목들(sj)을 비교하여 각 군집 CS(i)에서 예측 누적 수익률이 가장 높은 주식 종목 1개를 선택하여 포트폴리오 P에 추가할 수 있다.Specifically, in the above-mentioned portfolio recommendation algorithm of Table 1, the stock price prediction unit 130 for the multiple stock items sj of order 1 in the minimum height tree included in each cluster CS(i), the stock price of the prediction date By predicting, one stock item with the highest predicted cumulative return in each cluster CS(i) may be added to portfolio P by comparing a plurality of stock items (sj) whose stock price is predicted.

본 발명의 일 실시 예에 따른 상술한 예측 누적 수익률은 N일 간의 예측 종가를 기준으로 산출할 수 있다.The above-described predicted cumulative return according to an embodiment of the present invention may be calculated based on the predicted closing price for N days.

아래의 표 2는 발명의 일 실시 예에 따른 주식 포트폴리오 구성을 위한 주식 종목 선택 장치(100)의 성능을 평가하기 위해 6일 단위로 진행된 모의 투자 구간을 월 단위로 묶은 후 2014년 1월 3일부터 2014년 12월 22일까지 월 단위에 대한 주식 포트폴리오 구성을 위한 주식 종목 선택 장치(100)로부터 선택된 주식 종목을 추가한 포트폴리오를 사용하여 획득된 모의 투자 결과에 따른 누적 수익률과 종래의 주식 네트워크 분석 방법을 이용하여 획득된 모의 투자를 한 결과에 따른 누적 수익률을 비교한 결과를 나타낸 것이다.Table 2 below is a January 3, 2014 after grouping the simulated investment section in 6-day increments in monthly units to evaluate the performance of the stock item selection device 100 for configuring a stock portfolio according to an embodiment of the invention From December 22, 2014, the cumulative rate of return based on the simulated investment result and the conventional stock network analysis obtained by using the portfolio to which the selected stock item is added from the stock item selection device 100 for configuring the stock portfolio for the monthly unit It shows the result of comparing the cumulative returns according to the results of the simulated investments obtained using the method.

상술한 표 2에서 실험 데이터로 2018년 5월 기준 코스피 200 종목 중에서 모의투자 시작 이전에 1년 치 이상의 학습 데이터를 확보할 수 있는 175개의 종목을 사용하였으며, 또한 상술한 표 2에서 사용한 평균 수익률은 무작위로 초기 중심점이 선택되는 K-메도이드(K-medoids) 알고리즘 특성을 고려하여 모의 투자를 100회 수행한 평균 수익률로 사용하였다.As the experimental data in Table 2 above, among the 200 KOSPI 200 stocks as of May 2018, 175 stocks that can secure more than one year of learning data prior to the start of the simulated investment were used, and the average return used in Table 2 above was also used. In consideration of the characteristics of the K-medoids algorithm, in which the initial center point is randomly selected, the simulated investment was used as the average rate of return of 100 times.

본 발명의 일 실시 예에 따른 주식 포트폴리오 구성을 위한 주식 종목 선택 장치(100)는 60일 간의 각 주식 종목별 주가 데이터 및 기술적 분석 데이터로부터 생성된 입력 데이터를 이용하여 학습된 주가 예측 모델(131)을 이용하여 60일 이후 5일간의 주가를 예측할 수 있다.The stock item selection device 100 for configuring a stock portfolio according to an embodiment of the present invention uses a stock price prediction model 131 learned using input data generated from stock data and technical analysis data for each stock item for 60 days. You can use it to predict the stock price for 5 days after 60 days.

상술한 표 2에서 본 발명의 일 실시 예에 따른 주식 포트폴리오 구성을 위한 주식 종목 선택 장치(100)를 이용하여 모의 투자를 하는 방법은 5일간의 주가 예측치를 기반으로 본 발명의 일 실시 예에 따른 주식 포트폴리오 구성을 위한 주식 종목 선택 장치(100)로부터 추천된 주식 종목들에 대한 포트폴리오로 5일 동안 투자하고, 이후 1일간 학습된 주가 예측 모델(131)을 업데이트하는 시간을 가지는데 이때 지난 5일간의 실제 주가를 반영하여 주가 예측 모델(131)을 재학습시키면서 포트폴리오를 재구성하고, 5일간의 투자를 다시 시작하는 방법으로 모의 투자를 하였으며, 이에 따라 표 2에서 본 발명의 일 실시 예에 따른 주식 포트폴리오 구성을 위한 주식 종목 선택 장치(100)를 이용한 모의 투자 결과는 2014년 1월부터 2014년 12월까지의 기간 동안의 모의 투자 결과를 나타낸 것이다.In Table 2, a method of making a simulated investment using the stock item selection device 100 for configuring a stock portfolio according to an embodiment of the present invention is based on a five-day stock price forecast according to an embodiment of the present invention. Invest for 5 days as a portfolio of recommended stock items from the stock item selection device 100 for constructing a stock portfolio, and then take the time to update the stock price prediction model 131 learned for 1 day, in the past 5 days Re-learning the portfolio while retraining the stock price prediction model 131 by reflecting the actual stock price of the company and making a simulated investment by restarting the investment for 5 days. Accordingly, the stock according to an embodiment of the present invention in Table 2 The simulated investment result using the stock item selection device 100 for constructing a portfolio represents the simulated investment result for the period from January 2014 to December 2014.

또한, 상술한 표 2에서의 구간 평균 예측 정밀도는 2014년 1월 3일부터 2014년 12월 22일까지 월 단위에 대한 구간 평균 당 본 발명의 일 실시 예에 따른 주식 포트폴리오 구성을 위한 주식 종목 선택 장치(100)를 이용하여 획득된 주가의 예측 값에 대한 예측 정밀도를 나타낸 것이다.In addition, the section average prediction precision in Table 2 described above is a stock item selection for the composition of a stock portfolio according to an embodiment of the present invention per section average for a monthly unit from January 3, 2014 to December 22, 2014. It shows the prediction precision of the predicted value of the stock price obtained using the device 100.

구체적으로, 표 2에서의 구간 평균 예측 정밀도는 본 발명의 일 실시 예에 따른 주가 예측 모델(131)의 주가 예측 성능을 확인하기 위해 아래의 수학식 5를 이용하여 175 개의 주식 종목들에 대한 구간 평균 예측 정밀도를 계산할 결과를 나타낸 것이다.Specifically, the section average prediction precision in Table 2 is an interval for 175 stock stocks using Equation 5 below to confirm the stock price prediction performance of the stock price prediction model 131 according to an embodiment of the present invention. It shows the result of calculating the average prediction precision.

본 발명의 일 실시 예에 따른 주가 예측 모델(131)의 정밀도가 낮은 구간도 존재하나 주식 네트워크 분석을 통해 상관계수 값이 낮은 주식 종목들로 투자가 분산되었기 때문에 주식 종목 투자에 대한 위험도가 낮아져서 정밀도가 낮은 구간에서 종래의 주식 네트워크 분석 방법과 비슷하거나 조금 더 낮은 누적 수익률이 나타난 것을 확인할 수 있다.There is also a section in which the precision of the stock price prediction model 131 according to an embodiment of the present invention is low, but since the investment is distributed to stock stocks with a low correlation coefficient through stock network analysis, the risk for investment in stock stocks is lowered and the precision is lowered. It can be seen that the cumulative returns were similar or slightly lower than in the conventional stock network analysis method in the low section.

따라서, 본 발명의 일 실시 예에 따른 주식 포트폴리오 구성을 위한 주식 종목 선택 장치(100)를 이용하여 획득된 전체 누적 수익률은 종래의 주식 네트워크 분석만을 사용하여 획득된 전체 누적 수익률보다 약 10.56%P(퍼센트 포인트)가 더 높게 나온 것을 확인할 수 있다.Therefore, the total cumulative return obtained using the stock item selection device 100 for configuring a stock portfolio according to an embodiment of the present invention is about 10.56% P ( Percentage points).

주식 포트폴리오 구성을 위한 주식 종목 선택 장치에 포함된 구성요소들이 도 1, 도 2 및 도 6에서는 분리되어 도시되어 있으나, 복수의 구성요소들은 상호 결합되어 적어도 하나의 모듈로 구현될 수 있다. 구성요소들은 장치 내부의 소프트웨어적인 모듈 또는 하드웨어적인 모듈을 연결하는 통신 경로에 연결되어 상호 간에 유기적으로 동작한다. 이러한 구성요소들은 하나 이상의 통신 버스 또는 신호선을 이용하여 통신한다.Although the components included in the stock item selection device for configuring the stock portfolio are shown separately in FIGS. 1, 2, and 6, a plurality of components may be combined with each other and implemented as at least one module. The components are connected to a communication path connecting a software module or a hardware module inside the device to operate organically with each other. These components communicate using one or more communication buses or signal lines.

주식 포트폴리오 구성을 위한 주식 종목 선택 장치는 하드웨어, 펌웨어, 소프트웨어 또는 이들의 조합에 의해 로직회로 내에서 구현될 수 있고, 범용 또는 특정 목적 컴퓨터를 이용하여 구현될 수도 있다. 장치는 고정 배선형(Hardwired) 기기, 필드 프로그램 가능한 게이트 어레이(Field Programmable Gate Array, FPGA), 주문형 반도체(Application Specific Integrated Circuit, ASIC) 등을 이용하여 구현될 수 있다. 또한, 장치는 하나 이상의 프로세서 및 컨트롤러를 포함한 시스템 온 칩(System on Chip, SoC)으로 구현될 수 있다.The stock item selection device for configuring a stock portfolio may be implemented in a logic circuit by hardware, firmware, software, or a combination thereof, or may be implemented using a general purpose or specific purpose computer. The device may be implemented using a fixed-wired device, a field programmable gate array (FPGA), an application specific integrated circuit (ASIC), or the like. In addition, the device may be implemented as a System on Chip (SoC) including one or more processors and controllers.

주식 포트폴리오 구성을 위한 주식 종목 선택 장치는 하드웨어적 요소가 마련된 컴퓨팅 디바이스에 소프트웨어, 하드웨어, 또는 이들의 조합하는 형태로 탑재될 수 있다. 컴퓨팅 디바이스는 각종 기기 또는 유무선 통신망과 통신을 수행하기 위한 통신 모뎀 등의 통신 장치, 프로그램을 실행하기 위한 데이터를 저장하는 메모리, 프로그램을 실행하여 연산 및 명령하기 위한 마이크로프로세서 등을 전부 또는 일부 포함한 다양한 장치를 나타낼 수 있다.The stock item selection device for configuring a stock portfolio may be mounted on a computing device provided with hardware elements in software, hardware, or a combination thereof. Computing devices include various devices or communication devices such as communication modems for performing communication with wired/wireless communication networks, memory for storing data for executing programs, and microprocessors for executing and calculating and executing programs. Device.

도 8은 본 발명의 일 실시 예에 따른 주식 포트폴리오 구성을 위한 주식 종목 선택 방법을 설명하기 위한 흐름도이다. 주식 포트폴리오 구성을 위한 주식 종목 선택 방법은 컴퓨팅 디바이스에 의하여 수행될 수 있으며, 주식 포트폴리오 구성을 위한 주식 종목 선택 장치와 동일한 방식으로 동작한다8 is a flowchart illustrating a method for selecting a stock item for configuring a stock portfolio according to an embodiment of the present invention. The stock item selection method for configuring the stock portfolio may be performed by a computing device, and operates in the same manner as the stock item selection device for configuring the stock portfolio.

도 8을 참조하면, 컴퓨팅 디바이스는 주식 종목의 주가와 관련된 주가 데이터를 각 주식 종목별로 수집한다(S810). 상술한 주가 데이터는 주식 종목에서 일반적으로 사용되는 시가, 고가, 저가, 종가, 조정 종가 및 주식의 거래량을 포함할 수 있다.Referring to FIG. 8, the computing device collects stock price data related to the stock price of each stock item for each stock item (S810). The above-described stock price data may include market prices, high prices, low prices, closing prices, adjusted closing prices, and trading volumes commonly used in stocks.

컴퓨팅 디바이스는 수집된 주가 데이터를 이용하여 각 주식 종목별 간 상관관계를 분석한다(S820). 구체적으로, 본 발명의 일 실시 예에 따른 컴퓨팅 디바이스는 수집된 주가 데이터를 이용하여 각 주식 종목들 중에서 상관관계가 높은 주식 종목들로 어떠한 주식 종목들이 있는지 분석할 수 있고, 또한 각 주식 종목들 중에서 상관관계가 낮은 주식 종목들로 어떠한 주식 종목들이 있는지 분석할 수 있다.The computing device analyzes the correlation between each stock item using the collected stock price data (S820). Specifically, the computing device according to an embodiment of the present invention can analyze which stock items are among the stock items having high correlation among the stock items using the collected stock price data, and also among the stock items. It is possible to analyze which stocks have low correlation stocks.

각 주식 종목별 간 상관 관계를 분석하는 구체적인 방법은 도 9에서 후술하도록 한다.A detailed method of analyzing the correlation between each stock item will be described later in FIG. 9.

컴퓨팅 디바이스는 상관관계가 분석된 주식 종목들에 대해 주가 데이터 수집부(110)에서 수집된 주가 데이터를 기반으로 학습하여 생성한 주가 예측 모델을 이용하여 주가를 예측하려는 예측 날짜에 대한 상관관계가 분석된 주식 종목들 각각의 주가를 예측한다(S830).The computing device analyzes the correlation for the predicted date to predict the stock price by using the stock price prediction model generated by learning based on the stock price data collected by the stock price data collection unit 110 for the stocks with which the correlation has been analyzed. Predict the stock price of each of the stocks (S830).

구체적으로, 본 발명의 일 실시 예에 따른 컴퓨팅 디바이스는 상관관계가 분석된 주식 종목들에 대해 수집된 주가 데이터를 이용하여 미리 설정된 차원을 가지는 입력 데이터를 생성할 수 있고, 생성된 입력 데이터를 기반으로 주가 예측 모델이 예측 날짜에 대한 상관관계가 분석된 주식 종목들 각각의 주가를 예측하도록 주가 예측 모델을 학습할 수 있다.Specifically, the computing device according to an embodiment of the present invention may generate input data having a preset dimension by using stock price data collected for stock items with which correlation has been analyzed, and based on the generated input data. As a result, the stock price prediction model can be trained so that the stock price prediction model predicts the stock price of each of the stocks that are correlated with the prediction date.

또한, 본 발명의 일 실시 예에 따른 컴퓨팅 디바이스는 상관관계가 분석된 주식 종목들에 대해 수집된 주가 데이터 중 예측 날짜의 주가 데이터를 레이블링하여 상관관계가 분석된 주식 종목들 각각에 대해 예측된 주가가 레이블링된 예측 날짜의 주가가 되도록 주가 데이터 및 각 날짜 별로 변환된 기술적 분석 데이터로부터 생성된 입력 데이터를 이용하여 주가 예측 모델을 학습할 수 있다.In addition, the computing device according to an embodiment of the present invention predicts the stock price for each of the stocks that are correlated by labeling the stock price data of the prediction date among the stock price data collected for the stock stocks that have been correlated. A stock price prediction model may be trained using input data generated from stock price data and technical analysis data converted for each date so that A becomes the stock price of the labeled prediction date.

상술한 본 발명의 일 실시 예에 따른 주가 예측 모델은 시계열 데이터 분석이 가능한 순환신경망(RNN) 기반의 레이어로 구성될 수 있다.The stock price prediction model according to an embodiment of the present invention described above may be configured as a layer based on a circulating neural network (RNN) capable of analyzing time series data.

또한, 본 발명의 또 다른 일 실시 예에 따른 주가 예측 모델은 현재 메모리 셀에 이전 메모리 셀 상태를 반영할지 여부를 결정하는 망각 게이트(Forget Gate)를 포함하는 장단기 메모리(LSTM) 셀로 구성된 순환신경망(RNN) 기반의 레이어로 구성될 수 있다.In addition, the stock price prediction model according to another embodiment of the present invention is a circulatory neural network (LSTM) cell composed of long and short-term memory (LSTM) cells including a forget gate for determining whether to reflect a previous memory cell state in a current memory cell ( RNN) based layer.

또한, 본 발명의 또 다른 일 실시 예에 따른 주가 예측 모델은 상술한 장단기 메모리(LSTM) 셀 2개가 결합하여 양방향으로 상태 전파가 이루어지도록 형성된 양방향 장단기 메모리(Bi-LSTM) 셀로 구성된 순환신경망(RNN) 기반의 레이어로 구성될 수 있다.In addition, the stock price prediction model according to another embodiment of the present invention is a circulatory neural network (RNN) composed of bidirectional long and short memory (Bi-LSTM) cells formed by combining two of the aforementioned long and short memory (LSTM) cells so that state propagation is performed in both directions. ) It can be composed of a base layer.

상술한 주가 예측 모델 및 이를 이용하여 주가를 예측하는 구체적인 방법은 도 6 및 도 7에서 전술하였으므로 자세한 설명은 생략하도록 한다.Since the above-described stock price prediction model and a specific method for predicting the stock price using the above-described method are described in FIGS. 6 and 7, a detailed description thereof will be omitted.

컴퓨팅 디바이스는 상관관계가 분석된 주식 종목들 각각에 대해 주가 예측부(130)에서 예측된 주가를 서로 비교하여 주가 상승이 예측되는 주식 종목을 상관관계가 분석된 주식 종목들 중에서 선택한다(S840). 구체적으로, 본 발명의 일 실시 예에 따른 컴퓨팅 디바이스는 상관관계가 분석된 각 주식 종목별 중 상관관계가 유사한 주식 종목들을 군집화하고, 군집화된 상관관계가 유사한 주식 종목들 중에서 상대적으로 낮은 상관관계를 가지는 적어도 하나의 주식 종목을 추출하여, 추출된 적어도 하나의 주식 종목에 대해 예측된 주가를 서로 비교하여 주가 상승이 예측되는 주식 종목을 군집화된 상관관계가 유사한 주식 종목들 중에서 선택할 수 있다.The computing device compares the stock price predicted by the stock price predicting unit 130 to each of the stock stocks for which correlation has been analyzed, and selects a stock stock for which a stock price increase is predicted from among the stock stocks for which correlation has been analyzed (S840). . Specifically, the computing device according to an embodiment of the present invention clusters stock items with similar correlation among each stock item for which correlation has been analyzed, and has a relatively low correlation among stock items with similar clustered correlation. By extracting at least one stock item and comparing the predicted stock price with respect to the extracted at least one stock item, a stock item for which a stock price increase is predicted can be selected from among stock items having a similar correlation.

상술한 주식 종목을 선택하는 구체적인 방법은 전술하였으므로 자세한 설명은 생략하도록 한다.Since the detailed method of selecting the above-mentioned stock item has been described above, a detailed description will be omitted.

도 9는 본 발명의 일 실시 예에 따른 각 주식 종목별 간 상관관계를 분석하는 방법을 구체적으로 설명하기 위한 흐름도이다.9 is a flowchart illustrating in detail a method of analyzing a correlation between each stock item according to an embodiment of the present invention.

도 9를 참조하면, 컴퓨팅 디바이스는 주식 종목별 주가 데이터로부터 각각의 로그 수익률을 산출한다(S821). 구체적으로, 본 발명의 일 실시 예에 따른 컴퓨팅 디바이스는 상술한 수학식 1을 이용하여 각 주식 종목당 미리 설정된 기간 동안의 일일 로그 수익률을 산출할 수 있다.Referring to FIG. 9, the computing device calculates each log rate of return from stock price data for each stock item (S821). Specifically, the computing device according to an embodiment of the present invention may calculate the daily log yield for a preset period for each stock item using Equation 1 described above.

상술한 로그 수익률을 산출하는 구체적인 방법은 전술하였으므로 자세한 설명은 생략하도록 한다.The detailed method of calculating the log yield is described above, so a detailed description thereof will be omitted.

컴퓨팅 디바이스는 산출된 주식 가격의 로그 수익률을 이용하여 각 주식 종목별 상관계수를 산출한다(S822). 구체적으로, 본 발명의 일 실시 예에 따른 컴퓨팅 디바이스는 상술한 수학식 2를 이용하여 산출된 주식 가격의 로그 수익률로부터 각 주식 종목별 간 상관관계를 수치화한 상관계수를 산출할 수 있다.The computing device calculates a correlation coefficient for each stock item using the log return of the calculated stock price (S822). Specifically, the computing device according to an embodiment of the present invention may calculate a correlation coefficient obtained by quantifying a correlation between each stock item from the log return of the stock price calculated using Equation 2 described above.

상술한 상관계수를 산출하는 구체적인 방법은 전술하였으므로 자세한 설명은 생략하도록 한다.Since the detailed method of calculating the correlation coefficient described above has been described above, a detailed description will be omitted.

컴퓨팅 디바이스는 각 주식 종목별 간 산출된 상관계수를 이용하여 각 주식 종목별 간의 거리 관계를 수치화한 각 주식 종목별 간 떨어져 있는 거리를 산출한다(S823).The computing device calculates the distance between each stock item by quantifying the distance relationship between each stock item using the correlation coefficient calculated between each stock item (S823).

본 발명의 일 실시 예에 따른 컴퓨팅 디바이스는 상술한 각 주식 종목별 간 떨어져 있는 거리를 수학식 3을 이용하여 산출할 수 있고, 각 주식 종목별 간 상관관계가 높을 수록 산출된 각 주식 종목별 간 거리는 가까운 특징을 나타낼 수 있다.The computing device according to an embodiment of the present invention may calculate the distance between each stock item described above using Equation 3, and the higher the correlation between each stock item, the closer the calculated distance between each stock item is. Can represent

컴퓨팅 디바이스는 산출된 각 주식 종목별 간의 거리에 따라 각 주식 종목별 간 거리 관계를 나타내는 주식 네트워크 그래프를 생성한다(S824).The computing device generates a stock network graph representing the distance relationship between each stock item according to the calculated distance between each stock item (S824).

컴퓨팅 디바이스는 생성된 주식 네트워크 그래프로부터 각 주식 종목별 간 거리 관계에 따른 유사성을 갖는 미리 설정된 개수의 주식 종목 그룹들로 군집화한다(S825).The computing device clusters a predetermined number of stock item groups having similarity according to the distance relationship between each stock item from the generated stock network graph (S825).

본 발명의 일 실시 예에 따른 컴퓨팅 디바이스는 모든 주식 종목별 간의 상관계수 값을 가지고 있는 상관계수 행렬에 군집화 알고리즘을 적용하여 주식 네트워크 그래프로부터 미리 설정된 개수의 주식 종목 그룹들로 군집화할 수 있다.The computing device according to an embodiment of the present invention may cluster a predetermined number of stock item groups from a stock network graph by applying a clustering algorithm to a correlation coefficient matrix having a correlation coefficient value for each stock item.

상술한 미리 설정된 개수의 주식 종목 그룹들로 군집화하는 구체적인 방법은 전술하였으므로 자세한 설명은 생략하도록 한다.The detailed method of clustering the group of stocks of a predetermined number of stocks described above has been described above, so a detailed description thereof will be omitted.

컴퓨팅 디바이스는 군집화된 주식 종목 그룹 각각에 포함된 상관관계의 유사성을 가지는 주식 종목들 중 상관계수가 임계치보다 낮은 적어도 하나의 주식 종목을 주가를 예측하기 위한 주식 종목의 후보로 추출한다(S826).The computing device extracts at least one stock item having a correlation coefficient lower than a threshold among stock items having similarity in correlation included in each group of grouped stock items as a candidate for the stock item to predict the stock price (S826).

상술한 상관계수는 -1 내지 1의 값을 가질 수 있으며, 임계치는 0일 수 있으나, 상술한 예시는 본 발명의 일 실시 예를 설명하기 위한 예시일 뿐 이에 한정되는 것은 아니다.The above-described correlation coefficient may have a value of -1 to 1, and the threshold may be 0, but the above example is only an example for explaining an embodiment of the present invention, but is not limited thereto.

본 발명의 일 실시 예에 따른 컴퓨팅 디바이스는 각 주식 종목 간의 상관계수가 간선의 가중치가 되는 주식 네트워크 그래프로부터 최소신장트리(MST)를 구성하고, 구성된 최소신장트리를 이용하여 미리 설정된 개수로 거리 관계에 따른 유사성을 가지는 주식 종목들이 군집화된 주식 종목 그룹들 내에서 서로 상관관계가 낮은 주식 종목들을 추출할 수 있다.The computing device according to an embodiment of the present invention configures a minimum height tree (MST) from a stock network graph in which the correlation coefficient between each stock item is the weight of the edge, and the distance relationship is set to a preset number using the configured minimum height tree. The stock stocks having similarity according to can be extracted from the stock stock groups grouped with low correlation.

본 발명의 일 실시 예에 따른 컴퓨팅 디바이스는 크루스칼 알고리즘 또는 프림 알고리즘을 적용하여 주식 네트워크 그래프로부터 최소신장트리를 구성할 수 있다.The computing device according to an embodiment of the present invention may construct a minimum height tree from a stock network graph by applying a kruskal algorithm or a prim algorithm.

구체적으로, 본 발명의 일 실시 예에 따른 컴퓨팅 디바이스는 최소신장트리(MST)에서 차수가 1에 해당하는 에지가 하나인 주식 종목을 미리 설정된 개수로 거리 관계에 따른 유사성을 가지는 주식 종목들이 군집화된 주식 종목 그룹들 내에서 서로 상관관계가 낮은 주식 종목으로 추출할 수 있다.Specifically, in the computing device according to an embodiment of the present invention, stock stocks having similarities according to the distance relationship are clustered with a preset number of stock stocks with one edge corresponding to order 1 in the minimum height tree (MST). Within stock groups, stocks can be extracted as stocks with low correlation.

예를 들어, 상술한 방법에 의해 서로 유사성을 가지는 주식 종목들이 군집화된 주식 종목 그룹들이 10개인 경우, 10개의 주식 종목 그룹들에서 각각 서로 상관관계가 낮은 주식 종목들을 추출할 수 있다.For example, if there are 10 stock stock groups in which stock stocks having similarity to each other are clustered by the above-described method, stock stocks having a low correlation with each other may be extracted from 10 stock stock groups.

상술한 군집화된 주식 종목 그룹 각각에 포함된 상관관계의 유사성을 가지는 주식 종목들 중에서 상관관계가 낮은 주식 종목들을 추출하는 구체적인 방법은 전술하였으므로 자세한 설명은 생략하도록 한다.Since the detailed method of extracting the stocks with low correlation among the stocks having similarity of correlation included in each of the grouped stock stocks groups described above has been described above, a detailed description thereof will be omitted.

도 10은 본 발명의 또 다른 일 실시 예에 따른 주식 포트폴리오 구성을 위한 주식 종목 선택 방법을 설명하기 위한 흐름도이다. 주식 포트폴리오 구성을 위한 주식 종목 선택 방법은 컴퓨팅 디바이스에 의하여 수행될 수 있으며, 도 8에서 주식 포트폴리오 구성을 위한 주식 종목 선택 방법과 동일한 방식으로 동작하는 단계에 대한 설명은 생략하면서 설명하도록 한다.10 is a flow chart for explaining a method for selecting a stock item for configuring a stock portfolio according to another embodiment of the present invention. The stock item selection method for configuring a stock portfolio may be performed by a computing device. In FIG. 8, a description of steps that operate in the same manner as the stock item selection method for configuring a stock portfolio will be omitted and described.

도 10을 참조하면, 컴퓨팅 디바이스는 주식 종목의 주가와 관련된 주가 데이터를 각 주식 종목별로 수집한다(S1010).Referring to FIG. 10, the computing device collects stock price data related to the stock price of each stock item for each stock item (S1010).

상술한 주가 데이터에 대한 구체적인 설명은 전술하였으므로 자세한 설명은 생략하도록 한다.Since the detailed description of the stock price data has been described above, a detailed description will be omitted.

컴퓨팅 디바이스는 수집된 주가 데이터를 이용하여 각 주식 종목별 간 상관관계를 분석한다(S1020).The computing device analyzes the correlation between each stock item using the collected stock price data (S1020).

상술한 각 주식 종목별 간 상관관계를 분석하는 구체적인 방법은 전술하였으므로 자세한 설명은 생략하도록 한다.The detailed method of analyzing the correlation between each stock item described above has been described above, so a detailed description thereof will be omitted.

컴퓨팅 디바이스는 수집된 각 날짜 별로 상관관계가 분석된 주식 종목의 주가, 거래량 및 거래 시간을 포함하는 주가 데이터를 가공하여 주가 데이터를 주식 분석에 사용되는 기술적 분석 데이터로 변환한다(S1030).The computing device processes stock price data including the stock price, trading volume, and trading time of the correlated stock for each collected date, and converts the stock price data into technical analysis data used for stock analysis (S1030).

상술한 기술적 분석 데이터는 기술적 분석 지표인 볼린저 밴드(Bband), 이동평균(MA), 이동평균수렴&확산(MACD), 이중 지수 이동 평균(DEMA), 스토캐스틱(STOCH), 트릭스(TRIX), 방향성평균지표(ADX), 누적 균형거래량(OBV), SAR, 중간값(MIDPOINT)인 10개의 기술적 분석 지표 중 적어도 하나의 기술적 분석 지표를 포함할 수 있으나, 이에 한정 되는 것은 아니다.The above-mentioned technical analysis data includes Bollinger Band (Bband), Moving Average (MA), Moving Average Convergence & Diffusion (MACD), Dual Exponential Moving Average (DEMA), Stochastic (STOCH), Trix (TRIX), and Directionality. Average index (ADX), cumulative balance (OBV), SAR, median (MIDPOINT) of 10 technical analysis indicators may include at least one technical analysis indicator, but is not limited thereto.

컴퓨팅 디바이스는 상관관계가 분석된 주식 종목에 대해 수집된 주가 데이터 및 주가 데이터가 각 날짜 별로 변환된 기술적 분석 데이터를 기반으로 학습하여 생성한 주가 예측 모델을 이용하여 상관관계가 분석된 주식 종목별로 예측하려는 예측 날짜의 주가를 예측한다(S1040).The computing device predicts each stock item that has been correlated by using a stock price prediction model generated by learning based on stock data collected for the correlated stock item and technical analysis data in which the stock price data is converted for each day. The stock price of the prediction date to be predicted is predicted (S1040).

본 발명의 일 실시 예에 따른 컴퓨팅 디바이스는 주가 데이터 및 주가 데이터가 각 날짜 별로 변환된 기술적 분석 데이터로부터 주가의 시계열적인 특성을 고려하여 미리 설정된 차원을 가지는 입력 데이터를 생성하고, 생성된 입력 데이터를 기반으로 주가 예측 모델이 상관관계가 분석된 주식 종목별로 예측 날짜의 주가를 예측하도록 주가 예측 모델을 학습할 수 있다.The computing device according to an embodiment of the present invention generates input data having a predetermined dimension in consideration of time-series characteristics of the stock price from technical analysis data in which stock price data and stock price data are converted for each date, and generates the generated input data. Based on this, the stock price prediction model can be trained so that the stock price prediction model predicts the stock price of the prediction date for each correlated stock item.

본 발명의 또 다른 일 실시 예에 따른 컴퓨팅 디바이스는 상관관계가 분석된 주식 종목에 대해 수집된 주가 데이터 중 예측 날짜의 주가 데이터를 레이블링하여 상관관계가 분석된 주식 종목별로 예측된 주가가 레이블링된 예측 날짜의 주가가 되도록 주가 데이터 및 각 날짜 별로 변환된 기술적 분석 데이터로부터 생성된 입력 데이터를 이용하여 주가 예측 모델을 학습할 수 있다.The computing device according to another embodiment of the present invention labels the stock price data of the predicted date among the stock price data collected for the correlated stock item, and predicts the stock price predicted for each correlated stock item. A stock price prediction model may be trained using input data generated from stock price data and technical analysis data converted for each date to be a stock price of a date.

상술한 주가 예측 모델을 이용하여 각 주식 종목별 주가를 예측하는 구체적인 방법은 전술하였으므로 자세한 설명은 생략하도록 한다.Since the detailed method of predicting the stock price for each stock item using the above-described stock price prediction model has been described above, a detailed description thereof will be omitted.

컴퓨팅 디바이스는 상관관계가 분석된 주식 종목들 각각에 대해 예측된 주가를 서로 비교하여 주가 상승이 예측되는 주식 종목을 상관관계가 분석된 주식 종목들 중에서 주식 종목을 선택한다(S1050).The computing device compares the predicted stock price with respect to each of the correlated stock stocks, and selects a stock stock from which the stock stocks for which the stock price increase is predicted are correlated (S1050).

상술한 주식 종목을 선택하는 구체적인 방법은 전술하였으므로 자세한 설명은 생략하도록 한다.Since the detailed method of selecting the above-mentioned stock item has been described above, a detailed description thereof will be omitted.

도 8 내지 도 10에서는 각각의 과정을 순차적으로 실행하는 것으로 기재하고 있으나 이는 예시적으로 설명한 것에 불과하고, 이 분야의 기술자라면 본 발명의 실시예의 본질적인 특성에서 벗어나지 않는 범위에서 도 8 내지 도 10에 기재된 순서를 변경하여 실행하거나 또는 하나 이상의 과정을 병렬적으로 실행하거나 다른 과정을 추가하는 것으로 다양하게 수정 및 변형하여 적용 가능할 것이다.In FIGS. 8 to 10, each process is described as sequentially executed, but this is merely an example, and those skilled in the art may use the methods in FIGS. 8 to 10 without departing from essential characteristics of embodiments of the present invention. It may be applicable by various modifications and variations by changing the order described, executing one or more processes in parallel, or adding other processes.

본 발명의 일 실시 예에 따른 주식 포트폴리오 구성을 위한 주식 종목 선택 방법을 이용하여 주가가 상승할 것으로 예측되는 주식 종목을 선택할 수 있으며, 선택된 주식 종목이 추가하여 투자 위험도를 낮출 수 있는 주식 포트폴리오를 추천할 수 있다.Using the stock item selection method for configuring a stock portfolio according to an embodiment of the present invention, it is possible to select a stock item that is predicted to rise in stock price, and a stock portfolio capable of reducing investment risk by adding the selected stock item is recommended. can do.

본 실시 예들에 따른 동작은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능한 매체에 기록될 수 있다. 컴퓨터 판독 가능한 매체는 실행을 위해 프로세서에 명령어를 제공하는 데 참여한 임의의 매체를 나타낸다. 컴퓨터 판독 가능한 매체는 프로그램 명령, 데이터 파일, 데이터 구조 또는 이들의 조합을 포함할 수 있다. 예를 들면, 자기 매체, 광 기록매체, 메모리 등이 있을 수 있다. 컴퓨터 프로그램은 네트워크로 연결된 컴퓨터 시스템 상에 분산되어 분산 방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수도 있다. 본 실시 예를 구현하기 위한 기능적인(Functional) 프로그램, 코드, 및 코드 세그먼트들은 본 실시 예가 속하는 기술분야의 프로그래머들에 의해 용이하게 추론될 수 있을 것이다.The operation according to the present exemplary embodiments may be implemented in the form of program instructions that can be executed through various computer means and recorded on a computer readable medium. Computer readable media refers to any media that participates in providing instructions to a processor for execution. Computer-readable media may include program instructions, data files, data structures, or combinations thereof. For example, there may be a magnetic medium, an optical recording medium, a memory, and the like. The computer program may be distributed over a networked computer system to store and execute computer readable code in a distributed manner. Functional programs, codes, and code segments for implementing the present embodiment can be easily inferred by programmers in the technical field to which the present embodiment belongs.

이상의 설명은 본 발명의 기술 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 발명의 본질적인 특성에서 벗어나지 않는 범위 내에서 다양한 수정, 변경 및 치환이 가능할 것이다. 따라서, 본 발명에 개시된 실시 예 및 첨부된 도면들은 본 발명의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 이러한 실시 예 및 첨부된 도면에 의하여 본 발명의 기술 사상의 범위가 한정되는 것은 아니다. 본 발명의 보호 범위는 아래의 청구 범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술 사상은 본 발명의 권리 범위에 포함되는 것으로 해석되어야 할 것이다.The above description is merely illustrative of the technical idea of the present invention, and those of ordinary skill in the art to which the present invention pertains may make various modifications, changes, and substitutions without departing from the essential characteristics of the present invention. will be. Therefore, the embodiments disclosed in the present invention and the accompanying drawings are not intended to limit the technical spirit of the present invention, but to explain the scope of the technical spirit of the present invention. . The scope of protection of the present invention should be interpreted by the claims below, and all technical spirits within the scope equivalent thereto should be interpreted as being included in the scope of the present invention.

100: 주식 종목 선택 장치 130: 주가 예측부
110: 주가 데이터 수집부 140: 주식 종목 선택부
120: 상관관계 분석부 100: stock item selection device 130: stock price forecasting unit
110: Stock data collection unit 140: Stock item selection unit
120: correlation analysis unit

Claims

A stock price data collection unit that collects stock price data related to the stock price of each stock item for each stock item;
A correlation analysis unit that analyzes a correlation between each stock item by using the collected stock price data;
Using the stock price prediction model generated by learning based on the collected stock price data for the stocks analyzed with the correlation, the stock price of each of the stock stocks analyzed with the correlation for the predicted date to predict the stock price A predicted stock price prediction unit; And
A stock portfolio composition comprising; a stock item selection unit selecting a stock item for which a stock price is expected to rise by comparing the predicted stock price for each of the stock items for which the correlation is analyzed, from among the stock items in which the correlation is predicted; Stock stock selection device for.

According to claim 1,
The correlation analysis unit,
A correlation coefficient calculation unit for calculating a correlation coefficient obtained by quantifying a correlation between each stock item by using the yield of the stock price for each stock item;
A clustering unit for clustering stock items having similarity of the correlation into a preset number of stock item groups according to the calculated correlation coefficient; And
A stock item extracting unit extracting at least one stock item having a correlation coefficient lower than a threshold among the stock items having similarity in correlation among the grouped stock items, as a candidate for the stock item for predicting the stock price ;
The stock price prediction unit predicts the stock price of the stock item extracted as the candidate stock item selection device for the composition of the stock portfolio.

According to claim 2,
The correlation analysis unit,
A distance calculating unit that calculates a distance obtained by quantifying a distance relationship between each stock item by using the calculated correlation coefficient; And
Further comprising; a stock network graph generating unit for generating a stock network graph representing the distance relationship between each stock item according to the calculated distance between each stock item;
The clustering unit clusters stock items having similarity of the correlation into a preset number of stock item groups according to a distance relationship between each stock item from the generated stock network graph. Optional device.

According to claim 3,
The stock item extraction unit,
Based on the calculated distance between each stock item, a minimum height tree that minimizes the weight connecting the stock items from the stock network graph is configured, and each of the clustered stock items is configured using the configured minimum growth tree. Stock item selection for stock portfolio construction, characterized in that at least one stock item having a correlation coefficient lower than a threshold among the stock items having similarity of the correlation included in is extracted as a candidate for the stock item for predicting the stock price Device.

The method of claim 4,
At least one stock item having a lower correlation coefficient than the threshold is a stock item selection device for configuring a stock portfolio, characterized in that the stock is a single stock (edge) corresponding to one order in the configured minimum height tree.

According to claim 4,
The stock item extraction unit,
Select a stock stock from the stock network graph by applying Prim's Algorithm, and select a stock stock leading from the selected stock stock to the edge with the minimum correlation coefficient among all stock stocks that can be connected as a trunk. When selecting and selecting stock stocks leading to the edge having the minimum correlation coefficient value, stocks for constructing a stock portfolio characterized by constructing the minimum growth tree by repeating the process of not selecting the stock stocks already selected Event selection device.

According to claim 3,
The clustering unit,
The K-medoid algorithm is applied to designate a number of representative stocks among a set number of stock stocks, and between the designated stock stocks and the remaining stock stocks other than the designated stock stocks. In consideration of the similarity according to the distance relationship, the group is grouped into a group of arbitrary stock items by the preset number, and the stock items assigned to the group of random stock items and the group of each stock group grouped respectively. Stock stocks for stock portfolio construction, characterized by grouping stock stocks having similarity according to the distance relationship between each stock stock from the generated stock network graph by comparing the distance from the center to a preset number of stock stock groups Optional device.

According to claim 2,
The stock item selection section,
And comparing the predicted stock price for at least one stock stock extracted from each of the clustered stock stock groups, and selecting a stock stock for which the stock price increase is predicted in each of the stock stock groups grouped by the preset number. Stock ticker for organizing stock portfolios.

According to claim 2,
The correlation coefficient calculation unit,
A stock item selection device for configuring a stock portfolio, wherein the log rate of the stock price for each stock item is calculated, and a correlation coefficient for each stock item is calculated using the average value of the calculated log rate.

According to claim 1,
It further includes a stock price data conversion unit for processing the collected stock price data including the stock price, trading volume, and trading time of the stock items for which the correlation is analyzed for each date to convert it into technical analysis data used for stock analysis.
The stock price prediction unit,
For the stocks analyzed with correlation, input data is generated from the collected stock price data and the converted technical analysis data, and the prediction for each of the stocks analyzed with the correlation from the generated input data Stock ticker for stock portfolio construction, characterized by predicting the stock price of a date.

The method of claim 10,
The input data
Has a preset dimension for M match (M is a natural number),
The stock price prediction unit,
The generated input data is divided into M pieces reflecting the time-series characteristics of the M-match,
The stock price prediction model,
A circular neural network capable of time-series data analysis that receives each of the separated M input data reflecting the time-series characteristics of the M match and outputs a stock price prediction value of the prediction date for each of the correlated stocks ( Recurrent Neural Network (RNN) based stock item selection device for configuring a stock portfolio, characterized by being composed of layers.

The method of claim 11,
The stock price prediction model,
Whether to reflect the previous memory cell state in the current memory cell to receive the separated M input data in consideration of temporal order and output the stock price prediction value of the prediction date for each of the correlated stocks. A stock item selection device for configuring a stock portfolio, characterized by being composed of a layer based on a circulating neural network (RNN) composed of long short term memory (LSTM) cells including a determining forget gate.

The method of claim 12,
The stock price prediction model,
Two short- and long-term memory (LSTM) cells are combined to receive the separated M input data in consideration of temporal order and output the stock price prediction value of the prediction date for each of the correlated stocks. A stock item selection device for configuring a stock portfolio, characterized by being composed of a layer based on a circulating neural network (RNN) composed of bidirectional LSTM and Bi-LSTM cells formed to perform state propagation.

In the method of selecting stocks for the construction of a stock portfolio by a computing device,
Collecting stock price data related to the stock price of the stock stock for each stock stock;
Analyzing the correlation between each stock item using the collected stock price data;
Using the stock price prediction model generated by learning based on the collected stock price data for the stocks analyzed with the correlation, the stock price of each of the stock stocks analyzed with the correlation for the predicted date to predict the stock price Predicting; And
And selecting a stock item for which a stock price increase is predicted by comparing the predicted stock price for each of the stock items for which the correlation has been analyzed, from among the stock items for which the correlation has been analyzed.

The method of claim 14,
The step of analyzing the correlation,
Calculating a correlation coefficient quantifying a correlation between each stock item by using the yield of the stock price for each stock item;
Grouping stock items having similarity of correlation into a preset number of stock item groups according to the calculated correlation coefficient; And
And extracting at least one stock item having a correlation coefficient lower than a threshold among the stock items having similarity of the correlation included in each of the grouped stock item groups as a candidate of the stock item for predicting the stock price. and,
The step of predicting the stock price,
A stock item selection method characterized by predicting the stock price of at least one stock item extracted as the candidate.

The method of claim 15,
The step of selecting a stock item for which the stock price increase is predicted is
And comparing the predicted stock price for at least one stock stock extracted from each of the clustered stock stock groups, and selecting a stock stock for which the stock price increase is predicted in each of the stock stock groups grouped by the preset number. How to choose stocks to play.

A storage medium in which a computer-readable program for recording a stock item selection method for constructing a stock portfolio according to any one of claims 14 to 16 is recorded on a computer.