KR20190064749A

KR20190064749A - Method and device for intelligent decision support in stock investment

Info

Publication number: KR20190064749A
Application number: KR1020170163906A
Authority: KR
Inventors: 윤병운; 정유진; 노태연
Original assignee: 신한금융투자 주식회사
Priority date: 2017-12-01
Filing date: 2017-12-01
Publication date: 2019-06-11
Also published as: KR102105744B1

Abstract

The present invention relates to an intelligent stock investment decision support method including: a step (a) of collecting at least one among news, SNS, reports, public announcement data, financial statement data, and postings and comments in a finance-related community bulletin board in regard to an analysis target enterprise; a step (b) of screening a risky enterprise by using at least one among the collected public announcement data, financial statement data, and news; a step (c) of deriving an emotional value by word and document by using the news and SNS data, and deriving a credit risk level through variation trends of the emotional value by a word and a document; and a step (d) of deriving a signal based on the derived credit risk level and predicting the probability of the occurrence of a credit event by using the derived signal.

Description

TECHNICAL FIELD The present invention relates to an intelligent investment decision support method and apparatus,

본 발명은 지능형 증권 투자 의사결정 지원 방법 및 그 장치에 관한 것이다. The present invention relates to a method and apparatus for intelligent securities investment decision support.

최근 인공지능 기술에 관한 관심이 증폭되면서, 기계학습(machine learning), 딥러닝(deep learning) 등에 대한 연구가 활발히 진행되고 있다. 특히 핀테크(Fin-tech) 산업에서는 결제 시스템, 금융 서비스, 은행융자 서비스, 투자 자산운용, 보험 및 시장 인프라 등을 중심으로 기존의 금융 서비스 양사에 큰 변화를 줄 수 있는 혁신이 이루어지고 있다. Recently, as interest in artificial intelligence technology has been amplified, researches on machine learning, deep learning, and the like are being actively carried out. Particularly in the Fin-tech industry, innovations are being made that can make a big difference to existing financial services companies, focusing on payment systems, financial services, bank loan services, investment asset management, insurance and market infrastructure.

또한, 해외 글로벌 ICT(Information and communication technology) 선진 기업들과 여러 스타트업(Start-up) 기업들을 선두로 하여 핀테크 산업이 점점 더 활성화 되고 있다. 특히, 소셜 네트워크 서비스 중 하나인 트위터 게시글 분석을 통해 고객의 성향을 파악하고, 이를 기반으로 한 맞춤형 정보 및 서비스를 제공함으로써 수익 성장률을 향상시키고 있다. In addition, the pin-tec industry is becoming more and more active at the forefront of global ICT (information and communication technology) and start-up companies. Especially, we are analyzing customer 's tendency through Twitter' s analysis of one of the social network services, and providing customized information and services based on it.

하지만 국내에서는 이러한 성공 사례가 드문데, 이는 국내 모바일 환경과 기술력은 우수한 수준인 반면에 낮은 자율성, 사전 보안성 심의 등 엄격한 국내 금융서비스 규제 환경이 국내 핀테크 산업 내 혁신 활동을 저해하고 있기 때문이다. However, this success is rare in Korea because the domestic mobile environment and technology are excellent, while the strict domestic financial service regulatory environment such as low autonomy and advance security deliberation hinders innovation activities in the domestic pin tech industry .

인공지능의 한 분야인 기계학습은 훈련데이터를 통해 학습된 속성을 기반으로 예측할 수 있어, 시점 별로 많은 양이 누적된 주가 데이터나 재무제표에 기계학습을 적용하여 주가를 예측하려는 시도가 많이 이루어 졌으나, 예측 수준으로만 진행되었으며 실제 성공사례가 거의 없다.Machine learning, which is a field of artificial intelligence, can be predicted based on learned attributes through training data. There have been many attempts to predict stock price by applying machine learning to stock price data or financial statements, , Only at the predicted level and there are very few successful cases.

본 발명은 뉴스 및 SNS에서 수집된 데이터를 기반으로 오피니언 마이닝과 기계학습을 통해 투자 의사결정을 지원할 수 있는 지능형 증권 투자 의사결정 지원 방법 및 그 장치를 제공하기 위한 것이다. The present invention provides an intelligent securities investment decision support method and apparatus for supporting investment decision making through Opinion Mining and machine learning based on data collected from news and SNS.

또한, 본 발명은 뉴스 및 SNS 등에서 수집된 데이터를 이용하여 오피니언 마이닝, 감성분석 및 기계학습을 통해 신용 위험도를 예측하며, 이를 통해 위험 시그널을 정의하여 최종적으로 기업의 크레딧 이벤트 발생 가능성을 예측할 수 있는 지능형 증권 투자 의사결정 지원 방법 및 그 장치를 제공하기 위한 것이다. In addition, the present invention predicts credit risk through opinion mining, emotional analysis and machine learning using data collected from news and SNS, defines a risk signal and predicts the probability of a credit event event in the end And to provide a method and apparatus for intelligent securities investment decision support.

본 발명의 일 측면에 따르면, 뉴스 및 SNS에서 수집된 데이터를 기반으로 오피니언 마이닝과 기계학습을 통해 투자 의사결정을 지원할 수 있는 지능형 증권 투자 의사결정 지원 방법이 제공된다.According to an aspect of the present invention, there is provided an intelligent securities investment decision support method capable of supporting investment decision making through Opinion Mining and machine learning based on data collected from news and SNS.

본 발명의 일 실시예에 따르면, 지능형 증권 투자 의사결정 지원 방법에 있어서, (a) 분석 대상 기업과 관련된 뉴스, SNS, 보고서, 공시 자료, 재무제표 데이터, 금융 관련 커뮤니티 게시판 내 게시물 및 댓글 중 적어도 하나를 수집하는 단계; (b) 상기 수집된 공시 자료, 재무제표 데이터 및 뉴스 중 적어도 하나를 이용하여 위험 기업을 스크리닝하는 단계; (c) 상기 뉴스 및 SNS 데이터를 이용하여 단어별 및 문서별 감성값을 도출하고, 상기 도출된 단어별 및 문서별 감성값의 변동 추이를 통해 신용 위험도를 도출하는 단계; 및 (d) 상기 도출된 신용 위험도를 기반으로 시그널을 도출하며, 상기 도출된 시그널을 이용하여 크레딧 이벤트 발생 확률을 예측하는 단계를 포함하는 지능형 증권 투자 의사결정 지원 방법이 제공될 수 있다. According to an embodiment of the present invention, there is provided a method of supporting an intelligent securities investment decision, comprising the steps of: (a) selecting at least one of news, SNS, report, disclosure data, financial statement data, Collecting one; (b) screening a risk company using at least one of the collected disclosure data, financial statement data, and news; (c) deriving emotion values for each word and each document by using the news and SNS data, and deriving a credit risk through a variation of emotion values for each word and document; And (d) deriving a signal based on the derived credit risk, and estimating a credit event occurrence probability using the derived signal.

상기 (b) 단계는, 상기 분석 대상 기업이 상장 기업이면, 상기 공시 자료 및 상기 재무 제표 데이터를 이용하여 공시 지표와 재무 지표를 각각 도출하는 단계; 상기 도출된 공지 지표와 재무 지표를 학습된 SVM(support vector machine)에 적용하여 정상 기업과 위험 기업으로 1차 분류하는 단계; 상기 수집된 데이터에서 기정의된 크레딧 이벤트 키워드 출현 빈도를 분석하여 위험 기업을 2차 도출하는 단계; 및 상기 1차 분류와 상기 2차 도출 결과를 종합하여 최종 위험 후보 기업을 선정하는 단계를 포함할 수 있다. If the analysis target company is a listed company, deriving the disclosure index and the financial index using the disclosure data and the financial statement data; Applying the derived known index and financial index to the learned support vector machine (SVM) to classify the normal index and the financial index into a normal company and a risk company; Analyzing the occurrence frequency of the predetermined credit event keyword in the collected data to secondarily derive the risk company; And selecting a final risk candidate company by synthesizing the primary classification and the secondary derivation result.

상기 크레딧 이벤트는 유상 증자, 자본 잠식, 회사채, 자사주처분, 액면병합, 분식회계, 무상 감자, 사모펀드, 최대주주 또는 주요 주주 매도, 타인에 대한 채무 보증, 부도, 기업회생절차, 대출원리금 연체사실 발생, 사채원리금 미지급 발생, 조회 공시 및 감면보고서 지연 중 적어도 하나일 수 있다. The above credit event is a credit event, which includes capital increase, capital erosion, corporate bond, treasury stock disposition, liquidation merger, tangible accounting, free capitalization, private equity, major shareholder or major shareholder sale, debt guarantees for others, The occurrence of unpaid principal of the debentures, the disclosure of inquiries and the delayed reduction of the report.

상기 (b) 단계는, 상기 분석 대상 기업이 비상장 기업이면, 상기 수집된 뉴스 및 공시 자료 데이터에서 크레딧 이벤트 키워드 발생 빈도를 각각 도출하여 위험 기업을 스크리닝할 수 있다. In the step (b), if the analyzed entity is a non-listed company, the occurrence frequency of the credit event keyword may be derived from the collected news and disclosure data to screen the risky company.

상기 (c) 단계는, 상기 분석 대상 기업이 상장 기업인 경우, 상기 수집된 데이터에 포함된 오피니언(opinion)별로 주가를 이용하여 감성극성값을 할당하는 단계; 상기 감성값을 기반으로 긍정 및 부정에 따른 핵심 키워드를 선별하는 단계; 상기 수집된 데이터에서 유의미한 단어를 선별하여 데이터 셋을 만들고, 상기 데이터 셋에 포함된 각 단어의 벡터값을 도출하여 임베딩하는 단계; 상기 벡터 스페이스에서의 상기 핵심 키워드와 상기 단어간의 벡터값을 이용한 거리를 도출하고, 상기 도출된 거리를 기반으로 각각의 단어의 감성값을 도출하는 단계; 문서내에 포함된 각 단어의 감성값을 총합산하여 문서별 감성값을 도출하는 단계; 및 상기 도출된 문서별 감성값의 변동 추이를 이용하여 신용 위험도를 계산하는 단계를 포함할 수 있다. The step (c) may include the steps of: if the analysis target company is a listed company, assigning a sensitivity value using a stock price for each Opinion included in the collected data; Selecting key keywords based on positive and negative based on the emotion value; Selecting a meaningful word from the collected data to create a data set, deriving and embedding a vector value of each word included in the data set, Deriving a distance using the vector value between the core keyword and the word in the vector space and deriving an emotion value of each word based on the derived distance; Deriving emotion values for each document by totaling the emotion values of the words included in the document; And calculating the credit risk using the derived fluctuation of emotion value for each document.

상기 (c) 단계는, 상기 분석 대상 기업이 비상장 기업인 경우, 감성 사전을 정의하는 단계; 상기 수집된 데이터에서 유의미한 단어를 선별하여 데이터 셋을 만들고, 상기 데이터 셋에 포함된 각 단어의 벡터값을 도출하여 임베딩하는 단계; 상기 감성 사전을 이용하여 상기 벡터 스페이스내의 각 단어에 대한 감성값을 도출하는 단계; 문서내에 포함된 각 단어의 감성값을 합산하여 문서별 감성값을 도출하는 단계; 및 상기 도출된 문서별 감성값의 변동 추이를 이용하여 신용 위험도를 계산하는 단계를 포함할 수 있다. The step (c) may include the steps of: defining an emotional dictionary when the analyzed entity is a non-listed company; Selecting a meaningful word from the collected data to create a data set, deriving and embedding a vector value of each word included in the data set, Deriving an emotion value for each word in the vector space using the emotion dictionary; Deriving emotion values for each document by summing emotion values of the words included in the document; And calculating the credit risk using the derived fluctuation of emotion value for each document.

상기 감성 사전을 정의하는 단계는, 주식 투자 관련 오피니언 별로 감성을 표현하고 있는 데이터를 수집하는 단계; 및 상기 수집된 데이터를 나이브 베이즈 분류(naㅿve bayes classifier)를 이용하여 각 단어의 감성값을 도출하는 단계를 포함할 수 있다.The step of defining the emotional dictionary includes the steps of: collecting data expressing emotion for each stock opinion-related opinion; And deriving an emotion value of each word using the na ㅿ and bayes classifier for the collected data.

상기 (d) 단계에서, 상기 시그널의 도출은, 상기 문서별 감성값의 일별 변동 추이를 이용하여 도출될 수 있다.In the step (d), the derivation of the signal may be derived using a daily variation of the emotion value for each document.

상기 도출되는 시그널은 문서별 감성값을 이용하여 일정 기간(예를 들어, 3일) 이상 긍정 또는 부정으로 유지하는 시그널, 일 평균 감성값의 증감 시그널, 일 평균 감성값의 교차 시그널, 문서별 감성값의 비율 시그널 및 문서별 주간 감성값 평균 증가율 시그널 중 적어도 하나일 수 있다.The derived signal includes a signal for maintaining positive or negative for a predetermined period of time (for example, three days), a signal for increasing or decreasing the daily average sensibility value, a signal for crossing the daily average sensibility value, A ratio of values, and a weekly emotion value average increase rate signal per document.

상기 (d) 단계는, 동종 산업 내 기업들의 데이터를 수집하여 로지스틱 회귀분석을 수행하여 일정 기간내에 기정의된 크레딧 이벤트의 발생 여부를 체크하여 산업별 유효 시그널을 각각 도출하는 단계; 상기 산업별 유효 시그널을 이용하여 회귀식을 구축하는 단계; 및 상기 도출된 시그널을 동종 산업의 회귀식에 반영하여 크레딧 이벤트 발생 확률을 예측하는 단계를 포함할 수 있다.(D) collecting data of companies in the same industry and performing a logistic regression analysis to check whether a predetermined credit event is generated within a predetermined period to derive effective signals for each industry; Constructing a regression equation using the industry specific effective signal; And estimating a credit event occurrence probability by reflecting the derived signal in a regression equation of the same kind of industry.

본 발명의 다른 측면에 따르면, 뉴스 및 SNS에서 수집된 데이터를 기반으로 오피니언 마이닝과 기계학습을 통해 투사 의사결정을 지원할 수 있는 지능형 증권 투자 의사결정 지원 방법이 실행되는 장치가 제공될 수 있다.According to another aspect of the present invention, there is provided an apparatus for executing an intelligent securities investment decision support method capable of supporting projection decision making through Opinion Mining and machine learning based on data collected from news and SNS.

본 발명의 일 실시예에 따르면, 지능형 증권 투자 의사결정 지원 방법이 실행되는 컴퓨터 장치에 있어서, 통신부; 적어도 하나의 명령어들이 저장되는 메모리; 및 상기 메모리에 저장된 명령어들을 실행하는 프로세서를 포함하되, 상기 프로세서에 의해 실행된 명령어들은, (a) 분석 대상 기업과 관련된 뉴스, SNS, 보고서, 공시 자료, 재무제표 데이터, 금융 관련 커뮤니티 게시판 내 게시물 및 댓글 중 적어도 하나를 수집하는 단계; (b) 상기 수집된 공시 자료, 재무제표 데이터 및 뉴스 중 적어도 하나를 이용하여 위험 기업을 스크리닝하는 단계; (c) 상기 뉴스 및 SNS 데이터를 이용하여 단어별 및 문서별 감성값을 도출하고, 상기 도출된 단어별 및 문서별 감성값의 변동 추이를 통해 신용 위험도를 도출하는 단계; 및 (d) 상기 도출된 신용 위험도를 기반으로 시그널을 도출하며, 상기 도출된 시그널을 이용하여 크레딧 이벤트 발생 확률을 예측하는 단계를 수행하는 것을 특징으로 하는 컴퓨터 장치가 제공될 수 있다. According to an embodiment of the present invention, there is provided a computer apparatus in which an intelligent securities investment decision support method is implemented, the computer apparatus comprising: a communication unit; A memory in which at least one instruction is stored; And a processor for executing the instructions stored in the memory, wherein the instructions executed by the processor include: (a) news related to the company to be analyzed, SNS, report, disclosure data, financial statement data, And at least one of a comment; (b) screening a risk company using at least one of the collected disclosure data, financial statement data, and news; (c) deriving emotion values for each word and each document by using the news and SNS data, and deriving a credit risk through a variation of emotion values for each word and document; And (d) deriving a signal based on the derived credit risk, and predicting a credit event occurrence probability using the derived signal.

본 발명의 일 실시예에 따른 지능형 증권 투자 의사결정 지원 방법 및 그 장치를 제공함으로써, 뉴스 및 SNS에서 수집된 데이터를 기반으로 오피니언 마이닝과 기계학습을 통해 투사 의사결정을 지원할 수 있다.By providing the intelligent securities investment decision support method and apparatus according to an embodiment of the present invention, projection decision making can be supported through Opinion Mining and machine learning based on data collected from news and SNS.

또한, 본 발명은 뉴스 및 SNS 등에서 수집된 데이터를 이용하여 오피니언 마이닝, 감성분석 및 기계학습을 통해 신용 위험도를 예측하며, 이를 통해 위험 시그널을 정의하여 최종적으로 기업의 크레딧 이벤트 발생 가능성을 예측할 수 있다.In addition, the present invention predicts credit risk through opinion mining, emotional analysis, and machine learning using data collected from news and SNS, and defines a risk signal to predict the probability of corporate credit event eventually .

도 1은 본 발명의 일 실시예에 따른 지능형 증권 투자 의사결정 지원 장치의 내부 구성을 개략적으로 도시한 블록도.
도 2는 본 발명의 일 실시예에 따른 데이터 수집 모듈의 내부 구성을 도시한 도면.
도 3은 본 발명의 일 실시예에 따른 위험 기업 스크리닝 모듈의 내부 구성을 도시한 도면.
도 4는 본 발명의 일 실시예에 따른 상장 기업에 대한 재무 건전성을 평가하여 위험 기업을 스크리닝하는 방법을 나타낸 순서도.
도 5는 본 발명의 일 실시예에 따른 재무 지표를 예시한 도면.
도 6은 본 발명의 일 실시예에 다른 공시 지표를 예시한 도면.
도 7은 본 발명의 일 실시예에 따른 테스트 기업의 공시 지표 및 재무 지표를 기반으로 SVM을 통해 재무 건전성을 평가하여 위험 기업을 예측한 결과와 실제 기업 크레딧 결과를 비교한 결과를 나타낸 도면.
도 8은 본 발명의 일 실시예에 따른 각 기업별 수집된 데이터내에서의 크레딧 이벤트 발생 빈도를 각각의 키워드별로 정리한 일 예를 설명하기 위해 도시한 도면.
도 9는 본 발명의 일 실시예에 따른 비상장 기업에 대한 재무 건전성을 평가하는 방법.
도 10은 본 발명의 일 실시예에 따른 신용 위험도 계산 모듈을 내부 구성을 도시한 도면.
도 11은 본 발명의 일 실시예에 따른 상장 기업의 신용 위험도를 평가하는 방법을 나타낸 도면.
도 12는 본 발명의 일 실시예에 따른 비상장 기업의 신용 위험도를 평가하는 방법을 나타낸 도면.
도 13은 본 발명의 일 실시예에 따른 의견별 감성값 할당을 설명하기 위해 도시한 도면.
도 14는 본 발명의 일 실시예에 따른 감성 사전 설명을 위해 도시한 도면.
도 15는 본 발명의 일 실시예에 따른 크레딧 이벤트 발생가능성 예측 모듈의 세부 구성을 도시한 도면.
도 16은 본 발명의 일 실시예에 따른 크레딧 이벤트 발생 가능성을 예측 방법을 나타낸 순서도.
도 17은 에는 본 발명의 일 실시예에 따른 시그널들을 설명하기 위해 도시한 도면.1 is a block diagram schematically illustrating an internal configuration of an intelligent securities investment decision support apparatus according to an embodiment of the present invention;
2 illustrates an internal configuration of a data acquisition module according to an embodiment of the present invention;
3 is a diagram illustrating an internal configuration of a risk company screening module according to an embodiment of the present invention;
4 is a flowchart showing a method for screening a risk company by evaluating the financial soundness of a listed company according to an embodiment of the present invention.
5 is a diagram illustrating a financial indicator according to an embodiment of the present invention.
Figure 6 is a diagram illustrating disclosure indices in accordance with an embodiment of the present invention.
FIG. 7 is a view showing a result of comparing the result of prediction of a risk company with a result of actual corporate credit by evaluating the financial soundness through SVM based on disclosure indicators and financial indicators of a test company according to an embodiment of the present invention.
FIG. 8 is a diagram illustrating an example of the frequency of occurrence of credit events in each collected data for each company according to an embodiment of the present invention; FIG.
Figure 9 is a method for evaluating the financial health of an unlisted company in accordance with an embodiment of the present invention.
FIG. 10 illustrates an internal structure of a credit risk calculation module according to an embodiment of the present invention; FIG.
11 is a diagram illustrating a method for evaluating a credit risk of a listed company according to an embodiment of the present invention.
12 illustrates a method for evaluating a credit risk of an unlisted company in accordance with an embodiment of the present invention.
FIG. 13 is a view for explaining emotion value assignment by opinion according to an embodiment of the present invention; FIG.
FIG. 14 is a diagram for explaining emotion dictionary according to an embodiment of the present invention; FIG.
15 illustrates a detailed configuration of a credit event occurrence probability prediction module according to an embodiment of the present invention.
16 is a flowchart illustrating a method of predicting a credit event occurrence probability according to an embodiment of the present invention.
FIG. 17 is a diagram illustrating signals according to an embodiment of the present invention; FIG.

본 명세서에서 사용되는 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 명세서에서, "구성된다" 또는 "포함한다" 등의 용어는 명세서상에 기재된 여러 구성 요소들, 또는 여러 단계들을 반드시 모두 포함하는 것으로 해석되지 않아야 하며, 그 중 일부 구성 요소들 또는 일부 단계들은 포함되지 않을 수도 있고, 또는 추가적인 구성 요소 또는 단계들을 더 포함할 수 있는 것으로 해석되어야 한다. 또한, 명세서에 기재된 "...부", "모듈" 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어 또는 소프트웨어로 구현되거나 하드웨어와 소프트웨어의 결합으로 구현될 수 있다. 이하, 첨부된 도면들을 참조하여 본 발명의 실시예를 상세히 설명한다. As used herein, the singular forms "a", "an" and "the" include plural referents unless the context clearly dictates otherwise. In this specification, the terms "comprising ", or" comprising "and the like should not be construed as necessarily including the various elements or steps described in the specification, Or may be further comprised of additional components or steps. Also, the terms "part," " module, "and the like described in the specification mean units for processing at least one function or operation, which may be implemented in hardware or software or a combination of hardware and software . Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

본 발명은 공시자료, 재무지표, 텍스트 기반 방법에 의해 위험 기업군을 1차적으로 스크리닝하고, 선정된 기업에 대한 오피니언 마이닝을 통해 위험 시그널을 감지한 후 최종적으로 크레딧 이벤트를 예측하기 위한 것이다. The present invention is intended to primarily screen risky business groups by using disclosure data, financial indicators, and text-based methods, and to detect risk signals through opinion mining for selected companies and finally to predict credit events.

여기서, 공시 자료에 의해 확인 가능한 크레딧 이벤트는 유상 증자, 자본 잠식, 회사채, 자사주처분, 액면병합, 분식회계, 무상 감자, 사모펀드, 최대주주 또는 주유주주 매도, 타인에 대한 채무 보증, 부도, 기업회생절차, 대출원리금 연체사실 발생, 사채원리금 미지급 발생, 조회 공시 및 감면보고서 지연 중 적어도 하나일 수 있다. In this case, credit events that can be confirmed by the disclosure data include capital increase, capital infiltration, corporate bond, treasury stock disposal, liquidation merger, segregation accounting, free capital reduction, private equity fund, sale of largest shareholder or liquor shareholder, Regeneration procedure, occurrence of delinquency of loan principal, occurrence of non-payment of interest on principal of bond, delay of inquiry, and delayed reduction of report.

각 크레딧 이벤트에 대해 간략하게 설명하면 다음과 같다. 각 크레딧 이벤트 유형은 경제 용어로 이해와 설명의 편의를 도모하기 위해 간략하게 각 크레딧 이벤트 유형에 대해 간략하게 설명하기로 한다. A brief description of each credit event follows. Each credit event type will briefly describe each credit event type briefly in order to facilitate understanding and explanation in economic terms.

유상 증자는 자금을 받고 신규로 추가적인 주식을 발행하여 주식시장을 통해 매각하는 것을 의미한다. 유상 증자는 거래되는 주식의 수가 늘어나기 때문에 기존 주주들의 지분이 희석되는 경향이 있으며, 즉 주식의 수가 증가한다는 것은 회사의 지분을 더 많은 사람들과 나눠야 한다는 의미가 있으므로 이는 주가 하락에 영향을 미치게 된다. A rights offering implies the sale of additional stock through the stock market upon receipt of funds. In the case of rights offering, the share of existing shareholders tends to be diluted because the number of shares to be traded increases, that is, the increase in the number of stocks implies that the share of the company should be shared with more people, which affects the stock price decline .

또한, 자본 잠식은 자본금 결손이 일어난 상태로, 적자 등으로 인해 잉여금이 바닥나 원래 기업이 가지고 있는 자기 자본이 줄어드는 현상, 누적된 결손금으로 자기 자본이 자본금보다 적어지는 상태이다. 자본 잠식은 일반적으로 상장 폐지의 조건으로, 50% 이상 자본잠식이 된 경우 관리 종목 편입 사유가 되며, 2년 연석 50% 이상 자본 잠식 상태이거나 완전 자본 잠식 상태인 경우 즉시 퇴출, 상장폐지가 고려된다. 이러한 자본잠식을 벗어나기 위한 가장 손쉬운 방법이 '무상 감자'이다. Capital incapacity is a state where capital deficit has occurred and the amount of capital that the original company has decreased due to the deficit or other reasons, and the accumulated capital loss is less than the capital due to accumulated deficits. Capital erosion is generally a condition for the cancellation of the listing. If the capital erosion exceeds 50%, the reason for the incorporation of the management stock is considered. If the capital is eroded more than 50% . The easiest way to get rid of this capital erosion is the 'free potato'.

또한, 회사채는 유형에 따라 BW(Bond with warrants)와 CB(Conve rtible bond)로 구분된다. BW는 신주인수권과 신주인수권부사채로 구분되며, 신주인수권은 증자를 위해 신주가 발행되는 경우, 우선적으로 신주를 할당받을 수 있는 권리이며, 신주인수권부사채는 신주인수권이 부여된 회사채로 일반 사채와 달리 추가 상장 주식에 대한 매수기회가 포함되어 있는 회사채이다. Corporate bonds are classified into BW (bond with warrants) and CB (convertible bonds) depending on the type. BW is divided into BWs and BWs. If the BWs are issued for the capital increase, the right to receive new stocks is preferentially granted. BWs are corporate bonds to which the BWs are entitled. It is a corporate bond that includes an opportunity to buy additional listed shares.

BW는 신주를 발행한 만큼 주식수가 증가하는 반면, 기업의 실적에는 변동이 없다. 실적의 개선사항이 없이 주식수만 늘어나게 되므로 한 주식당 이익배분이 줄어들게 되므로 주가 하락의 요인이 된다. BW has increased its number of shares by issuing new shares, but has not changed its earnings. As the number of shares will increase without any improvement in earnings, it will lead to a decline in share prices as the share of profits per share will decrease.

신주인수권 발행 물량이 하루 거래량 수준이면 위험적인 요소가 아니지만, 발행물량이 지분의 10% 이상이라면 주가에 부담으로 작용하게 되며 기존 주주에게 불리하게 작용하여 주가 하락으로 이어진다. If the issuance volume of the warrants is equal to the daily trading volume, it is not a risk factor. However, if the issuance volume is more than 10% of the stake, the stock price will be burdened.

전환사채의 발행은 기존 주주들에게는 좋은 소식이 아니다. 전환사채의 악용 사례를 제외하소 정상적인 경우라면 전화사태의 발행은 기업에 자금이 필요하다는 것을 의미한다. 이는 재무 상태가 악화되어 현금이 필요한 경우 OR 기업의 발전에 긍정적인 투자가 목적일 수도 있다. Issuance of convertible bonds is not good news for existing shareholders. Exclude abuse of convertible bonds In normal cases, issuance of a telephone call means that the company needs funds. This may be a positive investment in the development of an OR company if cash is needed as the financial condition worsens.

자기주식은 회사가 발행한 주식을 일정한 조건이나 사유 등으로 회사가 다시 취득하여 보유 중인 주식을 의미한다. 자사주 처분은 기업의 운영자금 조달을 의미한다. 자사주처분은 시중에 주식 물량이 늘어나기 때문에 주가에 부정적인 영향을 미친다. 자사주처분을 통해 자금을 확보한 기업이 새로운 투자를 통해 기업의 가치를 올릴 수 있기 때문에 단기적으로 악재일 수 있으나 장기적으로는 호재가 될 수도 있다. Treasury stock means stocks issued by the Company that are acquired and held by the Company under certain conditions or reasons. The disposition of treasury stock means the operation financing of the enterprise. The disposal of treasury stocks has a negative impact on share prices as stocks increase on the market. Companies that secure funds through disposition of treasury stock can raise the value of the company through new investments, which may be short-term negative, but may be favorable in the long run.

액면병합은 액면가가 적은 주식을 합쳐 액면가를 높이는 것을 의미하며, 주식수를 줄이고 주가를 높일 수 있는 방법이다. 액면병합은 주식수를 줄이면서 주가를 높이는 방법으로 기업에서 주가를 끌어 올릴 때 일반적으로 사용하는 방법이다. 또는 낮은 주가의 가격 때문에 기업 이미지가 싸보일 수 있기 때문에 이를 방지하는 효과가 있으며, 회전율은 낮아지지만 주가 관리가 상대적으로 용이해지는 장점도 있다. The liquidation merger means increasing the face value by combining stocks with small face value, which is a way to reduce the number of shares and increase the stock price. Parallel merging is a commonly used method of raising share prices in corporations by increasing stock prices while reducing the number of shares. Or because the price of a low stock price can be cheaper, it has an effect of preventing it, and it also has an advantage that the turnover rate is lowered but the stock price is relatively easy to manage.

분식회계는 재무제표 또는 회계 보고서의 수치를 인위적으로 조작하는 행위로, 과장분식회계, 역분식 회계 등 재무제표 상에서 파악하기 어려우나 한번 발생시 기업 부도 및 상장 폐지 등으로 직접적인 영향을 미칠 수 있다. It is difficult to grasp on the financial statements such as exaggerated accounting and reverse accounting, but it can have a direct influence on the corporate default and the cancellation of the listing on the occasion of one occurrence.

무상감자는 주주들에게 아무런 보상도 없이 감자 비율만큼 주식을 가져가 소각하는 것으로 여러 주식을 합친 후 더 적은 수의 주식으로 다시 발생하는 방식을 취하는 게 일반적이다. 감자차익으로 결손금을 메워 자본금을 줄여나가는 것으로 주주 입장에서는 최악이지만, 기업이 유상증자를 추가로 단행하여 주가 부양이 일어나는 경우도 있다. 무상 감자는 대부분 기업으로서는 주주에게 보상을 지급하지 않으므로 자산 총액은 변함이 없으나 무상감자 시 주가는 하락할 수 있다. Free bonus is generally obtained by taking stock of the potato ratio and incinerating it without any compensation to the shareholders. In the worst case for shareholders, capital reduction is made by filling the loss with potato profit. However, there are cases where the company carries out additional capital increase to raise stock price. As most companies do not pay compensation to shareholders, the total amount of assets is unchanged, but the share price may fall when free of charge.

사모펀드는 소수의 투자자들로부터 자금을 모아 주식이나 채권 등에 운용하는 펀드이다. 사모펀드는 소수의 장기투자자들로부터 사모방식으로 자금을 끌어 모아 기업 및 금용 기관을 인수하고 구조 조정한 뒤 이를 매각하거나 재상장시켜 투자 자금을 회사하는 전략을 의미한다. 사모펀드는 제한 없이 특정 기업의 주식을 사들일 수 있다는 특성 때문에 재벌들 간의 계열 지원이나 내부 자금 이동의 수단으로 악용된다. A private equity fund is a fund that collects funds from a small number of investors and manages them in stocks and bonds. Private equity refers to a strategy to attract investment funds from a small number of long-term investors, to acquire and restructure corporate and financial institutions, and then to sell or re-sell the investment funds. Private equity funds are exploited as a means of supporting family members or transferring internal funds because of the nature of being able to buy shares in a particular company without restrictions.

최대주주 또는 주유주주 매도는 기업의 주주들이 자사에 대한 주식을 매도하는 것을 의미한다. 최대 주주가 자주 바뀌는 경우 기업의 사업성이 불안정하다는 방증이다. 또한 내부자 거래 또는 불공정 거래 행위 등이 발생할 가능성이 존재하며, 이는 악재로 작용할 수 있다.Largest shareholder or lump-sum shareholder selling means that the shareholders of the company sell shares to the company. If the largest shareholder changes frequently, it is a sign that the business enterprise is unstable. There is also the possibility of insider trading or unfair trading behavior, which can be a negative factor.

타인에 대한 채무 보증은 타인, 즉 제3 자(최대 주주, 경영진, 타법인 등)이 대출할 때 담보가 부족하여 원하는 만큼의 대출을 하지 못할 경우 회사가 이에 대한 보증인이 되어 대출을 도와주는 것을 의미한다. 이는 부채로 잡히지 않은 부채가 갑자기 발생하여 부수적인 비용이 지출될 수 있어 악재에 해당한다. The guarantee of the debt to another person is to guarantee the loan if the third party (the largest shareholder, the management, the other legal person, etc.) it means. This is bad news because unexpected debt is suddenly generated and ancillary expenses can be incurred.

부도는 만기가 되어 돌아온 어음이나 수표가 정상적으로 당일 영업시간 종료 전까지 결제되지 못한 경우를 의미한다. 부도가 난 기업은 채무에 대해 상환할 능력이 되지 않은 기업으로, 이에 대한 투자 시 회수가능성은 거의 없다. 부도는 주식시장 및 투자에 직접적인 영향을 미치며, 부도 기업에 대해서는 당좌거래 정지 및 관리종목 지정 등의 불이익을 받게 된다. A bankruptcy means that a note or check returned on maturity can not be paid normally until the end of the business day. A bankrupt company is a company that is not capable of repaying its debts. The bankruptcy has a direct impact on the stock market and investment, and the bankruptcy corporation will suffer disadvantages such as suspension of checking transactions and appointment of management items.

기업회생절차는 부채가 과도한 기업에게 재기할 수 있는 기회를 제공하는 제도를 의미한다. 기업회생절차는 한 기업이 사업을 계속할 만한 가치가 있지만 과잉투자 및 금융 사고 등의 문제로 인해 부채를 영업 이익으로 충분히 감당할 수 없을 때 밟게 된다. 회생절차가 불가능할 경우 상장폐지까지 이어질 수 있다. The corporate rehabilitation process is a system that provides an opportunity for the debt to recover from the excesses of the enterprise. The corporate rehabilitation process is worthwhile when a company is worth continuing its business, but it will not be able to cover its debt with operating profit due to over-investment and financial accidents. If the regeneration procedure is not possible, the listing may be abolished.

조회 공시는 주요 경영 사항에 관한 풍문 또는 보도, 발생 주권 등의 가격이나 거래량의 현저한 변동이 있는 경우, 거래소가 답변을 요구하고 반일 이내에 답변하는 공시를 의미한다. 조회 공시는 주식시장에 미미하게 영향을 미칠 수 있으나, 특정 이벤트 발생시 거래소가 직접 답변을 요구하는 공시이므로 관심을 가질 필요가 있다. Disclosure of disclosures refers to disclosures in which the Exchange requests for an answer and responds within one-half of a day when there is a significant change in the price or volume of transactions, such as rumors, reports, and sovereign rights regarding major management issues. The inquiry disclosure may have a small impact on the stock market, but it is necessary for the exchange to be interested because it is an announcement requiring a direct reply from a specific event.

감사보고서 지연은 기업 내에 중요한 불확실성이 발생했는데 기업이 자구책을 제시하지 못하거나, 중요한 왜곡표시에 대해 회사가 수정을 거부할 때 주로 발생한다. 이러한 감사보고서 '거절' 의견도 관련 크레딧 이벤트로 볼 수 있으며, '적정' 인 기업 역시 영업 손실 등을 함께 살펴봄으로써 재무불 건전성을 파악할 수 있다. Audit report delays occur when there is significant uncertainty within an entity, when the entity fails to present self-reliance, or when the company refuses to correct an important distortion indication. The rejection of the audit report can also be viewed as a related credit event, and a 'fair' company can also grasp the financial imperfections by looking at the operating loss.

도 1은 본 발명의 일 실시예에 따른 지능형 증권 투자 의사결정 지원 장치의 내부 구성을 개략적으로 도시한 블록도이며, 도 2는 본 발명의 일 실시예에 따른 데이터 수집 모듈의 내부 구성을 도시한 도면이고, 도 3은 본 발명의 일 실시예에 따른 위험 기업 스크리닝 모듈의 내부 구성을 도시한 도면이며, 도 4는 본 발명의 일 실시예에 따른 상장 기업에 대한 재무 건전성을 평가하여 위험 기업을 스크리닝하는 방법을 나타낸 순서도이며, 도 5는 본 발명의 일 실시예에 따른 재무 지표를 예시한 도면이고, 도 6은 본 발명의 일 실시예에 다른 공시 지표를 예시한 도면이며, 도 7은 본 발명의 일 실시예에 따른 테스트 기업의 공시 지표 및 재무 지표를 기반으로 SVM을 통해 재무 건전성을 평가하여 위험 기업을 예측한 결과와 실제 기업 크레딧 결과를 비교한 결과를 나타낸 도면이며, 도 8은 본 발명의 일 실시예에 따른 각 기업별 수집된 데이터내에서의 크레딧 이벤트 발생 빈도를 각각의 키워드별로 정리한 일 예를 설명하기 위해 도시한 도면이고, 도 9는 본 발명의 일 실시예에 따른 비상장 기업에 대한 재무 건전성을 평가하는 방법이며, 도 10은 본 발명의 일 실시예에 따른 신용 위험도 계산 모듈을 내부 구성을 도시한 도면이고, 도 11은 본 발명의 일 실시예에 따른 상장 기업의 신용 위험도를 평가하는 방법을 나타낸 도면이며, 도 12는 본 발명의 일 실시예에 따른 비상장 기업의 신용 위험도를 평가하는 방법을 나타낸 도면이고, 도 13은 본 발명의 일 실시예에 따른 의견별 감성값 할당을 설명하기 위해 도시한 도면이며, 도 14는 본 발명의 일 실시예에 따른 감성 사전 설명을 위해 도시한 도면이고, 도 15는 본 발명의 일 실시예에 따른 크레딧 이벤트 발생가능성 예측 모듈의 세부 구성을 도시한 도면이며, 도 16은 본 발명의 일 실시예에 따른 크레딧 이벤트 발생 가능성을 예측 방법을 나타낸 순서도이고, 도 17은 에는 본 발명의 일 실시예에 따른 시그널들을 설명하기 위해 도시한 도면이다. FIG. 1 is a block diagram schematically illustrating an internal configuration of an intelligent securities investment decision support apparatus according to an embodiment of the present invention. FIG. 2 illustrates an internal configuration of a data collection module according to an embodiment of the present invention FIG. 3 is a diagram illustrating an internal structure of a risk company screening module according to an embodiment of the present invention. FIG. 4 is a diagram illustrating an example of a risk company screening module according to an embodiment of the present invention. FIG. 5 is a diagram illustrating a financial indicator according to an embodiment of the present invention. FIG. 6 is a diagram illustrating disclosure indicators according to an embodiment of the present invention. FIG. The result of comparing the predicted value of the risk company with the actual corporate credit result by evaluating the financial soundness through the SVM based on the disclosure index and the financial index of the test company according to the embodiment of the invention FIG. 8 is a view for explaining an example of the frequency of occurrence of credit events in collected data for each company according to each keyword according to an embodiment of the present invention, and FIG. 9 FIG. 10 is a diagram illustrating an internal configuration of a credit risk calculation module according to an embodiment of the present invention, FIG. 11 is a diagram illustrating an internal configuration of a credit risk calculation module according to an embodiment of the present invention FIG. 12 is a diagram illustrating a method of evaluating a credit risk of a listed company according to an embodiment of the present invention. FIG. 12 is a diagram illustrating a method of evaluating a credit risk of an unlisted company according to an embodiment of the present invention. FIG. 14 is a diagram for explaining emotion dictionary according to an embodiment of the present invention, and FIG. 15 is a diagram for explaining emotion value assignment according to an embodiment of the present invention. FIG. 16 is a flowchart illustrating a method of predicting a credit event occurrence probability according to an embodiment of the present invention, and FIG. 17 is a flowchart illustrating a method of predicting a credit event occurrence probability according to an embodiment of the present invention. 1 is a diagram illustrating signals according to an embodiment of the invention.

도 1을 참조하면, 본 발명의 일 실시예에 따른 지능형 증권 투자 의사결정 지원 장치(100)는 데이터 수집 모듈(110), 위험 기업 스크리닝 모듈(115), 신용 위험도 계산 모듈(120), 크레딧 이벤트 발생가능성 예측 모듈(125)을 포함하여 구성된다. 1, an intelligent securities investment decision support apparatus 100 according to an embodiment of the present invention includes a data collection module 110, a risk company screening module 115, a credit risk calculation module 120, a credit event And an occurrence probability prediction module 125. [

데이터 수집 모듈(110)은 분석 대상 기업에 대한 온라인 및 SNS상의 데이터를 수집하여 데이터베이스에 저장하기 위한 수단이다.The data collection module 110 is a means for collecting data on the on-line and SNS data of the analysis target company and storing the data on the database.

데이터 수집 모듈(110)은 분석 대상 기업과 관련된 다양한 정보들을 온라인 및 SNS상에서 수집할 수 있다. 다만, 본 발명의 목적이 증권 투자 의사결정을 지원하는 것이므로, 데이터 수집 모듈(110)은 이와 연관된 데이터들을 수집할 수 있다. 데이터 수집 모듈(110)은 웹 크롤링 기법을 통해 온라인 및 SNS상에서 분석 대상 기업과 관련된 데이터를 수집하여 저장할 수 있다. 웹 크롤링 기법 자체는 당업자에게는 자명한 사항이므로 이에 대한 별도의 설명은 생략하기로 한다. The data collection module 110 can collect various information related to the analysis target company on-line and on the SNS. However, since the purpose of the present invention is to support decision making of securities investment, the data collection module 110 can collect data related thereto. The data collection module 110 can collect and store data related to the analysis target company on-line and on the SNS through a web crawling technique. Since the web crawling technique itself is obvious to those skilled in the art, a detailed description thereof will be omitted.

예를 들어, 데이터 수집 모듈(110)은 분석 대상 기업과 관련된 뉴스, 보고서, SNS, 금융 관련 커뮤니티 게시판 내 게시물과 댓글, 공시 자료, 재무제표 데이터 등을 수집하여 데이터베이스에 저장할 수 있다. For example, the data collection module 110 may collect news, reports, SNS, postings and comments in a community-related bulletin board related to the analyzed company, disclosure data, financial statement data, and the like, and store the collected data in a database.

도 2에는 데이터 수집 모듈(110)의 내부 구성이 개략적으로 도시되어 있다. 도 2를 참조하면, 본 발명의 일 실시예에 따른 데이터 수집 모듈(110)은 분석 대상 기업과 관련된 뉴스를 수집하여 뉴스 데이터베이스에 저장하고, SNS 데이터를 수집하여 SNS 데이터에 저장하며, 재무제표 데이터를 수집하여 재무제표 데이터베이스에 저장하고, 공시자료를 수집하여 공시자료 데이터베이스에 각각 저장할 수 있다FIG. 2 schematically shows an internal configuration of the data acquisition module 110. As shown in FIG. Referring to FIG. 2, the data collection module 110 according to an exemplary embodiment of the present invention collects news related to a company to be analyzed and stores the collected news in a news database, collects SNS data and stores the SNS data in SNS data, Can be collected and stored in the financial statement database, and the disclosure data can be collected and stored separately in the disclosure data base

위험 기업 스크리닝 모듈(115)은 기업별 수집된 데이터(예를 들어, 공시 자료와 재무제표 데이터)를 분석하여 재무 건전성을 평가한 후 위험 기업을 스크리닝하기 위한 수단이다.The risky enterprise screening module 115 is a means for analyzing enterprise collected data (for example, disclosed data and financial statement data) to evaluate the financial health and screening the risky enterprise.

분석 대상 기업에 따라 공시 자료와 재무제표 데이터를 이용 가능한 경우와 이용이 불가능한 경우로 구분된다. 즉, 상장된 기업의 경우 공시 자료와 재무제표 데이터를 이용 가능하나 비상장 기업의 경우 공시 자료와 재무제표 데이터 이용이 불가능하다. It can be classified into two types: the case where disclosure data and financial statement data are available and the case where it is impossible to use. In other words, listed companies can use disclosure data and financial statement data, but unlisted companies can not use disclosure data and financial statement data.

따라서, 이를 각각 구분하여 별도로 설명하기로 한다.Therefore, these will be separately described separately.

도 3에는 위험 기업 스크리닝 모듈(115)의 내부 구성이 도시되어 있다. 도 3에 도시된 바와 같이, 위험 기업 스크리닝 모듈(115)은 기업 유형이 상장 기업인지 비상장 기업인지에 따라 재무 건전성 평가 방법을 달리하여 위험 기업을 스크리닝할 수 있다.FIG. 3 shows the internal structure of the risk enterprise screening module 115. As shown in FIG. 3, the risky company screening module 115 can screen the risky enterprise by changing the financial soundness evaluation method according to whether the enterprise type is a listed company or a non-listed company.

우선, 상장 기업에 대한 재무 건전성을 평가하여 위험 기업을 스크리닝하는 방법에 대해 도 4를 참조하여 보다 상세히 설명하기로 한다. First, a method of evaluating the financial soundness of a listed company and screening a risky company will be described in detail with reference to FIG.

단계 410에서 위험 기업 스크리닝 모듈(115)은 기업별로 수집된 데이터를 구조화한다. In step 410, the hazardous enterprise screening module 115 structures the data collected by the enterprise.

데이터 구조화 단계에서는 우선, 기업별로 수집된 데이터를 기반으로 재무제표 데이터에서 크레딧 이벤트에 영향을 미치는 세부 항목들을 계산하여 재무지표를 도출한다.In the data structuring stage, financial indicators are derived by calculating detailed items affecting credit events in financial statement data based on data collected by each company.

또한, 재무제표를 이용하여 SPL 등급 산출 모형을 기반으로 사업성, 수익성 및 안정성 지표를 각각 도출한 후 이들 각각의 지표를 지정된 비율로 반영하여 최종적으로 재무지표를 도출할 수 있다. 즉, 위험 기업 스크리닝 모듈(115)은 도출된 재무지표를 기반으로 분석 대상 기업의 재무 건전성을 평가하여 위험 기업을 스크리닝할 수 있다. In addition, the financial statements can be used to derive business performance, profitability, and stability indicators based on the SPL rating model, and then reflect each of these indicators at a specified rate to ultimately derive financial indicators. That is, the risky enterprise screening module 115 can screen the risky enterprise by evaluating the financial soundness of the analyzed enterprise based on the derived financial indicators.

재무 지표를 도출하기 위한 각 세부 지표들에 대한 도출 방법 및 각 세부 지표들의 반영 비율은 도 4에 도시된 바와 같다.The derivation method for each sub-index for deriving the financial indicators and the reflection ratio of each sub-index are as shown in FIG.

도 5를 참조하여, 각각의 재무 지표를 도출하기 위한 세부 지표들을 도출하는 방법에 대해 간략하게 설명하기로 한다. Referring to FIG. 5, a method of deriving detailed indicators for deriving respective financial indicators will be briefly described.

도 5에 도시된 바와 같이, 재무 지표를 도출하기 위해, 사업성 지표, 수익성 지표 및 안정성 지표를 각각 도출해야 하며, 사업성 지표를 30%, 수익성 지표를 35%, 그리고 안정성 지표를 35% 반영하여 최종적으로 재무 지표를 도출할 수 있다.As shown in FIG. 5, in order to derive the financial indicators, it is necessary to derive the business performance index, the profitability index and the stability index respectively, and the business performance index, the profitability index and the stability index are calculated by 30%, 35% Can be used to derive financial indicators.

각각의 지표를 도출하는 방법에 대해 설명하기로 한다.A method of deriving each index will be described.

사업성 지표는 자산 규모 지표와 자산의 질 지표를 도출한 후 이를 각각 45%와 55% 반영하여 사업성 지표를 도출할 수 있다. 이때, 자산 규모 지표와 자산의 질 지표는 각각 도 5에 정의된 방법 중 어느 하나의 방법으로 도출될 수 있다. The business performance index can be derived by calculating 45% and 55% of the asset size index and asset quality index, respectively. At this time, the asset size indicator and the asset quality indicator can be derived by any one of the methods defined in FIG.

또한, 수익성 지표는 사업성과 지표와 성과의 질 지표를 도출한 후 이를 각각 60%와 40% 반영하여 최종 수익성 지표를 도출할 수 있다. 사업성과 지표와 성과의 질 지표를 도출하는 수식은 도 4에 도시된 바와 같다.In addition, the profitability index can be derived by calculating 60% and 40% of the profitability index after deriving the quality index of business performance, indicators and performance. The formula for deriving the quality index of business performance, indicators and performance is shown in FIG.

안정성 지표는 재무부담 지표와 상환 능력 지표를 각각 도 5에 정의된 수식으로 도출한 후 이를 각각 55%, 45% 반영하여 안정성 지표를 도출할 수 있다.The stability index can be derived by deriving the financial burden index and the reimbursement capacity index by the formulas defined in FIG. 5, respectively, and then calculating the stability index by reflecting 55% and 45%, respectively.

또한, 위험 기업 스크리닝 모듈(115)은 공지 지표를 도출한다. 예를 들어, 위험 기업 스크리닝 모듈(115)은 크레딧 이벤트 발생년도를 포함하여 과거 일정 기간(예를 들어, 2년 또는 4년 등)의 공시 자료를 분석하여 지표로 활용할 수 있다. 위험 기업 스크리닝 모듈(115)은 공시 지표를 도출함에 있어 도 6에 도시된 지표들의 발생 횟수를 도출하여 공시 지표를 도출할 수 있다.In addition, the risk company screening module 115 derives public indicators. For example, the risk company screening module 115 may analyze the disclosure data of a past period (for example, 2 years or 4 years) including the year of the credit event to use as an indicator. The risk company screening module 115 can derive the disclosure index by deriving the number of occurrences of the indicators shown in FIG. 6 in deriving the disclosure index.

본 발명의 일 실시예에서 크레딧 이벤트 발생에 영향을 미치는 공시 자료를 정리한 일 예가 도 6에 도시되어 있다. In an embodiment of the present invention, an example of summary of disclosure data affecting the occurrence of a credit event is shown in FIG.

데이터 구조화를 통해 재무 지표와 각 세부 항목이 도출되며, 공시 지표(공시 건수)가 각각 도출되면, 단계 415에서 위험 기업 스크리닝 모듈(115)은 구조화된 데이터를 SVM(support vector machine)에 적용하여 정상 기업과 위험 기업으로 각각 분류한다. SVM 알고리즘 자체는 공지된 기술이므로 이에 대한 별도의 설명은 생략하기로 한다.If the financial indicators and each sub-item are derived through the data structure, and the disclosure index (number of disclosures) is derived, the risk company screening module 115 applies the structured data to the support vector machine (SVM) And classified into a company and a risk company. Since the SVM algorithm itself is a well-known technique, a detailed description thereof will be omitted.

SVM은 수집된 과거 기업 데이터로 크레딧 이벤트 발생군과 정상상환 군을 구분하여 학습되어 있는 것을 가정하기로 한다. 이와 같이, 학습된 SVM에 독립 변수로 재무지표(사업성, 수익성, 안정성을 평가하는 세부 항목)와 비재무지표(신용도에 영향을 미치는 것으로 판단되는 공시 지표(건수))를 이용하며, 종속 변수로 크레딧 이벤트 발생 여부를 설정하여 정상 기업과 위험 기업을 1차적으로 스크리닝할 수 있다. The SVM assumes that the collected past corporate data is learned by distinguishing the credit event occurrence group and the normal repayment group. In this way, we use financial indicators (detail items to evaluate business performance, profitability, and stability) and non-financial indicators (disclosure indicators (cases) that are considered to affect creditworthiness) as independent variables to the learned SVM, It is possible to screen the normal enterprise and the risk enterprise primarily by setting whether or not the event occurs.

도 7에는 테스트 기업의 공시 지표 및 재무 지표를 기반으로 SVM을 통해 재무 건전성을 평가하여 위험 기업을 예측한 결과와 실제 기업 크레딧 결과를 비교한 결과가 예시되어 있다.FIG. 7 illustrates the result of comparing the result of predicting the risk company with the actual corporate credit result by evaluating the financial soundness through the SVM based on the disclosure index and the financial index of the test company.

도 7에 도시된 바와 같이, 기업별 공시 지표 및 재무지표를 SVM에 적용하여 정상 기업과 위험 기업을 각각 분류할 수 있다. 그러나, 도 7에 도시된 바와 같이, 위험 기업 스크리닝 모듈(115)은 각 기업의 공시 지표 및 재무 지표를 기반으로 SVM을 통해 재무 건전성을 평가하여 위험 기업을 예측하는 경우 재무 건전성이 양호한 것으로 평가되었으나, 실제적으로는 위험 기업인 경우가 있을 수 있다. As shown in FIG. 7, it is possible to classify the normal company and the risk company by applying the company-specific disclosure index and the financial index to the SVM. However, as shown in FIG. 7, the risk company screening module 115 evaluates the financial soundness of the SVM based on the disclosure indicators and the financial indicators of each company. When the risk firm is predicted, the financial soundness is evaluated as good , And there may be cases where it is actually a risk company.

도 7에서 보여지는 바와 같이, SVM 분석을 통해 부도 기업으로 예측된 우전 기업의 경우 실제 상장 폐지된 결과와 일치하는 것을 알 수 있다. 그러나 SVM 분석 결과 정상 기업으로 예측되었으나, 실제 상장 폐지 또는 워크 아웃된 위험 기업이 포함되어 있는 것을 알 수 있다.As shown in FIG. 7, it can be seen that, in the case of the right-wing enterprise predicted as a defaulting company through the SVM analysis, it is in agreement with the actual de-listing result. However, the results of the SVM analysis show that the company is estimated to be a normal company, but the actual listing is canceled or the company is out of work.

따라서, 단계 420에서 위험 기업 스크리닝 모듈(115)은 수집된 데이터에서 기정의된 크레딧 이벤트 키워드 발생여부를 파악하여 위험 기업들을 선별할 수 있다. Accordingly, in step 420, the risk company screening module 115 can identify the risk companies by determining whether a predetermined credit event keyword is generated from the collected data.

예를 들어, 위험 기업 스크리닝 모듈(115)은 수집된 데이터내에서 기정의된 크레딧 이벤트 키워드들의 발생 횟수를 각각 카운트하여 크레딧 이벤트가 발생하는지를 분석하여 위험 기업들을 스크리닝할 수 있다. For example, the risky enterprise screening module 115 may count the number of occurrences of credit event keywords in the collected data, respectively, and analyze whether a credit event occurs to screen risky companies.

도 8에는 각 기업별 수집된 데이터내에서의 크레딧 이벤트 발생 빈도를 각각의 키워드별로 정리한 후 위험 기업들을 선별한 일 예가 도시되어 있다. FIG. 8 shows an example of sorting the risk companies after sorting the frequency of occurrence of credit events in the collected data for each company by respective keywords.

지금까지 도 4 내지 도 8을 참조하여 상장 기업에 대한 재무 건전성을 평가하여 위험 기업을 스크리닝하는 방법에 대해 설명하였다. Up to now, referring to FIGS. 4 to 8, a method of screening a risk company by evaluating the financial soundness of a listed company has been described.

도 9를 참조하여 비상장 기업에 대한 재무 건전성을 평가하는 방법에 대해 설명하기로 한다. Referring to FIG. 9, a method for evaluating the financial soundness of a non-listed company will be described.

단계 910에서 위험 기업 스크리닝 모듈(115)은 수집된 뉴스 및 공시 보고서를 기반으로 기업별/기사별 크레딧 이벤트를 정의한다. In step 910, the risk company screening module 115 defines credit events by company / article based on the collected news and disclosure reports.

비상장 기업은 재무제표 데이터를 이용할 수 없으므로, 비상장 기업의 경우 뉴스나 공시 보고서만을 이용하여 재무 건전성을 평가해야 한다. 따라서, 뉴스 및 공시 보고서에서 크레딧 이벤트에 관련된 특정 키워드를 도출함으로써 비상장 기업별로 어떤 크레딧 이벤트가 발생하였는지를 정의한다.Unlisted companies are not allowed to use financial statement data, so unlisted companies should use only news or public disclosure reports to assess their financial soundness. Therefore, we define certain credit events for each unlisted company by deriving certain keywords related to credit events from news and disclosure reports.

단계 915에서 위험 기업 스크리닝 모듈(115)은 크레딧 이벤트 관련 키워드의 출현 여부를 기사별로 파악한다. In step 915, the risk company screening module 115 grasps the occurrence of the credit event-related keywords on an article-by-article basis.

단계 920에서 위험 기업 스크리닝 모듈(115)은 시점별 크레딧 이벤트 발생 빈도(키워드 출현 빈도) 및 성장률을 계산하여 비상장 기업 중 위험 기업을 1차적으로 스크리닝한다.In step 920, the risky enterprise screening module 115 firstly screens the risky enterprise among unlisted companies by calculating the occurrence frequency (frequency of keyword occurrence) and the growth rate of the credit event by time.

도 9에서 도시된 바와 같이, 위험 기업 스크리닝 모듈(115)은 비상장 기업의 경우 재무제표 데이터를 이용할 수 없으므로, 뉴스 및 공시 자료를 기반으로 크레딧 이벤트 발생 여부를 활용하여 위험 기업을 선정할 수 있다. As shown in FIG. 9, since the risk company screening module 115 can not use the financial statement data in the case of the unlisted company, the risk company can be selected based on the occurrence of the credit event based on the news and the disclosure data.

다시 도 1을 참조하면, 신용 위험도 계산 모듈(120)은 위험 기업으로 선정된 기업들에 대한 신용 위험도를 도출하기 위한 수단이다. Referring again to FIG. 1, the credit risk calculation module 120 is a means for deriving a credit risk for companies selected as a risk company.

신용 위험도 계산 모듈(120)은 위험 기업으로 선정된 기업들에 대해 현재 신용 위험도를 도출하기 위한 수단이다. 여기서, 신용 위험도는 뉴스 및 SNS 데이터를 이용하여 텍스트 및 문서 수준에서의 감성값, 감성값의 변동 추이 등을 활용한 지표를 의미한다.The credit risk calculation module 120 is a means for deriving the current credit risk for the companies selected as the risk enterprise. Here, the credit risk refers to the index using emotional value at the level of text and document, change of emotion value, etc. using news and SNS data.

도 10에는 신용 위험도 계산 모듈(120)의 상세 구조가 도시되어 있다. 도 10에 도시된 바와 같이, 신용 위험도 계산 모듈(120)은 상장 기업과 비상장 기업에 따라 일부 구성이 상이할 수 있다.FIG. 10 shows the detailed structure of the credit risk calculation module 120. As shown in FIG. As shown in FIG. 10, the credit risk calculation module 120 may have a different configuration depending on a listed company and a non-listed company.

우선, 이해와 설명의 편의를 도모하기 위해 도 11을 참조하여 상장 기업의 신용 위험도를 도출하는 방법에 대해 설명하기로 한다. First, in order to facilitate understanding and explanation, a method of deriving a credit risk of a listed company will be described with reference to FIG.

단계 1110에서 신용 위험도 계산 모듈(120)은 분석 대상 기업에 대해 수집된 데이터를 전처리한다. 뉴스 또는 SNS상에서 분석 대상 기업에 대해 수집된 데이터에서 불필요한 정보와 광고 등을 제거하며, 감성 분석 및 워드 임베딩 분석을 위해 파싱할 수 있다.In step 1110, the credit risk calculation module 120 preprocesses the collected data for the analysis target company. It eliminates unnecessary information and advertisements from the collected data for the analyzed companies in the news or SNS, and can be parsed for emotional analysis and word embedding analysis.

단계 1115에서 신용 위험도 계산 모듈(120)은 주가를 이용하여 의견별로 감성값을 할당한다. In step 1115, the credit risk calculation module 120 allocates the emotion value by opinion using the stock price.

신용 위험도 계산 모듈(120)은 전처리된 데이터에서 각 의견 데이터에 대해 주가를 이용하여 긍정 또는 부정의 감성값을 할당할 수 있다. The credit risk calculation module 120 can assign affirmative or negative emotional values to each opinion data from the preprocessed data using the stock price.

당일 주가는 과거 3일간의 의견이 주가 등락에 영향을 미치는 것을 반영하기 위해, 본 발명의 일 실시예에 따른 신용 위험도 계산 모듈(120)은 당일 주가를 기준으로 1일, 2일, 3일전 주가와 비교하여 1일, 2일 및 3일전 의견 데이터에 각각 감성값을 할당할 수 있다. 여기서, 감성값은 긍정 또는 부정일 수 있다.The credit risk calculation module 120 according to an exemplary embodiment of the present invention calculates the credit risk of the current day based on the stock price on the day of the first day, The emotional value can be assigned to the opinion data on the 1st, 2nd and 3rd days, respectively. Here, the emotion value may be positive or negative.

이와 같이, 의견 데이터에 대해 감성값이 할당된 이후, 신용 위험도 계산 모듈(120)은 감정이 할당된 의견 데이터를 나이브 베이즈 분류(Navie Bayes classifier) 를 이용하여 각 단어의 감성극성값을 할당한다. 여기서, 나이브 베이즈 분류를 통해 상대적으로 긍정 데이터에서 많이 출현하는 단어는 양수, 부정적 데이터에서 많이 출연하는 단어는 음수로 도출될 수 있다.After the emotion value is assigned to the opinion data, the credit risk calculation module 120 allocates emotion value of each word using the Navie Bayes classifier to the emotion-assigned opinion data . Here, words appearing more frequently in affirmative data through Naïve Bayes classification can be derived as negative numbers than those appearing more frequently in positive and negative data.

나이브 베이즈 분류 자체는 공지된 기술이므로 이를 기반으로 각 단어의 극성값을 도출하는 상세한 설명은 생략하기로 한다. Since the Naïve Bayes classification itself is a known technique, a detailed description of deriving the polarity value of each word will be omitted based on this.

단계 1120에서 신용 위험도 계산 모듈(120)은 나이브 베이즈 분류에 의해 감성값이 할당된 단어들 중에서 감성값의 절대값이 큰 가장 큰 n(자연수)개의 긍정 및 부정 단어를 핵심 키워드로 각각 선정한다.In step 1120, the credit risk calculation module 120 selects the largest n (natural number) positive and negative words having the greatest absolute value of the emotion value among the words to which the emotion value is assigned by the Naïve Bayes classification, as the core keyword .

단계 1125에서 신용 위험도 계산 모듈(120)은 수집된 데이터에서 유의미한 단어를 추출한 후 추출된 단어들의 벡터값을 도출하여 임베딩한다. In step 1125, the credit risk calculation module 120 extracts a meaningful word from the collected data, and derives and embeds a vector value of the extracted words.

예를 들어, 신용 위험도 계산 모듈(120)은 수집된 데이터에 대해 불필요한 정보인 기호, 숫자 등과 같은 노이즈를 제거한다. 이어, 신용 위험도 계산 모듈(120)은 수집된 데이터에서 각각의 단어를 추출하고, 각 단어에 대해 품사를 태깅할 수 있다. 이때, 명사와 동사를 제외한 불필요한 데이터를 제거하여 단어의 데이터 셋을 만들 수 있다.For example, the credit risk calculation module 120 removes noise such as symbols, numbers, and the like, which are unnecessary information for the collected data. The credit risk calculation module 120 may then extract each word from the collected data and tag the part of speech for each word. At this time, it is possible to create a data set of words by removing unnecessary data except nouns and verbs.

신용 위험도 계산 모듈(120)은 이와 같이 구성된 단어 데이터 셋에 포함된 각 단어들에 대한 벡터값을 생성한 후 임베딩할 수 있다. 이때 벡터의 차원은 n차원일 수 있다. The credit risk calculation module 120 may generate a vector value for each word included in the word data set and embed it. The dimension of the vector may be n-dimensional.

신용 위험도 계산 모듈(120)은 단어 데이터 셋에 대해 단어들의 동시 출현 빈도를 기반으로 벡터값을 생성할 수 있다. The credit risk calculation module 120 may generate a vector value based on the frequency of simultaneous occurrence of words for a word data set.

예를 들어, 신용 위험도 계산 모듈(120)은 word2vec를 이용하여 각 단어를 임베딩할 수 있다. 여기서, word2vec는 당업자에는 자명한 사항이므로 이에 대한 별도의 설명은 생략하기로 한다. For example, the credit risk calculation module 120 may embed each word using word2vec. Here, since word2vec is obvious to those skilled in the art, a detailed description thereof will be omitted.

단계 1130에서 신용 위험도 계산 모듈(120)은 워드 임베딩된 각 단어와 핵심 키워드의 거리를 이용하여 단어별및 문서별 감성값을 도출한다. In step 1130, the credit risk calculation module 120 derives emotion values for each word and each document using the distance between each word embedded in the word and the core keyword.

예를 들어, 각 단어 및 핵심 키워드는 각각 벡터값으로 변환되어 임베딩되어 있으므로, 벡터 스페이스에서 핵심 키워드와 각 단어의 벡터값간의 차이를 기반으로 거리를 도출할 수 있으며, 이를 기반으로 단어별 감성값을 도출할 수 있다.For example, since each word and core keyword are respectively transformed into a vector value and embedded, a distance can be derived based on the difference between the core keyword and the vector value of each word in the vector space. Based on this, Can be derived.

예를 들어, 단어별 감성값을 이용하여 문서 내 단어의 출현 빈도와 각 단어의 감성값을 곱한 후 이를 총합하여 문서별 감성값을 계산할 수 있다.For example, the emotion value of each document can be calculated by multiplying the occurrence frequency of the word in the document by the emotion value of each word by using the emotion value per word, and summing them.

단계 1135에서 신용 위험도 계산 모듈(120)은 문서별 감성값의 변동 추이를 기반으로 신용 위험도를 계산하고, 이를 이용하여 크레딧 이벤트 발생 이전의 위험 시그널을 감지한다. 이때, 각 일자별 감성값은 앞에서 계산한 문서별 감성값의 평균값을 이용하여 도출될 수 있다. In step 1135, the credit risk calculation module 120 calculates a credit risk based on a change in emotion value of each document, and uses the calculated credit risk to detect a danger signal before a credit event occurs. At this time, the emotion value for each day can be derived by using the average value of emotion value for each document calculated above.

예를 들어, 신용 위험도 계산 모듈(120)은 문서별 감성값의 변동 추이, 즉 3일 이상 유지 여부 등을 통해 신용 위험도를 계산할 수 있다. For example, the credit risk calculation module 120 can calculate the credit risk through the trend of the change in emotion value per document, that is, whether it is maintained for three days or more.

본 발명의 일 실시예에 따른 신용 위험도는 수집된 뉴스 및 소셜 데이터를 활용하여 텍스트 및 문서 수준에서의 감성점수(감성값), 감성 점수의 변동 추이 등을 활용한 통합 모니터링 지표를 나타낸다. 따라서, 신용 위험도 지표를 통해 향후 투자에 위험이 발생할 수 있는 시그널을 정의할 수 있다. The credit risk according to an exemplary embodiment of the present invention represents an integrated monitoring indicator utilizing the sensitivity score (emotion value) and the trend of the emotional score at the level of text and document using collected news and social data. Therefore, credit risk indicators can be used to define signals that could pose risks to future investments.

지금까지 도 11을 이용하여 상장 기업에 대한 신용 위험도를 평가하는 방법에 대해 설명하였다. Up to this point, a method for evaluating the credit risk of a listed company using FIG. 11 has been described.

이하에서는 도 12를 참조하여 비상장 기업에 대한 신용 위험도를 평가하는 방법에 대해 상세히 설명하기로 한다.Hereinafter, a method for evaluating a credit risk for a private company will be described in detail with reference to FIG.

단계 1210에서 신용 위험도 계산 모듈(120)은 감성 사전을 정의한다.In step 1210, the credit risk calculation module 120 defines an emotion dictionary.

시가 정보가 없는 비상장 기업은 주가 관점에서 감성값 할당이 어려우므로, 금융 시장에서 많이 활용하는 용어들에 대한 감성 사전을 정의하여 이를 별도로 이용하기로 한다.Since unlisted companies without market price information are difficult to allocate emotion value from the stock price perspective, they will define emotional dictionaries for terms that are frequently used in financial markets and use them separately.

예를 들어, 네이버 등 포탈 금융에 등록되어 있는 모든 기업 관련 의견(opinion)을 기반으로 각 의견에 대한 감성극성값을 먼저 할당한 후 나이브 베이즈 분류를 통해 각 단어에 대한 감성값을 할당한다. 해당 문서가 부정이면 문서 내 단어들에 대해 부정 감성값이 할당되고 전체 문서에 대해 이 과정을 반복함으로써 단어별 감성점수가 도출될 수 있다. For example, based on all company-related opinions registered in portal finance such as Naver, the emotional value of each opinion is assigned first, and the emotion value of each word is assigned through the Naïve Bayes classification. If the document is negative, the negative emotion value is assigned to the words in the document, and the process is repeated for the entire document, whereby the emotion score for each word can be derived.

의견별 감성값을 할당한 일 예가 도 13에 도시되어 있다. 이를 기반으로 사전 정의된 감성 사전의 일부가 도 14에 도시되어 있다. An example of assignment of emotion value by opinion is shown in Fig. A part of the predefined emotion dictionary based thereon is shown in Fig.

도 14에서 보여지는 바와 같이, 감성 사전내의 각 단어는 긍정 단어와 부정 단어로 구분되며, 각각의 단어는 감성값을 가지고 있다. As shown in FIG. 14, each word in the emotion dictionary is divided into positive and negative words, and each word has emotion value.

단계 1215에서 신용 위험도 계산 모듈(120)은 수집된 데이터에서 유의미한 단어를 추출한 후 추출된 단어들의 벡터값을 도출하여 임베딩한다. In step 1215, the credit risk calculation module 120 extracts a meaningful word from the collected data, and then derives and embeds a vector value of the extracted words.

단계 1220에서 신용 위험도 계산 모듈(120)은 감성 사전을 이용하여 단어별/문서별 감성값을 계산한다. In step 1220, the credit risk calculation module 120 calculates the emotion value for each word / document by using the emotion dictionary.

예를 들어, 신용 위험도 계산 모듈(120)은 사전에 정의된 감성 사전을 이용하여 단어별 감성값을 전파할 수 있다. 이어, 각 단어별 감성값을 이용하여 문서별 감성값을 도출할 수 있다.For example, the credit risk calculation module 120 may propagate emotion values by word using a predefined emotion dictionary. Then, the emotion value for each document can be derived by using the emotion value for each word.

단계 1225에서 신용 위험도 계산 모듈(120)은 문서별 감성값의 변동 추이를 기반으로 신용 위험도를 계산하고, 이를 이용하여 크레딧 이벤트 발생 이전의 위험 시그널을 감지한다. 이때, 각 일자별 감성값은 앞에서 계산한 문서별 감성값의 평균값을 이용하여 도출될 수 있다. In step 1225, the credit risk calculation module 120 calculates a credit risk based on a change in emotion value of each document, and uses the calculated credit risk to detect a danger signal before a credit event occurs. At this time, the emotion value for each day can be derived by using the average value of emotion value for each document calculated above.

다시 도 1을 참조하여 크레딧 이벤트 발생가능성 예측 모듈(125)은 신용 위험도가 높은 기업들에 대해 로지스틱 회귀분석을 통해 개별 크레딧 이벤트 발생 가능성을 예측한다. Referring again to FIG. 1, the credit event occurrence probability prediction module 125 predicts the probability of occurrence of an individual credit event through logistic regression analysis for companies with high credit risk.

도 15에는 크레딧 이벤트 발생가능성 예측 모듈(125)의 세부 구성이 도시되어 있으며, 이를 기반으로 크레딧 이벤트 발생 가능성을 예측하는 방법이 도 16에 도시되어 있다. FIG. 15 shows a detailed configuration of the credit event occurrence probability prediction module 125, and a method of predicting a credit event occurrence probability based on the credit event occurrence probability prediction module 125 is shown in FIG.

도 16을 참조하여 이에 대해 보다 상세히 설명하기로 한다. This will be described in more detail with reference to FIG.

단계 1610에서 크레딧 이벤트 발생가능성 예측 모듈(125)은 일자별 감성값의 변동 추이(증감 추이)를 이용하여 신용 위험도 관점에서의 시그널을 도출한다. 도 17에는 본 발명의 일 실시예에 따른 시그널들이 정의되어 있다. In step 1610, the credit event occurrence probability prediction module 125 derives a signal from the credit risk viewpoint by using a trend (variation) of the emotion value per line. 17, signals according to an embodiment of the present invention are defined.

문서별 감성값을 이용하여 일정 기간(예를 들어, 3일) 이상 긍정 또는 부정으로 유지하는 시그널, 일 평균 감성값의 증감 시그널, 일 평균 감성값의 교차 시그널, 문서별 감성값의 비율 시그널, 문서별 주간 감성값 평균 증가율 시그널을 각각 정의할 수 있다. A signal for maintaining positive or negative for a predetermined period of time (for example, three days), a signal for increasing / decreasing the daily average sensation value, a signal for crossing the daily average sensation value, a ratio signal of the sensibility value per document, Weekly emotion value average increase rate signal per document can be respectively defined.

단계 1615에서 크레딧 이벤트 발생가능성 예측 모듈(125)은 도출된 시그널을 이용하여 산업 수준에서의 유효 시그널을 정의한다.In step 1615, the credit event occurrence probability prediction module 125 defines an effective signal at the industrial level using the derived signal.

동종 산업 내 기업들의 데이터를 수집하여 로지스틱 회귀분석을 수행하여, 영향력이 있는 시그널(유효 시그널)만을 이용하여 회귀식을 구축한다. 예를 들어, 로지스틱 회귀분석을 통해 크레딧 이벤트에 영향을 미치는 유효 시그널은 크레딧 이벤트 별로 다르게 정의되며, 이를 이용하여 산업별, 크레딧 이벤트별 예측 모형을 각각 구축할 수 있다. . Collect data from companies in the same industry and perform logistic regression analysis to construct a regression equation using only the influential signals (effective signals). For example, through the logistic regression analysis, the validity signals that affect credit events are defined differently for each credit event and can be used to construct forecast models for each industry and credit event. .

단계 1620에서 크레딧 이벤트 발생가능성 예측 모듈(125)은 분석 대상 기업에 대해 도출된 시그널을 이용하여 크레딧 이벤트 발생 가능성(확률)을 예측한다.In step 1620, the credit event occurrence probability prediction module 125 predicts a credit event occurrence probability (probability) using a signal derived for the analysis object company.

예를 들어, 크레딧 이벤트 발생가능성 예측 모듈(125)은 분석 대상 기업에 상응하는 크레딧 이벤트 예측 모형(즉, 회귀식)에 분석 대상 기업에 상응하여 도출된 시그널을 입력함으로써 크레딧 이벤트 발생 확률을 예측할 수 있다. For example, the credit event occurrence probability prediction module 125 predicts the probability of a credit event occurrence by inputting a signal corresponding to an analysis target company in a credit event prediction model (i.e., a regression formula) corresponding to the analysis target company have.

이와 같이, 본 발명의 일 실시예에 따른 지능형 증권 투자 의사결정 지원 장치(100)는 분석 대상 기업에 대한 온라인 및 SNS상 데이터를 수집한 후 오피니언 마이닝과 기계 학습을 이용하여 분석 대상 기업의 증권 투자 위험 시그널을 감지함으로써 변동성이 높은 증권 시장에서의 예측을 가능하게 할 수 있다. As described above, the intelligent securities investment decision support apparatus 100 according to an embodiment of the present invention collects online and SNS data for an analysis target company, and then, using Opinion Mining and machine learning, By detecting risk signals, it is possible to make forecasts in the highly volatile stock market.

또한, 도 1에 따른 지능형 증권 투자 의사결정 지원 장치(100)는 컴퓨터일 수 있다. 만일 지능형 증권 투자 의사결정 지원 장치(100)가 컴퓨터인 경우, 메모리 및 프로세서를 포함할 수 있으며, 도 1에서 설명된 각각의 구성은 프로세서에서 수행되는 논리적 구성일 수 있다. In addition, the intelligent securities investment decision support apparatus 100 according to FIG. 1 may be a computer. If the intelligent securities investment decision support apparatus 100 is a computer, it may include a memory and a processor, and each of the configurations described in FIG. 1 may be a logical configuration performed in the processor.

본 발명의 실시 예에 따른 장치 및 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 컴퓨터 판독 가능 매체에 기록되는 프로그램 명령은 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 분야 통상의 기술자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media) 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다.The apparatus and method according to embodiments of the present invention may be implemented in the form of program instructions that can be executed through various computer means and recorded in a computer-readable medium. The computer readable medium may include program instructions, data files, data structures, and the like, alone or in combination. Program instructions to be recorded on a computer-readable medium may be those specially designed and constructed for the present invention or may be known and available to those of ordinary skill in the computer software arts. Examples of computer-readable media include magnetic media such as hard disks, floppy disks and magnetic tape; optical media such as CD-ROMs and DVDs; magnetic media such as floppy disks; Includes hardware devices specifically configured to store and execute program instructions such as magneto-optical media and ROM, RAM, flash memory, and the like. Examples of program instructions include machine language code such as those produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter or the like.

상술한 하드웨어 장치는 본 발명의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.

이제까지 본 발명에 대하여 그 실시 예들을 중심으로 살펴보았다. 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자는 본 발명이 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 변형된 형태로 구현될 수 있음을 이해할 수 있을 것이다. 그러므로 개시된 실시 예들은 한정적인 관점이 아니라 설명적인 관점에서 고려되어야 한다. 본 발명의 범위는 전술한 설명이 아니라 특허청구범위에 나타나 있으며, 그와 동등한 범위 내에 있는 모든 차이점은 본 발명에 포함된 것으로 해석되어야 할 것이다.The embodiments of the present invention have been described above. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. Therefore, the disclosed embodiments should be considered in an illustrative rather than a restrictive sense. The scope of the present invention is defined by the appended claims rather than by the foregoing description, and all differences within the scope of equivalents thereof should be construed as being included in the present invention.

Claims

In the intelligent securities investment decision support method,
(a) collecting at least one of a news, an SNS, a report, a disclosure data, financial statement data, a posting in a financial community board, and a comment related to an analysis target company;
(b) screening a risk company using at least one of the collected disclosure data, financial statement data, and news;
(c) deriving emotion values for each word and each document by using the news and SNS data, and deriving a credit risk through a variation of emotion values for each word and document; And
(d) deriving a signal based on the derived credit risk, and predicting a credit event occurrence probability using the derived signal.

The method according to claim 1,
The step (b)
If the analysis target company is a listed company, deriving the disclosure index and the financial index using the disclosure data and the financial statement data;
Applying the derived known index and financial index to the learned support vector machine (SVM) to classify the normal index and the financial index into a normal company and a risk company;
Analyzing the occurrence frequency of the predetermined credit event keyword in the collected data to secondarily derive the risk company; And
And selecting a final analysis candidate company based on the first classification and the second derivation result.

3. The method of claim 2,
The above credit event is a credit event, which includes capital increase, capital erosion, corporate bond, disposal of treasury stock, liquidation merger, segregation accounting, free capital reduction, private equity fund, sale of largest shareholder or lump sum shareholder, debt guarantee for others, The occurrence of non-payment of interest on the bond, the disclosure of the inquiry, and the delayed report of the reduction of taxes.

The method according to claim 1,
The step (b)
Wherein if the analysis target company is a non-listed company, the frequency of occurrence of a credit event keyword is extracted from the collected news and disclosure data, and the risk company is screened.

The method of claim 1, wherein the step (c)
If the analysis target company is a listed company, assigning a sensitivity value using a stock price for each opinion included in the collected data;
Selecting key keywords based on positive and negative based on the sensibility polarity value;
Selecting a meaningful word from the collected data to create a data set, deriving and embedding a vector value of each word included in the data set,
Deriving a distance using the vector value between the core keyword and the word in the vector space and deriving an emotion value of each word based on the derived distance;
Deriving emotion values for each document by totaling the emotion values of the words included in the document; And
And calculating a credit risk using the derived fluctuation of emotion value for each document.

The method of claim 1, wherein the step (c)
Defining an emotional dictionary when the analyzed entity is a non-listed company;
Selecting a meaningful word from the collected data to create a data set, deriving and embedding a vector value of each word included in the data set,
Deriving an emotion value for each word in the vector space using the emotion dictionary;
Deriving emotion values for each document by summing emotion values of the words included in the document; And
And calculating a credit risk using the derived fluctuation of emotion value for each document.

The method according to claim 6,
The step of defining the emotional dictionary comprises:
Collecting data expressing a clear emotion for each opinion; And
And deriving an emotion value of each word using the naive bayes classifier for the collected data.

The method according to claim 1,
In the step (d) above,
Wherein the intelligent securities investment decision support method is derived using the daily fluctuation trend of emotion value of each document.

9. The method of claim 8,
The derived signal includes a signal for maintaining positive or negative for a predetermined period of time (for example, three days), a signal for increasing or decreasing the daily average sensibility value, a signal for crossing the daily average sensibility value, Value ratio signal and a weekly emotion value average increase rate signal for each document.

The method according to claim 1,
The step (d)
Collecting data of companies in the same industry and performing logistic regression analysis to check whether a predetermined credit event occurs within a predetermined period to derive effective signals for each industry;
Constructing a regression equation using the industry specific effective signal; And
And estimating a credit event occurrence probability by reflecting the derived signal in a regression equation of the same kind of industry.

Readable recording medium having recorded thereon a program code for performing the method according to any one of claims 1 to 10.

1. A computer device in which an intelligent securities investment decision support method is implemented,
A communication unit;
A memory in which at least one instruction is stored; And
A processor for executing instructions stored in the memory,
The instructions executed by the processor,
(a) collecting at least one of a news, an SNS, a report, a disclosure data, financial statement data, a posting in a financial community board, and a comment related to an analysis target company;
(b) screening a risk company using at least one of the collected disclosure data, financial statement data, and news;
(c) deriving emotion values for each word and each document by using the news and SNS data, and deriving a credit risk through a variation of emotion values for each word and document; And
(d) deriving a signal based on the derived credit risk, and predicting a credit event occurrence probability using the derived signal.