KR20180122275A

KR20180122275A - Method for generating big data analysis report automatically and device performing the same

Info

Publication number: KR20180122275A
Application number: KR1020180043764A
Authority: KR
Inventors: 이동연; 장석호; 유재명
Original assignee: 비씨카드(주); 주식회사 퀀트랩
Priority date: 2017-05-02
Filing date: 2018-04-16
Publication date: 2018-11-12
Also published as: KR102022944B1

Abstract

The present invention relates to a method for automatically generating a big data analysis report and an apparatus for performing the same. According to an embodiment of the present invention, the method comprises the steps of: (a) analyzing data or supporting data analysis according to at least one predetermined analysis process when a report generation signal is obtained; (b) setting importance or supporting setting of importance in an order of data phase or association for at least one analysis result data included in a data structure generated as an analysis result; (c) selecting at least one piece of analysis result data as report contents based on the set importance; and (d) generating and providing a report by reflecting the at least one selected piece of analysis result data on a predetermined report template.

Description

TECHNICAL FIELD [0001] The present invention relates to a method and apparatus for automatically generating a large data analysis report,

본 발명은 빅데이터 분석 보고서를 자동으로 생성하는 방법 및 이를 수행하는 장치에 관한 것으로, 더욱 상세하게는, 특정 학습에 의해 빅데이터가 분석되고 분석 결과에 대해 중요도가 설정되며, 설정된 중요도를 기초로 소정의 분석 결과가 선택되도록 함으로써, 수요자가 필요로 할 정보만을 담은 빅데이터 분석 보고서가 신속하고 정확하게 자동 생성될 수 있도록 하는 방법에 관한 것이다.The present invention relates to a method of automatically generating a big data analysis report and an apparatus for performing the method. More particularly, the present invention relates to a method and apparatus for automatically generating a big data analysis report by analyzing big data by specific learning and setting importance on analysis results, And a method for enabling quick and accurate automatic generation of a big data analysis report containing only information required by a customer by allowing a predetermined analysis result to be selected.

최근, 빅데이터를 기반으로 하는 학습을 통해 수요자가 필요로 하는 정보를 도출해내어 제공하는 빅데이터 분석 기술에 대한 연구가 활발히 진행되고 있다. 특히, 수요자가 방대한 양의 정보를 직접 분석하여 작성하지 않아도 기계 학습을 통해 빅데이터 분석 보고서가 자동으로 생성되는 기술에 대한 개발이 집중적으로 이루어지고 있다.In recent years, research on big data analysis technology that extracts information needed by consumers through learning based on big data has been actively carried out. In particular, the development of technologies for automatically generating big data analysis reports through machine learning has been intensively developed, even if the user does not directly analyze and prepare vast amounts of information.

하지만, 현재까지 개발된 빅데이터 분석 보고서 자동 생성 기술은, 분석 방법과 보고서 템플릿만이 미리 설정되어, 분석 결과에 관계없이 분석된 내용이 모두 보고서로 출력되어 과도하게 많은 양의 보고서가 생성된다는 단점이 있다.However, the Big Data Analysis Report automatic generation technology developed so far has the disadvantage that only the analysis method and the report template are set in advance and the analyzed contents are all outputted to the report regardless of the analysis result and an excessive amount of reports are generated .

또한, 수요자가 분석된 보고서를 확인 후, 분석 결과에 따라 필요로 하는 정보 종류만을 별도로 설정하려면, 다시 말해, 분석 결과에 따라 보고서의 내용을 변경하고자 하면, 다른 복잡한 보고서 생성 프로그램을 별도로 마련해야 한다는 문제가 있다.In addition, if the user wants to change the report contents according to the result of analyzing, after analyzing the analyzed report, it is necessary to separately set different kinds of information according to the analysis result .

따라서, 분석 결과에 대해 중요도를 계산하고, 중요도가 높은 순서에 따라 보고서에 포함될 분석 결과를 채택하여, 이를 기반으로 보고서가 작성됨으로써, 수요자가 필요로 하는 정보만이 포함된 빅데이터 분석 보고서가 신속하고 정확하게 자동 생성될 수 있도록 하는 방법에 대한 요구가 점차 증대되고 있으며, 상술한 문제점을 해결하기 위한 방안이 시급한 실정이다.Therefore, the importance of the analysis result is calculated, and the analysis results to be included in the report in the order of importance are adopted, and based on this report, the Big Data Analysis Report containing only the information required by the user is promptly There is an urgent need for a method for solving the above problems.

본 발명은 전술한 종래기술의 문제점을 해결하기 위한 것으로, 특정 학습에 의해 빅데이터가 분석되고 분석 결과에 대해 중요도가 설정되며, 설정된 중요도를 기초로 소정의 분석 결과가 선택되도록 함으로써, 빅데이터 수동 분석 또는 보고서 작성 등의 분석가 개입 없이도 수요자가 필요로 할 정보만을 담은 빅데이터 분석 보고서가 신속하고 정확하게 자동 생성될 수 있도록 하는 것을 그 목적으로 한다.SUMMARY OF THE INVENTION The present invention has been made to solve the above problems of the prior art, and it is an object of the present invention to provide a method and apparatus for analyzing big data, The purpose of this report is to enable quick and accurate automatic generation of big data analysis reports that contain only the information that the consumer needs without any analyst intervention such as analysis or report generation.

본 발명의 목적은 이상에서 언급한 목적으로 제한되지 않으며, 언급되지 않은 또 다른 목적들은 아래의 기재로부터 명확하게 이해될 수 있을 것이다.The objects of the present invention are not limited to the above-mentioned objects, and other objects not mentioned can be clearly understood from the following description.

상술한 목적을 달성하기 위한 본 발명의 일 실시예에 따르면, 장치가, 빅데이터 분석 보고서를 자동으로 생성하는 방법에 있어서, (a) 보고서 생성 신호가 획득되면, 미리 설정된 적어도 하나의 분석 프로세스에 따라 데이터를 분석하거나 분석하도록 지원하는 단계; (b) 상기 분석 결과로서 생성된 데이터 구조에 포함된 적어도 하나의 분석 결과 데이터에 대하여 데이터 위상 또는 연관도 순으로 중요도를 설정하거나 설정하도록 지원하는 단계; (c) 상기 설정된 중요도를 기초로 하여 일 이상의 분석 결과 데이터를 보고서 내용으로 선택하는 단계; 및 (d) 상기 선택된 일 이상의 분석 결과 데이터를 소정의 보고서 템플릿에 반영하여 보고서를 생성하여 제공하는 단계를 포함하는, 빅데이터 분석 보고서 자동 생성 방법이 제공된다.According to an embodiment of the present invention, there is provided a method of automatically generating a big data analysis report, the method comprising the steps of: (a) when a report generation signal is obtained, Supporting the analysis or analysis of the data; (b) supporting the setting or setting of importance of at least one analysis result data included in the data structure generated as a result of the analysis in order of data phase or association order; (c) selecting one or more analysis result data as report contents based on the set importance; And (d) generating and providing a report by reflecting the selected one or more analysis result data on a predetermined report template, and providing the generated large data analysis report automatically.

상기 데이터 구조에 제1 분석 결과 데이터 및 (i) 상기 제1 분석 결과 데이터에 대한 하위 분석 결과 데이터 또는 (ii) 상기 제1 분석 결과 데이터와 소정의 연관도를 갖는, 제2 분석 결과 데이터가 포함되는 경우, 상기 (b) 단계에서, 상기 제1 분석 결과 데이터에 대해 중요도가 설정된 후, 상기 제2 분석 결과 데이터에 대해 중요도가 설정되는 것을 특징으로 할 수 있다.(I) sub-analysis result data for the first analysis result data or (ii) second analysis result data having a predetermined degree of association with the first analysis result data is included in the data structure The importance of the second analysis result data is set after the importance of the first analysis result data is set in the step (b).

상기 데이터 구조에 상기 제1 분석 결과 데이터와 대응되는 위상을 갖는 제3 분석 결과 데이터가 존재할 때, 상기 (b) 단계에서, 상기 제1 분석 결과 데이터 및 상기 제3 분석 결과 데이터에 대해, 분석 변수를 일정 비율로 조정했을 경우의 예상되는 목표지표의 변화를 각각 산출한 후, 상기 목표지표의 변화 정도에 기초하여 상기 제1 분석 결과 데이터 및 상기 제3 분석 결과 데이터에 대한 중요도가 각각 설정되는 것을 특징으로 할 수 있다.When the third analysis result data having a phase corresponding to the first analysis result data is present in the data structure, in the step (b), for the first analysis result data and the third analysis result data, The degree of importance of the first analysis result data and the degree of importance of the third analysis result data are respectively set on the basis of the degree of change of the target indicator .

상기 데이터 구조에 상기 제2 분석 결과 데이터와 대응되는 위상을 갖는 제4 분석 결과 데이터가 존재할 때, 상기 제2 분석 결과 데이터 및 상기 제4 분석 결과 데이터 각각이 상기 제1 분석 결과 데이터에 미치는 통계적 영향력의 비율에 기초하여 상기 제2 분석 결과 데이터 및 상기 제4 분석 결과 데이터에 대한 중요도가 각각 설정되는 것을 특징으로 할 수 있다.Wherein when the fourth analysis result data having a phase corresponding to the second analysis result data exists in the data structure, a statistical influence of each of the second analysis result data and the fourth analysis result data on the first analysis result data The importance of the second analysis result data and the fourth analysis result data are set based on the ratio of the first analysis result data and the second analysis result data.

상기 제2 분석 결과 데이터 및 상기 제4 분석 결과 데이터에 대한 중요도의 합은 상기 제1 분석 결과 데이터의 중요도보다 작거나 같은 것을 특징으로 할 수 있다.The sum of the importance of the second analysis result data and the fourth analysis result data may be less than or equal to the importance of the first analysis result data.

상기 데이터 구조에 상기 제1 분석 결과 데이터와 대응되는 위상을 갖는 제3 분석 결과 데이터가 존재할 때, 상기 (b) 단계에서, 상기 제1 분석 결과 데이터 및 상기 제3 분석 결과 데이터 각각을 전체 분석 결과 데이터의 평균 또는 분포와 비교하고, 이에 기초하여, 상기 제1 분석 결과 데이터 및 상기 제3 분석 결과 데이터에 대한 중요도가 각각 설정되는 것을 특징으로 할 수 있다.When the third analysis result data having a phase corresponding to the first analysis result data is present in the data structure, in the step (b), the first analysis result data and the third analysis result data are subjected to a total analysis result And the importance of each of the first analysis result data and the third analysis result data is set based on the comparison result.

상기 (a) 단계에서, 복수의 분석 프로세스에 따라 데이터 분석이 이루어지는 경우, 상기 복수의 분석 프로세스는 상호 계층 관계 또는 연관 관계를 형성하고, 상위 계층 분석 프로세스가 우선적으로 이용되거나 최초 이용된 분석 프로세스와 연관도가 높은 분석 프로세스가 우선적으로 이용되어 데이터 분석이 이루어지는 것을 특징으로 할 수 있다.In the step (a), when data analysis is performed according to a plurality of analysis processes, the plurality of analysis processes form a mutual hierarchical relationship or an association relationship, and the upper hierarchical analysis process is preferentially used or the first- The analysis process having a high degree of association is preferentially used and data analysis is performed.

상기 데이터 분석이 이루어진 후, 소정의 실행 규칙이 만족되면, 이전의 분석 프로세스와 다른 새로운 분석 프로세스에 따라 데이터 분석이 이루어지고, 상기 실행 규칙은, 상기 이전의 분석 프로세스에 따라 데이터가 분석된 결과에 의미적으로 연관된 새로운 분석 프로세스에 따른 데이터 분석 여부를 결정하기 위한 소정의 조건 규칙일 수 있다.After the data analysis is performed, if a predetermined execution rule is satisfied, data analysis is performed according to a new analysis process different from the previous analysis process, and the execution rule is set to a result of analyzing the data according to the previous analysis process And may be a predetermined condition rule for determining whether to analyze data according to a semantically related new analysis process.

상기 (c) 단계는, 상기 설정된 중요도 순으로 소정의 개수의 분석 결과 데이터를 선택하는 것을 특징으로 할 수 있다.In the step (c), a predetermined number of analysis result data may be selected in order of importance.

상기 (c) 단계는, 상기 설정된 중요도가 미리 설정된 하한값 이상인 분석 결과 데이터를 선택하는 것을 특징으로 할 수 있다.And the step (c) may select the analysis result data having the set importance of at least the lower limit value.

상기 (d) 단계는, 상기 선택된 일 이상의 분석 결과 데이터를 상기 소정의 보고서 템플릿에 따라 자연어, 표 및 그래프 중 적어도 어느 하나의 형태로 변환함으로써 보고서를 생성하여 제공하는 것을 특징으로 할 수 있다.The step (d) may generate and provide a report by converting the selected one or more analysis result data into at least one of natural language, table, and graph according to the predetermined report template.

한편, 본 발명의 다른 실시예에 따르면, 보고서 생성 신호가 획득되면, 미리 설정된 적어도 하나의 분석 프로세스에 따라 데이터를 분석하거나 분석하도록 지원하는 데이터 분석부; 상기 분석 결과로서 생성된 데이터 구조에 포함된 적어도 하나의 분석 결과 데이터에 대하여 데이터 위상 또는 연관도 순으로 중요도를 설정하거나 설정하도록 지원하는 중요도 설정부; 상기 설정된 중요도를 기초로 하여 일 이상의 분석 결과 데이터를 보고서 내용으로 선택하는 내용 선택부; 및 상기 선택된 일 이상의 분석 결과 데이터를 소정의 보고서 템플릿에 반영하여 보고서를 생성하여 제공하는 보고서 생성부를 포함하는, 장치가 제공된다.According to another embodiment of the present invention, there is provided a data analysis apparatus comprising: a data analysis unit for analyzing or analyzing data according to at least one predetermined analysis process when a report generation signal is acquired; An importance level setting unit for setting or setting importance levels of at least one analysis result data included in the data structure generated as a result of the analysis in order of data phase or association order; A content selection unit for selecting one or more analysis result data as report contents based on the set importance; And a report generation unit for generating and providing a report by reflecting the selected one or more analysis result data to a predetermined report template.

상기 데이터 구조에 제1 분석 결과 데이터 및 (i) 상기 제1 분석 결과 데이터에 대한 하위 분석 결과 데이터 또는 (ii) 상기 제1 분석 결과 데이터와 소정의 연관도를 갖는, 제2 분석 결과 데이터가 포함되는 경우, 상기 (b) 단계에서, 상기 제1 분석 결과 데이터에 대해 중요도가 설정된 후, 상기 제2 분석 결과 데이터에 대한 중요도가 설정되는 것을 특징으로 할 수 있다.(I) sub-analysis result data for the first analysis result data or (ii) second analysis result data having a predetermined degree of association with the first analysis result data is included in the data structure The importance of the second analysis result data is set after the importance of the first analysis result data is set in the step (b).

상기 데이터 구조에 상기 제1 분석 결과 데이터와 대응되는 위상을 갖는 제3 분석 결과 데이터가 존재할 때, 상기 중요도 설정부에서, 상기 제1 분석 결과 데이터 및 상기 제3 분석 결과 데이터에 대해, 분석 변수를 일정 비율로 조정했을 경우의 예상되는 목표지표의 변화를 각각 산출한 후, 상기 목표지표의 변화 정도에 기초하여 상기 제1 분석 결과 데이터 및 상기 제3 분석 결과 데이터에 대한 중요도가 각각 설정되는 것을 특징으로 할 수 있다.When the third analysis result data having a phase corresponding to the first analysis result data exists in the data structure, the importance setting unit sets an analysis variable for the first analysis result data and the third analysis result data The degree of importance of the first analysis result data and the degree of importance of the third analysis result data are respectively set on the basis of the degree of change of the target indicator, .

상기 데이터 구조에 상기 제1 분석 결과 데이터와 대응되는 위상을 갖는 제3 분석 결과 데이터가 존재할 때, 상기 중요도 설정부에서, 상기 제1 분석 결과 데이터 및 상기 제3 분석 결과 데이터 각각을 전체 분석 결과 데이터의 평균 또는 분포와 비교하고, 이에 기초하여, 상기 제1 분석 결과 데이터 및 상기 제3 분석 결과 데이터에 대한 중요도가 각각 설정되는 것을 특징으로 할 수 있다.When the third analysis result data having a phase corresponding to the first analysis result data is present in the data structure, the importance setting unit sets each of the first analysis result data and the third analysis result data as the entire analysis result data And the importance of the first analysis result data and the third analysis result data is set based on the comparison result.

상기 데이터 분석부에서, 복수의 분석 프로세스에 따라 데이터 분석이 이루어지는 경우, 상기 복수의 분석 프로세스는 상호 계층 관계 또는 연관 관계를 형성하고, 상위 계층 분석 프로세스가 우선적으로 이용되거나 최초 이용된 분석 프로세스와 연관도가 높은 분석 프로세스가 우선적으로 이용되어 데이터 분석이 이루어지는 것을 특징으로 할 수 있다.In the data analysis unit, when data analysis is performed according to a plurality of analysis processes, the plurality of analysis processes form a mutual hierarchical relationship or an association relationship, and an upper hierarchical analysis process is used in preference or in association with an initially used analysis process A high-level analysis process is preferentially used and data analysis is performed.

상기 내용 선택부는, 상기 설정된 중요도 순으로 소정의 개수의 분석 결과 데이터를 선택하는 것을 특징으로 할 수 있다.And the content selection unit may select a predetermined number of analysis result data in order of importance.

상기 내용 선택부는, 상기 설정된 중요도가 미리 설정된 하한값 이상인 분석 결과 데이터를 선택하는 것을 특징으로 할 수 있다.And the content selection unit may select the analysis result data having the set importance of at least the lower limit value.

상기 보고서 생성부는, 상기 선택된 일 이상의 분석 결과 데이터를 상기 소정의 보고서 템플릿에 따라 자연어, 표 및 그래프 중 적어도 어느 하나의 형태로 변환함으로써 보고서를 생성하여 제공하는 것을 특징으로 할 수 있다.The report generating unit may generate and provide a report by converting the selected one or more analysis result data into at least one of a natural language, a table, and a graph according to the predetermined report template.

본 발명의 일 실시예에 따르면, 보고서를 생성함에 있어, 분석 결과 데이터마다 데이터 위상 또는 연관도 순으로 중요도를 설정하고, 설정된 중요도를 기초로 보고서에 포함될 분석 결과 데이터를 선택함으로써, 수요자가 필요로 하는 정보만이 포함된 빅데이터 분석 보고서가 신속하고 정확하게 자동 생성될 수 있도록 한다.According to an embodiment of the present invention, in generating a report, importance levels are set in order of data phase or degree of association for each analysis result data, and analysis result data to be included in the report is selected based on the set importance, So that large data analysis reports containing only information can be generated quickly and accurately.

본 발명의 효과는 상기한 효과로 한정되는 것은 아니며, 본 발명의 상세한 설명 또는 특허청구범위에 기재된 발명의 구성으로부터 추론 가능한 모든 효과를 포함하는 것으로 이해되어야 한다.It should be understood that the effects of the present invention are not limited to the above effects and include all effects that can be deduced from the detailed description of the present invention or the configuration of the invention described in the claims.

도 1은 본 발명의 일 실시예에 따른 빅데이터 분석 보고서를 자동으로 생성하는 장치의 구성을 도시한 블록도이다.
도 2는 본 발명의 일 실시예에 따라 빅데이터 분석 보고서가 자동으로 생성되는 과정을 도시한 순서도이다.
도 3은 분석 결과 데이터에 대한 중요도 설정의 일 실시예를 개략적으로 나타낸 도면이다.
도 4는 분석 결과 데이터에 대한 중요도 설정의 다른 실시예를 개략적으로 나타낸 도면이다.
도 5는 본 발명의 일 실시예에 따라 빅데이터 분석 보고서 자동 생성 시 제공되는 화면을 예시하는 도면이다.1 is a block diagram illustrating a configuration of an apparatus for automatically generating a big data analysis report according to an embodiment of the present invention.
2 is a flowchart illustrating a process of automatically generating a big data analysis report according to an embodiment of the present invention.
FIG. 3 is a diagram schematically showing one embodiment of importance setting for analysis result data.
4 is a diagram schematically showing another embodiment of importance setting for the analysis result data.
FIG. 5 is a diagram illustrating a screen provided when the Big Data Analysis Report is automatically generated according to an embodiment of the present invention.

이하에서는 첨부한 도면을 참조하여 본 발명을 설명하기로 한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며, 따라서 여기에서 설명하는 실시예로 한정되는 것은 아니다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, the present invention will be described with reference to the accompanying drawings. The present invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. In order to clearly illustrate the present invention, parts not related to the description are omitted, and similar parts are denoted by like reference characters throughout the specification.

명세서 전체에서, 어떤 부분이 다른 부분과 "연결"되어 있다고 할 때, 이는 "직접적으로 연결"되어 있는 경우뿐 아니라, 그 중간에 다른 부재를 사이에 두고 "간접적으로 연결"되어 있는 경우도 포함한다. 또한 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 구비할 수 있다는 것을 의미한다.Throughout the specification, when a part is referred to as being "connected" to another part, it includes not only "directly connected" but also "indirectly connected" . Also, when an element is referred to as " comprising ", it means that it can include other elements, not excluding other elements unless specifically stated otherwise.

이하 첨부된 도면을 참고하여 본 발명의 실시예를 상세히 설명하기로 한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 일 실시예에 따른 빅데이터 분석 보고서를 자동으로 생성하는 장치의 구성을 도시한 블록도이다.1 is a block diagram illustrating a configuration of an apparatus for automatically generating a big data analysis report according to an embodiment of the present invention.

먼저, 장치(100)는 휴대폰, 스마트폰, PDA(Personal Digital Assistant), PMP(Portable Multimedia Player), 태블릿 PC 등과 같이 터치 스크린 패널이 구비된 모든 종류의 핸드헬드(Handheld) 기반의 무선 통신 장치를 포함할 수 있으며, 이 외에도 데스크탑 PC, 태블릿 PC, 랩탑 PC, 셋탑 박스를 포함하는 IPTV와 같이, 애플리케이션을 설치하고 실행할 수 있는 기반이 마련된 장치도 포함할 수 있다.First, the apparatus 100 may be any type of handheld-based wireless communication device equipped with a touch screen panel such as a mobile phone, a smart phone, a PDA (Personal Digital Assistant), a PMP (Portable Multimedia Player) , And may also include devices on which to install and run applications, such as desktop PCs, tablet PCs, laptop PCs, and IPTV, including set-top boxes.

본 발명의 일 실시예에 따르면, 장치(100)가 상기와 같이 애플리케이션을 설치하고 실행할 수 있는 기반이 마련된 장치인 경우, 장치(100)는 빅데이터 분석 보고서 자동 생성 서비스를 제공하는 애플리케이션을 설치하여 메모리에 저장할 수 있다. 예를 들어, 장치(100)는 각종 애플리케이션이 업로드 되어 있는 앱 스토어 서버(미도시)에 접속한 후, 보고서 양식 자료 업로드 서비스, 분석 프로세스 설정 서비스 등 다양한 서비스를 제공하는 빅데이터 분석 보고서 자동 생성 서비스 제공 애플리케이션을 다운로드 하여 설치할 수 있다.According to one embodiment of the present invention, if the device 100 is a device on which the application can be installed and executed as described above, the device 100 installs an application that provides an automatic generation of a big data analysis report It can be stored in memory. For example, the device 100 accesses an app store server (not shown) in which various applications are uploaded, and then generates a large data analysis report automatic generation service You can download and install the provided application.

도 1을 참조하면, 본 발명의 일 실시예에 따른 장치(100)는 데이터 분석부(110), 중요도 설정부(120), 내용 선택부(130) 및 보고서 생성부(140)를 포함한다.Referring to FIG. 1, an apparatus 100 according to an exemplary embodiment of the present invention includes a data analysis unit 110, an importance level setting unit 120, a content selection unit 130, and a report generation unit 140.

데이터 분석부(110)는 보고서 생성 신호가 획득되면, 미리 설정된 적어도 하나의 분석 프로세스에 따라 데이터 분석을 실시하거나 데이터 분석이 실시되도록 할 수 있다. 이 때, 분석 대상인 데이터는 장치(100) 내 저장된 빅데이터일 수 있으며, 외부 장치 또는 서버(미도시)로부터 별도 수신된 빅데이터일 수도 있다.When the report generation signal is acquired, the data analysis unit 110 may perform data analysis or perform data analysis according to at least one predetermined analysis process. In this case, the data to be analyzed may be big data stored in the device 100, or may be big data separately received from an external device or a server (not shown).

본 발명의 일 실시예에 따르면, 복수의 분석 프로세스에 따라 데이터 분석이 이루어지는 경우, 데이터 분석부(110)는 데이터 간 상호 계층 관계 또는 연관 관계를 형성하고, 상위 계층 분석 프로세스를 우선적으로 이용하거나 최초 이용된 분석 프로세스와 연관도가 높은 분석 프로세스를 우선적으로 이용함으로써 데이터를 분석할 수 있다.According to one embodiment of the present invention, when data analysis is performed according to a plurality of analysis processes, the data analysis unit 110 forms a hierarchical relationship or association relationship between data, and preferentially uses an upper hierarchical analysis process, Data can be analyzed by prioritizing the analytical process used and the analytical process associated with it.

구체적으로, 데이터 분석부(110)는 데이터 분석 후, 소정의 실행 규칙이 만족되면, 이전의 분석 프로세스와 다른 새로운 분석 프로세스에 따라 데이터 분석을 실시할 수 있다.Specifically, after analyzing the data, the data analysis unit 110 may analyze the data according to a new analysis process different from the previous analysis process, if the predetermined execution rule is satisfied.

상기 실행 규칙은, 이전의 분석 프로세스에 따라 데이터가 분석된 결과에 의미적으로 연관된 새로운 분석 프로세스에 따른 데이터 분석 여부를 결정하기 위한 소정의 조건 규칙일 수 있다. 다시 말해, 분석 결과에 종속되는 다음 분석 프로세스를 수행하기 위한 조건으로, 다음 분석 프로세스의 실시 여부를 판단하기 위한 조건을 의미한다.The execution rule may be a predetermined condition rule for determining whether to analyze data according to a new analysis process semantically related to a result of analyzing the data according to a previous analysis process. In other words, it means a condition for judging whether or not the next analysis process should be performed as a condition for performing the next analysis process that is dependent on the analysis result.

예를 들어, 가맹점 매출 고객에 대한 성비 분석 프로세스에 따라 분석된 결과가 여성 99%, 남성 1% 라고 가정하면, 남성의 선호 물품에 대한 추가 분석은 무의미하기 때문에, 20% 미만인 분석 결과에 대하여 다음 분석 프로세스에 따른 데이터 분석이 실시되지 않도록 성비 어느 한쪽의 비율을 20% 이상으로 설정하는 소정의 실행 규칙이 존재할 수 있다.For example, assuming that the analysis results of the male-to-female sales customer are 99% female and 1% male, additional analyzes of male preference items are meaningless, There may be a predetermined execution rule that sets the ratio of either sex ratio to 20% or more so that data analysis according to the analysis process is not performed.

중요도 설정부(120)는 데이터 분석부(110)를 통해 분석 결과로서 생성된 데이터 구조에 포함된 적어도 하나의 분석 결과 데이터에 대하여 데이터 위상 또는 연관도 순으로 중요도를 설정하거나 설정하도록 지원할 수 있다.The importance setting unit 120 can support setting or setting the importance of at least one analysis result data included in the data structure generated as the analysis result through the data analysis unit 110 in order of data phase or association order.

구체적으로, 상기 분석 결과로서 생성된 데이터 구조에 제1 분석 결과 데이터 및 제1 분석 결과 데이터에 대한 하위 분석 결과 데이터이거나, 또는 제1 분석 결과 데이터와 소정의 연관도를 갖는 제2 분석 결과 데이터가 포함되는 경우, 중요도 설정부(120)는 제1 분석 결과 데이터에 대한 중요도 설정 후 제2 분석 결과 데이터에 대한 중요도를 설정할 수 있다.Specifically, the first analysis result data and the second analysis result data for the first analysis result data or the second analysis result data having the predetermined association degree with the first analysis result data are added to the data structure generated as the analysis result The importance setting unit 120 may set the importance of the second analysis result data after setting the importance of the first analysis result data.

중요도를 배분하는 방법에는 다음과 같은 방법이 있다.There are several ways to distribute importance.

첫째, 목표지표(예를 들어, 매출)와 분석 절차에서 분석할 변수(예를 들어, 고객 성비)의 관계를 기계학습을 통해 학습한 후, 현재 보고서 작성 중인 대상에서 각 분석 절차의 변수들을 일정 비율로 조정했을 때 목표지표의 예상되는 변화를 측정한다. 이렇게 구한 각 변수의 변화량 또는 변화량의 제곱 등 수학적 변환의 비율에 따라 해당 변수에 대한 분석 절차의 중요도를 합이 100%가 되도록 배분할 수 있다.First, after learning the relationship between the target indicator (for example, sales) and the variable to be analyzed in the analysis procedure (for example, customer gender ratio) through machine learning, Measures the expected change in the target index when adjusted to a ratio. The significance of the analysis procedure for the variable can be distributed so that the sum is 100% according to the mathematical transformation ratio such as the square of the variation or the variation of each variable thus obtained.

둘째, 목표지표 없이 분석 절차에서 분석할 변수 또는 분석 절차의 결과를 전체 평균이나 전체 분포 등과 비교하여 차이를 구한다. 이 차이 또는 차이의 제곱 등 수학적 변환의 비율에 따라 해당 분석 절차의 중요도를 합이 100%가 되도록 배분할 수 있다.Second, we compare the results of the variables or the analysis procedures to be analyzed in the analysis procedure with the average or total distribution without the target index. Depending on the ratio of the mathematical transformations, such as the difference or the square of the difference, the importance of the analysis procedure can be distributed so that the sum is 100%.

본 발명의 일 실시예에 따르면, 상기 분석 결과로서 생성된 데이터 구조에 제1 분석 결과 데이터와 대응되는 위상을 갖는 제3 분석 결과 데이터가 존재하는 경우, 중요도 설정부(120)는 제1 분석 결과 데이터 및 제3 분석 결과 데이터에 대해, 분석 변수를 일정 비율로 조정한 경우에 예상되는 목표지표의 변화를 각각 산출하고, 산출한 목표지표의 변화 정도에 기초하여 제1 분석 결과 데이터 및 제3 분석 결과 데이터에 대한 중요도를 각각 설정할 수 있다.According to an embodiment of the present invention, when the third analysis result data having a phase corresponding to the first analysis result data exists in the data structure generated as the analysis result, the importance setting unit 120 sets the importance of the first analysis result data Data and the third analysis result data, the expected change of the target indicator is calculated when the analysis variable is adjusted at a predetermined ratio, and the first analysis result data and the third analysis The importance of the result data can be individually set.

예를 들어, 목표지표가 "매출"이고, 제1 분석 결과 데이터가 "고객 성비"에 대한 분석 결과, 제3 분석 결과 데이터가 "고객 연령대"에 대한 분석 결과이며, 제1 분석 결과 데이터와 제3 분석 결과 데이터가 데이터 구조 상에서 대응되는 위상에 있다고 가정하면, 중요도 설정부(120)는 "고객 성비"에 대한 분석 결과를 일정 비율로 조정한 경우에 예상되는 매출 변화량을 산출하고, "고객 연령대"에 대한 분석 결과를 일정 비율로 조정한 경우에 예상되는 매출 변화량을 산출할 수 있다. 이 후, 중요도 설정부(120)는 산출한 "고객 성비" 및 "고객 연령대" 각각에 대한 매출 변화량을 기초로 매출 변화량의 제곱 등 수학적 변환을 실시함으로써, "고객 성비"에 대한 중요도와 및 "고객 연령대"에 대한 중요도를 각각 설정할 수 있다.For example, if the target index is " sales ", the first analysis result data is the analysis result for " customer gender ratio ", the third analysis result data is the analysis result for & 3 Assuming that the analysis result data is in a corresponding phase on the data structure, the importance setting unit 120 calculates an expected sales change amount when the analysis result of " customer gender ratio " is adjusted to a certain ratio, "Is adjusted to a certain percentage, it is possible to calculate the expected amount of change in sales. Thereafter, the importance setting unit 120 performs a mathematical transformation such as a square of the sales change amount on the basis of the calculated sales change amount for each of the " customer gender ratio " and " customer age group & Customer age group ", respectively.

본 발명의 다른 실시예에 따르면, 상기 분석 결과로서 생성된 데이터 구조에 제1 분석 결과 데이터와 대응되는 위상을 갖는 제3 분석 결과 데이터가 존재하는 경우, 중요도 설정부(120)는 제1 분석 결과 데이터 및 제3 분석 결과 데이터 각각을 전체 분석 결과 데이터의 평균 또는 분포와 비교하고, 이를 기초로 하여, 제1 분석 결과 데이터 및 제3 분석 결과 데이터에 대한 중요도를 각각 설정할 수 있다.According to another embodiment of the present invention, when the third analysis result data having a phase corresponding to the first analysis result data exists in the data structure generated as a result of the analysis, the importance setting unit 120 sets the importance of the first analysis result data The data and the third analysis result data are compared with the average or distribution of the entire analysis result data and the importance of the first analysis result data and the third analysis result data can be respectively set.

예를 들어, 제1 분석 결과 데이터가 "고객 성비"에 대한 분석 결과이며, 제3 분석 결과 데이터가 "고객 연령대"에 대한 분석 결과라고 가정하면, 중요도 설정부(120)는 전체 분석 결과 데이터의 분포와 "고객 성비"에 대한 분석 결과를 비교하고, 전체 분석 결과 데이터의 분포와 "고객 연령대"에 대한 분석 결과를 비교할 수 있다. 이 후, 중요도 설정부(120)는 "고객 성비"에 대한 분석 결과 및 "고객 연령대"에 대한 분석 결과 각각에 대한 전체 분석 결과 데이터 분포와의 차이 정도를 숫자 등 정량적으로 표현하여, 표현된 각각의 차이값을 기초로 차이값의 제곱 등 수학적 변환을 실시함으로써, "고객 성비"에 대한 중요도와 및 "고객 연령대"에 대한 중요도를 각각 설정할 수 있다.For example, if the first analysis result data is the analysis result for the " customer gender ratio " and the third analysis result data is the analysis result for the " customer age range ", the importance level setting unit 120 Distribution "and" customer gender ratio ", and compare the distribution of the data of the whole analysis with the analysis result of" customer age ". Then, the importance level setting unit 120 quantitatively expresses the degree of difference between the analysis result of the " customer gender ratio " and the analysis result of the analysis result of the " customer age group "Quot; customer gender ratio " and a degree of importance for the " customer age group ", respectively, by performing a mathematical transformation such as square of the difference based on the difference value of the customer gender ratio.

상기 분석 결과로서 생성된 데이터 구조에, 제1 분석 결과 데이터에 대하여 하위 분석 결과 데이터이거나 소정의 연관도를 갖는 분석 결과 데이터인 제2 분석 결과 데이터와 대응되는 위상을 갖는 제4 분석 결과 데이터가 존재하는 경우, 중요도 설정부(120)는 제2 분석 결과 데이터 및 제4 분석 결과 데이터에 대하여, 제2 분석 결과 데이터 및 제4 분석 결과 데이터 각각이 제1 분석 결과 데이터에 미치는 통계적 영향력의 비율에 기초하여 제2 분석 결과 데이터 및 제4 분석 결과 데이터에 대한 중요도를 각각 설정할 수 있다.The fourth analysis result data having a phase corresponding to the second analysis result data which is the analysis result data having the predetermined correlation degree or the lower analysis result data with respect to the first analysis result data exists in the data structure generated as the result of the analysis , The importance setting unit 120 sets the importance of the second analysis result data and the fourth analysis result data based on the ratio of the statistical influence of each of the second analysis result data and the fourth analysis result data on the first analysis result data The importance of the second analysis result data and the fourth analysis result data can be set respectively.

또한, 중요도 설정부(120)는 제2 분석 결과 데이터 및 제4 분석 결과 데이터에 대한 중요도의 합이 제1 분석 결과 데이터의 중요도보다 작거나 같도록 제2 분석 결과 데이터 및 제4 분석 결과 데이터 각각의 중요도를 설정할 수 있다.Also, the importance setting unit 120 sets the importance of the second analysis result data and the fourth analysis result data so that the sum of the importance of the second analysis result data and the fourth analysis result data is less than or equal to the importance of the first analysis result data Can be set.

예를 들어, 제1 분석 결과 데이터의 중요도가 60%, 제1 분석 결과 데이터와 대응되는 위상을 갖는 제3 분석 결과 데이터의 중요도가 40%로 설정되었다고 가정하면, 중요도 설정부(120)는 제1 분석 결과 데이터의 하위 분석 결과 데이터인 제2 분석 결과 데이터 및 제2 분석 결과 데이터와 대응되는 위상을 갖는 제4 분석 결과 데이터에 대해, 제2 분석 결과 데이터와 제4 분석 결과 데이터의 중요도 합이 상위 분석 결과 데이터인 제1 분석 결과 데이터의 중요도 60%와 일치하거나 60% 미만이 되도록 제2 분석 결과 데이터와 제4 분석 결과 데이터에 대한 중요도를 각각 설정할 수 있다.For example, if the importance of the first analysis result data is set to 60% and the importance of the third analysis result data having a phase corresponding to the first analysis result data is set to 40%, the importance setting unit 120 sets 1 analysis result data and the fourth analysis result data having a phase corresponding to the second analysis result data, the sum of the importance of the second analysis result data and the fourth analysis result data is The importance of the second analysis result data and the fourth analysis result data can be set so as to be equal to or less than 60% of the importance degree 60% of the first analysis result data, which is the upper analysis result data.

내용 선택부(130)는 중요도 설정부(120)를 통해 설정된 중요도를 기초로 일 이상의 분석 결과 데이터를 보고서 내용으로 선택할 수 있다.The content selection unit 130 can select one or more analysis result data as report contents based on the importance set through the importance setting unit 120. [

본 발명의 일 실시예에 따르면, 내용 선택부(130)는 다수의 분석 결과 데이터가 포함된 전체 분석 결과 데이터에서 각각 설정된 중요도가 높은 순으로 소정의 개수만큼 분석 결과 데이터를 선택할 수 있다.According to an embodiment of the present invention, the content selection unit 130 can select a predetermined number of analysis result data in descending order of importance from the entire analysis result data including a plurality of analysis result data.

예를 들어, 전체 분석 결과 데이터 각각에 대한 중요도가 제1 분석 결과 데이터 60%, 제2 분석 결과 데이터 40%, 제3 분석 결과 데이터 15%, 제4 분석 결과 데이터 25%, 제5 분석 결과 데이터 20%로 설정되어 있으며, 선택되는 소정의 분석 결과 데이터 개수가 3개라고 가정하면, 내용 선택부(130)는 중요도가 높은 순으로 3개의 분석 결과 데이터, 즉, 제1 분석 결과 데이터, 제2 분석 결과 데이터 및 제4 분석 결과 데이터를 보고서 내용으로 선택할 수 있다.For example, the degree of importance for each of the entire analysis result data is 60% of the first analysis result data, 40% of the second analysis result data, 15% of the third analysis result data, 25% of the fourth analysis result data, 20%, and assuming that the number of selected analysis result data is three, the content selection unit 130 selects three analysis result data, that is, the first analysis result data, the second analysis result data, The analysis result data and the fourth analysis result data can be selected as report contents.

본 발명의 다른 실시예에 따르면, 내용 선택부(130)는 다수의 분석 결과 데이터가 포함된 전체 분석 결과 데이터에서 각각 설정된 중요도가 미리 설정된 하한값 이상인 분석 결과 데이터를 선택할 수 있다.According to another embodiment of the present invention, the content selection unit 130 can select the analysis result data having the importance set in the entire analysis result data including the plurality of analysis result data equal to or higher than the predetermined lower limit value.

예를 들어, 전체 분석 결과 데이터 각각에 대한 중요도가 제1 분석 결과 데이터 60%, 제2 분석 결과 데이터 40%, 제3 분석 결과 데이터 15%, 제4 분석 결과 데이터 25%, 제5 분석 결과 데이터 20%로 설정되어 있으며, 미리 설정되어 있는 중요도 하한값이 30%라고 가정하면, 내용 선택부(130)는 중요도가 30% 이상인 분석 결과 데이터, 즉, 제1 분석 결과 데이터 및 제2 분석 결과 데이터만을 보고서 내용으로 선택할 수 있다.For example, the degree of importance for each of the entire analysis result data is 60% of the first analysis result data, 40% of the second analysis result data, 15% of the third analysis result data, 25% of the fourth analysis result data, 20% and assuming that the predetermined lower limit of importance is 30%, the content selection unit 130 extracts the analysis result data having the importance of 30% or more, that is, the first analysis result data and the second analysis result data only You can choose from report contents.

보고서 생성부(140)는 내용 선택부(130)를 통해 선택된 일 이상의 분석 결과 데이터를 기초로 소정의 보고서 템플릿에 따라, 일 이상의 분석 결과 데이터를 자연어, 표, 그래프 등의 형태로 변환함으로써 보고서 내용을 작성할 수 있다.The report generating unit 140 converts one or more analysis result data into a form of a natural language, a table, a graph, etc. according to a predetermined report template based on one or more analysis result data selected through the content selection unit 130, Can be created.

보고서 생성부(140)는 작성된 보고서 내용을 PPTX, PDF, HTML 등의 형식으로 파일 생성하여 제공할 수 있다.The report generating unit 140 can generate and provide the generated report contents in the form of PPTX, PDF, HTML, or the like.

도 2는 본 발명의 일 실시예에 따라 빅데이터 분석 보고서가 자동으로 생성되는 과정을 도시한 순서도이다.2 is a flowchart illustrating a process of automatically generating a big data analysis report according to an embodiment of the present invention.

먼저, 장치(100)는 보고서 생성 신호가 획득되면(S201), 미리 설정된 적어도 하나의 분석 프로세스에 따라 빅데이터에 대한 데이터 분석을 실시할 수 있다(S202).First, when a report generation signal is obtained (S201), the apparatus 100 may perform data analysis on the big data according to at least one predetermined analysis process (S202).

데이터 분석은 도 3 및 도 4에 도시된 바와 같이, 복수의 분석 프로세스에 따라 이루어질 수 있다.The data analysis can be performed according to a plurality of analysis processes, as shown in Figs. 3 and 4. Fig.

복수의 분석 프로세스에 따라 데이터 분석이 이루어지면, 분석 결과로서 상호 계층 관계 또는 연관 관계가 형성된 데이터 구조인 트리(tree) 형태의 분석 결과가 도출될 수 있다.When data analysis is performed according to a plurality of analysis processes, analysis results in the form of a tree, which is a data structure in which a hierarchical relationship or an association is formed as the analysis result, can be derived.

예를 들어, 도 3 및 도 4에 도시된 바와 같이, 분석 프로세스에 따라 목표지표에 대한 데이터 분석이 이루어지면, 제1 분석 결과 데이터 및 제1 분석 결과 데이터와 대응되는 위상을 갖는 제3 분석 결과 데이터가 도출될 수 있으며, 다른 또는 동일한 분석 프로세스에 따라 제1 분석 결과 데이터에 대한 하위 분석 결과 데이터 또는 제1 분석 결과 데이터와 소정의 연관도를 갖는 제2 분석 결과 데이터 및 제2 분석 결과 데이터와 대응되는 위상을 갖는 제4 분석 결과 데이터가 도출될 수 있다.For example, as shown in FIG. 3 and FIG. 4, when the data analysis of the target indicator is performed according to the analysis process, the third analysis result data and the third analysis result having a phase corresponding to the first analysis result data Data can be derived and the second analysis result data and the second analysis result data having a predetermined degree of association with the first analysis result data or the first analysis result data with respect to the first analysis result data according to another or the same analysis process The fourth analysis result data having the corresponding phase can be derived.

이 때, 데이터 분석에 있어서, 분석 프로세스는, 상위 계층 분석 프로세스가 우선적으로 이용되거나 최초 이용된 분석 프로세스와 연관도가 높은 분석 프로세스가 우선적으로 이용될 수 있다.At this time, in the data analysis, the analysis process may be preferentially used in the higher layer analysis process, or an analysis process having a higher degree of association with the initially used analysis process may be preferentially used.

본 발명의 일 실시예에 따르면, 데이터 분석이 이루어진 후, 소정의 실행 규칙이 만족되면, 이전의 분석 프로세스와 다른 새로운 분석 프로세스에 따라 데이터 분석이 이루어질 수 있다. 이 때, 실행 규칙은, 이전의 분석 프로세스에 따라 데이터가 분석된 결과에 의미적으로 연관된 새로운 분석 프로세스에 따른 데이터 분석 여부를 결정하기 위한 소정의 조건 규칙일 수 있다.According to an embodiment of the present invention, after data analysis is performed, if predetermined execution rules are satisfied, data analysis can be performed according to a new analysis process different from the previous analysis process. In this case, the execution rule may be a predetermined condition rule for determining whether to analyze data according to a new analysis process semantically related to a result of analyzing the data according to a previous analysis process.

데이터 분석이 완료되면, 장치(100)는 분석 결과로서 생성된 데이터 구조에 포함된 적어도 하나의 분석 결과 데이터에 대하여 데이터 위상 또는 연관도 순으로 중요도를 설정할 수 있다(S203).Upon completion of the data analysis, the apparatus 100 may set the importance of the at least one analysis result data included in the data structure generated as the analysis result in order of data phase or association order (S203).

즉, 장치(100)는 제1 분석 결과 데이터 및 제3 분석 결과 데이터에 대한 중요도 설정 후, 제2 분석 결과 데이터 및 제4 분석 결과 데이터에 대한 중요도를 설정할 수 있다.That is, the apparatus 100 can set the importance of the second analysis result data and the fourth analysis result data after setting importance for the first analysis result data and the third analysis result data.

본 발명의 일 실시예에 따르면, 장치(100)는 제1 분석 결과 데이터 및 제3 분석 결과 데이터에 대해, 분석 변수를 일정 비율로 조정했을 경우 예상되는 목표지표의 변화를 각각 산출하여, 산출된 목표지표의 변화 정도를 기초로 제1 분석 결과 데이터 및 제3 분석 결과 데이터에 대한 중요도를 각각 설정할 수 있다.According to one embodiment of the present invention, the apparatus 100 calculates the expected change of the target index when the analysis variable is adjusted at a predetermined ratio, for each of the first analysis result data and the third analysis result data, The degree of importance of the first analysis result data and the third analysis result data can be respectively set based on the degree of change of the target indicator.

본 발명의 다른 실시예에 따르면, 장치(100)는 제1 분석 결과 데이터 및 제3 분석 결과 데이터 각각을 전체 분석 결과 데이터의 평균 또는 분포와 비교하여, 비교 결과를 기초로 제1 분석 결과 데이터 및 제3 분석 결과 데이터에 대한 중요도를 각각 설정할 수 있다.According to another embodiment of the present invention, the apparatus 100 compares each of the first analysis result data and the third analysis result data with an average or distribution of the entire analysis result data, And the degree of importance of the data as a result of the third analysis can be individually set.

제2 분석 결과 데이터 및 제4 분석 결과 데이터에 대해서는, 장치(100)는 제2 분석 결과 데이터 및 제4 분석 결과 데이터 각각이 상위 분석 결과 데이터인 제1 분석 결과 데이터에 미치는 통계적 영향력의 비율에 기초하여 중요도를 각각 설정할 수 있다.With respect to the second analysis result data and the fourth analysis result data, the apparatus 100 is based on the ratio of the statistical influence each of the second analysis result data and the fourth analysis result data has on the first analysis result data, which is the upper analysis result data Respectively.

또한, 제2 분석 결과 데이터 및 제4 분석 결과 데이터에 대하여, 장치(100)는 제2 분석 결과 데이터 및 제4 분석 결과 데이터에 대한 중요도의 합이 상기 제1 분석 결과 데이터의 중요도보다 작거나 같도록 중요도를 각각 설정할 수 있다.Also, with respect to the second analysis result data and the fourth analysis result data, the apparatus 100 may determine that the sum of the importance of the second analysis result data and the fourth analysis result data is less than or equal to the importance of the first analysis result data Respectively.

본 발명의 일 실시예에 따르면, 도 3에 도시된 바와 같이, 장치(100)는 제2 분석 결과 데이터 및 제4 분석 결과 데이터의 중요도 합이 제1 분석 결과 데이터의 중요도와 동일하도록 제2 분석 결과 데이터 및 제4 분석 결과 데이터에 대한 중요도를 각각 42%와 18%로 설정할 수 있다. 이 때, 제2 분석 결과 데이터의 중요도와 제4 분석 결과 데이터의 중요도는, 제1 분석 결과 데이터에 미치는 통계적 영향력의 비율에 따라 배분되어 설정된 것일 수 있다.According to one embodiment of the present invention, as shown in FIG. 3, the apparatus 100 may perform a second analysis such that the sum of the importance of the second analysis result data and the fourth analysis result data is equal to the importance of the first analysis result data. The importance of the result data and the fourth analysis result data can be set to 42% and 18%, respectively. In this case, the importance of the second analysis result data and the importance of the fourth analysis result data may be set according to the ratio of the statistical influence on the first analysis result data.

본 발명의 다른 실시예에 따르면, 도 4에 도시된 바와 같이, 장치(100)는 제2 분석 결과 데이터 및 제4 분석 결과 데이터의 중요도 합이 제1 분석 결과 데이터의 중요도 보다 소정 비율 작도록 제2 분석 결과 데이터 및 제4 분석 결과 데이터에 대한 중요도를 각각 37.8%와 16.2%로 설정할 수 있다. 이 때, 제2 분석 결과 데이터의 중요도와 제4 분석 결과 데이터의 중요도는, 제1 분석 결과 데이터에 미치는 통계적 영향력의 비율에 따라 배분되어 설정된 것일 수 있다.According to another embodiment of the present invention, as shown in FIG. 4, the apparatus 100 may be configured such that the sum of the importance of the second analysis result data and the fourth analysis result data is smaller than the importance of the first analysis result data 2, and 37.8% and 16.2%, respectively, for the analysis result data and the fourth analysis result data, respectively. In this case, the importance of the second analysis result data and the importance of the fourth analysis result data may be set according to the ratio of the statistical influence on the first analysis result data.

장치(100)는 일 이상의 분석 결과 데이터에 대한 중요도 설정이 완료되면, 설정된 중요도를 기초로 하여 보고서 내용에 포함될 일 이상의 분석 결과 데이터를 선택할 수 있다(S204).When the importance setting of one or more analysis result data is completed, the apparatus 100 may select one or more analysis result data to be included in the report content based on the set importance.

본 발명의 일 실시예에 따르면, 장치(100)는 상기 설정된 중요도를 기초로 하여, 중요도가 높은 순으로 소정의 분석 결과 데이터 개수만큼 보고서에 포함될 분석 결과 데이터를 선택할 수 있다.According to an embodiment of the present invention, the apparatus 100 can select analysis result data to be included in the report by the predetermined number of analysis result data in order of importance, based on the set importance.

예를 들어, 도 3을 일 실시예로 참고하여 설명하면, 소정의 분석 결과 데이터 개수가 3개라고 가정했을 경우, 장치(100)는 중요도가 가장 높게 배분되어 있는 3개의 분석 결과 데이터, 즉, 제1 분석 결과 데이터, 제2 분석 결과 데이터 및 제3 분석 결과 데이터를 보고서 내용으로 포함될 분석 결과 데이터로 선택할 수 있다.For example, referring to FIG. 3, a description will be given of an embodiment. Assuming that the number of data is three as a result of analysis, the apparatus 100 stores three analysis result data having the highest importance, that is, The first analysis result data, the second analysis result data, and the third analysis result data may be selected as analysis result data to be included in the report contents.

본 발명의 다른 실시예에 따르면, 장치(100)는 상기 설정된 중요도를 기초로 하여, 설정된 중요도가 미리 설정된 중요도 하한값 이상인 분석 결과 데이터를 보고서에 포함될 분석 결과 데이터로 선택할 수 있다.According to another embodiment of the present invention, the apparatus 100 may select the analysis result data having the predetermined importance lower than the predetermined importance lower limit value as analysis result data to be included in the report, based on the set importance.

예를 들어, 도 3을 일 실시예로 참고하여 설명하면, 미리 설정되어 있는 중요도 하한값이 20%인 경우, 장치(100)는 중요도가 20% 이상인 분석 결과 데이터, 즉, 제1 분석 결과 데이터, 제2 분석 결과 데이터 및 제3 분석 결과 데이터를 보고서 내용으로 포함될 분석 결과로 선택할 수 있다.For example, referring to FIG. 3, when the predetermined lower limit value of importance is 20%, the apparatus 100 stores the analysis result data having the importance of 20% or more, that is, the first analysis result data, The second analysis result data and the third analysis result data can be selected as analysis results to be included in the report contents.

이 후, 장치(100)는 선택한 일 이상의 분석 결과 데이터를 소정의 보고서 템플릿에 반영하여 자연어, 표 및 그래프 중 적어도 어느 하나의 형태로 변환함으로써 보고서를 작성할 수 있으며, 작성된 보고서를 파일로 생성하여 제공할 수 있다(S205).Thereafter, the apparatus 100 can generate a report by converting the analysis result data of the selected day or more into a predetermined report template and converting it into at least one of a natural language, a table, and a graph, and generates the generated report as a file (S205).

도 5는 본 발명의 일 실시예에 따라 빅데이터 분석 보고서 자동 생성 시 제공되는 화면을 예시하는 도면이다.FIG. 5 is a diagram illustrating a screen provided when the Big Data Analysis Report is automatically generated according to an embodiment of the present invention.

도 5의 (a) 및 (b)에 도시된 바와 같이, 장치(100)는 빅데이터 분석 보고서 자동 생성을 위한 설정 정보 입력 인터페이스를 제공할 수 있다.As shown in FIGS. 5A and 5B, the device 100 may provide a configuration information input interface for automatically generating a big data analysis report.

빅데이터 분석 보고서 자동 생성을 위한 설정 정보 입력 인터페이스에는, 보고서 템플릿 업로드 인터페이스, 글꼴 설정 인터페이스, 보고서 생성 파일 형식 설정 인터페이스 및 보고서 유형 설정 인터페이스가 포함될 수 있으며, 별도의 빅데이터 자료 업로드가 가능하도록 하는 인터페이스도 추가로 포함될 수 있다.The configuration information input interface for automatic generation of Big Data Analysis Report can include a report template upload interface, a font setting interface, a report generation file format setting interface and a report type setting interface, and an interface May also be included.

장치(100)를 통해 보고서 생성이 완료되면, 도 5의 (c)에 도시된 바와 같이 해당 보고서 파일이 제공될 수 있다. 이 때, 보고서 파일의 형식은 PPTX, PDF, HTML 중 어느 하나의 형식일 수 있으며, 이는 상기 분석 보고서 자동 생성을 위한 설정 정보 입력 인터페이스를 통해 설정된 파일 형식일 수 있다.When the report generation is completed through the apparatus 100, the report file can be provided as shown in FIG. 5 (c). At this time, the format of the report file may be one of PPTX, PDF, and HTML, and may be a file format set through a setting information input interface for automatically generating the analysis report.

이와 같이, 본 발명의 일 실시예에 따르면, 보고서를 생성함에 있어, 분석 결과 데이터마다 데이터 위상 또는 연관도 순으로 중요도를 설정하고, 설정된 중요도를 기초로 보고서에 포함될 분석 결과 데이터를 선택함으로써, 수요자가 필요로 하는 정보만이 포함된 빅데이터 분석 보고서가 신속하고 정확하게 자동 생성될 수 있도록 한다.As described above, according to an embodiment of the present invention, in generating a report, importance levels are set in order of data phase or association degree for each analysis result data, and analysis result data to be included in the report is selected based on the set importance, So that large data analysis reports containing only the information needed by the user can be automatically and quickly generated.

전술한 본 발명의 설명은 예시를 위한 것이며, 본 발명이 속하는 기술분야의 통상의 지식을 가진 자는 본 발명의 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 예를 들어, 단일형으로 설명되어 있는 각 구성 요소는 분산되어 실시될 수도 있으며, 마찬가지로 분산된 것으로 설명되어 있는 구성 요소들도 결합된 형태로 실시될 수 있다.It will be understood by those skilled in the art that the foregoing description of the present invention is for illustrative purposes only and that those of ordinary skill in the art can readily understand that various changes and modifications may be made without departing from the spirit or essential characteristics of the present invention. will be. It is therefore to be understood that the above-described embodiments are illustrative in all aspects and not restrictive. For example, each component described as a single entity may be distributed and implemented, and components described as being distributed may also be implemented in a combined form.

본 발명의 범위는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다.The scope of the present invention is defined by the appended claims, and all changes or modifications derived from the meaning and scope of the claims and their equivalents should be construed as being included within the scope of the present invention.

100 : 장치
110 : 데이터 분석부
120 : 중요도 설정부
130 : 내용 선택부
140 : 보고서 생성부100: Device
110: Data analysis unit
120: importance setting unit
130: Content selection unit
140: Report generation unit

Claims

A method for a device to automatically generate a Big Data Analysis Report,
(a) when a report generation signal is obtained, supporting the analysis or analysis of the data according to at least one predetermined analysis process;
(b) supporting the setting or setting of importance of at least one analysis result data included in the data structure generated as a result of the analysis in order of data phase or association order;
(c) selecting one or more analysis result data as report contents based on the set importance; And
(d) generating and providing a report by reflecting the selected one or more analysis result data on a predetermined report template, and providing the generated report.

The method according to claim 1,
(I) sub-analysis result data for the first analysis result data or (ii) second analysis result data having a predetermined degree of association with the first analysis result data is included in the data structure If so,
In the step (b)
Wherein importance of the second analysis result data is set after importance is set for the first analysis result data.

3. The method of claim 2,
When the third analysis result data having a phase corresponding to the first analysis result data exists in the data structure,
In the step (b)
Wherein the first analysis result data and the third analysis result data are obtained by calculating an expected change in the target index when the analysis variable is adjusted at a predetermined ratio, And the importance of the analysis result data and the third analysis result data are set, respectively.

The method of claim 3,
When the fourth analysis result data having a phase corresponding to the second analysis result data exists in the data structure,
The importance of the second analysis result data and the fourth analysis result data are respectively set based on the ratio of the statistical influence of each of the second analysis result data and the fourth analysis result data to the first analysis result data And generating a large data analysis report automatically.

5. The method of claim 4,
Wherein the sum of the importance of the second analysis result data and the fourth analysis result data is less than or equal to the importance of the first analysis result data.

3. The method of claim 2,
When the third analysis result data having a phase corresponding to the first analysis result data exists in the data structure,
In the step (b)
The first analysis result data and the third analysis result data are compared with an average or distribution of the entire analysis result data and the importance of the first analysis result data and the third analysis result data is respectively set The method comprising the steps of:

The method according to claim 6,
When the fourth analysis result data having a phase corresponding to the second analysis result data exists in the data structure,
The importance of the second analysis result data and the fourth analysis result data are respectively set based on the ratio of the statistical influence of each of the second analysis result data and the fourth analysis result data to the first analysis result data And generating a large data analysis report automatically.

8. The method of claim 7,
Wherein the sum of the importance of the second analysis result data and the fourth analysis result data is less than or equal to the importance of the first analysis result data.

The method according to claim 1,
In the step (a)
When data analysis is performed according to a plurality of analysis processes,
Characterized in that the plurality of analysis processes form a mutual hierarchical relationship or an association and an analysis process in which an upper hierarchical analysis process is preferentially used or an analysis process having a high degree of association with an initially used analysis process is preferentially used, How to automatically generate big data analysis reports.

10. The method of claim 9,
After the data analysis is performed, if the predetermined execution rule is satisfied, data analysis is performed according to a new analysis process different from the previous analysis process,
Wherein the execution rule is a predetermined condition rule for determining whether to analyze data according to a new analysis process that is semantically related to a result of analyzing the data according to the previous analysis process.

The method according to claim 1,
The step (c)
And a predetermined number of analysis result data are selected in the order of importance set in advance.

The method according to claim 1,
The step (c)
And the analysis result data having the set importance of at least the lower limit value is selected.

The method according to claim 1,
The step (d)
And generating and providing a report by converting the selected one or more analysis result data into at least one of a natural language, a table, and a graph according to the predetermined report template.

A data analysis unit for analyzing or analyzing data according to at least one predetermined analysis process when the report generation signal is acquired;
An importance level setting unit for setting or setting importance levels of at least one analysis result data included in the data structure generated as a result of the analysis in order of data phase or association order;
A content selection unit for selecting one or more analysis result data as report contents based on the set importance; And
And generating a report by reflecting the selected one or more pieces of analysis result data on a predetermined report template and providing the report.

15. The method of claim 14,
(I) sub-analysis result data for the first analysis result data or (ii) second analysis result data having a predetermined degree of association with the first analysis result data is included in the data structure If so,
In the step (b)
Wherein importance is set for the second analysis result data after the importance is set for the first analysis result data.

16. The method of claim 15,
When the third analysis result data having a phase corresponding to the first analysis result data exists in the data structure,
In the importance setting unit,
Wherein the first analysis result data and the third analysis result data are obtained by calculating an expected change in the target index when the analysis variable is adjusted at a predetermined ratio, The analysis result data, and the third analysis result data, respectively.

17. The method of claim 16,
When the fourth analysis result data having a phase corresponding to the second analysis result data exists in the data structure,
The importance of the second analysis result data and the fourth analysis result data are respectively set based on the ratio of the statistical influence of each of the second analysis result data and the fourth analysis result data to the first analysis result data &Lt; / RTI >

18. The method of claim 17,
Wherein the sum of the importance of the second analysis result data and the fourth analysis result data is less than or equal to the importance of the first analysis result data.

16. The method of claim 15,
When the third analysis result data having a phase corresponding to the first analysis result data exists in the data structure,
In the importance setting unit,
The first analysis result data and the third analysis result data are compared with an average or distribution of the entire analysis result data and the importance of the first analysis result data and the third analysis result data is respectively set &Lt; / RTI >

20. The method of claim 19,
When the fourth analysis result data having a phase corresponding to the second analysis result data exists in the data structure,
The importance of the second analysis result data and the fourth analysis result data are respectively set based on the ratio of the statistical influence of each of the second analysis result data and the fourth analysis result data to the first analysis result data &Lt; / RTI >

21. The method of claim 20,
Wherein the sum of the importance of the second analysis result data and the fourth analysis result data is less than or equal to the importance of the first analysis result data.

15. The method of claim 14,
In the data analysis unit,
When data analysis is performed according to a plurality of analysis processes,
Characterized in that the plurality of analysis processes form a mutual hierarchical relationship or an association and an analysis process in which an upper hierarchical analysis process is preferentially used or an analysis process having a high degree of association with an initially used analysis process is preferentially used, , Device.

23. The method of claim 22,
After the data analysis is performed, if the predetermined execution rule is satisfied, data analysis is performed according to a new analysis process different from the previous analysis process,
Wherein the execution rule is a predetermined condition rule for determining whether to analyze data according to a new analysis process that is semantically related to a result of analyzing the data according to the previous analysis process.

15. The method of claim 14,
The content selection unit,
And selects a predetermined number of pieces of analysis result data in the set order of importance.

15. The method of claim 14,
The content selection unit,
And selects analysis result data whose set importance is equal to or greater than a predetermined lower limit value.

15. The method of claim 14,
Wherein the report generation unit comprises:
And generates and provides a report by converting the selected one or more analysis result data into at least one of a natural language, a table, and a graph according to the predetermined report template.