KR102422325B1

KR102422325B1 - Method and apparatus for managing knowledge graph

Info

Publication number: KR102422325B1
Application number: KR1020210145290A
Authority: KR
Inventors: 박민우; 임철수; 손지성; 심형섭; 강지순; 최기석; 이행곤
Original assignee: 한국과학기술정보연구원
Priority date: 2021-10-28
Filing date: 2021-10-28
Publication date: 2022-07-19

Abstract

Provided are a method and device for managing a knowledge graph. The method for managing the knowledge graph according to one embodiment of the present invention comprises: a step of identifying an associative relation strength between a first entity and a second entity in a knowledge graph generated using one or more datasets; a step of displaying the knowledge graph, and visualizing and displaying a connection line representing the associative relation between the first entity and the second entity as a first graphic representation indicating the identified associative relation strength; and a step of displaying the first raw data corresponding to the first entity and the second raw data corresponding to the second entity in response to a selection of the visualized connection line.

Description

Knowledge graph management method and apparatus {METHOD AND APPARATUS FOR MANAGING KNOWLEDGE GRAPH}

본 발명은 지식 그래픽 관리 방법에 관한 것이다. 보다 자세하게는 개체 간의 연관 관계의 강도를 사용자가 직관적으로 확인할 수 있도록 다양한 그래픽 표현이 포함된 지식 그래프를 디스플레이하는 지식 그래프 관리 방법 및 장치에 관한 것이다. The present invention relates to a knowledge graphic management method. More particularly, it relates to a knowledge graph management method and apparatus for displaying a knowledge graph including various graphic representations so that a user can intuitively check the strength of a relationship between entities.

지식 그래프란 어떠한 주제에 대한 구조화된 지식을 그래프로서 표현한 것이다. 지식 그래프는 대화형 검색, 음성 명령 수행 등과 같은 다양한 분야에서 개발되어 적용되고 있다.A knowledge graph is a graphical representation of structured knowledge about a subject. The knowledge graph has been developed and applied in various fields such as interactive search and voice command execution.

기존에는 원본 데이터로부터 개체 추출, 개체 간의 관계 설정 등 같은 일련의 수작업을 통해서 지식 그래프를 생성하였다. 이러한 그래프 생성 작업은 기존에는 수작업을 통해서 이루어지고 있었으나, 근래에는 컴퓨팅 장치의 알고리즘을 이용하여 자동으로 생성되기도 한다.In the past, knowledge graphs were created through a series of manual tasks such as extracting entities from original data and establishing relationships between entities. Such a graph generation operation was previously performed manually, but in recent years, it is also automatically created using an algorithm of a computing device.

그런데 컴퓨팅 장치의 알고리즘을 통해서, 자동으로 생성된 지식 그래프에 있어서, 각 개체의 연관 관계가 부정확할 수 있다. 이 경우, 지식 그래프의 연관 관계를 바로잡을 수 있는 관리자(즉, 전문가)가, 일일이 개체의 연관 관계를 파악하고, 잘못된 연관 관계에 대한 수정 작업을 진행한다. 이때, 관리자는 개체의 연관 관계에 대한 정확성을 판정할 때, 원시 데이터를 데이터베이스에서 일일이 검색하여 확인하고, 해당 원시 데이터들을 이용한 개체 간의 연관 관계 설정이 정확한지 여부를 판정한다.However, in the knowledge graph automatically generated through the algorithm of the computing device, the association relationship of each entity may be inaccurate. In this case, a manager (ie, an expert) who can correct the relationship of the knowledge graph identifies the relationship of each entity and corrects the wrong relationship. In this case, when the administrator determines the accuracy of the association relationship between the entities, the manager searches and verifies the raw data one by one in the database, and determines whether the association relationship between the entities using the raw data is correct.

하지만 데이터 검색을 통한 작업 방식은 작업 속도를 더디게 할 뿐만 아니라, 많은 노동력이 필요하다. 또한, 관리자가 연관 관계를 설정하는데 참조된 데이터를 데이터베이스에서 잘못 검색하는 경우, 개체 연관 관계에 대한 부정확한 판정을 내릴 수도 있다. 또한, 기존의 지식 그래프를 통해서는, 관리자가 개체의 연관 관계 정도를 직관적으로 확인할 수 없었다.However, the work method through data retrieval not only slows the work speed, but also requires a lot of labor. In addition, if an administrator erroneously retrieves data referenced to establish an association relationship in a database, an inaccurate determination may be made on the object association relationship. In addition, through the existing knowledge graph, the manager could not intuitively check the degree of relationship between entities.

한국공개특허 제10-2011-0064833호 (2011년 6월 15일 공개)Korean Patent Publication No. 10-2011-0064833 (published on June 15, 2011)

본 발명이 해결하고자 하는 기술적 과제는, 지식 그래프에서 개체의 연관 관계의 정도와 유형을 직관적으로 확인할 수 있도록 지원하는 지식 그래프 관리 방법 및 장치를 제공하는 것이다.SUMMARY The technical problem to be solved by the present invention is to provide a knowledge graph management method and apparatus that supports intuitively checking the degree and type of relation between entities in the knowledge graph.

본 발명이 해결하고자 하는 다른 기술적 과제는, 지식 그래프에 대한 수정이 용이하게 할 수 있는 지식 그래프 관리 방법 및 장치를 제공하는 것이다.Another technical problem to be solved by the present invention is to provide a knowledge graph management method and apparatus capable of easily modifying the knowledge graph.

본 발명이 해결하고자 하는 또 다른 기술적 과제는, 지식 그래프의 관계를 설정될 때에 참조된 원시 데이터의 상세 정보를 용이하게 제공하는 지식 그래프 관리 방법 및 장치를 제공하는 것이다. Another technical problem to be solved by the present invention is to provide a knowledge graph management method and apparatus for easily providing detailed information of raw data referenced when a relationship of the knowledge graph is established.

본 발명의 기술적 과제들은 이상에서 언급한 기술적 과제들로 제한되지 않으며, 언급되지 않은 또 다른 기술적 과제들은 아래의 기재로부터 본 발명의 기술분야에서의 통상의 기술자에게 명확하게 이해될 수 있을 것이다.The technical problems of the present invention are not limited to the technical problems mentioned above, and other technical problems not mentioned will be clearly understood by those skilled in the art from the following description.

상기 기술적 과제를 해결하기 위한, 본 발명의 일 실시예에 따른 지식 그래프 관리 방법은, 하나 이상의 데이터셋을 이용하여 생성된 지식 그래프에서 제1 개체와 제2 개체 간의 연관 관계 강도를 식별하는 단계와, 상기 지식 그래프를 디스플레이하되, 상기 제1 개체와 상기 제2 개체 간의 연관 관계를 나타내는 연결 선을, 상기 식별된 연관 관계 강도를 가리키는 제1 그래픽 표현으로 시각화하여 디스플레이하는 단계와, 상기 시각화된 연결 선이 선택되는 것에 응답하여, 제1 개체와 대응되는 제1 원시 데이터 및 제2 개체와 대응되는 제2 원시 데이터를 디스플레이하는 단계를 포함할 수 있다. In order to solve the above technical problem, a knowledge graph management method according to an embodiment of the present invention includes the steps of identifying the strength of a relationship between a first entity and a second entity in a knowledge graph generated using one or more datasets; , displaying the knowledge graph, and visualizing and displaying a connection line indicating the association relationship between the first object and the second object as a first graphic representation indicating the identified association relationship strength, and the visualized connection In response to the line being selected, the method may include displaying first raw data corresponding to the first object and second raw data corresponding to the second object.

일 실시예에서, 상기 제1 개체와 대응되는 제1 원시 데이터 및 제2 개체와 대응되는 제2 원시 데이터를 디스플레이하는 단계는, 상기 연관 관계 강도에 대한 수치값을 디스플레이하는 단계를 포함할 수 있다. In an embodiment, the displaying of the first raw data corresponding to the first entity and the second raw data corresponding to the second entity may include displaying a numerical value for the strength of the association relationship. .

일 실시예에서, 상기 제1 그래픽 표현으로 시각화하여 디스플레이하는 단계는, 상기 제1 개체와 상기 제2 개체 간의 연관 관계의 유형을 식별하는 단계와, 상기 식별된 유형에 기초하여, 상기 연결 선의 종류를 결정하는 단계를 포함할 수 있다. In an embodiment, the step of visualizing and displaying the first graphical representation includes: identifying a type of a relationship between the first entity and the second entity; based on the identified type, a type of the connection line may include the step of determining

일 실시예에서, 상기 연결 선의 종류를 결정하는 단계는, 상기 제1 개체와 상기 제2 개체 간의 연관 관계의 유형이 공간 연관 관계인 경우, 상기 연결 선을 제1 선으로 결정하는 단계와, 상기 제1 개체와 상기 제2 개체 간의 연관 관계의 유형이 시간 연관 관계인 경우, 상기 연결 선을 제2 선으로 결정하는 단계와, 상기 제1 개체와 상기 제2 개체 간의 연관 관계의 유형이 관측 연관 관계인 경우, 상기 연결 선을 제3 선으로 결정하는 단계를 포함할 수 있다.In an embodiment, the determining of the type of the connection line may include determining the connection line as a first line when the type of the relation between the first entity and the second entity is a spatial relation; determining the connection line as a second line when the type of association between the first entity and the second entity is a temporal correlation, and when the type of association between the first entity and the second entity is an observational association , determining the connection line as a third line.

일 실시예에서, 상기 제1 그래픽 표현으로 시각화하여 디스플레이하는 단계는, 상기 연관 관계 강도에 비례하여 상기 연결 선의 두께가 굵어지도록 상기 연결 선의 두께를 결정하는 단계와, 상기 결정된 연결 선의 두께를 상기 제1 그래픽 표현으로 시각화하여 디스플레이하는 단계를 포함할 수 있다.In an embodiment, the step of visualizing and displaying the first graphic representation includes: determining the thickness of the connecting line so that the thickness of the connecting line becomes thicker in proportion to the strength of the association relationship; 1 may include a step of visualizing and displaying the graphic representation.

일 실시예에서, 상기 지식 그래프 관리 방법은, 상기 제1 개체와 상기 제2 개체 간의 연관 관계에 대한 수정 정보를 입력 받는 단계와, 상기 수정 정보에 기초하여, 상기 제1 개체와 상기 제2 개체 간의 연관 관계를 수정하는 단계와, 상기 수정된 연관 관계가 반영된 지식 그래프를 디스플레이하는 단계를 더 포함할 수 있다. In an embodiment, the method for managing the knowledge graph includes: receiving correction information on a relationship between the first entity and the second entity; based on the correction information, the first entity and the second entity The method may further include correcting a relationship between the two groups, and displaying a knowledge graph in which the modified relationship is reflected.

일 실시예에서, 상기 제1 개체는 상기 데이터셋의 제1 칼럼을 가리키고, 상기 제2 개체는 상기 데이터셋의 제2 칼럼을 가리킬 수 있다.In an embodiment, the first entity may indicate a first column of the dataset, and the second entity may indicate a second column of the dataset.

일 실시예에서, 상기 제1 개체와 제2 개체 간의 연관 관계 강도를 식별하는 단계는, 상기 제1 칼럼에 포함되는 하나 이상의 상기 제1 원시 데이터와 상기 제2 칼럼에 포함된 하나 이상의 상기 제2 원시 데이터 간의 유사도에 이용하여, 상기 제1 개체와 제2 개체 간의 연관 관계 강도를 계산하는 단계를 포함할 수 있다. In an embodiment, the identifying the strength of the association relationship between the first entity and the second entity includes: at least one of the first raw data included in the first column and at least one of the second data included in the second column The method may include calculating a relationship strength between the first entity and the second entity by using the similarity between the raw data.

일 실시예에서, 상기 제1 원시 데이터와 상기 제2 원시 데이터 각각은 복수의 위치 데이터를 포함할 수 있으며, 이 경우, 상기 제1 개체와 제2 개체 간의 연관 관계 강도를 식별하는 단계는, 상기 제1 원시 데이터의 위치 중간값과 상기 제2 원시 데이터의 위치 중간값 간의 거리를 측정하는 단계와, 상기 측정된 거리를 기초로 상기 제1 개체와 상기 제2 개체 간의 연관 관계 강도를 수치화하는 단계를 포함할 수 있다. In an embodiment, each of the first raw data and the second raw data may include a plurality of location data. In this case, the step of identifying the strength of the association relationship between the first entity and the second entity includes: Measuring a distance between the position median value of the first raw data and the position median value of the second raw data; Numericalizing the strength of the association relationship between the first entity and the second entity based on the measured distance may include.

일 실시예에서, 상기 제1 원시 데이터와 상기 제2 원시 데이터 각각은 복수의 시간 데이터를 포함할 수 있으며, 이 경우, 상기 제1 개체와 제2 개체 간의 연관 관계 강도를 식별하는 단계는, 상기 제1 원시 데이터의 중간값 또는 평균값 상기 제2 원시 데이터의 중간값 또는 평균값 간의 시간 차이를 산출하는 단계; 및 상기 산출된 시간 차이를 기초로 상기 제1 개체와 상기 제2 개체 간의 연관 관계 강도를 수치화하는 단계를 포함할 수 있다. In an embodiment, each of the first raw data and the second raw data may include a plurality of temporal data. In this case, the step of identifying the association relationship strength between the first entity and the second entity includes: calculating a time difference between the median or average value of the first raw data and the median or average value of the second raw data; and quantifying the strength of the association relationship between the first entity and the second entity based on the calculated time difference.

일 실시예에서, 상기 제1 원시 데이터와 상기 제2 원시 데이터 각각은 복수의 관측 데이터를 포함할 수 있으며, 이 경우 상기 제1 개체와 제2 개체 간의 연관 관계 강도를 식별하는 단계는, 상기 제1 원시 데이터의 중간값 또는 평균값과 상기 제2 원시 데이터의 중간값 또는 평균값 간의 차이를 산출하는 단계와, 상기 산출된 차이를 기초로 상기 제1 개체와 상기 제2 개체 간의 연관 관계 강도를 수치화하는 단계를 포함할 수 있다. In an embodiment, each of the first raw data and the second raw data may include a plurality of observation data. In this case, the step of identifying the strength of the association between the first entity and the second entity includes: 1 Calculating a difference between the median or average value of the raw data and the median or average value of the second raw data, and quantifying the strength of the association relationship between the first entity and the second entity based on the calculated difference may include steps.

상기 기술적 과제를 해결하기 위한, 본 발명의 다른 실시예에 따른 컴퓨팅 장치는, 하나 이상의 프로세서와, 상기 프로세서에 의하여 수행되는 프로그램을 로드(load)하는 메모리와, 상기 프로그램이 저장된 스토리지를 포함하되, 상기 프로그램은 하나 이상의 데이터셋을 이용하여 생성된 지식 그래프에서 제1 개체와 제2 개체 간의 연관 관계 강도를 식별하는 동작과, 상기 지식 그래프를 디스플레이하되, 상기 제1 개체와 상기 제2 개체 간의 연관 관계를 나타내는 연결 선을, 상기 식별된 연관 관계 강도를 가리키는 제1 그래픽 표현으로 시각화하여 디스플레이하는 동작과, 상기 시각화된 연결 선이 선택되는 것에 응답하여, 제1 개체와 대응되는 제1 원시 데이터 및 제2 개체와 대응되는 제2 원시 데이터를 디스플레이하는 동작을 수행하기 위한 인스트럭션들(instructions)을 포함할 수 있다. Computing device according to another embodiment of the present invention for solving the above technical problem, comprising one or more processors, a memory for loading a program executed by the processor, and a storage in which the program is stored, The program includes an operation of identifying the strength of a relationship between a first entity and a second entity in a knowledge graph generated using one or more datasets, and displaying the knowledge graph, wherein the association between the first entity and the second entity is performed. visualizing and displaying a connecting line representing the relationship as a first graphical representation indicating the identified association relationship strength, and in response to the visualized connecting line being selected, first raw data corresponding to a first object; It may include instructions for performing an operation of displaying the second raw data corresponding to the second object.

상기 기술적 과제를 해결하기 위한, 본 발명의 또 다른 실시예에 따른, 명령어를 포함하는 컴퓨터 판독 가능한 비일시적 저장 매체는, 상기 명령어는 프로세서에 의해 실행될 때 상기 프로세서로 하여금, 하나 이상의 데이터셋을 이용하여 생성된 지식 그래프에서 제1 개체와 제2 개체 간의 연관 관계 강도를 식별하는 단계와, 상기 지식 그래프를 디스플레이하되, 상기 제1 개체와 상기 제2 개체 간의 연관 관계를 나타내는 연결 선을, 상기 식별된 연관 관계 강도를 가리키는 제1 그래픽 표현으로 시각화하여 디스플레이하는 단계와, 상기 시각화된 연결 선이 선택되는 것에 응답하여, 제1 개체와 대응되는 제1 원시 데이터 및 제2 개체와 대응되는 제2 원시 데이터를 디스플레이하는 단계를 포함하는 동작들을 수행할 수 있다.In accordance with another embodiment of the present invention for solving the above technical problem, a computer-readable non-transitory storage medium including instructions, when the instructions are executed by a processor, causes the processor to use one or more data sets. identifying the strength of the association relationship between the first entity and the second entity in the knowledge graph generated by Visualizing and displaying the first graphical representation indicating the strength of the association relationship, and in response to the visualized connecting line being selected, first raw data corresponding to the first object and second raw data corresponding to the second object Operations including displaying data may be performed.

도 1은 본 발명의 일 실시예에 따른, 지식 그래프 관리 방법을 설명하는 순서도이다.
도 2는 지식 그래프를 예시하는 도면이다.
도 3는 도 2의 단계 S400을 자세하게 설명하기 위한 도면이다.
도 4 내지 도 6은 다양한 그래픽 표현을 포함하는 지식 그래픽을 예시하는 도면이다.
도 7은 도 3의 단계 S430을 자세하게 설명하기 위한 도면이다.
도 8 내지 도 10은 객체들 간의 연관 관계 유형에 따라 디스플레이되는 그래픽 표현을 예시하는 도면이다.
도 11은 도 2의 단계 S600을 자세하게 설명하기 위한 도면이다.
도 12는 입력에 따라 수정된 지식 그래프를 예시하는 도면이다.
도 13은 다양한 실시예에서 컴퓨팅 장치를 구현할 수 있는 예시적인 하드웨어 구성도이다.1 is a flowchart illustrating a knowledge graph management method according to an embodiment of the present invention.
2 is a diagram illustrating a knowledge graph.
FIG. 3 is a view for explaining in detail step S400 of FIG. 2 .
4-6 are diagrams illustrating knowledge graphics including various graphical representations.
7 is a diagram for explaining in detail step S430 of FIG. 3 .
8 to 10 are diagrams illustrating graphic representations displayed according to types of association relationships between objects.
11 is a diagram for explaining in detail step S600 of FIG. 2 .
12 is a diagram illustrating a knowledge graph modified according to an input.
13 is an exemplary hardware configuration diagram that may implement a computing device in various embodiments.

이하, 첨부된 도면을 참조하여 본 발명의 바람직한 실시예들을 상세히 설명한다. 본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나 본 발명의 기술적 사상은 이하의 실시예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 이하의 실시예들은 본 발명의 기술적 사상을 완전하도록 하고, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 본 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명의 기술적 사상은 청구항의 범주에 의해 정의될 뿐이다.Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. Advantages and features of the present invention and methods of achieving them will become apparent with reference to the embodiments described below in detail in conjunction with the accompanying drawings. However, the technical idea of the present invention is not limited to the following embodiments, but may be implemented in various different forms, and only the following embodiments complete the technical idea of the present invention, and in the technical field to which the present invention belongs It is provided to fully inform those of ordinary skill in the art of the scope of the present invention, and the technical spirit of the present invention is only defined by the scope of the claims.

각 도면의 구성요소들에 참조부호를 부가함에 있어서, 동일한 구성요소들에 대해서는 비록 다른 도면상에 표시되더라도 가능한 한 동일한 부호를 가지도록 하고 있음에 유의해야 한다. 또한, 본 발명을 설명함에 있어, 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명은 생략한다.In adding reference numerals to the components of each drawing, it should be noted that the same components are given the same reference numerals as much as possible even though they are indicated on different drawings. In addition, in describing the present invention, if it is determined that a detailed description of a related known configuration or function may obscure the gist of the present invention, the detailed description thereof will be omitted.

다른 정의가 없다면, 본 명세서에서 사용되는 모든 용어(기술 및 과학적 용어를 포함)는 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 공통적으로 이해될 수 있는 의미로 사용될 수 있다. 또 일반적으로 사용되는 사전에 정의되어 있는 용어들은 명백하게 특별히 정의되어 있지 않는 한 이상적으로 또는 과도하게 해석되지 않는다. 본 명세서에서 사용된 용어는 실시예들을 설명하기 위한 것이며 본 발명을 제한하고자 하는 것은 아니다. 본 명세서에서, 단수형은 문구에서 특별히 언급하지 않는 한 복수형도 포함한다.Unless otherwise defined, all terms (including technical and scientific terms) used herein may be used with the meaning commonly understood by those of ordinary skill in the art to which the present invention belongs. In addition, terms defined in a commonly used dictionary are not to be interpreted ideally or excessively unless clearly defined in particular. The terminology used herein is for the purpose of describing the embodiments and is not intended to limit the present invention. In this specification, the singular also includes the plural, unless specifically stated otherwise in the phrase.

또한, 본 발명의 구성 요소를 설명하는 데 있어서, 제1, 제2, A, B, (a), (b) 등의 용어를 사용할 수 있다. 이러한 용어는 그 구성 요소를 다른 구성 요소와 구별하기 위한 것일 뿐, 그 용어에 의해 해당 구성 요소의 본질이나 차례 또는 순서 등이 한정되지 않는다. 어떤 구성 요소가 다른 구성요소에 "연결", "결합" 또는 "접속"된다고 기재된 경우, 그 구성 요소는 그 다른 구성요소에 직접적으로 연결되거나 또는 접속될 수 있지만, 각 구성 요소 사이에 또 다른 구성 요소가 "연결", "결합" 또는 "접속"될 수도 있다고 이해되어야 할 것이다.In addition, in describing the components of the present invention, terms such as first, second, A, B, (a), (b), etc. may be used. These terms are only for distinguishing the elements from other elements, and the essence, order, or order of the elements are not limited by the terms. When it is described that a component is “connected”, “coupled” or “connected” to another component, the component may be directly connected or connected to the other component, but another component is between each component. It should be understood that elements may be “connected,” “coupled,” or “connected.”

명세서에서 사용되는 "포함한다 (comprises)" 및/또는 "포함하는 (comprising)"은 언급된 구성 요소, 단계, 동작 및/또는 소자는 하나 이상의 다른 구성 요소, 단계, 동작 및/또는 소자의 존재 또는 추가를 배제하지 않는다.As used herein, "comprises" and/or "comprising" refers to the presence of one or more other components, steps, operations and/or elements mentioned. or addition is not excluded.

이하, 도면들을 참조하여 본 발명의 몇몇 실시예들을 설명한다.Hereinafter, some embodiments of the present invention will be described with reference to the drawings.

도 1은 본 발명의 일 실시예에 따른, 지식 그래프 관리 방법을 설명하는 순서도이다.1 is a flowchart illustrating a knowledge graph management method according to an embodiment of the present invention.

도 1에 도시된 방법의 각 단계는 컴퓨팅 장치에 의해 수행될 수 있다. 다시 말하면, 본 방법의 각 단계는 컴퓨팅 장치의 프로세서에 의해 실행되는 하나 이상의 인스트럭션들로 구현될 수 있다. 본 방법에 포함되는 제1 단계들은 제1 컴퓨팅 장치에 의하여 수행되고, 본 방법의 제2 단계들은 제2 컴퓨팅 장치에 의하여 수행될 수 있다. 이하에서는, 본 방법의 각 단계가 컴퓨팅 장치에 의해 수행되는 것을 가정하여 설명을 이어가도록 하되, 각 단계의 수행 주체는 단지 예시일 뿐, 본 발명이 이하의 설명에 의해 한정되는 아니며, 설명의 편의를 위해 상기 방법에 포함되는 일부 단계의 동작 주체는 그 기재가 생략될 수도 있다.Each step of the method illustrated in FIG. 1 may be performed by a computing device. In other words, each step of the method may be implemented with one or more instructions executed by a processor of a computing device. The first steps included in the method may be performed by the first computing device, and the second steps of the method may be performed by the second computing device. Hereinafter, the description is continued assuming that each step of the method is performed by a computing device, but the subject performing each step is merely an example, and the present invention is not limited by the following description, for convenience of description For this purpose, the description of the operating subject of some steps included in the method may be omitted.

도 1을 참조하면, 컴퓨팅 장치는 데이터베이스, 외부의 서버 중 하나 이상과 연동하여, 하나 이상의 데이터셋(data set)을 수집할 수 있다(S100). 상기 데이터셋은 CSV(Comma Separated Values) 파일, 엑셀 파일 등과 같이 행과 열은 구성된 데이터셋일 수 있다. 일 실시예에서, 컴퓨팅 장치는 수집된 데이터셋이 행과 열에 가지는 포맷에 부합되지 않은 경우, 상기 포맷에 부합되도록 수집한 데이터셋의 형식, 양식, 데이터 정렬 상태 등을 변경할 수 있다. 몇몇 실시예에서, 컴퓨팅 장치는 수집된 데이터셋에 포함된 데이터가 사전에 설정된 데이터 포맷에 부합되지 않은 경우, 상기 데이터 포맷에 부합되도록 해당 데이터를 변경할 수 있다. 예를 들어, 데이터셋의 특정 칼럼에 포함된 데이터가 위경도에 대한 좌표이고, 위치 데이터의 표준 포맷이 행정 주소인 경우, 위경도 좌표를 도로명을 포함하는 행정 주소 형식으로 변경할 수 있다. 또 다른 예를 들어, 데이터셋의 특정 칼럼에 포함된 데이터가 월일년 순으로 기록된 일시 데이터이고, 일시 데이터의 표준 포맷이 년월일 순서의 포맷인 경우, 월일년 순의 일시 데이터를 년월일 순서의 일시 데이터로 변경할 수 있다. Referring to FIG. 1 , the computing device may collect one or more data sets by interworking with one or more of a database and an external server ( S100 ). The dataset may be a dataset in which rows and columns are configured, such as a CSV (Comma Separated Values) file, an Excel file, or the like. In an embodiment, when the format of the collected dataset does not match the format of the rows and columns, the computing device may change the format, form, data arrangement state, etc. of the collected dataset to conform to the format. In some embodiments, when data included in the collected data set does not conform to a preset data format, the computing device may change the corresponding data to conform to the data format. For example, when data included in a specific column of a dataset is coordinates for latitude and longitude, and the standard format of location data is an administrative address, the latitude and longitude coordinates may be changed to an administrative address format including street names. For another example, if the data included in a specific column of the dataset is date and time data recorded in the order of month, day, and year, and the standard format of the date and time data is in the format of year, month, day, date, month, day, year, date, and date of year, month, day data can be changed.

이어서, 컴퓨팅 장치는 이미 구축된 맞춤형 인공지능 모델 중에서, 상기 하나 이상의 데이터셋의 도메인(domain) 대응되는 인공지능 모델을 선택할 수 있다(S200). 일 실시예에서, 서로 다른 도메인을 가지는 복수의 인공 지능 모델이 미리 구축될 수 있으며, 이 중에서 상기 데이터셋의 도메인과 대응되는 인공지능 모델을 선택할 수 있다. 여기서, 도메인은 인공지능 모델이 이용되는 관련 분야 또는 주제로서, 예컨대 침수, 미세먼지, 지진 등과 같은 재난 유형일 수 있다. 예컨대, 데이터셋의 도메인이 지진인 경우, 지진 도메인을 가지는 인공지능 모델이 선택될 수 있고, 데이터셋의 도메인이 침수인 경우 침수 도메인을 가지는 인공지능 모델이 선택될 수 있다. 각각의 인공지능 모델은 미리 학습되어 구축될 수 있다.Next, the computing device may select an artificial intelligence model corresponding to a domain of the one or more datasets from among the already built customized artificial intelligence models ( S200 ). In an embodiment, a plurality of artificial intelligence models having different domains may be built in advance, and an artificial intelligence model corresponding to the domain of the dataset may be selected from among them. Here, the domain is a related field or topic in which the artificial intelligence model is used, and may be, for example, a disaster type such as flooding, fine dust, earthquake, and the like. For example, when the domain of the dataset is an earthquake, an artificial intelligence model having an earthquake domain may be selected, and when the domain of the dataset is flooding, an AI model having a submersion domain may be selected. Each AI model can be trained and built in advance.

이어서, 컴퓨팅 장치는 선택된 인공지능 모델을 이용하여 개체들이 연결되어 관계를 형성하는 지식 그래프를 생성할 수 있다(S300). 도 2는 지식 그래프를 예시하는 도면으로서, 도 2에 예시한 지식 그래프는 개체 A에서부터 개체 E까지의 개체를 포함하고, 각 개체들의 연관 관계를 연결 선으로서 표현하고 있다. 상기 개체는 지식 그래프에서 노드로서 표현될 수 있다. 여기서, 컴퓨팅 장치는 각 데이터셋에 포함된 컬럼명을 개체로서 선정하고, 연관 관계가 있는 개체들은 연결 선으로서 연결할 수 있다. 연결 선으로 연결되는 개체는 동일한 데이터셋으로부터 선정된 개체일 수 있으며, 또는 서로 다른 데이터셋으로부터 선정된 개체일 수도 있다. Next, the computing device may generate a knowledge graph in which entities are connected to form a relationship using the selected artificial intelligence model ( S300 ). FIG. 2 is a diagram illustrating a knowledge graph. The knowledge graph illustrated in FIG. 2 includes entities from entity A to entity E, and the relationship between the entities is expressed as a connecting line. The entity may be represented as a node in the knowledge graph. Here, the computing device may select a column name included in each data set as an entity, and connect entities having a relation with each other as a connection line. The entities connected by the connecting line may be entities selected from the same dataset or entities selected from different datasets.

컴퓨팅 장치는 인공지능 모델을 이용하거나, 알고리즘을 통해서 연관 관계가 있는 개체들의 연관 관계의 강도를 산출하여 수치할 수 있다. The computing device may use an artificial intelligence model or may calculate and quantify the strength of the association of related entities through an algorithm.

일 실시예에서, 컴퓨팅 장치를 인공지능 모델을 이용하여, 데이터셋 명칭, 컬럼명, 컬럼 데이터들 중에서 하나 이상을 클러스터링하고, 클러스터링된 결과를 이용하여 개체들 간의 연관 관계를 결정할 수 있다. 일 실시예에서, 제1 컬럼명과 제2 컬럼명이 공간과 관련된 명칭(예컨대, 주소, 장소 등)인 경우, 제1 컬럼명을 나타내는 제1 개체와 제2 컬럼명을 나타내는 제2 개체는 서로 연관 관계를 가질 수 있으며, 제1 컬럼명에 대응되는 복수의 제1 컬럼 데이터의 공간 중간값(즉, 위치 중간값)과, 제2 컬럼명에 대응되는 복수의 제2 컬럼 데이터의 공간 중간값(즉, 위치 중간값) 간에 거리를 측정하고, 측정된 거리에 기초로 제1 개체와 제2 개체 간의 연관 관계 강도를 수치화할 수 있다. 여기서, 공간 중간값은 복수의 컬럼 데이터(즉, 복수의 공간 데이터)의 중간값일 수 있다. 컴퓨팅 장치는, 측정된 거리가 짧을수록 더 높은 수치값을 가지도록 제1 개체와 제2 개체 간의 연관 관계 강도를 수치화할 수 있다.In an embodiment, the computing device may use an artificial intelligence model to cluster one or more of a dataset name, a column name, and column data, and determine a relationship between entities using the clustered result. In an embodiment, when the first column name and the second column name are names (eg, address, place, etc.) related to space, the first entity indicating the first column name and the second entity indicating the second column name are related to each other may have a relationship, and a spatial median value (ie, a position median value) of a plurality of first column data corresponding to the first column name and a spatial median value of a plurality of second column data corresponding to a second column name ( That is, the distance between the positions may be measured, and the strength of the association between the first entity and the second entity may be quantified based on the measured distance. Here, the spatial median value may be an median value of a plurality of column data (ie, a plurality of spatial data). The computing device may quantify the strength of the association relationship between the first entity and the second entity to have a higher numerical value as the measured distance is shorter.

일 실시예에서, 제3 컬럼명과 제4 컬럼명이 시간과 관련된 명칭(예컨대, 일시, 날짜 등)인 경우, 제3 컬럼명을 나타내는 제3 개체와 제4 컬럼명을 가리키는 제4 개체는 서로 연관 관계를 가질 수 있으며, 제3 컬럼명에 대응되는 복수의 제3 컬럼 데이터의 시간 중간값(또는 평균값)과, 제4 컬럼명에 대응되는 복수의 제4 컬럼 데이터의 시간 중간값(또는 평균값) 간에 시간 차이를 산출하고, 산출된 시간 차를 기초로 제3 개체와 제4 개체 간의 연관 관계 강도를 수치화할 수 있다. 컴퓨팅 장치는, 시간 차이가 짧을수록 더 높은 수치값을 가지도록 제3 개체와 제4 개체 간의 연관 관계 강도를 수치화할 수 있다. In an embodiment, when the third column name and the fourth column name are names related to time (eg, date and time, date, etc.), the third entity indicating the third column name and the fourth entity indicating the fourth column name are related to each other may have a relationship, and a time median value (or average value) of a plurality of third column data corresponding to the third column name and a temporal median value (or average value) of a plurality of fourth column data corresponding to the fourth column name A time difference may be calculated between the two, and the strength of a relationship between the third entity and the fourth entity may be quantified based on the calculated time difference. The computing device may quantify the strength of the association between the third entity and the fourth entity to have a higher numerical value as the time difference is shorter.

또 다른 실시예에서, 제5 컬럼명과 제6 컬럼명이 동일 유형의 관측값과 관련된 명칭(예컨대, 강수량, 침수량 등)인 경우, 제5 컬럼명을 나타내는 제5 개체와 제6 컬럼명을 가리키는 제6 개체는 서로 연관 관계를 가질 수 있으며, 제5 컬럼명에 대응되는 복수의 제5 컬럼 데이터의 관측 중간값(또는 평균값)과, 제6 컬럼명에 대응되는 복수의 제6 컬럼 데이터의 관측 중간값(또는 평균값) 간에 차이를 산출하고, 산출된 관측값의 차이를 기초로 제5 개체와 제6 개체 간의 연관 관계 강도를 수치화할 수 있다. 컴퓨팅 장치는, 산출된 관측값의 차이가 적을수록 더 높은 수치값을 가지도록 제5 개체와 제6 개체 간의 연관 관계 강도를 수치화할 수 있다. In another embodiment, when the fifth column name and the sixth column name are names related to the same type of observation value (eg, precipitation amount, inundation amount, etc.), the fifth entity indicating the fifth column name and the sixth column name are indicated. The sixth entity may have a relationship with each other, and an observation median (or average value) of a plurality of fifth column data corresponding to the fifth column name and observation of a plurality of sixth column data corresponding to the sixth column name A difference between the median values (or average values) may be calculated, and the strength of the association between the fifth individual and the sixth individual may be quantified based on the difference between the calculated observation values. The computing device may quantify the strength of the association relationship between the fifth entity and the sixth entity to have a higher numerical value as the difference between the calculated observation values is smaller.

이외에도, 컴퓨팅 장치는 다양한 방식을 통해서, 각각의 개체들의 연관 관계를 수치화할 수 있다. In addition, the computing device may quantify the relationship of each entity through various methods.

이어서, 컴퓨팅 장치는 다양한 그래픽 표현으로 개체들 간의 연관 관계를 표현한 후, 그 연관 관계가 그래픽 표현으로서 표현된 지식 그래픽을 디스플레이할 수 있다(S400). 단계 S400에 대해서는, 도 3를 참조하여 보다 구체적으로 설명하기로 한다.Next, the computing device may express the relation between the entities in various graphic representations, and then display the knowledge graphic in which the relation is expressed as a graphic representation ( S400 ). Step S400 will be described in more detail with reference to FIG. 3 .

컴퓨팅 장치는 지식 그래프를 디스플레이한 상태에서, 관리자로부터 수정 정보를 입력 받을 수 있다(S500). 여기서, 수정 정보는 개체들 간의 연관 관계에 대한 수정 정보일 수 있다. 관리자는 지식 그래프에 포함된 연결 선을 선택하고, 그 선택된 연결 선에 대한 수정 정보를 입력할 수 있으며, 이 경우 컴퓨팅 장치는 상기 수정 정보를 관리자로부터 입력 받을 수 있다. The computing device may receive correction information from the manager while displaying the knowledge graph (S500). Here, the correction information may be correction information on a relationship between entities. The manager may select a connection line included in the knowledge graph and input correction information for the selected connection line. In this case, the computing device may receive the correction information from the administrator.

수정 정보가 입력되면, 컴퓨팅 장치는 상기 수정 정보를 기초로 지식 그래프를 변경하여 디스플레이할 수 있다(S600). 단계 S600에 대해서는, 도 11을 참조하여 보다 구체적으로 설명하기로 한다.When correction information is input, the computing device may change and display the knowledge graph based on the correction information (S600). Step S600 will be described in more detail with reference to FIG. 11 .

이어서, 컴퓨팅 장치는 지식 그래프에 대한 추가적인 수정 정보가 입력되는지 여부를 모니터링하여 입력되면, 그 추가적인 수정 정보를 기초로, 지식 그래프를 다시 변경하여 디스플레이할 수 있다. Subsequently, the computing device may monitor whether additional correction information for the knowledge graph is input, and when it is input, change and display the knowledge graph again based on the additional correction information.

이하, 도 3 내지 도 10을 참조하여, 도 1의 단계 S400에 대해서 자세하게 설명한다.Hereinafter, step S400 of FIG. 1 will be described in detail with reference to FIGS. 3 to 10 .

컴퓨팅 장치는 연관 관계에 있는 개체들을 식별하고, 그 개체들의 연관 관계의 강도를 식별할 수 있다(S410). 일 실시예에서, 개체들의 연관 관계의 강도는 수치화될 수 있으며, 컴퓨팅 장치는 이러한 수치화된 연관 관계의 강도를 식별할 수 있다. The computing device may identify the entities in the association relationship and identify the strength of the association relationship between the entities ( S410 ). In an embodiment, the strength of the association of the entities may be quantified, and the computing device may identify the strength of the quantified association.

이어서, 컴퓨팅 장치는 개체 간의 연관 관계 강도를 제1 그래픽 표현하여 시각화할 수 있다(S420). 일 실시예에서, 상기 연관 관계 강도에 비례하여 연결 선의 두께가 굵어지도록 연결 선의 두께를 결정하고, 그 결정된 선의 두께를 제1 그래픽 표현으로서 시각화할 수 있다. 다른 실시예에서, 컴퓨팅 장치는 일정한 연관 관계 강도를 가지는 연결 선을 하이라이트(highlight)하여 시각화할 수 있다.Then, the computing device may visualize the strength of the relationship between the objects by first graphic representation ( S420 ). In an embodiment, the thickness of the connection line may be determined so that the thickness of the connection line is increased in proportion to the strength of the association relationship, and the determined thickness of the line may be visualized as a first graphic representation. In another embodiment, the computing device may highlight and visualize a connection line having a certain association strength.

도 4는 제1 그래픽 표현으로 시각화된 지식 그래프를 예시하는 도면으로서, 도 4에 예시된 바와 같이 연관 관계의 강도는 연결 선의 두께로서 시각화되고 있다. 도 4에 따르면, 개체 A와 개체 B 간의 연관 관계의 강도는 "강"이고, 개체 B와 개체 D 간의 연관 관계의 강도는 "강"이며, 개체 B와 개체 C의 연관 관계의 강도는 "중"인 것으로 예시하고 있다. 또한, 도 4에 따르면, 개체 A와 개체 C 간의 연관 관계의 강도는 "약"이고, 개체 A와 개체 E 간의 연관 관계의 강도는 "약"인 것으로 예시하고 있다.4 is a diagram illustrating a knowledge graph visualized as a first graphic representation, and as illustrated in FIG. 4 , the strength of an association is visualized as a thickness of a connecting line. According to FIG. 4 , the strength of the association between entity A and entity B is “strong”, the strength of the association between entity B and entity D is “strong”, and the strength of the association between entity B and entity C is “medium”. “It is exemplified by Also, according to FIG. 4 , it is exemplified that the strength of the association between the entity A and the entity C is “weak” and the strength of the association between the entity A and the entity E is “weak”.

이어서, 컴퓨팅 장치는 각 개체를 그룹핑할 수 있다(S430). 일 실시예에서, 컴퓨팅 장치는 동일한 데이터셋으로부터 선정된 개체들을 동일 그룹의 개체로서 그룹핑할 수 있다. 각 개체들은 앞서 설명한 바와 같이, 데이터셋의 칼럼명일 수 있다.Subsequently, the computing device may group each object ( S430 ). In an embodiment, the computing device may group objects selected from the same dataset as objects of the same group. Each entity may be a column name of a dataset, as described above.

다음으로, 컴퓨팅 장치는 동일 그룹에 속하는 개체들을 동일 형태 또는 동일한 색상을 가지는 제2 그래픽 표현으로 시각화할 수 있다(S440). Next, the computing device may visualize objects belonging to the same group as a second graphic representation having the same shape or the same color ( S440 ).

도 5a는 동일 그룹에 속하는 개체를 동일한 색상으로 시각화한 것을 예시하는 도면이다. 도 5a에서는 특정 색상을 빗금으로 표시하였다. 도 5a에 따르면, 개체 A, B, C는 제1 그룹에 속하여 제1 색상으로 시각화되고, 개체 D, E는 제2 그룹에 속하여 제2 색상으로 시각화될 수 있다.5A is a diagram illustrating visualization of objects belonging to the same group with the same color. In FIG. 5A, a specific color is indicated by a hatched line. Referring to FIG. 5A , objects A, B, and C belong to a first group and may be visualized with a first color, and objects D and E may belong to a second group and may be visualized with a second color.

도 5b는 동일 그룹에 속하는 개체를 모양으로서 시각화하는 것을 예시하는 도면이다. 도 5b에 따르면, 개체 A, B, C는 제1 그룹에 속하여 원 형태로 시각화되고, 개체 D, E는 제2 그룹에 속하여 원 형태로 시각화될 수 있다.5B is a diagram illustrating visualization of an object belonging to the same group as a shape. Referring to FIG. 5B , objects A, B, and C belong to a first group and may be visualized in a circle shape, and objects D and E may belong to a second group and may be visualized in a circle shape.

이어서, 컴퓨팅 장치는 각 개체의 인과 관계를 식별할 수 있다(S450). 일 실시예에서, 단계 S200 단계에서 선택된 인공지능 모델을 통해서, 각 개체의 인과 관계가 판정될 수 있다. 즉, 컴퓨팅 장치는 인공지능 모델을 이용하여 인과 관계에 있는 개체들을 식별할 수 있다. 예를 들어, 일정값 이상의 강우량 측정값을 가지는 강우량 개체와 침수 개체는 서로 인과 관계에 있을 수 있다. Subsequently, the computing device may identify a causal relationship of each entity ( S450 ). In an embodiment, a causal relationship of each entity may be determined through the artificial intelligence model selected in step S200. That is, the computing device may use the artificial intelligence model to identify causal entities. For example, a rainfall entity and a submerged entity having a rainfall measurement value greater than or equal to a predetermined value may have a causal relationship.

다음으로, 컴퓨팅 장치는 인과 관계에 있는 개체들을 제3 그래픽 표현으로 시각화할 수 있다(S460). Next, the computing device may visualize the objects in the causal relationship as a third graphic representation ( S460 ).

도 6은 인과 관계의 개체를 제3 그래픽 표현으로 시각화한 것을 예시하는 도면으로서, 도 6에 따르면 인과 관계에 있는 A 개체와 C 개체의 연결 선을 이중 선으로 시각화할 수 있다. 여기서, 이중 선이 제3 그래프 표현에 해당할 수 있다. 몇몇 실시예에서, 제3 표현으로서, 다른 형태의 선을 이용할 수 있으며, 또는 다양한 그래픽 표현 방법을 이용할 수 있다.6 is a diagram illustrating visualization of a causal entity as a third graphic representation. According to FIG. 6 , a connection line between entity A and entity C in a causal relation may be visualized as a double line. Here, the double line may correspond to the third graph representation. In some embodiments, as the third representation, other types of lines may be used, or various graphic representation methods may be used.

이어서, 미리 설정된 연관 관계 유형을 기초로 연결 선을 별도의 그래픽 표현으로 시각화할 수 있다(S470). Next, the connection line may be visualized as a separate graphic representation based on the preset association type ( S470 ).

이하, 단계 S470에 대해서는 도 7 내지 도 10을 참조하여 자세하게 설명하기로 한다.Hereinafter, step S470 will be described in detail with reference to FIGS. 7 to 10 .

컴퓨팅 장치는 개체들이 연관 관계를 가지고 있은 경우, 그 연관 관계의 유형을 식별할 수 있다(S431). 일 실시예에서, 컴퓨팅 장치는 개체들의 연관 관계가 공간 연관 관계, 시간 연관 관계, 관측 연관 관계 등 중에서 어느 유형에 해당하는지 여부를 식별할 수 있다. 여기서, 공간 연관 관계는 개체들의 공간값(즉, 위치값)이 유사하여 연관된 관계를 나타내고, 시간 연관 관계는 개체들의 시간값이 유사하여 연관된 관계를 나타내고, 관측 연관 관계는 개체들의 관측값이 유사하여 연관된 관계를 나타낼 수 있다.When objects have an association relationship, the computing device may identify a type of the association relationship ( S431 ). In an embodiment, the computing device may identify which type of association relation between entities corresponds to among a spatial relation, a temporal relation, an observation relation, and the like. Here, the spatial correlation relationship indicates a relationship in which spatial values (ie, location values) of objects are similar, the temporal correlation relationship indicates a relationship in which the temporal values of the objects are similar, and the observation relationship relationship indicates a relationship in which the observed values of the objects are similar. can indicate a related relationship.

이어서, 컴퓨팅 장치는 공간 연관 관계를 가지는 개체들의 연결 선을 제1 선으로 변경하고 시각화하여, 해당 개체들이 공간 연관 관계를 있음을 디스플레이할 수 있다(S432). Subsequently, the computing device may change the connection line of the objects having the spatial association relationship to the first line and visualize the object to display that the objects have the spatial association relationship ( S432 ).

이어서, 컴퓨팅 장치는 시간 연관 관계를 가지는 개체들의 연결 선을 제2 선으로 변경하고 시각화하여, 해당 개체들이 시간 연관 관계를 있음을 디스플레이할 수 있다(S433). Subsequently, the computing device may change the connection line of the objects having the temporal relationship to a second line and visualize it to display that the objects have the temporal relationship ( S433 ).

다음으로, 컴퓨팅 장치는 관측 연관 관계를 가지는 개체들의 연결 선을 제3 선으로 변경하고 시각화하여, 해당 개체들이 관측 연관 관계를 있음을 디스플레이할 수 있다(S434).Next, the computing device may change the connection line of the objects having the observation relationship to a third line and visualize it to display that the objects have the observation relationship ( S434 ).

도 8은 연관 관계의 유형에 따라 시각화된 지식 그래프를 나타내는 도면을 예시하는 것으로서, 도 8에 따르면 개체 A와 개체 E는 공간 연관 관계를 나타내는 제1 선(즉, 점 선)으로 시각화한 것을 예시한다. 또한, 도 8에 따르면, 개체 A와 개체 B는 시간 연관 관계를 나타내는 제2 선(즉, 사각 점선)으로 시각화한 것을 예시하고, 개체 B와 개체 C는 관측 연관 관계를 나타내는 제3 선(즉, 파선)으로 시각화한 것을 예시한다.8 is a diagram illustrating a knowledge graph visualized according to the type of association relationship, and according to FIG. 8, object A and object E are visualized with a first line (ie, dotted line) indicating spatial association relationship do. In addition, according to FIG. 8 , the objects A and B exemplify the visualization with a second line (ie, a dotted square line) representing the temporal relationship, and the objects B and C are visualized with a third line (ie, the observation relationship) , broken line) is an example of visualization.

다시 도 3을 참조하면, 컴퓨팅 장치는 관리자로부터 연관 관계를 나타내는 특정 연결 선을 선택받을 수 있다(S480).Referring back to FIG. 3 , the computing device may receive a selection of a specific connection line indicating a relationship from the manager ( S480 ).

이어서, 컴퓨팅 장치는 선택된 연관 관계의 연결 선에 대한 상세 정보를 디스플레이할 수 있다(S490). 일 실시예에서, 컴퓨팅 장치는 선택된 연관 관계를 설정하는데 이용된 개체들의 원시 데이터를, 상기 상세 정보에 포함시켜 디스플레이할 수 있다. 상기 원시 데이터는 데이터셋의 이름, 칼럼명 및 칼럼 데이터를 포함할 수 있다. 또한, 컴퓨팅 장치는 상기 연관 관계의 강도에 대한 수치값을 상기 상세 정보에 포함시켜 디스플레이할 수 있다. 이에 따라, 관리자는 연관 관계를 설정하는 기초가 되는 원시 데이터를 확인할 수 있으며, 원시 데이터를 기초로 개체 관계과 적절하게 설정되었는지 여부를 검증할 수 있다.Subsequently, the computing device may display detailed information about the connection line of the selected association relationship ( S490 ). In an embodiment, the computing device may display raw data of entities used to establish the selected association relationship by including it in the detailed information. The raw data may include a name of a dataset, a column name, and column data. In addition, the computing device may display a numerical value of the strength of the association in the detailed information. Accordingly, the administrator can check the raw data that is the basis for establishing the association relationship, and can verify whether the object relationship and the entity relationship are properly set based on the raw data.

도 9는 개체 A와 개체 B의 연결 선(11)이 선택되는 경우에 디스플레이되는 상세 정보를 예시하고 있다. 도 9에 예시된 바와 같이, 연결 선(11)이 선택되면, 개체 A와 개체 B 간의 연관 관계 강도에 대한 수치값(70%), 개체 A의 원시 데이터(즉, 제1 개체의 raw data) 및 개체 B의 원시 데이터(즉, 제2 개체의 raw data)를 포함하는 상세 정보를 디스플레이할 수 있다. 여기서, 수치값은 각 개체의 유사한 정도를 나타낼 수 있으며, 숫자가 높을수록 강도가 강함을 의미할 수 있다.9 illustrates detailed information displayed when the connecting line 11 between the object A and the object B is selected. As illustrated in FIG. 9 , when the connecting line 11 is selected, a numerical value (70%) for the strength of the association relationship between the entity A and the entity B, the raw data of the entity A (ie, the raw data of the first entity) and detailed information including raw data of the entity B (ie, raw data of the second entity) may be displayed. Here, the numerical value may indicate the degree of similarity of each individual, and a higher number may mean stronger strength.

도 10은 개체 B와 개체 D의 연결 선(12)이 선택되는 경우에 디스플레이되는 상세 정보를 예시하고 있다. 도 10에 예시된 바와 같이, 연결 선(12)이 선택되면, 개체 B와 개체 D 간의 연관 관계 강도에 대한 수치값(72%), 개체 B의 원시 데이터(즉, 제2 개체의 raw data) 및 개체 D의 원시 데이터(즉, 제4 개체의 raw data)를 포함하는 상세 정보를 디스플레이할 수 있다.10 illustrates detailed information displayed when the connecting line 12 of the object B and the object D is selected. As illustrated in FIG. 10 , when the connecting line 12 is selected, the numerical value (72%) of the association relationship strength between the entity B and the entity D, the raw data of the entity B (ie, the raw data of the second entity) and detailed information including raw data of the entity D (ie, raw data of the fourth entity) may be displayed.

본 실시예에 따르면, 관리자는 다양한 그래픽 표현을 통해서, 지식 그래프에서의 개체들의 연관 관계를 직관적으로 확인할 수 있다. 또한, 관리자는 연결 선을 선택하는 것을 통해, 연관 관계를 설정하는데 이용한 각 개체의 원시 데이터와 연관 관계의 강도에 대한 수치값을 확인할 수 있다. According to the present embodiment, the administrator can intuitively check the relation of entities in the knowledge graph through various graphic representations. In addition, by selecting a connection line, the administrator can check the raw data of each entity used to establish the association and the numerical value of the strength of the association.

이하, 도 11 및 도 12를 참조하여, 도 1의 단계 S600에 대해서 상세하게 설명하기로 한다.Hereinafter, step S600 of FIG. 1 will be described in detail with reference to FIGS. 11 and 12 .

컴퓨팅 장치는 관리자로부터 연관 관계에 대한 수정 정보를 입력받는 것에 응답하여, 입력된 수정 정보를 기초로 연관 관계 강도를 변경할 수 있다(S610). 관리자는 도 9 및 도 10과 같은 상세 정보를 확인한 상태에서, 연관 관계에 대한 데이터를 수정할 수 있다. 이때, 관리자는 연관 관계의 강도에 대한 수치값을 수정할 수 있다. 관리자는 연관 관계가 전혀 없는 경우, 수치값을 '0%'로 입력할 수 있다. 이외에도, 관리자는 상세 정보를 통해서 또 다른 데이터를 수정할 수도 있다.In response to receiving the correction information on the association from the manager, the computing device may change the association strength based on the input correction information ( S610 ). The manager may modify data on the relationship while checking the detailed information as shown in FIGS. 9 and 10 . In this case, the administrator may modify the numerical value of the strength of the association relationship. The administrator can input the numerical value as '0%' if there is no correlation at all. In addition, the administrator may modify other data through detailed information.

이어서, 컴퓨팅 장치는 수정된 연관 관계 강도의 수치값을 기초로, 선택된 개체들에 대한 연관 관계 강도의 수치값을 변경하고, 더불어 변경된 수치값에 따라 지식 그래프에서 해당 연결 선을 변경할 수 있다(S620). 일 실시예에서, 컴퓨팅 장치는 수치값의 변동에 따라, 연결 선의 굵기를 변경할 수 있다. 또한, 컴퓨팅 장치는 수치값이 '0%'로 수정된 경우(즉, 연관 관계가 없는 것으로 관리자로부터 입력된 경우), 해당 개체들을 연결하는 연결 선을 제거할 수 있다.Then, the computing device may change the numerical value of the association strength for the selected entities based on the modified numerical value of the association strength, and also change the corresponding connection line in the knowledge graph according to the changed numerical value (S620). ). In an embodiment, the computing device may change the thickness of the connection line according to a change in the numerical value. In addition, when the numerical value is modified to '0%' (ie, when input from an administrator as having no correlation), the computing device may remove the connecting line connecting the corresponding objects.

도 12는 연관 관계 강도의 수치값이 변경됨에 따라, 지식 그래프가 수정된 상태를 예시하는 도면이다. 도 12에 따르면, 개체 B와 개체 D 간의 연관 관계 강도가 10%로 변경되는 것에 응답하여, 개체 B와 개체 D 간의 연결 선(13)의 굵기가 얇아지도록 지식 그래프가 변경될 수 있다.12 is a diagram illustrating a state in which the knowledge graph is modified as the numerical value of the correlation strength is changed. According to FIG. 12 , in response to a change in the strength of the association relationship between the entity B and the entity D to 10%, the knowledge graph may be changed such that the thickness of the connection line 13 between the entity B and the entity D becomes thinner.

다음으로, 컴퓨팅 장치는 상기 수정된 연관 관계 강도가 인공 지능 모델에 학습 데이터로서 이용될 수 있도록, 수정된 연관 관계 강도와 관련된 개체들의 원시 데이터를 별도로 태깅할 수 있다(S630). 예컨대, 개체 B와 개체 D 간의 연관 관계 강도가 수정된 경우, 수정된 연관 관계 강도의 수치값, 개체 B의 원시 데이터 및 개체 D의 원시 데이터를 추가 학습 데이터로서 별도로 태깅할 수 있다. 이렇게 태깅된 추가 학습 데이터는 인공지능 모델에 입력되어 학습되어, 인공지능 모델의 정확성을 향상시킬 수 있다.Next, the computing device may separately tag raw data of entities related to the modified association strength so that the modified association strength can be used as training data in the artificial intelligence model ( S630 ). For example, when the association relationship strength between the entity B and the entity D is corrected, the numerical value of the modified association relationship strength, the raw data of the entity B, and the raw data of the entity D may be separately tagged as additional training data. The additional training data tagged in this way is input to the AI model and trained, so that the accuracy of the AI model can be improved.

상술한 바와 같이, 본 발명의 실시예에 따르면, 개체들 간의 연관 관계가 직관적으로 이해될 수 있는 지식 그래프를 제공할 수 있다. 또한, 본 실시예에 따르면, 관리자가 연결 선을 선택하는 것을 통해서, 연관 관계를 설정하는데 이용된 원시 데이터를 획득할 수 있으므로, 보다 빠르고 용이하게 지식 그래프의 수정 작업을 수행할 수 있다. 또한, 본 실시예에 따르면, 간편 입력을 통해서 연관 관계를 설정하는데 이용된 원시 데이터를 관리자에게 제공함으로써, 지식 그래프 수정 작업에서의 오류를 최소화할 수 있다.As described above, according to an embodiment of the present invention, it is possible to provide a knowledge graph in which the relation between entities can be intuitively understood. In addition, according to the present embodiment, the raw data used to establish the relationship can be obtained through the administrator selecting the connection line, so that the correction of the knowledge graph can be performed more quickly and easily. In addition, according to the present embodiment, by providing the manager with raw data used to set the correlation through simple input, it is possible to minimize errors in the knowledge graph correction operation.

도 13은 다양한 실시예에서 컴퓨팅 장치를 구현할 수 있는 예시적인 하드웨어 구성도이다.13 is an exemplary hardware configuration diagram that may implement a computing device in various embodiments.

본 실시예에 따른 컴퓨팅 장치(1000)는 하나 이상의 프로세서(1100), 시스템 버스(1600), 통신 인터페이스(1200), 프로세서(1100)에 의하여 수행되는 컴퓨터 프로그램(1500)을 로드(load)하는 메모리(1400)와, 컴퓨터 프로그램(1500)을 저장하는 스토리지(1300)를 포함할 수 있다. 도 13에서는 실시예와 관련 있는 구성요소들 만이 도시되어 있다. 따라서, 본 명세서의 실시예들이 속한 기술분야의 통상의 기술자라면 도 13에 도시된 구성요소들 외에 다른 범용적인 구성 요소들이 더 포함될 수 있음을 알 수 있다.The computing device 1000 according to the present embodiment includes one or more processors 1100 , a system bus 1600 , a communication interface 1200 , and a memory for loading a computer program 1500 executed by the processor 1100 . 1400 and a storage 1300 for storing the computer program 1500 may be included. In Fig. 13, only the components related to the embodiment are shown. Accordingly, those skilled in the art to which the embodiments of the present specification pertain can know that other general-purpose components other than the components shown in FIG. 13 may be further included.

프로세서(1100)는 컴퓨팅 장치(1000)의 각 구성의 전반적인 동작을 제어한다. 프로세서(1100)는 CPU(Central Processing Unit), MPU(Micro Processor Unit), MCU(Micro Controller Unit), GPU(Graphic Processing Unit) 또는 본 명세서의 기술 분야에 잘 알려진 임의의 형태의 프로세서 중 적어도 하나를 포함하여 구성될 수 있다. 또한, 프로세서(1100)는 다양한 실시예들에 따른 방법/동작을 실행하기 위한 적어도 하나의 애플리케이션 또는 프로그램에 대한 연산을 수행할 수 있다. 컴퓨팅 장치(1000)는 둘 이상의 프로세서를 구비할 수 있다.The processor 1100 controls the overall operation of each component of the computing device 1000 . The processor 1100 includes at least one of a central processing unit (CPU), a micro processor unit (MPU), a micro controller unit (MCU), a graphic processing unit (GPU), or any type of processor well known in the art. may be included. In addition, the processor 1100 may perform an operation on at least one application or program for executing the method/operation according to various embodiments. The computing device 1000 may include two or more processors.

메모리(1400)는 각종 데이터, 명령 및/또는 정보를 저장한다. 메모리(1400)는 본 명세서의 다양한 실시예들에 따른 방법/동작들을 실행하기 위하여 스토리지(1300)로부터 하나 이상의 프로그램(1500)을 로드(load) 할 수 있다. 메모리(1400)의 예시는 RAM이 될 수 있으나, 이에 한정되는 것은 아니다. The memory 1400 stores various data, commands, and/or information. The memory 1400 may load one or more programs 1500 from the storage 1300 to execute methods/operations according to various embodiments of the present specification. An example of the memory 1400 may be a RAM, but is not limited thereto.

통신 인터페이스(1200)는 이동통신망, 유선 인터넷망 등의 네트워크를 이용하여 이동통신단말, 개인용 컴퓨터, 서버 등과 같은 외부의 통신 장치와 통신할 수 있다. 상기 통신 인터페이스(1200)는 통신 장치로부터 입력 정보를 수신할 수 있다. The communication interface 1200 may communicate with an external communication device such as a mobile communication terminal, a personal computer, or a server using a network such as a mobile communication network or a wired Internet network. The communication interface 1200 may receive input information from a communication device.

시스템 버스(1600)는 컴퓨팅 장치(1000)의 구성 요소 간 통신 기능을 제공한다. 상기 시스템 버스(1600)는 주소 버스(Address Bus), 데이터 버스(Data Bus) 및 제어 버스(Control Bus) 등 다양한 형태의 버스로 구현될 수 있다. The system bus 1600 provides a communication function between components of the computing device 1000 . The system bus 1600 may be implemented as various types of buses such as an address bus, a data bus, and a control bus.

스토리지(1300)는 하나 이상의 컴퓨터 프로그램(1500)을 비임시적으로 저장할 수 있다. 스토리지(1300)는 플래시 메모리 등과 같은 비휘발성 메모리, 하드 디스크, 착탈형 디스크, 또는 본 명세서의 실시예들이 속하는 기술 분야에서 잘 알려진 임의의 형태의 컴퓨터로 읽을 수 있는 기록 매체를 포함하여 구성될 수 있다. 또한, The storage 1300 may non-temporarily store one or more computer programs 1500 . The storage 1300 may include a non-volatile memory such as a flash memory, a hard disk, a removable disk, or any type of computer-readable recording medium well known in the art to which embodiments of the present specification pertain. . In addition,

컴퓨터 프로그램(1500)은 본 명세서의 다양한 실시예들에 따른 방법/동작들이 구현된 하나 이상의 인스트럭션(instruction)들을 포함할 수 있다. 컴퓨터 프로그램(1500)이 메모리(1400)에 로드 되면, 프로세서(1100)는 상기 하나 이상의 인스트럭션들을 실행시킴으로써 본 명세서의 다양한 실시예들에 따른 방법/동작들을 수행할 수 있다. The computer program 1500 may include one or more instructions in which methods/operations according to various embodiments of the present specification are implemented. When the computer program 1500 is loaded into the memory 1400 , the processor 1100 may execute the one or more instructions to perform methods/operations according to various embodiments of the present specification.

일 실시예에서, 컴퓨터 프로그램(1500)는 하나 이상의 데이터셋을 이용하여 생성된 지식 그래프에서 제1 개체와 제2 개체 간의 연관 관계 강도를 식별하는 동작과, 상기 지식 그래프를 디스플레이하되, 상기 제1 개체와 상기 제2 개체 간의 연관 관계를 나타내는 연결 선을, 상기 식별된 연관 관계 강도를 가리키는 제1 그래픽 표현으로 시각화하여 디스플레이하는 동작과, 상기 시각화된 연결 선이 선택되는 것에 응답하여, 제1 개체와 대응되는 제1 원시 데이터 및 제2 개체와 대응되는 제2 원시 데이터를 디스플레이하는 동작을 수행하기 위한 인스트럭션들(instructions)을 포함할 수 있다. In an embodiment, the computer program 1500 performs an operation of identifying the strength of an association relationship between a first entity and a second entity in a knowledge graph generated using one or more datasets, and displaying the knowledge graph, wherein the first entity visualizing and displaying a connecting line representing an association relationship between an entity and the second object as a first graphical representation indicating the identified association strength; in response to the visualized connecting line being selected, the first object may include instructions for performing an operation of displaying the first raw data corresponding to and second raw data corresponding to the second object.

지금까지 도 1 내지 도 13을 참조하여 본 발명의 다양한 실시예들 및 그 실시예들에 따른 효과들을 언급하였다. 본 발명의 기술적 사상에 따른 효과들은 이상에서 언급한 효과들로 제한되지 않으며, 언급되지 않은 또 다른 효과들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.So far, various embodiments of the present invention and effects according to the embodiments have been described with reference to FIGS. 1 to 13 . Effects according to the technical spirit of the present invention are not limited to the above-mentioned effects, and other effects not mentioned will be clearly understood by those skilled in the art from the following description.

지금까지 도 1 내지 도 13을 참조하여 설명된 본 발명의 기술적 사상은 컴퓨터가 읽을 수 있는 매체 상에 컴퓨터가 읽을 수 있는 코드로 구현될 수 있다. 상기 컴퓨터로 읽을 수 있는 기록 매체는, 예를 들어 이동형 기록 매체(CD, DVD, 블루레이 디스크, USB 저장 장치, 이동식 하드 디스크)이거나, 고정식 기록 매체(ROM, RAM, 컴퓨터 구비 형 하드 디스크)일 수 있다. 상기 컴퓨터로 읽을 수 있는 기록 매체에 기록된 상기 컴퓨터 프로그램은 인터넷 등의 네트워크를 통하여 다른 컴퓨팅 장치에 전송되어 상기 다른 컴퓨팅 장치에 설치될 수 있고, 이로써 상기 다른 컴퓨팅 장치에서 사용될 수 있다.The technical ideas of the present invention described with reference to FIGS. 1 to 13 may be implemented as computer-readable codes on a computer-readable medium. The computer-readable recording medium may be, for example, a removable recording medium (CD, DVD, Blu-ray disk, USB storage device, removable hard disk) or a fixed recording medium (ROM, RAM, computer-equipped hard disk). can The computer program recorded in the computer-readable recording medium may be transmitted to another computing device through a network such as the Internet and installed in the other computing device, thereby being used in the other computing device.

이상에서, 본 발명의 실시예를 구성하는 모든 구성 요소들이 하나로 결합되거나 결합되어 동작하는 것으로 설명되었다고 해서, 본 발명의 기술적 사상이 반드시 이러한 실시예에 한정되는 것은 아니다. 즉, 본 발명의 목적 범위 안에서라면, 그 모든 구성요소들이 하나 이상으로 선택적으로 결합하여 동작할 수도 있다.In the above, even though all the components constituting the embodiment of the present invention are described as being combined or operating in combination, the technical spirit of the present invention is not necessarily limited to this embodiment. That is, within the scope of the object of the present invention, all of the components may operate by selectively combining one or more.

도면에서 동작들이 특정한 순서로 도시되어 있지만, 반드시 동작들이 도시된 특정한 순서로 또는 순차적 순서로 실행되어야만 하거나 또는 모든 도시 된 동작들이 실행되어야만 원하는 결과를 얻을 수 있는 것으로 이해되어서는 안 된다. 특정 상황에서는, 멀티태스킹 및 병렬 처리가 유리할 수도 있다. 더욱이, 위에 설명한 실시예들에서 다양한 구성들의 분리는 그러한 분리가 반드시 필요한 것으로 이해되어서는 안 되고, 설명된 프로그램 컴포넌트들 및 시스템들은 일반적으로 단일 소프트웨어 제품으로 함께 통합되거나 다수의 소프트웨어 제품으로 패키지 될 수 있음을 이해하여야 한다.Although acts are shown in a particular order in the drawings, it should not be understood that the acts must be performed in the specific order or sequential order shown, or that all depicted acts must be performed to obtain a desired result. In certain circumstances, multitasking and parallel processing may be advantageous. Moreover, the separation of the various components in the embodiments described above should not be construed as necessarily requiring such separation, and the program components and systems described may generally be integrated together into a single software product or packaged into multiple software products. It should be understood that there is

이상 첨부된 도면을 참조하여 본 발명의 실시예들을 설명하였지만, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자는 그 기술적 사상이나 필수적인 특징을 변경하지 않고서 본 발명이 다른 구체적인 형태로도 실시될 수 있다는 것을 이해할 수 있다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적인 것이 아닌 것으로 이해해야만 한다. 본 발명의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술 사상은 본 발명에 의해 정의되는 기술적 사상의 권리범위에 포함되는 것으로 해석되어야 할 것이다.Although embodiments of the present invention have been described above with reference to the accompanying drawings, those of ordinary skill in the art to which the present invention pertains can practice the present invention in other specific forms without changing the technical spirit or essential features. can understand that there is Therefore, it should be understood that the embodiments described above are illustrative in all respects and not restrictive. The protection scope of the present invention should be interpreted by the claims below, and all technical ideas within the equivalent range should be interpreted as being included in the scope of the technical ideas defined by the present invention.

Claims

A knowledge graph management method performed by a computer device, comprising:
identifying a strength of a relationship between a first entity and a second entity in a knowledge graph generated using one or more datasets;
displaying the knowledge graph, visualizing and displaying a connection line indicating a relationship between the first object and the second object as a first graphic representation indicating the identified relationship strength;
receiving correction information on a type of a relationship between the first entity and the second entity;
modifying an association relationship between the first entity and the second entity based on the correction information; and
generating learning data of the knowledge graph generation artificial intelligence model by tagging the revision information, first raw data corresponding to the first entity, and second raw data corresponding to the second entity;
Visualizing and displaying the first graphic representation comprises:
identifying a type of the association relationship between the first entity and the second entity; and
based on the identified type, determining the type of the connecting line;
The type of the association relationship between the entities is a spatial association relationship indicating that there is an association relationship between the positions of the first object and the second object, and that there is an association relationship between the time values of the first object and the second object. It is one of a temporal correlation indicating a temporal correlation and an observation correlation indicating that a correlation exists between the observation values of the first entity and the second entity,
The step of determining the type of the connection line,
determining the connection line as a first line when the type of the relation between the first entity and the second entity is the spatial relation;
determining the connection line as a second line when the type of the relation between the first entity and the second entity is the temporal relation; and
determining the connection line as a third line when the type of the association between the first entity and the second entity is the observation correlation;
How to manage the knowledge graph.

According to claim 1,
The displaying of the first raw data corresponding to the first object and the second raw data corresponding to the second object may include:
Displaying a numerical value for the strength of the association relationship,
How to manage the knowledge graph.

delete

According to claim 1,
The first entity points to a first column of the dataset, and the second entity points to a second column of the dataset,
How to manage the knowledge graph.

10. The method of claim 9,
The step of identifying the strength of the association between the first entity and the second entity comprises:
calculating the strength of a relationship between the first entity and the second entity by using the similarity between the one or more first raw data included in the first column and the one or more second raw data included in the second column comprising steps,
How to manage the knowledge graph.

11. The method of claim 10,
Each of the first raw data and the second raw data includes a plurality of location data,
The step of identifying the strength of the association between the first entity and the second entity comprises:
measuring a distance between a position median value of the first raw data and a median position value of the second raw data; and
Comprising the step of quantifying the strength of the association relationship between the first entity and the second entity based on the measured distance,
How to manage the knowledge graph.

11. The method of claim 10,
Each of the first raw data and the second raw data includes a plurality of time data,
The step of identifying the strength of the association between the first entity and the second entity comprises:
calculating a time difference between the median or average value of the first raw data and the median or average value of the second raw data; and
Comprising the step of quantifying the strength of the association relationship between the first entity and the second entity based on the calculated time difference,
How to manage the knowledge graph.

11. The method of claim 10,
Each of the first raw data and the second raw data includes a plurality of observation data,
The step of identifying the strength of the association between the first entity and the second entity comprises:
calculating a difference between the median or average value of the first raw data and the median or average value of the second raw data; and
Comprising the step of quantifying the strength of the association relationship between the first entity and the second entity based on the calculated difference,
How to manage the knowledge graph.

According to claim 1,
The step of visualizing and displaying the
Visualizing and displaying a plurality of objects generated from the same data set as a second graphic representation having the same shape or the same color,
How to manage the knowledge graph.

According to claim 1,
The step of visualizing and displaying the
identifying a third entity that has a causal relationship with the first entity in the knowledge graph; and
Visualizing and displaying a causal relationship between the first entity and the third entity in a third graphical representation;
How to manage the knowledge graph.

delete