KR101916934B1

KR101916934B1 - Data searching apparatus

Info

Publication number: KR101916934B1
Application number: KR1020180011486A
Authority: KR
Inventors: 최진혁
Original assignee: 주식회사 인포리언스
Priority date: 2018-01-30
Filing date: 2018-01-30
Publication date: 2018-11-08
Also published as: KR20180014803A

Abstract

본 발명의 실시예에 따른 데이터 탐색 장치는 시계열 데이터를 저장하는 메모리; 및상기 메모리에 액세스가능한 프로세서를 포함하며, 상기 프로세서는, 상기 시계열 데이터의 일부 구간 또는 일부 시점에 외부에서 입력된 코멘트를 할당하고, 상기 코멘트에 포함된 분류 태그에 따라 상기 코멘트를 분류할 수 있다.A data search apparatus according to an embodiment of the present invention includes a memory for storing time series data; And a processor accessible to the memory, wherein the processor can assign an externally input comment at a part of or a part of the time series data, and classify the comment according to a classification tag included in the comment .

Description

DATA SEARCHING APPARATUS

본 발명은 데이터 탐색 장치에 관한 것이다.The present invention relates to a data search apparatus.

웹, 스마트 폰, IoT 센서 등을 통해 누구나 데이터를 모을 수 있으므로 데이터 소스의 다양화 및 개인화가 이루어지고 있다. 이를 뒷받침하기 위하여 데이터 분석 알고리즘의 오픈소스(open-source)화 및 서비스의 플랫폼화가 진행되고 있다. 또한 전문적인 기술 지식이 없어도 알고리즘을 적용해볼 수 있게 되었다.Since data can be collected by anyone through web, smart phone, IoT sensor, etc., data sources are diversified and personalized. In order to support this, open-source data analysis algorithms and platform services have been developed. In addition, the algorithm can be applied without expert knowledge.

그러나 데이터와 알고리즘이 준비되어 있다고 하여도, 누구나 데이터를 쉽게 활용할 수 있는 것은 아니다.데이터를 가공하거나, 데이터에 포함된 주요 정보를 탐색하거나, 데이터 마이닝(data mining)이나 머신 러닝 알고리즘(machine learning algorithm)을 적용하는데 에는 기술적인 지식과 경험이 요구되는데, 모두가 이러한 지식과 경험을 보유하고 있지는 못한다.However, even if data and algorithms are prepared, it is not possible for anyone to easily utilize the data. It is possible to process data, search key information contained in the data, or perform data mining or machine learning algorithms ) Requires technical knowledge and experience, but not all of them have such knowledge and experience.

또한 추후에는 데이터 또는 알고리즘에 대한 전문적인 지식 못지 않게, 데이터가 생성되는 환경과 상황에 대한 경험적인 지식, 개인적인 성향과 어떤 데이터에 어떤 알고리즘을 어떤 파라메터(parameter)를 적용하여 활용해야 하는가에 대한 노하우의 중요성이 더욱 커질 것이다.In the future, as well as expert knowledge of data or algorithms, experience knowledge of environment and circumstances in which data is generated, personal tendencies, and know-how about which algorithm to use with which parameters are applied to certain data Will become even more important.

또한 데이터와 함께 이러한 데이터를 대용량으로 수집하는 과정을 수행하는 것은 인공지능 서비스를 구현하는데 있어서 매우 중요한 요소이다.　In addition, the process of collecting large amounts of such data together with data is a very important factor in implementing artificial intelligence services.

따라서, 누구나 자신의 데이터를 최대로 활용할 수 있게 하기 위해서는 데이터에 대한 경험적인 지식을 보유한 경험자 또는 데이터 분석에 대한 전문적인 기술을 가진 전문가의 능력을 쉽게 빌릴 수 있게 해야 하며, 한편으로 경험자 및 전문가들이 이러한 과정을 통해 자신의 지식과 경험을 활용하여 수익을 창출할 수 있는 기회를 얻을 수 있게 할 필요가 있다.Therefore, in order to make the best use of all of their data, everyone should be able to easily borrow the ability of the experienced person who possesses the knowledge of the data or the expert of the data analysis. Meanwhile, Through these processes, we need to be able to use our knowledge and experience to create opportunities for profit.

공개특허 10-2007-0108294(공개일 : 2007년 11월 09일)[Patent Document 1] Japanese Patent Application Laid-Open No. 10-2007-0108294 (Publication Date: November 09, 2007)

본 발명의 실시예에 따른 데이터 탐색 장치는 유저가 원하는 패턴의 데이터를 탐색 대상 시계열 데이터로부터 탐색하기 위한 것이다.A data search apparatus according to an embodiment of the present invention is for searching for data of a pattern desired by a user from search time series data.

본 발명의 실시예에 따른 데이터 탐색 장치는 매칭 구간에 할당된 코멘트를 분류하기 위한 것이다. A data search apparatus according to an embodiment of the present invention is for classifying comments allocated in a matching section.

본 발명의 실시예에 따른 데이터 탐색 장치는 코멘트, 설정 구간, 분류 태그 리스트의 가격을 산정하기 위한 것이다.The data searching apparatus according to the embodiment of the present invention is for estimating the prices of comments, setting intervals, and classification tag lists.

본 발명의 실시예에 따른 데이터 탐색 장치는 새로이 입력되는 분석 대상 시계열 데이터를 분류 태그에 따라 분류하기 위한 것이다.The data search apparatus according to an embodiment of the present invention classifies newly inputted analysis target time series data according to classification tags.

본 발명의 실시예에 따른 데이터 탐색 장치는 유저에 의하여 선택된 구간 또는 시점에 코멘트를 할당하기 위한 것이다.The data search apparatus according to an embodiment of the present invention is for assigning a comment to a section or point selected by a user.

본 발명의 실시예에 따른 데이터 탐색 장치는 코멘트, 선택 구간이나 선택 시점, 분류 태그 리스트의 가격을 산정하기 위한 것이다.The data search apparatus according to an embodiment of the present invention is for estimating the prices of comments, selection periods, selection points, and classification tag lists.

본 출원의 과제는 이상에서 언급한 과제로 제한되지 않으며, 언급되지 않는 또 다른 과제는 아래의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.The task of the present application is not limited to the above-mentioned problems, and another task which is not mentioned can be clearly understood by a person skilled in the art from the following description.

본 발명의 일측면에 따르면, 시계열 데이터를 저장하는 메모리; 및상기 메모리에 액세스가능한 프로세서를 포함하며, 상기 프로세서는, 상기 시계열 데이터의 일부 구간 또는 일부 시점에 외부에서 입력된 코멘트를 할당하고, 상기 코멘트에 포함된 분류 태그에 따라 상기 코멘트를 분류할 수 있다. According to an aspect of the present invention, there is provided an information processing apparatus including: a memory for storing time series data; And a processor accessible to the memory, wherein the processor can assign an externally input comment at a part of or a part of the time series data, and classify the comment according to a classification tag included in the comment .

상기 프로세서는, 상기 코멘트에 대한 코멘트 리스트를 생성하여 상기 분류 태그에 연관시키고,상기 분류 태그에 대한 분류 태그 리스트를 생성할 수 있다. The processor may generate a comment list for the comment, associate the comment list with the classification tag, and generate a classification tag list for the classification tag.

상기 프로세서는,상기 분류 태그 및 상기 코멘트 중 적어도 하나에 대한 점수를 하나 이상의 유저 단말기로부터 입력받아 할당하고, 상기 코멘트가 다른 코멘트에서 인용될 경우 상기 코멘트의 인용횟수를 계산하며, 상기 점수와 상기 코멘트의 인용횟수에 따라 상기 코멘트에 대한 가격을 산정할 수 있다. Wherein the processor is configured to receive and assign a score for at least one of the classification tag and the comment from one or more user terminals and to calculate the number of citations of the comment when the comment is cited in another comment, The price of the comment can be calculated according to the number of quotations of the comment.

상기 프로세서는, 상기 코멘트가 할당된 상기 시계열 데이터의 특징으로 이루어진 데이터 벡터를 생성하고, 상기 데이터 벡터에 따른 머쉰 러닝 모델에 또다른 시계열 데이터을 적용하여 상기 또다른 시계열 데이터를 상기 분류 태그 별로 분류할 수 있다. The processor may generate a data vector having the characteristics of the time series data to which the comment is assigned and apply another time series data to the machine learning model according to the data vector to classify the time series data according to the classification tag have.

상기 프로세서는, 코멘트 할당 없이 상기 또다른 시계열 데이터를 상기 머쉰 러닝 모델에 적용할 수 있다.The processor may apply the time series data to the machine learning model without comment assignment.

본 발명의 실시예에 따른 데이터 탐색 장치는 서로 다른 복수의 시계열 데이터에서의 설정 구간에 해당되는 패턴과 일치하거나 유사한 부분을 탐색 대상 시계열 데이터로부터 탐색함으로써 유저가 원하는 데이터를 탐색할 수 있다. The data search apparatus according to the embodiment of the present invention can search for desired data by searching the search target time series data for a portion that matches or is similar to the pattern corresponding to the setting interval in a plurality of different time series data.

본 발명의 실시예에 따른 데이터 탐색 장치는 분류 태그가 포함된 코멘트를 할당함으로써 매칭 구간에 할당된 코멘트를 분류 태그에 따라 분류할 수 있다. The data search apparatus according to the embodiment of the present invention can classify the comments allocated in the matching section according to the classification tag by allocating the comments including the classification tag.

본 발명의 실시예에 따른 데이터 탐색 장치는 코멘트, 설정 구간, 분류 태그 리스트의 적절성에 따른 점수나 인용횟수에 따라 가격을 산정할 수 있다. The data searching apparatus according to the embodiment of the present invention can calculate the price according to the score or the number of citations according to the appropriateness of the comment, the setting period, and the classification tag list.

본 발명의 실시예에 따른 데이터 탐색 장치는 머쉰 러닝 모델을 통하여 새로이 입력되는 분석 대상 시계열 데이터를 분류 태그에 따라 분류할 수 있다.The data searching apparatus according to the embodiment of the present invention can classify newly inputted analysis target time series data according to classification tags through a machine learning model.

본 발명의 실시예에 따른 데이터 탐색 장치는 코멘트, 선택 구간이나 선택 시점, 분류 태그 리스트의 적절성에 따른 점수나 인용횟수를 통하여 가격을 산정할 수 있다. The data search apparatus according to the embodiment of the present invention can calculate the price through the score, the number of citations, and the number of comments according to the appropriateness of the comment, the selection period, the selection time, and the classification tag list.

본 발명의 실시예에 따른 데이터 탐색 장치는 머쉰 러닝 모델를 통하여 새로이 입력되는 분석 대상 시계열 데이터를 분류 태그에 따라 분류할 수 있다. The data searching apparatus according to the embodiment of the present invention can classify newly inputted analysis target time series data according to classification tags through a machine learning model.

본 출원의 효과는 이상에서 언급한 효과로 제한되지 않으며, 언급되지 않는 또 다른 효과는 아래의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.The effects of the present application are not limited to the effects mentioned above, and other effects not mentioned can be clearly understood by those skilled in the art from the following description.

도 1은 본 발명의 실시예에 따른 데이터 탐색 장치를 나타낸다.
도 2는 제1 시계열 데이터, 제2 시계열 데이터, 제1 탐색 대상 시계열 데이터 및 제2 탐색 대상 시계열 데이터의 일례를 나타낸다.
도 3은 매칭 구간에 할당된 코멘트의 일례를 나타낸다.
도 4는 분류 태그 리스트의 일례를 나타낸다.
도 5 및 도 6은 데이터 벡터를 생성하는 과정을 나타내는 도면이다.
도 7 내지 도9는 머쉰 러닝 모델을 통한 나타낸다.
도 10 내지 도 12는 유저에 의하여 선택된 구간에 할당된 코멘트를 나타낸다. 1 shows a data searching apparatus according to an embodiment of the present invention.
2 shows an example of the first time series data, the second time series data, the first search target time series data, and the second search target time series data.
FIG. 3 shows an example of a comment assigned to a matching section.
4 shows an example of a classification tag list.
5 and 6 are diagrams illustrating a process of generating a data vector.
Figures 7-9 show through a machine learning model.
10 to 12 show comments assigned to the section selected by the user.

이하 본 발명의 실시예에 대하여 첨부한 도면을 참조하여 상세하게 설명하기로 한다. 다만, 첨부된 도면은 본 발명의 내용을 보다 쉽게 개시하기 위하여 설명되는 것일 뿐, 본 발명의 범위가 첨부된 도면의 범위로 한정되는 것이 아님은 이 기술분야의 통상의 지식을 가진 자라면 용이하게 알 수 있을 것이다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings. It is to be understood, however, that the appended drawings illustrate the present invention in order to more easily explain the present invention, and the scope of the present invention is not limited thereto. You will know.

또한, 본 출원에서 사용한 용어는 단지 특정한 실시예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. Also, the terms used in the present application are used only to describe certain embodiments and are not intended to limit the present invention. The singular expressions include plural expressions unless the context clearly dictates otherwise.

본 출원에서, "포함하다" 또는 "가지다" 등의 용어는 명세서 상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.In the present application, the terms "comprises" or "having" and the like are used to specify that there is a feature, a number, a step, an operation, an element, a component or a combination thereof described in the specification, But do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, or combinations thereof.

도 1은 본 발명의 실시예에 따른 데이터 탐색 장치를 나타낸다. 도 1에 도시된 바와 같이, 본 발명의 실시예에 따른 데이터 탐색 장치는 메모리(106) 및 프로세서(104)를 포함한다. 1 shows a data searching apparatus according to an embodiment of the present invention. 1, a data search apparatus according to an embodiment of the present invention includes a memory 106 and a processor 104. [

본 발명의 실시예에 따른 데이터 탐색 장치는 정보를 전달하기 위한 버스(102) 또는 다른 통신 메커니즘을 포함할 수 있다. 이와 같은 버스(102) 또는 다른 통신 메커니즘은, 프로세서(104), 컴퓨터 판독가능한 기록매체(RM), 네트워크인터페이스 (112)(예를 들면, 모뎀 또는 이더넷카드), 디스플레이부(114)(예를 들면, CRT 또는 LCD), 입력부 (118)(예를 들면, 키보드, 키패드, 가상키보드, 마우스, 트랙볼, 스타일러스, 터치 감지 수단 등), 및/또는 하위시스템들을 상호접속한다. The data searching apparatus according to an embodiment of the present invention may include a bus 102 or other communication mechanism for transferring information. Such a bus 102 or other communication mechanism may include a processor 104, a computer readable recording medium RM, a network interface 112 (e.g., a modem or Ethernet card), a display portion 114 A keyboard, a keypad, a virtual keyboard, a mouse, a trackball, a stylus, a touch sensing means, etc.), and / or subsystems.

컴퓨터 판독가능한 기록매체(RM)는 메모리(106)(예를 들면, RAM), 정적저장부 (108)(예를 들면, ROM), 디스크드라이브 (110)(예를 들면, HDD, SSD, 광 디스크, 플래쉬 메모리 드라이브 등)를 포함하나 이에 한정되는 것은 아니다. 이 때 디스크 드라이브는 non-transitory 기록매체일 수 있다. 광디스크는 CD, DVD, Blu-ray disc이나 이에 한정되는 것은 아니다. The computer readable recording medium RM includes a memory 106 (e.g., RAM), a static storage 108 (e.g., ROM), a disk drive 110 (e.g., HDD, SSD, Disk, flash memory drive, etc.). At this time, the disk drive may be a non-transitory recording medium. The optical disc may be a CD, a DVD, or a Blu-ray disc, but is not limited thereto.

본 발명의 실시예에 따른 데이터 탐색 장치는 하나 이상의 디스크드라이브(110)를 구비할 수 있다. 또한 도 1에 도시된 바와 같이,디스크 드라이브(110)는 프로세서(104)와 함께 하우징(120)에 구비될 수 있으나 이와 다르게 원격에 설치되어 프로세서(104)와 원격 통신을 수행할 수도 있다. 하나 이상의 디스크 드라이브들을 구비하는 데이터베이스를 포함할 수도 있다.The data searching apparatus according to an embodiment of the present invention may include one or more disk drives 110. Also, as shown in FIG. 1, the disk drive 110 may be included in the housing 120 along with the processor 104, but may be remotely located to perform remote communication with the processor 104. And may include a database having one or more disk drives.

기록매체(RM)는 본 발명의 실시예에 따른 데이터 탐색 장치의 동작에 필요한 운영체제，드라이버, 애플리케이션프로그램, 데이터 및 데이터베이스 등을 저장할 수 있다.The recording medium RM may store an operating system, a driver, an application program, data and a database necessary for the operation of the data searching apparatus according to the embodiment of the present invention.

디스플레이부(114)는 본 발명의 실시예에 따른 데이터 탐색 장치의 동작 및 유저 인터페이스를 표시할 수 있다. The display unit 114 may display an operation of the data search apparatus and a user interface according to an embodiment of the present invention.

프로세서(104)는 CPU, 마이크로 컨트롤러, 디지털신호프로세서(DSP) 등일 수 있으나 이에 한정되는 것은 아니며, 본 발명의 실시예에 따른 데이터 탐색 장치의 동작을 제어한다. The processor 104 may be, but is not limited to, a CPU, a microcontroller, a digital signal processor (DSP), and the like, and controls the operation of the data search apparatus according to an embodiment of the present invention.

프로세서(104)는 기록매체(RM)에 접속하여 기록매체(RM)에저장된명령들의 하나 이상의 시퀀스들을 실행하는 것에 의해 이후에 설명될 데이터 탐색, 코멘트 할당, 분류 태그의 처리 및 머쉰 러닝(machine learing) 등을 수행할 수 있다. Processor 104 is connected to a recording medium RM to perform data searching, comment allocation, classification tag processing and machine learing (to be described later) by executing one or more sequences of instructions stored in a recording medium RM ) Can be performed.

이러한 명령들은, 정적 저장부 (108) 또는 디스크드라이브 (110)와 같은 다른 컴퓨터 판독가능 매체로부터 메모리 (106) 안으로 판독될 수도 있다. 다른 실시형태들에서, 본 개시를 구현하기 위한 소프트웨어 명령들 대신 또는 소프트웨어 명령들과 조합하여 하드웨어에 내장된 회로부(hard-wired circuitry)가 사용될 수도 있다.These instructions may be read into the memory 106 from another computer readable medium such as the static storage 108 or the disk drive 110. [ In other embodiments, hard-wired circuitry may be used instead of or in combination with software instructions to implement the present disclosure.

로직은, 실행을 위해 프로세서(104)로 명령들을 제공하는데 참여하는 임의의 매체를 지칭할 수도 있는 컴퓨터 판독가능한 기록매체(RM)에 인코딩될 수도 있다. 이러한 기록매체(RM)는 비휘발성 기록매체들, 휘발성 기록매체들을 포함하지만 이들에 제한되지 않는 많은 형태들을 취할 수도 있다. The logic may be encoded in a computer readable recording medium (RM), which may refer to any medium that participates in providing instructions to processor 104 for execution. Such a recording medium RM may take many forms including, but not limited to, non-volatile recording media, volatile recording media.

프로세서(104)는 디스플레이부(114)용 하드웨어 제어기와 통신하여 디스플레이부(114) 상에 데이터 탐색 장치의 동작 및 유저 인터페이싱 동작을 표시할 수 있다.The processor 104 may communicate with the hardware controller for the display unit 114 to display the operation of the data searching apparatus and the user interfacing operation on the display unit 114. [

일 실시형태에서, 컴퓨터 판독가능한 기록매체(RM)는 비일시적일 수 있다. 다양한 구현예들에서, 비휘발성 기록매체(RM)들은 광학 또는 자기 디스크들, 예컨대 디스크드라이브 (110)를 포함하고, 휘발성 기록매체들은 동적기록매체, 예컨대 시스템 메모리 (106)를 포함하고, 버스(102)를 포함하는 배선들(wires)을 포함하는 송신매체들은 동축케이블들, 동선(copper wire), 및 광섬유들을 포함한다.In one embodiment, the computer readable recording medium RM may be non-volatile. Volatile recording media RM include optical or magnetic disks such as disk drive 110 and volatile recording media include a dynamic recording medium such as system memory 106 and a bus (not shown) 102 include coaxial cables, copper wire, and optical fibers.

일예에서, 송신매체들은, 라디오파 및 적외선 데이터 통신들 동안 생성된 것들과 같은 음파 또는 광파의 형태를 취할 수도 있다.In one example, transmission media may take the form of sound waves or light waves, such as those generated during radio and infrared data communications.

컴퓨터 판독가능한 기록매체(RM)들의 몇몇 공통의 형태들은, 예를 들면, 플로피디스크, 플렉시블 디스크, 하드디스크, 자기테이프, 임의의 다른 자기매체, CD-ROM, 임의의 다른 광학매체, 펀치카드들, 종이테이프, 구멍들의 패턴을 갖는 임의의 다른 물리적 매체, RAM, PROM, EPROM, FLASH-EPROM, 임의의 다른 메모리칩 또는 카트리지, 반송파, 또는 컴퓨터가 판독하도록 적응된 임의의 다른 매체를 포함한다.Some common forms of computer readable media (RM) are, for example, a floppy disk, a flexible disk, a hard disk, a magnetic tape, any other magnetic medium, a CD-ROM, , A paper tape, any other physical medium having a pattern of holes, a RAM, a PROM, an EPROM, a FLASH-EPROM, any other memory chip or cartridge, a carrier wave, or any other medium adapted to be read by a computer.

본 개시의 다양한 실시형태들에서, 본 개시를 실시하기 위한 명령시퀀스들의 실행은 본 발명의 실시예에 따른 데이터 탐색 장치에 의해 수행될 수도 있다. 본 개시의 다양한 다른 실시형태들에서, 통신링크(124)에 의해 (예를 들면, LAN, WLAN,PTSN, 및/또는 원격통신들, 모바일, 및 셀룰러폰 네트워크들을 포함하는 다른 유선 또는 무선 네트워크들과 같은) 네트워크에 결합된 복수의 컴퓨팅 장치들(100)은 본 개시를 실시하기 위한 명령시퀀스들을 서로 협력시켜 수행할 수도 있다.In various embodiments of the present disclosure, execution of the instruction sequences for implementing the present disclosure may be performed by a data search apparatus according to an embodiment of the present invention. In various other embodiments of the present disclosure, other wired or wireless networks (e.g., LAN, WLAN, PTSN, and / or remote communications, including mobile and cellular telephone networks) A plurality of computing devices 100 coupled to a network may perform the instruction sequences for implementing the present disclosure in cooperation with each other.

본 발명의 실시예에 따른 데이터 탐색 장치는, 통신링크(124) 및 네트워크 인터페이스 (112)를 통해 메시지들, 데이터, 정보 및 하나 이상의 프로그램들(즉, 애플리케이션코드)을 포함하는 명령들을 송신하고 수신할 수도 있다. The data searching apparatus according to the embodiment of the present invention transmits and receives commands including messages, data, information and one or more programs (i.e., application code) through the communication link 124 and the network interface 112 You may.

네트워크 인터페이스 (112)는, 통신링크(124)를 통한 송수신을 가능하게 하기 위한, 별개의 또는 통합된 안테나를 포함할 수도 있다. 수신된 프로그램 코드는 수신될 때 프로세스(104)에 의해 실행될 수도 있고/있거나 실행을 위해 디스크드라이브 (110) 또는 몇몇 다른 비휘발성 저장에 저장될 수도 있다.The network interface 112 may include separate or integrated antennas to enable transmission and reception over the communication link 124. The received program code may be executed by the process 104 when received and / or may be stored in the disk drive 110 or some other non-volatile store for execution.

다음으로 도면을 참조하여 본 발명의 실시예에 따른 데이터 탐색 장치의 동작에 대해 설명한다. Next, the operation of the data search apparatus according to the embodiment of the present invention will be described with reference to the drawings.

메모리(106)는 서로 다른 제1 시계열 데이터 및 제2 시계열 데이터를 저장한다. 제1 시계열 데이터 및 제2 시계열 데이터는 시간에 따른 다양한 데이터 값에 대한 정보를 포함할 수 있다. The memory 106 stores different first time series data and second time series data. The first time series data and the second time series data may include information on various data values over time.

예를 들어, 제1 시계열 데이터 및 제2 시계열 데이터는 센서가 출력한 시간에 따른 센싱값에 대한 정보나, 특정 기업이나 주식 시장의 시간에 따른 주가에 대한 정보일 수 있다.For example, the first time series data and the second time series data may be information on a sensing value according to the time outputted by the sensor, or information on a stock price according to time of a specific company or a stock market.

또한 제1 시계열 데이터 및 제2 시계열 데이터는 동일 요인을 센싱하는 서로 다른 센서로부터 각각 출력될 수도 있고, 서로 다른 요인(예를 들어, 해수면의 온도 및 태풍의 이동경로, 기온과 작물의 성장량 등)에 대한 것일 수도 있다. The first time series data and the second time series data may be respectively output from different sensors for sensing the same factor or may be different from each other depending on different factors (for example, the temperature of the sea surface and the travel path of the typhoon, . &Lt; / RTI >

프로세서(104)는 메모리(106)에 액세스(access)가능한다. 이에 따라 프로세서(104)는 제1 시계열 데이터 및 제2 시계열 데이터를 판독할 수 있다. The processor 104 may access the memory 106. Accordingly, the processor 104 can read the first time series data and the second time series data.

도 2는 제1 시계열 데이터, 제2 시계열 데이터, 제1 탐색 대상 시계열 데이터 및 제2 탐색 대상 시계열 데이터의 일례를 나타낸다. 도 2에서 가로축은 시간에 해당되고 세로축은 각 시계열 데이터의 값에 해당될 수 있다. 2 shows an example of the first time series data, the second time series data, the first search target time series data, and the second search target time series data. In FIG. 2, the horizontal axis corresponds to the time and the vertical axis corresponds to the value of each time series data.

프로세서(104)는 설정 구간에 존재하는 제1 시계열 데이터의 제1 패턴에 매칭되는 제1 탐색 대상 시계열 데이터의 일부인 제1 매칭 데이터를 도출한다. The processor 104 derives the first matching data which is a part of the first search target time series data matched with the first pattern of the first time series data existing in the setting interval.

또한 프로세서(104)는 상기 설정 구간에 존재하는 제2 시계열 데이터의 제2 패턴에 매칭되는, 제1 탐색 대상 시계열 데이터와 다른 제2 탐색 대상 시계열 데이터의 일부인 제2 매칭 데이터를 도출한다.The processor 104 also derives second matching data that is part of the second search target time series data that is different from the first search target time series data that matches the second pattern of the second time series data existing in the setting interval.

즉, 프로세서(104)는 동일 설정 구간에 존재하는 서로 다른 시계열 데이터의 패턴에 매칭되는 데이터를 제1 탐색 대상 시계열 데이터 및 제2 탐색 대상 시계열 데이터로부터 탐색할 수 있다. That is, the processor 104 can search the first search target time series data and the second search target time series data that match the different time series data patterns existing in the same setting interval.

이를 위하여 프로세서(104)는 설정구간의 복수의 패턴을 서로 다른 복수의 탐색대상 시계열 데이터 전체 구간에 대해 윈도우잉(windowing)하면서 복수의 패턴과 동일하거나 유사한 데이터를 탐색하여 제1 매칭 데이터 및 제2 매칭 데이터를 도출할 수 있다. 허용가능한 유사도(error rate)는 디폴트 값이 제공될 수 있으며, 디폴트 값은 변경가능하다.For this, the processor 104 searches a plurality of patterns of the set period for data of the same or similar pattern as a plurality of patterns while windowing the entire section of the plurality of search target time series data to obtain first matching data and second Matching data can be derived. Acceptable error rates can be provided with default values, and the default values are changeable.

이를 위하여 프로세서(104)는 설정 구간에서 데이터 샘플링을 수행하고 샘플링된 데이터의 순서 및 데이터 값들과 일치하거나 유사한 데이터를 탐색할 수 있다. For this purpose, the processor 104 may perform data sampling in the setup interval and search for data that is consistent with or similar to the order and data values of the sampled data.

*프로세서(104)는 제1 탐색 대상 시계열 데이터 및 제2 탐색 대상 시계열 데이터에 대해 탐색 동작을 수행할 수 있으나 이에 한정되지 않으며 도 2에 도시된 바와 같이, 3개 이상의 서로 다른 탐색 대상 시계열 데이터에 대해 탐색 동작을 수행할 수 있다. The processor 104 may perform a search operation on the first search target time series data and the second search target time series data, but is not limited thereto. As shown in FIG. 2, the processor 104 may search for three or more different search target time series data The search operation can be performed.

이상에서 설명된 바와 같이, 본 발명의 실시예에 따른 데이터 탐색 장치는 서로 다른 시계열 데이터 중 설정 구간의 패턴과 유사하거나 동일한 데이터를 탐색할 수 있다.As described above, the data search apparatus according to the embodiment of the present invention can search for data similar to or identical to the pattern of the setting interval among different time series data.

예를 들어, 제1 시계열 데이터 내지 제3 시계열 데이터가 각각 시간에 따른 온도, 습도, 작물 성장량이라고 할 때, 유저(user)는 온도, 습도, 작물 성장량의 연관성이 높은 구간을 설정 구간으로 데이터 탐색 장치에 입력하면, 데이터 탐색 장치는 설정 구간에서의 온도, 습도, 작물 성장량의 패턴과 일치하거나 유사한 부분을 복수의 탐색 대상 데이터로부터 탐색할 수 있다. 이에 따라 유저는 연관성이 높은 구간을 용이하게 찾을 수 있다. For example, when the first time series data to the third time series data are temperature, humidity, and crop growth amounts respectively with respect to time, the user can select a section in which the relation of temperature, humidity, When the data is input to the apparatus, the data search apparatus can search for a portion of the search target data that matches or is similar to the pattern of the temperature, humidity, and crop growth amount in the setting region. Accordingly, the user can easily find the section with high relevance.

한편, 도 2에 도시된 바와 같이, 제1 탐색 대상 시계열 데이터 및 제2 탐색 대상 시계열 데이터는 각각 제1 시계열 데이터 및 제2 시계열 데이터의 적어도 일부일 수 있다. Meanwhile, as shown in FIG. 2, the first search target time series data and the second search target time series data may be at least a part of the first time series data and the second time series data, respectively.

또는 도 2와 다르게 제1 탐색 대상 시계열 데이터 및 제2 탐색 대상 시계열 데이터는 각각 제1 시계열 데이터 및 제2 시계열 데이터와 서로 다를 수 있다. Alternatively, unlike FIG. 2, the first search target time series data and the second search target time series data may be different from the first time series data and the second time series data, respectively.

예를 들어, 제1 시계열 데이터 및 제2 시계열 데이터가 각각 온도 센서 1 및 습도 센서 1이 출력한 데이터이고, 제1 탐색 대상 시계열 데이터 및 제2 탐색 대상 시계열 데이터는 온도 센서 2 및 습도 센서 2가 출력한 데이터일 수 있다.For example, the first time series data and the second time series data are the data output from the temperature sensor 1 and the humidity sensor 1, respectively, and the first search target time series data and the second search target time series data are the temperature sensor 2 and the humidity sensor 2 And may be output data.

이에 따라 유저는 온도 센서 1 및 습도 센서 1의 데이터 중 설정 구간의 패턴과 유사하거나 일치되는 부분이온도 센서 2 및 습도 센서 2가 출력한 데이터에 있는지 확인할 수 있다. Accordingly, the user can confirm whether or not a portion of the data of the temperature sensor 1 and the humidity sensor 1 that is similar to or coincides with the pattern of the setting section is in the data output from the temperature sensor 2 and the humidity sensor 2.

이상의 설명에서 설정 구간의 설정은 쿼리를 통하여 이루어질 수 있다. In the above description, the setting of the setting interval can be performed through a query.

프로세서(104)는 탐색 전에 normalization 필터나 mean 필터로 복수의 탐색 대상 시계열 데이터를 일정 정도 평활화(smoothing)한 후 탐색 동작을 수행할 수 있다. The processor 104 may perform a search operation after smoothing a plurality of search target time series data to a certain degree with a normalization filter or a mean filter before the search.

한편, 도 3에 도시된 바와 같이, 프로세서(104)는 제1 매칭 데이터와 제2 매칭 데이터가 존재하는 매칭 구간에 외부에서 입력된 코멘트를 할당할 수 있다. 유저는 자신의 단말기 또는 입력부(118)를 통하여 매칭 구간의 코멘트를 입력할 수 있다. 코멘트는 매칭 구간에 대한 유저의 해석, 의견이나 메모일 수 있으나 이에 한정되는 것은 아니다. Meanwhile, as shown in FIG. 3, the processor 104 may assign a comment input from the outside in a matching interval in which the first matching data and the second matching data exist. The user can input a comment of the matching section through his or her terminal or input unit 118. [ The comment may be a user's interpretation, comment, or memo about the matching interval, but is not limited thereto.

단말기는 본 발명의 실시예에 따른 데이터 탐색 장치에 접속하여 통신가능하며, PC, 타블렛, 스마트폰, 또는 랩탑일 수 있으나 이에 한정되는 것은 아니다. The terminal may be connected to the data search apparatus according to an embodiment of the present invention and may be a PC, a tablet, a smart phone, or a laptop, but is not limited thereto.

이 때 코멘트는 분류 태그를 포함할 수 있으며, 프로세서(104)는 코멘트에 포함된 분류 태그(tag)에 따라 코멘트를 분류할 수 있다. 예를 들어, 유저는 '첫 번째 매칭 구간 후 #1분 이내에 또다른 매칭 구간이 발생하면 주의 요망'이라는 코멘트를 입력할 수 있다. At this time, the comment may include a classification tag, and the processor 104 may classify the comment according to the classification tag included in the comment. For example, the user can input a comment saying 'If another matching interval occurs within # 1 minute after the first matching interval', the user is requested to pay attention.

이 때 코멘트는 #1분과 같은 분류 태그를 포함할 수 있으며, 본 발명의 실시예에서 분류 태그는 해쉬 태그(hash tag)일 수 있으나 이에 한정되는 것은 아니다. At this time, the comment may include a classification tag such as # 1 minute. In an embodiment of the present invention, the classification tag may be a hash tag, but is not limited thereto.

도 4에 도시된 바와 같이, 프로세서(104)는 분류 태그 별로 정렬된 분류 태그 리스트를 생성할 수 있으며, 이와 같은 분류 태그 리스트를 기록매체(RM)에 저장할 수 있다. As shown in FIG. 4, the processor 104 may generate a classification tag list sorted by classification tag, and may store the classification tag list in the recording medium RM.

즉, 프로세서(104)는 코멘트에 대한 코멘트 리스트를 생성하여 분류 태그에 연관시키고,분류 태그에 대한 분류 태그 리스트를 생성할 수 있다. 코멘트 리스트는 분류 태그 별로 생성될 수 있으며, 코멘트와 함께 코멘트 관련 정보 역시 포함할 수 있다. That is, the processor 104 may generate a comment list for comments, associate it with a classification tag, and generate a classification tag list for the classification tag. A comment list can be generated for each classification tag, and can include comment-related information as well as comments.

즉, 코멘트 1 내지 코멘트 3은 분류 태그 #ABCD를 포함하고 있으며, 코멘트 4 내지 코멘트 6은 분류 태그 #WXYZ를 포함할 수 있다. That is, the comments 1 to 3 include the classification tag #ABCD, and the comments 4 to 6 may include the classification tag #WXYZ.

*코멘트 관련 정보는 코멘트의 제목, 코멘트 작성자의 ID, 작성시간, 코멘트가 할당된 매칭 구간의 시작위치와 끝위치, 코멘트가 할당된 매칭 구간에 있는 데이터의 최대, 최소, 평균값을 포함할 수 있으나 이에 한정되는 것은 아니다.The comment related information may include the title of the comment, the ID of the comment creator, the creation time, the start and end positions of the matching interval to which the comment is assigned, and the maximum, minimum and average values of the data in the matching interval to which the comment is assigned But is not limited thereto.

도 4에서 코멘트 1 내지 코멘트 6은 문자나 기호로 이루어질 수 있으나 코멘트 1 내지 코멘트 6 각각에 부여된 코드일 수도 있다. In FIG. 4, comments 1 to 6 may be characters or symbols, but may be codes assigned to comments 1 to 6, respectively.

한편, 프로세서(104)는 설정 구간, 분류 태그 및 코멘트 중 적어도 하나에 대한 점수를 하나 이상의 유저 단말기로부터 입력받아 할당할 수 있다. Meanwhile, the processor 104 may receive and allocate a score for at least one of a setting interval, a classification tag, and a comment from one or more user terminals.

즉, 특정 유저는 다른 유저들에 의하여 이루어진 설정 구간, 분류 태그 및 코멘트의 적절성에 대한 점수를 입력부(118) 또는 단말기를 통하여 부여할 수 있다. 설정 구간에 대한 점수는 설정 구간에 대한 정보를 포함하는 쿼리에 대한 점수일 수 있다.That is, the specific user can assign a score for the appropriateness of the setting period, the classification tag, and the comment made by other users through the input unit 118 or the terminal. The score for the setting interval may be a score for the query that includes information about the setting interval.

또한 프로세서(104)는 코멘트가 다른 코멘트에서 인용될 경우 코멘트의 인용횟수를 계산할 수 있다. 예를 들어, 코멘트 3이 '첫 번째 매칭 구간 후 #1분 이내에 또다른 매칭 구간이 발생하면 코멘트 1에 따라 주의 요망'일 경우 코멘트 3은 코멘트 1을 1회 인용한 것일 수 있다. The processor 104 may also calculate the number of times a comment is cited if the comment is cited in another comment. For example, when comment 3 is 'requesting attention according to comment 1 if another matching interval occurs within # 1 minute after the first matching interval', comment 3 may be quoted comment 1 once.

이를 통하여 프로세서(104)는 상기 점수와 코멘트의 인용횟수에 따라 코멘트에 대한 가격을 산정할 수 있다. 이에 따라 본 발명의 실시예에 따른 데이터 탐색 장치는 해당 코멘트를 작성한 유저에게 적절한 보상을 제공할 수 있다. Through this, the processor 104 can calculate the price of the comment according to the number of citations and the number of citations of the comments. Accordingly, the data search apparatus according to the embodiment of the present invention can provide appropriate compensation to the user who created the corresponding comment.

*이상에서는 상기 점수와 코멘트의 인용횟수에 따라 가격이 산정되나 적절하게 설정된 설정 구간에 대한 점수에 따라 가격이 산정되어 해당 설정 구간을 설정한 유저에게 보상이 제공될 수 있다. In the above description, the price is calculated according to the number of quotations of the score and the comment, but the price is calculated according to the score of the appropriately set setting interval, and compensation can be provided to the user who sets the setting interval.

한편, 프로세서(104)는 제1 매칭 데이터의 제1 특징 및 제2 매칭 데이터의 제2 특징으로 이루어진 데이터 벡터(data vector)를 생성할 수 있다. 도 5 및 도 6은 데이터 벡터를 생성하는 과정을 나타내는 도면이다. Meanwhile, the processor 104 may generate a data vector consisting of the first feature of the first matching data and the second characteristic of the second matching data. 5 and 6 are diagrams illustrating a process of generating a data vector.

도 5에 도시된 데이터 벡터 생성 과정을 설명한 후 도 6의 데이터 벡터 생성 과정에 대해 설명한다.The data vector generation process shown in FIG. 5 will be described and then the data vector generation process shown in FIG. 6 will be described.

도 5에서 Data#1 내지 Data#4는 서로 다른 제1 내지 제4 매칭 데이터이다. 이 때 각 탐색 대상 시계열 데이터의 데이터 생성 주기가 서로 다를 수 있다. In FIG. 5, Data # 1 to Data # 4 are first to fourth matching data that are different from each other. At this time, the data generation periods of the search target time series data may be different from each other.

예를 들어, Data#1의 데이터는 단위 시간(예를 들어, 1초)마다 생성되고, Data#2의 데이터는 2초마다 생성되며, Data#3 및 Data#4는 각각 4초와 6초마다 생성될 수 있다. 이상의 설명에서는 단위 시간을 1초라고 하였으나 이에 한정되는 것은 아니며, 경우에 따라 달라질 수 있다. For example, data of Data # 1 is generated every unit time (for example, 1 second), data of Data # 2 is generated every 2 seconds, and Data # 3 and Data # 4 are generated every 4 seconds and 6 seconds &Lt; / RTI > In the above description, the unit time is 1 second, but the present invention is not limited thereto, and may be changed depending on the case.

이 때 제1 내지 제4 매칭 데이터의 제1 특징 내지 제4 특징은 각 매칭 데이터의 데이터값일 수 있다. 데이터 벡터는이들 데이터 값으로 이루어질 수 있다.In this case, the first to fourth features of the first to fourth matching data may be data values of each matching data. The data vectors may be made up of these data values.

데이터 벡터는 각 단위 시간마다 생성될 수 있는데, 앞서 설명된 바와 같이, 매칭 데이터들의 데이터 생성 주기가 다를 수 있으므로 특정 단위 시간에 제1 매칭 데이터는 데이터 값이 존재하지만 제2 매칭 데이터는 데이터 값이 존재하지 않을 수 있다. Since the data generation cycle of the matching data may be different as described above, the first matching data has a data value in a specific unit time, but the second matching data has a data value It may not exist.

이와 같이 매칭 데이터의 데이터값이 존재하지 않으면 데이터 벡터가 형성되지 않으므로 설정된 방법에 따라 데이터 값을 가상적으로 생성할 수 있다. If the data value of the matching data does not exist, the data vector is not formed, so that the data value can be virtually generated according to the set method.

예를 들어, 도 5에 도시된 바와 같이, 각 매칭 데이터의 n 번째 데이터 값과 n+T 번째(n은 1 이상의 자연수, T는 주기) 데이터 값 사이에 데이터 값이 없는 경우, 프로세서(104)는 가상적으로 n 번째 데이터 값을 생성하여 n 번째 데이터 값과 n+m 번째 데이터 값 사이를 채울 수 있다. 5, if there is no data value between the nth data value of each matching data and the n + Tth (n is a natural number of 1 or more, T period) data value, the processor 104, May virtually generate an nth data value to fill the nth data value and the n + mth data value.

즉, 제2 매칭 데이터는 n 번째 데이터값이 #1이고, T는 2이므로 n+2번째(=n+T) 데이터 값은 #2이며, n 번째 데이터값과 n+2번째 데이터 값 사이에 데이터 값이 없다. That is, since the nth data value is # 1 and T is 2, the (n + 2) th (= n + T) data value is # 2 and the There is no data value.

이에 따라 프로세서(104)는 n 번째 데이터값과 n+2번째(=n+T) 데이터 값 사이를 가상적으로 #1(=n 번째 데이터값)로 채울 수 있다. 프로세서(104)는 나머지 제3 및 제4 매칭 데이터 역시 이와 같은 방법으로 데이터 값을 채울 수 있다. Accordingly, the processor 104 can virtually fill the nth data value and the (n + 2) th (= n + T) data value with # 1 (= nth data value). Processor 104 may fill in the data values in the same manner with the remaining third and fourth matching data.

도 5에 도시된 바와 같이, 데이터 벡터가 생성이 안되는 구간이 존재할 수 있다. 가장 첫번째 데이터 벡터를 생성할 때, Data#2 의 경우, 현재 시점의 데이터가 없어 과거의 데이터로부터 가져와야 하는데, 과거의 데이터가 존재하지 않을 수 있다. As shown in FIG. 5, there may be a period in which a data vector can not be generated. When generating the first data vector, in the case of Data # 2, there is no data at the present time, and it is necessary to fetch from the past data, and there may not exist the past data.

이와 같은 현상은 첫 번째 및 두 번째 데이터 벡터에서의 Data#3 의 경우에도 발생하므로, 첫 번째 데이터 벡터와 두 번째 데이터 벡터는 완벽하게 채워질 수가 없게 될 수 있다. 이에 따라 첫 번째 데이터 벡터와 두 번째 데이터 벡터는 생성되지 않을 수 있다.This phenomenon occurs even in the case of Data # 3 in the first and second data vectors, so that the first data vector and the second data vector can not be completely filled. Accordingly, the first data vector and the second data vector may not be generated.

이와 다르게 n 번째 데이터 값과 n+T 번째 데이터 값의 평균값을 가상적으로 생성하여 비어있는 데이터 값을 채울 수도 있으며, 다양한 방법을 통하여 데이터 값을 가상적으로 생성할 수 있다. Alternatively, an average value of the n-th data value and the n + T-th data value may be virtually generated to fill the empty data value, and the data value may be virtually generated through various methods.

이에 따라 각 단위 시간마다 데이터 벡터가 생성될 수 있다. Accordingly, a data vector can be generated for each unit time.

한편, 프로세서(104)는 매칭 구간의 전체 데이터 중 일부를 샘플링(sampling)하여 데이터 벡터를 생성할 수 있다. 즉, 제1 특징 및 제2 특징은 동일 시점에서 제1 매칭 데이터 및 제2 매칭 데이터 각각으로부터 샘플링된 데이터 값일 수 있다. 샘플링 레이트(sampling rate)의 조절에 따라 프로세서(104)의 연산량을 감소시키면서도 신뢰성 있는 데이터 벡터가 생성될 수 있다.Meanwhile, the processor 104 may generate a data vector by sampling a part of the entire data of the matching interval. That is, the first characteristic and the second characteristic may be data values sampled from each of the first matching data and the second matching data at the same time point. A reliable data vector can be generated while reducing the amount of computation of the processor 104 by adjusting the sampling rate.

*다음으로 도 6을 참조하여 데이터 벡터를 생성하는 방법에 대해 설명한다. Next, a method of generating a data vector will be described with reference to FIG.

도 6에 도시된 바와 같이, 서로 다른 제1 매칭 데이터 및 제2 매칭 데이터는 디스플레이부(114)에 연결된 직선들로 표시될 수 있다. 이 때 제1 매칭 데이터의 제1 특징 및 제2 매칭 데이터의 제2 특징은 동일 구간에 존재하는 세그멘테이션(segmentation)화된 제1 매칭 데이터 및 제2 매칭 데이터 각각의 기울기를 포함할 수 있다. As shown in FIG. 6, the first matching data and the second matching data, which are different from each other, may be represented by straight lines connected to the display unit 114. At this time, the first characteristic of the first matching data and the second characteristic of the second matching data may include a slope of each of the segmented first matching data and the second matching data existing in the same section.

프로세서(104)는 시계열 데이터, 탐색 대상 시계열 데이터 또는 매칭 데이터의 시계열 데이터에 대한 세그멘테이션을 수행할 수 있는데, Piecewise Linear Segmentation 기법을 이용할 수 있다. 이와 같은 세그멘테이션 기법은 Piecewise Linear Segmentation 기법에 한정되는 것은 아니며, 다양한 세그멘테이션 기법이 본 발명에 적용될 수 있다.The processor 104 may perform segmentation on time series data of time series data, search time series data, or matching data, and may use a piecewise linear segmentation technique. Such a segmentation technique is not limited to the piecewise linear segmentation technique, and various segmentation techniques can be applied to the present invention.

이에 따라 시계열 데이터, 탐색 대상 시계열 데이터 또는 매칭 데이터는 직선 형상의 세그먼트(segment)로 이루어질 수 있다. Accordingly, the time series data, the search target time series data, or the matching data can be formed as a straight line segment.

도 6에 도시된 바와 같이, 구간마다 데이터 벡터가 생성될 수 있는데, 구간의 설정이 제1 매칭 데이터를 기준으로 할 것인지 제2 매칭 데이터를 기준으로 할 것인지가 결정되어야 한다.As shown in FIG. 6, a data vector may be generated for each section. It should be determined whether the setting of the interval is based on the first matching data or the second matching data.

본 발명의 실시예의 경우, 프로세서(104)는 세그먼트의 개수가 많은 매칭 데이터를 구간을 설정할 수 있다. 이에 따라 제2 매칭 데이터를 기준으로 했을 경우보다 많은 데이터 벡터를 생성할 수 있다. In the case of the embodiment of the present invention, the processor 104 may set intervals of matching data having a large number of segments. Accordingly, more data vectors can be generated when the second matching data is used as a reference.

도 6에 도시된 바와 같이, 제1 매칭 데이터의 세그먼트 개수가 제2 매칭 데이터의 세그먼트 개수보다 많으므로 구간을 설정하기 위한 기준 매칭 데이터는 제1 매칭 데이터가 될 수 있다.As shown in FIG. 6, since the number of segments of the first matching data is larger than the number of segments of the second matching data, the reference matching data for setting the interval may be the first matching data.

프로세서(104)는 제1 매칭 데이터를 이루는 세그먼트의 기울기가 변할 때마다 구간을 설정할 수 있으며, 각 구간의 제1 매칭 데이터 및 제2 매칭 데이터의 기울기를 통하여 데이터 벡터를 생성할 수 있다. The processor 104 may set the interval every time the slope of the segment constituting the first matching data changes, and may generate the data vector through the slope of the first matching data and the second matching data of each interval.

이 때 구간 A에서는 제2 매칭 데이터의 세그먼트 개수가 기준 매칭 데이터인 제1 매칭 데이터의 세그먼트 개수보다 많다. 이에 따라 구간 A에서는 복수의 세그먼트 기울기가 존재할 수 있으며, 프로세서(104)는 구간 A에서의 복수의 세그먼트 기울기를 대표하는 대표 기울기를 제2 특징으로 설정할 수 있다.At this time, in the section A, the number of segments of the second matching data is larger than the number of segments of the first matching data which is the reference matching data. Accordingly, a plurality of segment slopes may exist in the section A, and the processor 104 may set a representative slope representing the plurality of segment slopes in the section A as the second characteristic.

본 발명의 실시예에서 대표 기울기는 복수의 세그먼트 기울기의 평균값일 수 있으나, 이에 한정되지 않으며 다양한 방법에 의하여 대표 기울기가 설정될 수 있다. In the embodiment of the present invention, the representative slope may be an average value of a plurality of segment slopes, but the present invention is not limited thereto and representative slopes can be set by various methods.

또한 데이터 벡터의 제1 특징 및 제2 특징은 기울기와 더불어 구간 경계에서의 데이터 값을 포함할 수 있다. 도 6에서는 검은 점의 데이터 값이 구간 경계에서의 데이터 값일 수 있다. Also, the first and second features of the data vector may include data values at the boundary of the interval along with the slope. In Fig. 6, the data value of the black dot may be the data value at the section boundary.

데이터 벡터의 생성은 도 5 및 도 6에 도시된 방법에 한정되지 않으면 다양한 방법에 의해 데이터 벡터가 생성될 수 있다. If the generation of the data vector is not limited to the method shown in Figs. 5 and 6, the data vector can be generated by various methods.

한편, 머쉰런닝 모델에 대해 다음의 도 7 내지 도 9를 참조하여 상세히 설명한다. The machine running model will be described in detail with reference to FIGS. 7 to 9 below.

앞서 도 5 및 도6을 통하여 설명된 바와 같이 데이터 벡터는 매칭 구간에서 생성되고, 매칭 구간에는 분류 태그를 포함하는 코멘트가 할당되므로 도 7에 도시된 바와 같이, 데이터 벡터는 분류 태그와 연관될 수 있다. As described above with reference to FIG. 5 and FIG. 6, the data vector is generated in the matching interval, and the comment including the classification tag is assigned to the matching interval, so that the data vector can be associated with the classification tag have.

도 7의 머쉰 러닝 모델은 decision tree learning을 이용한 것이다. 즉, 분류 태그 #ABCD, #UYTR, #NBVC 와 연관된 데이터 벡터의 Data#1, Data#2, Data#3의 관계가 decision tree로 설정될 수 있다. The machine learning model of FIG. 7 uses decision tree learning. That is, the relationship between Data # 1, Data # 2, and Data # 3 of the data vector associated with the classification tags #ABCD, #UYTR, and #NBVC can be set as a decision tree.

예를 들어, 도 7에 도시된 바와 같이, 분류 태그 #ABCD와 연관된 데이터 벡터는 Data#1<0.4이고, Data#2<30이며, Data#3>150일 수 있다. 이외에 도 7에서 분류 태그 #ABCD와 연관된 Data#1, Data#2, Data#3의 관계가 있으나 이에 대해서는 생략하도록 한다. For example, as shown in FIG. 7, the data vector associated with the classification tag #ABCD may be Data # 1 <0.4, Data # 2 <30, and Data # 3> 150. In addition, there is a relationship of Data # 1, Data # 2, and Data # 3 associated with the classification tag #ABCD in FIG. 7, but these are omitted.

또한 분류 태그 #UYTR와 연관된 데이터 벡터는 Data#1>0.8이고, Data#3>100이며, Data#3=180일 수 있다. 이외에 도 7에서 분류 태그 #UYTR와 연관된 Data#1, Data#2, Data#3의 관계가 있으나 이에 대해서는 생략하도록 한다.Also, the data vector associated with the classification tag #UYTR may be Data # 1> 0.8, Data # 3> 100, and Data # 3 = 180. In addition, there is a relationship of Data # 1, Data # 2, and Data # 3 associated with the classification tag #UYTR in FIG. 7, but these are omitted.

또한 분류 태그 #NBVC와 연관된 데이터 벡터는 Data#1=1.2이고, Data#1<10이며, Data#2>50일 수 있다. 이외에 도 7에서 분류 태그 #NBVC와 연관된 Data#1, Data#2, Data#3의 관계가 있으나 이에 대해서는 생략하도록 한다.Also, the data vector associated with the classification tag #NBVC may be Data # 1 = 1.2, Data # 1 <10, and Data # 2> 50. In addition, there is a relation of Data # 1, Data # 2, and Data # 3 associated with the classification tag #NBVC in FIG. 7, but these are omitted.

한편, 도 8의 머쉰 러닝 모델은 벡터 공간에서의 클러스터링(CLUSTERING)을 이용한 것이다. 즉, 하나의 분류 태그와 연관된 데이터 벡터들은 다른 하나의 분류 태그와 연관된 데이터 벡터들에 비하여 벡터 공간 내에서 보다 가깝게 모여 있을 수 있으므로 하나의 그룹으로 클러스터링할 수 있다. On the other hand, the machine learning model of FIG. 8 uses CLUSTERING in vector space. That is, the data vectors associated with one classification tag may be clustered more closely in the vector space than the data vectors associated with the other classification tag, and thus may be clustered into a single group.

또한, 도 9의 머쉰 러닝 모델은 하나의 분류 태그에 포함되며 순차적으로 생성된 데이터 벡터들의 성분들 사이의 관계를 통하여 형성될 수 있다. 데이터 벡터들은 순차적으로 형성되는데, 연속된 두 개의 데이터 벡터들의 성분 사이의 상태 변화를 통하여 머쉰 러닝이 이루어질 수 있다. Also, the machine learning model of FIG. 9 can be formed through the relationship between the components of the sequentially generated data vectors included in one classification tag. The data vectors are formed sequentially, and the machine learning can be done through the state change between the components of two consecutive data vectors.

예를 들어, 도 9에 도시된 바와 같이, 분류 태크 #ABCD에 포함된 데이터 벡터들은 (D11, D21, D31), (D12, D22, D32), (D13, D23, D33), (D14, D24, D34), (D15, D25, D35), (D16, D26, D36)일 수 있다. For example, as shown in FIG. 9, the data vectors included in the classification tag #ABCD are (D11, D21, D31), D12, D22, D32, D13, D23, D33, , D34), (D15, D25, D35), (D16, D26, D36).

D11과 D12 사이의 상태 변화, D12와 D13의 상태 변화, D13와 D14의 상태 변화, D14와 D15의 상태 변화, D15와 D16의 상태 변화가 계산될 수 있다. The state change between D11 and D12, the state change of D12 and D13, the state change of D13 and D14, the state change of D14 and D15, and the state change of D15 and D16.

이와 같은 상태 변화 계산은 D21과 D22, D22과 D23, D23과 D24, D24과 D25, D25과 D26 사이에 이루어질 수 있으며, 마찬가지로 D31과 D32, D32과 D33, D33과 D34, D34과 D35, D35과 D36 사이에 이루어질 수 있다. This state change calculation can be made between D21 and D22, D22 and D23, D23 and D24, D24 and D25, D25 and D26, and likewise D31 and D32, D32 and D33, D33 and D34, D34 and D35, D36. &Lt; / RTI >

상태 변화의 기준은 경우에 따라 다양하게 설정될 수 있다. 예를 들어, 연속된 2 개의 성분 사이의 차이가 20보다 크면 상태가 State#1에서 State#2로 변화했다고 설정될 수 있다. 연속된 2 개의 성분 사이의 비율이 1보다 크면 State#2에서 State#3로 변화했다고 설정될 수 있다. State#2에서 State#1로의 상태 변화의 기준, State#3에서 State#1로 상태 변화할 때의 기준이 설정될 수 있다. The criterion of the state change can be variously set according to the case. For example, if the difference between two consecutive components is greater than 20, the state can be set to change from State # 1 to State # 2. If the ratio between the two consecutive components is greater than 1, it can be set to change from State # 2 to State # 3. A criterion for state change from State # 2 to State # 1, and a state change state from State # 3 to State # 1 can be set.

이와 같은 상태 변화의 기준에 따라 전체 상태 변화의 횟수에 대한 각 상태 변화의 횟수에 대한 비율이 계산될 수 있으며, 이와 같은 비율이 머쉰 러닝 모델이 될 수 있다. The ratio of the number of state changes to the number of total state changes can be calculated according to the criterion of the state change, and the ratio can be a model for running the model.

프로세서(104)는, 데이터 벡터에 따른 머쉰 러닝 모델(machine learning model)에 제1 분석 대상 시계열 데이터 및 제2 분석 대상 시계열 데이터를 적용하여 제1 분석 대상 시계열 데이터 및 제2 분석 대상 시계열 데이터를 분류 태그 별로 분류할 수 있다.The processor 104 applies the first analysis target time series data and the second analysis target time series data to a machine learning model according to the data vector to classify the first analysis subject time series data and the second analysis subject time series data You can sort by tag.

즉, 도 7 내지 도 9에 도시된 바와 같이, 분류 태그에 따라 다양한 머쉰 러닝 모델이 생성될 수 있으며, 본 발명의 실시예에 따른 데이터 탐색 장치로 새로이 제1 분석 대상 시계열 데이터 및 제2 분석 대상 시계열 데이터가 입력될 수 있다. In other words, as shown in FIGS. 7 to 9, various modeling models can be generated according to the classification tag. In the data searching apparatus according to the embodiment of the present invention, the first analysis target time series data and the second analysis target data Time series data can be input.

프로세서(104)는 새로이 입력되는 제1 분석 대상 시계열 데이터 및 제2 분석 대상 시계열 데이터를 머쉰 러닝 모델에 적용하여 제1 분석 대상 시계열 데이터 및 제2 분석 대상 시계열 데이터를 분류 태그 별로 분류할 수 있다. The processor 104 may classify the first analysis target time series data and the second analysis target time series data by classification tags by applying the first input analysis target time series data and the second analysis target time series data to the machine learning model.

즉, 프로세서(104)는 매칭 데이터의 도출 및 코멘트 할당 없이 제1 분석 대상 시계열 데이터 및 제2 분석 대상 시계열 데이터를 머쉰 러닝 모델에 적용할 수 있으며, 이에 따라 상기 제1 분석 대상 시계열 데이터 및 상기 제2 분석 대상 시계열 데이터가 특정 분류 태그로 분류될 수 있다. That is, the processor 104 can apply the first analysis target time series data and the second analysis target time series data to the machine learning model without deriving the matching data and without allocating the comment, 2 Time series data to be analyzed can be classified into specific classification tags.

다음으로 도면을 참조하여 본 발명의 다른 실시예에 따른 데이터 탐색 장치를 설명한다.Next, a data searching apparatus according to another embodiment of the present invention will be described with reference to the drawings.

본 발명의 다른 실시예에 따른 데이터 탐색 장치는 시계열 데이터를 저장하는 메모리(106)와, 메모리(106)에 액세스가능한 프로세서(104)를 포함한다. The data search apparatus according to another embodiment of the present invention includes a memory 106 for storing time series data and a processor 104 that is accessible to the memory 106. [

프로세서(104)는 시계열 데이터의 일부 구간 또는 일부 시점에 외부에서 입력된 코멘트를 할당하고, 코멘트에 포함된 분류 태그에 따라 코멘트를 분류할 수 있다.The processor 104 may assign externally input comments at some time or part of time-series data, and classify the comments according to the classification tags included in the comments.

도 10 내지 도 12는 유저에 의하여 선택된 구간에 할당된 코멘트를 나타낸다.10 to 12 show comments assigned to the section selected by the user.

도 10에 도시된 바와 같이, 유저는 입력부(118)나 단말기를 통하여 자신이 선택한 시계열 데이터의 구간에 코멘트를 입력할 수 있다. 이에 따라 프로세서(104)는 유저에 의하여 선택된 구간에 입력된 코멘트를 할당할 수 있다. As shown in FIG. 10, the user can input a comment into the section of the time series data selected by the user through the input unit 118 or the terminal. Accordingly, the processor 104 can assign the comment entered in the section selected by the user.

이 때 코멘트는 분류 태그를 포함할 수 있으며, 프로세서(104)는 분류 태그 리스트를 생성할 수 있다. 분류 태그 리스트에 대해서는 앞서 상세히 설명하였으므로 이에 대한 설명은 생략된다.At this time, the comment may include a classification tag, and the processor 104 may generate a classification tag list. The classification tag list has been described in detail above, and a description thereof will be omitted.

도 10에서는 코멘트가 선택된 구간에 할당되었으나 도 11에 도시된 바와 같이 코멘트가 선택된 시점에 할당될 수도 있다. 또한 코멘트에 포함된 분류 태그에 따라 코멘트가 분류되며, 분류 태그 리스트가 생성될 수 있다. In Fig. 10, a comment is assigned to the selected section, but may be assigned at the time when the comment is selected, as shown in Fig. In addition, comments are classified according to the classification tag included in the comment, and a classification tag list can be generated.

아울러 도 12에 도시된 바와 같이 선택된 구간 또는 선택된 시점 중 적어도 하나에 코멘트가 할당될 수 있으며, 코멘트에 할당된 분류 태그에 따라 코멘트가 분류되며 분류 태그 리스트가 생성될 수 있다. 12, a comment may be assigned to at least one of the selected section or the selected time, the comment may be classified according to the classification tag assigned to the comment, and the classification tag list may be generated.

즉, 도 10 내지 도 12에 도시된 바와 같이, 프로세서(104)는코멘트에 대한 코멘트 리스트를 생성하여 분류 태그에 연관시키고, 분류 태그에 대한 분류 태그 리스트를 생성할 수 있다. That is, as shown in FIGS. 10-12, the processor 104 may generate a list of comments on comments and associate them with a classification tag, and generate a classification tag list for the classification tag.

유저는 특정 시계열 데이터의 특정 구간이나 특정 시점을 단말기의 마우스, 스타일러스 또는 터치 스크린을 드래그(drag)하여 선택한 후, 해당 구간에 대한 코멘트를 작성하여 저장할 수 있다.The user can drag a mouse, a stylus, or a touch screen of the terminal to select a specific section or a specific point of time of the specific time series data, and then create and save a comment for the section.

프로세서(104)는분류 태그 및 코멘트 중 적어도 하나에 대한 점수를 하나 이상의 유저 단말기로부터 입력받아 할당하고, 코멘트가 다른 코멘트에서 인용될 경우 코멘트의 인용횟수를 계산할 수 있다. 이에 따라 프로세서(104)는 점수와 코멘트의 인용횟수에 따라 코멘트에 대한 가격을 산정할 수 있다.The processor 104 may receive and assign scores for at least one of the classification tag and the comment from one or more user terminals, and may calculate the number of citations of a comment when the comment is cited in another comment. Accordingly, the processor 104 can calculate the price for the comment according to the number of citations and the number of citations to the comment.

이에 대해서는 앞서 본 발명의 실시예에 따른 데이터 탐색 장치를 통하여 설명하였으므로 이에 대한 설명은 생략된다. Since this has been described above with reference to the data searching apparatus according to the embodiment of the present invention, a description thereof will be omitted.

한편, 프로세서(104)는,코멘트가 할당된 시계열 데이터의 특징으로 이루어진 데이터 벡터를 생성하고, 데이터 벡터에 따른 머쉰 러닝 모델에 또다른 시계열 데이터을 적용하여 또다른 시계열 데이터를 분류 태그 별로 분류할 수 있다. On the other hand, the processor 104 may generate a data vector having characteristics of time series data assigned a comment, and apply another time series data to the model learning model according to the data vector to classify another time series data according to the classification tag .

이에 대한 설명은 앞서 본 발명의 실시예에 따른 데이터 탐색 장치를 통하여 상세히 설명하였으므로 이에 대한 설명은 생략된다. The description has been described in detail with reference to the data searching apparatus according to the embodiment of the present invention, and a description thereof will be omitted.

한편, 프로세서(104)는 코멘트 할당 없이 또다른 시계열 데이터를 머쉰 러닝 모델에 적용할 수 있다. 이에 대한 설명은 앞서 본 발명의 실시예에 따른 데이터 탐색 장치를 통하여 상세히 설명하였으므로 이에 대한 설명은 생략된다.On the other hand, the processor 104 may apply another time series data to the machine learning model without comment assignment. The description has been described in detail with reference to the data searching apparatus according to the embodiment of the present invention, and a description thereof will be omitted.

이상에서 설명된 머쉰 러닝 모델 역시 유저들에 의하여 평가되어 머쉰 러닝 모델에 대한 점수가 프로세서(104)에 의하여 할당될 수 있으며, 이와 같은 머쉰 러닝 모델에 대한 점수에 따라 프로세서(104)는 머쉰 러닝 모델에 대한 가격을 산정할 수 있다. The machine learning model described above may also be evaluated by the user so that the score for the machine learning model can be assigned by the processor 104. Depending on the score for the machine learning model, The price can be calculated.

프로세서(104)는 가격이 산정된 머쉰 러닝 모델을 매매하는 과정을 제어할 수 있으며, 머쉰 러닝 모델이 판매될 경우, 머쉰 러닝 모델을 구축한 유저에게 보상이 이루어지는 과정 역시 제어할 수 있다.The processor 104 can control the process of buying and selling price-estimated machine learning models, and when a machine learning model is sold, the process of compensating the user who built the machine learning model can also be controlled.

이상과 같이 본 발명에 따른 실시예를 살펴보았으며, 앞서 설명된 실시예 이외에도 본 발명이 그 취지나 범주에서 벗어남이 없이 다른 특정 형태로 구체화 될 수 있다는 사실은 해당 기술에 통상의 지식을 가진 이들에게는 자명한 것이다. 그러므로, 상술된 실시예는 제한적인 것이 아니라 예시적인 것으로 여겨져야 하고, 이에 따라 본 발명은 상술한 설명에 한정되지 않고 첨부된 청구항의 범주 및 그 동등 범위 내에서 변경될 수도 있다.It will be apparent to those skilled in the art that the present invention may be embodied in other specific forms without departing from the spirit or scope of the invention as defined in the appended claims. . Therefore, the above-described embodiments are to be considered as illustrative rather than restrictive, and the present invention is not limited to the above description, but may be modified within the scope of the appended claims and equivalents thereof.

프로세서(104)
메모리(106)
디스플레이부(114)
입력부(118)The processor 104,
The memory 106,
The display unit 114,
The input unit 118,

Claims

A memory for storing time series data; And
A processor accessible to the memory,
The processor comprising:
A comment input by the user is assigned to a part of the time series data or a part of the time series data,
Classifies the comments according to a classification tag included in the comment,
The comment is an interpretation, a comment or a memo of the user with respect to a partial section or a partial view of the time series data,
The processor comprising:
Generates a data vector having characteristics of the time series data to which the comment is assigned,
And classifies the time series data according to the classification tag by applying another time series data to the machine learning model according to the data vector.

The method according to claim 1,
The processor comprising:
Generating a comment list for the comment, associating the comment list with the classification tag,
And generates a classification tag list for the classification tag.

3. The method of claim 2,
The processor comprising:
A score for at least one of the classification tag and the comment is received from one or more user terminals,
Calculates the number of citations of the comment when the comment is quoted in another comment,
And calculates a price for the comment according to the score and the number of citations of the comment.

delete

The method according to claim 1,
The processor comprising:
And applying said another time series data to said machine learning model without comment allocation.