KR20200052449A

KR20200052449A - Connected data architecture system to support artificial intelligence service and control method thereof

Info

Publication number: KR20200052449A
Application number: KR1020180131164A
Authority: KR
Inventors: 차병래
Original assignee: 제노테크주식회사
Priority date: 2018-10-30
Filing date: 2018-10-30
Publication date: 2020-05-15
Also published as: KR102302631B1

Abstract

The present invention relates to a connected data architecture system for an artificial intelligence (AI) service and a control method thereof. According to the present invention, the connected data architecture system for an AI service comprises: a data lake (10) for storing collected data; a data storage (20); cloud computing (30) for transmitting learning data to the data lake (10); and a micro storage (40), wherein the data lake (10) includes a data lake communication unit (11), a data collection unit (12), a learning request unit (13), and a data providing unit (14). According to the present invention, a customized AI service can be provided.

Description

CONNECTED DATA ARCHITECTURE SYSTEM TO SUPPORT ARTIFICIAL INTELLIGENCE SERVICE AND CONTROL METHOD THEREOF}

본 발명은 인공지능 서비스를 위한 연결된 데이터 아키텍처 시스템 및 이에 대한 제어방법에 관한 것으로, 보다 상세하게는 데이터를 소프트웨어 정의 네트워크를 통해 구조적으로 관리하면서 안정적으로 인터페이스별로 딥 러닝을 통해 학습된 데이터를 수신받아 사용자의 환경이나 상태에 따라 학습된 데이터를 제공할 수 있는 인공지능 서비스를 위한 연결된 데이터 아키텍처 시스템 및 이에 대한 제어방법에 관한 것이다.The present invention relates to a connected data architecture system for artificial intelligence services and a control method therefor, more specifically, while receiving structurally managed data through a software-defined network and receiving data learned through deep learning stably for each interface. It relates to a connected data architecture system for artificial intelligence services that can provide learned data according to a user's environment or state, and a control method therefor.

인공지능은 최근 클라우드 컴퓨팅 환경의 급속한 발전과 빅데이터가 뒷받침되어 딥러닝과 다양한 학습 정책 등이 구현됨으로 인해 4차 산업혁명의 핵심 요소가 되었다.Artificial intelligence has become a key element of the 4th industrial revolution due to the rapid development of cloud computing environments and the support of big data, and deep learning and various learning policies have been implemented.

이러한 인공지능은 인간의 학습능력과 추론능력, 지각능력, 자연언어의 이해능력 등을 컴퓨터 프로그램으로 실현한 기술이다. 즉, 인공지능은 인간의 지능으로 할 수 있는 사고, 학습, 자기 개발 등을 컴퓨터가 할 수 있도록 하는 방법을 연구하는 컴퓨터 공학 및 정보기술의 한 분야로서, 컴퓨터가 인간의 지능적인 행동을 모방할 수 있도록 하는 것이다.Such artificial intelligence is a technology that realizes human learning ability, reasoning ability, perception ability, and understanding ability of natural language through computer programs. In other words, artificial intelligence is a field of computer science and information technology that studies how computers can do thinking, learning, and self-development that human intelligence can do. Computers can imitate human intelligent behavior. To make it possible.

인공지능의 한 분파로 신경망(Neural Network)이 존재하며, 신경망의 학습 정책으로는 지도학습(Supervised Learning), 비지도 학습(Unsupervised Learning), 강화 학습(Reinforce Learning), 그리고 전이 학습(Transfer Learning) 등이 있다. 신경망의 학습법 중 인공지능의 학습법 중 머신러닝(Machine Learning)과 딥러닝(Deep Learning)이 있다.As a branch of artificial intelligence, neural networks exist, and the learning policies of neural networks are supervised learning, unsupervised learning, reinforcement learning, and transfer learning. And so on. Among the learning methods of neural networks, there are machine learning and deep learning among artificial intelligence learning methods.

머신러닝(Machine Learning)은 컴퓨터가 스스로 방대한 데이터를 분석해서 미래를 예측하는 기술로서, 방대한 양의 데이터 가운데 비슷한 것끼리 묶어내고 서로 관계있는 것들의 상하구조를 인식하여 이를 바탕으로 앞으로의 행동을 예측하는 기술이다.Machine learning is a technology that a computer analyzes massive data by itself and predicts the future. It bundles similar things among vast amounts of data and recognizes the top and bottom structures of interrelated ones to predict future behavior based on this. It is a technique to do.

딥러닝(Deep Learning)은 컴퓨터가 여러 데이터를 이용해 마치 사람처럼 스스로 학습할 수 있게 하기 위해 인공 신경망(ANN: artificial neural network)을 기반으로 구축한 한 기계 학습 기술로서, 많은 데이터를 분류해서 같은 집합들끼리 묶고 상하의 관계를 파악하는 기술이다.Deep Learning is a machine learning technology built on an artificial neural network (ANN) to enable computers to learn themselves as if they were using multiple data. It is a technique to tie them together and grasp the relationship between them.

전이학습(Transfer learning)은 딥러닝을 특징 추출로만 사용하며, 추출된 피처를 이용하여 다른 모델을 학습하는 것으로, 기존의 만들어진 모델을 이용하여 새로운 모델을 만들 경우에 학습을 빠르게 하며 예측을 높이는 방법이다. 즉, 전이학습(Transfer learning)은 이미 학습이 완료된 모델을 가지고 원하는 학습에 미세조정을 이용하여 학습시키는 것으로, 딥러닝을 적용할 수 있는 편리한 방법이다.Transfer learning uses deep learning only as feature extraction, and learns other models using the extracted features.How to speed up learning and increase prediction when creating a new model using an existing model to be. In other words, transfer learning is a convenient way to apply deep learning by learning by using fine-tuning for desired learning with a model that has already been trained.

인공지능을 이용한 기술로는 공개특허 제10-2015-0047803호(공개일자: 2015년 05월 06일)에 기재된 바와 같이 인공지능 오디오 장치에 있어서, 상기 인공지능 오디오 장치가 현재 위치한 현재 장소의 위치를 탐지함과 함께 상기 현재 장소의 온도 및 습도 중의 적어도 하나와, 사람 유무를 감지하는 센싱부와, 네트워크를 통해 외부 서버와 통신하는 제1 통신부와, 상황에 따른 복수의 인사말 음성과 복수의 인사말 텍스트 중의 적어도 하나를 포함하는 인사말 데이터와, 복수 의 음악을 포함하는 음악 데이터를 저장하는 저장부와, 음성과 음악을 처리하여 출력하는 음성 처리부와, 텍스트를 처리하여 출력하는 표시 처리부와, 상기 센싱부에 의해 감지된 상기 현재 장소의 위치 정보를 이용하여 상기 외부 서버로부터 상기 현재 장소의 날씨 정보를 수집하고, 상기 센싱부에 의해 상기 현재 장소에 사람이 감지되면 미리 정해진 인사말 출력시기인지를 판단하고, 판단결과 상기 미리 정해진 인사말 출력시기이면 상기 저장부에 저장된 인사말 데이터로부터, 현재 시간과, 상기 센싱부에 의해 감지된 온도 및 습도 중의 적어도 하나와, 상기 외부 서버로부터 수집된 현재 장소의 날씨 정보에 부합하는 인사말 음성과 인사말 텍스트 중의 적어도 하나를 추출하여 상기 음성 처리부와 상기 표시 처리부 중의 적어도 하나를 통해 출력하는 제어부를 포함하는 인공지능 오디오 장치가 있다.As a technology using AI, in the AI audio device as described in Patent Publication No. 10-2015-0047803 (published date: 05/06/2015), the location of the current place where the AI audio device is currently located With detecting, at least one of the temperature and humidity of the current place, a sensing unit for detecting the presence or absence of a person, a first communication unit communicating with an external server via a network, a plurality of greeting voices and a plurality of greetings depending on the situation A storage unit for storing greeting data including at least one of text, music data including a plurality of music, a voice processing unit for processing and outputting voice and music, a display processing unit for processing and outputting text, and the sensing The weather information of the current place is collected from the external server by using the location information of the current place detected by the wealth, and When a person is detected in the current place by the sensing unit, it is determined whether it is a predetermined greeting output time, and when the determination result is the predetermined greeting output time, it is detected by the sensing unit from the current time and from the greeting data stored in the storage unit. A control unit that extracts at least one of the temperature and humidity and at least one of a greeting voice and a greeting text corresponding to the weather information of the current place collected from the external server and outputs it through at least one of the voice processing unit and the display processing unit There are included AI audio devices.

상기 공개특허와 같이 인공지능을 이용한 기술로는 다양한 분야에서 사용되고 있지만, 대부분 고사양 하드웨어 및 대용량 스토리지, 클라우드에 의존하는 문제점이 있다. 이에 제한된 환경에서 클라우드의 도움없이 에지 디바이스 독립적으로 혹은 클라우드 의존 및 인터넷 연결을 최소화할 필요가 있는 다양한 인공지능 응용이 요구되고 있다.As the technology using artificial intelligence as in the above-mentioned patent, it is used in various fields, but most of them have a problem of relying on high-spec hardware, mass storage, and cloud. Accordingly, in a limited environment, various AI applications requiring edge devices independently or without cloud assistance and minimizing Internet connection are required without the help of the cloud.

본 발명은 상술한 문제점을 해결하기 위해 제안된 것으로, 데이터를 소프트웨어 정의 네트워크를 통해 구조적으로 관리하면서 안정적으로 인터페이스별로 딥 러닝을 포함한 다양한 학습 정책을 이용하여 데이터를 학습시켜 학습된 데이터를 수신받아 사용자의 환경이나 상태에 따라 학습된 데이터를 소프트웨어 정의 네트워크를 이용한 인공지능 서비스로 제공할 수 있는 인공지능 서비스를 위한 연결된 데이터 아키텍처 시스템 및 이에 대한 제어방법을 제공하는 목적이 있다.The present invention has been proposed to solve the above-mentioned problems, while data is structurally managed through a software-defined network and stably learns data using various learning policies including deep learning for each interface to receive the learned data. An object of the present invention is to provide a connected data architecture system and control method for an artificial intelligence service that can provide the learned data according to the environment or state of the software as an artificial intelligence service using a software-defined network.

또한, 학습할 데이터에 대한 학습을 분산 및 병렬로 처리하여 저사양 저전력 디바이스에 적용할 수 있는 인공지능 서비스를 위한 연결된 데이터 아키텍처 시스템 및 이에 대한 제어방법을 제공하는 목적이 있다.In addition, an object of the present invention is to provide a connected data architecture system and a control method for an artificial intelligence service that can be applied to low-end, low-power devices by distributing and parallelly learning data to be learned.

또한, 대용량의 학습할 데이터를 클라우딩 컴퓨팅 또는 데이터레이크를 이용하여 학습할 수 있도록 하여 학습 계산량과 저장공간 및 학습시간을 단축시킬 수 있는 인공지능 서비스를 위한 연결된 데이터 아키텍처 시스템 및 이에 대한 제어방법을 제공하는 목적이 있다.In addition, it is possible to learn a large amount of data to be learned using cloud computing or data lake, thereby reducing the learning computation amount, storage space, and learning time. There is a purpose to provide.

본 발명이 해결하려는 과제들은 이상에서 언급한 과제들로 제한되지 않으며, 언급되지 않은 또 다른 과제들은 아래의 기재로부터 당업자에게 명확하게 이해될 수 있을 것이다. The problems to be solved by the present invention are not limited to the problems mentioned above, and other problems that are not mentioned will be clearly understood by those skilled in the art from the following description.

상기의 목적을 달성하기 위한 본 발명에 의한 인공지능을 이용한 연결된 데이터 아키텍처 시스템 및 이에 대한 제어방법은, 데이터를 수집하여 생성된 수집데이터를 저장하기 위한 데이터레이크(10)와, 상기 데이터레이크(10)로부터 수집데이터를 수신받아 전이학습하여 생성된 결과데이터를 상기 데이터레이크(10)로 공유하기 위한 데이터저장소(20)와, 상기 데이터레이크(10)로부터 수집데이터를 수신받아 전이학습하여 생성된 학습데이터를 상기 데이터레이크(10)로 전송하기 위한 클라우드 컴퓨팅(30)과, 상기 데이터레이크(10)를 통해 상기 학습데이터를 수신받아 적용하거나 상기 결과데이터를 수신받아 사용하고 상기 데이터를 입력하기 위한 마이크로 스토리지(40)를 포함하고, 상기 데이터레이크(10)는, 상기 데이터저장소(20)와 상기 클라우드 컴퓨팅(30)과 상기 마이크로 스토리지(40)와 통신하기 위한 데이터레이크통신부(11)와, 상기 데이터레이크통신부(11)를 통해 상기 마이크로 스토리지(40)로부터 수신받은 데이터를 수집하여 수집데이터로 생성하기 위한 데이터수집부(12)와, 상기 데이터수집부(12)에 생성된 수집데이터를 상기 데이터저장소(20)와 상기 클라우드 컴퓨팅(30) 중 어느 하나로 전송하기 위해 판단하여 학습요청하기 위한 학습요청부(13)와, 상기 학습요청부(13)를 통해 상기 데이터저장소(20)로부터 수신받은 결과데이터나 상기 클라우드 컴퓨팅(30)으로부터 수신받은 학습데이터를 마이크로 스토리지(40)로 제공하기 위한 데이터제공부(14)를 포함하여 구성된다.Connected data architecture system using artificial intelligence according to the present invention and a control method therefor for achieving the above object, the data lake 10 and the data lake (10) for storing the collected data generated by collecting data ) The data storage 20 for sharing the result data generated by receiving the collected data from the transfer learning to the data lake 10, and learning generated by receiving the collected data from the data lake 10 to transfer learning A cloud computing 30 for transmitting data to the data lake 10 and a microcomputer for receiving and applying the learning data through the data lake 10 or receiving and using the result data and inputting the data Storage 40, the data lake 10, the data storage 20 and the cloud computing ( 30) data for communicating with the micro-storage 40, the data lake communication unit 11, and data for collecting the data received from the micro-storage 40 through the data lake communication unit 11 to generate the collected data A collection request unit 12 and a learning request unit 13 for judging and requesting learning to transmit the collected data generated in the data collection unit 12 to one of the data storage 20 and the cloud computing 30 ), A data providing unit 14 for providing the result data received from the data storage 20 through the learning request unit 13 or the learning data received from the cloud computing 30 to the micro storage 40. ).

또한, 상기 데이터저장소(20)는, 상기 데이터레이크(10)와 통신하기 위한 데이터저장소통신부(21)와, 상기 데이터저장소통신부(21)를 통해 상기 데이터레이크(10)로부터 수신받은 수집데이터를 전이학습하기 위한 전이학습부(22)와, 상기 전이학습부(22)를 통해 전이학습되어 결과데이터를 생성하기 위한 학습완료부(23)와, 상기 학습완료부(23)에서 생성된 결과데이터를 공유하기 위한 결과공유부(24)를 포함하여 구성될 수 있다.In addition, the data storage 20, the data storage communication unit 21 for communicating with the data lake 10, and the data storage communication unit 21 through the data lake 10 transfers the collected data received Transfer learning unit 22 for learning, and learning completion unit 23 for transfer learning through the transfer learning unit 22 to generate result data, and result data generated in the learning completion unit 23 It may be configured to include a result sharing unit 24 for sharing.

이때, 상기 클라우드 컴퓨팅(30)은, 상기 데이터레이크(10)와 통신하기 위한 클라우드통신부(31)와, 상기 클라우드통신부(31)를 통해 상기 데이터레이크(10)로부터 수신받은 수집데이터를 백업하기 위한 백업부(32)와, 상기 백업부(32)에 백업된 수집데이터를 전이학습하여 학습데이터를 생성하기 위한 AI학습부(33)와, 상기 AI학습부(33)를 통해 생성된 학습데이터를 상기 데이터레이크(10)로 전달하기 위한 데이터전달부(34)를 포함하여 구성되는 것이 바람직하다.At this time, the cloud computing 30, the cloud communication unit 31 for communicating with the data lake 10, and for backing up the collected data received from the data lake 10 through the cloud communication unit 31 The backup unit 32, the AI learning unit 33 for generating learning data by transfer learning the collected data backed up in the backup unit 32, and the learning data generated through the AI learning unit 33 It is preferably configured to include a data transfer unit 34 for delivery to the data lake 10.

또한, 상기 마이크로 스토리지(40)는, 식별하기 위한 식별정보를 저장하여 관리하기 위한 정보관리부(41)와, 상기 데이터레이크(10)와 통신하기 위한 마이크로스토리지통신부(42)와, 상기 마이크로스토리지통신부(42)를 통해 상기 데이터레이크(10)로 전송하기 위한 데이터를 입력하기 위한 데이터입력부(43)와, 상기 데이터레이크(10)에서 수신받은 결과데이터나 학습데이터를 저장하기 위한 데이터저장부(44)와, 상기 데이터저장부(44)에 저장된 학습데이터를 적용시키기 위한 데이터적용부(45)와, 상기 데이터적용부(45)를 통해 적용된 학습데이터나 상기 데이터저장부(44)에 저장된 결과데이터를 사용하기 위한 데이터사용부(46)를 포함하여 구성될 수 있다.In addition, the micro-storage 40 includes an information management unit 41 for storing and managing identification information for identification, a micro storage communication unit 42 for communicating with the data lake 10, and the micro storage communication unit. A data input unit 43 for inputting data to be transmitted to the data lake 10 through 42, and a data storage unit 44 for storing result data or learning data received from the data lake 10 ), A data application unit 45 for applying the learning data stored in the data storage unit 44, and learning data applied through the data application unit 45 or result data stored in the data storage unit 44 It may be configured to include a data using unit 46 for using.

본 발명에 있어서, 데이터를 수집하기 위한 데이터레이크(10)에서 상기 데이터를 수집하고 누적저장하여 수집데이터를 생성하는 데이터수집단계와, 상기 데이터레이크(10)에서 상기 수집데이터의 전이학습을 병렬로 처리하기 위한 데이터저장소(20)와 상기 수집데이터를 백업하고 전이학습하기 위한 클라우드 컴퓨팅(30) 중 상기 수집데이터의 학습을 요청하기 위해 상기 수집데이터의 용량과 미리 설정된 기준용량을 비교하여 판단하는 판단단계와, 상기 데이터레이크(10)에서 상기 수집데이터의 용량이 미리 설정된 기준용량 이하일 경우에 상기 데이터저장소(20)로 상기 수집데이터를 전송하여 학습을 요청하고, 상기 데이터레이크(10)에서 상기 수집데이터의 용량이 미리 설정된 기준용량을 초과하였을 경우에 상기 클라우드 컴퓨팅(30)으로 상기 수집데이터를 전송하여 학습을 요청하는 학습요청단계와, 상기 데이터저장소(20)에서 상기 데이터레이크(10)로부터 수신받은 수집데이터를 전이학습하는 전이학습단계와, 상기 데이터저장소(20)에서 전이학습된 결과에 대한 결과데이터를 생성한 후 상기 데이터레이크(10)로 공유하기 위한 공유단계와, 상기 클라우드 컴퓨팅(30)에서 상기 데이터레이크(10)로부터 수신받은 수집데이터를 백업하는 백업단계와, 상기 클라우드 컴퓨팅(30)에서 백업된 수집데이터를 전이학습하여 학습데이터를 생성하는 AI학습단계와, 상기 클라우드 컴퓨팅(30)에서 생성된 학습데이터를 상기 데이터레이크(10)로 전달하기 위한 전달단계와, 상기 데이터레이크(10)에서 상기 데이터저장소(20)의 결과데이터나 상기 클라우드 컴퓨팅(30)의 학습데이터를 수신받아 마이크로 스토리지(40)로 제공하기 위한 제공단계와, 상기 마이크로 스토리지(40)에서 상기 학습데이터를 수신받을 경우에 상기 학습데이터를 적용하는 적용단계와, 상기 마이크로 스토리지(40)에서 상기 적용된 학습데이터를 사용하거나 상기 결과데이터를 수신받을 경우에 상기 결과데이터를 사용하는 사용단계를 포함하여 구성된다.In the present invention, the data collection step of collecting and accumulating the data in the data lake 10 for collecting data to generate the collected data, and the transfer learning of the collected data in the data lake 10 in parallel The data storage 20 for processing and the cloud computing 30 for back-up and transfer learning of the collected data are compared and determined by comparing the capacity of the collected data with a preset reference capacity to request learning of the collected data In the step, when the capacity of the collected data in the data lake 10 is less than or equal to a preset reference capacity, the collection data is transmitted to the data storage 20 to request learning, and the data lake 10 collects the data. When the data capacity exceeds a preset reference capacity, the cloud computing 30 collects the data. A learning request step for requesting learning by transmitting data, and a transfer learning step for transfer learning of the collected data received from the data lake 10 in the data storage 20, and transfer learning in the data storage 20 After generating result data for the result, a sharing step for sharing to the data lake 10, a backup step for backing up the collected data received from the data lake 10 in the cloud computing 30, and the cloud AI learning step of transferring learning data backed up by computing 30 to generate learning data, and delivery step for transferring learning data generated by cloud computing 30 to the data lake 10; The data lake 10 receives the result data of the data storage 20 or the learning data of the cloud computing 30 to the micro storage 40 The providing step for providing, the applying step of applying the learning data when receiving the learning data from the micro storage 40, and using the applied learning data or the result data in the micro storage 40 It comprises a use step of using the result data when receiving.

상술한 바와 같이 본 발명에 따르면, 데이터를 구조적으로 관리하면서 안정적으로 인터페이스별로 딥 러닝을 통해 데이터를 학습시켜 학습된 데이터를 수신받아 사용자의 환경이나 상태에 따라 학습된 데이터를 제공할 수 있어 헬스케어, 웨어러블 디바이스, IoT 디바이스 등과 같은 저사양 디바이스에서 독립적으로 인공지능이 동작할 수 있게 되며, 사용자 맞춤형으로 AI서비스를 제공할 수 있는 효과가 있다.As described above, according to the present invention, it is possible to provide learned data according to a user's environment or condition by receiving data by learning data through deep learning for each interface stably while managing data structurally. , AI can operate independently on low-end devices such as wearable devices and IoT devices, and it is possible to provide AI services customized to users.

또한, 학습할 데이터를 수집한 데이터레이크와 데이터저장소 간의 인터페이스를 통해 학습할 데이터에 대한 학습을 병렬로 처리할 수 있어 공유 및 확장성을 확보할 수 있어 어느 하나로 통합되는 데이터의 용량을 줄여 최적의 학습 및 데이터 분류가 가능함으로 인해 효율적으로 AI서비스를 제공할 수 있는 효과가 있다.In addition, through the interface between the data lake and the data storage that collects the data to be learned, the learning of the data to be learned can be processed in parallel, so sharing and scalability can be secured. It is possible to provide AI services efficiently because learning and data classification are possible.

또한, 학습할 데이터를 수집한 데이터레이크와 클라우딩 컴퓨팅 간의 인터페이스를 통해 확장성을 제공하기 때문에 학습할 데이터가 대용량이더라도 학습 및 학습시간을 단축시킬 수 있는 효과가 있다.In addition, because it provides scalability through the interface between data lake and cloud computing that collects the data to be learned, it has an effect of shortening the learning and learning time even if the data to be learned is large.

도 1은 본 발명의 일실시 예에 의한 인공지능 서비스를 위한 연결된 데이터 아키텍처 시스템,
도 2는 본 발명의 일실시 예에 의한 인공지능 서비스를 위한 연결된 데이터 아키텍처 시스템의 데이터레이크,
도 3은 본 발명의 일실시 예에 의한 인공지능 서비스를 위한 연결된 데이터 아키텍처 시스템의 데이터저장소,
도 4는 본 발명의 일실시 예에 의한 인공지능 서비스를 위한 연결된 데이터 아키텍처 시스템의 클라우드 컴퓨팅,
도 5는 본 발명의 일실시 예에 의한 인공지능 서비스를 위한 연결된 데이터 아키텍처 시스템의 마이크로 스토리지,
도 6은 본 발명의 일실시 예에 의한 인공지능 서비스를 위한 연결된 데이터 아키텍처 시스템의 제어방법.1 is a connected data architecture system for artificial intelligence services according to an embodiment of the present invention,
2 is a data lake of a connected data architecture system for artificial intelligence services according to an embodiment of the present invention,
3 is a data storage of a connected data architecture system for artificial intelligence services according to an embodiment of the present invention,
4 is a cloud computing of a connected data architecture system for artificial intelligence services according to an embodiment of the present invention,
5 is a micro storage of a connected data architecture system for artificial intelligence services according to an embodiment of the present invention,
6 is a method of controlling a connected data architecture system for artificial intelligence services according to an embodiment of the present invention.

이하, 첨부된 도면을 참조하여 본 발명에 의한 인공지능 서비스를 위한 연결된 데이터 아키텍처 시스템 및 이에 대한 제어방법을 상세히 설명한다.Hereinafter, a connected data architecture system for an artificial intelligence service and a control method therefor will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 일실시 예에 의한 인공지능 서비스를 위한 연결된 데이터 아키텍처 시스템이고, 도 2는 본 발명의 일실시 예에 의한인공지능 서비스를 위한 연결된 데이터 아키텍처 시스템의 데이터레이크이며, 도 3은 본 발명의 일실시 예에 의한인공지능 서비스를 위한 연결된 데이터 아키텍처 시스템의 데이터저장소이다.1 is a connected data architecture system for artificial intelligence services according to an embodiment of the present invention, Figure 2 is a data lake of a connected data architecture system for artificial intelligence services according to an embodiment of the present invention, Figure 3 is It is a data storage of a connected data architecture system for artificial intelligence service according to an embodiment of the present invention.

또한, 도 4는 본 발명의 일실시 예에 의한 인공지능 서비스를 위한 연결된 데이터 아키텍처 시스템의 클라우드 컴퓨팅이며, 도 5는 본 발명의 일실시 예에 의한 인공지능 서비스를 위한 연결된 데이터 아키텍처 시스템의 마이크로 스토리지이고, 도 6은 본 발명의 일실시 예에 의한 인공지능 서비스를 위한 연결된 데이터 아키텍처 시스템의 제어방법이다.In addition, Figure 4 is a cloud computing of a connected data architecture system for an artificial intelligence service according to an embodiment of the present invention, Figure 5 is a micro storage of a connected data architecture system for an artificial intelligence service according to an embodiment of the present invention And, Figure 6 is a control method of a connected data architecture system for artificial intelligence services according to an embodiment of the present invention.

상기 도면의 구성 요소들에 인용부호를 부가함에 있어서, 동일한 구성 요소들에 한해서는 비록 다른 도면상에 표시되더라도 가능한 동일한 부호를 가지도록 하고 있으며, 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 공지 기능 및 구성에 대한 상세한 설명은 생략한다. 또한, '상부', '하부', '앞', '뒤', '선단', '전방', '후단' 등과 같은 방향성 용어는 개시된 도면(들)의 배향과 관련하여 사용된다. 본 발명의 실시 예의 구성요소는 다양한 배향으로 위치설정될 수 있기 때문에 방향성 용어는 예시를 목적으로 사용되는 것이지 이를 제한하는 것은 아니다.When adding quotation marks to the components of the drawings, the same components have the same reference numerals as possible even though they are displayed on different drawings, and a known function determined to unnecessarily obscure the subject matter of the present invention And detailed description of the configuration is omitted. In addition, directional terms such as 'top', 'bottom', 'front', 'back', 'leading', 'front', 'end', etc. are used in connection with the orientation of the disclosed drawing (s). Since the components of the embodiments of the present invention can be positioned in various orientations, the directional terms are used for illustrative purposes and are not limiting.

본 발명의 바람직한 일실시 예에 의한 인공지능 서비스를 위한 연결된 데이터 아키텍처 시스템은, 상기 도 1에 도시된 바와 같이, 데이터를 수집하여 생성된 수집데이터를 저장하기 위한 데이터레이크(10)와, 상기 데이터레이크(10)로부터 수집데이터를 수신받아 전이학습하여 생성된 결과데이터를 상기 데이터레이크(10)로 공유하기 위한 데이터저장소(20)와, 상기 데이터레이크(10)로부터 수집데이터를 수신받아 전이학습하여 생성된 학습데이터를 상기 데이터레이크(10)로 전송하기 위한 클라우드 컴퓨팅(30)과, 상기 데이터레이크(10)를 통해 상기 학습데이터를 수신받아 적용하거나 상기 결과데이터를 수신받아 사용하기 위한 마이크로 스토리지(40)를 포함하여 구성된다.Connected data architecture system for an artificial intelligence service according to an exemplary embodiment of the present invention, as shown in Figure 1, the data lake 10 for storing the collected data generated by the data, and the data By receiving the collected data from the lake 10, transfer the learning data generated by the transfer to the data lake 10 to share the result data 20, and the data lake 10 receives the collected data from the transfer learning Cloud computing 30 for transmitting the generated learning data to the data lake 10 and micro storage for receiving and applying the learning data through the data lake 10 or receiving and using the result data ( 40).

상기 데이터레이크(10)는, 상기 도 2에 도시된 바와 같이, 상기 데이터저장소(20)와 상기 클라우드 컴퓨팅(30)과 상기 마이크로 스토리지(40)와 통신하기 위한 데이터레이크통신부(11)와, 상기 데이터레이크통신부(11)를 통해 상기 마이크로 스토리지(40)로부터 수신받은 데이터를 수집하여 수집데이터로 생성하기 위한 데이터수집부(12)와, 상기 데이터수집부(12)에 생성된 수집데이터를 상기 데이터저장소(20)와 상기 클라우드 컴퓨팅(30) 중 어느 하나로 전송하기 위해 판단하여 학습요청하기 위한 학습요청부(13)와, 상기 학습요청부(13)를 통해 상기 데이터저장소(20)로부터 수신받은 결과데이터나 상기 클라우드 컴퓨팅(30)으로부터 수신받은 학습데이터를 마이크로 스토리지(40)로 제공하기 위한 데이터제공부(14)를 포함하여 구성될 수 있다.The data lake 10, as shown in Figure 2, the data storage 20 and the cloud computing 30 and the data storage communication unit 11 for communicating with the micro storage 40, the The data collection unit 12 for collecting data received from the micro storage 40 through the data lake communication unit 11 and generating the collected data, and the data collected in the data collection unit 12 are the data Results received from the data storage 20 through the learning request unit 13 and the learning request unit 13 for judging a learning request by determining to transmit to one of the storage 20 and the cloud computing 30 It may be configured to include a data providing unit 14 for providing data or learning data received from the cloud computing 30 to the micro storage 40.

상기 데이터레이크(10)는 대용량 데이터를 캡처링, 처리, 분석하여 사용자 또는 데이터를 소비하는 마이크로 스토리지(40)에 제공할 수 있도록 전사적 데이터 레이크를 구축하기 위해 물리적 계층(Physical Layer)과, 분산된 스토리지 계층(Distributed Storage Layer)과, 보안 계층(Security Layer)과, 데이터 수집 계층(Data Acquisition Layer)과, 메시징 계층(Messaging Layer)과, 유입 계층(Ingestion Layer)과, 람다 아키텍쳐(Lambda Architecture)와, 서비스 계층(Serving Layer)을 포함하여 구성될 수 있다.The data lake 10 captures, processes, and analyzes large amounts of data and provides it to a micro storage 40 that consumes users or data. Distributed Storage Layer, Security Layer, Data Acquisition Layer, Messaging Layer, Ingestion Layer, Lambda Architecture, , It may be configured to include a service layer (Serving Layer).

또한, 상기 데이터레이크(10)는 상기 클라우드 컴퓨팅(30)과 클라우드 버스팅을 제공하고, 상기 마이크로 스토리지(40)와 클라우드 스패닝을 제공하게 된다.In addition, the data lake 10 provides the cloud computing 30 and cloud bursting, and provides the micro storage 40 and cloud spanning.

상기 클라우드 버스팅은 하이브리드 클라우드(혼합형 클라우드) 환경에서 사용되는 응용 프로그램 배포 모델이며, 상기 데이터레이크(10)의 용량을 초과하면 초과 수요로 인해 퍼블릭 클라우드로 자동 전송되어 응용 프로그램이 계속 실행될 수 있도록 한다.The cloud bursting is an application distribution model used in a hybrid cloud (mixed cloud) environment, and when the capacity of the data lake 10 is exceeded, it is automatically transferred to the public cloud due to excessive demand, so that the application can be continuously executed. .

상기 클라우드 스패닝은 많은 컴퓨팅 자원들을 필요로 하는 응용 프로그램 구성 요소가 여러 클라우드 환경에서 동시에 배포되도록 하는 전달 모델이며, 여러 대의 컴퓨터를 연결하여 상호 협력하도록 할 수 있다.The cloud spanning is a delivery model that allows application components that require a lot of computing resources to be simultaneously deployed in multiple cloud environments, and can connect multiple computers to cooperate with each other.

상기 데이터레이크통신부(11)는, 상기 데이터저장소(20)와 상기 클라우드 컴퓨팅(30)과 상기 마이크로 스토리지(40)와 무선 네트워크를 통해 통신한다. 상기 무선네트워크로는 와이파이, 인터넷 등을 사용할 수 있다.The data lake communication unit 11 communicates with the data storage 20, the cloud computing 30, and the micro storage 40 through a wireless network. Wi-Fi, Internet, etc. can be used as the wireless network.

상기 데이터수집부(12)는, 상기 마이크로 스토리지(40)로부터 데이터를 수신받아 누적저장하면서 수집하여 수집데이터로 생성한다.The data collection unit 12 receives the data from the micro-storage 40, accumulates and collects the data, and generates the collected data.

이때, 상기 데이터수집부(12)는 주기적으로 상기 마이크로 스토리지(40)의 데이터를 수신받을 수 있다.In this case, the data collection unit 12 may periodically receive data of the micro storage 40.

상기 학습요청부(13)는, 상기 데이터수집부(12)에 생성된 수집데이터를 상기 데이터저장소(20)와 상기 클라우드 컴퓨팅(30) 중에서 어느 하나로 전송하여 학습요청하기 위한 것으로, 상기 수집데이터의 용량에 따라 판단할 수 있다.The learning request unit 13 is for requesting learning by transmitting the collected data generated in the data collection unit 12 to one of the data storage 20 and the cloud computing 30. It can be judged according to the capacity.

즉, 상기 학습요청부(13)는 상기 수집데이터의 용량에 따라 상기 데이터저장소(20)와 상기 클라우드 컴퓨팅(30) 중에서 학습요청을 하기 위한 대상이 판단되며, 판단된 상기 데이터저장소(20)와 상기 클라우드 컴퓨팅(30) 중 어느 하나로 전송되어 학습요청될 수 있다.That is, the learning request unit 13 determines the target for the learning request from the data storage 20 and the cloud computing 30 according to the capacity of the collected data, and the determined data storage 20 and It can be transmitted to any one of the cloud computing 30 to be requested to learn.

이때, 상기 학습요청부(13)는 학습요청하기 위한 상기 데이터저장소(20)와 상기 클라우드 컴퓨팅(30) 중 어느 하나를 판단하기 위해 상기 수집데이터의 용량과 미리 설정된 기준용량을 비교한다.At this time, the learning request unit 13 compares the capacity of the collected data and a preset reference capacity to determine one of the data storage 20 and the cloud computing 30 for requesting learning.

상기 수집데이터의 용량이 미리 설정된 기준용량 이하일 경우에는 상기 데이터저장소(20)로 상기 수집데이터를 전송하여 학습요청하며, 상기 수집데이터의 용량이 미리 설정된 기준용량을 초과할 경우에는 상기 클라우드 컴퓨팅(30)으로 상기 수집데이터를 전송하여 학습요청한다.When the capacity of the collected data is less than or equal to the preset reference capacity, the collected data is sent to the data storage 20 to request learning, and when the capacity of the collected data exceeds the preset reference capacity, the cloud computing 30 ) To request the learning by sending the collected data.

상기 학습요청부(13)로 인해 상황에 따라 판단된 상기 데이터저장소(20)와 상기 클라우드 컴퓨팅(30) 중 어느 하나로 상기 수집데이터를 전송하여 학습요청함으로써 효율적으로 전이학습시켜 신속한 AI서비스를 제공할 수 있게 된다.The learning request unit 13 transmits the collected data to any one of the data storage 20 and the cloud computing 30 determined according to the situation, and requests the learning to efficiently transfer and learn to provide rapid AI service. It becomes possible.

상기 데이터제공부(14)는, 상기 학습요청부(13)에 의해 수집데이터를 상기 데이터저장소(20)로 전송한 경우에 상기 데이터저장소(20)에서 결과데이터를 수신받고, 상기 학습요청부(13)에 의해 수집데이터를 상기 클라우드 컴퓨팅(30)으로 전송한 경우에 상기 클라우드 컴퓨팅(30)에서 학습데이터를 수신받게 된다.The data providing unit 14 receives the result data from the data storage 20 when the collected data is transmitted to the data storage 20 by the learning request unit 13, and the learning request unit ( 13) When the collected data is transmitted to the cloud computing 30, the learning data is received from the cloud computing 30.

상기 데이터제공부(14)는 상기 데이터저장소(20)으로부터 수신받은 결과데이터나 상기 클라우드 컴퓨팅(30)으로부터 수신받은 학습데이터를 해당 마이크로 스토리지(40)로 제공한다.The data providing unit 14 provides the result data received from the data storage 20 or the learning data received from the cloud computing 30 to the corresponding micro storage 40.

이때, 상기 데이터제공부(14)는 상기 결과데이터나 학습데이터의 종류에 따라 마이크로 스토리지(40)의 식별정보가 미리 저장되어 있는 것이 바람직하며, 상기 결과데이터나 학습데이터의 종류에 대응되는 마이크로 스토리지(40)를 추출하여 그에 해당되는 상기 결과데이터나 학습데이터를 제공할 수 있다.At this time, the data providing unit 14 is preferably stored in advance the identification information of the micro storage 40 according to the type of the result data or learning data, the micro storage corresponding to the type of the result data or learning data (40) may be extracted to provide the result data or learning data corresponding thereto.

또한, 상기 데이터제공부(14)는 상기 결과데이터나 학습데이터를 요청한 마이크로 스토리지(40)로 제공할 수도 있다.Also, the data providing unit 14 may provide the result data or the learning data to the requested micro storage 40.

상기 데이터저장소(20)는, 상기 도 3에 도시된 바와 같이, 상기 데이터레이크(10)와 통신하기 위한 데이터저장소통신부(21)와, 상기 데이터저장소통신부(21)를 통해 상기 데이터레이크(10)로부터 수신받은 수집데이터를 전이학습하기 위한 전이학습부(22)와, 상기 전이학습부(22)를 통해 전이학습되어 결과데이터를 생성하기 위한 학습완료부(23)와, 상기 학습완료부(23)에서 생성된 결과데이터를 공유하기 위한 결과공유부(24)를 포함하여 구성될 수 있다.The data storage 20, as shown in Figure 3, a data storage communication unit 21 for communicating with the data lake 10, and the data lake 10 through the data storage communication unit 21 A transfer learning unit 22 for transfer learning of the collected data received from, a learning completion unit 23 for transfer learning through the transfer learning unit 22 to generate result data, and the learning completion unit 23 ) May include a result sharing unit 24 for sharing the result data generated in.

상기 데이터저장소통신부(21)는, 상기 데이터레이크(10)와 통신하기 위해 무선 네트워크를 사용할 수 있다. 다시 말해, 상기 데이터저장소통신부(21)는 상기 데이터레이크(10)와 동일한 무선 네트워크를 사용하여 통신할 수 있다.The data storage communication unit 21 may use a wireless network to communicate with the data lake 10. In other words, the data storage communication unit 21 may communicate using the same wireless network as the data lake 10.

상기 전이학습부(22)는, FPGA 또는 GPU 자원을 이용한 분산 AI 서비스를 제공하기 위해 전이학습을 지원한다. The transfer learning unit 22 supports transfer learning to provide a distributed AI service using FPGA or GPU resources.

상기 전이학습은 과도한 컴퓨팅 능력 없이 과거 모델을 새 도메인으로 전이하는 머신 학습의 부문이다. 또한, 상기 전이학습은 도메인의 임베딩 벡터(embedding vector)의 근본적인 분포가 원본 도메인과 다르며, 이를 통해 기존 모델 및 도메인 지식을 재사용할 수 있어 오랜 데이터 세트에 대한 반복 훈련을 피할 수 있다.The transfer learning is a section of machine learning that transfers past models to a new domain without excessive computing power. In addition, in the transfer learning, the fundamental distribution of the embedding vector of the domain is different from the original domain, and thus it is possible to reuse the existing model and domain knowledge, thereby avoiding repetitive training on a long data set.

상기 전이학습부(22)는 고정된 특징 추출기(fixed feature extractor), 미세조정(fine-tuning), 미리 학습된 모델(pretrained model) 등을 이용하여 상기 데이터레이크(10)로부터 수신받은 수집데이터를 전이학습한다.The transfer learning unit 22 receives the collected data received from the data lake 10 using a fixed feature extractor, a fine-tuning, a pretrained model, and the like. Transfer learning.

상기 학습완료부(23)는, 상기 전이학습부(22)에서 수집데이터가 전이학습되어 결과데이터를 생성하게 된다.In the learning completion unit 23, the collected data is transferred from the transfer learning unit 22 to generate result data.

상기 결과데이터는 상기 수집데이터가 전이학습완료된 결과이며, AI 서비스인 것이 바람직하다. 이에, 상기 결과데이터는 상기 마이크로 스토리지(40)에서 바로 사용할 수도 있다.The result data is a result of the completion of transfer learning of the collected data, and is preferably an AI service. Accordingly, the result data may be directly used in the micro storage 40.

상기 결과공유부(24)는, 상기 데이터레이크(10)로 상기 결과데이터를 공유하기 위한 것으로, 타 데이터레이크로도 공유하여 사용할 수 있다.The result sharing unit 24 is for sharing the result data with the data lake 10, and can also be used by sharing with other data lakes.

상기 데이터저장소(20)가 구성됨으로 인해 상기 데이터레이크(10)와 동등한 수준의 협업 및 확장성을 제공할 수 있게 되며, AI서비스를 위하여 상기 데이터레이크(10)와의 물리적 네트워크, 스토리지 및 컴퓨팅 자원들의 공유를 통해 협업 및 확장성을 상속하게 되어 효율적으로 전이학습을 수행할 수 있게 된다.Since the data storage 20 is configured, it is possible to provide the same level of collaboration and scalability as the data lake 10, and the physical network, storage and computing resources of the data lake 10 for AI service. Through sharing, inheritance of collaboration and extensibility enables efficient transfer learning.

한편, 상기 클라우드 컴퓨팅(30)은, 상기 도 4에 도시된 바와 같이, 상기 데이터레이크(10)와 통신하기 위한 클라우드통신부(31)와, 상기 클라우드통신부(31)를 통해 상기 데이터레이크(10)로부터 수신받은 수집데이터를 백업하기 위한 백업부(32)와, 상기 백업부(32)에 백업된 수집데이터를 전이학습하여 학습데이터를 생성하기 위한 AI학습부(33)와, 상기 AI학습부(33)를 통해 생성된 학습데이터를 상기 데이터레이크(10)로 전달하기 위한 데이터전달부(34)를 포함하여 구성되는 것이 바람직하다.On the other hand, the cloud computing 30, as shown in Figure 4, the cloud communication unit 31 for communicating with the data lake 10, and the data lake 10 through the cloud communication unit 31 The backup unit 32 for backing up the collected data received from, and the AI learning unit 33 and the AI learning unit for generating learning data by transferring and learning the collected data backed up in the backup unit 32 33) is preferably configured to include a data transmission unit 34 for transmitting the learning data generated through the data lake (10).

상기 클라우드통신부(31)는, 와이파이, 인터넷 등과 같은 무선네트워크를 사용하여 상기 데이터레이크(10)와 통신한다.The cloud communication unit 31 communicates with the data lake 10 using a wireless network such as Wi-Fi or the Internet.

상기 백업부(32)는, 상기 클라우드통신부(31)를 통해 상기 데이터레이크(10)로부터 수집데이터를 수신받아 백업한다.The backup unit 32 receives the collected data from the data lake 10 through the cloud communication unit 31 and backs it up.

상기 백업부(32)로 인해 상기 데이터레이크(10)의 수집데이터를 백업함으로써 상기 데이터레이크(10)의 용량 부족이나 오류 등에 의해 원본이 손상되거나 삭제되더라도 추후 AI학습에 영향을 미치는 것을 미연에 방지할 수 있게 된다.By backing up the collected data of the data lake 10 due to the backup unit 32, even if the original is damaged or deleted due to insufficient capacity or errors of the data lake 10, it is prevented from affecting AI learning in the future. I can do it.

상기 AI학습부(33)는, 상기 백업된 수집데이터를 전이학습하여 학습데이터를 생성한다. The AI learning unit 33 generates learning data by transfer learning the backed up collected data.

상기 학습데이터는 상기 수집데이터를 전이학습한 것으로, 추후 상기 마이크로 스토리지(40)로 전달되었을 경우에 적용시킨 다음에 사용할 수 있게 된다.The learning data is a transfer learning of the collected data, and can be used after being applied to the micro storage 40 in the future.

이때, 상기 AI학습부(33)는 상기 수집데이터를 전이학습하기 위해 퍼블릭 클라우드의 컴퓨팅 자원과 GPU Cluster 또는 GPU Cloud를 이용한 AI로 대용량 학습 데이터를 기반으로 Fast Training을 지원하게 된다.At this time, the AI learning unit 33 supports Fast Training based on large-capacity learning data with AI using GPU cluster or GPU Cloud and computing resources of the public cloud to transfer the collected data.

상기 AI학습부(33)에서 Fast Training을 지원함으로써 상기 수집데이터의 전이학습을 신속하게 수행할 수 있으며, 상기 수집데이터가 대용량이더라도 학습에 대한 오류가 발생하는 것을 최대한 방지할 수 있게 된다.By supporting Fast Training in the AI learning unit 33, transfer learning of the collected data can be quickly performed, and even if the collected data is large, errors in learning can be prevented as much as possible.

상기 데이터전달부(34)는, 상기 AI학습부(33)를 통해 생성된 학습데이터를 상기 데이터레이크(10)로 전달한다.The data transfer unit 34 transfers the learning data generated through the AI learning unit 33 to the data lake 10.

상기 클라우드 컴퓨팅(30)이 구성됨으로써 상기 데이터레이크(10)의 물리적 자원의 한계성을 퍼블릭 클라우드를 통한 스토리지의 확장성을 확보할 수 있도록 하며, 퍼블릭 클라우드의 컴퓨팅 자원을 통해 상기 수집데이터의 학습 및 학습시간을 단축시킬 수 있게 된다.By configuring the cloud computing 30, it is possible to secure the scalability of storage through the public cloud by limiting the physical resources of the data lake 10, and learning and learning the collected data through computing resources of the public cloud. This will shorten the time.

한편, 상기 데이터저장소(20)와 상기 클라우드 컴퓨팅(30)에서 전이학습을 적용하기 위한 알고리즘은 개발자/사용자 또는 서비스 제공자의 무작위나 직관적으로 적용하기 보다 이전에 학습된 수집데이터와 새로 수신한 수집데이터의 상관관계를 측정함으로써 전이학습의 타당성과 전이학습을 위한 시간과 컴퓨팅 자원을 효율적으로 응용이 가능하다. On the other hand, algorithms for applying transfer learning in the data storage 20 and the cloud computing 30 are collected data previously learned and newly received collection data rather than randomly or intuitively applied by a developer / user or a service provider. By measuring the correlation of, the feasibility of transfer learning and time and computing resources for transfer learning can be effectively applied.

상기 상관관계 측정은 모수적(Parametric) 방법과 비모수적 방법(Non-parametric) 방법으로 구분되며, 상기 모수적 상관관계는 피어슨 적률 상관계수(Pearson product moment correlation coefficient) 기법을 사용하고, 상기 비모수적 상관관계는 켄달의 타우(Kendall's Tau) 기법이나 스피어만의 순위 상관계수(Spearman's rank order correlation coefficient) 기법을 사용할 수 있다.The correlation measurement is divided into a parametric method and a non-parametric method, and the parametric correlation uses a Pearson product moment correlation coefficient technique, and the nonparametric method is used. The correlation can be Kendall's Tau technique or Spearman's rank order correlation coefficient technique.

상기 상관관계를 측정한 결과인 상관 계수는 -1과 +1 사이의 값을 가지게 되며, 0은 관련성이 없다는 것을 의미한다. 이때, 상기 -1 또는 +1에 근접한 값은 관련성이 크다는 것을 의미하며, +는 관계의 양 방향성을 나타내고 -는 관계의 역 방향성을 나타낸다.The correlation coefficient, which is a result of measuring the correlation, has a value between -1 and +1, and 0 means that there is no correlation. At this time, a value close to -1 or +1 means that the relevance is high, + indicates both directions of the relationship, and-indicates reverse direction of the relationship.

상기 마이크로 스토리지(40)는, 상기 도 5에 도시된 바와 같이, 식별하기 위한 식별정보를 저장하여 관리하기 위한 정보관리부(41)와, 상기 데이터레이크(10)와 통신하기 위한 마이크로스토리지통신부(42)와, 상기 마이크로스토리지통신부(42)를 통해 상기 데이터레이크(10)로 전송하기 위한 데이터를 입력하기 위한 데이터입력부(43)와, 상기 데이터레이크(10)에서 수신받은 결과데이터나 학습데이터를 저장하기 위한 데이터저장부(44)와, 상기 데이터저장부(44)에 저장된 학습데이터를 적용시키기 위한 데이터적용부(45)와, 상기 데이터적용부(45)를 통해 적용된 학습데이터나 상기 데이터저장부(44)에 저장된 결과데이터를 사용하기 위한 데이터사용부(46)를 포함하여 구성될 수 있다.The micro-storage 40, as shown in FIG. 5, the information management unit 41 for storing and managing identification information for identification, and the micro-storage communication unit 42 for communicating with the data lake 10 ), A data input unit 43 for inputting data to be transmitted to the data lake 10 through the micro-storage communication unit 42, and result data or learning data received from the data lake 10 are stored. Data storage unit 44 for applying, a data application unit 45 for applying the learning data stored in the data storage unit 44, and learning data or the data storage unit applied through the data application unit 45 It may be configured to include a data use unit 46 for using the result data stored in (44).

상기 마이크로 스토리지(40)는 헬스케어, 웨어러블 디바이스, IoT 디바이스 등과 같은 저사양 저전력 디바이스이다.The micro-storage 40 is a low-spec, low-power device such as a healthcare, wearable device, IoT device, and the like.

상기 정보관리부(41)는, 상기 마이크로 스토리지(40)를 타 마이크로 스토리지와 식별가능하도록 식별정보를 생성하여 저장하고 관리한다.The information management unit 41 generates, stores, and manages the identification information so that the micro storage 40 can be distinguished from other micro storage.

상기 식별정보에는 상기 마이크로 스토리지(40)를 식별하기 위한 식별번호와 상기 마이크로 스토리지(40)의 종류 등이 포함될 수 있으며, 상기 마이크로 스토리지(40)를 사용하는 사용자의 정보가 더 포함되어 생성될 수 있다.The identification information may include an identification number for identifying the micro-storage 40 and the type of the micro-storage 40, and may be generated by further including information of a user who uses the micro-storage 40. have.

상기 마이크로스토리지통신부(42)는, 상기 데이터레이크(10)와 통신하기 위해 인터넷, 와이파이 등과 같은 무선 네트워크를 사용할 수 있다.The micro storage communication unit 42 may use a wireless network such as the Internet or Wi-Fi to communicate with the data lake 10.

상기 데이터입력부(43)는, 학습을 위한 데이터를 입력하여 상기 마이크로스토리지통신부(42)를 통해 상기 데이터레이크(10)로 전송한다.The data input unit 43 inputs data for learning and transmits it to the data lake 10 through the micro storage communication unit 42.

상기 학습을 위한 데이터는 사용자의 패턴분석을 통해 사용자 환경이나 상태 등을 포함할 수 있다.The data for learning may include a user environment or a state through a user's pattern analysis.

상기 데이터는 추후 AI 서비스를 제공하기 위해 상기 데이터저장소(20)나 상기 클라우드 컴퓨팅(30)을 통해 학습되는 초기 데이터로서, 사용자가 직접 입력하여 생성할 수 있다.The data is initial data learned through the data storage 20 or the cloud computing 30 in order to provide AI services in the future, and may be directly input and generated by a user.

상기 데이터입력부(43)는 상기 데이터를 미리 설정된 전송시각에 따라 상기 데이터레이크(10)로 전송하는 것이 바람직하다.Preferably, the data input unit 43 transmits the data to the data lake 10 according to a preset transmission time.

상기 데이터저장부(44)는 상기 데이터를 전송받은 데이터레이크(10)에서 결과데이터나 학습데이터를 수신받고, 상기 수신받은 결과데이터나 학습데이터를 저장한다.The data storage unit 44 receives result data or learning data from the data lake 10 that receives the data, and stores the received result data or learning data.

상기 데이터저장부(44)에 학습데이터가 저장될 경우에는 상기 데이터적용부(45)에서 상기 학습데이터를 적용시킨다.When the learning data is stored in the data storage unit 44, the learning data is applied by the data application unit 45.

상기 데이터적용부(45)는, 사용자가 AI서비스를 사용할 수 있도록 상기 학습데이터를 적용시키는 것으로, 상기 학습데이터의 결과를 적용시키게 된다.The data application unit 45 applies the learning data so that the user can use the AI service, and applies the result of the learning data.

상기 데이터사용부(46)는, 상기 데이터적용부(45)에 적용된 학습데이터를 사용하거나 상기 데이터저장부(44)에 결과데이터가 저장될 경우에 상기 결과데이터를 사용한다. 즉, 상기 데이터사용부(46)는 상기 데이터적용부(45)에 적용된 학습데이터나 상기 결과데이터를 상기 마이크로 스토리지(40)를 사용하는 사용자의 환경이나 상태 등에 따라 사용하여 AI서비스를 제공할 수 있게 된다.The data use unit 46 uses the result data when learning data applied to the data application unit 45 is used or when result data is stored in the data storage unit 44. That is, the data usage unit 46 can provide AI service by using the learning data applied to the data application unit 45 or the result data according to the environment or status of the user using the micro storage 40. There will be.

상기와 같이 구성된 마이크로 스토리지(40)는 상기 데이터레이크(10)를 통해 마이크로 스토리지의 특성과 저사양 컴퓨팅의 단점을 극복하여 사용자에게 AI서비스를 제공할 수 있게 된다.The micro storage 40 configured as described above can provide the AI service to the user by overcoming the characteristics of the micro storage and the disadvantages of low-spec computing through the data lake 10.

또한, 상기 마이크로 스토리지(40)와 상기 데이터레이크(10) 사이는 내장형 AI로 저사양의 임베디드 시스템에 적합하며, 독립적인 인공지능 시스템 혹은 클라우드 서비스의 연결을 최소화하는 인공지능 시스템이 될 수 있다. 이에, 실생활에 필요한 제품에 외부 통신 데이터 사용을 최소화시킨 학습기능을 이용할 수 있게 된다.In addition, between the micro-storage 40 and the data lake 10 is an embedded AI, suitable for a low-end embedded system, and may be an independent artificial intelligence system or an artificial intelligence system that minimizes the connection of cloud services. Accordingly, it is possible to use a learning function that minimizes the use of external communication data in products necessary for real life.

상기 내장형 AI는 스마트 오브젝트(Smart Objects), 자연어 처리(Natural Language Processing), 개인화(Personalization) 등 개인맞춤형 서비스를 지원하게 된다.The embedded AI will support personalized services such as smart objects, natural language processing, and personalization.

상기와 같이 구성된 인공지능을 이용한 연결된 데이터 아키텍처 시스템의 제어방법은, 상기 도 6에 도시된 바와 같이, 먼저, 상기 데이터레이크(10)에서 상기 데이터를 수집하고 누적저장하여 수집데이터를 생성한다(S10).The control method of the connected data architecture system using artificial intelligence configured as described above, as shown in FIG. 6, first, collects and accumulates the data in the data lake 10 to generate collected data (S10). ).

상기 데이터레이크(10)에서 상기 데이터저장소(20)나 상기 클라우드 컴퓨팅(30) 중 어느 하나로 상기 수집데이터를 전송하기 위해 상기 수집데이터의 용량과 미리 설정된 기준용량을 비교하여 판단한다(S20).In order to transmit the collected data from the data lake 10 to either the data storage 20 or the cloud computing 30, it is determined by comparing the collected data capacity with a preset reference capacity (S20).

즉, 상기 데이터레이크(10)에서는 상기 수집데이터의 용량을 측정하여 상기 수집데이터의 용량이 미리 설정된 기준용량을 초과하는 지 판단하게 된다.That is, the data lake 10 measures the capacity of the collected data to determine whether the capacity of the collected data exceeds a preset reference capacity.

상기 데이터레이크(10)에서 상기 수집데이터의 용량이 미리 설정된 기준용량 이하로 판단된 경우에 상기 수집데이터의 전이학습을 병렬로 처리하기 위해 데이터저장소(20)로 상기 수집데이터를 전송하여 학습을 요청하게 된다. 반면, 상기 데이터레이크(10)에서 상기 수집데이터이 용량이 미리 설정된 기준용량을 초과하였다고 판단된 경우에 상기 수집데이터를 백업하고 전이학습하기 위해 클라우드 컴퓨팅(30)으로 상기 수집데이터를 전송하여 학습을 요청하게 된다(S30).When the capacity of the collected data in the data lake 10 is determined to be less than or equal to a preset reference capacity, the collected data is transmitted to the data storage 20 in order to process the transfer learning of the collected data in parallel to request learning. Is done. On the other hand, when it is determined in the data lake 10 that the collected data has exceeded a preset reference capacity, the collected data is transmitted to the cloud computing 30 to request learning by backing up the collected data and learning the transfer. Will be (S30).

상기 데이터레이크(10)에서 상기 데이터저장소(20)로 상기 수집데이터를 전송할 경우에는, 상기 데이터저장소(20)에서 상기 수집데이터를 전이학습하게 된다(S40).When the collected data is transmitted from the data lake 10 to the data storage 20, the collected data is transferred to the data storage 20 to learn (S40).

상기 데이터저장소(20)는 상기 수집데이터를 전이학습한 결과에 대한 결과데이터를 생성하고, 상기 결과데이터를 상기 데이터레이크(10)로 공유한다(S41).The data storage 20 generates result data for the result of transfer learning the collected data, and shares the result data with the data lake 10 (S41).

상기와 같이 상기 데이터레이크(10)를 중심으로 상기 데이터저장소(20)와 협업함으로 인해 학습을 위한 데이터를 수집하고, 전이학습하며, 학습결과를 적용하는 단계를 공유 및 확장성을 지원하게 된다.As described above, by collaborating with the data storage 20 around the data lake 10, collecting data for learning, transfer learning, and applying the learning results support sharing and scalability.

상기 데이터저장소(20)를 사용함으로써 데이터 수집에 대한 물리적 자원의 한계성을 SDI 기술에 의해 공유 및 확장성을 확보할 수 있으며, SDI기술과 가상화 기술에 의해 전이학습을 병렬로 처리할 수 있게 되고 학습된 결과데이터를 공유 및 AI서비스를 제공할 수 있게 된다.By using the data storage 20, the limitation of physical resources for data collection can be secured and shared by SDI technology, and transfer learning can be processed in parallel by SDI technology and virtualization technology. It becomes possible to share the result data and provide AI services.

상기 데이터레이크(10)에서 상기 클라우드 컴퓨팅(30)으로 상기 수집데이터를 전송할 경우에는, 상기 클라우드 컴퓨팅(30)에서 상기 수집데이터를 백업한다(S50). When the collected data is transmitted from the data lake 10 to the cloud computing 30, the collected data is backed up by the cloud computing 30 (S50).

상기 클라우드 컴퓨팅(30)은 상기 백업된 수집데이터를 전이학습하여 학습데이터를 생성한다(S51).The cloud computing 30 generates learning data by transfer learning the backed up collected data (S51).

상기 클라우드 컴퓨팅(30)에서 생성된 학습데이터는 상기 데이터레이크(10)로 전달된다(S52). The learning data generated in the cloud computing 30 is transferred to the data lake 10 (S52).

상기와 같이 상기 데이터레이크(10)를 중심으로 상기 클라우드 컴퓨팅(30)을 통해 AI학습함으로 인해 상기 클라우드 컴퓨팅(30)의 퍼블릭 클라우드를 통한 스토리지 및 컴퓨팅 자원의 확장성을 갖게 되어 데이터를 수집하고, 전이학습하는 단계의 확장성을 지원하게 된다. 다시 말해, 상기 데이터레이크(10)의 물리적 자원에 대한 한계성은 상기 클라우드 컴퓨팅(30)의 퍼블릭 클라우드의 스토리지로 확장성을 확보하고, 상기 데이터레이크(10)의 수집데이터에 대한 전이학습은 상기 클라우드 컴퓨팅(30)의 퍼블릭 클라우드의 컴퓨팅 자원을 통해 학습 및 학습시간을 단축할 수 있게 된다.As described above, AI learning through the cloud computing 30 around the data lake 10 has scalability of storage and computing resources through the public cloud of the cloud computing 30 to collect data, It supports the scalability of the stage of transfer learning. In other words, the limit of the physical resources of the data lake 10 secures scalability to the storage of the public cloud of the cloud computing 30, and the transfer learning of the collected data of the data lake 10 is the cloud. It is possible to shorten learning and learning time through computing resources of the public cloud of computing 30.

상기 데이터저장소(20)의 결과데이터나 상기 클라우드 컴퓨팅(30)의 학습데이터를 수신받은 데이터레이크(10)는 상기 결과데이터나 상기 학습데이터를 마이크로 스토리지(40)로 제공한다(S60). 이때, 상기 데이터레이크(10)에는 상기 마이크로 스토리지(40)의 식별정보가 미리 저장되어 있어야 하며, 상기 결과데이터나 상기 학습데이터에 대응되는 마이크로 스토리지(40)를 추출하여 상기 결과데이터나 상기 학습데이터를 제공해야 한다.The data lake 10 receiving the result data of the data storage 20 or the learning data of the cloud computing 30 provides the result data or the learning data to the micro storage 40 (S60). At this time, the data lake 10, the identification information of the micro-storage 40 must be stored in advance, extract the micro-storage 40 corresponding to the result data or the learning data to extract the result data or the learning data Should be provided.

상기 마이크로 스토리지(40)에서 상기 학습데이터를 수신받을 경우에는 상기 마이크로 스토리지(40)를 통해 AI서비스를 사용할 수 있도록 상기 학습데이터의 결과를 적용할 수 있다(S70).When receiving the learning data from the micro storage 40, the result of the learning data may be applied to use the AI service through the micro storage 40 (S70).

상기 마이크로 스토리지(40)에서 상기 결과데이터를 수신받을 경우에는 상기 마이크로 스토리지(40)에서 사용할 수 있다. 또한, 상기 마이크로 스토리지(40)에서 상기 학습데이터가 적용될 경우에 상기 적용된 학습데이터를 사용할 수 있다(S80).When the result data is received from the micro storage 40, it can be used in the micro storage 40. In addition, when the learning data is applied to the micro storage 40, the applied learning data may be used (S80).

상기 마이크로 스토리지(40)는 상기 데이터레이크(10)를 통해 상기 학습데이터나 상기 결과데이터를 수신받음으로써 데이터를 수집하고 전이학습하는 단계를 상기 데이터레이크(10)를 통해 수행하기 때문에 저사양 컴퓨팅의 특성을 극복할 수 있게 된다.The micro-storage 40 collects data by receiving the learning data or the result data through the data lake 10 and transfers the learning through the data lake 10 to perform the characteristics of low-spec computing. Will be able to overcome it.

상기와 같이 제어되는 인공지능을 이용한 연결된 데이터 아키텍처 시스템은, 데이터를 구조적으로 관리하면서 안정적으로 인터페이스별로 딥 러닝을 통해 데이터를 학습시켜 학습데이터나 결과데이터를 수신받아 사용자의 환경이나 상태에 따라 학습데이터나 결과데이터를 제공할 수 있어 헬스케어, 웨어러블 디바이스, IoT 디바이스 등과 같은 저사양 디바이스에서 독립적으로 인공지능이 동작할 수 있게 되며, 사용자 맞춤형으로 AI서비스를 제공할 수 있는 효과가 있다.The connected data architecture system using artificial intelligence controlled as described above, while structurally managing data, stably learns data through deep learning for each interface, receives training data and result data, and receives learning data according to the user's environment or condition. B. Since it can provide result data, artificial intelligence can operate independently on low-end devices such as healthcare, wearable devices, IoT devices, etc., and it is effective in providing AI services customized to users.

또한, 학습할 데이터를 수집한 데이터레이크(10)와 데이터저장소(20) 간의 인터페이스를 통해 학습할 데이터에 대한 학습을 병렬로 처리할 수 있어 공유 및 확장성을 확보할 수 있어 어느 하나로 통합되는 데이터의 용량을 줄여 최적의 학습 및 데이터 분류가 가능함으로 인해 효율적으로 AI서비스를 제공할 수 있게 된다.In addition, through the interface between the data lake 10 and the data storage 20 that collects the data to be learned, the learning of the data to be learned can be processed in parallel, so sharing and scalability can be secured, so that the data is integrated into one. By reducing the capacity of, it is possible to provide AI services efficiently by enabling optimal learning and data classification.

또한, 학습할 데이터를 수집한 데이터레이크(10)와 클라우드 컴퓨팅(30) 간의 인터페이스를 통해 확장성을 제공하기 때문에 학습할 데이터가 대용량이더라도 학습 및 학습시간을 단축시킬 수 있는 효과가 있다.In addition, because it provides scalability through the interface between the data lake 10 and cloud computing 30, which collects the data to be learned, there is an effect to shorten the learning and learning time even if the data to be learned is large.

앞에서 설명되고, 도면에 도시된 본 발명의 실시 예들은 본 발명의 기술적 사상을 한정하는 것으로 해석되어서는 안 된다. 본 발명의 보호범위는 청구범위에 기재된 사항에 의하여만 제한되고, 본 발명의 기술분야에서 통상의 지식을 가진 자는 본 발명의 기술적 사상을 다양한 형태로 개량 변경하는 것이 가능하다. 따라서 이러한 개량 및 변경은 통상의 지식을 가진 자에게 자명한 것인 경우에는 본 발명의 보호범위에 속하게 될 것이다.The embodiments described above and illustrated in the drawings should not be interpreted as limiting the technical spirit of the present invention. The protection scope of the present invention is limited only by the matters described in the claims, and a person having ordinary knowledge in the technical field of the present invention can improve and modify the technical spirit of the present invention in various forms. Therefore, such improvements and modifications will fall within the protection scope of the present invention if it is apparent to those skilled in the art.

10: 데이터레이크 11: 데이터레이크통신부
12: 데이터수집부 13: 학습요청부
14: 데이터제공부 20: 데이터저장소
21: 데이터저장소통신부 22: 전이학습부
23: 학습완료부 24: 결과공유부
30: 클라우드 컴퓨팅 31: 클라우드통신부
32: 백업부 33: AI학습부
34: 데이터전달부 40: 마이크로 스토리지
41: 정보관리부 42: 마이크로스토리지통신부
43: 데이터입력부 44: 데이터저장부
45: 데이터적용부 46: 데이터사용부10: Data Lake 11: Data Lake Communication Department
12: Data collection unit 13: Learning request unit
14: data providing unit 20: data storage
21: Data storage communication department 22: Transfer learning department
23: Learning completion part 24: Result sharing part
30: cloud computing 31: cloud communication department
32: backup unit 33: AI learning unit
34: data transfer unit 40: micro storage
41: Information Management Department 42: Micro Storage Communication Department
43: data input unit 44: data storage unit
45: data application unit 46: data application unit

Claims

Data lake (10) for storing the collected data generated by collecting data, and data for sharing the result data generated by receiving and receiving the collected data from the data lake (10) to the data lake (10) The storage 20 and the cloud computing 30 for receiving the collected data from the data lake 10 and transferring the learning data generated by transfer learning to the data lake 10 and the data lake 10 It includes a micro-storage 40 for receiving and applying the learning data or receiving and using the result data and inputting the data,
The data lake 10 is a data lake communication unit 11 for communicating with the data storage 20, the cloud computing 30 and the micro storage 40, and the data lake communication unit 11 through the Data collection unit 12 for collecting data received from the micro storage 40 and generating the collected data, and the data collected in the data collection unit 12 in the data storage 20 and the cloud computing ( 30) from the learning request unit 13 and the result data received from the data storage 20 through the learning request unit 13 and the learning request unit 13 for determining to transmit to any one of the cloud computing (30) Connected data architecture system for artificial intelligence services, including a data providing unit (14) for providing the received learning data to the micro storage (40).

The method according to claim 1,
The data storage 20 transfers the data storage communication unit 21 for communicating with the data lake 10 and the collected data received from the data lake 10 through the data storage communication unit 21. For the transfer learning unit 22, and the transfer learning through the transfer learning unit 22, learning completion unit 23 for generating result data and the learning completion unit 23 to share the result data generated Connected data architecture system for artificial intelligence services, including a result sharing unit 24 for.

The method according to claim 1 or claim 2,
The cloud computing 30 includes a cloud communication unit 31 for communicating with the data lake 10 and a backup unit for backing up the collected data received from the data lake 10 through the cloud communication unit 31. (32), the AI learning unit 33 for generating learning data by transfer learning the collected data backed up in the backup unit 32, and the learning data generated through the AI learning unit 33 to the data Connected data architecture system for artificial intelligence services, including a data delivery unit (34) for delivery to the lake (10).

The method according to claim 3,
The micro storage 40 includes an information management unit 41 for storing and managing identification information for identification, a micro storage communication unit 42 for communicating with the data lake 10, and the micro storage communication unit 42 ) And a data input unit 43 for inputting data to be transmitted to the data lake 10 and a data storage unit 44 for storing result data or learning data received from the data lake 10. , Data applying unit 45 for applying the learning data stored in the data storage unit 44 and learning data applied through the data application unit 45 or result data stored in the data storage unit 44 are used. Connected data architecture system for artificial intelligence services, including a data use unit 46 for.

A data collection step of collecting the data in the data lake 10 for collecting data and storing the accumulated data to generate collected data;
In order to request learning of the collected data among the data storage 20 for parallelly processing the transfer learning of the collected data in the data lake 10 and the cloud computing 30 for backing up the collected data and learning the transfer of the collected data A determination step of comparing and determining the capacity of the collected data and a preset reference capacity;
When the capacity of the collected data in the data lake 10 is less than or equal to a preset reference capacity, the collected data is transmitted to the data storage 20 to request learning, and the capacity of the collected data in the data lake 10 A learning request step for requesting learning by transmitting the collected data to the cloud computing 30 when the preset reference capacity is exceeded;
A transfer learning step of transfer learning the collected data received from the data lake 10 in the data storage 20;
A sharing step for generating result data for the result of transfer learning in the data storage 20 and sharing it with the data lake 10;
A backup step of backing up the collected data received from the data lake 10 in the cloud computing 30;
AI learning step to transfer the collected data backed up in the cloud computing (30) to generate learning data,
A delivery step for delivering learning data generated in the cloud computing 30 to the data lake 10;
A providing step for receiving the result data of the data storage 20 or the learning data of the cloud computing 30 from the data lake 10 and providing it to the micro storage 40;
An application step of applying the learning data when the learning data is received from the micro storage 40;
Control method of a connected data architecture system for an artificial intelligence service, comprising using the applied learning data in the micro-storage (40) or using the result data when receiving the result data.