WO2018216828A1 - Système de gestion de mégadonnées énergétiques, et procédé associé - Google Patents

Système de gestion de mégadonnées énergétiques, et procédé associé Download PDF

Info

Publication number
WO2018216828A1
WO2018216828A1 PCT/KR2017/005385 KR2017005385W WO2018216828A1 WO 2018216828 A1 WO2018216828 A1 WO 2018216828A1 KR 2017005385 W KR2017005385 W KR 2017005385W WO 2018216828 A1 WO2018216828 A1 WO 2018216828A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
big data
energy big
energy
node
Prior art date
Application number
PCT/KR2017/005385
Other languages
English (en)
Korean (ko)
Inventor
송민구
최중인
Original Assignee
재단법인차세대융합기술연구원
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 재단법인차세대융합기술연구원 filed Critical 재단법인차세대융합기술연구원
Publication of WO2018216828A1 publication Critical patent/WO2018216828A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2219Large Object storage; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present disclosure relates to a system and a method for managing big data as a whole as a whole, and to energy collected in real time in a spark streaming-based cloud system, in particular, BIG DATA MANAGEMENT SYSTEM AND MANAGEMENT METHOD THEREOF.
  • Cluster computing frameworks are gaining popularity in the face of the ever-increasing numbers of big data in the modern computing era.
  • Hadoop and Spark are growing rapidly, and many Internet service companies, such as Google, Facebook, Amazon, and others, use these cluster computing platforms as machine machines. It is used as a platform for providing real-time services equipped with their own technologies such as learning.
  • Spark is a real-time distributed computing framework for big data, and can execute big data processing at high speed in a distributed cluster environment.
  • Spark Unlike Hadoop, a typical decentralized framework, Spark usually comes with the term 'real time'.
  • Hadoop passes through storage in the Hadoop Distributed File System (HDFS), which results in a lot of interaction and slow processing.
  • HDFS Hadoop Distributed File System
  • Spark is based on in-memory processing, enabling faster and lower latency analysis, which is expected to be the framework for next-generation big data processing.
  • Spark can read and write big data to be processed via HDFS, but subsequent processing is done in memory by default, which can be faster than Hadoop for tasks with many iterations, such as machine learning or charting. That's why Spark is said to be able to perform data analysis tasks 100 times faster than running on Hadoop MapReduce.
  • MapReduce has been pointed out as a performance bottleneck in Hadoop clusters because it runs jobs in batch mode. Spark, on the other hand, is an alternative to MapReduce because it handles the analysis with a short batch of less than five seconds.
  • big data refers to almost all kinds of digital data in modern times, which have a large amount, vary in form, and are generated / updated at high speed, but are not structured and thus difficult to process and coexist with structured and atypical forms.
  • the structured data is data stored in a fixed field, for example, a relational database and a spreadsheet
  • the unstructured data is data not stored in a fixed field.
  • Semi-structured data is not stored in fixed fields, but includes metadata, schemas, and the like, for example, XML or HTML text. Big data is sometimes referred to as large data.
  • the cluster collects energy big data collected in real time. Processing and managing at the server side; And transmitting, by the cluster server, the information to be displayed on the screen of the web client to the web client according to a request of the web client.
  • the energy big data processing step the energy big data including the structured data and the unstructured data Classifying; Analyzing the classified energy big data; And there is provided an energy big data management method comprising the step of storing the analyzed energy big data in a cluster server.
  • an energy big data management system using a cluster server and a web client comprising: a management node controlling a cluster server; And a data node for processing and storing the energy big data collected in real time; the managed node transmits information stored in the data node or requests new energy big data to be received at the data node according to a request of a web client.
  • a big data management system is provided.
  • FIG. 1 is a view showing an example of the overall configuration of an energy big data management system according to the present disclosure
  • FIG. 2 is a diagram illustrating a concept of an RDD abstraction process according to the present disclosure
  • FIG. 3 is a diagram illustrating an example of a cluster server according to the present disclosure
  • FIG. 4 is a diagram illustrating an example of a data node according to the present disclosure.
  • the energy big data management system 1 includes a cluster server 10 and a web client 20.
  • the cluster server 10 processes, stores, and manages energy big data collected in real time.
  • the cluster server 10 is preferably a cluster server based on a spark framework, but is not limited thereto.
  • data analysis frameworks are essential for processing large amounts of energy big data on distributed servers, and these data analysis frameworks include safety, data security, timeliness, and reliability ( It is necessary to meet various requirements such as reliability and anti-aging.
  • Spark streaming which supports second-scale processing through in-memory processing among existing data analysis frameworks, is in the spotlight.
  • Spark streaming operates on the spark proposed by UCBerkeley in 2012 to improve the execution time of slow jobs due to frequent storage accesses in the existing Apache Hadoop. Spark saves execution time by storing intermediate results in memory that will be used repeatedly in jobs to reduce frequent storage accesses.
  • the data structure used to store and manage the intermediate result in memory is a resilient distributed dataset (RDD), and methods such as transformation and action may be provided.
  • RDD resilient distributed dataset
  • RDD supports parallel processing and has defect tolerance, so that big data can be utilized and analyzed through an operation process as shown in FIG. 2.
  • Spark streaming has been developed for stream processing by requiring stream processing in various industries. Spark streaming may process live stream data delivered as an input to sparks periodically in a micro batch form. There are two cases where such spark streaming approaches storage. There are two cases where the input data is stored for fault recovery and the data read out from RDD is read back.
  • the cluster server 10 includes a managed node 100, a data node 120, and an edge node 140.
  • the management node 100 is configured to cluster information such that information stored in the data node 120 may be transmitted to the web client 20 or new energy big data may be transmitted in real time to the data node 120 at the request of the web client 20.
  • the management node 100 generates and transmits a control signal, that is, an external energy data input signal, to the data node 120 so that external energy data generated from the external environment 22 is received by the data node 120.
  • the field energy data generated from the field environment 24 is generated and transmitted to the data node 120 so as to receive the control signal, that is, the field energy data input signal.
  • the field environment 24 may include, for example, a business agency, a public agency, a home, and the like.
  • the managed node 100 retrieves the corresponding energy big data stored in the data node 120 so that the information may be displayed on the screen 21 of the web client 20 so that the information may be transmitted to the web client 20.
  • a control signal that is, a search request signal, is generated and transmitted to the node 120.
  • the management node 100 generates a control signal for failure of the cluster server 10 and the web client 20, that is, a fault recovery signal, and transmits the generated control signal to the data node 120 or the web client 20. do.
  • the cluster server 10 may check a failure diagnosis and abnormality prediction situation for the web client 20. In this way, the cluster server 10 controls the operation state of the web client 20 based on the accumulated energy big data, thereby smoothly operating the web client 20, and actively and voluntarily enters the energy saving market. It is effective to join.
  • the managed node 100 is illustrated as being made up of two, but is not limited thereto.
  • the first management node 110 is the main management node according to the environment of the cluster server 10 or the environment of the web client 20.
  • the remaining second management node 120 may be used as a secondary management node. Alternatively, only one of the two management nodes may be used or both of the management nodes may be used.
  • the data node 120 receives a control signal from the management node 100 and manages energy big data received from the web client 20.
  • the present disclosure referring to FIG. 3, although six data nodes 120 are illustrated, the present invention is not limited thereto.
  • the energy big data may include external energy data and field energy data generated from the web client 20 generated directly or indirectly from an external environment.
  • the external energy data may include terrain information, weather information, social information, etc. according to the external environment 22
  • the site energy data may include energy consumption, energy remaining amount, etc. according to the site environment 24.
  • Such energy big data consisting of external energy data and field energy data may be classified into structured data and unstructured data.
  • energy big data may be classified as semi-structured data, this disclosure is classified and described as structured data and unstructured data.
  • the data node 120 includes a data receiver 1210, a data analyzer 1220, a data storage 1230, and a data manager 1240.
  • the data receiver 1210 receives external energy data transmitted in real time and field energy data received in response to a control signal of the management node 100.
  • the structured data included in the energy big data may be collected through Kafka, and the unstructured data included in the energy big data may be collected through a plume.
  • the data analyzer 1220 may convert the collected energy big data to be analyzed so as to analyze each field.
  • the structured data classified through Kafka and the unstructured data classified through Plume can be analyzed and then converted into a programming language using MLlib and Sqoop for analysis. have.
  • Sqoop converts structured data from relational database systems (RDBMS) to HDFS and HBase, while machine learning (MLlib) is limited to some algorithms in supervised and unsupervised learning, but machine learning implementations include Python, Scala, and Java. Supported by several programming languages, GraphX is a library for chart calculation. Here, Sqoop is preferably associated with Kafka and Plume for real-time data transmission.
  • the data storage unit 1230 classifies and stores the analyzed data for each application field.
  • the data manager 1240 A first management unit 1242 for providing information stored in the data storage unit 1230 and a second management unit 1244 for visualizing the information stored in the data storage unit 1230 at the request of the web client 20; And a third manager 1246 managing the cluster server 10 and the web client 20.
  • the first management unit 1242 may support a SQL search for the structured data of the Hadoop database (HBase) unit that retrieves the corresponding information so that the information stored in the data node 120 may be transmitted at the request of the web client 20. It can include Spark R, which links R, a statistical tool useful for SQL and data science.
  • HBAse Hadoop database
  • the second manager 1244 may sequentially include a plurality of jobs, and may include an Oozie for performing workflow scheduling and monitoring.
  • the third management unit 1246 may include a zookeeper (Zookeeper), which serves to help resolve various kinds of failures and exceptions occurring in the field.
  • Zookeeper zookeeper
  • the data node 120 transmits the information so that the information is displayed on the screen 21 of the web client 20 according to the request of the web client 20. .
  • the energy big data arranged for each field is searched and transmitted.
  • the data node 120 receives energy data from the web client 20 in real time.
  • the external energy data may be input to the data node 120 in real time without a separate external energy data input signal.
  • the edge node 140 connects the cluster server 10 and the web client 20 to directly or indirectly interact with each other via, for example, a network.
  • the network may be any behavioral network such as, for example, a Local Area Network (LAN), Wide Area Network (WAN), Virtual Private Network (VPN), or the Internet.
  • LAN Local Area Network
  • WAN Wide Area Network
  • VPN Virtual Private Network
  • the edge node 140 is illustrated as two, but is not limited thereto.
  • the web client 20 is preferably composed of a computer 20 (eg, a PC) capable of network communication.
  • Web client 20 may be a conventional computer, but it will be understood that it may be any form of machine or computing device, including, for example, a desktop computer, a laptop computer, a tablet computer, a personal digital assistant (PDA), or a smartphone. .
  • PDA personal digital assistant
  • the plurality of web clients 20 may be the same or different, respectively, and may include a transmitter / receiver (not shown) for transmitting / receiving data to enable communication via a network.
  • a method for managing energy big data using a spark cluster server and a web client comprising: processing and managing energy big data collected in real time on a cluster server side; And transmitting, by the cluster server, the information to be displayed on the screen of the web client to the web client according to a request of the web client.
  • the energy big data processing step the energy big data including the structured data and the unstructured data Classifying; Analyzing the classified energy big data; And storing the analyzed energy big data in a cluster server.
  • a representative example of a web client is a PC, but is not limited thereto, and any computing means (eg, a mobile phone) capable of displaying information received from a cluster server through a screen may be used.
  • This series of steps is an internal process of the server side computer, which is performed by software.
  • the energy big data management method which collects real-time energy big data includes external energy data and field energy data.
  • the information transmission step includes the step of retrieving energy big data arranged for each field corresponding to the request when the web client's request includes information on energy big data required for site control. Way.
  • An energy big data management method for analyzing or retrieving classified energy big data in consideration of priority according to a request of a web client.
  • An energy big data management system using a cluster server and a web client comprising: a management node controlling a cluster server; And a data node for processing and storing the energy big data collected in real time; the managed node transmits information stored in the data node or requests new energy big data to be received at the data node according to a request of a web client.
  • Big Data Management System
  • an edge node for transmitting information and signals between the cluster server and the web client.
  • a cluster server is an energy big data management system consisting of two managed nodes, six data nodes, and two edge nodes.
  • the data node receives external energy data and field energy data through edge nodes to classify the data into structured data and unstructured data when a command signal for receiving energy big data is input from the management node.
  • Energy big data management system that analyzes and places and stores by sector.
  • the data node searches for the energy big data arranged for each field corresponding to the request and selects the edge node.
  • An energy big data management system that sends this information to a web client via the web.
  • the method of providing a system for managing energy big data by managing and controlling the operation state of a web client that provides energy big data in a cluster server, it is possible to efficiently manage energy big data according to the purpose. can do.

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Tourism & Hospitality (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Human Resources & Organizations (AREA)
  • General Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Primary Health Care (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Debugging And Monitoring (AREA)

Abstract

L'invention concerne un procédé de gestion de mégadonnées énergétiques au moyen d'un serveur de grappe Spark et d'un client Web, le procédé comprenant : une étape consistant à permettre à un serveur de grappe de traiter et de gérer des mégadonnées énergétiques collectées en temps réel ; et une étape de transmission d'informations consistant à permettre au serveur de grappe de transmettre, au client Web, des informations à afficher sur un écran du client Web selon une requête du client Web, l'étape de traitement de mégadonnées énergétiques comprenant les étapes consistant : à classifier des mégadonnées énergétiques comprenant des données structurées et des données non structurées ; à analyser les mégadonnées énergétiques classifiées ; et à mémoriser dans le serveur de grappe les mégadonnées énergétiques analysées.
PCT/KR2017/005385 2017-05-24 2017-05-24 Système de gestion de mégadonnées énergétiques, et procédé associé WO2018216828A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020170063987A KR101878291B1 (ko) 2017-05-24 2017-05-24 에너지 빅데이터 관리 시스템 및 그 방법
KR10-2017-0063987 2017-05-24

Publications (1)

Publication Number Publication Date
WO2018216828A1 true WO2018216828A1 (fr) 2018-11-29

Family

ID=63251991

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2017/005385 WO2018216828A1 (fr) 2017-05-24 2017-05-24 Système de gestion de mégadonnées énergétiques, et procédé associé

Country Status (2)

Country Link
KR (1) KR101878291B1 (fr)
WO (1) WO2018216828A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112328569A (zh) * 2020-07-31 2021-02-05 山东云缦智能科技有限公司 基于Flume分布式数据收集架构的构建方法
CN118413867A (zh) * 2024-07-02 2024-07-30 西安羚控电子科技有限公司 一种基于业务数据降级的集群数据同步方法及装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20100113686A (ko) * 2009-04-14 2010-10-22 서울시립대학교 산학협력단 모니터링을 통한 실시간 에너지 관리 시스템
JP2014067205A (ja) * 2012-09-26 2014-04-17 Hitachi Systems Ltd 使用量管理サーバ、プログラム、使用量管理方法、および情報管理サーバ
WO2015023100A1 (fr) * 2013-08-12 2015-02-19 주식회사 인코어드 테크놀로지스 Appareil et système de fourniture d'informations sur l'énergie
KR101648401B1 (ko) * 2015-04-17 2016-08-16 (주)모아데이타 데이터 관리 및 분석을 위한 데이터베이스 장치, 스토리지 유닛 및 그 방법
KR20160128498A (ko) * 2015-04-28 2016-11-08 세림티에스지(주) 클라우드 시스템을 이용한 에너지 관리 시스템

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101568654B1 (ko) * 2014-01-03 2015-11-16 주식회사 테크인모션 Api 드라이버를 이용한 웹서버와 빅데이터 클러스터 기반의 빅데이터 서비스 시스템
KR101700327B1 (ko) * 2016-01-05 2017-01-26 (주)미소정보기술 빅데이터의 분석 결과를 제공하기 위한 방법, 서버 및 컴퓨터 판독 가능한 기록 매체

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20100113686A (ko) * 2009-04-14 2010-10-22 서울시립대학교 산학협력단 모니터링을 통한 실시간 에너지 관리 시스템
JP2014067205A (ja) * 2012-09-26 2014-04-17 Hitachi Systems Ltd 使用量管理サーバ、プログラム、使用量管理方法、および情報管理サーバ
WO2015023100A1 (fr) * 2013-08-12 2015-02-19 주식회사 인코어드 테크놀로지스 Appareil et système de fourniture d'informations sur l'énergie
KR101648401B1 (ko) * 2015-04-17 2016-08-16 (주)모아데이타 데이터 관리 및 분석을 위한 데이터베이스 장치, 스토리지 유닛 및 그 방법
KR20160128498A (ko) * 2015-04-28 2016-11-08 세림티에스지(주) 클라우드 시스템을 이용한 에너지 관리 시스템

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
KIM, YOUNG-SUN ET AL.: "A Study for Big Data Analytics Platform with Raspberry Pi Cluster and Apache Spark", PROCEEDINGS OF THE KIPS( KOREA INFORMATION PROCESSING SOCIETY) FALL CONFERENCE, vol. 22, no. 2, October 2015 (2015-10-01), pages 1272 - 1275 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112328569A (zh) * 2020-07-31 2021-02-05 山东云缦智能科技有限公司 基于Flume分布式数据收集架构的构建方法
CN118413867A (zh) * 2024-07-02 2024-07-30 西安羚控电子科技有限公司 一种基于业务数据降级的集群数据同步方法及装置

Also Published As

Publication number Publication date
KR101878291B1 (ko) 2018-08-07

Similar Documents

Publication Publication Date Title
CN109564568B (zh) 用于分布式数据集索引的装置、方法和机器可读存储介质
CN107943668B (zh) 计算机服务器集群日志监控方法及监控平台
CN105138661B (zh) 一种基于Hadoop的网络安全日志k-means聚类分析系统及方法
CN113176875B (zh) 一种基于微服务的资源共享服务平台架构
CN112600891B (zh) 一种基于信息物理融合的边云协同系统及工作方法
CA3055823A1 (fr) Generation selective de vecteurs de mots et representations de vecteurs de paragraphe de champs pour apprentissage automatique
WO2024075911A1 (fr) Système de gestion intégrée de connaissances de sécurité en cas de catastrophe à l'aide d'une ia
Mehdipour et al. Fog computing realization for big data analytics
CN111309550A (zh) 应用程序的数据采集方法、系统、设备和存储介质
EP3627376A1 (fr) Architecture de noeud de travail d'apprentissage machine
WO2022108427A1 (fr) Système d'outil d'habilitation de confiance intelligent d'environnement de l'ido reposant sur la 5g
CN110879771A (zh) 一种基于关键词序列挖掘的用户异常检测的日志分析系统
KR102712550B1 (ko) 원격 네트워크 관리 플랫폼을 위한 중앙집중식 머신 학습 예측자
US20160203224A1 (en) System for analyzing social media data and method of analyzing social media data using the same
Li et al. The overview of big data storage and management
Dunne et al. A comparison of data streaming frameworks for anomaly detection in embedded systems
CN113962597A (zh) 一种数据分析方法、装置、电子设备及存储介质
da Silva et al. Big Data Analytics Technologies and Platforms: A Brief Review.
WO2018216828A1 (fr) Système de gestion de mégadonnées énergétiques, et procédé associé
CN116629802A (zh) 一种用于铁路港口站的大数据平台系统
CN113542074B (zh) 一种可视化管理kubernetes集群的东西向网络流量的方法及系统
Calderon et al. Monitoring framework for the performance evaluation of an IoT platform with Elasticsearch and Apache Kafka
CN110661999A (zh) 一种基于大数据的视频监控系统
CN112632155A (zh) 一种基于云计算的信息处理方法及系统
Zhang et al. Efficient online surveillance video processing based on spark framework

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17910816

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17910816

Country of ref document: EP

Kind code of ref document: A1