WO2017190757A1 - Système et procédé d'analyse de données réparties - Google Patents

Système et procédé d'analyse de données réparties Download PDF

Info

Publication number
WO2017190757A1
WO2017190757A1 PCT/EP2016/000713 EP2016000713W WO2017190757A1 WO 2017190757 A1 WO2017190757 A1 WO 2017190757A1 EP 2016000713 W EP2016000713 W EP 2016000713W WO 2017190757 A1 WO2017190757 A1 WO 2017190757A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
measurement data
storage device
analysis
computing
Prior art date
Application number
PCT/EP2016/000713
Other languages
English (en)
Inventor
Tobias ABTHOFF
Original Assignee
Norcom Information Technology Ag
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Norcom Information Technology Ag filed Critical Norcom Information Technology Ag
Priority to EP16720348.8A priority Critical patent/EP3420451A1/fr
Priority to PCT/EP2016/000713 priority patent/WO2017190757A1/fr
Publication of WO2017190757A1 publication Critical patent/WO2017190757A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources

Definitions

  • a distributed data analysis system for analyzing collected measurement data comprises a data input device configured to receive measurement data, a first storage device associated with the data input device and configured to store the measurement data input via the data input unit, and a first computing device associated with the first storage device.
  • a second storage device is configured to store measurement data previously stored on the first storage device.
  • a second computing device is associated with the second storage device, and a data distribution system is configured to distribute the measurement data between the first storage device and the second storage device based on at least one predetermined criterion.
  • a data management device is configured to store a location of the measurement data and to update the stored location based on the distribution by the data distribution device.
  • Fig. 3 shows a schematic overview of a distributed data analysis system in accordance with another embodiment of the present invention.
  • Each service point 104 includes a data input device 10 for receiving the measurement data generated during the test drive.
  • the amount of data typically will be several GB or several tens of GB.
  • Each service point 104 includes a storage device 12, for example, one or more hard disks for storing the
  • Main computing system 102 includes a main storage device 16 and an associated main computing device 18.
  • Storage device 16 has a much larger storage capacity than, for example, storage devices 12, 13.
  • storage device 16 serves as a "data lake” storing the bulk of the available measurement data collected by one or more vehicles.
  • storage devices 13, 16 and computing devices 15, 18 may be part of a large computing cluster 31 , for example, a Hadoop cluster.
  • the measurement data offloaded at service point 106 can be easily transferred to main storage device 16 due to service point 106 being co- located with main computing system 102.
  • service point 104 which may be located anywhere around the globe, for example, in areas with poor internet connection. Therefore, the data offloaded at service point 104 may not be available for analysis on main computing system 102 in a timely manner. Therefore, in the present embodiment, part of the analysis of the measurement data is performed at local service point 104, in combination with an analysis that is performed in parallel on the measurement data available at the headquarters. This will be described in more detail below.
  • distributed data analysis system 100 includes a data management device 22 that is in communication with computing device 14 of each service point 104, as well as with main computing system 102.
  • data management device 22 may include a web server running the DaSense software.
  • Data management device 22 is configured to store a location of the measurement data that is offloaded at each service point 104 and that is stored on main storage device 16, for example, in an appropriate database.
  • data management device 22 receives the meta data generated by service point 104 during the data ingest process and forwards the same to main computing system 102. In this manner, the location of all the data that is available for performing a particular analysis is stored, for example, on main computing system 102.
  • data movement planner 68 may determine that data needs to be moved from service point 104 to central computing system 102, e.g., due to the maximum local data size on storage device 12 being exceeded.
  • Data movement planner is configured to choose from moving the data in an online or an offline manner, i.e., by transferring the same via data link 108 or via physical mail, for example, using portable hard drives 32 sent, for example, via DHL or a similar courier service.
  • the corresponding determinations are forwarded to an online move queue 70 and an offline move queue 72.
  • the locations of the moved data are continuously updated and stored by data management device 22 to be used in subsequent queries.
  • specific queries may be automatically generated by distributed data analysis system 100, for example, standard queries for certain car behaviors, car locations, data types and the like, which may be generated on a regular basis, and the results of the queries may be stored for future reference without immediately being reported to a user.
  • predefined report may be generated after lapse of a predetermined time period, for example, on a weekly basis, and the available data may be retrieved by engineers in a web client or as a PDF document at a later time.
  • Data analysis system 200 is suitable for use in an autonomous driving application.
  • a large number of algorithms has to be developed to interpret incoming sensory data from, e.g., cameras, radar or lidar systems or the like in order to maintain an accurate representation of the vehicle state and its environment.
  • the sensory data which normally has to be analyzed in real time, is recorded, such that new versions of an algorithm can be tested on the same data set.
  • the rate of data is extremely high, for example, around 2 GB per second. Clearly, this requires a large available storage space. Therefore, typically, the test drives are performed in the vicinity of main computing system 202 at the headquarters.
  • Second server node 117 may have an intermediate amount of computing power for data that might be accessed not in the immediate future, but perhaps in the foreseeable future, or perhaps less frequently than the "hot” data (referred to herein as “warm” data). It will be appreciated that additional server nodes for data that is even less likely to be accessed, having even less computing power and considerably higher storage capacity, may also be provided (for "cold” data). In addition, an object store 140 that has practically no computing power is provided for data that is outdated, but has to be kept for various reasons (“frozen” data). It should be noted that, in some embodiments, at least some of the nodes having data with different temperatures may also be provided at geographically different locations, instead of being co-located with each other, for example, at the headquarters.
  • Data distribution device 120 is configured to classify the measurement data stored on the respective storage devices into data having different priorities, for example, based on one or more predetermined criteria, and to transfer data having a low priority to a server node that has lower computing power.
  • data distribution device 120 may be configured to classify some measurement data stored on first storage device 112 as having a lower priority and transfer the same to second storage device 116.
  • data that is stored on, for example, second storage device 116 may be transferred to first storage device 112, if necessary.
  • Data classification can be based on, for example, access times, creation dates, or other meta data or content-related criteria.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

La présente invention concerne un système d'analyse de données réparties (100) pour l'analyse de grandes quantités de données de mesure recueillies, par exemple, des données qui sont accumulées lors d'essais de route de véhicules (2) dans le domaine de l'ingénierie automobile. Lors de la réception d'une requête pour une analyse devant être effectuée sur les données de mesure recueillies, un dispositif d'analyse (26) détermine lesquels parmi une pluralité de différents dispositifs de stockage (12, 16) situés à des emplacements géographiquement différents comprennent des données de mesure pertinentes par rapport à la requête, et effectuent l'analyse sur les données de mesure stockées sur les dispositifs de stockage appropriés. Les résultats partiels de l'analyse sont combinés et renvoyés à un utilisateur. Des données sont transférées entre les dispositifs de stockage (12, 16) sur la base, par exemple, d'une capacité de stockage restante desdits dispositifs ou d'une priorité des données de mesure.
PCT/EP2016/000713 2016-05-02 2016-05-02 Système et procédé d'analyse de données réparties WO2017190757A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP16720348.8A EP3420451A1 (fr) 2016-05-02 2016-05-02 Système et procédé d'analyse de données réparties
PCT/EP2016/000713 WO2017190757A1 (fr) 2016-05-02 2016-05-02 Système et procédé d'analyse de données réparties

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2016/000713 WO2017190757A1 (fr) 2016-05-02 2016-05-02 Système et procédé d'analyse de données réparties

Publications (1)

Publication Number Publication Date
WO2017190757A1 true WO2017190757A1 (fr) 2017-11-09

Family

ID=55910911

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2016/000713 WO2017190757A1 (fr) 2016-05-02 2016-05-02 Système et procédé d'analyse de données réparties

Country Status (2)

Country Link
EP (1) EP3420451A1 (fr)
WO (1) WO2017190757A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110187829A (zh) * 2019-04-22 2019-08-30 上海蔚来汽车有限公司 一种数据处理方法、装置、系统及电子设备

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140040575A1 (en) * 2012-08-01 2014-02-06 Netapp, Inc. Mobile hadoop clusters
US20140195558A1 (en) * 2013-01-07 2014-07-10 Raghotham Murthy System and method for distributed database query engines

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140040575A1 (en) * 2012-08-01 2014-02-06 Netapp, Inc. Mobile hadoop clusters
US20140195558A1 (en) * 2013-01-07 2014-07-10 Raghotham Murthy System and method for distributed database query engines

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110187829A (zh) * 2019-04-22 2019-08-30 上海蔚来汽车有限公司 一种数据处理方法、装置、系统及电子设备

Also Published As

Publication number Publication date
EP3420451A1 (fr) 2019-01-02

Similar Documents

Publication Publication Date Title
US20220300273A1 (en) Over-the-air (ota) mobility services platform
EP2752779B1 (fr) Système et procédé pour moteurs d'interrogation de base de données distribués
US20190377816A1 (en) Tool for Creating and Deploying Configurable Enrichment Pipelines
US20160012107A1 (en) Mapping query operations in database systems to hardware based query accelerators
US20190377817A1 (en) Tool for Creating and Deploying Configurable Pipelines
US20170085661A1 (en) Computer Systems and Methods for Sharing Asset-Related Information Between Data Platforms Over a Network
US20210373914A1 (en) Batch to stream processing in a feature management platform
US20220188194A1 (en) Cloud-based database backup and recovery
US20200065405A1 (en) Computer System & Method for Simplifying a Geospatial Dataset Representing an Operating Environment for Assets
JP6501675B2 (ja) 設定可能な搭載型の情報処理
US11797527B2 (en) Real time fault tolerant stateful featurization
CN112019605A (zh) 数据流的数据分发方法和系统
US11907913B2 (en) Maintaining an aircraft with automated acquisition of replacement aircraft parts
Killeen Knowledge-based predictive maintenance for fleet management
WO2017190757A1 (fr) Système et procédé d'analyse de données réparties
US11775864B2 (en) Feature management platform
RU2718215C2 (ru) Система обработки данных и способ обнаружения затора в системе обработки данных
Hilgendorf Efficient industrial big data pipeline for lossless transfer of vehicular data
Matesanz et al. Demand-driven data acquisition for large scale fleets
US20210374637A1 (en) Analyzing and managing production and supply chain

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 2016720348

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2016720348

Country of ref document: EP

Effective date: 20180928

NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16720348

Country of ref document: EP

Kind code of ref document: A1