WO2022145524A1 - Procédé et appareil de structuration de différents types de données - Google Patents

Procédé et appareil de structuration de différents types de données Download PDF

Info

Publication number
WO2022145524A1
WO2022145524A1 PCT/KR2020/019348 KR2020019348W WO2022145524A1 WO 2022145524 A1 WO2022145524 A1 WO 2022145524A1 KR 2020019348 W KR2020019348 W KR 2020019348W WO 2022145524 A1 WO2022145524 A1 WO 2022145524A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
unstructured
heterogeneous
preset
classifying
Prior art date
Application number
PCT/KR2020/019348
Other languages
English (en)
Korean (ko)
Inventor
김준오
Original Assignee
(주)누리텔레콤
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by (주)누리텔레콤 filed Critical (주)누리텔레콤
Publication of WO2022145524A1 publication Critical patent/WO2022145524A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification

Definitions

  • the present invention relates to a method and apparatus for shaping heterogeneous data.
  • Energy data is based on supply and demand, but there are various types of electricity, gas, and heat.
  • Embodiments of the present invention are to provide a heterogeneous data standardization method and apparatus for classifying data related to energy transaction and unstructured data and combining them to manage data related to energy transaction as integrated data.
  • a heterogeneous data formalization method performed by a heterogeneous data formalization apparatus, the method comprising: classifying structured data from input data according to a preset standardized item rule; classifying the unstructured data in the input data according to a preset basic item classification system; and attaching the unstructured data to the unstructured data items classified according to the basic item classification system and re-processing the unstructured data into standardized data.
  • the unstructured data may be classified according to the basic item classification system extracted through at least one of the Internet, a pre-stored database, and a comparison process between unstructured data.
  • the method may further include classifying the unstructured data that is not classified according to the preset basic item classification system into other preset items.
  • the method when the summed data obtained by adding the input additional data and the unstructured data classified into the preset other items are classified according to the basic item classification system, a new item is created and the summed data and related data are re-mapping the process. It may further include the step of classifying into standardized data through
  • the method may include: arranging the structured data according to the preset standardized item rule; imaging the unstructured data; and combining the arranged structured data and the imaged unstructured data and outputting them on a screen.
  • a data input unit for receiving data; a data classification unit for classifying structured data from the input data according to a preset standardized item rule and classifying unstructured data from the input data according to a preset basic item classification system; and an unstructured data management unit for reprocessing the unstructured data into standardized data by attaching unstructured data to items of unstructured data classified according to the basic item classification system.
  • the data classification unit may classify the unstructured data according to a basic item classification system extracted through at least one of the Internet, a pre-stored database, and a comparison process between the unstructured data.
  • the data classification unit may classify the unstructured data that is not classified according to the preset basic item classification system into other preset items.
  • the data classification unit is configured to generate a new item when the summed data obtained by adding the additional data input through the data input unit and the unstructured data classified into the preset other items are classified according to the basic item classification system, and related to the summed data It can be classified as standardized data through the process of re-mapping the data.
  • the device may further include a data output unit for arranging the structured data according to the preset standardized item rule, imaging the unstructured data, combining the arranged structured data and the imaged unstructured data to output to a screen can
  • the disclosed technology may have the following effects. However, this does not mean that a specific embodiment should include all of the following effects or only the following effects, so the scope of the disclosed technology should not be understood as being limited thereby.
  • Embodiments of the present invention can classify data related to energy transaction into formalized data and unstructured data, combine them, and manage as integrated data.
  • the embodiments of the present invention are advantageous for big data processing due to this heterogeneous data standardization method, and apart from this, it is possible to increase system response speed by easily accessing frequently occurring data.
  • FIG. 1 is a block diagram of an apparatus for shaping heterogeneous data according to an embodiment of the present invention.
  • FIG. 2 is a flowchart of a data classification method in a heterogeneous data shaping method according to an embodiment of the present invention.
  • FIG. 3 is a flowchart of an additional data classification method in the heterogeneous data shaping method according to an embodiment of the present invention.
  • FIG. 4 is a flowchart of a method of outputting data in the heterogeneous data shaping method according to an embodiment of the present invention.
  • FIG. 1 is a block diagram of an apparatus for shaping heterogeneous data according to an embodiment of the present invention.
  • the heterogeneous data shaping apparatus 100 includes a data input unit 110 , a data classification unit 120 , an unstructured data management unit 130 , and a data output unit 140 .
  • includes not all illustrated components are essential components.
  • the heterogeneous data shaping apparatus 100 may be implemented by more elements than the illustrated elements, or the heterogeneous data shaping apparatus 100 may be implemented by fewer elements than the illustrated elements.
  • the data input unit 110 receives data.
  • the data classification unit 120 classifies structured data in the data input from the data input unit 110 according to a preset standardized item rule, and classifies unstructured data in the input data according to a preset basic item classification system.
  • the unstructured data management unit 130 attaches the unstructured data to the items of the unstructured data classified according to the basic item classification system in the data classification unit 120 and reprocesses the unstructured data into standardized data.
  • the data output unit 140 outputs data classified by the data classification unit 120 or data managed by the unstructured data management unit 130 .
  • the data output unit 140 combines the structured data and the unstructured data and outputs them on the screen.
  • the data classification unit 120 may classify the unstructured data according to a basic item classification system extracted through at least one of the Internet, a pre-stored database, and a comparison process between unstructured data.
  • the data classification unit 120 may classify unstructured data that is not classified according to a preset basic item classification system into other preset items.
  • the data classification unit 120 is configured to add the additional data input through the data input unit 110 and the unstructured data classified into other items if the summed data is classified according to the basic item classification system, the new item It can be classified into standardized data through the process of generating and re-mapping the summed data and related data.
  • the data output unit 140 may arrange the structured data according to a preset standardized item rule, image the unstructured data, combine the arranged structured data and the imaged unstructured data, and output it to the screen. .
  • FIG. 2 is a flowchart of a data classification method in a heterogeneous data shaping method according to an embodiment of the present invention.
  • step S101 the heterogeneous data shaping apparatus 100 receives heterogeneous data for standardization.
  • step S102 the heterogeneous data shaping apparatus 100 classifies the inputted heterogeneous data into structured data that can be standardized and unstructured data that cannot be standardized. That is, the heterogeneous data shaping apparatus 100 checks whether the input heterogeneous data is structured data or unstructured data.
  • step S103 the heterogeneous data shaping apparatus 100 separates and stores the standardized data according to a corresponding item rule.
  • the heterogeneous data shaping apparatus 100 separately manages unstructured data among heterogeneous data.
  • the heterogeneous data shaping apparatus 100 classifies the unstructured data among the heterogeneous data according to a basic item classification (eg, name, address, etc.) system through at least one of the Internet, a pre-stored database, and a comparison between the unstructured data.
  • the heterogeneous data shaping apparatus 100 may reprocess the data classified as the item into one data by attaching the unstructured data to the item.
  • the heterogeneous data shaping apparatus 100 may first classify data that cannot be classified into other items.
  • FIG. 3 is a flowchart of an additional data classification method in the heterogeneous data shaping method according to an embodiment of the present invention.
  • the heterogeneous data shaping apparatus 100 may classify data through a process of creating a new item and re-mapping related data when the corresponding data types are accumulated and classified in the future. This will be described with reference to FIG. 3 .
  • step S201 the heterogeneous data shaping apparatus 100 according to an embodiment of the present invention receives additional data. It is assumed that after receiving data as in FIG. 2 , heterogeneous data is classified into structured data and unstructured data, and then unstructured data is managed.
  • step S202 the heterogeneous data shaping apparatus 100 according to an embodiment of the present invention compares the received additional data with the unstructured data shown in FIG. 2 .
  • step S203 the heterogeneous data shaping apparatus 100 according to an embodiment of the present invention checks whether the received additional data is data similar to the unstructured data.
  • step S204 if the input additional data is similar to the unstructured data, the heterogeneous data shaping apparatus 100 according to an embodiment of the present invention creates an item classified as similar data as a new item.
  • step S205 the heterogeneous data shaping apparatus 100 according to an embodiment of the present invention classifies and stores the data related to the generated new item as standardized data through a process of re-mapping.
  • step S206 if the input additional data is not similar to the unstructured data, the heterogeneous data shaping apparatus 100 according to an embodiment of the present invention classifies and stores the unstructured data. And the heterogeneous data shaping apparatus 100 performs again from step S201 of receiving additional data.
  • FIG. 4 is a flowchart of a method of outputting data in the heterogeneous data shaping method according to an embodiment of the present invention.
  • the heterogeneous data shaping apparatus 100 displays the standardized data according to the corresponding item on the screen, and the unclassified unstructured data is imaged and displayed as other items. This will be described with reference to FIG. 4 .
  • step S301 the heterogeneous data shaping apparatus 100 according to an embodiment of the present invention receives formatted data.
  • step S302 the heterogeneous data shaping apparatus 100 according to an embodiment of the present invention receives unstructured data.
  • step S303 the heterogeneous data shaping apparatus 100 according to an embodiment of the present invention generates an imaged unstructured data by imaging the received unstructured data.
  • step S304 the heterogeneous data shaping apparatus 100 according to an embodiment of the present invention combines the received formatted data and the imaged unstructured data.
  • step S305 the heterogeneous data shaping apparatus 100 according to an embodiment of the present invention displays and outputs the combined structured and unstructured data on the screen.
  • a non-transitory computer-readable storage medium for storing instructions that, when executed by a processor, cause the processor to execute a method, the method comprising: classifying structured data in input data according to a preset standardized item rule to do; classifying the unstructured data in the input data according to a preset basic item classification system; and attaching unstructured data to items of unstructured data classified according to the basic item classification system and reprocessing them into structured data.
  • the various embodiments described above are implemented as software including instructions stored in a machine-readable storage media readable by a machine (eg, a computer).
  • the device is a device capable of calling a stored command from a storage medium and operating according to the called command, and may include an electronic device (eg, the electronic device A) according to the disclosed embodiments.
  • the processor may perform a function corresponding to the instruction by using other components directly or under the control of the processor.
  • Instructions may include code generated or executed by a compiler or interpreter.
  • the device-readable storage medium may be provided in the form of a non-transitory storage medium.
  • 'non-transitory' means that the storage medium does not include a signal and is tangible, and does not distinguish that data is semi-permanently or temporarily stored in the storage medium.
  • the methods according to the various embodiments described above may be provided by being included in a computer program product.
  • Computer program products may be traded between sellers and buyers as commodities.
  • the computer program product may be distributed in the form of a machine-readable storage medium (eg, compact disc read only memory (CD-ROM)) or online through an application store (eg, Play StoreTM).
  • an application store eg, Play StoreTM
  • at least a portion of the computer program product may be temporarily stored or temporarily generated in a storage medium such as a memory of a server of a manufacturer, a server of an application store, or a relay server.
  • the various embodiments described above are stored in a recording medium readable by a computer or a similar device using software, hardware, or a combination thereof. can be implemented in In some cases, the embodiments described herein may be implemented by the processor itself. According to the software implementation, embodiments such as the procedures and functions described in this specification may be implemented as separate software modules. Each of the software modules may perform one or more functions and operations described herein.
  • non-transitory computer-readable medium refers to a medium that stores data semi-permanently, not a medium that stores data for a short moment, such as a register, cache, memory, etc., and can be read by a device.
  • Specific examples of the non-transitory computer-readable medium may include a CD, DVD, hard disk, Blu-ray disk, USB, memory card, ROM, and the like.
  • each of the components may be composed of a single or a plurality of entities, and some sub-components of the above-described corresponding sub-components may be omitted, or other Sub-components may be further included in various embodiments.
  • some components eg, a module or a program
  • operations performed by a module, program, or other component are sequentially, parallel, repetitively or heuristically executed, or at least some operations are executed in a different order, are omitted, or other operations are added.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

La présente invention concerne un procédé et un appareil de structuration de différents types de données. Le procédé de structuration de différents types de données, selon un mode de réalisation de la présente invention, comprend les étapes consistant à : classifier des données structurées issues de données d'entrée selon une règle d'éléments structurés prédéfinie ; classifier des données non structurées issues des données d'entrée selon un système de classification d'éléments par défaut prédéfini ; et attacher les données non structurées à des éléments des données non structurées qui sont classifiées selon le système de classification d'éléments par défaut et les retraiter en données structurées.
PCT/KR2020/019348 2020-12-30 2020-12-31 Procédé et appareil de structuration de différents types de données WO2022145524A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020200187873A KR20220095893A (ko) 2020-12-30 2020-12-30 이종 데이터 정형화 방법 및 장치
KR10-2020-0187873 2020-12-30

Publications (1)

Publication Number Publication Date
WO2022145524A1 true WO2022145524A1 (fr) 2022-07-07

Family

ID=82260862

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2020/019348 WO2022145524A1 (fr) 2020-12-30 2020-12-31 Procédé et appareil de structuration de différents types de données

Country Status (2)

Country Link
KR (1) KR20220095893A (fr)
WO (1) WO2022145524A1 (fr)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102549640B1 (ko) * 2022-09-30 2023-06-30 국민건강보험공단 특정 비급여 항목을 키워드에 기초하여 식별하는 방법, 장치 및 시스템

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080263029A1 (en) * 2007-04-18 2008-10-23 Aumni Data, Inc. Adaptive archive data management
KR101637504B1 (ko) * 2015-01-16 2016-07-07 주식회사 솔트룩스 비정형 데이터 처리 시스템 및 방법
KR20180079222A (ko) * 2016-12-31 2018-07-10 이재규 비정형정보의 정형화를 위한 정형정보 제공 시스템 및 상기 시스템의 사용 및 적용 방법
KR20190063978A (ko) * 2017-11-30 2019-06-10 굿모니터링 주식회사 비정형 데이터의 카테고리 자동분류 방법

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080263029A1 (en) * 2007-04-18 2008-10-23 Aumni Data, Inc. Adaptive archive data management
KR101637504B1 (ko) * 2015-01-16 2016-07-07 주식회사 솔트룩스 비정형 데이터 처리 시스템 및 방법
KR20180079222A (ko) * 2016-12-31 2018-07-10 이재규 비정형정보의 정형화를 위한 정형정보 제공 시스템 및 상기 시스템의 사용 및 적용 방법
KR20190063978A (ko) * 2017-11-30 2019-06-10 굿모니터링 주식회사 비정형 데이터의 카테고리 자동분류 방법

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
KIM, SUNGHYUN ET AL.: "Consumer Trend Platform Development for Combination Analysis of Structured and Unstructured Big Data", JOURNAL OF DIGITAL CONVERGENCE - JOURNAL OF DIGITAL CONVERGENCE, vol. 15, no. 6, 28 June 2017 (2017-06-28), pages 133 - 143, XP009537846 *

Also Published As

Publication number Publication date
KR20220095893A (ko) 2022-07-07

Similar Documents

Publication Publication Date Title
WO2012002635A1 (fr) Appareil d'essai de système
WO2022145524A1 (fr) Procédé et appareil de structuration de différents types de données
WO2020111505A1 (fr) Procédé et système de production d'informations de gt d'objet pour l'apprentissage machine d'images
WO2012153879A1 (fr) Dispositif de test du traitement des exceptions et procédé associé
WO2021049868A1 (fr) Système et procédé de gestion de la qualité de produits
CN115952544A (zh) 一种基于大数据的智能存储系统
CN113010116A (zh) 一种数据处理方法、装置、终端设备及可读存储介质
CN115129594A (zh) 测试场景处理方法、装置、电子设备和存储介质
WO2022080582A1 (fr) Procédé d'apprentissage par renforcement orienté cible et dispositif pour sa réalisation
CN111159158B (zh) 数据归一方法及装置、计算机可读存储介质、电子设备
WO2016137035A1 (fr) Dispositif et procédé de génération de cas d'essai, et support d'enregistrement lisible par ordinateur pour enregistrer un programme afin de l'exécuter
CN102508750A (zh) Soc内部模块检测装置及方法
WO2016027910A1 (fr) Procédé de traçage de rayons, dispositif de traçage de rayons pour mettre en œuvre ledit procédé, et support d'enregistrement permettant de stocker celui-ci
US8140305B2 (en) Conversion of an application program
US20230092026A1 (en) Processing device, processing method, and non-transitory storage medium
US7259875B1 (en) Conversion of unformatted data to a viewable format using a plurality of plug-able formatters
WO2022145520A1 (fr) Procédé et dispositif de mappage de données pour une gestion d'informations personnelles liée à une chaîne de blocs
CN115033489A (zh) 代码资源检测方法、装置、电子设备及存储介质
CN114627419A (zh) 基于多应用场景的视频质检方法、装置、设备及存储介质
WO2015183016A1 (fr) Dispositif de traitement de données et procédé de vérification de données enregistrées dans une mémoire d'un dispositif de traitement de données
CN113342430A (zh) 故障码的处理方法、装置、终端设备及可读存储介质
CN113221888A (zh) 车牌号管理系统测试方法、装置、电子设备及存储介质
WO2022169007A1 (fr) Système de gestion d'interface utilisateur pour outil de configuration d'architecture de système ouvert automobile et procédé associé
WO2023229230A1 (fr) Procédé et dispositif de détection de compte à accès multiple à l'aide d'un degré de similarité entre des pseudonymes
CN113158844B (zh) 一种船只监管方法、装置和电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20968073

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20968073

Country of ref document: EP

Kind code of ref document: A1