KR101024494B1 - Extraction method of modified data using meta data - Google Patents

Extraction method of modified data using meta data Download PDF

Info

Publication number
KR101024494B1
KR101024494B1 KR1020080070153A KR20080070153A KR101024494B1 KR 101024494 B1 KR101024494 B1 KR 101024494B1 KR 1020080070153 A KR1020080070153 A KR 1020080070153A KR 20080070153 A KR20080070153 A KR 20080070153A KR 101024494 B1 KR101024494 B1 KR 101024494B1
Authority
KR
South Korea
Prior art keywords
data
information
metadata
change
extraction
Prior art date
Application number
KR1020080070153A
Other languages
Korean (ko)
Other versions
KR20100009314A (en
Inventor
김대환
이병호
Original Assignee
(주)디에프아이비즈
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by (주)디에프아이비즈 filed Critical (주)디에프아이비즈
Priority to KR1020080070153A priority Critical patent/KR101024494B1/en
Publication of KR20100009314A publication Critical patent/KR20100009314A/en
Application granted granted Critical
Publication of KR101024494B1 publication Critical patent/KR101024494B1/en

Links

Images

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to a method for extracting only data necessary for a target system from a source system where data is generated. In particular, when defining an extraction target table to be applied to a target system, the metadata that explicitly defines each attribute is registered. And extracting the change transaction and the change information from the database of the source system, and then regenerating the transaction information and the metadata of the transaction by regenerating the extracted data in the form of the extraction point, and converting the regenerated data into the target system. By reprocessing and providing data in an applicable form, it minimizes the load on the source system due to data retrieval or extraction, and provides a change data extraction method using metadata that is not affected by schema changes.

Metadata, Data Extraction

Description

Extraction method of modified data using meta data

The present invention explicitly defines each attribute when defining the extraction target table to be applied to the target system from the source system where the data is generated, registers metadata, and then changes information of the change transaction and the source table from the database of the source system. After retrieving, the data of the transaction is compared with the metadata of the occurrence of the transaction, and the data is regenerated into the data of the time of extraction, and the data is reprocessed from the regenerated data into a form applicable to the target system. Minimize the load of the source system due to the change, and the change data extraction method using metadata that is not affected by schema changes, etc.

In general, a method of extracting change data from a database of a source system includes a sequence of operations in a table (Schema; an overview of the target or a general schematic). Recognizes record change information and extracts change data by using a column that records the point of processing and records the time when the basic unit of work for completing the requested work is performed while the integrity of the database is guaranteed. A time stamp method and a method of extracting a series of transactions from a log using a transaction log of a database are used.

When the change data is extracted using the timestamp method, only the record of the last version (Version) reflected in the current table of the source system can be extracted. If the record is deleted and currently does not exist, the change history is managed. There was a problem that can not be done.

In addition, when extracting data from each unit table stored in the database of the source system by the time stamp method, excessive overload occurs in the relational database management system (hereinafter referred to as RDBMS). There was a problem such as a delay in the processing of the work.

In addition, the method of extracting a series of transactions from the log by using the transaction log of the database includes all the logs at the time of the transaction change, including delete transactions that do not currently exist in the database of the source system. Even if you are working on the source system, you can extract the change data by minimizing the load on the source system database by using the transaction log to extract the data.

However, when extracting data using the transaction log as described above, Data Definition Language (DDL) is a language used to define a relationship between data and data. Used to change the structure of the table. Therefore, the structure of the extracted data is different from the structure of the data stored in the database of the source system, and there is a problem that the target system using the extracted data is not recognized.

Accordingly, an object of the present invention is to extract stable change data from a source system in a structured form and apply it to a target system in order to solve the problems of the conventional data extraction method described above.

In addition, corporate DW (Data Warehouse) is a data system that puts all the data collected from each business division of a company and customer relationship management (CRM). It is another term that enables efficient linkage between the source system and the target system by stably extracting the transaction of the backbone system used in the methodology or software necessary for management) and providing it to the data consumption system (target system). The purpose.

In addition, by extracting the change data applicable to the target system using the metadata from the database of the source system to minimize the load of the source system, and to not be affected by schema changes, etc. another object.

In order to achieve the above object, the present invention includes a metadata registration step of obtaining and registering metadata that explicitly defines each attribute from an extraction target table stored in a database of a source system; A change information extraction step of extracting change transaction information of an extraction target table from a database of the source system; A data reproducing step of reproducing the change information into data of an extraction step time point by comparing metadata of occurrence time of the extracted change transaction with metadata; It provides a change data extraction method using metadata comprising the step of reconstructing the data from the regenerated data in the form requested in the target system.

As described above, the present invention provides a method of extracting changed transaction and change information from a database of a source system, comparing the metadata with metadata, and extracting change data to be applied to the target system. It can be applied stably.

In addition, by using the metadata as described above, it is possible to minimize the load on the source system generated in the process of extracting the change data, so as not to be affected by the schema change of the source system.

In addition, by changing and extracting the data of the source system to be applied to the target system it is possible to efficiently link the source system and the target system.

The present invention relates to a change data extraction method using metadata for extracting data according to an explicitly defined attribute from a database of a source system and modifying the data according to a processor of a target system. Change data must be provided in a form that can be used by the extracted data with stability, selection of valid transactions.

In the present invention, the transaction log to be used for extracting the change data is generated for the purpose of recovery when a problem occurs in the database. The transaction log is used to extract the change data in a structured form that can be used by a process of the target system. do.

However, the transaction log is a log stored for the recovery of the DBMS. In the log, a series of commands used to search for, store, modify, and delete the signal information in the DBMS and the data in the database are in DML (Data Manipulation Language). ), DDL has dependencies and is stored. In this case, the structure of the table in the DBMS is changed by the DDL to change the table, which affects the DML later, and there is a problem that it is difficult to link with other systems because the structure of the source data is changed when extracting the transaction log.

In addition, since the database only supports the current catalog in order to ensure the validity of transactions for each unit of time, there was a limit in providing the change data of the past time in a formal form.

Accordingly, the present invention overcomes the problems of using the transaction log and the limitations of providing the change data of the past point in time in a standardized form, and provides the change data in a form that can be stably provided to the linked system. It was created.

In addition, the present invention provides a commit (commit; actual update of records) or roll-back (DBMS function of invalidating the current transaction in the database) and returning data to the original state when extracting the transaction log) The change target data is transmitted to a process of the target system at the time of occurrence of the change, so that not only valid information on the current database but also volatilized information can be traced and extracted by the target system's request for data consumption.

Hereinafter, with reference to the accompanying drawings will be described in more detail the present invention.

1 illustrates a method of extracting change data using metadata according to the present invention in order. In order to extract change data from respective DBs stored in a database of a source system, an extraction target table is defined. The metadata 250 which explicitly defines each attribute from the defined extraction target table is registered. The metadata acquires and registers schema information of the extraction target table in the form of Snap Shot (storing the current state of the memory including all memory bytes, hardware registers, and status indicators), and extracts the metadata. Also register.

In addition, the metadata manages domain information, target table information, target column information, table change history information, extraction performance information, and the like. The domain information managed by the metadata includes information of a target DBMS and an extraction target table. It is a table managed to represent units, and manages logical SCN (System Change Number) of execution result and processing transaction of target domain.

In addition, the target table information registers an object number, table space, table name, record storage length, and character set information used in each unit database of the target table.

In addition, the target column information manages the information of the column to be extracted in the extraction target table, and registers the type, length, scale, order of the columns in the table, and the created date by searching the catalog in the database. The change history information manages information on which the target table is changed by the DDL as a history.

In addition, the extraction information is a table that manages the history (History) the extraction is performed for each domain, the start time, end time, SCN, CSCN (confirmed SCN), location information of the temporary storage area of each unit operation Manage.

When metadata is registered from the source system as described above, the transaction and DBMS catalog information of the extraction target table are extracted from the log file of the source system database (100), and the extracted information is temporarily stored. At this time, a temporary storage area 150 is provided to temporarily store the extracted information.

Next, the validity of the data is detected by the commit / rollback of the temporarily stored transaction and DMBS catalog information (200), and the change of the extraction table is detected by comparing with the target table and column information in the pre-registered metadata. (300).

As described above, when it is determined whether to change the extraction table by comparing the metadata and the detected information, the binary information is restored to the original form by using the target table / column information and the table change history information in the metadata (400). ).

Next, time series characteristics according to SCN or CSCN are given to the information restored in its original form, and information about the information is reproduced and provided as data of a form to be applied to a target system (400). In this case, the extracted transaction data is processed into a data form at the time of metadata registration by combining the metadata and schema change information of the streaming format (450).

As described above, the present invention extracts the desired transaction and catalog information from the transaction log file of the source database and changes the data applicable to the process of the target system by using the metadata obtained from the extraction target table. By extracting stable change data, data that can be utilized in the target system can be extracted.

In one embodiment, the customer information of the financial sector is managed as a unique key-based record as master data, and the customer's address information, etc., when the CRM system is built, is required to manage history as important information for target marketing. Do. In this case, since the customer master ledger only contains the information of the current standard, it cannot be inquired by the conventional time stamp method and must be extracted using the transaction log.

If the customer information is stored in the database of the source system as shown in Table 1, and the extracted address information is changed in 2008 based on the last update date in the source system, a total of three pieces of information may be extracted.

Customer number name Resident registration number address Maximum renewal date 201945 Daehwan Kim 710606-XXXXXXX Daelim 2-dong, Yeongdeungpo, Seoul 2008/01/20 645028 Kang Soo Kim 480615-XXXXXXX Pyeonghwa-dong, Wansan-gu, Jeonju-si, Jeollabuk-do 12 / 08/2007 321232 Hyoshin Wang 460701-XXXXXXX Wansan-gu, Jeollabuk-do 2008/05/12 233211 Sehwan Park 740605-XXXXXXX 5, Sin-gil, Yeongdeungpo-gu, Seoul 06/05, 2008

In this case, if the address information is changed more than once in 2008, the past information cannot be retrieved. At this time, the actual customer address information is changed is shown in Table 2.

Customer number name Resident registration number address Maximum renewal date 201945 Daehwan Kim 710606-XXXXXXX Daelim 2-dong, Yeongdeungpo, Seoul 2008/01/20 645028 Kang Soo Kim 480615-XXXXXXX Pyeonghwa-dong, Wansan-gu, Jeonju-si, Jeollabuk-do 12 / 08/2007 321232 Hyoshin Wang 460701-XXXXXXX Manan-gu, Anyang-si, Gyeonggi-do 02 / 02/2008 321232 Hyoshin Wang 460701-XXXXXXX Wansan-gu, Jeollabuk-do 2008/05/12 233211 Sehwan Park 740605-XXXXXXX Daechi-dong, Gangnam-gu, Seoul 01 / 03/2008 233211 Sehwan Park 740605-XXXXXXX Guro-dong, Guro-gu, Seoul 03 / 02/2008 233211 Sehwan Park 740605-XXXXXXX 5, Sin-gil, Yeongdeungpo-gu, Seoul 06/05, 2008

In other words, the actual address information was changed once for 201945 customers, 2 times for 321232 customers, and 3 times for 233211 customers.The three address information except final information cannot be detected in the CRM system. If you extract the data using, you can get the following information.

SCN Customer number name Resident registration number address Maximum renewal date One 233211 Sehwan Park 740605-XXXXXXX Daechi-dong, Gangnam-gu, Seoul 01 / 03/2008 2 201945 Daehwan Kim 710606-XXXXXXX Daelim 2-dong, Yeongdeungpo, Seoul 2008/01/20 3 321232 Hyoshin Wang 460701-XXXXXXX Manan-gu, Anyang-si, Gyeonggi-do 02 / 02/2008 4 233211 Sehwan Park 740605-XXXXXXX Guro-dong, Guro-gu, Seoul 03 / 02/2008 5 321232 Hyoshin Wang 460701-XXXXXXX Wansan-gu, Jeollabuk-do 2008/05/12 6 233211 Sehwan Park 740605-XXXXXXX 5, Sin-gil, Yeongdeungpo-gu, Seoul 06/05, 2008

When the above information is obtained for the CRM system, it is possible to obtain information that can be analyzed in connection with the service information used according to the monthly customer address.

The extracted information as described above is provided by reprocessing the extracted information to be run in the process of the target CRM system compared to the items registered in the metadata.

While the present invention has been described with reference to the embodiments shown in the drawings, it is only for the purpose of illustrating the invention, and those skilled in the art to which the present invention pertains various modifications or equivalents from the detailed description of the invention. It will be appreciated that one embodiment is possible. Accordingly, the true scope of the present invention should be determined by the technical idea of the claims.

1 is a view sequentially showing a change data extraction method using metadata according to the present invention.

Claims (6)

A metadata registration step of acquiring and registering metadata that explicitly defines each attribute from the extraction target table stored in the database of the source system; A change information extraction step of extracting change transaction information of an extraction target table from a database of the source system; A data reproducing step of reproducing the change information into data of an extraction step time point by comparing metadata of occurrence time of the extracted change transaction with metadata; And reconstructing data from the regenerated data in a form requested by a target system. The method of claim 1, The metadata registration step, A method of extracting change data using metadata, wherein the schema information of the extraction target table is acquired and registered in a snapshot format at the time of data extraction. The method of claim 1, In the metadata, A method of extracting change data using metadata, wherein domain information, target table information, target column information, table change history information, and extraction execution information are registered and managed. The method of claim 1, The information extracted in the change information extraction step, Change data extraction method using metadata, characterized in that the transaction and DBMS catalog information extracted from the log file of the extraction target table of the database. The method of claim 1, The data regeneration step, The change data extraction method using metadata, characterized in that for converting the extracted change transaction information by a commit or rollback signal. The method of claim 1, The data regeneration step, And extracting the changed change transaction information in the form of metadata registration point data by combining the metadata and schema change information in a streaming format.
KR1020080070153A 2008-07-18 2008-07-18 Extraction method of modified data using meta data KR101024494B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1020080070153A KR101024494B1 (en) 2008-07-18 2008-07-18 Extraction method of modified data using meta data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1020080070153A KR101024494B1 (en) 2008-07-18 2008-07-18 Extraction method of modified data using meta data

Publications (2)

Publication Number Publication Date
KR20100009314A KR20100009314A (en) 2010-01-27
KR101024494B1 true KR101024494B1 (en) 2011-03-31

Family

ID=41817756

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020080070153A KR101024494B1 (en) 2008-07-18 2008-07-18 Extraction method of modified data using meta data

Country Status (1)

Country Link
KR (1) KR101024494B1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014051348A2 (en) * 2012-09-28 2014-04-03 삼성에스디에스 주식회사 Apparatus and method for transforming data object

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20120135782A (en) * 2011-06-07 2012-12-17 한국과학기술정보연구원 Method for transferring meta-data and apparatus thereof
JP6050917B2 (en) * 2013-07-09 2016-12-21 デルフィクス コーポレーション Virtual database rewind

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20040088397A (en) * 2003-04-01 2004-10-16 마이크로소프트 코포레이션 Transactionally consistent change tracking for databases
JP2004532480A (en) * 2001-05-24 2004-10-21 オラクル・インターナショナル・コーポレイション Synchronous change data capture in a relational database
JP2005524909A (en) * 2002-05-09 2005-08-18 オラクル・インターナショナル・コーポレイション Method and apparatus for change data collection in a database system
JP2007179347A (en) * 2005-12-28 2007-07-12 Exa Corp Program verification support system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004532480A (en) * 2001-05-24 2004-10-21 オラクル・インターナショナル・コーポレイション Synchronous change data capture in a relational database
JP2005524909A (en) * 2002-05-09 2005-08-18 オラクル・インターナショナル・コーポレイション Method and apparatus for change data collection in a database system
KR20040088397A (en) * 2003-04-01 2004-10-16 마이크로소프트 코포레이션 Transactionally consistent change tracking for databases
JP2007179347A (en) * 2005-12-28 2007-07-12 Exa Corp Program verification support system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014051348A2 (en) * 2012-09-28 2014-04-03 삼성에스디에스 주식회사 Apparatus and method for transforming data object
WO2014051348A3 (en) * 2012-09-28 2014-05-22 삼성에스디에스 주식회사 Apparatus and method for transforming data object

Also Published As

Publication number Publication date
KR20100009314A (en) 2010-01-27

Similar Documents

Publication Publication Date Title
CN110879813B (en) Binary log analysis-based MySQL database increment synchronization implementation method
US10754875B2 (en) Copying data changes to a target database
US20210271653A1 (en) Mutations in a column store
US9779104B2 (en) Efficient database undo / redo logging
US8386431B2 (en) Method and system for determining database object associated with tenant-independent or tenant-specific data, configured to store data partition, current version of the respective convertor
EP3327588B1 (en) Value-id-based sorting in column-store databases
US8296269B2 (en) Apparatus and method for read consistency in a log mining system
US7730044B2 (en) Log data store and assembler for large objects in database system
US20060004840A1 (en) Index adding program of relational database, index adding apparatus, and index adding method
EP2270691B1 (en) Computer-implemented method for operating a database and corresponding computer system
US7406489B2 (en) Apparatus, system and method for persistently storing data in a data synchronization process
US20060026199A1 (en) Method and system to load information in a general purpose data warehouse database
US10157211B2 (en) Method and system for scoring data in a database
US10762037B2 (en) Data processing system
CN111259004B (en) Method for indexing data in storage engine and related device
Yang et al. F1 Lightning: HTAP as a Service
US8595190B2 (en) Methods and apparatus related to completion of large objects within a DB2 database environment
KR101549220B1 (en) Method and System for Managing Database, and Tree Structure for Database
Wagner et al. Database image content explorer: Carving data that does not officially exist
KR101024494B1 (en) Extraction method of modified data using meta data
AU2018345147B2 (en) Database processing device, group map file production method, and recording medium
KR101583283B1 (en) Apparatus and method for recovering data in DB2 database
KR101120235B1 (en) Extraction method of modified data using parallel processing of transaction log
KR101046338B1 (en) Method of extracting change data through local database direct connection under global database
CA2322603C (en) Optimizing updatable scrollable cursors in database systems

Legal Events

Date Code Title Description
A201 Request for examination
E902 Notification of reason for refusal
E701 Decision to grant or registration of patent right
GRNT Written decision to grant
FPAY Annual fee payment

Payment date: 20140317

Year of fee payment: 4

FPAY Annual fee payment

Payment date: 20160317

Year of fee payment: 6

FPAY Annual fee payment

Payment date: 20170317

Year of fee payment: 7

FPAY Annual fee payment

Payment date: 20190318

Year of fee payment: 9