CN107291926B - Binlog analysis method - Google Patents

Binlog analysis method Download PDF

Info

Publication number
CN107291926B
CN107291926B CN201710516504.2A CN201710516504A CN107291926B CN 107291926 B CN107291926 B CN 107291926B CN 201710516504 A CN201710516504 A CN 201710516504A CN 107291926 B CN107291926 B CN 107291926B
Authority
CN
China
Prior art keywords
data
binlog
program
odb
sdb
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710516504.2A
Other languages
Chinese (zh)
Other versions
CN107291926A (en
Inventor
尹参
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Souyidai Beijing Finance Information Service Co ltd
Original Assignee
Souyidai Beijing Finance Information Service Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Souyidai Beijing Finance Information Service Co ltd filed Critical Souyidai Beijing Finance Information Service Co ltd
Priority to CN201710516504.2A priority Critical patent/CN107291926B/en
Publication of CN107291926A publication Critical patent/CN107291926A/en
Application granted granted Critical
Publication of CN107291926B publication Critical patent/CN107291926B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a binlog analysis method, which comprises the following steps: step one, creating an SDB and an ODB, and realizing the separation of structure and data change; step two, reading and executing configuration for the synchronous program; running a program, synchronizing data and changing the data; and step four, the program exits, and the binlog file and the binlog recording point are recorded. The binlog analysis method provided by the invention aims at ODS requirements, realizes automatic coupling of data structures to ODS structures, realizes data chain stretching during data synchronization, and finally constructs ODS basic data.

Description

Binlog analysis method
Technical Field
The invention belongs to the field of computers, and particularly relates to a binlog analysis method.
Background
The business analysis department requires basic Data of an operation Data storage ODS (operational Data store):
1. there is a need to record data change history in real time
2. Data structure coupling into ODS data structure
3. Data synchronization needs to be performed by chain stretching
In the prior art, only one support data and structure is unified and synchronized with a master library, the original table structure is not supported to be modified, and data chain stretching operation is simultaneously carried out on the basis of the original synchronization, so that the requirements are not met.
Disclosure of Invention
The invention aims to realize the change of a data structure automatic coupling ODS structure and the data chain stretching during data synchronization aiming at ODS requirements, and construct ODS basic data.
In order to achieve the technical effects, the invention provides a binlog analysis method, which comprises the following steps:
mode library (SDB), ODS slave library (ODB), master library (MDB)
Step one, creating the SDB and the ODB, and realizing the separation of structure and data change.
The MDB structure is automatically synchronized to the SDB, and the relationship between the ODB and the MDB table metadata structure is isolated. ODB data structures. The method prevents the MDB table structure from being changed when the MDB table structure is read, and further prevents the table mode which is not corresponding from being read when the data is synchronized. The data table mode is read with SDB instead of MDB. The ODB initializes the database table schema at some point according to binlog, coupling ODS embedded structures.
And step two, configuring and reading the synchronous program, and taking effect immediately.
And configuring SDB, ODB and MDB links.
Configuration table filter, DB filter, inverse filter.
Configuration value converters including md5 converter, id encryption converter, special converter, idle converter.
And configuring a read point configuration file path and a read point. The read point contains the binlog file, which currently reads the location.
And step three, running a program, synchronizing data and changing the data.
The event starting processing synchronous data and structure are divided into four categories:
tablemap,write,update,query
the tablemap maps tableeid and table mode, triggers tablemap events before data synchronization each time, acquires and analyzes a table structure, processes table structure contraction, and caches mapping.
write creates a record event. Filtering, field value and type conversion, acquiring data, assembling and converting ODS structure data (time chain head, version initial value, primary key migration), and generating mysql insert statement.
An event is triggered when update updates data. And inquiring a tablemap cache table structure, solving a chain tail according to the main key and the time, updating the current chain tail time, converting the chain tail into a secondary chain tail, replacing the new data with the new chain tail (including the update of the record version), recording the deletion state and the like.
The query event is a data structure change event. Including modification indexes, modification fields, modification tables and libraries. The modification mode comprises creation, deletion and updating. And filtering, correcting and structurally modifying the obtained query key sql by using an sql parser of the dry. Ultimately, the database-related schema of the ODB and SDB is modified.
Step four, the program exits, and the binlog file and the binlog recording point are recorded
When the program needs manual intervention, the program can be quitted cleanly, the binlog points are recorded, and data analysis is resumed when the program is restarted.
The binlog analysis method provided by the invention is based on the binlog analysis of mysql-binlog-connector-java to construct a specific database data and structure synchronization system, and has the advantages that:
A. including sql parsing and synchronous changes. And analyzing and reconstructing the sql statement by using a standard sql statement analyzer, and adding a custom structure change synchronization item. And the problems of abnormal statement analysis and execution are avoided. As an sql statement translation proxy layer. Can be resolved into different database compatible statements, can automatically modify field types, and the like.
B. And (5) carrying out desensitization processing on the real-time data. The needed desensitization data comprises reversible desensitization, erasure desensitization and type replacement desensitization.
C. A snapshot of the historical change version of the data. The method comprises the steps of primary key migration, updating time, creating time and historical version comparison encoding.
D. And alarming the abnormal data or structure. When the operation statement of the specific database is contained, the alarm is notified, and the operation is safely stopped, so that the manual intervention processing is facilitated.
E. And (5) breakpoint resuming and analyzing functions. The binlog reading position is recorded in real time, and when the processing needs to be stopped, the analysis can be restarted from the last recording point.
F. Configurable filtering, extraction, conversion functions. Tables, libraries, fields are filtered by configuration files.
Drawings
Fig. 1 is a schematic flow chart of a binlog analysis method provided by the present invention.
Detailed Description
The invention is further described below in conjunction with the following description.
According to the binlog analysis method, the SDB and the ODB are deployed on the same machine, a database structure is initialized, a mysql system library is filtered through file configuration, data fields needing desensitization (including user names, telephone numbers, addresses, user identifications, user identity numbers and the like) are designated, NULL values and NULL string fields are respectively processed (NULL-N, EMPTY-E and OTHERS-O), data types are converted during data desensitization, and the user identifications are modified from long types to varchar types as required to be encrypted. And removing the constraint of the unavailable key and the constraint of the index key, and migrating the primary key to the ods _ id. The same is done when automatically synchronizing the data structure. A binlog start point is specified. To speed up the query, index correlations are automatically added based on the modified table structure.
And monitoring and recording a log in operation, wherein the log comprises a next execution point of the binlog, a current execution point and an sql statement conversion result, including a structural statement and relative to the CURD data. The reason for the abnormal log.
When special modification structures including sensitive information are met, partial conditions need to be manually processed, automatic exit is realized at a check point, and the information is notified by mails and short messages. Subsequent tasks are performed at the checkpoint after restart.
In the operation process, in order to realize read-write separation, double masters and one slave are adopted, wherein one master is used for inserting ODS basic data, the other master is used for reading the slave library for data analysis, and an intermediate analysis result and a final analysis result are inserted.
1. The test data started at binlog point 7 months 25 and data No. 8 months 15 was retrieved.
2. The real data were retrieved from the binlog point of 25 days 7 months at 29 months 8, and all the historical data were retrieved 10 days.
3. The related library comprises a user library, an account library, a business database (related to the target), and a financial related library.
The technical implementation purpose is to support the business intelligence analysis department to process and analyze the change condition of the historical data and provide basic data for generating an ODS layer. The binlog analysis method provided by the embodiment supports data analysis work of business intelligence departments, including user-related data analysis (registration behavior, activity and the like) and fund account change-related data analysis. Financial reconciliation, etc.

Claims (4)

1. A binlog analysis method, comprising the steps of:
step one, creating an SDB and an ODB, realizing the separation of structure and data change, and embedding an ODS structure; the first step is to create an SDB and an ODB, and realize the separation of structure and data change, wherein the ODS structure embedding comprises the following steps: automatically synchronizing the change of the MDB structure to the SDB, isolating the relationship between the ODB and the MDB table metadata structure, replacing the MDB with the SDB to read a data table mode, initializing a database table mode by the ODB according to some point of binlog, and coupling an ODS embedded structure;
step two, configuring and reading a synchronous program;
running a program, synchronizing data and structure change, creating a binlog file and recording a binlog recording point in real time;
and step four, the program is terminated, and the program is restarted according to the real-time binlog recording point.
2. The binlog analysis method of claim 1, wherein configuring and reading for a synchronization procedure in step two comprises:
configuring the SDB, ODB, MDB links,
a configuration table filter, a DB filter, a reverse filter,
configuration value converters, including md5 converters, id encryption converters, special converters, idle converters,
and configuring a read point configuration file path and a read point, wherein the read point comprises a binlog file, and the current read position of the binlog file.
3. The binlog analysis method of claim 1, wherein the third step comprises running a program, synchronizing data, changing data, and processing synchronizing data and structure for event departure, and the third step comprises four categories:
tablemap,write,update,query,
the tablemap maps tableeid and table mode, triggers tablemap event before data synchronization each time, acquires and analyzes table structure, processes sub-table structure contraction, caches mapping relation,
write is a technique for creating log events, filtering, field value and type conversion, fetching data, assembling and converting ODS structure data,
update is an event triggered when data is updated, a tablemap cache table structure is inquired, a chain tail is obtained according to a main key and time, the current chain tail time is updated, the chain tail is converted into a secondary chain tail, new data is replaced by the new chain tail, the deletion state is recorded and the like,
the query event is a data structure change event and comprises a modification index, a modification field, a modification table and a database, the modification mode comprises creating, deleting and updating, and filtering, modifying and structure changing operations are carried out on the obtained query key sql by using a dry sql parser, and finally the modification of database related modes of ODB and SDB is reflected.
4. The binlog analysis method of claim 1, wherein the fourth step, the process exiting, restarting according to the real-time binlog entry point comprises:
when the program requires manual intervention, it can be exited and binlog points recorded, and data analysis resumed when the program was restarted.
CN201710516504.2A 2017-06-29 2017-06-29 Binlog analysis method Active CN107291926B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710516504.2A CN107291926B (en) 2017-06-29 2017-06-29 Binlog analysis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710516504.2A CN107291926B (en) 2017-06-29 2017-06-29 Binlog analysis method

Publications (2)

Publication Number Publication Date
CN107291926A CN107291926A (en) 2017-10-24
CN107291926B true CN107291926B (en) 2020-08-18

Family

ID=60099273

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710516504.2A Active CN107291926B (en) 2017-06-29 2017-06-29 Binlog analysis method

Country Status (1)

Country Link
CN (1) CN107291926B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109145060B (en) * 2018-07-20 2020-09-04 腾讯科技(深圳)有限公司 Data processing method and device
CN110750410B (en) * 2018-07-24 2024-04-12 北京京东尚科信息技术有限公司 Method and device for monitoring database logs
CN111046434A (en) * 2019-12-19 2020-04-21 济南东驰网络科技有限公司 Method for realizing data desensitization based on canal
CN113051265A (en) * 2019-12-27 2021-06-29 中信百信银行股份有限公司 Method, device, computer equipment and readable storage medium for reducing loss caused by relational database table structure change
CN112035464B (en) * 2020-07-22 2024-03-15 武汉达梦数据库股份有限公司 Data synchronization filtering method and synchronization device based on log analysis
CN114491580B (en) * 2021-12-30 2022-10-04 深圳市恒创智达信息技术有限公司 Database sensitive information encryption method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102402596A (en) * 2011-11-07 2012-04-04 北京搜狗科技发展有限公司 Reading and writing method and system of master slave separation database
CN104298779A (en) * 2014-11-04 2015-01-21 中国银行股份有限公司 Processing method and system for massive data processing
CN105956207A (en) * 2016-07-01 2016-09-21 杭州帕拉迪网络科技有限公司 Binlog-based configurable mysql database real-time synchronization method
CN106484869A (en) * 2016-10-12 2017-03-08 北京集奥聚合科技有限公司 A kind of distributed caching method based on mysql binlog and system
CN106874281A (en) * 2015-12-11 2017-06-20 北京新媒传信科技有限公司 Realize the method and apparatus that data base read-write is separate

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102402596A (en) * 2011-11-07 2012-04-04 北京搜狗科技发展有限公司 Reading and writing method and system of master slave separation database
CN104298779A (en) * 2014-11-04 2015-01-21 中国银行股份有限公司 Processing method and system for massive data processing
CN106874281A (en) * 2015-12-11 2017-06-20 北京新媒传信科技有限公司 Realize the method and apparatus that data base read-write is separate
CN105956207A (en) * 2016-07-01 2016-09-21 杭州帕拉迪网络科技有限公司 Binlog-based configurable mysql database real-time synchronization method
CN106484869A (en) * 2016-10-12 2017-03-08 北京集奥聚合科技有限公司 A kind of distributed caching method based on mysql binlog and system

Also Published As

Publication number Publication date
CN107291926A (en) 2017-10-24

Similar Documents

Publication Publication Date Title
CN107291926B (en) Binlog analysis method
US11461294B2 (en) System for importing data into a data repository
US11360950B2 (en) System for analysing data relationships to support data query execution
US10678810B2 (en) System for data management in a large scale data repository
EP3602341B1 (en) Data replication system
US10671642B2 (en) Copying data changes to a target database
US9218377B2 (en) Failure recovery and error correction techniques for data loading in information warehouses
RU2599538C2 (en) Methods and systems for loading data into temporal data warehouse
CN102567453B (en) Halloween protection in a multi-version database system
US8805777B2 (en) Data record collapse and split functionality
CN105205053A (en) Method and system for analyzing database incremental logs
CN105224527A (en) Be applicable to the general ETL method of multiple object table update mode
CN113204571A (en) SQL execution method and device related to write-in operation and storage medium
CN111078719A (en) Data recovery method and device, storage medium and processor
CN114281757A (en) Database migration method and system and computer readable storage medium
CN112015463B (en) Configuration version management device and working method thereof
CN116821098A (en) Data warehouse management method, service system and storage medium
CN111881323B (en) Table separation method based on sequencing field and time routing
CN115827172A (en) Method and device for executing database transaction
CN117971835A (en) Metadata association method and system
CN117453460A (en) Method and device for realizing redis second-level disaster recovery
CN118245277A (en) Method, equipment and medium for recovering human resource system data
CN117149180A (en) Application replication method, device, equipment and medium for low-code development
CN116302206A (en) Presto data source hot loading method based on MQ
Pesado A Comparison of DBMSs for Mobile Devices

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant