CN107291926B - Binlog analysis method - Google Patents
Binlog analysis method Download PDFInfo
- Publication number
- CN107291926B CN107291926B CN201710516504.2A CN201710516504A CN107291926B CN 107291926 B CN107291926 B CN 107291926B CN 201710516504 A CN201710516504 A CN 201710516504A CN 107291926 B CN107291926 B CN 107291926B
- Authority
- CN
- China
- Prior art keywords
- data
- binlog
- program
- odb
- sdb
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a binlog analysis method, which comprises the following steps: step one, creating an SDB and an ODB, and realizing the separation of structure and data change; step two, reading and executing configuration for the synchronous program; running a program, synchronizing data and changing the data; and step four, the program exits, and the binlog file and the binlog recording point are recorded. The binlog analysis method provided by the invention aims at ODS requirements, realizes automatic coupling of data structures to ODS structures, realizes data chain stretching during data synchronization, and finally constructs ODS basic data.
Description
Technical Field
The invention belongs to the field of computers, and particularly relates to a binlog analysis method.
Background
The business analysis department requires basic Data of an operation Data storage ODS (operational Data store):
1. there is a need to record data change history in real time
2. Data structure coupling into ODS data structure
3. Data synchronization needs to be performed by chain stretching
In the prior art, only one support data and structure is unified and synchronized with a master library, the original table structure is not supported to be modified, and data chain stretching operation is simultaneously carried out on the basis of the original synchronization, so that the requirements are not met.
Disclosure of Invention
The invention aims to realize the change of a data structure automatic coupling ODS structure and the data chain stretching during data synchronization aiming at ODS requirements, and construct ODS basic data.
In order to achieve the technical effects, the invention provides a binlog analysis method, which comprises the following steps:
mode library (SDB), ODS slave library (ODB), master library (MDB)
Step one, creating the SDB and the ODB, and realizing the separation of structure and data change.
The MDB structure is automatically synchronized to the SDB, and the relationship between the ODB and the MDB table metadata structure is isolated. ODB data structures. The method prevents the MDB table structure from being changed when the MDB table structure is read, and further prevents the table mode which is not corresponding from being read when the data is synchronized. The data table mode is read with SDB instead of MDB. The ODB initializes the database table schema at some point according to binlog, coupling ODS embedded structures.
And step two, configuring and reading the synchronous program, and taking effect immediately.
And configuring SDB, ODB and MDB links.
Configuration table filter, DB filter, inverse filter.
Configuration value converters including md5 converter, id encryption converter, special converter, idle converter.
And configuring a read point configuration file path and a read point. The read point contains the binlog file, which currently reads the location.
And step three, running a program, synchronizing data and changing the data.
The event starting processing synchronous data and structure are divided into four categories:
tablemap,write,update,query
the tablemap maps tableeid and table mode, triggers tablemap events before data synchronization each time, acquires and analyzes a table structure, processes table structure contraction, and caches mapping.
write creates a record event. Filtering, field value and type conversion, acquiring data, assembling and converting ODS structure data (time chain head, version initial value, primary key migration), and generating mysql insert statement.
An event is triggered when update updates data. And inquiring a tablemap cache table structure, solving a chain tail according to the main key and the time, updating the current chain tail time, converting the chain tail into a secondary chain tail, replacing the new data with the new chain tail (including the update of the record version), recording the deletion state and the like.
The query event is a data structure change event. Including modification indexes, modification fields, modification tables and libraries. The modification mode comprises creation, deletion and updating. And filtering, correcting and structurally modifying the obtained query key sql by using an sql parser of the dry. Ultimately, the database-related schema of the ODB and SDB is modified.
Step four, the program exits, and the binlog file and the binlog recording point are recorded
When the program needs manual intervention, the program can be quitted cleanly, the binlog points are recorded, and data analysis is resumed when the program is restarted.
The binlog analysis method provided by the invention is based on the binlog analysis of mysql-binlog-connector-java to construct a specific database data and structure synchronization system, and has the advantages that:
A. including sql parsing and synchronous changes. And analyzing and reconstructing the sql statement by using a standard sql statement analyzer, and adding a custom structure change synchronization item. And the problems of abnormal statement analysis and execution are avoided. As an sql statement translation proxy layer. Can be resolved into different database compatible statements, can automatically modify field types, and the like.
B. And (5) carrying out desensitization processing on the real-time data. The needed desensitization data comprises reversible desensitization, erasure desensitization and type replacement desensitization.
C. A snapshot of the historical change version of the data. The method comprises the steps of primary key migration, updating time, creating time and historical version comparison encoding.
D. And alarming the abnormal data or structure. When the operation statement of the specific database is contained, the alarm is notified, and the operation is safely stopped, so that the manual intervention processing is facilitated.
E. And (5) breakpoint resuming and analyzing functions. The binlog reading position is recorded in real time, and when the processing needs to be stopped, the analysis can be restarted from the last recording point.
F. Configurable filtering, extraction, conversion functions. Tables, libraries, fields are filtered by configuration files.
Drawings
Fig. 1 is a schematic flow chart of a binlog analysis method provided by the present invention.
Detailed Description
The invention is further described below in conjunction with the following description.
According to the binlog analysis method, the SDB and the ODB are deployed on the same machine, a database structure is initialized, a mysql system library is filtered through file configuration, data fields needing desensitization (including user names, telephone numbers, addresses, user identifications, user identity numbers and the like) are designated, NULL values and NULL string fields are respectively processed (NULL-N, EMPTY-E and OTHERS-O), data types are converted during data desensitization, and the user identifications are modified from long types to varchar types as required to be encrypted. And removing the constraint of the unavailable key and the constraint of the index key, and migrating the primary key to the ods _ id. The same is done when automatically synchronizing the data structure. A binlog start point is specified. To speed up the query, index correlations are automatically added based on the modified table structure.
And monitoring and recording a log in operation, wherein the log comprises a next execution point of the binlog, a current execution point and an sql statement conversion result, including a structural statement and relative to the CURD data. The reason for the abnormal log.
When special modification structures including sensitive information are met, partial conditions need to be manually processed, automatic exit is realized at a check point, and the information is notified by mails and short messages. Subsequent tasks are performed at the checkpoint after restart.
In the operation process, in order to realize read-write separation, double masters and one slave are adopted, wherein one master is used for inserting ODS basic data, the other master is used for reading the slave library for data analysis, and an intermediate analysis result and a final analysis result are inserted.
1. The test data started at binlog point 7 months 25 and data No. 8 months 15 was retrieved.
2. The real data were retrieved from the binlog point of 25 days 7 months at 29 months 8, and all the historical data were retrieved 10 days.
3. The related library comprises a user library, an account library, a business database (related to the target), and a financial related library.
The technical implementation purpose is to support the business intelligence analysis department to process and analyze the change condition of the historical data and provide basic data for generating an ODS layer. The binlog analysis method provided by the embodiment supports data analysis work of business intelligence departments, including user-related data analysis (registration behavior, activity and the like) and fund account change-related data analysis. Financial reconciliation, etc.
Claims (4)
1. A binlog analysis method, comprising the steps of:
step one, creating an SDB and an ODB, realizing the separation of structure and data change, and embedding an ODS structure; the first step is to create an SDB and an ODB, and realize the separation of structure and data change, wherein the ODS structure embedding comprises the following steps: automatically synchronizing the change of the MDB structure to the SDB, isolating the relationship between the ODB and the MDB table metadata structure, replacing the MDB with the SDB to read a data table mode, initializing a database table mode by the ODB according to some point of binlog, and coupling an ODS embedded structure;
step two, configuring and reading a synchronous program;
running a program, synchronizing data and structure change, creating a binlog file and recording a binlog recording point in real time;
and step four, the program is terminated, and the program is restarted according to the real-time binlog recording point.
2. The binlog analysis method of claim 1, wherein configuring and reading for a synchronization procedure in step two comprises:
configuring the SDB, ODB, MDB links,
a configuration table filter, a DB filter, a reverse filter,
configuration value converters, including md5 converters, id encryption converters, special converters, idle converters,
and configuring a read point configuration file path and a read point, wherein the read point comprises a binlog file, and the current read position of the binlog file.
3. The binlog analysis method of claim 1, wherein the third step comprises running a program, synchronizing data, changing data, and processing synchronizing data and structure for event departure, and the third step comprises four categories:
tablemap,write,update,query,
the tablemap maps tableeid and table mode, triggers tablemap event before data synchronization each time, acquires and analyzes table structure, processes sub-table structure contraction, caches mapping relation,
write is a technique for creating log events, filtering, field value and type conversion, fetching data, assembling and converting ODS structure data,
update is an event triggered when data is updated, a tablemap cache table structure is inquired, a chain tail is obtained according to a main key and time, the current chain tail time is updated, the chain tail is converted into a secondary chain tail, new data is replaced by the new chain tail, the deletion state is recorded and the like,
the query event is a data structure change event and comprises a modification index, a modification field, a modification table and a database, the modification mode comprises creating, deleting and updating, and filtering, modifying and structure changing operations are carried out on the obtained query key sql by using a dry sql parser, and finally the modification of database related modes of ODB and SDB is reflected.
4. The binlog analysis method of claim 1, wherein the fourth step, the process exiting, restarting according to the real-time binlog entry point comprises:
when the program requires manual intervention, it can be exited and binlog points recorded, and data analysis resumed when the program was restarted.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710516504.2A CN107291926B (en) | 2017-06-29 | 2017-06-29 | Binlog analysis method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710516504.2A CN107291926B (en) | 2017-06-29 | 2017-06-29 | Binlog analysis method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107291926A CN107291926A (en) | 2017-10-24 |
CN107291926B true CN107291926B (en) | 2020-08-18 |
Family
ID=60099273
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710516504.2A Active CN107291926B (en) | 2017-06-29 | 2017-06-29 | Binlog analysis method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107291926B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109145060B (en) * | 2018-07-20 | 2020-09-04 | 腾讯科技(深圳)有限公司 | Data processing method and device |
CN110750410B (en) * | 2018-07-24 | 2024-04-12 | 北京京东尚科信息技术有限公司 | Method and device for monitoring database logs |
CN111046434A (en) * | 2019-12-19 | 2020-04-21 | 济南东驰网络科技有限公司 | Method for realizing data desensitization based on canal |
CN113051265A (en) * | 2019-12-27 | 2021-06-29 | 中信百信银行股份有限公司 | Method, device, computer equipment and readable storage medium for reducing loss caused by relational database table structure change |
CN112035464B (en) * | 2020-07-22 | 2024-03-15 | 武汉达梦数据库股份有限公司 | Data synchronization filtering method and synchronization device based on log analysis |
CN114491580B (en) * | 2021-12-30 | 2022-10-04 | 深圳市恒创智达信息技术有限公司 | Database sensitive information encryption method and device |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102402596A (en) * | 2011-11-07 | 2012-04-04 | 北京搜狗科技发展有限公司 | Reading and writing method and system of master slave separation database |
CN104298779A (en) * | 2014-11-04 | 2015-01-21 | 中国银行股份有限公司 | Processing method and system for massive data processing |
CN105956207A (en) * | 2016-07-01 | 2016-09-21 | 杭州帕拉迪网络科技有限公司 | Binlog-based configurable mysql database real-time synchronization method |
CN106484869A (en) * | 2016-10-12 | 2017-03-08 | 北京集奥聚合科技有限公司 | A kind of distributed caching method based on mysql binlog and system |
CN106874281A (en) * | 2015-12-11 | 2017-06-20 | 北京新媒传信科技有限公司 | Realize the method and apparatus that data base read-write is separate |
-
2017
- 2017-06-29 CN CN201710516504.2A patent/CN107291926B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102402596A (en) * | 2011-11-07 | 2012-04-04 | 北京搜狗科技发展有限公司 | Reading and writing method and system of master slave separation database |
CN104298779A (en) * | 2014-11-04 | 2015-01-21 | 中国银行股份有限公司 | Processing method and system for massive data processing |
CN106874281A (en) * | 2015-12-11 | 2017-06-20 | 北京新媒传信科技有限公司 | Realize the method and apparatus that data base read-write is separate |
CN105956207A (en) * | 2016-07-01 | 2016-09-21 | 杭州帕拉迪网络科技有限公司 | Binlog-based configurable mysql database real-time synchronization method |
CN106484869A (en) * | 2016-10-12 | 2017-03-08 | 北京集奥聚合科技有限公司 | A kind of distributed caching method based on mysql binlog and system |
Also Published As
Publication number | Publication date |
---|---|
CN107291926A (en) | 2017-10-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107291926B (en) | Binlog analysis method | |
US11461294B2 (en) | System for importing data into a data repository | |
US11360950B2 (en) | System for analysing data relationships to support data query execution | |
US10678810B2 (en) | System for data management in a large scale data repository | |
EP3602341B1 (en) | Data replication system | |
US10671642B2 (en) | Copying data changes to a target database | |
US9218377B2 (en) | Failure recovery and error correction techniques for data loading in information warehouses | |
RU2599538C2 (en) | Methods and systems for loading data into temporal data warehouse | |
CN102567453B (en) | Halloween protection in a multi-version database system | |
US8805777B2 (en) | Data record collapse and split functionality | |
CN105205053A (en) | Method and system for analyzing database incremental logs | |
CN105224527A (en) | Be applicable to the general ETL method of multiple object table update mode | |
CN113204571A (en) | SQL execution method and device related to write-in operation and storage medium | |
CN111078719A (en) | Data recovery method and device, storage medium and processor | |
CN114281757A (en) | Database migration method and system and computer readable storage medium | |
CN112015463B (en) | Configuration version management device and working method thereof | |
CN116821098A (en) | Data warehouse management method, service system and storage medium | |
CN111881323B (en) | Table separation method based on sequencing field and time routing | |
CN115827172A (en) | Method and device for executing database transaction | |
CN117971835A (en) | Metadata association method and system | |
CN117453460A (en) | Method and device for realizing redis second-level disaster recovery | |
CN118245277A (en) | Method, equipment and medium for recovering human resource system data | |
CN117149180A (en) | Application replication method, device, equipment and medium for low-code development | |
CN116302206A (en) | Presto data source hot loading method based on MQ | |
Pesado | A Comparison of DBMSs for Mobile Devices |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |