CN115905313A - MySQL big table association query system and method - Google Patents

MySQL big table association query system and method Download PDF

Info

Publication number
CN115905313A
CN115905313A CN202211168253.0A CN202211168253A CN115905313A CN 115905313 A CN115905313 A CN 115905313A CN 202211168253 A CN202211168253 A CN 202211168253A CN 115905313 A CN115905313 A CN 115905313A
Authority
CN
China
Prior art keywords
mysql
data
binlog
canal
processing module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211168253.0A
Other languages
Chinese (zh)
Inventor
陈武
梅欣
陈亨通
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Kejia Information Technology Co ltd
Original Assignee
Shanghai Kejia Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Kejia Information Technology Co ltd filed Critical Shanghai Kejia Information Technology Co ltd
Priority to CN202211168253.0A priority Critical patent/CN115905313A/en
Publication of CN115905313A publication Critical patent/CN115905313A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a MySQL big table association query system and a MySQL big table association query method, which comprise Canal, MQ, a data processing module and an elastic search, wherein Mysql records data update in real time by using binlog, the real-time data of the binlog analyzed by Canal is synchronized into MQ, the data processing module monitors MQ messages and processes the data to be synchronized into ES in real time, the ES is used for screening and querying the data, and the acquired data id set is further transmitted to MySQL to query a corresponding table to acquire detailed information to be assembled and returned to a client; the system synchronizes data change in real time through the Binlog of MySQL, avoids intrusion operation on service codes and a database, improves the expansibility of the system by introducing MQ in a loose coupling mode, uses ES for searching, solves the performance problem of MySQL related query, and improves user experience.

Description

MySQL large table association query system and method
Technical Field
The invention relates to the field of computer data processing, in particular to a MySQL big table association query system and a MySQL big table association query method.
Background
General system development requires data to be stored in a database. Commonly used databases are classified as Relational databases (Relational database) such as MySQL, oracle, SQLServer, etc., or non-Relational databases (Not Only SQL) such as MongoDB, HBase, dynamo, etc. Relational databases are widely used because of their row and column storage, which is easy to understand and good support for transactions ensures data consistency. More and more businesses under the present stream of internet use MySQL as the preferred database for open source and free reasons.
The elastic search (hereinafter referred to as ES) is a highly extended, open-source full-text retrieval and analysis engine, which can quickly store and search massive data in near-real time.
Query exposure of data in a database is an operation that is not normal, and this often involves the associated query exposure among a plurality of tables. The relational database consumes a long time when a plurality of tables are subjected to correlation query, and particularly, the correlation of the tables with large data volume can cause the exhaustion of the physical machine resources of the database, thereby causing the unavailability of an upper application system.
The existing solutions are:
1) The read-write separation is carried out, the fields to be inquired form a new wide table and are put into a read database, but the scheme relates to code reconstruction and data migration, the reconstruction cost is high, and the new inquiry conditions need to be changed. Relational databases also have performance bottlenecks for storage and querying of large data volumes. Under the condition that the database and the table of the service library are divided, the difficulty of realizing the scheme is further increased.
2) And performing data increment synchronization into the ES by using a third-party data synchronization tool. The scheme can cause data to have certain delay, and cannot meet the query with higher data real-time requirement.
Therefore, how to achieve multi-table association query that is non-intrusive to the original service code and can meet the performance requirement is a technical problem to be solved by technicians at present.
Disclosure of Invention
The invention aims to solve the technical problem of a MySQL big table association query system and a MySQL big table association query method.
The invention is realized by the following technical scheme: a MySQL big table association query system comprises a Canal, an MQ, a data processing module and an ES, wherein real-time data of Canal analysis binlog are synchronized into the MQ, the data processing module monitors MQ messages and processes the data to be synchronized into the ES in real time, the ES is used for screening and querying the data, and according to an acquired data id set, the table corresponding to the query in MySQL is queried to acquire detailed information, and the detailed information is assembled and returned to a client.
As a preferred technical scheme, the method comprises the steps that an interactive protocol of MySQL slave is simulated by the Canal, the Canal is disguised as a MySQL slave and sends a dump protocol to the MySQL, the MySQL pushes binlog to the Canal after receiving a dump request, the Canal receives a binlog change event and then is packaged into an internal event object which is divided into three types of adding INSERT, modifying UPDATE and deleting DELETE, and the internal event object is sent to a message queue MQ.
As a preferred technical solution, after monitoring the message in the message queue, the data processing module finds the corresponding ES index configuration according to the table name to determine the index to be changed, and operates the corresponding index.
As an optimized technical scheme, the data processing module comprises the functions of mapping relation maintenance of database fields and ES indexes, message queue MQ monitoring, ES operation command construction and ES changing operation.
As the preferred technical scheme, the event objects are divided into three types of new addition, modification and deletion.
The invention relates to a MySQL big table association query method, which comprises the following specific steps:
step one, a binlog switch of the MySQL server is started, and a Row format is set to reflect data change of a Row level in real time, all DDL and DML operations are recorded in the binlog, and master-slave copy and data recovery are realized by using the binlog;
simulating a slave node of MySQL by using a Canal tool, monitoring and acquiring binlog incremental change in real time, analyzing and packaging the monitored MySQL operation event into an event object, and sending the event object to a message queue MQ;
step three, after a consumer using the data processing module as a message queue monitors binlog event information in the queue in real time, corresponding logic processing is carried out, the data information is back-checked in MySQL, and finally the correlation query screening conditions and the id of each table are synchronized into ES;
and step four, screening and inquiring the data by using the ES, and inquiring a corresponding table in MySQL according to the acquired data id set to acquire detailed information, assembling the detailed information and returning the assembled information to the client.
The invention has the beneficial effects that: the invention synchronizes data change in real time through the Binlog of MySQL, avoids intrusion operation to service codes and a database, improves the expansibility of the system by introducing MQ in a loose coupling mode, uses ES for searching, solves the performance problem of MySQL related query, and improves user experience.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic flow chart of the present invention;
fig. 2 is a system architecture diagram of the present invention.
Detailed Description
All of the features disclosed in this specification, or all of the steps in any method or process so disclosed, may be combined in any combination, except combinations of features and/or steps that are mutually exclusive.
As shown in fig. 2, the MySQL large table association query system of the present invention includes Canal, MQ, a data processing module, and ES, where real-time data of Canal analysis binlog is synchronized to MQ, the data processing module monitors MQ messages and processes data and synchronizes to ES in real time, the ES is used to perform screening query on data, and according to the acquired data id set, the MySQL queries a corresponding table to acquire detailed information, assembles the detailed information, and returns the detailed information to a client.
The method comprises the steps that a client simulates an interactive protocol of MySQL slave, pretends to be MySQL slave and sends a dump protocol to MySQL, the MySQL receives the dump request and then pushes binlog to the client, the client receives a binlog change event and then encapsulates the binlog change event into an internal event object which is divided into three types of adding INSERT, modifying UPDATE and deleting DELETE, the internal event object is sent to a message queue MQ, and a data processing module is responsible for monitoring the message in the message queue, then finds out the corresponding ES index configuration according to the table name, judges the index to be changed, and operates the corresponding index.
The data processing module comprises the functions of mapping relation maintenance of database fields and ES indexes, message queue MQ monitoring, ES operation command construction and ES change operation.
The event objects in the embodiment are divided into three types, namely adding, modifying and deleting.
As shown in fig. 1, in the MySQL large table association query method of the present invention, during query, a binlog switch of a MySQL server is turned on, and is set to a Row format to reflect data changes at a Row level in real time, all DDL and DML operations are recorded in the binlog, and master-slave replication and data recovery are implemented by using the binlog; then, simulating a slave node of MySQL by using a Canal tool, monitoring and acquiring binlog increment change in real time, analyzing and encapsulating the monitored MySQL operation event into an event object, and sending the event object to a message queue MQ; after a consumer using a data processing module as a message queue monitors binlog event information in the queue in real time, corresponding logic processing is carried out, the data information is back-checked in MySQL, and finally the correlation query screening conditions and the id of each table are synchronized into ES; and finally, screening and inquiring the data by using the ES, and acquiring detailed information from a corresponding table inquired in MySQL according to the acquired data id set, assembling the detailed information and returning the assembled detailed information to the client.
In this embodiment, the method and system for MySQL large table associative query of the present invention are mainly used to solve the performance problem of MySQL associative query and implement no code intrusion in a loose coupling manner.
Specifically, (1) MySQL binding records all DDL and DML operations, and stores the DDL and DML operations in a database server disk in a file form, which can be used to view information such as SQL change operations and time consumption of a database, and implement master-slave copy, data recovery, and the like. The Row mode needs to be started for better supporting the Canal operation;
(2) The method comprises the steps that an interactive protocol of a MySQL slave is simulated by the Canal, the Canal is disguised as the MySQL slave and sends a dump protocol to the MySQL, the MySQL pushes a binlog to the Canal after receiving a dump request, the Canal encapsulates an internal event object after receiving a binlog change event into three types, namely adding INSERT, modifying UPDATE and deleting DELETE, and sends the three types of the internal event object into a message queue MQ (the queue needs to be distinguished according to schemas of a database, namely different schemas are distributed into different queues).
The message content comprises operation type, field value after change, field value before change and table name:
Figure BDA0003862356850000051
(3) And the data processing module is responsible for finding out the corresponding ES index configuration according to the table name to judge the index to be changed after monitoring the message in the message queue, and operating the corresponding index. The module should include the following functions: the method comprises the steps of maintaining mapping relation of database fields and ES indexes, monitoring a message queue MQ, constructing an ES operation command and performing change operation on an ES.
Taking the table user and the company joint query as an example, the table user and the company are in a many-to-one relationship, that is, one company record corresponds to multiple user records, the associated field is the company _ id field in the user table, the query conditions are company.
Corresponding ES index structure:
"mappings":{
"_doc":{
"properties":{
"user_id":{
"type":"long"
},
"company_id":{
"type":"long"
},
"user_name":{
"type":"keyword"
},
"compay_name":{
"type":"keyword"
}
}
}
}。
if the mapping relationship between the database field and the ES index comprises: database table fields, ES index fields, SQL mapping, index name, reference table, and master table:
Figure BDA0003862356850000061
Figure BDA0003862356850000071
/>
the fields of the database table and the index fields need to be in one-to-one correspondence, SQL mapping needs to contain all associated query conditions and ids when generating ES instructions and reversely checking data, and a left join statement is needed to designate a main table.
After monitoring MQ messages changed by a user or a company table, matching indexes, namely user _ company _ index, needing to be changed according to the maintained mapping data, and generating corresponding operation instructions;
Figure BDA0003862356850000072
Figure BDA0003862356850000081
according to the maintained SQL mapping primary and secondary table relation, if the INSERT operation of the company table is required to be ignored and no operation is required to be carried out; the INSERT and UPDATE operations in the ES instruction can perform reverse check on a database according to the SQL mapping statement to obtain all field values, and then the field values are converted into newly-added or updateByQuery operations at the bottom layer; the DELETE instruction of ES generates a deleteByQuery operation according to the id field value changed by the USER table.
(4) When in query, the index of the user _ company _ index in the ES is queried, the queried user _ id set is obtained, then the corresponding user data is queried in the user table, and then the corresponding information is queried in the company table according to the user and the company associated field company _ id.
The above description is only an embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that are not thought of through the inventive work should be included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope defined by the claims.

Claims (6)

1. A MySQL large table correlation query system is characterized in that: the system comprises a Canal, an MQ, a data processing module and an ES, wherein real-time data of Canal analysis binlog are synchronized into the MQ, the data processing module monitors MQ messages and processes the data to be synchronized into the ES in real time, the ES is used for screening and inquiring the data, and the acquired data id set is transmitted to a MySQL to inquire a corresponding table to acquire detailed information to be assembled and returned to a client.
2. The MySQL big table correlation query system according to claim 1, characterized in that: the method comprises the steps that an interactive protocol of MySQL slave is simulated by the Canal, the Canal is disguised as MySQL slave and sends a dump protocol to the MySQL, the MySQL pushes binlog to the Canal after receiving a dump request, the Canal encapsulates an internal event object after receiving a binlog change event, the internal event object is divided into three types of adding INSERT, modifying UPDATE and deleting DELETE, and the three types of adding INSERT, modifying UPDATE and deleting DELETE are sent to a message queue MQ.
3. The MySQL large-table associative query system according to claim 1, characterized in that: and the data processing module is responsible for finding out the corresponding ES index configuration according to the table name to judge the index to be changed after monitoring the message in the message queue, and operating the corresponding index.
4. The MySQL large-table associative query system according to claim 1 or 3, characterized in that: the data processing module comprises the functions of mapping relation maintenance of database fields and ES indexes, message queue MQ monitoring, ES operation command construction and ES change operation.
5. The MySQL large-table associative query system according to claim 1, characterized in that: the event objects are divided into three types of new addition, modification and deletion.
6. A MySQL big table association query method is characterized in that the query system of any one of claims 1 to 5 is used, and the specific steps are as follows:
step one, a binlog switch of a MySQL server is started and set to be in a Row format so as to reflect data change of a Row level in real time, all DDL and DML operations are recorded in the binlog, and master-slave copy and data recovery are realized by using the binlog;
simulating a slave node of MySQL by using a Canal tool, monitoring and acquiring binlog incremental change in real time, analyzing and packaging the monitored MySQL operation event into an event object, and sending the event object to a message queue MQ;
step three, after a data processing module is used as a consumer of the message queue to monitor binlog event information in the queue in real time, corresponding logic processing is carried out, data information is back-checked in MySQL, and finally the correlation query screening conditions and the id of each table are synchronized into an ES;
and step four, screening and inquiring the data by using the ES, and inquiring a corresponding table in MySQL according to the acquired data id set to acquire detailed information, assembling the detailed information and returning the assembled detailed information to the client.
CN202211168253.0A 2022-09-23 2022-09-23 MySQL big table association query system and method Pending CN115905313A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211168253.0A CN115905313A (en) 2022-09-23 2022-09-23 MySQL big table association query system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211168253.0A CN115905313A (en) 2022-09-23 2022-09-23 MySQL big table association query system and method

Publications (1)

Publication Number Publication Date
CN115905313A true CN115905313A (en) 2023-04-04

Family

ID=86482459

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211168253.0A Pending CN115905313A (en) 2022-09-23 2022-09-23 MySQL big table association query system and method

Country Status (1)

Country Link
CN (1) CN115905313A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116150211A (en) * 2023-04-18 2023-05-23 北京江融信科技有限公司 Multi-data source query method, platform and application system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116150211A (en) * 2023-04-18 2023-05-23 北京江融信科技有限公司 Multi-data source query method, platform and application system
CN116150211B (en) * 2023-04-18 2023-08-18 北京江融信科技有限公司 Multi-data source query method, platform and application system

Similar Documents

Publication Publication Date Title
US11816126B2 (en) Large scale unstructured database systems
US11755575B2 (en) Processing database queries using format conversion
TWI710919B (en) Data storage device, translation device and data inventory acquisition method
CN107402963B (en) Search data construction method, incremental data pushing device and equipment
CN109906448B (en) Method, apparatus, and medium for facilitating operations on pluggable databases
US8886617B2 (en) Query-based searching using a virtual table
US8745029B2 (en) Method for generating indexes for downloading data
US7690000B2 (en) Metadata journal for information technology systems
CN109656958B (en) Data query method and system
CN101901242A (en) Federated configuration data management
CN111046036A (en) Data synchronization method, device, system and storage medium
CN114116716A (en) Hierarchical data retrieval method, device and equipment
CN114218218A (en) Data processing method, device and equipment based on data warehouse and storage medium
CN115905313A (en) MySQL big table association query system and method
CN114661832A (en) Multi-mode heterogeneous data storage method and system based on data quality
CN109165262A (en) Fragmentation clustering system and fragmentation method of relational large table
CN113934713A (en) Order data indexing method, system, computer equipment and storage medium
CN110245037B (en) Hive user operation behavior restoration method based on logs
Serrano et al. From relations to multi-dimensional maps: A SQL-to-hbase transformation methodology
CN114003614A (en) Data synchronization device, method, online transaction system, computer equipment and storage medium
Cerpa Pre-physical data base design heuristics
Porshnev et al. The concupt of automated system of steel production storage data structure
Gomes et al. An object mapping for the Cassandra distributed database
Le et al. Cloud Database
US20220164349A1 (en) Distributed queries through dynamic views

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination