CN115905313A - MySQL big table association query system and method - Google Patents
MySQL big table association query system and method Download PDFInfo
- Publication number
- CN115905313A CN115905313A CN202211168253.0A CN202211168253A CN115905313A CN 115905313 A CN115905313 A CN 115905313A CN 202211168253 A CN202211168253 A CN 202211168253A CN 115905313 A CN115905313 A CN 115905313A
- Authority
- CN
- China
- Prior art keywords
- mysql
- data
- binlog
- canal
- processing module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 20
- 238000012545 processing Methods 0.000 claims abstract description 22
- 230000008859 change Effects 0.000 claims abstract description 18
- 238000012216 screening Methods 0.000 claims abstract description 10
- 230000001360 synchronised effect Effects 0.000 claims abstract description 10
- 230000008569 process Effects 0.000 claims abstract description 5
- 238000012544 monitoring process Methods 0.000 claims description 12
- 238000013507 mapping Methods 0.000 claims description 11
- 230000002452 interceptive effect Effects 0.000 claims description 4
- 238000011084 recovery Methods 0.000 claims description 4
- 238000010276 construction Methods 0.000 claims description 3
- 238000012423 maintenance Methods 0.000 claims description 3
- 238000007792 addition Methods 0.000 claims description 2
- 238000012217 deletion Methods 0.000 claims description 2
- 230000037430 deletion Effects 0.000 claims description 2
- 238000012986 modification Methods 0.000 claims description 2
- 230000004048 modification Effects 0.000 claims description 2
- 238000004806 packaging method and process Methods 0.000 claims description 2
- 230000008878 coupling Effects 0.000 abstract description 3
- 238000010168 coupling process Methods 0.000 abstract description 3
- 238000005859 coupling reaction Methods 0.000 abstract description 3
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000033772 system development Effects 0.000 description 1
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Abstract
The invention discloses a MySQL big table association query system and a MySQL big table association query method, which comprise Canal, MQ, a data processing module and an elastic search, wherein Mysql records data update in real time by using binlog, the real-time data of the binlog analyzed by Canal is synchronized into MQ, the data processing module monitors MQ messages and processes the data to be synchronized into ES in real time, the ES is used for screening and querying the data, and the acquired data id set is further transmitted to MySQL to query a corresponding table to acquire detailed information to be assembled and returned to a client; the system synchronizes data change in real time through the Binlog of MySQL, avoids intrusion operation on service codes and a database, improves the expansibility of the system by introducing MQ in a loose coupling mode, uses ES for searching, solves the performance problem of MySQL related query, and improves user experience.
Description
Technical Field
The invention relates to the field of computer data processing, in particular to a MySQL big table association query system and a MySQL big table association query method.
Background
General system development requires data to be stored in a database. Commonly used databases are classified as Relational databases (Relational database) such as MySQL, oracle, SQLServer, etc., or non-Relational databases (Not Only SQL) such as MongoDB, HBase, dynamo, etc. Relational databases are widely used because of their row and column storage, which is easy to understand and good support for transactions ensures data consistency. More and more businesses under the present stream of internet use MySQL as the preferred database for open source and free reasons.
The elastic search (hereinafter referred to as ES) is a highly extended, open-source full-text retrieval and analysis engine, which can quickly store and search massive data in near-real time.
Query exposure of data in a database is an operation that is not normal, and this often involves the associated query exposure among a plurality of tables. The relational database consumes a long time when a plurality of tables are subjected to correlation query, and particularly, the correlation of the tables with large data volume can cause the exhaustion of the physical machine resources of the database, thereby causing the unavailability of an upper application system.
The existing solutions are:
1) The read-write separation is carried out, the fields to be inquired form a new wide table and are put into a read database, but the scheme relates to code reconstruction and data migration, the reconstruction cost is high, and the new inquiry conditions need to be changed. Relational databases also have performance bottlenecks for storage and querying of large data volumes. Under the condition that the database and the table of the service library are divided, the difficulty of realizing the scheme is further increased.
2) And performing data increment synchronization into the ES by using a third-party data synchronization tool. The scheme can cause data to have certain delay, and cannot meet the query with higher data real-time requirement.
Therefore, how to achieve multi-table association query that is non-intrusive to the original service code and can meet the performance requirement is a technical problem to be solved by technicians at present.
Disclosure of Invention
The invention aims to solve the technical problem of a MySQL big table association query system and a MySQL big table association query method.
The invention is realized by the following technical scheme: a MySQL big table association query system comprises a Canal, an MQ, a data processing module and an ES, wherein real-time data of Canal analysis binlog are synchronized into the MQ, the data processing module monitors MQ messages and processes the data to be synchronized into the ES in real time, the ES is used for screening and querying the data, and according to an acquired data id set, the table corresponding to the query in MySQL is queried to acquire detailed information, and the detailed information is assembled and returned to a client.
As a preferred technical scheme, the method comprises the steps that an interactive protocol of MySQL slave is simulated by the Canal, the Canal is disguised as a MySQL slave and sends a dump protocol to the MySQL, the MySQL pushes binlog to the Canal after receiving a dump request, the Canal receives a binlog change event and then is packaged into an internal event object which is divided into three types of adding INSERT, modifying UPDATE and deleting DELETE, and the internal event object is sent to a message queue MQ.
As a preferred technical solution, after monitoring the message in the message queue, the data processing module finds the corresponding ES index configuration according to the table name to determine the index to be changed, and operates the corresponding index.
As an optimized technical scheme, the data processing module comprises the functions of mapping relation maintenance of database fields and ES indexes, message queue MQ monitoring, ES operation command construction and ES changing operation.
As the preferred technical scheme, the event objects are divided into three types of new addition, modification and deletion.
The invention relates to a MySQL big table association query method, which comprises the following specific steps:
step one, a binlog switch of the MySQL server is started, and a Row format is set to reflect data change of a Row level in real time, all DDL and DML operations are recorded in the binlog, and master-slave copy and data recovery are realized by using the binlog;
simulating a slave node of MySQL by using a Canal tool, monitoring and acquiring binlog incremental change in real time, analyzing and packaging the monitored MySQL operation event into an event object, and sending the event object to a message queue MQ;
step three, after a consumer using the data processing module as a message queue monitors binlog event information in the queue in real time, corresponding logic processing is carried out, the data information is back-checked in MySQL, and finally the correlation query screening conditions and the id of each table are synchronized into ES;
and step four, screening and inquiring the data by using the ES, and inquiring a corresponding table in MySQL according to the acquired data id set to acquire detailed information, assembling the detailed information and returning the assembled information to the client.
The invention has the beneficial effects that: the invention synchronizes data change in real time through the Binlog of MySQL, avoids intrusion operation to service codes and a database, improves the expansibility of the system by introducing MQ in a loose coupling mode, uses ES for searching, solves the performance problem of MySQL related query, and improves user experience.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic flow chart of the present invention;
fig. 2 is a system architecture diagram of the present invention.
Detailed Description
All of the features disclosed in this specification, or all of the steps in any method or process so disclosed, may be combined in any combination, except combinations of features and/or steps that are mutually exclusive.
As shown in fig. 2, the MySQL large table association query system of the present invention includes Canal, MQ, a data processing module, and ES, where real-time data of Canal analysis binlog is synchronized to MQ, the data processing module monitors MQ messages and processes data and synchronizes to ES in real time, the ES is used to perform screening query on data, and according to the acquired data id set, the MySQL queries a corresponding table to acquire detailed information, assembles the detailed information, and returns the detailed information to a client.
The method comprises the steps that a client simulates an interactive protocol of MySQL slave, pretends to be MySQL slave and sends a dump protocol to MySQL, the MySQL receives the dump request and then pushes binlog to the client, the client receives a binlog change event and then encapsulates the binlog change event into an internal event object which is divided into three types of adding INSERT, modifying UPDATE and deleting DELETE, the internal event object is sent to a message queue MQ, and a data processing module is responsible for monitoring the message in the message queue, then finds out the corresponding ES index configuration according to the table name, judges the index to be changed, and operates the corresponding index.
The data processing module comprises the functions of mapping relation maintenance of database fields and ES indexes, message queue MQ monitoring, ES operation command construction and ES change operation.
The event objects in the embodiment are divided into three types, namely adding, modifying and deleting.
As shown in fig. 1, in the MySQL large table association query method of the present invention, during query, a binlog switch of a MySQL server is turned on, and is set to a Row format to reflect data changes at a Row level in real time, all DDL and DML operations are recorded in the binlog, and master-slave replication and data recovery are implemented by using the binlog; then, simulating a slave node of MySQL by using a Canal tool, monitoring and acquiring binlog increment change in real time, analyzing and encapsulating the monitored MySQL operation event into an event object, and sending the event object to a message queue MQ; after a consumer using a data processing module as a message queue monitors binlog event information in the queue in real time, corresponding logic processing is carried out, the data information is back-checked in MySQL, and finally the correlation query screening conditions and the id of each table are synchronized into ES; and finally, screening and inquiring the data by using the ES, and acquiring detailed information from a corresponding table inquired in MySQL according to the acquired data id set, assembling the detailed information and returning the assembled detailed information to the client.
In this embodiment, the method and system for MySQL large table associative query of the present invention are mainly used to solve the performance problem of MySQL associative query and implement no code intrusion in a loose coupling manner.
Specifically, (1) MySQL binding records all DDL and DML operations, and stores the DDL and DML operations in a database server disk in a file form, which can be used to view information such as SQL change operations and time consumption of a database, and implement master-slave copy, data recovery, and the like. The Row mode needs to be started for better supporting the Canal operation;
(2) The method comprises the steps that an interactive protocol of a MySQL slave is simulated by the Canal, the Canal is disguised as the MySQL slave and sends a dump protocol to the MySQL, the MySQL pushes a binlog to the Canal after receiving a dump request, the Canal encapsulates an internal event object after receiving a binlog change event into three types, namely adding INSERT, modifying UPDATE and deleting DELETE, and sends the three types of the internal event object into a message queue MQ (the queue needs to be distinguished according to schemas of a database, namely different schemas are distributed into different queues).
The message content comprises operation type, field value after change, field value before change and table name:
(3) And the data processing module is responsible for finding out the corresponding ES index configuration according to the table name to judge the index to be changed after monitoring the message in the message queue, and operating the corresponding index. The module should include the following functions: the method comprises the steps of maintaining mapping relation of database fields and ES indexes, monitoring a message queue MQ, constructing an ES operation command and performing change operation on an ES.
Taking the table user and the company joint query as an example, the table user and the company are in a many-to-one relationship, that is, one company record corresponds to multiple user records, the associated field is the company _ id field in the user table, the query conditions are company.
Corresponding ES index structure:
"mappings":{
"_doc":{
"properties":{
"user_id":{
"type":"long"
},
"company_id":{
"type":"long"
},
"user_name":{
"type":"keyword"
},
"compay_name":{
"type":"keyword"
}
}
}
}。
if the mapping relationship between the database field and the ES index comprises: database table fields, ES index fields, SQL mapping, index name, reference table, and master table:
the fields of the database table and the index fields need to be in one-to-one correspondence, SQL mapping needs to contain all associated query conditions and ids when generating ES instructions and reversely checking data, and a left join statement is needed to designate a main table.
After monitoring MQ messages changed by a user or a company table, matching indexes, namely user _ company _ index, needing to be changed according to the maintained mapping data, and generating corresponding operation instructions;
according to the maintained SQL mapping primary and secondary table relation, if the INSERT operation of the company table is required to be ignored and no operation is required to be carried out; the INSERT and UPDATE operations in the ES instruction can perform reverse check on a database according to the SQL mapping statement to obtain all field values, and then the field values are converted into newly-added or updateByQuery operations at the bottom layer; the DELETE instruction of ES generates a deleteByQuery operation according to the id field value changed by the USER table.
(4) When in query, the index of the user _ company _ index in the ES is queried, the queried user _ id set is obtained, then the corresponding user data is queried in the user table, and then the corresponding information is queried in the company table according to the user and the company associated field company _ id.
The above description is only an embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that are not thought of through the inventive work should be included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope defined by the claims.
Claims (6)
1. A MySQL large table correlation query system is characterized in that: the system comprises a Canal, an MQ, a data processing module and an ES, wherein real-time data of Canal analysis binlog are synchronized into the MQ, the data processing module monitors MQ messages and processes the data to be synchronized into the ES in real time, the ES is used for screening and inquiring the data, and the acquired data id set is transmitted to a MySQL to inquire a corresponding table to acquire detailed information to be assembled and returned to a client.
2. The MySQL big table correlation query system according to claim 1, characterized in that: the method comprises the steps that an interactive protocol of MySQL slave is simulated by the Canal, the Canal is disguised as MySQL slave and sends a dump protocol to the MySQL, the MySQL pushes binlog to the Canal after receiving a dump request, the Canal encapsulates an internal event object after receiving a binlog change event, the internal event object is divided into three types of adding INSERT, modifying UPDATE and deleting DELETE, and the three types of adding INSERT, modifying UPDATE and deleting DELETE are sent to a message queue MQ.
3. The MySQL large-table associative query system according to claim 1, characterized in that: and the data processing module is responsible for finding out the corresponding ES index configuration according to the table name to judge the index to be changed after monitoring the message in the message queue, and operating the corresponding index.
4. The MySQL large-table associative query system according to claim 1 or 3, characterized in that: the data processing module comprises the functions of mapping relation maintenance of database fields and ES indexes, message queue MQ monitoring, ES operation command construction and ES change operation.
5. The MySQL large-table associative query system according to claim 1, characterized in that: the event objects are divided into three types of new addition, modification and deletion.
6. A MySQL big table association query method is characterized in that the query system of any one of claims 1 to 5 is used, and the specific steps are as follows:
step one, a binlog switch of a MySQL server is started and set to be in a Row format so as to reflect data change of a Row level in real time, all DDL and DML operations are recorded in the binlog, and master-slave copy and data recovery are realized by using the binlog;
simulating a slave node of MySQL by using a Canal tool, monitoring and acquiring binlog incremental change in real time, analyzing and packaging the monitored MySQL operation event into an event object, and sending the event object to a message queue MQ;
step three, after a data processing module is used as a consumer of the message queue to monitor binlog event information in the queue in real time, corresponding logic processing is carried out, data information is back-checked in MySQL, and finally the correlation query screening conditions and the id of each table are synchronized into an ES;
and step four, screening and inquiring the data by using the ES, and inquiring a corresponding table in MySQL according to the acquired data id set to acquire detailed information, assembling the detailed information and returning the assembled detailed information to the client.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211168253.0A CN115905313A (en) | 2022-09-23 | 2022-09-23 | MySQL big table association query system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211168253.0A CN115905313A (en) | 2022-09-23 | 2022-09-23 | MySQL big table association query system and method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115905313A true CN115905313A (en) | 2023-04-04 |
Family
ID=86482459
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211168253.0A Pending CN115905313A (en) | 2022-09-23 | 2022-09-23 | MySQL big table association query system and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115905313A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116150211A (en) * | 2023-04-18 | 2023-05-23 | 北京江融信科技有限公司 | Multi-data source query method, platform and application system |
-
2022
- 2022-09-23 CN CN202211168253.0A patent/CN115905313A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116150211A (en) * | 2023-04-18 | 2023-05-23 | 北京江融信科技有限公司 | Multi-data source query method, platform and application system |
CN116150211B (en) * | 2023-04-18 | 2023-08-18 | 北京江融信科技有限公司 | Multi-data source query method, platform and application system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11816126B2 (en) | Large scale unstructured database systems | |
US11755575B2 (en) | Processing database queries using format conversion | |
TWI710919B (en) | Data storage device, translation device and data inventory acquisition method | |
CN107402963B (en) | Search data construction method, incremental data pushing device and equipment | |
CN109906448B (en) | Method, apparatus, and medium for facilitating operations on pluggable databases | |
US8886617B2 (en) | Query-based searching using a virtual table | |
US8745029B2 (en) | Method for generating indexes for downloading data | |
US7690000B2 (en) | Metadata journal for information technology systems | |
CN109656958B (en) | Data query method and system | |
CN101901242A (en) | Federated configuration data management | |
CN111046036A (en) | Data synchronization method, device, system and storage medium | |
CN114116716A (en) | Hierarchical data retrieval method, device and equipment | |
CN114218218A (en) | Data processing method, device and equipment based on data warehouse and storage medium | |
CN115905313A (en) | MySQL big table association query system and method | |
CN114661832A (en) | Multi-mode heterogeneous data storage method and system based on data quality | |
CN109165262A (en) | Fragmentation clustering system and fragmentation method of relational large table | |
CN113934713A (en) | Order data indexing method, system, computer equipment and storage medium | |
CN110245037B (en) | Hive user operation behavior restoration method based on logs | |
Serrano et al. | From relations to multi-dimensional maps: A SQL-to-hbase transformation methodology | |
CN114003614A (en) | Data synchronization device, method, online transaction system, computer equipment and storage medium | |
Cerpa | Pre-physical data base design heuristics | |
Porshnev et al. | The concupt of automated system of steel production storage data structure | |
Gomes et al. | An object mapping for the Cassandra distributed database | |
Le et al. | Cloud Database | |
US20220164349A1 (en) | Distributed queries through dynamic views |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |