CN111881323A - Table separation method based on sorting field and time routing - Google Patents

Table separation method based on sorting field and time routing Download PDF

Info

Publication number
CN111881323A
CN111881323A CN202010567383.6A CN202010567383A CN111881323A CN 111881323 A CN111881323 A CN 111881323A CN 202010567383 A CN202010567383 A CN 202010567383A CN 111881323 A CN111881323 A CN 111881323A
Authority
CN
China
Prior art keywords
data
database
list
xml file
separation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010567383.6A
Other languages
Chinese (zh)
Other versions
CN111881323B (en
Inventor
王崟平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan XW Bank Co Ltd
Original Assignee
Sichuan XW Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan XW Bank Co Ltd filed Critical Sichuan XW Bank Co Ltd
Priority to CN202010567383.6A priority Critical patent/CN111881323B/en
Publication of CN111881323A publication Critical patent/CN111881323A/en
Application granted granted Critical
Publication of CN111881323B publication Critical patent/CN111881323B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/80Information retrieval; Database structures therefor; File system structures therefor of semi-structured data, e.g. markup language structured data such as SGML, XML or HTML
    • G06F16/81Indexing, e.g. XML tags; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/02Banking, e.g. interest calculation or account maintenance

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Development Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Technology Law (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a table separation method based on sequencing fields and time routing, which comprises the following steps: A. configuring required database parameters according to different types of table separation scenes in the database, and setting a configuration file for operating the database; B. reading column information of the table according to the configured table name; C. if the scene is a hot list separation scene, inquiring data before the storage days configured in the list; if the scene is a cold list separation scene, the data of the list is inquired; D. storing the inquired data in corresponding parameters; E. dividing granularity according to configured date and time, and storing data in the parameters into a specified object; F. and writing the data in the specified object into a separate table one by one, and deleting the successfully written data from the database table corresponding to the configured database table name. The invention can multiplex and uniformly manage the sub-table codes, greatly reduces the repeated labor, and does not need new development, test and online data table separation.

Description

Table separation method based on sorting field and time routing
Technical Field
The invention relates to a data processing method in a database, in particular to a table separation method based on sequencing fields and time routing.
Background
In the databases of banking systems, there are usually two types of database tables, hot and cold tables. A hot table refers to a table with data being written continuously, a cold table refers to a table with many rows of data and no new data to be rewritten, such as a mysql table with two or three hundred million rows of data. When the data in the hot table or the cold table is too large, the data in the table needs to be queried slowly, so that the hot table or the cold table needs to be split into a plurality of small tables according to time, and the query is faster.
At present, in each system of a bank, tables which need to be separated are developed, tested and on-line based on the business needs of the system, and a universal design scheme and a universal library do not exist, so that a large amount of repeated labor is caused. Meanwhile, the current mode has the following disadvantages:
1. because the codes of the sub-tables of different systems are not universal, a large amount of redundant codes and repeated labor are caused, the time for developing, testing and maintaining is wasted, and the unified maintenance cannot be realized. For example, Motian technicians find the A-table too large and need to be split, and then write a piece of the sublist code to split the A-table and test the upper line. After a period of time, the B table needs to be separated, a new sub-table code needs to be developed, and then the online test is carried out.
2. Non-configurable and lack of flexibility. For example, the former A table has less data volume entering every day, the data of the A table is separated according to the granularity of half a year during separation, and the A table needs to be separated according to the quarterly after the service is increased. Or the service using the A table is stopped, no data exists, the task of separating the A table data needs to be closed, the code may need to be changed to be on line again, or the configuration is modified, and then the program is restarted, and the like. New codes need to be developed to realize the functions, which is time-consuming and labor-consuming and causes various waste of resources.
Disclosure of Invention
The invention provides a table separation method based on sequencing fields and time routing, which can be used for multiplexing and uniformly managing sub-table codes and reducing repeated labor.
The invention relates to a table separation method based on sequencing fields and time routing, which comprises the following steps:
A. configuring required database parameters according to a hot table separation scene and a cold table separation scene in a database, and setting an xml file for operating the database and each attribute in a TableInfo object having a parameter source required by a database operation language in the xml file;
B. reading column information of the table according to the configured database table name through the xml file, wherein the column information comprises each column name and a column value corresponding to each column name;
C. if the scene is a hot list separation scene, inquiring data before the storage days configured in the database list name corresponding list through the xml file; if the scene is a cold list separation scene, inquiring data in the database list name corresponding list through the xml file;
D. storing the inquired data in a mapList parameter of a List < Map < String, Object > structure in a TableInfo Object, wherein the Map < String, Object > structure represents a row of data in the inquired data, a key is a column name of the inquired data, and a value is a value corresponding to the column name;
E. storing data in the mapList parameter into an insermap object of a Map < String, TableInfo > structure according to configured date and time partition granularity, wherein a key is (the name of the database table + configured date + time partition granularity identifier), the value of the key is tablelnfo object, and the data in the mapList parameter corresponding to the date and time partition granularity is stored in the mapList parameter of the tablelnfo object;
F. and writing data in mapList parameters in a value of a TableInfo object type of an instermAP object into a separation table with keys in the instermAP object as table names one by one through the xml file, and finally deleting the successfully written data from a database table corresponding to the configured database table names through the xml file.
The invention operates and divides the database by the configuration mode, and the configuration object has universality and the function of a template. Different bank systems only need to set corresponding values in the configuration objects, so that the mode and codes of the branch table are unified, the purposes of code reuse and flexibility improvement are achieved, and the steps of code modification, re-online and the like are not needed.
Further, in step F, if the detached list does not exist, a new detached list is created in the database by using the key in the insterMap object as a list name through the xml file, and then the data writing is performed. So as to ensure that data writing abnormity can not be caused by no corresponding separation table when data is written.
Preferably, the query is performed 3 times at intervals in all steps of performing database query through the xml file, so as to prevent data query abnormality caused by network fluctuation.
Preferably, the steps B to F are automatically executed by setting a timing task, so that the manual operation is further simplified.
On the basis, when data abnormity occurs in the operation of the database through the xml file, corresponding abnormal information is prompted.
The table separation method based on the sequencing field and the time routing can multiplex and uniformly manage the sub-table codes, greatly reduce repeated labor, save labor cost, facilitate access, and only need to configure parameters to start timing tasks without new development, test and online data table separation.
The present invention will be described in further detail with reference to the following examples. This should not be understood as limiting the scope of the above-described subject matter of the present invention to the following examples. Various substitutions and alterations according to the general knowledge and conventional practice in the art are intended to be included within the scope of the present invention without departing from the technical spirit of the present invention as described above.
Drawings
FIG. 1 is a flow chart of a table separation method based on sorting field and time routing according to the present invention.
Detailed Description
As shown in fig. 1, the table separation method based on sorting field and time routing of the present invention includes the steps of:
A. configuring required database parameters according to a hot table separation scene, a cold table separation scene and a data cleaning scene in a database, and setting an xml file for operating the database and each attribute in a TableInfo object having a parameter source required by a database operation language in the xml file. The timing task of the method is executed according to the required scene setting. The data cleaning scene means that data in the database does not need to be stored for a long time, for example, the data before two months can be directly deleted, so that the indirection of the database table is ensured, the query efficiency is improved, and useless redundant data are reduced. The configuration parameters and the xml files of the three scenes respectively comprise:
configuration parameters of the hotlist separation scenario:
Figure BDA0002548333950000031
Figure BDA0002548333950000041
configuration parameters of the cold table separation scene:
the difference between cold and hot table separation is that the data size of the cold table is already large, cannot be filtered by time, only by id primary key, and therefore has no keepDays parameter.
Figure BDA0002548333950000042
Figure BDA0002548333950000051
Configuration parameters of the data cleaning scenario:
Figure BDA0002548333950000052
Figure BDA0002548333950000061
tablelnfo object parameter definition:
the attributes of the object are the source of parameters required for sql statements in an xml file (tablescript. The parameters in $ { } or # { } in the sql statement in the xml file come from the object:
Figure BDA0002548333950000062
Figure BDA0002548333950000071
an xml file: xml configuration:
after the sqlSessionsTemplate parameters in the three scenes are combined by mybatis (an open source item of apache) and tableSerplit. xml, the database can be operated by using the configured sql sentences in the xml file, and the package names of the items need to be changed when the respective items are introduced. Different databases are configured with different sqlSessionsTemplate parameters, and the configuration of each sqlSessionsTemplate parameter needs to introduce TableSpit. Xml is shown below as an example of a practically used tablescript:
Figure BDA0002548333950000072
Figure BDA0002548333950000081
Figure BDA0002548333950000091
B. and reading column information of the table according to the configured database table name through the xml file, wherein the column information comprises each column name and a column value corresponding to each column name.
C. If the scene is a hot list separation scene, inquiring data before the storage days configured in the database list name corresponding list through the xml file; and if the scene is the cold list separation scene, inquiring the data in the database list name corresponding list through the xml file.
D. And storing the inquired data in a mapList parameter of a List < Map < String, Object > structure in a TableInfo Object, wherein the Map < String, Object > structure represents a row of data in the inquired data, a key is a column name of the inquired data, and a value is a value corresponding to the column name.
E. According to the configured date and time partition granularity, storing the data in the mapList parameter into an insermap object of a Map < String, TableInfo > structure, wherein the key is (the database table name + the configured date + time partition granularity identifier), the value is a tableInfo object, and the data in the mapList parameter corresponding to the date and time partition granularity is stored in the mapList parameter of the tableInfo object.
F. And writing data in mapList parameters in the value of the TableInfo object type of the instermAP object into a separation table with keys in the instermAP object as table names one by one through the xml file. If the separation table does not exist, a new separation table is created in a database by the xml file by taking the key in the insterMap object as a table name, and then the data is written. And finally deleting the successfully written data from the database table corresponding to the configured database table name through the xml file.
All the steps of inquiring the database through the xml file are inquired at intervals of 3 times, and the interval is 10 seconds every time, so that abnormal data inquiry caused by network fluctuation is prevented. And the steps B to F are automatically executed by setting the timing task, so that the manual operation is further simplified.
If any abnormity exists in the steps, prompting is carried out.
The invention operates and divides the database by the configuration mode, and the configuration object has universality and the function of a template. Different bank systems only need to set corresponding values in the configuration objects, so that the mode and codes of the branch table are unified, the purposes of code reuse and flexibility improvement are achieved, and the steps of code modification, re-online and the like are not needed.
Example 1:
hotlist separation scenario: in the device fingerprint (a web application system developed by Java language), a js _ access _ log database table is provided, hundreds of thousands of data are entered every day, and tens of millions of data are entered every month. The read-write performance of the table is affected after the data volume is large, so that the data in the js _ access _ log table for more than 60 days needs to be separated into the cold table in a quarterly manner, and the data is triggered once every night and early in the morning. After the js _ access _ log database table is separated by the table separation method, the cold table format is formed as follows: js _ access _ log _ his _2020_1_3(2020 is the year, corresponding to the datafield parameter, 1_3 indicates jss _ access _ log table that data from more than 60 days 1 to 3 months are present in the js _ access _ log _ his _2020_1_3 table), js _ access _ log _ his _2020_4_5, and so on.
Example 2:
cold list separation scenario: similarly, for a js _ access _ log database table in the device fingerprint, a written-dead timing task writes data in the js _ access _ log database table for more than 60 days into a js _ access _ log _ his cold table, the data of the js _ access _ log _ his cold table exceeds an upper hundred million lines after two years, more than 300G is achieved, data query is slow and basically unavailable, and the cold table needs to be split into a plurality of small cold tables to prevent the single table from being too large and being unavailable and maintained. After the timing task is configured, the large cold table can be split into a plurality of small cold tables for query by the table separation method of the invention.
Example 3:
data cleaning scenario: there is a temporary data table, operator _ ilog _ con, in the database, hundreds of thousands of data are entered every day, the database performance is not good, so in order to achieve the required performance, only seven days of data need to be kept in the operator _ ilog _ con table, after configuring the timing task, the steps are carried out:
s1, inquiring an id List of corresponding data from a database table (namely, an oper _ ilog _ con table) corresponding to the tableName parameter according to a keepDays parameter (namely, seven days) configured in a data cleaning scene, and saving the id List into a List < Object > idList parameter of a tablelnfo Object through an sql statement in an xml file.
And S2, deleting corresponding data from the operator _ ilog _ con table through an sql statement in the xml file according to the id stored in the idList parameter.
If any abnormity exists in the steps, prompting is carried out.
Through the above steps, the data in the operator _ ilog _ con table for more than seven days is cleaned up.
The three scenes basically cover all scenes for separating the database table data, and other database tables meeting the scenes can realize the table separation only by configuring corresponding parameters and setting timing tasks.

Claims (5)

1. A table separation method based on sorting field and time route is characterized by comprising the following steps:
A. configuring required database parameters according to a hot table separation scene and a cold table separation scene in a database, and setting an xml file for operating the database and each attribute in a TableInfo object having a parameter source required by a database operation language in the xml file;
B. reading column information of the table according to the configured database table name through the xml file, wherein the column information comprises each column name and a column value corresponding to each column name;
C. if the scene is a hot list separation scene, inquiring data before the storage days configured in the database list name corresponding list through the xml file; if the scene is a cold list separation scene, inquiring data in the database list name corresponding list through the xml file;
D. storing the inquired data in a mapList parameter of a List < Map < String, Object > structure in a TableInfo Object, wherein the Map < String, Object > structure represents a row of data in the inquired data, a key is a column name of the inquired data, and a value is a value corresponding to the column name;
E. storing data in the mapList parameter into an insermap object of a Map < String, TableInfo > structure according to configured date and time partition granularity, wherein a key is (the name of the database table + configured date + time partition granularity identifier), the value of the key is tablelnfo object, and the data in the mapList parameter corresponding to the date and time partition granularity is stored in the mapList parameter of the tablelnfo object;
F. and writing data in mapList parameters in a value of a TableInfo object type of an instermAP object into a separation table with keys in the instermAP object as table names one by one through the xml file, and finally deleting the successfully written data from a database table corresponding to the configured database table names through the xml file.
2. The table separation method based on ordered fields and time routing according to claim 1, characterized by: in step F, if the detached list does not exist, a new detached list is created in the database by using the key in the insterMap object as a list name through the xml file, and then the data is written.
3. The table separation method based on ordered fields and time routing according to claim 1, characterized by: and querying the database for 3 times at intervals in all steps of querying the database through the xml file so as to prevent data query abnormality caused by network fluctuation.
4. The table separation method based on ordered fields and time routing according to claim 1, characterized by: and automatically executing the steps B to F by setting a timing task.
5. Table splitting method based on ordered fields and temporal routing according to one of claims 1 to 4, characterized in that: and when data abnormity occurs in the operation of the database through the xml file, prompting corresponding abnormal information.
CN202010567383.6A 2020-06-19 2020-06-19 Table separation method based on sequencing field and time routing Active CN111881323B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010567383.6A CN111881323B (en) 2020-06-19 2020-06-19 Table separation method based on sequencing field and time routing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010567383.6A CN111881323B (en) 2020-06-19 2020-06-19 Table separation method based on sequencing field and time routing

Publications (2)

Publication Number Publication Date
CN111881323A true CN111881323A (en) 2020-11-03
CN111881323B CN111881323B (en) 2022-05-20

Family

ID=73157846

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010567383.6A Active CN111881323B (en) 2020-06-19 2020-06-19 Table separation method based on sequencing field and time routing

Country Status (1)

Country Link
CN (1) CN111881323B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113448966A (en) * 2021-07-17 2021-09-28 绿漫科技有限公司 Order data multidimensional spreadsheet system

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140108421A1 (en) * 2012-10-04 2014-04-17 Codefutures Corporation Partitioning database data in a sharded database
CN104714972A (en) * 2013-12-17 2015-06-17 中国银联股份有限公司 Database sub-table establishing and searching method
CN104794123A (en) * 2014-01-20 2015-07-22 阿里巴巴集团控股有限公司 Method and device for establishing NoSQL database index for semi-structured data
CN108153849A (en) * 2017-12-20 2018-06-12 杭州登虹科技有限公司 A kind of database table cutting method, device, system and medium
US20190102476A1 (en) * 2017-09-29 2019-04-04 Oracle International Corporation Handling semi-structured and unstructured data in a sharded database environment
US20190114272A1 (en) * 2017-10-12 2019-04-18 Western Digital Technologies, Inc. Methods and apparatus for variable size logical page management based on hot and cold data
CN110659393A (en) * 2019-09-12 2020-01-07 河南九商科技有限公司 Method and system for generating xml code

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140108421A1 (en) * 2012-10-04 2014-04-17 Codefutures Corporation Partitioning database data in a sharded database
CN104714972A (en) * 2013-12-17 2015-06-17 中国银联股份有限公司 Database sub-table establishing and searching method
CN104794123A (en) * 2014-01-20 2015-07-22 阿里巴巴集团控股有限公司 Method and device for establishing NoSQL database index for semi-structured data
US20190102476A1 (en) * 2017-09-29 2019-04-04 Oracle International Corporation Handling semi-structured and unstructured data in a sharded database environment
US20190114272A1 (en) * 2017-10-12 2019-04-18 Western Digital Technologies, Inc. Methods and apparatus for variable size logical page management based on hot and cold data
CN108153849A (en) * 2017-12-20 2018-06-12 杭州登虹科技有限公司 A kind of database table cutting method, device, system and medium
CN110659393A (en) * 2019-09-12 2020-01-07 河南九商科技有限公司 Method and system for generating xml code

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
严成武: "支持分库分表和读写分离的ORM框架的设计与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
冯超政 等: "基于冷热数据的MongoDB自动分片机制", 《计算机工程》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113448966A (en) * 2021-07-17 2021-09-28 绿漫科技有限公司 Order data multidimensional spreadsheet system
CN113448966B (en) * 2021-07-17 2022-06-21 绿漫科技有限公司 Order data multidimensional spreadsheet system

Also Published As

Publication number Publication date
CN111881323B (en) 2022-05-20

Similar Documents

Publication Publication Date Title
CN107958057B (en) Code generation method and device for data migration in heterogeneous database
CN101477486B (en) File backup recovery method based on sector recombination
CN109871373B (en) Data storage method and device and computer readable storage medium
US20110264636A1 (en) Updating a data warehouse schema based on changes in an observation model
CA3176450A1 (en) Method and apparatus for implementing incremental data consistency
CN101452487B (en) Data loading method and system, and data loading unit
CN105787058B (en) A kind of user tag system and the data delivery system based on user tag system
CN101490675A (en) Methods and apparatus for reusing data access and presentation elements
US20090055418A1 (en) Automatic cascading copy operations in a database with referential integrity
CN108369599A (en) Duplication control between redundant data center
CN111914066B (en) Global searching method and system for multi-source database
AU2014271289A1 (en) Extract, transform and load (etl) system and method
CN112148718A (en) Big data support management system for city-level data middling station
CN112416923A (en) Metadata management method and device, equipment and storage medium
US7020656B1 (en) Partition exchange loading technique for fast addition of data to a data warehousing system
CN110929120B (en) Method and apparatus for managing technical metadata
CN115858513A (en) Data governance method, data governance device, computer equipment and storage medium
CN105095392B (en) A kind of method and device of Data Mart data sharing
CN111881323B (en) Table separation method based on sequencing field and time routing
CN114416703A (en) Method, device, equipment and medium for automatically monitoring data integrity
CN107291938A (en) Order Query System and method
CN110879799A (en) Method and apparatus for managing technical metadata
CN113590651B (en) HQL-based cross-cluster data processing system and method
CN113032385B (en) Easily-extensible configurable data backup system and method
US7433882B2 (en) Data management system and computer program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant