CN110321334B - Method and device for automatic partition capacity expansion of database - Google Patents

Method and device for automatic partition capacity expansion of database Download PDF

Info

Publication number
CN110321334B
CN110321334B CN201910440428.0A CN201910440428A CN110321334B CN 110321334 B CN110321334 B CN 110321334B CN 201910440428 A CN201910440428 A CN 201910440428A CN 110321334 B CN110321334 B CN 110321334B
Authority
CN
China
Prior art keywords
partition
data
time period
current
data amount
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910440428.0A
Other languages
Chinese (zh)
Other versions
CN110321334A (en
Inventor
徐丽萍
姚伏霞
安娜娜
包六十三
邹延鑫
徐晶
欧阳海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Bank Co Ltd
Original Assignee
Ping An Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Bank Co Ltd filed Critical Ping An Bank Co Ltd
Priority to CN201910440428.0A priority Critical patent/CN110321334B/en
Publication of CN110321334A publication Critical patent/CN110321334A/en
Application granted granted Critical
Publication of CN110321334B publication Critical patent/CN110321334B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Economics (AREA)
  • Databases & Information Systems (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • General Physics & Mathematics (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a method and a device for automatic partition capacity expansion of a database, wherein the method comprises the following steps: in the management period, after detecting that the data amount stored in the current partition reaches a preset threshold value; acquiring a data time period and a current data capacity corresponding to a current partition; screening out the historical partition data volume of the historical time period corresponding to the data time period; calculating the predicted data quantity corresponding to the current partition according to the historical partition data quantity by using a pre-established partition data quantity prediction model; and if the predicted data volume is larger than the current data capacity, expanding the current partition. The method for automatically partitioning and expanding the database adopts the pre-established prediction model to predict the data volume of the current partition in advance, is favorable for obtaining the accurate data volume to be predicted, and performs targeted expansion on the current partition.

Description

Method and device for automatic partition capacity expansion of database
Technical Field
The invention relates to the field of databases, in particular to a method and a device for automatically partitioning and expanding a database.
Background
With the rapid development of economy, data generation and updating are accelerated, and for example, the amount of data in a bill form is increased with the increase of consumption. Today, the consumption has various influencing factors, such as the holiday of shopping with big size of double 11, and the quantity of bank bills is easy to be increased in an explosive manner, thereby influencing the stability of a database.
Therefore, it is necessary to predict the data volume of the database partition in advance, and then relieve the pressure of the database by means of pre-expansion.
Disclosure of Invention
In order to solve the problems, particularly the problem that the data volume of the database partition cannot be predicted in advance in the prior art, the invention adopts the following technical schemes:
in a first aspect, the present invention provides a method for automatic partition expansion of a database, the method comprising:
in the management period, after detecting that the data amount stored in the current partition reaches a preset threshold value;
acquiring a data time period and a current data capacity corresponding to a current partition;
screening out the historical partition data volume of the historical time period corresponding to the data time period;
calculating the predicted data quantity corresponding to the current partition according to the historical partition data quantity by using a pre-established partition data quantity prediction model;
and if the predicted data volume is larger than the current data capacity, expanding the current partition.
Preferably, the calculating, by using a pre-established partition data amount prediction model, the predicted data amount corresponding to the current partition according to the historical partition data amount includes:
Fitting the historical partition data volume according to a regression equation prediction method, and determining a fitting function;
and taking the fitting function as a partition data amount prediction model, inputting the historical partition data amount into the partition data amount prediction model, and calculating the predicted data amount corresponding to the current partition.
Preferably, after calculating the predicted data amount corresponding to the current partition according to the historical partition data amount by using a pre-established partition data amount prediction model, the method further includes:
and inputting the actual data quantity of the current partition into the partition data quantity prediction model, and correcting a fitting function of the partition data quantity prediction model according to the actual data quantity.
Preferably, after calculating the predicted data amount corresponding to the current partition according to the historical partition data amount by using a pre-established partition data amount prediction model, the method further includes:
according to the weighted moving average method, the predicted data amount is corrected based on the historical partition data amount and the partition data amount of the previous period of the data period.
Preferably, the weighted moving average method corrects the predicted data amount sj_st using the following formula:
SJ_ST=C t +d×(C t -C t-1 )+b×(C y1_t -C y1_t-1 )+p×(C y2_t -C y2_t-1 );
Wherein Ct is the predicted data amount of the data period t of the current partition, C t-1 The partition data amount corresponding to the previous time period t-1 of the data time period t, C y1_t C is the partition data volume corresponding to the same time period y1_t of the previous year corresponding to the data time period t y1_t-1 The partition data amount corresponding to the previous time period y1_t-1 of the y1_t time period C y2_t The partition data amount corresponding to the same time period y2_t of the first two years of the data time period t, C y2_t-1 The partition data amount corresponding to the previous time period y2_t-1 of the y2_t time period, d, b and p are holiday factor correction factors, and d+b+p=1.
Preferably, after calculating the predicted data amount corresponding to the current partition according to the historical partition data amount by using a pre-established partition data amount prediction model, the method further includes:
and if the difference value between the predicted data quantity corresponding to the current partition and the actual data quantity of the current partition is larger than a preset difference value, carrying out capacity reduction on the current partition.
Preferably, after calculating the predicted data amount corresponding to the current partition according to the historical partition data amount by using a pre-established partition data amount prediction model, the method further includes:
if the predicted data amount is larger than the preset maximum capacity of the current partition, an auxiliary partition of the current partition is established, and the capacity of the auxiliary partition is determined according to the difference value between the predicted data amount and the maximum capacity of the current partition.
In a second aspect, the present invention provides an apparatus for automatic partition expansion of a database, the apparatus comprising:
the detection module is used for detecting that the data amount stored in the current partition reaches a preset threshold value in a management period;
the acquisition module is used for acquiring the data time period and the current data capacity corresponding to the current partition;
the screening module is used for screening the historical partition data volume of the historical time period corresponding to the data time period;
the measuring and calculating module is used for calculating the predicted data quantity corresponding to the current partition according to the historical partition data quantity by utilizing a pre-established partition data quantity prediction model;
and the capacity expansion module is used for expanding the capacity of the current partition when the predicted data volume is larger than the current data capacity.
In a third aspect, the present invention provides a computer device comprising a storage medium and a processor, the storage medium having stored therein computer readable instructions which, when executed by the processor, cause the processor to perform the steps of a method of automatic partition expansion of a database according to the first aspect described above.
In a fourth aspect, the present invention provides a computer readable storage medium, which when executed by one or more processors, causes the one or more processors to perform the steps of a method for automatic partition expansion of a database as described in the first aspect above.
Compared with the prior art, the technical scheme of the invention has at least the following advantages:
1. according to the automatic partition capacity expansion method of the database, the data volume of the current partition is predicted in advance by judging the data volume condition of the current partition in advance and based on the historical partition data volume of the historical time period, and the data volume of the current partition is predicted in advance by adopting a pre-established prediction model, so that the accurate data volume to be predicted is facilitated to be obtained, and the current partition is expanded in a targeted mode.
2. According to the method for automatically partitioning and expanding the database, disclosed by the invention, the fitting function of the partitioned data quantity prediction model is corrected through the actual data quantity, so that an autonomous learning process of the prediction model is established, and the accuracy of the prediction model is ensured.
3. The method for automatically partitioning and expanding the database also adopts a weighted moving average method, and corrects the predicted data volume based on the historical partitioned data volume and the partitioned data volume of the previous time period of the data time period, so that the predicted data volume is closer to the actual data volume.
Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
FIG. 1 is a flow chart of an embodiment of an automatic partition capacity expansion method of a database according to the present invention;
FIG. 2 is a flow chart of one embodiment of an automatic database partitioning and capacity expanding device of the present invention;
FIG. 3 is a block diagram of the internal architecture of a computing device in accordance with one embodiment of the present invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
In order to enable those skilled in the art to better understand the present invention, the following description will make clear and complete descriptions of the technical solutions according to the embodiments of the present invention with reference to the accompanying drawings.
In some of the flows described in the specification and claims of the present invention and in the foregoing figures, a plurality of operations appearing in a particular order are included, but it should be clearly understood that the operations may be performed in other than the order in which they appear herein or in parallel, the sequence numbers of the operations such as S11, S12, etc. are merely used to distinguish between the various operations, and the sequence numbers themselves do not represent any order of execution. In addition, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first" and "second" herein are used to distinguish different messages, devices, modules, etc., and do not represent a sequence, and are not limited to the "first" and the "second" being different types.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless expressly stated otherwise, as understood by one of ordinary skill in the art. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. The term "and/or" as used herein includes all or any element and all combination of one or more of the associated listed items.
It will be understood by those of ordinary skill in the art that unless otherwise defined, all terms used herein (including technical and scientific terms) have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, wherein the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. It will be apparent that the described embodiments are only some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to fall within the scope of the invention.
Referring to fig. 1, the method for automatically partitioning and expanding a database provided by the present invention includes the following steps:
s11, after detecting that the data amount stored in the current partition reaches a preset threshold value in a management period;
s12, acquiring a data time period and a current data capacity corresponding to the current partition;
in the embodiment of the invention, taking a bill table as an example, generally, for convenience in management, bill table data is partitioned monthly, and the bill table comprises transaction records generated according to time sequence.
And pre-establishing a management period of the database table, wherein the management period takes month as an example, a timing task is pulled up at the beginning of each month, and whether the data amount stored in the current partition is larger than or equal to a preset threshold value is detected. In addition, the data amount stored in the current partition can be monitored in real time, once the data amount stored in the current partition reaches a preset threshold value, the next step is carried out, and a series of different preset threshold values can be set according to different time phases from the beginning of the month to the end of the month so as to meet the requirement of real-time monitoring. Specifically, the data amount includes an index of the current partition and data.
Specifically, the data time period is month, and may be a shorter time period such as week, day, etc. according to the monthly bill table.
The current data capacity is the maximum data size that the current partition can currently accommodate.
S13, screening out the data volume of the historical partition of the historical time period corresponding to the data time period;
preferably, the historical partition data amount of at least two historical time periods corresponding to the data time period is screened out. The granularity is the same for each historical period, e.g., granularity is 1 month. Taking the data time period as 2019 and 11 months as an example, the historical time periods corresponding to the data time period are 2018 and 11 months, 2017 and 11 months, and so on. In addition, the historical time period can be selected according to actual requirements, and if the current data time period is a certain day, the corresponding historical time period is a certain day. For example, the current data time period is 11 days of 11 months in 2019, the screened historical time periods corresponding to the data time period are 11 days of 11 months in 2018, 11 days of 11 months in 2017, and the like. According to the method for automatically partitioning and expanding the database, the historical partition data volume of the historical time period, such as the last year, the last year and even the earlier year, is analyzed, and the possible value of the bill data volume of the data time period corresponding to the current partition is predicted, so that the partition is expanded in advance, the partition capacity is reasonably utilized, and the situation that the database is pressurized due to the sudden increase of the data volume can be avoided.
S14, calculating the predicted data quantity corresponding to the current partition according to the historical partition data quantity by utilizing a pre-established partition data quantity prediction model;
the calculating, by using a pre-established partition data amount prediction model, a predicted data amount corresponding to the current partition according to the historical partition data amount includes:
fitting the historical partition data volume according to a regression equation prediction method, and determining a fitting function;
and taking the fitting function as a partition data amount prediction model, inputting the historical partition data amount into the partition data amount prediction model, and calculating the predicted data amount corresponding to the current partition.
Specifically, multiple regression equations are preset, and a target regression equation which is most in line with the historical partition data amount is determined according to the historical partition data amount and is used as a fitting function. Wherein, the preset regression equation includes, but is not limited to, 1 degree term, 2 degree term, 3 degree term, index term, etc.
Further, different data time periods correspond to different fitting functions. In brief, there are corresponding fitting functions for different months, such as 11 months for 11 months. And selecting corresponding fitting functions according to different data time periods to predict the predicted data quantity, so as to obtain more accurate data quantity to be predicted.
Specifically, after calculating the predicted data amount corresponding to the current partition according to the historical partition data amount by using a pre-established partition data amount prediction model, the method further includes:
and inputting the actual data quantity of the current partition into the partition data quantity prediction model, and correcting a fitting function of the partition data quantity prediction model according to the actual data quantity.
Preferably, after the actual data quantity of the current partition is counted, the actual data quantity of the current partition is input into the partition data quantity prediction model, and the fitting function is corrected, so that an autonomous learning process of the prediction model is established, and accuracy of the prediction model is guaranteed. After the actual data volume of each data period is counted, the fitting function corresponding to the data period is updated in time so as to be used for predicting the data volume of the next data period.
The method further comprises the steps of after calculating the predicted data quantity corresponding to the current partition according to the historical partition data quantity by using a pre-established partition data quantity prediction model, and the method comprises the following steps:
according to the weighted moving average method, the predicted data amount is corrected based on the historical partition data amount and the partition data amount of the previous period of the data period.
Specifically, the weighted shift method corrects the predicted data amount sj_st by using the following formula I, where the corrected predicted data amount is sj_st:
SJ_ST=C t +d×(C t -C t-1 )+b×(C y1_t -C y1_t-1 )+p×(C y2_t -C y2_t-1 ) (formula I);
wherein Ct is the predicted data amount of the data period t of the current partition, C t-1 The partition data amount corresponding to the previous time period t-1 of the data time period t, C y1_t C is the partition data volume corresponding to the same time period y1_t of the previous year corresponding to the data time period t y1_t-1 The partition data amount corresponding to the previous time period y1_t-1 of the y1_t time period C y2_t The partition data amount corresponding to the same time period y2_t of the first two years of the data time period t, C y2_t-1 The partition data amount corresponding to the previous time period y2_t-1 of the y2_t time period, d, b and p are holiday factor correction factors, and d+b+p=1.
The weights of d, b and p are automatically adjusted according to the number of holidays, and the more the holidays are, the larger the weights are. Meanwhile, when an important holiday is on the weekend, the weight is correspondingly increased.
Specifically, taking the predicted data amount of a certain bill table for correcting the month of 2018, 11 as an example, the predicted data amount of the month of 2018 is corrected based on the historical data amounts of the month of 2018, the month of 2017, the month of 2016, and the month of 2016, 10, and the predicted data amount of the month of 2018, 11, calculated by using the above-mentioned partitioned data amount prediction model, and the specific historical data amounts and predicted data amounts are shown in the following table:
* And the predicted data quantity of 2018 and 11 months, which is calculated by adopting the partition data quantity prediction model, is not the actual data quantity.
Based on the data, the predicted data amount sj_st is corrected by adopting the formula I in combination with holiday factor correction factors of 2018, 2017 and 2016, which are respectively 0.4, 0.4 and 0.2, so as to obtain the corrected predicted data amount 104.4, which is closer to the actual data amount 104 than the predicted data amount 105 before correction. From the above, it can be explained that the weighted moving average method provided by the embodiment of the present invention can effectively correct the predicted data amount.
Further, when the following situation occurs, that is, ct=ct1=0, cy 1_t=cy 1_t-1+.0, cy 2_t=cy 2_t-1+.0, if the SJ value calculated using the above-mentioned weighted shift method will be 0, this situation may cause the system to provide no service, and therefore, at this time, the system will calculate the predicted data amount sj_sa of the current time period t using the following formula II:
SJ_SA=C t +d×C t-1 +b×C y1_t +p×C y2_t (formula II);
wherein C is t C for predicting the data amount of the data period t of the current partition t-1 Pair of previous time period t-1 for data time period tAmount of partition data to be used, C y1_t C is the partition data volume corresponding to the same time period y1_t of the previous year corresponding to the data time period t y2_t The adjustment of the partition data amounts d, b, and p corresponding to the same time period y2_t of the first two years of the data period t is the same as above.
When the following condition occurs, namely C t =C t-1 =0、C y1_t =C y1_t-1 ≠0、C y2_t =C y2_t-1 If not equal to 0, the comprehensive compromise algorithm of the formula I and the formula II may also be used to calculate the prediction data sj_sta of the current time period t, see formula III:
SJ=SJ_STA=C t =m×sj_st+n×sj_sa (formula III);
where m and n are weighting coefficients, and m+n=1.
In summary, the corresponding prediction algorithm can be selected according to the actual situation, so that the SJ value not only effectively reflects the regularity of the effect of the factors affecting the predicted data amount, but also reflects the change caused by the external environment change, so that the predicted data amount is closer to the actual data amount.
Preferably, after calculating the predicted data amount corresponding to the current partition according to the historical partition data amount by using a pre-established partition data amount prediction model, the method further includes: if the difference value between the predicted data volume corresponding to the current partition and the actual data volume of the current partition is larger than the preset difference value, the current partition is contracted, and the current partition is contracted along with the actual data volume of the partition, so that the operation and maintenance cost of the database is effectively reduced, the saved space can be used for establishing more new partitions, and the utilization rate of the database is improved.
Preferably, after calculating the predicted data amount corresponding to the current partition according to the historical partition data amount by using a pre-established partition data amount prediction model, the method further includes: if the predicted data amount is larger than the preset maximum capacity of the current partition, an auxiliary partition of the current partition is established, and the capacity of the auxiliary partition is determined according to the difference value between the predicted data amount and the maximum capacity of the current partition. Since the preset maximum capacity of the partition is favorable for ensuring the stability of the database and preventing the database from being expanded limitlessly to cause slow inquiry, when the predicted data volume is larger than the preset maximum capacity of the current partition, the auxiliary partition of the current partition is established, and the auxiliary partition and the current partition have own local indexes, so that the auxiliary partition and the current partition are kept independent from each other. After the affiliated partition of the current partition is established, the global index of the current partition is re-established, so that the uniformity of data between the current partition and the affiliated partition is ensured.
And S15, if the predicted data volume is larger than the current data capacity, expanding the current partition.
The method comprises the steps of predicting the possible predicted data volume of a bill table in the current time period by counting the actual condition of the data volume of the bill table in the same time period in the past year, and relieving the pressure of the current partition by a method of expanding the current partition in advance if the predicted data volume exceeds the maximum value of the data volume which can be accommodated by the current partition, namely the current data capacity.
Further, after capacity expansion, the partition utilization rate query statement may be executed to obtain the expanded result, so as to determine whether the capacity expansion of the partition is successful. Such as by statement lookup table space size (M), free space (M), etc.
Specifically, the total capacity of the query partition space may be achieved by the following statement:
select tablespace_name,sum(bytes)/1024/1024 as MB from dba_data_files group by tablespace_name.
further, after the expansion, the index of the current partition is reconstructed.
After the capacity expansion, the global index of the partition fails, so the global index needs to be recreated.
Referring to fig. 2, the embodiment of the present invention further provides an apparatus for automatic partition expansion of a database, which includes a detection module 11, an acquisition module 12, a screening module 13, a measurement module 14, and an expansion module 15, wherein:
the detection module 11 is configured to detect, in a management period, that an amount of data stored in a current partition reaches a preset threshold;
in the embodiment of the invention, taking a bill table as an example, generally, for convenience in management, bill table data is partitioned monthly, and the bill table comprises transaction records generated according to time sequence.
And pre-establishing a management period of the database table, wherein the management period takes month as an example, a timing task is pulled up at the beginning of each month, and whether the data amount stored in the current partition is larger than or equal to a preset threshold value is detected. In addition, the data amount stored in the current partition can be monitored in real time, once the data amount stored in the current partition reaches a preset threshold value, the next step is carried out, and a series of different preset threshold values can be set according to different time phases from the beginning of the month to the end of the month so as to meet the requirement of real-time monitoring. Specifically, the data amount includes an index of the current partition and data.
An obtaining module 12, configured to obtain a data time period and a current data capacity corresponding to a current partition;
specifically, the data time period is month, and may be a shorter time period such as week, day, etc. according to the monthly bill table.
The current data capacity is the maximum data size that the current partition can currently accommodate.
A screening module 13, configured to screen a historical partition data amount of a historical time period corresponding to the data time period;
preferably, the historical partition data amount of at least two historical time periods corresponding to the data time period is screened out. The granularity is the same for each historical period, e.g., granularity is 1 month. Taking the data time period as 2019 and 11 months as an example, the historical time periods corresponding to the data time period are 2018 and 11 months, 2017 and 11 months, and so on. In addition, the historical time period can be selected according to actual requirements, and if the current data time period is a certain day, the corresponding historical time period is a certain day. For example, the current data time period is 11 days of 11 months in 2019, the screened historical time periods corresponding to the data time period are 11 days of 11 months in 2018, 11 days of 11 months in 2017, and the like. According to the method for automatically partitioning and expanding the database, the historical partition data volume of the historical time period, such as the last year, the last year and even the earlier year, is analyzed, and the possible value of the bill data volume of the data time period corresponding to the current partition is predicted, so that the partition is expanded in advance, the partition capacity is reasonably utilized, and the situation that the database is pressurized due to the sudden increase of the data volume can be avoided.
A calculation module 14, configured to calculate, according to the historical partition data amount, a predicted data amount corresponding to the current partition by using a pre-established partition data amount prediction model;
the calculating, by using a pre-established partition data amount prediction model, a predicted data amount corresponding to the current partition according to the historical partition data amount includes:
fitting the historical partition data volume according to a regression equation prediction method, and determining a fitting function;
and taking the fitting function as a partition data amount prediction model, inputting the historical partition data amount into the partition data amount prediction model, and calculating the predicted data amount corresponding to the current partition.
Specifically, multiple regression equations are preset, and a target regression equation which is most in line with the historical partition data amount is determined according to the historical partition data amount and is used as a fitting function. Wherein, the preset regression equation includes, but is not limited to, 1 degree term, 2 degree term, 3 degree term, index term, etc.
Further, different data time periods correspond to different fitting functions. In brief, there are corresponding fitting functions for different months, such as 11 months for 11 months. And selecting corresponding fitting functions according to different data time periods to predict the predicted data quantity, so as to obtain more accurate data quantity to be predicted.
Specifically, after calculating the predicted data amount corresponding to the current partition according to the historical partition data amount by using a pre-established partition data amount prediction model, the method further includes:
and inputting the actual data quantity of the current partition into the partition data quantity prediction model, and correcting a fitting function of the partition data quantity prediction model according to the actual data quantity.
Preferably, after the actual data quantity of the current partition is counted, the actual data quantity of the current partition is input into the partition data quantity prediction model, and the fitting function is corrected, so that an autonomous learning process of the prediction model is established, and accuracy of the prediction model is guaranteed. After the actual data volume of each data period is counted, the fitting function corresponding to the data period is updated in time so as to be used for predicting the data volume of the next data period.
The method further comprises the steps of after calculating the predicted data quantity corresponding to the current partition according to the historical partition data quantity by using a pre-established partition data quantity prediction model, and the method comprises the following steps:
according to the weighted moving average method, the predicted data amount is corrected based on the historical partition data amount and the partition data amount of the previous period of the data period.
Specifically, the weighted shift method corrects the predicted data amount sj_st by using the following formula I, where the corrected predicted data amount is sj_st:
SJ_ST=C t +d×(C t -C t-1 )+b×(C y1_t -C y1_t-1 )+p×(C y2_t -C y2_t-1 ) (formula I);
wherein Ct is the predicted data amount of the data period t of the current partition, C t-1 The partition data amount corresponding to the previous time period t-1 of the data time period t, C y1_t C is the partition data volume corresponding to the same time period y1_t of the previous year corresponding to the data time period t y1_t-1 The partition data amount corresponding to the previous time period y1_t-1 of the y1_t time period C y2_t The partition data amount corresponding to the same time period y2_t of the first two years of the data time period t, C y2_t-1 The partition data amount corresponding to the previous time period y2_t-1 of the y2_t time period, d, b and p are holiday factor correction factors, and d+b+p=1.
The weights of d, b and p are automatically adjusted according to the number of holidays, and the more the holidays are, the larger the weights are. Meanwhile, when an important holiday is on the weekend, the weight is correspondingly increased.
Specifically, taking the predicted data amount of a certain bill table for correcting the month of 2018, 11 as an example, the predicted data amount of the month of 2018 is corrected based on the historical data amounts of the month of 2018, the month of 2017, the month of 2016, and the month of 2016, 10, and the predicted data amount of the month of 2018, 11, calculated by using the above-mentioned partitioned data amount prediction model, and the specific historical data amounts and predicted data amounts are shown in the following table:
* And the predicted data quantity of 2018 and 11 months, which is calculated by adopting the partition data quantity prediction model, is not the actual data quantity.
Based on the data, the predicted data amount sj_st is corrected by adopting the formula I in combination with holiday factor correction factors of 2018, 2017 and 2016, which are respectively 0.4, 0.4 and 0.2, so as to obtain the corrected predicted data amount 104.4, which is closer to the actual data amount 104 than the predicted data amount 105 before correction. From the above, it can be explained that the weighted moving average method provided by the embodiment of the present invention can effectively correct the predicted data amount.
Further, when the following condition, namely C t =C t-1 =0、C y1_t =C y1_t-1 ≠0、C y2_t =C y2_t-1 Not equal to 0, if the SJ value calculated by the weighted shift method is 0, this may cause the system to not provide service, so the system will calculate the predicted data amount sj_sa of the current time period t using the following formula II:
SJ_SA=C t +d×C t-1 +b×C y1_t +p×C y2_t (formula II);
wherein C is t C for predicting the data amount of the data period t of the current partition t-1 The partition data amount corresponding to the previous time period t-1 of the data time period t, C y1_t Corresponding to the data time period tThe partition data amount corresponding to the same time period y1_t of the previous year, C y2_t The adjustment of the partition data amounts d, b, and p corresponding to the same time period y2_t of the first two years of the data period t is the same as above.
When the following condition occurs, namely C t =C t-1 =0、C y1_t =C y1_t-1 ≠0、C y2_t =C y2_t-1 If not equal to 0, the comprehensive compromise algorithm of the formula I and the formula II may also be used to calculate the prediction data sj_sta of the current time period t, see formula III:
SJ=SJ_STA=C t =m×sj_st+n×sj_sa (formula III);
where m and n are weighting coefficients, and m+n=1.
In summary, the corresponding prediction algorithm can be selected according to the actual situation, so that the SJ value not only effectively reflects the regularity of the effect of the factors affecting the predicted data amount, but also reflects the change caused by the external environment change, so that the predicted data amount is closer to the actual data amount.
Preferably, after calculating the predicted data amount corresponding to the current partition according to the historical partition data amount by using a pre-established partition data amount prediction model, the method further includes: if the difference value between the predicted data volume corresponding to the current partition and the actual data volume of the current partition is larger than the preset difference value, the current partition is contracted, and the current partition is contracted along with the actual data volume of the partition, so that the operation and maintenance cost of the database is effectively reduced, the saved space can be used for establishing more new partitions, and the utilization rate of the database is improved.
Preferably, after calculating the predicted data amount corresponding to the current partition according to the historical partition data amount by using a pre-established partition data amount prediction model, the method further includes: if the predicted data amount is larger than the preset maximum capacity of the current partition, an auxiliary partition of the current partition is established, and the capacity of the auxiliary partition is determined according to the difference value between the predicted data amount and the maximum capacity of the current partition. Since the preset maximum capacity of the partition is favorable for ensuring the stability of the database and preventing the database from being expanded limitlessly to cause slow inquiry, when the predicted data volume is larger than the preset maximum capacity of the current partition, the auxiliary partition of the current partition is established, and the auxiliary partition and the current partition have own local indexes, so that the auxiliary partition and the current partition are kept independent from each other. After the affiliated partition of the current partition is established, the global index of the current partition is re-established, so that the uniformity of data between the current partition and the affiliated partition is ensured.
And the capacity expansion module 15 is configured to expand the capacity of the current partition when the predicted data size is greater than the current data size.
The method comprises the steps of predicting the possible predicted data volume of a bill table in the current time period by counting the actual condition of the data volume of the bill table in the same time period in the past year, and relieving the pressure of the current partition by a method of expanding the current partition in advance if the predicted data volume exceeds the maximum value of the data volume which can be accommodated by the current partition, namely the current data capacity.
Further, after capacity expansion, the partition utilization rate query statement may be executed to obtain the expanded result, so as to determine whether the capacity expansion of the partition is successful. Such as by statement lookup table space size (M), free space (M), etc.
Specifically, the total capacity of the query partition space may be achieved by the following statement:
select tablespace_name,sum(bytes)/1024/1024 as MB from dba_data_files group by tablespace_name.
further, after the expansion, the index of the current partition is reconstructed.
After the capacity expansion, the global index of the partition fails, so the global index needs to be recreated.
In one embodiment, the present invention also proposes a computer device comprising a storage medium, a processor and a computer program stored on the storage medium and executable on the processing medium, the processor implementing the following steps when executing the computer program: in the management period, after detecting that the data amount stored in the current partition reaches a preset threshold value; acquiring a data time period and a current data capacity corresponding to a current partition; screening out the historical partition data volume of the historical time period corresponding to the data time period; calculating the predicted data quantity corresponding to the current partition according to the historical partition data quantity by using a pre-established partition data quantity prediction model; and if the predicted data volume is larger than the current data capacity, expanding the current partition.
In one embodiment, the processor, when executing the computer readable instructions, further performs the steps of: according to the weighted moving average method, the predicted data amount is corrected based on the historical partition data amount and the partition data amount of the previous period of the data period.
In one embodiment, the processor, when executing the computer readable instructions, further performs the steps of: and if the difference value between the predicted data quantity corresponding to the current partition and the actual data quantity of the current partition is larger than a preset difference value, carrying out capacity reduction on the current partition.
In one embodiment, the processor, when executing the computer readable instructions, further performs the steps of: if the predicted data amount is larger than the preset maximum capacity of the current partition, an auxiliary partition of the current partition is established, and the capacity of the auxiliary partition is determined according to the difference value between the predicted data amount and the maximum capacity of the current partition.
Referring to fig. 3, fig. 3 is a schematic diagram illustrating an internal structure of a computer device according to an embodiment. As shown in fig. 3, the computer device comprises a processor 1, a storage medium 2, a memory 3 and a network interface 4 connected by a system bus. The storage medium 2 of the computer device stores an operating system, a database, and computer readable instructions, where a control information sequence may be stored in the database, where the computer readable instructions when executed by the processor 1 may enable the processor 1 to implement a method for automatic partition expansion of the database, and the processor 1 may implement functions of the detection module 11, the acquisition module 12, the screening module 13, the measurement module 14, and the expansion module 15 in an apparatus for automatic partition expansion of the database in the embodiment shown in fig. 2. The processor 1 of the computer device is used to provide computing and control capabilities, supporting the operation of the entire computer device. The memory 3 of the computer device may have stored therein computer readable instructions that, when executed by the processor 1, cause the processor 1 to perform a method for automatic partition expansion of a database. The network interface 4 of the computer device is used for communication with a terminal connection. It will be appreciated by those skilled in the art that the structure shown in fig. 3 is merely a block diagram of some of the structures associated with the present application and is not limiting of the computer device to which the present application may be applied, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.
Furthermore, the present invention also proposes a storage medium storing computer readable instructions that, when executed by one or more processors, cause the one or more processors to perform the steps of: in the management period, after detecting that the data amount stored in the current partition reaches a preset threshold value; acquiring a data time period and a current data capacity corresponding to a current partition; screening out the historical partition data volume of the historical time period corresponding to the data time period; calculating the predicted data quantity corresponding to the current partition according to the historical partition data quantity by using a pre-established partition data quantity prediction model; and if the predicted data volume is larger than the current data capacity, expanding the current partition.
The one or more processors can implement the functions of the detection module 11, the acquisition module 12, the screening module 13, the measurement module 14, and the capacity expansion module 15 in the apparatus for automatic partition capacity expansion of a database in the embodiment shown in fig. 2.
In one embodiment, the processor, when executing the computer readable instructions, further performs the steps of: and if the difference value between the predicted data quantity corresponding to the current partition and the actual data quantity of the current partition is larger than a preset difference value, carrying out capacity reduction on the current partition.
In one embodiment, the processor, when executing the computer readable instructions, further performs the steps of: if the predicted data amount is larger than the preset maximum capacity of the current partition, an auxiliary partition of the current partition is established, and the capacity of the auxiliary partition is determined according to the difference value between the predicted data amount and the maximum capacity of the current partition.
As can be seen from the above embodiments, the present invention has the following advantages:
the data volume of the current partition is predicted in advance by judging the data volume condition of the current partition in advance and based on the historical partition data volume of the historical time period by adopting a pre-established prediction model, so that the accurate data volume to be predicted is obtained, and the current partition is subjected to targeted capacity expansion.
Those skilled in the art will appreciate that implementing all or part of the above-described methods in accordance with the embodiments may be accomplished by way of a computer program stored in a computer-readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. The storage medium may be a nonvolatile storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a random access Memory (Random Access Memory, RAM).
The technical features of the above-described embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above-described embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The foregoing examples illustrate only a few embodiments of the invention and are described in detail herein without thereby limiting the scope of the invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention. Accordingly, the scope of protection of the present invention is to be determined by the appended claims.

Claims (6)

1. A method for automatic partition expansion of a database, the method comprising:
in the management period, after detecting that the data amount stored in the current partition reaches a preset threshold value;
acquiring a data time period and a current data capacity corresponding to a current partition;
screening out the historical partition data volume of the historical time period corresponding to the data time period;
Calculating the predicted data quantity corresponding to the current partition according to the historical partition data quantity by using a pre-established partition data quantity prediction model, wherein the method comprises the following steps of: fitting the historical partition data volume according to a regression equation prediction method, and determining a fitting function; taking the fitting function as a partition data amount prediction model, inputting the historical partition data amount into the partition data amount prediction model, and calculating the predicted data amount corresponding to the current partition;
inputting the actual data quantity of the current partition into the partition data quantity prediction model, and correcting a fitting function of the partition data quantity prediction model according to the actual data quantity;
correcting the predicted data amount based on the historical partition data amount and the partition data amount of a previous time period of the data time period according to a weighted moving average method;
the weighted moving average method corrects the predicted data amount sj_st using the following formula:
SJ_ST=C t +d×(C t -C t-1 )+b×(C y1_t -C y1_t-1 )+p×(C y2_t -C y2_t-1 );
wherein C is t C for predicting the data amount of the data period t of the current partition t-1 The partition data amount corresponding to the previous time period t-1 of the data time period t, C y1_t C is the partition data volume corresponding to the same time period y1_t of the previous year corresponding to the data time period t y1_t-1 The partition data amount corresponding to the previous time period y1_t-1 of the y1_t time period C y2_t The partition data amount corresponding to the same time period y2_t of the first two years of the data time period t, C y2_t-1 The partition data amount corresponding to the previous time period y2_t-1 of the y2_t time period is d, b and p are holiday factor correction factors, andd+b+p=1, and the weights of d, b and p are automatically adjusted according to the number of holidays, and the more the number of holidays is, the larger the weight is;
when C t =C t-1 =0、C y1_t =C y1_t-1 ≠0、C y2_t =C y2_t-1 Not equal to 0, the following formula is used to calculate the predicted data amount sj_sa for the current time period t:
SJ_SA=C t +d×C t-1 +b×C y1_t +p×C y2_t
when C t =C t-1 =0、C y1_t =C y1_t-1 ≠0、C y2_t =C y2_t-1 When not equal to 0, the following formula may be used to calculate the prediction data sj_sta of the current period t:
SJ=SJ_STA=C t =m×SJ_ST+n×SJ_SA;
wherein m and n are weighting coefficients in the comprehensive compromise algorithm, and m+n=1;
and if the predicted data volume is larger than the current data capacity, expanding the current partition.
2. The method for automatic partition expansion of a database according to claim 1, wherein after calculating the predicted data amount corresponding to the current partition according to the historical partition data amount by using a pre-established partition data amount prediction model, further comprises:
and if the difference value between the predicted data quantity corresponding to the current partition and the actual data quantity of the current partition is larger than a preset difference value, carrying out capacity reduction on the current partition.
3. The method for automatic partition expansion of a database according to claim 1, wherein after calculating the predicted data amount corresponding to the current partition according to the historical partition data amount by using a pre-established partition data amount prediction model, further comprises:
if the predicted data amount is larger than the preset maximum capacity of the current partition, an auxiliary partition of the current partition is established, and the capacity of the auxiliary partition is determined according to the difference value between the predicted data amount and the maximum capacity of the current partition.
4. An apparatus for automatically partitioning and expanding a database, the apparatus comprising:
the detection module is used for calling the acquisition module after detecting that the data amount stored in the current partition reaches a preset threshold value in the management period;
the acquisition module is used for acquiring the data time period and the current data capacity corresponding to the current partition;
the screening module is used for screening the historical partition data volume of the historical time period corresponding to the data time period;
the calculating module is used for calculating the predicted data quantity corresponding to the current partition according to the historical partition data quantity by utilizing a pre-established partition data quantity prediction model, and comprises the following steps: fitting the historical partition data volume according to a regression equation prediction method, and determining a fitting function; taking the fitting function as a partition data amount prediction model, inputting the historical partition data amount into the partition data amount prediction model, and calculating the predicted data amount corresponding to the current partition;
Inputting the actual data quantity of the current partition into the partition data quantity prediction model, and correcting a fitting function of the partition data quantity prediction model according to the actual data quantity;
correcting the predicted data amount based on the historical partition data amount and the partition data amount of a previous time period of the data time period according to a weighted moving average method;
the weighted moving average method corrects the predicted data amount sj_st using the following formula:
SJ_ST=C t +d×(C t -C t-1 )+b×(C y1_t -C y1_t-1 )+p×(C y2_t -C y2_t-1 );
wherein C is t C for predicting the data amount of the data period t of the current partition t-1 Partition data corresponding to a previous time period t-1 of the data time period tAmount of C y1_t C is the partition data volume corresponding to the same time period y1_t of the previous year corresponding to the data time period t y1_t-1 The partition data amount corresponding to the previous time period y1_t-1 of the y1_t time period C y2_t The partition data amount corresponding to the same time period y2_t of the first two years of the data time period t, C y2_t-1 For the partition data volume corresponding to the y2_t-1 of the previous time period of the y2_t time period, d, b and p are holiday factor correction factors, and the weights of d+b+p=1, d, b and p are automatically adjusted according to the number of holidays, and the more the number of holidays is, the more the weights are;
When C t =C t-1 =0、C y1_t =C y1_t-1 ≠0、C y2_t =C y2_t-1 Not equal to 0, the following formula is used to calculate the predicted data amount sj_sa for the current time period t:
SJ_SA=C t +d×C t-1 +b×C y1_t +p×C y2_t
when C t =C t-1 =0、C y1_t =C y1_t-1 ≠0、C y2_t =C y2_t-1 When not equal to 0, the following formula may be used to calculate the prediction data sj_sta of the current period t:
SJ=SJ_STA=C t =m×SJ_ST+n×SJ_SA;
wherein m and n are weighting coefficients in the comprehensive compromise algorithm, and m+n=1;
and the capacity expansion module is used for expanding the capacity of the current partition when the predicted data volume is larger than the current data capacity.
5. A computer device comprising a storage medium and a processor, the storage medium having stored therein computer readable instructions which, when executed by the processor, cause the processor to perform the steps of the method of automatic partition expansion of a database according to any of claims 1 to 3.
6. A computer-readable storage medium having stored therein computer-readable instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of the method of automatic partition expansion of a database according to any one of claims 1 to 3.
CN201910440428.0A 2019-05-24 2019-05-24 Method and device for automatic partition capacity expansion of database Active CN110321334B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910440428.0A CN110321334B (en) 2019-05-24 2019-05-24 Method and device for automatic partition capacity expansion of database

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910440428.0A CN110321334B (en) 2019-05-24 2019-05-24 Method and device for automatic partition capacity expansion of database

Publications (2)

Publication Number Publication Date
CN110321334A CN110321334A (en) 2019-10-11
CN110321334B true CN110321334B (en) 2024-03-19

Family

ID=68118999

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910440428.0A Active CN110321334B (en) 2019-05-24 2019-05-24 Method and device for automatic partition capacity expansion of database

Country Status (1)

Country Link
CN (1) CN110321334B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110941602B (en) * 2019-11-20 2023-05-12 建信金融科技有限责任公司 Database configuration method and device, electronic equipment and storage medium
CN111625196A (en) * 2020-05-26 2020-09-04 北京海益同展信息科技有限公司 Block chain node capacity expansion method and device, computer equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105893531A (en) * 2016-03-31 2016-08-24 武汉虹信技术服务有限责任公司 PostgreSQL database mass data management method and system
CN107861989A (en) * 2017-10-17 2018-03-30 平安科技(深圳)有限公司 Partitioned storage method, apparatus, computer equipment and the storage medium of data
CN108829541A (en) * 2018-06-14 2018-11-16 Oppo(重庆)智能科技有限公司 Dilatation detection method, electronic device and computer readable storage medium
CN109445706A (en) * 2018-10-30 2019-03-08 紫光华山信息技术有限公司 Disk partition expansion method, device and server-side

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105893531A (en) * 2016-03-31 2016-08-24 武汉虹信技术服务有限责任公司 PostgreSQL database mass data management method and system
CN107861989A (en) * 2017-10-17 2018-03-30 平安科技(深圳)有限公司 Partitioned storage method, apparatus, computer equipment and the storage medium of data
CN108829541A (en) * 2018-06-14 2018-11-16 Oppo(重庆)智能科技有限公司 Dilatation detection method, electronic device and computer readable storage medium
CN109445706A (en) * 2018-10-30 2019-03-08 紫光华山信息技术有限公司 Disk partition expansion method, device and server-side

Also Published As

Publication number Publication date
CN110321334A (en) 2019-10-11

Similar Documents

Publication Publication Date Title
CN110704542A (en) Data dynamic partitioning system based on node load
CN110321334B (en) Method and device for automatic partition capacity expansion of database
CN107480028B (en) Method and device for acquiring usable residual time of disk
CN106549772A (en) Resource prediction method, system and capacity management device
CN104811344A (en) Network dynamic service monitoring method and apparatus
Park et al. Estimating extreme tail risk measures with generalized Pareto distribution
CN107861991B (en) Receipt data processing method and device, computer equipment and storage medium
US20130046685A1 (en) Time-efficient and deterministic adaptive score calibration techniques for maintaining a predefined score distribution
CN111162925A (en) Network flow prediction method and device, electronic equipment and storage medium
CN114444074A (en) Abnormal block chain link point detection method and device
CN114723234A (en) Transformer capacity hidden and reported identification method, system, computer equipment and storage medium
CN109428760B (en) User credit evaluation method based on operator data
CN115756812A (en) Resource adjusting method and device and storage medium
CN111614520B (en) IDC flow data prediction method and device based on machine learning algorithm
CN110704773A (en) Abnormal behavior detection method and system based on frequent behavior sequence mode
CN113283973A (en) Account checking difference data processing method and device, computer equipment and storage medium
CN110647724A (en) Cash adding and clearing model construction method, model construction equipment and storage medium
CN114138634B (en) Test case selection method and device, computer equipment and storage medium
US10956319B2 (en) Method and apparatus for multiple accesses in memory and storage system, wherein the memory return addresses of vertexes that have not been traversed
CN114281474A (en) Resource adjusting method and device
CN112613762A (en) Knowledge graph-based group rating method and device and electronic equipment
CN111711957A (en) Traffic-based site capacity expansion prediction method, device and system
US20200410586A1 (en) Adjusting Method and Adjusting Device, Server and Storage Medium for Scorecard Model
CN101557291B (en) Method for log aggregation and device thereof
CN116015785B (en) Information security protection method, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant