WO2018058983A1 - 数据库容量计算方法、装置、服务器及存储设备 - Google Patents

数据库容量计算方法、装置、服务器及存储设备 Download PDF

Info

Publication number
WO2018058983A1
WO2018058983A1 PCT/CN2017/085868 CN2017085868W WO2018058983A1 WO 2018058983 A1 WO2018058983 A1 WO 2018058983A1 CN 2017085868 W CN2017085868 W CN 2017085868W WO 2018058983 A1 WO2018058983 A1 WO 2018058983A1
Authority
WO
WIPO (PCT)
Prior art keywords
archive
capacity
idleness
data
target library
Prior art date
Application number
PCT/CN2017/085868
Other languages
English (en)
French (fr)
Inventor
侯松
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2018058983A1 publication Critical patent/WO2018058983A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/214Database migration support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Definitions

  • the present application relates to the field of database technologies, and in particular, to a database capacity calculation method, apparatus, server, and storage device.
  • Database archiving refers to the placement of some of the less commonly used data in a database into a large-capacity device such as a tape that is not easily accessible. After the data in the database table is archived, the original occupied capacity of the archived data can be reused. Therefore, it is often necessary to recalculate the capacity of the database archive to facilitate data storage. In the traditional way, after the database archiving work, the capacity of the database after the calculation is only quantized from the dimensions of storage, table space, table size and the like. However, all three layers neglect the space after the table is archived and the data has been deleted but not yet recovered. Therefore, the actual capacity after the database is archived cannot be accurately calculated, and the calculation granularity is too coarse, resulting in statistically obtained data and actual conditions. Larger.
  • a database capacity calculation method, apparatus, server, and storage device are provided.
  • a database capacity calculation method includes:
  • a database capacity calculation device includes:
  • a data acquisition module configured to acquire archival configuration data of the target library
  • a generating module configured to acquire an archive table matching the archive configuration data in the target library according to the archive configuration data, and generate a configuration information table
  • a determining module configured to determine an original capacity of the archive table according to the configuration information table
  • An analysis module configured to analyze the archive table and obtain the idleness of each data block in the archive table
  • a calculation module configured to calculate an actual capacity of the archive table according to the idleness of the respective data block and the original capacity of the archive table.
  • a server comprising a memory and a processor, the memory storing computer executable instructions, the computer executable instructions being executed by the processor, such that the processor performs the following steps:
  • One or more storage devices storing computer executable instructions that, when executed by one or more processors, cause the one or more processors to perform the following steps:
  • Figure 1 is a block diagram of a server in one embodiment
  • FIG. 2 is a flow chart of a method for calculating a database capacity in an embodiment
  • FIG. 3 is a flow chart of obtaining archive configuration data of a target library in an embodiment
  • FIG. 4 is a flow chart of analyzing an archive table in an embodiment and obtaining the idleness of each data block in the archive table;
  • Figure 5 is a flow chart for calculating the actual capacity of an archive table in one embodiment
  • FIG. 6 is a flow chart of calculating an actual archive rate of a target library in one embodiment
  • Figure 7 is a block diagram of a database capacity calculation device in one embodiment
  • Figure 8 is a block diagram of a data acquisition module in one embodiment
  • Figure 9 is a block diagram of an analysis module in one embodiment
  • Figure 10 is a block diagram of a computing module in one embodiment
  • Figure 11 is a block diagram of a database capacity calculation device in another embodiment.
  • FIG. 1 is a block diagram of a server in one embodiment.
  • the server includes a processor, memory, and network interface connected by a system bus.
  • the server's processor is used to provide computing and control capabilities to support the operation of the entire server.
  • the server's memory is used to store data, code instructions, etc., and in one embodiment, the memory can include non-volatile storage media and internal memory.
  • the non-volatile storage medium stores an operating system, a database, and computer executable instructions.
  • the database stores an archive table, and the computer executable instructions are used to implement a database capacity calculation method applicable to the server provided in the embodiment of the present application. .
  • the internal memory provides a cached operating environment for operating systems and computer executable instructions in a non-volatile storage medium.
  • the network interface of the server is used to communicate with an external terminal through a network connection, such as receiving a data archive request sent by the terminal.
  • the server can be implemented with a stand-alone server or a server cluster consisting of multiple servers. It will be understood by those skilled in the art that the structure shown in FIG. 1 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the server to which the solution of the present application is applied.
  • the specific server may include More or fewer components are shown in the figures, or some components are combined, or have different component arrangements.
  • a database capacity calculation method is provided, which is applicable to the server shown in FIG. 1, and the method includes the following steps:
  • Step S210 Obtain archival configuration data of the target library.
  • the target library refers to the target database of the actual capacity to be statistically archived.
  • the archive configuration data records the archive table information of the target library for archiving, and the server can clearly obtain the target library for archiving from the archive configuration data of the target library. Archive table and archive table information.
  • step S110 acquires archival configuration data of the target library, including the following steps:
  • Step S302 Obtain a target library and target library information to be calculated.
  • the server may obtain target library information of the target library, and the target library information may include information such as a target library number, a target library name, and a table name of each table stored in the target library.
  • Step S304 extract matching archive configuration data from the configuration library according to the target library information, and generate a SQL script according to the archive configuration data.
  • the configuration repository stores the configuration information of each database, including the archive configuration data and service configuration data of each database.
  • the service configuration data may include the service type corresponding to the database storage table, the primary key of the database table, and the field.
  • the server can find and extract the archive configuration data matching the target library information from the configuration library according to the target library number or the target library name in the target library information, and export the archive configuration data to generate SQL (Structured) Query Language, a structured query language, that facilitates the transfer of archive configuration data from the configuration repository to the target library.
  • SQL Structured
  • Step S306 importing the archive configuration data into the target library according to the SQL script.
  • the server can import the archive configuration data matching the target library information of the target library into the target library according to the generated SQL script, and successfully extract the information about the archived table in the target library.
  • Step S220 Acquire an archive table matching the archive configuration data in the target library according to the archive configuration data, and generate a configuration information table.
  • the server imports the archive configuration data into the target library
  • the archived table and the archive table information archived in the target library are obtained according to the archive configuration data
  • the configuration information table is generated according to the archive table information
  • the configuration information table may include the archive table name.
  • the owner of the archive table the name of the primary table associated with the archive table, the owner of the primary table, and the like, wherein the owner refers to the user who created the table, that is, the owner to which the table belongs.
  • Step S230 determining the original capacity of the archive table according to the configuration information table.
  • the server can calculate the actual capacity after archiving according to the archive table name in the configuration information table, and the archived table in the target library.
  • the DBA_SEGMENTS function can be used to obtain the physical space occupied by the archive table, that is, the original capacity of the archive table.
  • Step S240 analyzing the archive table, and obtaining the idleness of each data block in the archive table.
  • the logical structure of the database includes table spaces, segments, regions, and data blocks.
  • a database consists of multiple table spaces.
  • the objects stored in the table space are called segments, such as data segments and index segments.
  • the segments are composed of regions, and the regions are disks.
  • the size of the area is an integer multiple of the data block size.
  • the data block is the smallest I/O unit in the database. It is also the unit of the memory data buffer and the data file storage space unit.
  • the server can analyze the archive table, scan each data block in the archive table through the show_space function, and obtain the idleness of each data block, which is the unoccupied capacity of the data block. The ratio of the size of the data block.
  • step S250 the actual capacity of the archive table is calculated according to the idleness of each data block and the original capacity of the archive table.
  • the server can obtain the capacity of the data block according to the BLOCK_SIZE parameter of the archive table, and calculate the archive capacity of the archive table according to the idleness of each data block in the archive table, wherein the archive capacity refers to the vacancy of the data in the table after being archived. Capacity, which is the amount of data that can be reused after it has been archived. After calculating the archive capacity of the archive table, you can subtract the archive capacity from the original capacity of the archive table to get the actual capacity of the archive table. After calculating the actual capacity of each archive table of the target library one by one according to the name of the archive table in the configuration information table, the actual capacity of each archive table can be added to the capacity of other unarchived tables, and the database is archived. Actual capacity.
  • the above database capacity calculation method obtains the archive configuration data of the target library, generates a configuration information table according to the archive configuration data, determines the original capacity of the archive table in the target library according to the configuration information table, analyzes the archive table, and obtains each data in the archive table.
  • the idleness of the block and then calculate the actual capacity of the archive table according to the idle capacity of each data block and the original capacity of the archive table, so that the real capacity calculation of the database can be fine-grained to the data block level, and can be accurately calculated according to the idleness of each data block.
  • the actual size of each archive table so as to accurately get the actual capacity of the database after archiving.
  • step S240 analyzes the archive table and obtains the idleness of each data block in the archive table, including the following steps:
  • Step S402 analyzing the archive table to obtain a high water mark line of the archive table.
  • High water mark (HWM, High-Water Mark) is used to define the data blocks that have been used in one segment of the archive table and the unused data blocks.
  • the high water mark is the used data block, and the high water mark is the allocated but unused data block. .
  • the server needs to obtain the high water mark of the archive table, thereby obtaining the data blocks that have been used in each segment in the archive table.
  • Step S404 acquiring the data block that has been used in the archive table according to the high water mark line.
  • the server can obtain the used data blocks and the unused data blocks in the archive table according to the high water mark of the archive table, and obtain the archive capacity of the used data blocks, that is, obtain the used data blocks due to data archiving. Thus free up the capacity to rewrite data.
  • the number of unused data blocks can also be obtained from the high water mark line, and the allocated capacity of the archive table but not used is calculated according to the data block capacity size defined by the BLOCK_SIZE parameter of the archive table.
  • step S406 the data blocks that have been used are scanned one by one, and the idleness of each used data block is obtained.
  • the server can scan the used data blocks in the archive table one by one by the show_space function, and obtain the idleness of the used data blocks respectively, thereby calculating the archive capacity of the archive table according to the idleness of each data block in the archive table.
  • the used data block is obtained according to the high water mark of the archive table, and the idleness of each used data block is obtained, so that the archive capacity of the archive table is accurately calculated, so that the actual capacity of the archive table is obtained.
  • the calculation can be fine-grained to the data block level.
  • the vacant capacity under the high water mark can be removed, and the actual capacity of each archive table can be accurately calculated.
  • step S250 calculates the actual capacity of the archive table according to the idleness of each data block and the original capacity of the archive table, including the following steps:
  • Step S502 All the used data blocks are allocated to the corresponding idleness stage according to the preset idleness phase values corresponding to the respective idleness stages and the idleness of each used data block.
  • the server can preset various idleness stages according to business requirements. For example, 5 idleness stages of 0-25%, 25%-50%, 50%-75%, 75%-100%, 100%, etc. can be respectively set.
  • Each idleness phase corresponds to an idleness value interval, and the idleness phase value is the maximum value and the minimum value of each idleness phase, and the idleness value interval corresponding to each idleness phase may be greater than or equal to the minimum value, less than the maximum value. , but not limited to this.
  • the idleness of the used data block can be obtained, and the obtained idleness of the used data block is compared with the idleness phase value of each idleness stage.
  • the idleness value interval the idleness of the used data block falls into, thereby allocating the used data block. For example, the idleness of a used data block is 30%. After comparison, it is known that it falls within the interval of 25%-50%, and the used data block is allocated to 25%. -50% idle phase.
  • Step S504 acquiring the number of data blocks allocated to the respective idleness stages.
  • the server can acquire the number of data blocks allocated to each idle degree stage after assigning each used data block to the corresponding idleness stage according to the idleness of each used data block in the archive table.
  • Step S506 calculating, by using a weighting algorithm, the idleness phase value and the number of data blocks corresponding to each idleness phase, and obtaining the number of weighted idle data blocks in each idleness phase.
  • each idle degree phase corresponds to an idleness value interval
  • the server needs to perform weighted calculation on the idleness phase value and the number of data blocks corresponding to each idleness phase, thereby obtaining the idleness of the used data block in the archive table.
  • Distribution structure For example, suppose the number of data blocks allocated to 0-25%, 25%-50%, 50%-75%, 75%-100%, 100%, etc. in the archive table is a, b, respectively.
  • c, d, e the number of weighted free data blocks for each idle degree can be calculated as a*(0+25%)/2, b*(25%+50%)/2, c*(50% +75%)/2, d*(75%+100%)/2, e*100%.
  • Step S508 obtaining the data block capacity of the archive table, and calculating the archive capacity according to the data block capacity and the number of weighted idle data blocks in each idle degree phase.
  • Step S510 calculating the actual capacity of the archive table according to the original capacity and the archive capacity.
  • the server can calculate the actual capacity of the archive table based on the original capacity of the obtained archive table, the allocated but unused capacity in the archive table, and the archive capacity.
  • the actual capacity of the archive table original capacity - allocated but unused capacity - Archive capacity.
  • the data blocks that have been used in the archive table are allocated to the respective idleness stages according to the idle degree, and the number of weighted idle data blocks in each idleness stage is calculated, thereby making the calculated archived table archived capacity more Accurate, accurate calculation of the actual size of each archive table.
  • the foregoing database capacity calculation method further includes the following steps:
  • Step S602 obtaining the original full library capacity of the target library.
  • the server may extract data corresponding to the target library from the database record table, and obtain the original full library capacity of the target library, wherein the original full library capacity refers to the physical space occupied by the database before the data is archived.
  • Step S604 the archive capacity of each archive table in the target library is counted, and the total archive capacity of the target library is obtained.
  • the server calculates the archive capacity of each archive table in the target library
  • the archive capacity of each archive table can be counted to obtain the total archive capacity of the target library.
  • step S606 the actual filing rate is calculated according to the total archive capacity and the original full storage capacity.
  • the actual archive rate refers to the ratio of the archived data to the original total data
  • the actual archive rate of the target library the total archive capacity / the original full library capacity.
  • the server may collect the archive time, archive capacity, actual capacity, total archive capacity of the target library, actual archive rate, etc. of each archive table of the target library every preset time, for example, daily, monthly, and the like. Data and report generation, so that the staff can adjust the scope of the archive and the actual archive rate in the database.
  • the actual archiving rate of the target library can be calculated according to the archive capacity of the accurately calculated archive table, so that the calculation of the actual archiving rate is more accurate, which can help the staff to more accurately analyze the archiving work of the database.
  • a database capacity calculation apparatus including a data acquisition module 710, a generation module 720, a determination module 730, an analysis module 740, and a calculation module 750.
  • the data obtaining module 710 is configured to acquire archival configuration data of the target library.
  • the target library refers to the target database of the actual capacity to be statistically archived.
  • the archive configuration data records the archive table information of the target library for archiving, and the server can clearly obtain the target library for archiving from the archive configuration data of the target library. Archive table and archive table information.
  • the data acquisition module 710 includes an information acquisition unit 712, a script generation unit 714, and an import unit 716.
  • the information obtaining unit 712 is configured to acquire a target library and target library information to be calculated.
  • the server may obtain target library information of the target library, and the target library information may include information such as a target library number, a target library name, and a table name of each table stored in the target library.
  • the script generating unit 714 is configured to extract matching archive configuration data from the configuration library according to the target library information, and generate a SQL script according to the archive configuration data.
  • the configuration repository stores the configuration information of each database, including the archive configuration data and service configuration data of each database.
  • the service configuration data may include the service type corresponding to the database storage table, the primary key of the database table, and the field.
  • the server can find and extract the archive configuration data matching the target library information from the configuration library according to the target library number or the target library name in the target library information, and export the archive configuration data to generate the SQL script, so as to facilitate the archiving configuration data from Passed to the target library in the configuration repository.
  • the import unit 716 is configured to import the archive configuration data into the target library according to the SQL script.
  • the server can import the archive configuration data matching the target library information of the target library into the target library according to the generated SQL script, and successfully extract the information about the archived table in the target library.
  • the generating module 720 is configured to obtain an archive table matching the archive configuration data in the target library according to the archive configuration data, and generate a configuration information table.
  • the server imports the archive configuration data into the target library
  • the archived table and the archive table information archived in the target library are obtained according to the archive configuration data
  • the configuration information table is generated according to the archive table information
  • the configuration information table may include the archive table name.
  • the owner of the archive table the name of the primary table associated with the archive table, the owner of the primary table, and the like, wherein the owner refers to the user who created the table, that is, the owner to which the table belongs.
  • the determining module 730 is configured to determine an original capacity of the archive table according to the configuration information table.
  • the server can calculate the actual capacity after archiving according to the archive table name in the configuration information table, and the archived table in the target library.
  • the DBA_SEGMENTS function can be used to obtain the physical space occupied by the archive table, that is, the original capacity of the archive table.
  • the analysis module 740 is configured to analyze the archive table and obtain the idleness of each data block in the archive table.
  • the logical structure of the database includes table spaces, segments, regions, and data blocks.
  • a database consists of multiple table spaces.
  • the objects stored in the table space are called segments, such as data segments and index segments.
  • the segments are composed of regions, and the regions are disks.
  • the size of the area is an integer multiple of the data block size.
  • the data block is the smallest I/O unit in the database. It is also the unit of the memory data buffer and the data file storage space unit.
  • the server can analyze the archive table, scan each data block in the archive table through the show_space function, and obtain the idleness of each data block, which is the unoccupied capacity of the data block. The ratio of the size of the data block.
  • the calculation module 750 is configured to calculate the actual capacity of the archive table according to the idleness of each data block and the original capacity of the archive table.
  • the server can obtain the capacity of the data block according to the BLOCK_SIZE parameter of the archive table, and calculate the archive capacity of the archive table according to the idleness of each data block in the archive table, wherein the archive capacity refers to the vacancy of the data in the table after being archived. Capacity, which is the amount of data that can be reused after it has been archived. After calculating the archive capacity of the archive table, you can subtract the archive capacity from the original capacity of the archive table to get the actual capacity of the archive table. After calculating the actual capacity of each archive table of the target library one by one according to the name of the archive table in the configuration information table, the actual capacity of each archive table can be added to the capacity of other unarchived tables, and the database is archived. Actual capacity.
  • the database capacity calculation device obtains the archive configuration data of the target library, generates a configuration information table according to the archive configuration data, determines the original capacity of the archive table in the target library according to the configuration information table, analyzes the archive table, and obtains each data in the archive table.
  • the idleness of the block and then calculate the actual capacity of the archive table according to the idle capacity of each data block and the original capacity of the archive table, so that the real capacity calculation of the database can be fine-grained to the data block level, and can be accurately calculated according to the idleness of each data block.
  • the actual size of each archive table so as to accurately get the actual capacity of the database after archiving.
  • the analysis module 740 includes an analysis unit 742, a data block acquisition unit 744, and an idleness acquisition unit 746.
  • the analyzing unit 742 is configured to analyze the archive table to obtain a high water mark of the archive table.
  • the high water mark line is used to define the used data blocks and unused data blocks in one segment of the archive table.
  • the high water mark line is the used data block, and the high water mark line is the allocated but unused data. Piece.
  • the server needs to obtain the high water mark of the archive table, thereby obtaining the data blocks that have been used in each segment in the archive table.
  • the data block obtaining unit 744 is configured to acquire the data block that has been used in the archive table according to the high water mark line.
  • the server can obtain the used data blocks and the unused data blocks in the archive table according to the high water mark of the archive table, and obtain the archive capacity of the used data blocks, that is, obtain the used data blocks due to data archiving. Thus free up the capacity to rewrite data.
  • the number of unused data blocks can also be obtained from the high water mark line, and the allocated capacity of the archive table but not used is calculated according to the data block capacity size defined by the BLOCK_SIZE parameter of the archive table.
  • the idleness obtaining unit 746 is configured to scan the used data blocks one by one and acquire the idleness of each used data block.
  • the server can scan the used data blocks in the archive table one by one by the show_space function, and obtain the idleness of the used data blocks respectively, thereby calculating the archive capacity of the archive table according to the idleness of each data block in the archive table.
  • the used data block is obtained according to the high water mark of the archive table, and the idleness of each used data block is obtained, so that the archive capacity of the archive table is accurately calculated, so that the actual capacity of the archive table is obtained.
  • the calculation can be fine-grained to the data block level.
  • the vacant capacity under the high water mark can be removed, and the actual capacity of each archive table can be accurately calculated.
  • the computing module 750 includes an allocation unit 752, a quantity acquisition unit 754, and a calculation unit 756.
  • the allocating unit 752 is configured to allocate each used data block to a corresponding idle degree stage according to the preset idleness phase value corresponding to each idle degree phase and the idleness of each used data block.
  • the server can preset various idleness stages according to business requirements. For example, 5 idleness stages of 0-25%, 25%-50%, 50%-75%, 75%-100%, 100%, etc. can be respectively set.
  • Each idleness phase corresponds to an idleness value interval, and the idleness phase value is the maximum value and the minimum value of each idleness phase, and the idleness value interval corresponding to each idleness phase may be greater than or equal to the minimum value, less than the maximum value. , but not limited to this.
  • the idleness of the used data block can be obtained, and the obtained idleness of the used data block is compared with the idleness phase value of each idleness stage.
  • the idleness value interval the idleness of the used data block falls into, thereby allocating the used data block. For example, the idleness of a used data block is 30%. After comparison, it is known that it falls within the interval of 25%-50%, and the used data block is allocated to 25%. -50% idle phase.
  • the quantity obtaining unit 754 is configured to acquire the number of data blocks allocated to each idle degree stage.
  • the server can acquire the number of data blocks allocated to each idle degree stage after assigning each used data block to the corresponding idleness stage according to the idleness of each used data block in the archive table.
  • the calculating unit 756 is configured to calculate, by using a weighting algorithm, the idleness phase value and the number of data blocks corresponding to each idleness phase, to obtain the number of weighted idle data blocks in each idleness phase.
  • each idle degree phase corresponds to an idleness value interval
  • the server needs to perform weighted calculation on the idleness phase value and the number of data blocks corresponding to each idleness phase, thereby obtaining the idleness of the used data block in the archive table.
  • Distribution structure For example, suppose the number of data blocks allocated to 0-25%, 25%-50%, 50%-75%, 75%-100%, 100%, etc. in the archive table is a, b, respectively.
  • c, d, e the number of weighted free data blocks for each idle degree can be calculated as a*(0+25%)/2, b*(25%+50%)/2, c*(50% +75%)/2, d*(75%+100%)/2, e*100%.
  • the calculating unit 756 is further configured to obtain a data block capacity of the archive table, and calculate the archive capacity according to the data block capacity and the number of weighted idle data blocks in each idle degree phase.
  • the calculating unit 756 is further configured to calculate the actual capacity of the archive table according to the original capacity and the archive capacity.
  • the data blocks that have been used in the archive table are allocated to the respective idleness stages according to the idle degree, and the number of weighted idle data blocks in each idleness stage is calculated, thereby making the calculated archived table archived capacity more Accurate, accurate calculation of the actual size of each archive table.
  • the database capacity calculation device includes a data acquisition module 710, a generation module 720, a determination module 730, an analysis module 740, and a calculation module 750, and includes a capacity acquisition module 760 and a statistics module. 770.
  • the capacity acquisition module 760 is configured to acquire the original full library capacity of the target library.
  • the server may extract data corresponding to the target library from the database record table, and obtain the original full library capacity of the target library, wherein the original full library capacity refers to the physical space occupied by the database before the data is archived.
  • the statistics module 770 is configured to count the archive capacity of each archive table in the target library, and obtain the total archive capacity of the target library.
  • the server calculates the archive capacity of each archive table in the target library
  • the archive capacity of each archive table can be counted to obtain the total archive capacity of the target library.
  • the calculation module 750 is further configured to calculate an actual archive rate based on the total archive capacity and the original full storage capacity.
  • the actual archive rate refers to the ratio of the archived data to the original total data
  • the actual archive rate of the target library the total archive capacity / the original full library capacity.
  • the server may collect data such as archive time, archive capacity, actual capacity, total archive capacity of the target library, and actual archive rate of each archive table of the target library every preset time, for example, daily, monthly, and the like. And generate reports to facilitate the staff to adjust the scope of the archive in the database and the actual filing rate.
  • the actual archiving rate of the target library can be calculated according to the archive capacity of the accurately calculated archive table, so that the calculation of the actual archiving rate is more accurate, which can help the staff to more accurately analyze the archiving work of the database.
  • the various modules in the above described database capacity computing device may be implemented in whole or in part by software, hardware or a combination thereof.
  • the calculation module 750 can calculate the actual capacity of the archive table according to the idleness of each data block and the original capacity of the archive table by the processor of the server, where the processor can be a central processing unit or a microprocessor. Wait.
  • the above modules may be embedded in the hardware of the server or may be stored in the memory of the server in a software form, so that the processor can call the corresponding operations of the above modules.
  • the storage device may be a magnetic disk, an optical disk, or a read-only storage memory (Read-Only) Memory, ROM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Fuzzy Systems (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Quality & Reliability (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种数据库容量计算方法,包括:获取目标库的归档配置数据;根据所述归档配置数据获取所述目标库中与所述归档配置数据匹配的归档表,并生成配置信息表;根据所述配置信息表确定所述归档表的原容量;分析所述归档表,并获取所述归档表中各个数据块的空闲度;及根据所述各个数据块的空闲度及所述归档表的原容量计算所述归档表的实际容量。

Description

数据库容量计算方法、装置、服务器及存储设备
本申请要求于2016年9月28日提交中国专利局、申请号为2016108598262、发明名称为“数据库容量计算方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
【技术领域】
本申请涉及数据库技术领域,特别是涉及一种数据库容量计算方法、装置、服务器及存储设备。
【背景技术】
数据库归档,指的是将数据库中某些比较不常用的数据放置到诸如磁带等大容量而不便于访问的设备。数据库的表中的数据被归档后,被归档的数据原来占用的容量即可重新被使用,因此,往往需要重新计算数据库归档后的容量方便进行数据存储。而在传统方式中,进行数据库归档工作后,计算数据库归档后的容量仅从存储、表空间、表大小等维度进行量化。然而,这三个层面均忽视了表被归档后,数据删除而尚未被回收的空间,因此无法准确地计算出数据库归档后的实际容量,计算粒度过粗,导致统计得到的数据与实际情况出入较大。
【发明内容】
根据本申请的各种实施例,提供一种数据库容量计算方法、装置、服务器及存储设备。
一种数据库容量计算方法,包括:
获取目标库的归档配置数据;
根据所述归档配置数据获取所述目标库中与所述归档配置数据匹配的归档表,并生成配置信息表;
根据所述配置信息表确定所述归档表的原容量;
分析所述归档表,并获取所述归档表中各个数据块的空闲度;及
根据所述各个数据块的空闲度及所述归档表的原容量计算所述归档表的实际容量。
一种数据库容量计算装置,包括:
数据获取模块,用于获取目标库的归档配置数据;
生成模块,用于根据所述归档配置数据获取所述目标库中与所述归档配置数据匹配的归档表,并生成配置信息表;
确定模块,用于根据所述配置信息表确定所述归档表的原容量;
分析模块,用于分析所述归档表,并获取所述归档表中各个数据块的空闲度;及
计算模块,用于根据所述各个数据块的空闲度及所述归档表的原容量计算所述归档表的实际容量。
一种服务器,包括存储器和处理器,所述存储器中储存有计算机可执行指令,所述计算机可执行指令被所述处理器执行时,使得所述处理器执行以下步骤:
获取目标库的归档配置数据;
根据所述归档配置数据获取所述目标库中与所述归档配置数据匹配的归档表,并生成配置信息表;
根据所述配置信息表确定所述归档表的原容量;
分析所述归档表,并获取所述归档表中各个数据块的空闲度;及
根据所述各个数据块的空闲度及所述归档表的原容量计算所述归档表的实际容量。
一个或多个存储有计算机可执行指令的存储设备,所述计算机可执行指令被一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:
获取目标库的归档配置数据;
根据所述归档配置数据获取所述目标库中与所述归档配置数据匹配的归档表,并生成配置信息表;
根据所述配置信息表确定所述归档表的原容量;
分析所述归档表,并获取所述归档表中各个数据块的空闲度;及
根据所述各个数据块的空闲度及所述归档表的原容量计算所述归档表的实际容量。
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征、目的和优点将从说明书、附图以及权利要求书变得明显。
【附图说明】
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为一个实施例中服务器的框图;
图2为一个实施例中数据库容量计算方法的流程图;
图3为一个实施例中获取目标库的归档配置数据的流程图;
图4为一个实施例中分析归档表,并获取归档表中各个数据块的空闲度的流程图;
图5为一个实施例中计算归档表的实际容量的流程图;
图6为一个实施例中计算目标库的实际归档率的流程图;
图7为一个实施例中数据库容量计算装置的框图;
图8为一个实施例中数据获取模块的框图;
图9为一个实施例中分析模块的框图;
图10为一个实施例中计算模块的框图;
图11为另一个实施例中数据库容量计算装置的框图。
【具体实施方式】
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。
图1为一个实施例中服务器的框图。如图1所示,该服务器包括通过系统总线连接的处理器、存储器和网络接口。其中,该服务器的处理器用于提供计算和控制能力,支撑整个服务器的运行。该服务器的存储器用于存储数据、代码指令等,在一个实施例中,存储器可包括非易失性存储介质及内存储器。非易失性存储介质存储有操作系统、数据库和计算机可执行指令,数据库中存储有归档表,该计算机可执行指令用于实现本申请实施例中提供的适用于服务器的一种数据库容量计算方法。内存储器为非易失性存储介质中的操作系统及计算机可执行指令提供高速缓存的运行环境。该服务器的网络接口用于据以与外部的终端通过网络连接通信,比如接收终端发送的数据归档请求等。服务器可以用独立的服务器或者是多个服务器组成的服务器集群来实现。本领域技术人员可以理解,图1中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的服务器的限定,具体的服务器可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
如图2所示,在一个实施例中,提供了一种数据库容量计算方法,适用于如图1所示的服务器,该方法包括以下步骤:
步骤S210,获取目标库的归档配置数据。
目标库指的是待统计进行归档工作后的实际容量的目标数据库,归档配置数据中记录有目标库进行归档的归档表信息,服务器从目标库的归档配置数据中可清楚地获取目标库进行归档的归档表及归档表信息。
如图3所示,在一个实施例中,步骤S110获取目标库的归档配置数据,包括以下步骤:
步骤S302,获取待计算的目标库及目标库信息。
服务器可获取目标库的目标库信息,目标库信息可包括目标库编号、目标库名称、目标库中存储的各个表的表名等信息。
步骤S304,根据目标库信息从配置库中提取匹配的归档配置数据,并根据归档配置数据生成SQL脚本。
配置库中存储有各个数据库的相关配置信息,包括各个数据库的归档配置数据、业务配置数据等,其中,业务配置数据可包括数据库存储的表对应的业务类型、数据库各个表的主键、字段等。服务器可根据目标库信息中的目标库编号或是目标库名称等,从配置库中查找并提取与目标库信息匹配的归档配置数据,并导出归档配置数据生成SQL(Structured Query Language,结构化查询语言)脚本,方便将归档配置数据从配置库中传递至目标库。
步骤S306,根据SQL脚本将归档配置数据导入目标库中。
服务器可根据生成的SQL脚本将与目标库的目标库信息匹配的归档配置数据导入目标库中,即可成功提取目标库中进行归档的归档表的相关信息。
步骤S220,根据归档配置数据获取目标库中与归档配置数据匹配的归档表,并生成配置信息表。
服务器将归档配置数据导入目标库后,即可根据归档配置数据获取目标库中进行归档的归档表及归档表信息,并根据归档表信息生成配置信息表,配置信息表中可包括归档表表名、归档表属主、与归档表关联的主表表名、主表属主等信息,其中,属主指的是创建该表的用户,即表所属的主人。
步骤S230,根据配置信息表确定归档表的原容量。
服务器可按照配置信息表中的归档表表名,逐一对目标库中的归档表计算归档后的实际容量,可使用DBA_SEGMENTS函数获取归档表所占的物理空间大小,即归档表的原容量。
步骤S240,分析归档表,并获取归档表中各个数据块的空闲度。
数据库的逻辑结构包括表空间、段、区及数据块,一个数据库由多个表空间组成,表空间中存储的对象称为段,例如数据段、索引段等,段由区组成,区是磁盘分配的最小单位。区的大小是数据块大小的整数倍,数据块是数据库中最小的I/O单位,同时也是内存数据缓冲区的单位,及数据文件存储空间单位。在确定了归档表的原容量后,服务器可对归档表进行分析,通过show_space函数扫描归档表中的各个数据块,并获取各个数据块的空闲度,空闲度即数据块未被占用的容量占数据块容量大小的比率。
步骤S250,根据各个数据块的空闲度及归档表的原容量计算归档表的实际容量。
服务器可根据归档表的BLOCK_SIZE参数获取数据块的容量大小,并根据归档表中各个数据块的空闲度计算出归档表的归档容量,其中,归档容量指的是表中数据被归档后空出的容量,即数据被归档后可重新使用的容量。计算出归档表的归档容量后,可将归档表的原容量减去归档容量,即可得到归档表的实际容量。按照配置信息表中的归档表表名逐一计算目标库的各个归档表的实际容量后,即可将各个归档表的实际容量与其它未进行归档的表的容量进行相加,得到数据库进行归档后的实际容量。
上述数据库容量计算方法,通过获取目标库的归档配置数据,根据归档配置数据生成配置信息表,根据配置信息表确定目标库中的归档表的原容量,分析归档表,并获取归档表中各个数据块的空闲度,再根据各个数据块的空闲及归档表的原容量计算归档表的实际容量,使得数据库的真实容量计算能细粒度到数据块层面,根据各个数据块的空闲度能够准确计算得到每个归档表的实际容量大小,从而准确得到数据库进行归档后的实际容量。
如图4所示,在一个实施例中,步骤S240分析归档表,并获取归档表中各个数据块的空闲度,包括以下步骤:
步骤S402,分析归档表,得到归档表的高水位线。
高水位线(HWM,High-Water Mark)用于界定归档表的一个段中已被使用的数据块及未被使用的数据块,高水位线下为已被使用的数据块,高水位线上为已分配但未使用的数据块。当在表中新增数据时,高水位线会随着数据存储的空间不足而往上移,但当归档表中的数据被归档,相当于将归档的数据进行删除时,高水位线并不会随着数据块中的数据被删除而降低,因此,服务器需要获取归档表的高水位线,从而得到归档表中的各个段中已被使用的数据块。
步骤S404,根据高水位线获取归档表中已被使用的数据块。
服务器可根据归档表的高水位线获取归档表中已被使用的数据块及未被使用的数据块,并获取已被使用的数据块的归档容量,即获取已被使用的数据块因数据归档从而空出来可重新写入数据的容量。还可根据高水位线获取未被使用的数据块数量,并根据归档表的BLOCK_SIZE参数定义的数据块容量大小计算归档表已分配但未使用的容量。
步骤S406,逐一扫描已被使用的数据块,并获取各个已被使用的数据块的空闲度。
服务器可通过show_space函数逐一扫描归档表中已被使用的数据块,并分别获取已被使用的数据块的空闲度,从而根据归档表中各个数据块的空闲度计算出归档表的归档容量。
在本实施例中,根据归档表的高水位线获取已被使用的数据块,并获取各个已被使用的数据块的空闲度从而精准地计算出归档表的归档容量,使得归档表的实际容量计算能细粒度到数据块层面,在计算归档表的实际容量时能去除高水位线下的空出的容量,能够准确计算得到每个归档表的实际容量大小。
如图5所示,在一个实施例中,步骤S250根据各个数据块的空闲度及归档表的原容量计算归档表的实际容量,包括以下步骤:
步骤S502,根据预设的各个空闲度阶段对应的空闲度阶段值及各个已被使用的数据块的空闲度,将各个已被使用的数据块分配到对应的空闲度阶段。
服务器可预先根据业务需求设置各个空闲度阶段,例如,可分别设置0-25%、25%-50%、50%-75%、75%-100%、100%等5个空闲度阶段。每个空闲度阶段对应一个空闲度数值区间,空闲度阶段值即为各个空闲度阶段的最大值、最小值,每个空闲度阶段对应的空闲度数值区间可大于或等于最小值,小于最大值,但并不限于此。通过show_space函数扫描归档表中已被使用的数据块,可获取已被使用的数据块的空闲度,将获取的已被使用的数据块的空闲度与各个空闲度阶段的空闲度阶段值进行比对,判断该已被使用的数据块的空闲度落入哪一个空闲度数值区间,从而对已被使用的数据块进行分配。例如,某已被使用的数据块的空闲度为30%,进行比对后,可知其落入25%-50%这一空闲度数值区间,则将该已被使用的数据块分配至25%-50%的空闲度阶段。
步骤S504,获取被分配到各个空闲度阶段的数据块数量。
服务器逐一根据归档表中各个已被使用的数据块的空闲度,将各个已被使用的数据块分配到对应的空闲度阶段后,可获取被分配到各个空闲度阶段的数据块数量。
步骤S506,通过加权算法对各个空闲度阶段对应的空闲度阶段值及数据块数量进行计算,得到各个空闲度阶段的加权空闲数据块数量。
因各个空闲度阶段对应的为空闲度数值区间,因此,服务器需要对各个空闲度阶段对应的空闲度阶段值及数据块数量进行加权计算,从而获取归档表中已被使用的数据块的空闲度分布结构。例如,假设归档表中,分配到0-25%、25%-50%、50%-75%、75%-100%、100%等5个空闲度阶段的数据块数量分别为a、b、c、d、e,则可计算得到各个空闲度阶段的加权空闲数据块数量分别为a*(0+25%)/2、b*(25%+50%)/2、c*(50%+75%)/2、d*(75%+100%)/2、e*100%。
步骤S508,获取归档表的数据块容量,根据数据块容量及各个空闲度阶段的加权空闲数据块数量计算归档容量。
服务器可根据归档表的BLOCK_SIZE参数获取归档表的数据块容量大小,并将数据块容量与各个空闲度阶段的加权空闲数据块数量之和相乘,即可得到归档表的归档容量,即归档容量=归档容量*各个空闲度阶段的加权空闲数据块数量之和。
步骤S510,根据原容量及归档容量计算归档表的实际容量。
服务器可根据获取的归档表的原容量、归档表中已分配但未使用的容量及归档容量等数据计算归档表的实际容量,归档表的实际容量=原容量-已分配但未使用的容量-归档容量。
在本实施例中,将归档表中已被使用的数据块按照空闲度分配至各个空闲度阶段,并计算各个空闲度阶段的加权空闲数据块数量,从而使得计算出的归档表的归档容量更加准确,能够准确计算得到每个归档表的实际容量大小。
如图6所示,在一个实施例中,上述数据库容量计算方法,还包括以下步骤:
步骤S602,获取目标库的原全库容量。
服务器可从数据库记录表中提取与目标库对应的数据,并获取目标库的原全库容量,其中,原全库容量指的是数据库未进行数据归档前所占的物理空间大小。
步骤S604,统计目标库中各个归档表的归档容量,得到目标库的总归档容量。
服务器计算得到目标库中各个归档表的归档容量后,可对各个归档表的归档容量进行统计,得到目标库的总归档容量。
步骤S606,根据总归档容量及原全库容量计算实际归档率。
实际归档率指的是已归档的数据占原来全部数据的比率,目标库的实际归档率=总归档容量/原全库容量。在=一个实施例中,服务器可每隔预设时间,例如每日、每月等,收集目标库各个归档表的归档时间、归档容量、实际容量、目标库的总归档容量、实际归档率等数据,并生成报表,方便工作人员对数据库中的归档范围及实际归档率等进行察看调整。
在本实施例中,可根据准确计算得到的归档表的归档容量计算目标库的实际归档率,使实际归档率的计算更加精准,能帮助工作人员更加准确地分析数据库的归档工作。
如图7所示,在一个实施例中提供了一种数据库容量计算装置,包括数据获取模块710、生成模块720、确定模块730、分析模块740及计算模块750。
数据获取模块710,用于获取目标库的归档配置数据。
目标库指的是待统计进行归档工作后的实际容量的目标数据库,归档配置数据中记录有目标库进行归档的归档表信息,服务器从目标库的归档配置数据中可清楚地获取目标库进行归档的归档表及归档表信息。
如图8所示,在一个实施例中,数据获取模块710包括信息获取单元712、脚本生成单元714和导入单元716。
信息获取单元712,用于获取待计算的目标库及目标库信息。
服务器可获取目标库的目标库信息,目标库信息可包括目标库编号、目标库名称、目标库中存储的各个表的表名等信息。
脚本生成单元714,用于根据目标库信息从配置库中提取匹配的归档配置数据,并根据归档配置数据生成SQL脚本。
配置库中存储有各个数据库的相关配置信息,包括各个数据库的归档配置数据、业务配置数据等,其中,业务配置数据可包括数据库存储的表对应的业务类型、数据库各个表的主键、字段等。服务器可根据目标库信息中的目标库编号或是目标库名称等,从配置库中查找并提取与目标库信息匹配的归档配置数据,并导出归档配置数据生成SQL脚本,方便将归档配置数据从配置库中传递至目标库。
导入单元716,用于根据SQL脚本将归档配置数据导入目标库中。
服务器可根据生成的SQL脚本将与目标库的目标库信息匹配的归档配置数据导入目标库中,即可成功提取目标库中进行归档的归档表的相关信息。
生成模块720,用于根据归档配置数据获取目标库中与归档配置数据匹配的归档表,并生成配置信息表。
服务器将归档配置数据导入目标库后,即可根据归档配置数据获取目标库中进行归档的归档表及归档表信息,并根据归档表信息生成配置信息表,配置信息表中可包括归档表表名、归档表属主、与归档表关联的主表表名、主表属主等信息,其中,属主指的是创建该表的用户,即表所属的主人。
确定模块730,用于根据配置信息表确定归档表的原容量。
服务器可按照配置信息表中的归档表表名,逐一对目标库中的归档表计算归档后的实际容量,可使用DBA_SEGMENTS函数获取归档表所占的物理空间大小,即归档表的原容量。
分析模块740,用于分析归档表,并获取归档表中各个数据块的空闲度。
数据库的逻辑结构包括表空间、段、区及数据块,一个数据库由多个表空间组成,表空间中存储的对象称为段,例如数据段、索引段等,段由区组成,区是磁盘分配的最小单位。区的大小是数据块大小的整数倍,数据块是数据库中最小的I/O单位,同时也是内存数据缓冲区的单位,及数据文件存储空间单位。在确定了归档表的原容量后,服务器可对归档表进行分析,通过show_space函数扫描归档表中的各个数据块,并获取各个数据块的空闲度,空闲度即数据块未被占用的容量占数据块容量大小的比率。
计算模块750,用于根据各个数据块的空闲度及归档表的原容量计算归档表的实际容量。
服务器可根据归档表的BLOCK_SIZE参数获取数据块的容量大小,并根据归档表中各个数据块的空闲度计算出归档表的归档容量,其中,归档容量指的是表中数据被归档后空出的容量,即数据被归档后可重新使用的容量。计算出归档表的归档容量后,可将归档表的原容量减去归档容量,即可得到归档表的实际容量。按照配置信息表中的归档表表名逐一计算目标库的各个归档表的实际容量后,即可将各个归档表的实际容量与其它未进行归档的表的容量进行相加,得到数据库进行归档后的实际容量。
上述数据库容量计算装置,通过获取目标库的归档配置数据,根据归档配置数据生成配置信息表,根据配置信息表确定目标库中的归档表的原容量,分析归档表,并获取归档表中各个数据块的空闲度,再根据各个数据块的空闲及归档表的原容量计算归档表的实际容量,使得数据库的真实容量计算能细粒度到数据块层面,根据各个数据块的空闲度能够准确计算得到每个归档表的实际容量大小,从而准确得到数据库进行归档后的实际容量。
如图9所示,在一个实施例中,分析模块740包括分析单元742、数据块获取单元744及空闲度获取单元746。
分析单元742,用于分析归档表,得到归档表的高水位线。
高水位线用于界定归档表的一个段中已被使用的数据块及未被使用的数据块,高水位线下为已被使用的数据块,高水位线上为已分配但未使用的数据块。当在表中新增数据时,高水位线会随着数据存储的空间不足而往上移,但当归档表中的数据被归档,相当于将归档的数据进行删除时,高水位线并不会随着数据块中的数据被删除而降低,因此,服务器需要获取归档表的高水位线,从而得到归档表中的各个段中已被使用的数据块。
数据块获取单元744,用于根据高水位线获取归档表中已被使用的数据块。
服务器可根据归档表的高水位线获取归档表中已被使用的数据块及未被使用的数据块,并获取已被使用的数据块的归档容量,即获取已被使用的数据块因数据归档从而空出来可重新写入数据的容量。还可根据高水位线获取未被使用的数据块数量,并根据归档表的BLOCK_SIZE参数定义的数据块容量大小计算归档表已分配但未使用的容量。
空闲度获取单元746,用于逐一扫描已被使用的数据块,并获取各个已被使用的数据块的空闲度。
服务器可通过show_space函数逐一扫描归档表中已被使用的数据块,并分别获取已被使用的数据块的空闲度,从而根据归档表中各个数据块的空闲度计算出归档表的归档容量。
在本实施例中,根据归档表的高水位线获取已被使用的数据块,并获取各个已被使用的数据块的空闲度从而精准地计算出归档表的归档容量,使得归档表的实际容量计算能细粒度到数据块层面,在计算归档表的实际容量时能去除高水位线下的空出的容量,能够准确计算得到每个归档表的实际容量大小。
如图10所示,在一个实施例中,计算模块750包括分配单元752、数量获取单元754及计算单元756。
分配单元752,用于根据预设的各个空闲度阶段对应的空闲度阶段值及各个已被使用的数据块的空闲度,将各个已被使用的数据块分配到对应的空闲度阶段。
服务器可预先根据业务需求设置各个空闲度阶段,例如,可分别设置0-25%、25%-50%、50%-75%、75%-100%、100%等5个空闲度阶段。每个空闲度阶段对应一个空闲度数值区间,空闲度阶段值即为各个空闲度阶段的最大值、最小值,每个空闲度阶段对应的空闲度数值区间可大于或等于最小值,小于最大值,但并不限于此。通过show_space函数扫描归档表中已被使用的数据块,可获取已被使用的数据块的空闲度,将获取的已被使用的数据块的空闲度与各个空闲度阶段的空闲度阶段值进行比对,判断该已被使用的数据块的空闲度落入哪一个空闲度数值区间,从而对已被使用的数据块进行分配。例如,某已被使用的数据块的空闲度为30%,进行比对后,可知其落入25%-50%这一空闲度数值区间,则将该已被使用的数据块分配至25%-50%的空闲度阶段。
数量获取单元754,用于获取被分配到各个空闲度阶段的数据块数量。
服务器逐一根据归档表中各个已被使用的数据块的空闲度,将各个已被使用的数据块分配到对应的空闲度阶段后,可获取被分配到各个空闲度阶段的数据块数量。
计算单元756,用于通过加权算法对各个空闲度阶段对应的空闲度阶段值及数据块数量进行计算,得到各个空闲度阶段的加权空闲数据块数量。
因各个空闲度阶段对应的为空闲度数值区间,因此,服务器需要对各个空闲度阶段对应的空闲度阶段值及数据块数量进行加权计算,从而获取归档表中已被使用的数据块的空闲度分布结构。例如,假设归档表中,分配到0-25%、25%-50%、50%-75%、75%-100%、100%等5个空闲度阶段的数据块数量分别为a、b、c、d、e,则可计算得到各个空闲度阶段的加权空闲数据块数量分别为a*(0+25%)/2、b*(25%+50%)/2、c*(50%+75%)/2、d*(75%+100%)/2、e*100%。
计算单元756,还用于获取归档表的数据块容量,根据数据块容量及各个空闲度阶段的加权空闲数据块数量计算归档容量。
服务器可根据归档表的BLOCK_SIZE参数获取归档表的数据块容量大小,并将数据块容量与各个空闲度阶段的加权空闲数据块数量之和相乘,即可得到归档表的归档容量,即归档容量=归档容量*各个空闲度阶段的加权空闲数据块数量之和。
计算单元756,还用于根据原容量及归档容量计算归档表的实际容量。
具体的,可根据获取的归档表的原容量、归档表中已分配但未使用的容量及归档容量等数据计算归档表的实际容量,归档表的实际容量=原容量-已分配但未使用的容量-归档容量。
在本实施例中,将归档表中已被使用的数据块按照空闲度分配至各个空闲度阶段,并计算各个空闲度阶段的加权空闲数据块数量,从而使得计算出的归档表的归档容量更加准确,能够准确计算得到每个归档表的实际容量大小。
如图11所示,在一个实施例中,上述数据库容量计算装置,除了包括数据获取模块710、生成模块720、确定模块730、分析模块740及计算模块750,还包括容量获取模块760及统计模块770。
容量获取模块760,用于获取目标库的原全库容量。
服务器可从数据库记录表中提取与目标库对应的数据,并获取目标库的原全库容量,其中,原全库容量指的是数据库未进行数据归档前所占的物理空间大小。
统计模块770,用于统计目标库中各个归档表的归档容量,得到目标库的总归档容量。
服务器计算得到目标库中各个归档表的归档容量后,可对各个归档表的归档容量进行统计,得到目标库的总归档容量。
计算模块750,还用于根据总归档容量及原全库容量计算实际归档率。
实际归档率指的是已归档的数据占原来全部数据的比率,目标库的实际归档率=总归档容量/原全库容量。在一个实施例中,服务器可每隔预设时间,例如每日、每月等,收集目标库各个归档表的归档时间、归档容量、实际容量、目标库的总归档容量、实际归档率等数据,并生成报表,方便工作人员对数据库中的归档范围及实际归档率等进行察看调整。
在本实施例中,可根据准确计算得到的归档表的归档容量计算目标库的实际归档率,使实际归档率的计算更加精准,能帮助工作人员更加准确地分析数据库的归档工作。
上述数据库容量计算装置中的各个模块可全部或部分通过软件、硬件或其组合来实现。例如,在硬件实现上,上述计算模块750可通过服务器的处理器根据各个数据块的空闲度及归档表的原容量计算归档表的实际容量,其中,处理器可以为中央处理单元、微处理器等。上述各模块可以硬件形式内嵌于或独立于服务器的处理器中,也可以以软件形式存储于服务器的存储器中,以便于处理器调用执行以上各个模块对应的操作。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一计算机可读取存储设备中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,所述的存储设备可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)等。
以上所述实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。

Claims (20)

  1. 一种数据库容量计算方法,包括:
    获取目标库的归档配置数据;
    根据所述归档配置数据获取所述目标库中与所述归档配置数据匹配的归档表,并生成配置信息表;
    根据所述配置信息表确定所述归档表的原容量;
    分析所述归档表,并获取所述归档表中各个数据块的空闲度;及
    根据所述各个数据块的空闲度及所述归档表的原容量计算所述归档表的实际容量。
  2. 根据权利要求1所述的方法,其特征在于,所述获取目标库的归档配置数据,包括:
    获取待计算的目标库及目标库信息;
    根据所述目标库信息从配置库中提取匹配的归档配置数据,并根据所述归档配置数据生成SQL脚本;及
    根据所述SQL脚本将所述归档配置数据导入所述目标库中。
  3. 根据权利要求1或2所述的方法,其特征在于,所述分析所述归档表,并获取所述归档表中各个数据块的空闲度,包括:
    分析所述归档表,得到所述归档表的高水位线;
    根据所述高水位线获取所述归档表中已被使用的数据块;及
    逐一扫描所述已被使用的数据块,并获取各个已被使用的数据块的空闲度。
  4. 根据权利要求3所述的方法,其特征在于,所述根据所述各个数据块的空闲度及所述归档表的原容量计算所述归档表的实际容量,包括:
    根据预设的各个空闲度阶段对应的空闲度阶段值及所述各个已被使用的数据块的空闲度,将所述各个已被使用的数据块分配到对应的空闲度阶段;
    获取被分配到所述各个空闲度阶段的数据块数量;
    通过加权算法对所述各个空闲度阶段对应的空闲度阶段值及数据块数量进行计算,得到所述各个空闲度阶段的加权空闲数据块数量;
    获取所述归档表的数据块容量,根据所述数据块容量及所述各个空闲度阶段的加权空闲数据块数量计算归档容量;及
    根据所述原容量及所述归档容量计算所述归档表的实际容量。
  5. 根据权利要求4所述的方法,其特征在于,所述方法还包括:
    获取所述目标库的原全库容量;
    统计所述目标库中各个归档表的归档容量,得到所述目标库的总归档容量;及
    根据所述总归档容量及所述原全库容量计算实际归档率。
  6. 一种数据库容量计算装置,包括:
    数据获取模块,用于获取目标库的归档配置数据;
    生成模块,用于根据所述归档配置数据获取所述目标库中与所述归档配置数据匹配的归档表,并生成配置信息表;
    确定模块,用于根据所述配置信息表确定所述归档表的原容量;
    分析模块,用于分析所述归档表,并获取所述归档表中各个数据块的空闲度;及
    计算模块,用于根据所述各个数据块的空闲度及所述归档表的原容量计算所述归档表的实际容量。
  7. 根据权利要求6所述的装置,其特征在于,所述数据获取模块,包括:
    信息获取单元,用于获取待计算的目标库及目标库信息;
    脚本生成单元,用于根据所述目标库信息从配置库中提取匹配的归档配置数据,并根据所述归档配置数据生成SQL脚本;及
    导入单元,用于根据所述SQL脚本将所述归档配置数据导入所述目标库中。
  8. 根据权利要求6或7所述的装置,其特征在于,所述分析模块,包括:
    分析单元,用于分析所述归档表,得到所述归档表的高水位线;
    数据块获取单元,用于根据所述高水位线获取所述归档表中已被使用的数据块;及
    空闲度获取单元,用于逐一扫描所述已被使用的数据块,并获取各个已被使用的数据块的空闲度。
  9. 根据权利要求8所述的装置,其特征在于,所述计算模块,包括:
    分配单元,用于根据预设的各个空闲度阶段对应的空闲度阶段值及所述各个已被使用的数据块的空闲度,将所述各个已被使用的数据块分配到对应的空闲度阶段;
    数量获取单元,用于获取被分配到所述各个空闲度阶段的数据块数量;及
    计算单元,用于通过加权算法对所述各个空闲度阶段对应的空闲度阶段值及数据块数量进行计算,得到所述各个空闲度阶段的加权空闲数据块数量;
    所述计算单元,还用于获取所述归档表的数据块容量,根据所述数据块容量及所述各个空闲度阶段的加权空闲数据块数量计算归档容量;
    所述计算单元,还用于根据所述原容量及所述归档容量计算所述归档表的实际容量。
  10. 根据权利要求9所述的装置,其特征在于,所述装置还包括:
    容量获取模块,用于获取所述目标库的原全库容量;及
    统计模块,用于统计所述目标库中各个归档表的归档容量,得到所述目标库的总归档容量;
    所述计算模块,还用于根据所述总归档容量及所述原全库容量计算实际归档率。
  11. 一种服务器,包括存储器和处理器,所述存储器中储存有计算机可执行指令,所述计算机可执行指令被所述处理器执行时,使得所述处理器执行以下步骤:
    获取目标库的归档配置数据;
    根据所述归档配置数据获取所述目标库中与所述归档配置数据匹配的归档表,并生成配置信息表;
    根据所述配置信息表确定所述归档表的原容量;
    分析所述归档表,并获取所述归档表中各个数据块的空闲度;及
    根据所述各个数据块的空闲度及所述归档表的原容量计算所述归档表的实际容量。
  12. 根据权利要求11所述的服务器,其特征在于,所述获取目标库的归档配置数据,包括:
    获取待计算的目标库及目标库信息;
    根据所述目标库信息从配置库中提取匹配的归档配置数据,并根据所述归档配置数据生成SQL脚本;及
    根据所述SQL脚本将所述归档配置数据导入所述目标库中。
  13. 根据权利要求11或12所述的服务器,其特征在于,所述分析所述归档表,并获取所述归档表中各个数据块的空闲度,包括:
    分析所述归档表,得到所述归档表的高水位线;
    根据所述高水位线获取所述归档表中已被使用的数据块;及
    逐一扫描所述已被使用的数据块,并获取各个已被使用的数据块的空闲度。
  14. 根据权利要求13所述的服务器,其特征在于,所述根据所述各个数据块的空闲度及所述归档表的原容量计算所述归档表的实际容量,包括:
    根据预设的各个空闲度阶段对应的空闲度阶段值及所述各个已被使用的数据块的空闲度,将所述各个已被使用的数据块分配到对应的空闲度阶段;
    获取被分配到所述各个空闲度阶段的数据块数量;
    通过加权算法对所述各个空闲度阶段对应的空闲度阶段值及数据块数量进行计算,得到所述各个空闲度阶段的加权空闲数据块数量;
    获取所述归档表的数据块容量,根据所述数据块容量及所述各个空闲度阶段的加权空闲数据块数量计算归档容量;及
    根据所述原容量及所述归档容量计算所述归档表的实际容量。
  15. 根据权利要求14所述的服务器,其特征在于,所述计算机可执行指令被所述处理器执行时,还使得所述处理器执行:
    获取所述目标库的原全库容量;
    统计所述目标库中各个归档表的归档容量,得到所述目标库的总归档容量;及
    根据所述总归档容量及所述原全库容量计算实际归档率的步骤。
  16. 一个或多个存储有计算机可执行指令的存储设备,所述计算机可执行指令被一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:
    获取目标库的归档配置数据;
    根据所述归档配置数据获取所述目标库中与所述归档配置数据匹配的归档表,并生成配置信息表;
    根据所述配置信息表确定所述归档表的原容量;
    分析所述归档表,并获取所述归档表中各个数据块的空闲度;及
    根据所述各个数据块的空闲度及所述归档表的原容量计算所述归档表的实际容量。
  17. 根据权利要求16所述的存储设备,其特征在于,所述获取目标库的归档配置数据,包括:
    获取待计算的目标库及目标库信息;
    根据所述目标库信息从配置库中提取匹配的归档配置数据,并根据所述归档配置数据生成SQL脚本;及
    根据所述SQL脚本将所述归档配置数据导入所述目标库中。
  18. 根据权利要求16或17所述的存储设备,其特征在于,所述分析所述归档表,并获取所述归档表中各个数据块的空闲度,包括:
    分析所述归档表,得到所述归档表的高水位线;
    根据所述高水位线获取所述归档表中已被使用的数据块;及
    逐一扫描所述已被使用的数据块,并获取各个已被使用的数据块的空闲度。
  19. 根据权利要求18所述的存储设备,其特征在于,所述根据所述各个数据块的空闲度及所述归档表的原容量计算所述归档表的实际容量,包括:
    根据预设的各个空闲度阶段对应的空闲度阶段值及所述各个已被使用的数据块的空闲度,将所述各个已被使用的数据块分配到对应的空闲度阶段;
    获取被分配到所述各个空闲度阶段的数据块数量;
    通过加权算法对所述各个空闲度阶段对应的空闲度阶段值及数据块数量进行计算,得到所述各个空闲度阶段的加权空闲数据块数量;
    获取所述归档表的数据块容量,根据所述数据块容量及所述各个空闲度阶段的加权空闲数据块数量计算归档容量;及
    根据所述原容量及所述归档容量计算所述归档表的实际容量。
  20. 根据权利要求19所述的存储设备,其特征在于,所述计算机可执行指令被一个或多个处理器执行时,还使得所述一个或多个处理器执行:
    获取所述目标库的原全库容量;
    统计所述目标库中各个归档表的归档容量,得到所述目标库的总归档容量;及
    根据所述总归档容量及所述原全库容量计算实际归档率的步骤。
PCT/CN2017/085868 2016-09-28 2017-05-25 数据库容量计算方法、装置、服务器及存储设备 WO2018058983A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610859826.2A CN106383897B (zh) 2016-09-28 2016-09-28 数据库容量计算方法和装置
CN201610859826.2 2016-09-28

Publications (1)

Publication Number Publication Date
WO2018058983A1 true WO2018058983A1 (zh) 2018-04-05

Family

ID=57937357

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/085868 WO2018058983A1 (zh) 2016-09-28 2017-05-25 数据库容量计算方法、装置、服务器及存储设备

Country Status (3)

Country Link
CN (1) CN106383897B (zh)
TW (1) TWI637281B (zh)
WO (1) WO2018058983A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106383897B (zh) * 2016-09-28 2018-02-16 平安科技(深圳)有限公司 数据库容量计算方法和装置
CN108470045B (zh) * 2018-03-06 2020-02-18 平安科技(深圳)有限公司 电子装置、数据链式归档的方法及存储介质
CN109885567B (zh) * 2018-12-13 2024-04-02 平安壹钱包电子商务有限公司 一种存储空间扩容方法和装置
CN111090652B (zh) * 2019-12-20 2023-05-23 山大地纬软件股份有限公司 一种可水平扩展归档数据库的数据归档方法和装置

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101364897A (zh) * 2008-09-17 2009-02-11 中兴通讯股份有限公司 一种历史数据归档的系统与实现方法
CN103761318A (zh) * 2014-01-27 2014-04-30 中国工商银行股份有限公司 一种关系型异构数据库数据同步的方法及系统
CN106383897A (zh) * 2016-09-28 2017-02-08 平安科技(深圳)有限公司 数据库容量计算方法和装置

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7660959B2 (en) * 2006-09-28 2010-02-09 International Business Machines Corporation Managing encryption for volumes in storage pools
CN101493842A (zh) * 2009-02-20 2009-07-29 中兴通讯股份有限公司 Oracle数据库中大批量删除数据的方法
US9002801B2 (en) * 2010-03-29 2015-04-07 Software Ag Systems and/or methods for distributed data archiving amongst a plurality of networked computing devices
TW201342052A (zh) * 2011-12-16 2013-10-16 Ibm 磁帶機系統伺服器
US9275086B2 (en) * 2012-07-20 2016-03-01 Commvault Systems, Inc. Systems and methods for database archiving
CN103064768B (zh) * 2013-01-25 2015-11-25 北京捷成世纪科技发展江苏有限公司 一种基于磁带的数据归档备份方法和系统
CN104699807B (zh) * 2015-03-23 2018-05-15 上海新炬网络信息技术股份有限公司 一种oracle数据表空间的自动监控扩容方法
CN105808633B (zh) * 2016-01-08 2019-07-23 平安科技(深圳)有限公司 数据归档方法和系统

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101364897A (zh) * 2008-09-17 2009-02-11 中兴通讯股份有限公司 一种历史数据归档的系统与实现方法
CN103761318A (zh) * 2014-01-27 2014-04-30 中国工商银行股份有限公司 一种关系型异构数据库数据同步的方法及系统
CN106383897A (zh) * 2016-09-28 2017-02-08 平安科技(深圳)有限公司 数据库容量计算方法和装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WU, MINGJUAN ET AL.: "Research and Application on History Data Archiving and Reconstruction Strategy Based on XML", COMPUTER DEVELOPMENT & APPLICATIONS, vol. 25, no. 2, 25 February 2012 (2012-02-25), pages 5 - 7 *

Also Published As

Publication number Publication date
TW201814559A (zh) 2018-04-16
CN106383897A (zh) 2017-02-08
TWI637281B (zh) 2018-10-01
CN106383897B (zh) 2018-02-16

Similar Documents

Publication Publication Date Title
WO2018058983A1 (zh) 数据库容量计算方法、装置、服务器及存储设备
WO2018103315A1 (zh) 监控数据的处理方法、装置、服务器及存储设备
WO2018149082A1 (zh) 合同生成方法、装置、服务器和存储介质
WO2018036167A1 (zh) 测试任务执行机分配方法、装置、服务器和存储介质
WO2018076865A1 (zh) 数据分享方法、装置、存储介质及电子设备
WO2018058959A1 (zh) Sql审核方法、装置、服务器及存储设备
WO2018076867A1 (zh) 数据备份的删除方法、装置、系统、存储介质和服务器
WO2015020471A1 (en) Method and apparatus for distributing data in hybrid cloud environment
WO2020224246A1 (zh) 基于区块链的数据管理方法、装置、设备和存储介质
WO2017054463A1 (zh) 事件信息推送方法、事件信息推送装置及存储介质
WO2014044136A1 (zh) 基于分布式数据的并发处理方法、系统和计算机存储介质
WO2017157146A1 (zh) 基于用户画像的个性化推荐方法、装置、服务器及存储介质
WO2018076889A1 (zh) 数据备份的方法、装置、系统、存储介质及服务器
WO2018076841A1 (zh) 数据分享方法、装置、存储介质及服务器
WO2017028573A1 (zh) 一种基于移动终端的图片信息处理的方法及系统
WO2017071363A1 (zh) 密码的共享方法、密码的共享系统及终端设备
WO2014044130A1 (zh) 业务巡检方法和系统、计算机存储介质
WO2018000174A1 (zh) 面向存储的dna序列的并行快速匹配方法及其系统
WO2020253135A1 (zh) 自动化分析方法、用户设备、存储介质及装置
WO2020015060A1 (zh) 用电量异常评估方法、装置、设备和计算机存储介质
WO2018076829A1 (zh) 终端数据的处理方法、装置、系统、存储介质及服务器
WO2018076890A1 (zh) 数据备份的方法、装置、存储介质、服务器及系统
WO2018066942A1 (en) Electronic device and method for controlling the same
WO2021072881A1 (zh) 基于对象存储的请求处理方法、装置、设备及存储介质
WO2015180434A1 (zh) 一种数据库集群管理数据的方法、节点及系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17854439

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 17.07.2019)

122 Ep: pct application non-entry in european phase

Ref document number: 17854439

Country of ref document: EP

Kind code of ref document: A1