WO2022001627A1 - 数据清除方法、装置、计算机设备和存储介质 - Google Patents

数据清除方法、装置、计算机设备和存储介质 Download PDF

Info

Publication number
WO2022001627A1
WO2022001627A1 PCT/CN2021/099673 CN2021099673W WO2022001627A1 WO 2022001627 A1 WO2022001627 A1 WO 2022001627A1 CN 2021099673 W CN2021099673 W CN 2021099673W WO 2022001627 A1 WO2022001627 A1 WO 2022001627A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
target
database
specified
data table
Prior art date
Application number
PCT/CN2021/099673
Other languages
English (en)
French (fr)
Inventor
冀怀远
张敏伟
杨婧
徐梅兰
Original Assignee
苏宁易购集团股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 苏宁易购集团股份有限公司 filed Critical 苏宁易购集团股份有限公司
Publication of WO2022001627A1 publication Critical patent/WO2022001627A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors

Definitions

  • the present application relates to the technical field of data processing, and in particular, to a data clearing method, apparatus, computer equipment and storage medium.
  • the storage space of the database is limited, so it is necessary to clean the database. For example, the non-business tables or garbage data in the database can be cleaned up. Otherwise, the storage space of the database is likely to be tight, which will affect the system operation and task execution.
  • the data in the database is usually cleaned up manually.
  • the database uses sub-database and sub-table to store data
  • the data of one data table in the database will be split into multiple data tables and stored in different sub-databases in different databases.
  • the amount of data to be deleted is large and the storage locations are scattered, which makes the existing method of manually clearing data inefficient in the scenario of sub-database and sub-table.
  • the existing method for clearing multiple data tables is very dependent on the configuration information input by the user, which causes pressure and inconvenience to the user's operation when the number of databases and data tables involved is huge.
  • the present invention provides a data clearing method, device, computer equipment and storage medium.
  • the embodiments of the present invention can quickly and accurately perform data clearing processing on target data tables with a large number and scattered storage locations, The process is efficient and user-friendly.
  • the present invention provides a method for clearing data according to a first aspect.
  • the method includes:
  • the specified data of the data table includes one or more specified data, each specified data corresponds to a database, and each specified data includes the identifier of the data table to be cleared;
  • each of the above-mentioned specified data further includes database connection information
  • the above-mentioned multiple target sub-databases in the database corresponding to any piece of designated data determined according to any piece of designated data include:
  • One or more target sub-databases in the database corresponding to any one of the specified data are determined from a plurality of initial sub-databases according to the identifier of the data table to be cleared included in the specified data.
  • the above-mentioned identifiers of data tables to be cleared include one or more identifiers of data tables
  • the above-mentioned method further includes: determining one or more target data table identifiers corresponding to each target sub-database according to the data table identifiers to be cleared included in the specified data, and the target data table identifiers corresponding to each target sub-database correspond to the target data table identifiers.
  • One or more target data tables in the target sub-library are determining one or more target data table identifiers corresponding to each target sub-database according to the data table identifiers to be cleared included in the specified data, and the target data table identifiers corresponding to each target sub-database correspond to the target data table identifiers.
  • One or more target data tables in the target sub-library is determining one or more target data table identifiers corresponding to each target sub-database according to the data table identifiers to be cleared included in the specified data, and the target data table identifiers corresponding to each target sub-database correspond to the target data table identifiers.
  • the above-mentioned multiple target sub-databases in the database corresponding to any specified data are determined according to any specified data; each target sub-database is determined according to the identifier of the data table to be cleared included in the specified data.
  • the multiple target data sheets in the library include:
  • the first thread pool includes one or more first threads, and the number of first threads is the same as the number of specified data;
  • the above-mentioned data clearing is performed on all target data tables in each target sub-database, including:
  • a second thread pool corresponding to the first thread is established by each first thread, each second thread pool includes a plurality of second threads, and the number of second threads included in each second thread pool is the same as that of the second thread The number of target sub-libraries determined by the first thread corresponding to the pool is the same;
  • the data clearing operation for all target data tables in a target sub-database is performed by a second thread respectively, and the target sub-databases processed correspondingly by different second threads are different.
  • the operation of performing data clearing on all target data tables in a target sub-database through any second thread includes:
  • a third thread pool corresponding to any one of the second threads is established through the any one of the second threads, the third thread pool includes one or more third threads, and the number of the third threads is processed corresponding to the any one of the second threads
  • the number of target data table identifiers corresponding to the target sub-database is the same;
  • the data clearing operation of all target data tables corresponding to a target data table identifier is performed by a third thread respectively, and the target data table identifiers processed corresponding to different third threads are different.
  • the method further includes: respectively establishing a database connection pool corresponding to a target sub-database through a second thread, and the target sub-databases processed corresponding to different second threads are different;
  • Performing data clearing on all target data tables corresponding to a target data table identifier by any third thread includes:
  • the present invention provides an apparatus for clearing data according to a second aspect.
  • the apparatus includes:
  • the specified data acquisition module is used to obtain the specified data of the data table, the specified data of the data table includes one or more specified data, each specified data corresponds to a database, and each specified data includes the identifier of the data table to be cleared;
  • a target sub-database determination module used for determining a plurality of target sub-databases in the database corresponding to any one of the specified data according to any specified data;
  • a target data table determination module used for determining a plurality of target data tables in each target sub-database according to the identifier of the data table to be cleared included in the specified data
  • the clearing module is used to clear all target data tables in each target sub-library.
  • the present invention provides a computer device for clearing data, comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the computer program when the processor executes the computer program. The following steps:
  • the specified data of the data table includes one or more specified data, each specified data corresponds to a database, and each specified data includes the identifier of the data table to be cleared;
  • the present invention provides a computer-readable storage medium for clearing data, on which a computer program is stored, and when the computer program is executed by a processor, the following steps are implemented:
  • the specified data of the data table includes one or more specified data, each specified data corresponds to a database, and each specified data includes the identifier of the data table to be cleared;
  • the above-mentioned data clearing method, device, computer equipment and storage medium obtain the specified data of the data table including the identifier of the data table to be cleared, and can clear the data table according to the actual needs of the user; each specified data corresponds to a database, which can realize multiple The operation of clearing data tables in the database is unified; according to each specified data, the target sub-database containing the data table to be cleared in the database corresponding to the specified data is determined, which realizes the data table clearing at the database sub-database level, and excludes The data sub-database of the data table to be cleared reduces redundant operations, saves the task running load and saves the overall process time; according to the identification of the data table to be cleared contained in the specified data, multiple target data tables in each target sub-database are determined Moreover, all target data tables in each target sub-database are cleared, and a unified operation of clearing multiple data tables in multiple data sub-databases is realized.
  • the embodiment of the present application can quickly and accurately perform data clearing processing on a large number of target data tables with scattered storage locations in combination with the actual needs of users, which is efficient and automated, avoids numerous repeated operations, and saves a lot of time and labor costs. and resource usage costs.
  • 1-1 is an application environment diagram of the data clearing method in one embodiment
  • 1-2 is a schematic diagram of the relationship between a database and a data sub-database in one embodiment
  • 1-3 are schematic diagrams of the relationship between a data sub-database and a data table in one embodiment
  • FIG. 2 is a schematic flowchart of a data clearing method in one embodiment
  • FIG. 3 is a schematic flowchart of a step of determining target sub-libraries in one embodiment
  • FIG. 4 is a schematic flowchart of the steps of determining target sub-repositories and determining target data tables in one embodiment
  • FIG. 5 is a schematic flowchart of a data clearing step in one embodiment
  • FIG. 6 is a schematic flowchart of a data clearing step in another embodiment
  • FIG. 7 is a schematic flowchart of a data clearing step in yet another embodiment
  • FIG. 8 is a structural block diagram of a data clearing device in one embodiment
  • FIG. 9 is a detailed structural block diagram of a target sub-library determination module in one embodiment.
  • FIG. 10 is a detailed block diagram of a clearing module in one embodiment
  • Figure 11 is a diagram of the internal structure of a computer device in one embodiment.
  • the data clearing method provided by the embodiments of the present application can be applied to a system capable of data clearing, and the system can be implemented by an independent server or a server cluster composed of multiple servers, or can be implemented by other network side devices.
  • the system may be implemented by a server 102, and the server 102 where the system is located may perform data interaction with one or more database servers.
  • the database 110 may include multiple data sub-databases, including a data sub-database 112 , a data sub-database 114 and a data sub-database 116 .
  • Each data sub-database may also include multiple data tables.
  • the data sub-database 112 includes multiple data tables such as data table 120 , data table 122 , and data table 124 .
  • the present application provides a data clearing method, comprising the following steps:
  • Step 202 Acquire the specified data of the data table, the specified data of the data table includes one or more pieces of specified data, each piece of specified data corresponds to a database, and each piece of specified data includes the identifier of the data table to be cleared.
  • the specified data of the data table can be input by the user, and the database corresponding to each specified data is the database that needs to be cleared of the specified data.
  • the identifier of the data table to be cleared is used to determine the data table to be cleared, and the data to be cleared
  • the table identifier can include part of the information of one or more data table names (such as the data table name prefix). For example, there are two data tables named custlevel_A and custlevel_B. If the identifier of the data table to be cleared is custlevel, the system can Find the datasheets custlevel_A and custlevel_B.
  • the data specified in the data table may be stored in an EXCEL file, and the system reads the EXCEL file to obtain the specified data in the data table.
  • the content in the specified data of the data table may be configured as shown in Table 1 below.
  • a row of data is a specified data
  • the table identifier of the data to be cleared in each specified data is the parameter information of the "Table" field in the table, such as table1, table_2.
  • the system can further encapsulate the data information into an object for the system to call in the subsequent execution process.
  • databases corresponding to different specified data are respectively deployed on different database servers, or may be deployed on the same database server.
  • Step 204 Determine a plurality of target sub-databases in the database corresponding to any piece of specified data according to any piece of specified data.
  • the target sub-database is the data sub-database containing the data table that needs to be cleared.
  • the system can select one piece of designated data from one or more pieces of designated data obtained, access the database corresponding to the piece of designated information through the information in the piece of designated data, and then determine that the database contains the target data of the data table to be cleared. library.
  • Step 206 Determine a plurality of target data tables in each target sub-database according to the identifier of the data table to be cleared included in any piece of the specified data.
  • the target data table is the data table that needs to be cleared.
  • the system performs a query in the database corresponding to the specified piece of data according to the identifier of the data table to be cleared, and obtains all target data tables to be cleared in each target sub-database in the database.
  • Step 208 Perform data clearing on all target data tables in each target sub-database.
  • the system clears all target data tables in each target sub-database.
  • the system performing data clearing on the target data table may mean that the system clears all data in the target data table and retains the target data table, or the system directly deletes the target data table.
  • the specified data of the data table including the identifier of the data table to be cleared is obtained, and the data table can be cleared according to the actual needs of the user; each piece of specified data corresponds to one database, and the operation of clearing the data tables in multiple databases can be realized. Unified; according to each specified data, determine the target sub-database that contains the data table to be cleared in the database corresponding to the specified data, realize the data table deletion at the database sub-database level, and exclude the data sub-database that does not contain the data table to be cleared.
  • the database reduces redundant operations, saves the task running load and saves the overall process time; according to the identification of the data table to be cleared contained in the specified data, multiple target data tables in each target sub-database are determined, and each target sub-database is identified. All target data tables in the database are cleared, which realizes the unified operation of clearing multiple data tables in multiple data sub-databases.
  • the embodiment of the present application can realize the clearing of multiple data tables in multiple databases and multiple data sub-databases according to the actual needs of users, which is efficient and automatic, avoids numerous repetitive operations, and saves a lot of time cost, labor cost and resource usage cost.
  • each of the above-mentioned specified data also includes database connection information; the above-mentioned determination of multiple target sub-databases in the database corresponding to any of the specified data according to any of the specified data includes:
  • Step 302 Determine a plurality of initial sub-databases in the database corresponding to any piece of specified data according to the database connection information included in any piece of specified data.
  • the database connection information is information used to access the database.
  • the database connection information may include the IP address and port information of the server where the database is located, and the database connection information may also include the information used to log in to the database. user name and password.
  • the system can access the database according to the database connection information, and then obtain a list of all data sub-databases in the database, and the excluded part can determine the data sub-databases that do not contain the data table to be cleared, thereby obtaining multiple initial sub-databases.
  • the system can log in to the information_schema database of the database according to the database connection information, and then execute the show databases statement to obtain a list of all data sub-databases in the database within the scope of the current user authority. Further exclude the database (performance_schema library, etc.) that comes with the mysql database, so as to obtain multiple initial sub-databases.
  • Step 304 Determine one or more target sub-databases in the database corresponding to any one of the specified data from a plurality of initial sub-databases according to the identifier of the data table to be cleared included in the specified data.
  • the system further determines one or more target sub-databases containing the data table to be cleared from the plurality of databases according to the identifier of the data table to be cleared.
  • library For example, the system can query the data table to be cleared in the database according to the identifier of the data table to be cleared, and the data sub-database containing the data table to be cleared can be determined as the target sub-database.
  • the system first accesses the database to exclude data sub-databases that can be determined not to contain data tables to be cleared during execution, which saves operation steps and running burdens for subsequent processes and improves overall process efficiency.
  • the above identifiers of the data tables to be cleared include one or more identifiers of the data tables; the method further includes: determining one or more corresponding to each target sub-database according to the identifiers of the data tables to be cleared included in the specified data.
  • the target data table identifier corresponding to each target sub-database corresponds to one or more target data tables in the target sub-database.
  • the identifier of the data table to be cleared is one or more identifiers of the data table, each identifier of the data table corresponds to one or more tables of data to be cleared, and the identifier of the target data table corresponding to each target sub-database is the identifier of the data table to be cleared
  • the system checks the database according to the identifiers of the data tables to be cleared. For each sub-database, the set of data table identifiers corresponding to all target data tables in the sub-database is the target data table corresponding to the target sub-database. logo.
  • each data table to be cleared corresponds to a data table identifier, so that the operation process of locating the data table to be cleared is convenient and fast, and the identifiers of all target data tables in each target sub-database are determined, which is convenient for subsequent data processing.
  • Table clearing makes the entire data clearing process efficient.
  • the above-mentioned multiple target sub-databases in the database corresponding to any specified data are determined according to any specified data; according to the identification of the data table to be cleared included in the specified data Identifying multiple target data sheets in any target sub-library includes:
  • Step 402 establish a first thread pool, the first thread pool includes one or more first threads, and the number of the first threads is the same as the number of specified data;
  • the number of threads in the first thread pool is the same as the number of specified data, and one first thread corresponds to processing a piece of specified data, that is, one first thread corresponds to a database corresponding to the specified piece of data.
  • the number of specified data is two, and the two databases corresponding to the two specified data are distributed in different servers as an example: for example: the two databases are respectively in the server 10.10.10. 10.1 and server 10.10.10.2, then the database in server 10.10.10.1 may correspond to the first thread thread_1, and the database in server 10.10.10.2 may correspond to thread thread_2.
  • the first thread pool may be a cacheable thread pool, that is, the size of the first thread pool varies with the amount of specified data. For example, if there are two pieces of specified data, corresponding to two different databases, then there are two threads in the first thread pool; if the number of specified pieces of data is three, corresponding to three different databases, then the first thread pool There are three first threads.
  • the system creates a first thread pool at the database embodiment level in the main thread, and each first thread in the first thread pool correspondingly processes a database corresponding to a piece of specified data.
  • Step 404 determine a plurality of target sub-databases in the database corresponding to a specified data by a first thread respectively, and determine any target sub-database according to the data table identification to be cleared that the specified data includes In the operation of multiple target data tables in , the specified data corresponding to the processing of different first threads is different.
  • the system executes, through each first thread, on the database corresponding to the first thread: determining a plurality of target sub-databases in the database corresponding to a piece of specified data according to a piece of specified data, and according to any piece of specified data including
  • the identifier of the data table to be cleared determines multiple target data tables in each target sub-library.
  • the system can log in to the information_schema library of the database through thread_1, and execute show databases, and then exclude the self-contained data sub-database of the mysql type database to obtain multiple target sub-databases of the database, such as DB_1, DB_2, DB_3, etc.
  • the system can search the database according to the identifier of the data table to be cleared through thread_1, and obtain the specific table name of the target data table in each target sub-database.
  • the target data table in sub-database DB_1 can be custlevel_001- custlevel_100; the target data table in sub-database DB_2 can be custlevel_101- custlevel_200.
  • each database to be processed corresponds to a first thread, and each thread can be executed concurrently, which realizes the synchronization of data removal in multiple different databases, makes the data removal process fast and efficient, and effectively and reasonably utilizes system resources , to improve the overall processing performance.
  • the above-mentioned data clearing is performed on all target data tables in each target sub-database, including:
  • Step 502 Establish a second thread pool corresponding to the first thread through each first thread, each second thread pool includes a plurality of second threads, and the number of second threads included in each second thread pool is the same as the number of the second thread pool.
  • the number of target sub-libraries determined by the first thread corresponding to the second thread pool is the same;
  • Step 504 Perform data clearing operations on all target data tables in one target sub-database through a second thread respectively, and different second threads correspond to different target sub-databases.
  • the system corresponding to each first thread, the system establishes a second thread pool, that is, corresponding to the database corresponding to each specified data, the system establishes a second thread pool, that is, each database to be processed corresponds to a second thread pool Thread Pool.
  • the number of second threads in each second thread pool is the same as the number of target sub-libraries in the database corresponding to the second thread, one second thread corresponds to one target sub-library, and the target sub-library corresponding to each second thread The sub-libraries are different.
  • the system executes a target data table clearing operation in the target sub-database corresponding to the second thread through a second thread.
  • the database corresponds to the first thread thread_1 in the first thread pool pool_1, and the database includes target sub-databases DB_1 and DB_2.
  • Thread_1 establishes a corresponding second thread pool pool_2.
  • Each target sub-library in the database corresponds to a second thread in pool_2.
  • DB_1 corresponds to the second thread thread_3
  • DB_2 corresponds to the second thread thread_4.
  • Thread_3 clears all target data tables in target sub-database DB_1, and uses thread_4 to clear all target data tables in target sub-database DB_2.
  • a thread pool at the sub-database level is established on the basis of the thread pool at the database level, and each target sub-database allocates a second thread in the second thread pool for data clearing, which realizes multiple database Multiple sub-database data clearing operations are performed concurrently, making the data clearing process fast and efficient, effectively and reasonably utilizing system resources, and improving overall processing performance.
  • the above-mentioned operation of performing data clearing on all target data tables in a target sub-database through any second thread, as shown in Figure 6, includes:
  • Step 602 Establish a third thread pool corresponding to any second thread through the any second thread, the third thread pool includes one or more third threads, and the number of third threads is the same as that of any second thread.
  • the number of target data table identifiers corresponding to the target sub-database processed by the thread is the same.
  • Step 604 Perform data clearing operations on all target data tables corresponding to one target data table identifier through a third thread respectively, and the target data table identifiers processed corresponding to different third threads are different.
  • each target sub-library corresponds to a third thread pool.
  • each target sub-database corresponds to one or more target data table identifiers
  • the number of third threads in the third thread pool corresponding to this target sub-database and the number of target data table identifiers corresponding to this target sub-database corresponds to a third thread in the third thread pool.
  • the system executes the data clearing operation on all target data tables corresponding to a target data table identifier through a third thread respectively.
  • the target sub-database is DB_1 as an example for description
  • the second thread corresponding to the target sub-database is thread_3
  • the target sub-database corresponds to two target data table identifiers: custlevel and deptlevel.
  • the corresponding target data table in the target sub-database is custlevel_001-custlevel_100, and the target data table in the target sub-database corresponding to deptlevel is deptlevel_001- deptlevel_100;
  • the system establishes a third thread pool pool_subDB_1 corresponding to the target sub-database through thread_3, then there are two third threads thread_subDB_1 and thread_subDB_2 in the third thread pool pool_subDB_1, which correspond to the target data table identifiers custlevel and deptlevel respectively;
  • the three-thread thread_subDB_1 clears the target data table custlevel_001-custlevel_100 in the target sub-database, and uses the third thread thread_subDB_2 to clear the target data table deptlevel_001- in the target sub-database deptlevel_100 to clear.
  • a third thread pool at the data table identification level is established based on the thread pool at the database level and the thread pool at the sub-database level, thereby constructing a three-level multi-threading system.
  • Each target data table identification is assigned a third thread pool.
  • a third thread in the three-thread pool performs data clearing, realizing the concurrent data clearing operations of multiple data tables in multiple sub-databases in multiple databases, making the data clearing process fast and efficient, effectively and reasonably utilizing system resources, and improving overall processing performance.
  • the method further includes:
  • a database connection pool corresponding to a target sub-database is established respectively through a second thread, and the target sub-databases processed by different second threads are different.
  • the second thread corresponding to the target sub-database is thread_3, and the system creates a database connection pool db_pool_1 corresponding to the target sub-database DB_1 through thread_3, which is responsible for allocating, Manage and release database connections for DB_1.
  • the database connection pool may be DBCP (DataBase connection pool) database connection pool.
  • the system can set parameters such as the initial number of connections, the maximum number of connections, the minimum number of idle connections, and the maximum waiting time for obtaining a connection when creating a database connection pool. For example, the maximum number of connections can be set to 50.
  • performing data clearing on all target data tables corresponding to a target data table identifier by any third thread includes:
  • Step 702 Obtain a database connection corresponding to the target data table identifier.
  • the system obtains the database connection of the target sub-database corresponding to the target data table identifier through the third thread, and the database connection may come from a database connection pool corresponding to the target sub-database.
  • Step 704 Perform data clearing on all target data tables corresponding to the target data table identifier according to the database connection corresponding to the target data table identifier.
  • the system can access the target sub-database according to the database connection through the third thread, and then execute the clear statement according to the specific target data table name.
  • the target data table identifier custlevel corresponds to the third thread thread_subDB_1
  • the target data table identifier custlevel in the target sub-database DB_1 corresponds to the target data table custlevel_001- custlevel_100
  • the system obtains the database connection corresponding to the target sub-database DB_1 according to the third thread thread_subDB_1, and then accesses the target sub-database DB_1 according to the database connection, and then clears the target data tables custlevel_001-custlevel_100 in order.
  • a database connection pool is created corresponding to each target sub-database, which not only protects the target sub-database, but also makes the number of connections applied to the target sub-database always smaller than the maximum externally provided by the target sub-database.
  • the number of connections and the multiplexing of database connections are supported to reduce resource consumption. At the same time, the reasonable allocation of resources and the efficient execution of tasks can be achieved in the case of a large number of target data tables.
  • steps in the flowcharts of FIGS. 2-7 are shown in sequence according to the arrows, these steps are not necessarily executed in the sequence shown by the arrows. Unless explicitly stated herein, the execution of these steps is not strictly limited to the order, and these steps may be performed in other orders. Moreover, at least a part of the steps in FIGS. 2-7 may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed and completed at the same time, but may be executed at different times. These sub-steps or stages are not necessarily completed at the same time. The order of execution of the steps is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a part of sub-steps or stages of other steps.
  • a data clearing device comprising: a specified data acquisition module, a target sub-database determination module, a target data table determination module and a clearing module, wherein:
  • the specified data acquisition module 802 is used to acquire the specified data of the data table, the specified data of the data table includes one or more pieces of specified data, each piece of specified data corresponds to a database, and each piece of specified data includes the identifier of the data table to be cleared;
  • the target sub-database determination module 804 is used to determine a plurality of target sub-databases in the database corresponding to any one of the specified data according to any specified data;
  • the target data table determination module 806 is used for determining a plurality of target data tables in each target sub-library according to the identifier of the data table to be cleared included in the specified data;
  • the clearing module 808 is used for clearing all target data tables in each target sub-database.
  • each of the above-mentioned specified data also includes database connection information;
  • the above-mentioned target sub-database determination module 804, as shown in FIG. 9, may include:
  • a first determining unit 902 configured to determine a plurality of initial sub-databases in the database corresponding to any piece of specified data according to the database connection information included in any piece of specified data;
  • the second determining unit 904 is configured to determine one or more target sub-databases in the database corresponding to any one of the specified data from a plurality of initial sub-databases according to the identifier of the data table to be cleared included in the specified data.
  • the above-mentioned identifiers of data tables to be cleared include one or more identifiers of data tables; the device further includes:
  • the target data table identification determination unit (not shown in the figure) is used to determine one or more target data table identifications corresponding to each target sub-database according to the identification of the data table to be cleared included in the specified data.
  • the target data table identifier corresponding to the library corresponds to one or more target data tables in the target sub-library.
  • target sub-database determination module 804 and target data table determination module 806 implement their corresponding functions, they specifically implement:
  • the first thread pool includes one or more first threads, and the number of first threads is the same as the number of specified data;
  • the above-mentioned clearing module 808 may include:
  • the establishment unit 1002 is configured to establish a second thread pool corresponding to the first thread through each first thread, each second thread pool includes a plurality of second threads, and each second thread pool includes a second thread pool of The number is the same as the number of target sub-libraries determined by the first thread corresponding to the second thread pool;
  • the clearing unit 1004 is configured to perform the data clearing operation of all target data tables in one target sub-database through a second thread respectively, and the target sub-databases processed corresponding to different second threads are different.
  • the first clearing unit 1004 when the above-mentioned first clearing unit 1004 performs the data clearing operation on all target data tables in a target sub-database through any second thread, it specifically implements:
  • a third thread pool corresponding to any one of the second threads is established through the any one of the second threads, the third thread pool includes one or more third threads, and the number of the third threads is processed corresponding to the any one of the second threads
  • the number of target data table identifiers corresponding to the target sub-database is the same;
  • the data clearing operation of all target data tables corresponding to a target data table identifier is performed by a third thread respectively, and the target data table identifiers processed corresponding to different third threads are different.
  • the device further includes:
  • the database connection pool establishing module (not shown in the figure) is used to respectively establish a database connection pool corresponding to a target sub-database through a second thread, and the target sub-databases processed corresponding to different second threads are different.
  • the first clearing unit 1004 when the above-mentioned first clearing unit 1004 implements data clearing on all target data tables corresponding to a target data table identifier through any third thread, it specifically implements:
  • Each module in the above-mentioned data clearing device may be implemented in whole or in part by software, hardware and combinations thereof.
  • the above modules can be embedded in or independent of the processor in the computer device in the form of hardware, or stored in the memory in the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.
  • a computer device is provided, and the computer device may be a server, and its internal structure diagram may be as shown in FIG. 11 .
  • the computer device includes a processor, memory, a network interface, and a database connected by a system bus. Among them, the processor of the computer device is used to provide computing and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium, an internal memory.
  • the nonvolatile storage medium stores an operating system, a computer program, and a database.
  • the internal memory provides an environment for the execution of the operating system and computer programs in the non-volatile storage medium.
  • the computer device's database is used to store data clearing data.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer program when executed by a processor, implements a data clearing method.
  • FIG. 11 is only a block diagram of a partial structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied. Include more or fewer components than shown in the figures, or combine certain components, or have a different arrangement of components.
  • a computer device including a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor implements any one of the embodiments of the present invention when the processor executes the computer program. method steps.
  • a computer-readable storage medium having stored thereon a computer program that, when executed by a processor, implements the method steps of any one of the embodiments of the present invention.

Abstract

一种数据清除方法、装置、计算机设备和存储介质。所述方法包括:获取数据表指定数据,所述数据表指定数据包括一条或多条指定数据,每条所述指定数据对应于一个数据库,每条所述指定数据包括待清除数据表标识(S202);根据任一条所述指定数据确定与该任一条所述指定数据对应的数据库中的多个目标分库(S204);根据该任一条所述指定数据包括的待清除数据表标识确定每个所述目标分库中的多张目标数据表(S206);对每个所述目标分库中的所有目标数据表进行数据清除(S208)。该方法能够实现快速准确地对数量众多且存储位置分散的目标数据表进行数据清除处理,流程高效且用户操作简便。

Description

数据清除方法、装置、计算机设备和存储介质 技术领域
本申请涉及数据处理技术领域,特别是涉及一种数据清除方法、装置、计算机设备和存储介质。
背景技术
数据库的存储空间有限,因此需要对数据库进行数据清理,比如,可以对数据库中的非业务表或者垃圾数据进行清理,否则容易造成数据库存储空间的紧张,进而影响系统运行和任务执行。
技术问题
目前通常由人工负责清理数据库中数据,然而在数据库采用分库分表方式存储数据时,以往数据库中一个数据表的数据会被拆分为多个数据表并存储于不同数据库中的不同分库中,即分库分表场景下需要删除的数据量多且存储位置分散,这使得现有通过人工清除数据的方式在分库分表场景下变得效率低下。另外,现有的针对多张数据表的清除方法十分依赖于用户所输入的配置信息,当所涉数据库和数据表数量庞大时给用户操作造成压力与不便。
技术解决方案
本发明针对现有技术的缺点,提供了一种数据清除方法、装置、计算机设备和存储介质, 本发明实施例能够实现快速准确地对数量众多且存储位置分散的目标数据表进行数据清除处理,流程高效且用户操作简便。
本发明根据第一方面提供了一种用于清除数据的方法,在一个实施例中,该方法包括:
获取数据表指定数据,数据表指定数据包括一条或多条指定数据,每条指定数据对应于一个数据库,每条指定数据包括待清除数据表标识;
根据任一条指定数据确定与该任一条指定数据对应的数据库中的多个目标分库;
根据该任一条指定数据包括的待清除数据表标识确定每个目标分库中的多张目标数据表;
对每个目标分库中的所有目标数据表进行数据清除。
在其中一个实施例中,上述的每条指定数据还包括数据库连接信息;
上述的根据任一条指定数据确定与该任一条指定数据对应的数据库中的多个目标分库包括:
根据任一条指定数据包括的数据库连接信息确定与该任一条指定数据对应的数据库中的多个初始分库;
根据该任一条指定数据包括的待清除数据表标识从多个初始分库中确定与该任一条指定数据对应的数据库中的一个或者多个目标分库。
在其中一个实施例中,上述的待清除数据表标识包括一个或者多个数据表标识;
上述的方法还包括:根据该任一条指定数据包括的待清除数据表标识确定每个目标分库对应的一个或者多个目标数据表标识,每个目标分库对应的目标数据表标识对应于该目标分库中的一张或者多张目标数据表。
在其中一个实施例中,上述的根据任一条指定数据确定与该任一条指定数据对应的数据库中的多个目标分库;根据该任一条指定数据包括的待清除数据表标识确定每个目标分库中的多张目标数据表包括:
建立第一线程池,第一线程池中包括一个或多个第一线程,第一线程的数量与指定数据的数量相同;
分别通过一个第一线程执行根据一条指定数据确定与该一条指定数据对应的数据库中的多个目标分库,以及根据该一条指定数据包括的待清除数据表标识确定任一目标分库中的多个目标数据表的操作,不同第一线程对应处理的指定数据不同。
在其中一个实施例中,上述的对每个目标分库中的所有目标数据表进行数据清除,包括:
通过每个第一线程建立与该第一线程对应的第二线程池,每个第二线程池包括多个第二线程,每个第二线程池包括的第二线程的数量与该第二线程池对应的第一线程确定出的目标分库的数量相同;
分别通过一个第二线程执行对一个目标分库中的所有目标数据表进行数据清除的操作,不同第二线程对应处理的目标分库不同。
在其中一个实施例中,通过任一个第二线程执行对一个目标分库中的所有目标数据表进行数据清除的操作,包括:
通过该任一个第二线程建立与该任一个第二线程对应的第三线程池,第三线程池中包括一个或者多个第三线程,第三线程的数量与该任一个第二线程对应处理的目标分库所对应的目标数据表标识的数量相同;
分别通过一个第三线程执行对一个目标数据表标识对应的所有目标数据表进行数据清除的操作,不同第三线程对应处理的目标数据表标识不同。
在其中一个实施例中,还包括:分别通过一个第二线程建立一个目标分库对应的数据库连接池,不同第二线程对应处理的目标分库不同;
通过任一个第三线程对一个目标数据表标识对应的所有目标数据表进行数据清除包括:
获取与所述目标数据表标识对应的数据库连接;
根据与所述目标数据表标识对应的数据库连接对所述目标数据表标识对应的所有目标数据表进行数据清除。
本发明根据第二方面提供了一种用于清除数据的装置,在一个实施例中,该装置包括:
指定数据获取模块,用于获取数据表指定数据,数据表指定数据包括一条或多条指定数据,每条指定数据对应于一个数据库,每条指定数据包括待清除数据表标识;
目标分库确定模块,用于根据任一条指定数据确定与该任一述指定数据对应的数据库中的多个目标分库;
目标数据表确定模块,用于根据该任一条指定数据包括的待清除数据表标识确定每个目标分库中的多张目标数据表;
清除模块,用于对每个目标分库中的所有目标数据表进行数据清除。
本发明根据第三方面提供了一种用于清除数据的计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现以下步骤:
获取数据表指定数据,数据表指定数据包括一条或多条指定数据,每条指定数据对应于一个数据库,每条指定数据包括待清除数据表标识;
根据任一条指定数据确定与该任一条指定数据对应的数据库中的多个目标分库;
根据该任一条指定数据包括的待清除数据表标识确定每个目标分库中的多张目标数据表;
对每个目标分库中的所有目标数据表进行数据清除。
本发明根据第四方面提供了一种用于清除数据的计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现以下步骤:
获取数据表指定数据,数据表指定数据包括一条或多条指定数据,每条指定数据对应于一个数据库,每条指定数据包括待清除数据表标识;
根据任一条指定数据确定与该任一条指定数据对应的数据库中的多个目标分库;
根据该任一条指定数据包括的待清除数据表标识确定每个目标分库中的多张目标数据表;
对每个目标分库中的所有目标数据表进行数据清除。
有益效果
上述数据清除方法、装置、计算机设备和存储介质,获取了包含待清除数据表标识的数据表指定数据,可根据用户实际需求进行数据表清除;每条指定数据对应于一个数据库,可实现多个数据库中数据表清除的操作统一;根据每条指定数据确定该条指定数据所对应的数据库中包含待清除数据表的目标分库,实现了数据库分库层面上的数据表清除,排除了不包含待清除数据表的数据分库,减少了冗杂操作,节省了任务运行负载并且节约了整体流程时间;根据指定数据中包含的待清除数据表标识确定每个目标分库中的多张目标数据表并且对每个目标分库中的所有目标数据表进行清除,实现了多个数据分库中多张数据表清除的统一操作。本申请实施例能够结合用户实际需求实现快速准确地对数量众多且存储位置分散的目标数据表进行数据清除处理,高效,自动化,避免了繁多的重复操作,且节省了大量的时间成本,人力成本和资源使用成本。
附图说明
图1-1为一个实施例中数据清除方法的应用环境图;
图1-2为一个实施例中一个数据库与数据分库的关系示意图;
图1-3为一个实施例中一个数据分库与数据表的关系示意图;
图2为一个实施例中数据清除方法的流程示意图;
图3为一个实施例中目标分库确定步骤的流程示意图;
图4为一个实施例中目标分库确定和目标数据表确定步骤的流程示意图;
图5为一个实施例中数据清除步骤的流程示意图;
图6为另一个实施例中数据清除步骤的流程示意图;
图7为又一个实施例中数据清除步骤的流程示意图;
图8为一个实施例中数据清除装置的结构框图;
图9为一个实施例中目标分库确定模块的细化结构框图;
图10为一个实施例中清除模块的细化结构框图;
图11为一个实施例中计算机设备的内部结构图。
本发明的实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
本申请实施例提供的数据清除方法,可以应用于能够进行数据清除的系统,该系统可以由独立的服务器或者多个服务器组成的服务器集群来实现,也可以由其他网络侧设备实现。例如,如图1-1所示,该系统可以由服务器102实现,该系统所在的服务器102可以与一个或多个数据库服务器进行数据交互。以数据库服务器104中配置的数据库110为例,如图1-2所示,数据库110中可以包括多个数据分库,其中有数据分库112、数据分库114和数据分库116。每个数据分库中也可以包括多张数据表,如图1-3所示,数据分库112包括数据表120、数据表122和数据表124等多张数据表。
在一个实施例中,如图2所示,本申请提供了一种数据清除方法,包括以下步骤:
步骤202:获取数据表指定数据,数据表指定数据包括一条或多条指定数据,每条指定数据对应于一个数据库,每条指定数据包括待清除数据表标识。
其中,数据表指定数据可以由用户输入,每条指定数据所对应的数据库即为该条指定数据所需要进行数据清除的数据库,待清除数据表标识用于确定需要清除的数据表,待清除数据表标识可以包括一个或多个数据表名称的部分信息(如数据表名称前缀),比如,有两个数据表分别为custlevel_A和custlevel_B,如果待清除数据表标识为custlevel,那么系统基于该标识可以找到数据表custlevel_A和custlevel_B。
具体地,在其中一个实施方式中,该数据表指定数据可以存储于EXCEL文件中,系统读取该EXCEL文件,获取数据表指定数据。示例性地,数据表指定数据中的内容可以配置为如下表一所示。其中,一行数据为一条指定数据,每条指定数据中待清除数据表标识是表中 “Table”字段的参数信息,比如table1,table_2。
表一:
IP Port Username Password Table
x.x.x.1 10 name_1 word_1 table1,table_2
x.x.x.2 20 name_2 word_1 table1,table_1
x.x.x.2 30 name_2 word_2 table2,table_3
在另一个实施方式中,系统在读取数据表指定数据后,还可以将数据信息封装成对象供系统在之后的执行过程中调用。
具体地,不同指定数据对应的数据库分别部署于不同的数据库服务器,也可以部署于同一数据库服务器。
步骤204:根据任一条指定数据确定与该任一条指定数据对应的数据库中的多个目标分库。
其中,目标分库是包含需要清除数据表的数据分库。
具体地,系统可以从获得的一条或者多条指定数据中选取一条指定数据,通过该条指定数据中的信息访问该条指定信息对应的数据库,再确定该数据库中包含待清除数据表的目标分库。
步骤206:根据该任一条指定数据包括的待清除数据表标识确定每个目标分库中的多张目标数据表。
其中,目标数据表是需要清除的数据表。
具体地,系统在该条指定数据对应的数据库中根据待清除数据表标识进行查询,得到该数据库中每个目标分库中所有需要清除的目标数据表。
步骤208:对每个目标分库中的所有目标数据表进行数据清除。
具体地,系统在确定每个目标分库中的目标数据表之后,对每个目标分库中的所有目标数据表进行清除。其中,系统对目标数据表进行数据清除可以是指系统清除目标数据表中的所有数据而保留目标数据表,也可以是系统直接删除目标数据表。
在本实施例中,获取了包含待清除数据表标识的数据表指定数据,可根据用户实际需求进行数据表清除;每条指定数据对应于一个数据库,可实现多个数据库中数据表清除的操作统一;根据每条指定数据确定该条指定数据所对应的数据库中包含待清除数据表的目标分库,实现了数据库分库层面上的数据表清除,排除了不包含待清除数据表的数据分库,减少了冗杂操作,节省了任务运行负载并且节约了整体流程时间;根据指定数据中包含的待清除数据表标识确定每个目标分库中的多张目标数据表并且对每个目标分库中的所有目标数据表进行清除,实现了多个数据分库中多张数据表清除的统一操作。本申请实施例能够结合用户实际需求实现在多个数据库以及多个数据分库中多张数据表的清除,高效,自动化,避免了繁多的重复操作,且节省了大量的时间成本,人力成本和资源使用成本。
在一个实施例中,如图3所示,上述的每条指定数据还包括数据库连接信息;上述的根据任一条指定数据确定与该任一条指定数据对应的数据库中的多个目标分库包括:
步骤302:根据任一条指定数据包括的数据库连接信息确定与该任一条指定数据对应的数据库中的多个初始分库。
其中,数据库连接信息为用于访问该数据库的信息,在其中一个实施方式中,该数据库连接信息可以包括数据库所在服务器的IP地址和端口信息,该数据库连接信息还可以包括用于登入该数据库的用户名和密码。
具体地,系统可以根据数据库连接信息访问该数据库,再获取该数据库中的所有数据分库列表,排除部分可确定不包含待清除数据表的数据分库,由此得到多个初始分库。比如,以该数据库为mysql数据库为例进行说明,系统可以根据数据库连接信息登陆该数据库的information_schema库,再执行show databases语句,从而得到当前用户权限范围内该数据库中所有数据分库的列表,系统进一步排除该mysql数据库自带的数据库(performance_schema库等),从而得到多个初始分库。
步骤304:根据该任一条指定数据包括的待清除数据表标识从多个初始分库中确定与该任一条指定数据对应的数据库中的一个或者多个目标分库。
具体地,系统在确定与该条指定数据对应的数据库中的多个初始分库之后,进一步根据待清除数据表标识从该多个数据库中确定出包含待清除数据表的一个或者多个目标分库。比如,系统可以在该数据库中根据待清除数据表标识查询待清除数据表,包含待清除数据表的数据分库即可确定为目标分库。
在本实施例中,系统在执行过程中先通过访问数据库排除可确定不包含待清除数据表的数据分库,为之后的流程节省操作步骤和运行负担,提高了整体流程效率。
在一个实施例中,上述的待清除数据表标识包括一个或者多个数据表标识;本方法还包括:根据该任一条指定数据包括的待清除数据表标识确定每个目标分库对应的一个或者多个目标数据表标识,每个目标分库对应的目标数据表标识对应于该目标分库中的一张或者多张目标数据表。
其中,待清除数据表标识为一个或者多个数据表标识,每个数据表标识对应一张或者多张待清除数据表,每个目标分库对应的目标数据表标识均为该待清除数据表标识中的一个或者多个数据表标识,目标分库对应的每个目标数据表标识对应于该目标分库中的一张或者多张目标数据表,每张目标数据表均唯一对应于一个目标数据表标识。
具体地,系统根据待清除数据表标识在该数据库中进行查验,对于每个分库,该分库中所有目标数据表对应的数据表标识组成的集合即为该目标分库对应的目标数据表标识。
在本实施例中,每张待清除数据表对应于一个数据表标识,使得定位待清除数据表的操作过程便捷快速,确定出每个目标分库中的所有目标数据表标识,方便后续进行数据表清除,使得整个数据清除流程高效。
在一个实施例中,如图4所示,上述的根据任一条指定数据确定与该任一条指定数据对应的数据库中的多个目标分库;根据该任一条指定数据包括的待清除数据表标识确定任一目标分库中的多个目标数据表包括:
步骤402:建立第一线程池,第一线程池中包括一个或多个第一线程,第一线程的数量与指定数据的数量相同;
其中,第一线程池中的线程数量与指定数据的数量相同,一个第一线程对应处理一条指定数据,即一个第一线程对应处理该一条指定数据所对应的数据库。在其中一个实施方式中,以指定数据的数量为两条,且该两条指定数据所对应的两个数据库分布在不同的服务器中为例进行说明:比如:该两个数据库分别在服务器10.10.10.1和服务器10.10.10.2中,那么服务器10.10.10.1中的数据库可以对应于第一线程thread_1,服务器10.10.10.2中的数据库可以对应于线程thread_2。该第一线程池可以为可缓存线程池,即第一线程池的大小随指定数据的数量而变化。比如,有两条指定数据,对应于两个不同的数据库,那么该第一线程池中有两个线程;若指定数据的数量为三条,对应于三个不同的数据库,则该第一线程池中有三个第一线程。
具体地,系统在主线程中创建数据库实施例层面的第一线程池,第一线程池中的每个第一线程对应处理一条指定数据所对应的数据库。
步骤404:分别通过一个第一线程执行根据一条指定数据确定与该一条指定数据对应的数据库中的多个目标分库,以及根据该一条指定数据包括的待清除数据表标识确定任一目标分库中的多个目标数据表的操作,不同第一线程对应处理的指定数据不同。
具体地,系统通过每个第一线程对该第一线程所对应的数据库执行:根据一条指定数据确定与该一条指定数据对应的数据库中的多个目标分库,以及根据该任一条指定数据包括的待清除数据表标识确定每个目标分库中的多张目标数据表。比如,以第一线程对应服务器10.10.10.1中的一个mysql数据库为例进行说明:该数据库对应于第一线程池pool_1中的第一线程thread_1,系统可以通过thread_1登录该数据库的information_schema库,执行show databases,再排除mysql类型数据库的自带数据分库,得到该数据库的多个目标分库,如DB_1,DB_2,DB_3等。此后,系统可以通过thread_1根据待清除数据表标识在该数据库中进行查找,得到每个目标分库中的目标数据表的具体表名,比如,以待清除数据表标识为数据表名称前缀custlevel为例,分库DB_1中的目标数据表可以为custlevel_001- custlevel_100;分库DB_2中的目标数据表可以为custlevel_101- custlevel_200。
在本实施例中,每个需要处理的数据库对应于一个第一线程,各线程可并发执行,实现了多个不同数据库中数据清除的同步进行,使得数据清除流程快速高效,有效合理利用系统资源,提升整体处理性能。
在一个实施例中,如图5所示,上述的对每个目标分库中的所有目标数据表进行数据清除,包括:
步骤502:通过每个第一线程建立与该第一线程对应的第二线程池,每个第二线程池包括多个第二线程,每个第二线程池包括的第二线程的数量与该第二线程池对应的第一线程确定出的目标分库的数量相同;
步骤504:分别通过一个第二线程执行对一个目标分库中的所有目标数据表进行数据清除的操作,不同第二线程对应处理的目标分库不同。
具体地,对应于每个第一线程,系统建立一个第二线程池,即对应于每条指定数据所对应的数据库,系统建立一个第二线程池,即每个需要处理的数据库对应一个第二线程池。每个第二线程池中的第二线程数量与该第二线程所对应的数据库中的目标分库的数量相同,一个第二线程对应于一个目标分库,每个第二线程所对应的目标分库不同。系统通过一个第二线程执行对该第二线程所对应的目标分库中的目标数据表清除操作。比如,以第一线程对应服务器10.10.10.1中的一个mysql数据库为例进行说明,该数据库对应于第一线程池pool_1中的第一线程thread_1,该数据库中包括目标分库DB_1,DB_2,系统通过thread_1建立与之对应的第二线程池pool_2,该数据库中的每个目标分库对应于pool_2中的一个第二线程,比如DB_1对应于第二线程thread_3,DB_2对应于第二线程thread_4,系统通过thread_3对目标分库DB_1中的所有目标数据表进行清除,通过thread_4对目标分库DB_2中的所有目标数据表进行清除。
在本实施例中,在数据库层面的线程池的基础上建立了分库层面的线程池,每个目标分库分配第二线程池中的一个第二线程进行数据清除,实现了多个数据库中多个分库数据清除操作的并发进行,使得数据清除流程快速高效,有效合理利用系统资源,提升整体处理性能。
在一个实施例中,上述的通过任一个第二线程执行对一个目标分库中的所有目标数据表进行数据清除的操作,如图6所示,包括:
步骤602:通过该任一个第二线程建立与该任一个第二线程对应的第三线程池,第三线程池中包括一个或者多个第三线程,第三线程的数量与该任一个第二线程对应处理的目标分库所对应的目标数据表标识的数量相同。
步骤604:分别通过一个第三线程执行对一个目标数据表标识对应的所有目标数据表进行数据清除的操作,不同第三线程对应处理的目标数据表标识不同。
具体地,系统通过任一个第二线程建立一个与该第二线程对应的第三线程池,即每个目标分库对应一个第三线程池。如上所述,每个目标分库对应有一个或者多个目标数据表标识,该目标分库所对应的第三线程池中的第三线程数量与该目标分库所对应的目标数据表标识数量相同,一个目标数据表标识对应于该第三线程池中的一个第三线程。系统分别通过一个第三线程执行对一个目标数据表标识对应的所有目标数据表进行数据清除的操作。在其中一个实施方式中,以目标分库为DB_1为例进行说明,该目标分库所对应的第二线程为thread_3,该目标分库对应有两个目标数据表标识:custlevel和deptlevel,custlevel所对应的该目标分库中的目标数据表为custlevel_001- custlevel_100,deptlevel所对应的该目标分库中的目标数据表为deptlevel_001- deptlevel_100;系统通过thread_3建立与该目标分库对应的第三线程池pool_subDB_1,则第三线程池pool_subDB_1中有两个第三线程thread_subDB_1和thread_subDB_2,分别与目标数据表标识custlevel和deptlevel对应;系统通过第三线程thread_subDB_1对该目标分库中的目标数据表custlevel_001- custlevel_100进行清除,通过第三线程thread_subDB_2对该目标分库中的目标数据表deptlevel_001- deptlevel_100进行清除。
在本实施例中,基于数据库层面的线程池和分库层面的线程池建立了数据表标识层面的第三线程池,从而构建出三个层次的多线程体系,每个目标数据表标识分配第三线程池中的一个第三线程进行数据清除,实现了多个数据库中多个分库中多张数据表的数据清除操作的并发进行,使得数据清除流程快速高效,有效合理利用系统资源,提升整体处理性能。
在一个实施例中,本方法还包括:
分别通过一个第二线程建立一个目标分库对应的数据库连接池,不同第二线程对应处理的目标分库不同。
具体地,以目标分库为DB_1为例进行说明,该目标分库所对应的第二线程为thread_3,系统通过thread_3创建与目标分库DB_1对应的数据库连接池db_pool_1,该数据库连接池负责分配、管理和释放DB_1的数据库连接。在其中一个实施方式中,该数据库连接池可以为DBCP(DataBase connection pool)数据库连接池。在另一个实施方式中,系统在创建数据库连接池时可以设置初始连接数,最大连接数,最小空闲连接数,获取连接的最大等待时间等参数,比如,该最大连接数可以设置为50。
在一个实施例中,如图7所示,通过任一个第三线程对一个目标数据表标识对应的所有目标数据表进行数据清除包括:
步骤702:获取与所述目标数据表标识对应的数据库连接。
具体地,系统通过该第三线程获得与该目标数据表标识对应的目标分库的数据库连接,该数据库连接可以来自与该目标分库对应的数据库连接池。
步骤704:根据与所述目标数据表标识对应的数据库连接对所述目标数据表标识对应的所有目标数据表进行数据清除。
具体地,系统可以通过该第三线程根据该数据库连接访问该目标分库,再根据具体的目标数据表名称执行清除语句。比如,以目标数据表标识为目标分库DB_1对应的custlevel为例进行说明,目标数据表标识custlevel对应第三线程thread_subDB_1,目标数据表标识custlevel在目标分库DB_1中对应的目标数据表为custlevel_001- custlevel_100;系统根据第三线程thread_subDB_1获取目标分库DB_1对应的数据库连接,再根据该数据库连接访问目标分库DB_1,接着对目标数据表custlevel_001- custlevel_100按顺序进行清除。
在本实施例中,对应于每个目标分库创建了一个数据库连接池,不仅起到对目标分库的保护作用,使得应用于目标分库的连接数量始终小于该目标分库对外提供的最大连接数量,并且支持数据库连接的复用,减少资源消耗。同时在目标数据表数量较多的情况下能够实现资源的合理分配和任务的高效执行。
应该理解的是,虽然图2-7的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图2-7中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。
在一个实施例中,如图8所示,提供了一种数据清除装置,包括:指定数据获取模块、目标分库确定模块、目标数据表确定模块和清除模块,其中:
指定数据获取模块802,用于获取数据表指定数据,数据表指定数据包括一条或多条指定数据,每条指定数据对应于一个数据库,每条指定数据包括待清除数据表标识;
目标分库确定模块804,用于根据任一条指定数据确定与该任一述指定数据对应的数据库中的多个目标分库;
目标数据表确定模块806,用于根据该任一条指定数据包括的待清除数据表标识确定每个目标分库中的多张目标数据表;
清除模块808,用于对每个目标分库中的所有目标数据表进行数据清除。
在一个实施例中,上述的每条指定数据还包括数据库连接信息;上述的目标分库确定模块804,如图9所示,可以包括:
第一确定单元902,用于根据任一条指定数据包括的数据库连接信息确定与该任一条指定数据对应的数据库中的多个初始分库;
第二确定单元904,用于根据该任一条指定数据包括的待清除数据表标识从多个初始分库中确定与该任一条指定数据对应的数据库中的一个或者多个目标分库。
在一个实施例中,上述的待清除数据表标识包括一个或者多个数据表标识;本装置还包括:
目标数据表标识确定单元(图中未示出),用于根据该任一条指定数据包括的待清除数据表标识确定每个目标分库对应的一个或者多个目标数据表标识,每个目标分库对应的目标数据表标识对应于该目标分库中的一张或者多张目标数据表。
在一个实施例中,上述的目标分库确定模块804和目标数据表确定模块806在实现其对应的功能时,具体实现:
建立第一线程池,第一线程池中包括一个或多个第一线程,第一线程的数量与指定数据的数量相同;
分别通过一个第一线程执行根据一条指定数据确定与该一条指定数据对应的数据库中的多个目标分库,以及根据该一条指定数据包括的待清除数据表标识确定任一目标分库中的多个目标数据表的操作,不同第一线程对应处理的指定数据不同。
在一个实施例中,如图10所示,上述的清除模块808可以包括:
建立单元1002,用于通过每个第一线程建立与该第一线程对应的第二线程池,每个第二线程池包括多个第二线程,每个第二线程池包括的第二线程的数量与该第二线程池对应的第一线程确定出的目标分库的数量相同;
清除单元1004,用于分别通过一个第二线程执行对一个目标分库中的所有目标数据表进行数据清除的操作,不同第二线程对应处理的目标分库不同。
在一个实施例中,上述的第一清除单元1004在通过任一个第二线程执行对一个目标分库中的所有目标数据表进行数据清除的操作时,具体实现:
通过该任一个第二线程建立与该任一个第二线程对应的第三线程池,第三线程池中包括一个或者多个第三线程,第三线程的数量与该任一个第二线程对应处理的目标分库所对应的目标数据表标识的数量相同;
分别通过一个第三线程执行对一个目标数据表标识对应的所有目标数据表进行数据清除的操作,不同第三线程对应处理的目标数据表标识不同。
在一个实施例中,本装置还包括:
数据库连接池建立模块(图中未示出),用于分别通过一个第二线程建立一个目标分库对应的数据库连接池,不同第二线程对应处理的目标分库不同。
在一个实施例中,上述的第一清除单元1004在实现通过任一个第三线程对一个目标数据表标识对应的所有目标数据表进行数据清除时,具体实现:
获取与所述目标数据表标识对应的数据库连接;
根据与所述目标数据表标识对应的数据库连接对所述目标数据表标识对应的所有目标数据表进行数据清除。
关于数据清除装置的具体限定可以参见上文中对于数据清除方法的限定,在此不再赘述。上述数据清除装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
在一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构图可以如图11所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机程序和数据库。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机设备的数据库用于存储数据清除数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机程序被处理器执行时以实现一种数据清除方法。
本领域技术人员可以理解,图11中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
在一个实施例中,提供了一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,处理器执行计算机程序时实现本发明的任意一种实施例的方法步骤。
在一个实施例中,提供了一种计算机可读存储介质,其上存储有计算机程序,计算机程序被处理器执行时实现本发明的任意一种实施例的方法步骤。
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。

Claims (10)

  1. 一种数据清除方法,其特征在于,所述方法包括:
    获取数据表指定数据,所述数据表指定数据包括一条或多条指定数据,每条所述指定数据对应于一个数据库,每条所述指定数据包括待清除数据表标识;
    根据任一条所述指定数据确定与该任一条所述指定数据对应的数据库中的多个目标分库;
    根据该任一条所述指定数据包括的待清除数据表标识确定每个所述目标分库中的多张目标数据表;
    对每个所述目标分库中的所有目标数据表进行数据清除。
  2. 根据权利要求1所述的方法,其特征在于,每条所述指定数据还包括数据库连接信息;
    所述根据任一条所述指定数据确定与该任一条所述指定数据对应的数据库中的多个目标分库包括:
    根据任一条所述指定数据包括的数据库连接信息确定与该任一条所述指定数据对应的数据库中的多个初始分库;
    根据该任一条所述指定数据包括的待清除数据表标识从所述多个初始分库中确定与该任一条所述指定数据对应的数据库中的一个或者多个目标分库。
  3. 根据权利要求1所述的方法,其特征在于,所述待清除数据表标识包括一个或者多个数据表标识;
    所述方法还包括:
    根据该任一条所述指定数据包括的待清除数据表标识确定每个所述目标分库对应的一个或者多个目标数据表标识,每个所述目标分库对应的目标数据表标识对应于该所述目标分库中的一张或者多张目标数据表。
  4. 根据权利要求3所述的方法,其特征在于,所述根据任一条所述指定数据确定与该任一条所述指定数据对应的数据库中的多个目标分库;根据该任一条所述指定数据包括的待清除数据表标识确定任一所述目标分库中的多个目标数据表包括:
    建立第一线程池,所述第一线程池中包括一个或多个第一线程,所述第一线程的数量与所述指定数据的数量相同;
    分别通过一个所述第一线程执行根据一条所述指定数据确定与该一条所述指定数据对应的数据库中的多个目标分库,以及根据该一条所述指定数据包括的待清除数据表标识确定任一所述目标分库中的多个目标数据表的操作,不同第一线程对应处理的指定数据不同。
  5. 根据权利要求4所述的方法,其特征在于,
    所述对每个所述目标分库中的所有目标数据表进行数据清除,包括:
    通过每个所述第一线程建立与该每个所述第一线程对应的第二线程池,每个所述第二线程池包括多个第二线程,每个所述第二线程池包括的第二线程的数量与该每个所述第二线程池对应的第一线程确定出的目标分库的数量相同;
    分别通过一个所述第二线程执行对一个所述目标分库中的所有目标数据表进行数据清除的操作,不同第二线程对应处理的目标分库不同。
  6. 根据权利要求5所述的方法,其特征在于,
    通过任一个所述第二线程执行对一个所述目标分库中的所有目标数据表进行数据清除的操作,包括:
    通过该任一个所述第二线程建立与该任一个所述第二线程对应的第三线程池,所述第三线程池中包括一个或者多个第三线程,所述第三线程的数量与该任一个所述第二线程对应处理的目标分库所对应的目标数据表标识的数量相同;
    分别通过一个所述第三线程执行对一个所述目标数据表标识对应的所有目标数据表进行数据清除的操作,不同第三线程对应处理的目标数据表标识不同。
  7. 根据权利要求6所述的方法,其特征在于,所述方法还包括:
    分别通过一个所述第二线程建立一个所述目标分库对应的数据库连接池,不同第二线程对应处理的目标分库不同;
    通过一个第三线程对一个目标数据表标识对应的所有目标数据表进行数据清除包括:
    获取与所述目标数据表标识对应的数据库连接;
    根据与所述目标数据表标识对应的数据库连接对所述目标数据表标识对应的所有目标数据表进行数据清除。
  8. 一种数据清除装置,其特征在于,所述装置包括:
    指定数据获取模块,用于获取数据表指定数据,所述数据表指定数据包括一条或多条指定数据,每条所述指定数据对应于一个数据库,每条所述指定数据包括待清除数据表标识;
    目标分库确定模块,用于根据任一条所述指定数据确定与该任一条所述指定数据对应的数据库中的多个目标分库;
    目标数据表确定模块,用于根据该任一条所述指定数据包括的待清除数据表标识确定每个所述目标分库中的多张目标数据表;
    清除模块,用于对每个所述目标分库中的所有目标数据表进行数据清除。
  9. 一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现权利要求1至7中任一项所述方法的步骤。
  10. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1至7中任一项所述的方法的步骤。
PCT/CN2021/099673 2020-07-03 2021-06-11 数据清除方法、装置、计算机设备和存储介质 WO2022001627A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010636675.0A CN112000648B (zh) 2020-07-03 2020-07-03 数据清除方法、装置、计算机设备和存储介质
CN202010636675.0 2020-07-03

Publications (1)

Publication Number Publication Date
WO2022001627A1 true WO2022001627A1 (zh) 2022-01-06

Family

ID=73466394

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/099673 WO2022001627A1 (zh) 2020-07-03 2021-06-11 数据清除方法、装置、计算机设备和存储介质

Country Status (2)

Country Link
CN (1) CN112000648B (zh)
WO (1) WO2022001627A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112000648B (zh) * 2020-07-03 2022-11-15 苏宁云计算有限公司 数据清除方法、装置、计算机设备和存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130226882A1 (en) * 2012-02-29 2013-08-29 International Business Machines Corporation Automatic table cleanup for relational databases
CN106095878A (zh) * 2016-06-07 2016-11-09 中国建设银行股份有限公司 基于分库分表的数据库操作装置及方法
CN106528840A (zh) * 2016-11-11 2017-03-22 中国银行股份有限公司 基于银行系统的业务数据的清理方法以及系统
CN109885565A (zh) * 2019-02-14 2019-06-14 中国银行股份有限公司 一种数据表清理方法和装置
US20200050593A1 (en) * 2016-09-07 2020-02-13 International Business Machines Corporation Automatically setting an auto-purge value to multiple tables within a database
CN112000648A (zh) * 2020-07-03 2020-11-27 苏宁云计算有限公司 数据清除方法、装置、计算机设备和存储介质

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111339088A (zh) * 2020-02-21 2020-06-26 苏宁云计算有限公司 数据库的分库分表方法、装置、介质及计算机设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130226882A1 (en) * 2012-02-29 2013-08-29 International Business Machines Corporation Automatic table cleanup for relational databases
CN106095878A (zh) * 2016-06-07 2016-11-09 中国建设银行股份有限公司 基于分库分表的数据库操作装置及方法
US20200050593A1 (en) * 2016-09-07 2020-02-13 International Business Machines Corporation Automatically setting an auto-purge value to multiple tables within a database
CN106528840A (zh) * 2016-11-11 2017-03-22 中国银行股份有限公司 基于银行系统的业务数据的清理方法以及系统
CN109885565A (zh) * 2019-02-14 2019-06-14 中国银行股份有限公司 一种数据表清理方法和装置
CN112000648A (zh) * 2020-07-03 2020-11-27 苏宁云计算有限公司 数据清除方法、装置、计算机设备和存储介质

Also Published As

Publication number Publication date
CN112000648A (zh) 2020-11-27
CN112000648B (zh) 2022-11-15

Similar Documents

Publication Publication Date Title
CN109343963B (zh) 一种容器集群的应用访问方法、装置及相关设备
US10657111B2 (en) Computer-implemented method for storing unlimited amount of data as a mind map in relational database systems
CN105868033A (zh) 基于Redis实现优先级消息队列的方法及系统
CN105808323A (zh) 一种虚拟机创建方法及系统
CN103544319A (zh) 一种多租户共享数据库的方法和多租户数据库即服务系统
US9430525B2 (en) Access plan for a database query
TWI746511B (zh) 資料表連接方法及裝置
CN109379398B (zh) 一种数据同步方法及装置
CN113220659B (zh) 一种数据迁移的方法、系统、电子装置和存储介质
US20140229429A1 (en) Database management delete efficiency
WO2021068521A1 (zh) 一种本地存储引擎系统的数据处理方法、装置以及设备
CN110399368B (zh) 一种定制数据表的方法、数据操作方法及装置
US11947534B2 (en) Connection pools for parallel processing applications accessing distributed databases
WO2022001627A1 (zh) 数据清除方法、装置、计算机设备和存储介质
CN114791907A (zh) 一种多租户共享数据的处理方法和系统
CN108319604B (zh) 一种hive中大小表关联的优化方法
CN110851421B (zh) 减少数据迁移耗时的方法、装置、存储介质及电子设备
CN114564621A (zh) 一种关联数据的方法、装置、设备及可读存储介质
CN112100186A (zh) 基于分布式系统的数据处理方法、装置、计算机设备
CN112749189A (zh) 数据查询方法及装置
CN111221847A (zh) 监控数据存储方法、装置及计算机可读存储介质
CN111669358A (zh) 一种批量处理vrouter网络隔离空间的方法和装置
CN116561106B (zh) 一种配置项数据管理方法及系统
CN117390040B (zh) 基于实时宽表的业务请求处理方法、设备及存储介质
CN113076178B (zh) 报文存储方法、装置及设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21833083

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21833083

Country of ref document: EP

Kind code of ref document: A1