CN115422175B - Invalid data cleaning method based on database historical snapshot - Google Patents

Invalid data cleaning method based on database historical snapshot Download PDF

Info

Publication number
CN115422175B
CN115422175B CN202211031439.1A CN202211031439A CN115422175B CN 115422175 B CN115422175 B CN 115422175B CN 202211031439 A CN202211031439 A CN 202211031439A CN 115422175 B CN115422175 B CN 115422175B
Authority
CN
China
Prior art keywords
data
database
data table
cleaning
identified
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211031439.1A
Other languages
Chinese (zh)
Other versions
CN115422175A (en
Inventor
林韶宾
娄帅
郑红云
党中华
张文凤
司同
龙禹
王佳明
林禹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Great Opensource Software Co ltd
Original Assignee
Beijing Great Opensource Software Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Great Opensource Software Co ltd filed Critical Beijing Great Opensource Software Co ltd
Priority to CN202211031439.1A priority Critical patent/CN115422175B/en
Publication of CN115422175A publication Critical patent/CN115422175A/en
Application granted granted Critical
Publication of CN115422175B publication Critical patent/CN115422175B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides an invalid data cleaning method based on historical snapshots of a database, which comprises the following steps: collecting all historical database snapshots in a source database of a distributed system; analyzing data of all collected historical snapshots of the distributed database to obtain a first data table set; and obtaining unidentified data in the distributed database to be cleaned, obtaining a second data table set, selecting the second data tables in the second data table set in sequence, and deleting the currently selected second data table if the currently selected second data table does not exist in the first data table set until all the second data tables in the second data table set exist in the first data table set.

Description

Invalid data cleaning method based on database historical snapshot
Technical Field
The invention relates to the technical field of database data processing, in particular to an invalid data cleaning method based on database historical snapshots.
Background
With the development of internet technology, many industries have entered the mass data era, and most of the current technologies related to big data are focused on data mining and utilization. The mining of large data is necessarily premised on the existence of a large amount of data, but the excessive data obviously brings about no small difficulty in mining and utilization. In the context of contemporary information explosion, the rapid updating of data is accompanied by a dramatic increase in the amount of data, in other words, the latest data must be grasped and the outdated or stale data must be cleaned up in time. Otherwise, the data mining difficulty is greatly increased due to the fact that the data volume is too large, and more importantly, errors of data analysis can be caused directly. At present, when invalid data is cleared, a common method is to directly search and clear the invalid data in a database according to an invalid condition or a time condition, so that a large amount of workload will occur in the searching process, and the large amount of workload will cause the fault tolerance rate to be reduced, thereby affecting the invalid data clearing process.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides an invalid data cleaning method based on historical database snapshots, which is used for rapidly identifying and cleaning invalid data in a database in a historical database snapshot mode, so that the workload of directly searching for the invalid data in the database according to the invalid conditions or time conditions is effectively reduced.
An invalid data cleaning method based on a database historical snapshot comprises the following steps:
collecting all historical database snapshots in a source database; analyzing data of all collected historical database snapshots to obtain a first data table set; and obtaining unidentified data in the database to be cleaned, obtaining a second data table set, selecting the second data tables in the second data table set in sequence, and deleting the currently selected second data table if the currently selected second data table does not exist in the first data table set until the second data table in the second data table set exists in the first data table set.
As an embodiment of the present invention, performing data analysis on all collected historical database snapshots to obtain a first data table set, including: analyzing data of all collected historical database snapshots to obtain file information corresponding to each historical database snapshot and path information corresponding to the file information; generating a data table corresponding to each database historical snapshot according to the file information and the path information corresponding to the file information; and integrating the data tables corresponding to all the historical snapshots of the database to obtain a first data table set.
As an embodiment of the present invention, obtaining unidentified data in a database to be cleaned to obtain a second data table set includes: acquiring all data tables marked as unidentified data in a source database, and establishing a database to be cleaned; and integrating the data tables marked as unidentified data in the database to be cleaned to obtain a second data table set.
As an embodiment of the present invention, acquiring all data tables marked as unidentified data in a source database includes: acquiring all data tables to be identified in a source database; respectively collecting reading time data and data table reading object data in preset time of each data table to be identified; determining the activity of the corresponding data table to be identified according to the reading time data of the data table to be identified; reading object data according to a data table of the data table to be identified to determine the importance degree of the corresponding data table to be identified; performing data effective value analysis according to the liveness and the importance of each data table to be identified to obtain a data effective value of each data table to be identified; and if the effective value of the data of the current data table to be identified is smaller than the threshold value of the effective value of the preset data, marking the unidentified data of the current data table to be identified.
As an embodiment of the present invention, acquiring all data tables to be identified in a source database includes: acquiring a marking instruction sentence input by a user, and analyzing the marking instruction sentence to obtain effective data marking information; the identification instruction statement is a corresponding SQL statement generated by combining effective data identification information preset by a user with a corresponding Structured Query Language (SQL) command; and querying a data table which cannot be matched with the effective data identification information in the visible data of the source database to obtain all data tables to be identified in the source database.
As an embodiment of the present invention, deleting the currently selected second data table includes: generating a first Structured Query Language (SQL) command according to the currently selected second data table and the corresponding database to be cleaned; and executing a first Structured Query Language (SQL) command, and cleaning the currently selected second data table in the second data table set.
As an embodiment of the present invention, a method for cleaning invalid data based on a database history snapshot further includes: generating a second Structured Query Language (SQL) command according to the currently selected second data table and the source database; and executing a second Structured Query Language (SQL) command, and cleaning a table corresponding to the currently selected second data table in the source database.
As an embodiment of the present invention, a method for cleaning invalid data based on a database history snapshot further includes: and automatically generating an operation log record after the deleting operation is finished.
As an embodiment of the present invention, a method for cleaning invalid data based on a historical snapshot of a database further includes: after the cleaning is finished, generating a third Structured Query Language (SQL) command according to the database to be cleaned and the second data table set; and executing a third Structured Query Language (SQL) command, and cleaning all second data tables in the database to be cleaned.
As an embodiment of the present invention, a method for cleaning invalid data based on a database history snapshot further includes: and setting fixed cleaning time, and cleaning the invalid data in the source database once every other fixed cleaning time.
The beneficial effects of the invention are as follows:
the invention provides an invalid data cleaning method based on historical database snapshots, which is used for rapidly identifying and cleaning invalid data in a database in a historical database snapshot mode, so that the workload of directly searching for the invalid data in the database according to the invalid conditions or time conditions is effectively reduced.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and drawings.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
FIG. 1 is a flowchart of a method for cleaning invalid data based on a historical snapshot of a database according to an embodiment of the present invention;
fig. 2 is a flowchart of an obtaining method for obtaining all data tables marked as unidentified data in a source database in an invalid data cleaning method based on a historical database snapshot according to an embodiment of the present invention;
fig. 3 is a flowchart of an obtaining method for obtaining all to-be-identified data tables in a source database in an invalid data cleaning method based on a historical database snapshot according to an embodiment of the present invention.
Detailed Description
The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it should be understood that they are presented herein only to illustrate and explain the present invention and not to limit the present invention.
Referring to fig. 1, an embodiment of the present invention provides a method for cleaning invalid data based on a historical database snapshot, including:
s101, collecting all historical database snapshots in a source database;
s102, analyzing data of all collected historical database snapshots to obtain a first data table set;
s103, obtaining unidentified data in a database to be cleaned to obtain a second data table set;
s104, selecting the second data tables in the second data table set in sequence, and deleting the currently selected second data table if the currently selected second data table does not exist in the first data table set;
s105, ending until all second data tables in the second data table set exist in the first data table set;
the working principle of the technical scheme is as follows: when a database invalid data cleaning instruction is received, collecting all database historical snapshots in a current source database, performing data analysis on all the collected database historical snapshots to obtain a first data table set, simultaneously obtaining unidentified data in a database to be cleaned to obtain a second data table set, selecting a second data table in the second data table set in sequence, deleting the currently selected second data table if the currently selected second data table does not exist in the first data table set until all the second data tables in the second data table set exist in the first data table set, and finishing the cleaning of invalid data;
the beneficial effects of the above technical scheme are: the method is used for rapidly identifying invalid data in the database and cleaning the invalid data in the database in a historical database snapshot mode, so that the workload of directly searching the invalid data in the database according to the invalidation condition or time condition is effectively reduced.
In one embodiment, performing data parsing on all collected historical database snapshots to obtain a first data table set, including: analyzing data of all collected historical database snapshots to obtain file information corresponding to each historical database snapshot and path information corresponding to the file information; generating a data table corresponding to each database historical snapshot according to the file information and the path information corresponding to the file information; integrating data tables corresponding to all database historical snapshots to obtain a first data table set;
the working principle and the beneficial effects of the technical scheme are as follows: and obtaining file information corresponding to each database historical snapshot and path information corresponding to the file information through a read-only view of the database historical snapshots, quickly determining corresponding data tables according to the information, and constructing a first data table set, which is beneficial to improving the processing speed of a cleaning preamble part of failure data.
In one embodiment, obtaining unidentified data in a database to be cleaned to obtain a second data table set comprises: acquiring all data tables marked as unidentified data in a source database, and establishing a database to be cleaned; integrating the data tables marked as unidentified data in the database to be cleaned to obtain a second data table set;
the working principle and the beneficial effects of the technical scheme are as follows: the method comprises the steps of obtaining all data tables marked as unidentified data in a source database in advance, establishing a database to be cleaned, obtaining a second data table set according to the database to be cleaned, and being beneficial to improving the processing speed of the preamble part of the invalid data cleaning.
Referring to fig. 2, in one embodiment, obtaining all data tables marked as unidentified data in the source database includes:
s201, acquiring all data tables to be identified in a source database;
s202, respectively collecting reading time data and data table reading object data in preset time of each data table to be identified;
s203, determining the activity degree of the corresponding data table to be identified according to the reading time data of the data table to be identified;
s204, reading object data according to the data table of the data table to be identified to determine the importance degree of the corresponding data table to be identified;
s205, carrying out data effective value analysis according to the activity and the importance of each data table to be identified to obtain a data effective value of each data table to be identified;
s206, if the data effective value of the current data table to be identified is smaller than the preset data effective value threshold, marking unidentified data on the current data table to be identified;
the working principle of the technical scheme is as follows: the invalid data generally comprises physical invalidity and logical invalidity, before the invalid data is judged through historical snapshots, the data is preferably subjected to primary screening through data valid values, and the rest data which cannot be judged are screenedJudging data through historical snapshots; firstly, acquiring all data tables to be identified in a source database, wherein the data tables to be identified are all visible data tables judged through visibility, then respectively acquiring the reading time of each data table to be identified in preset time and reading object data when the data table is subjected to reading operation each time, wherein the reading object data of the data table refers to calling an application end or a program end of the data table, and determining the activity of the corresponding data table to be identified according to the reading time data of the data table to be identified after the reading time data and the reading object data of the data table are obtained, wherein the judgment of the activity is preferably carried out according to the frequency of the reading times in the reading time data, and the calculation method of the activity is preferably as follows:
Figure BDA0003817289050000071
wherein H is the activity, p is the frequency of the reading times in the preset time, H is the preset time, L p,h Is a preset weight value corresponding to a preset time h and a frequency p of the reading times within the preset time, wherein the shorter the preset time h is, the higher the frequency p of the reading times within the preset time is, and L is p,h The larger; then, determining the importance Z corresponding to the data table to be identified according to the data table reading object data of the data table to be identified, wherein the judgment of the importance Z is preferably determined according to the importance of the reading object in the data table reading object data within the preset time, and the calculation mode of the importance Z is preferably as follows: />
Figure BDA0003817289050000072
Wherein m is i The method comprises the steps of reading the importance of an object in the ith reading in a preset time h, wherein the importance of the object is determined by the daily use frequency of the object to be read, and the higher the daily use frequency is, the higher the importance of the object to be read is; then, carrying out data effective value analysis according to the activity and the importance of each data table to be identified to obtain the data effective value of each data table to be identified, wherein the analysis method is preferably as follows:
Figure BDA0003817289050000081
wherein Y is data validThe values alpha and beta are respectively preset weighted values corresponding to the activity and the importance; finally, comparing the data effective value of the current data table to be identified with a preset data effective value threshold, and marking the current data table to be identified with unidentified data if the data effective value of the current data table to be identified is smaller than the preset data effective value threshold;
the beneficial effects of the above technical scheme are: before invalid data is judged through the historical snapshots, the obvious valid data is primarily screened by the data valid values obtained by analyzing the liveness and the importance, so that the workload of subsequently judging the invalid data through the historical snapshots is effectively reduced, and the speed of cleaning the invalid data is increased.
Referring to fig. 3, in an embodiment, acquiring all the to-be-identified data tables in the source database includes:
s301, obtaining an identification instruction sentence input by a user, and analyzing the identification instruction sentence to obtain effective data identification information; the identification instruction statement is a corresponding SQL statement generated by combining corresponding Structured Query Language (SQL) commands based on preset effective data identification information of a user;
s302, inquiring a data table which cannot be matched with the effective data identification information in the visible data of the source database to obtain all data tables to be identified in the source database;
the working principle and the beneficial effects of the technical scheme are as follows: in order to prevent the deletion of some required data, but not commonly used and logic failure data of the user as invalid data, for example, database data such as standard data tables, construction logic and the like used by the user for reference; obtaining an identification instruction sentence input by a user in advance, and analyzing the identification instruction sentence to obtain effective data identification information; the identification instruction statement is a corresponding SQL statement generated by combining effective data identification information preset by a user with a corresponding Structured Query Language (SQL) command, so that a data table which is needed by the acquisition user but is not frequently used is avoided according to the effective data identification information, a data table which cannot be matched with the effective data identification information is inquired in visible data of a source database, all data tables to be identified in the source database are obtained, the improvement of the reliability of invalid data cleaning is facilitated, and the cleaning of data needed by the user is avoided.
In one embodiment, deleting the currently selected second data table includes: generating a first Structured Query Language (SQL) command according to the currently selected second data table and the corresponding database to be cleaned; executing a first Structured Query Language (SQL) command, and cleaning a currently selected second data table in the second data table set;
the working principle and the beneficial effects of the technical scheme are as follows: when the currently selected second data table is deleted, a first Structured Query Language (SQL) command is generated according to the currently selected second data table and the corresponding database to be cleaned, an SQL sentence corresponding to the database to be cleaned is directly generated, cleaning efficiency is improved beneficially, after the first Structured Query Language (SQL) command is obtained, the first Structured Query Language (SQL) command is executed, and the currently selected second data table in the second data table set is cleaned.
In one embodiment, a method for cleaning invalid data based on a historical snapshot of a database further comprises: generating a second Structured Query Language (SQL) command according to the currently selected second data table and the source database; executing a second Structured Query Language (SQL) command, and cleaning a table corresponding to the currently selected second data table in the source database;
the working principle and the beneficial effects of the technical scheme are as follows: in order to save efficiency, in the process of continuously judging and cleaning invalid data, deleting the currently selected second data table in the database to be cleaned, and simultaneously generating a second Structured Query Language (SQL) command according to the currently selected second data table and the source database; and executing a second Structured Query Language (SQL) command to clean the table corresponding to the currently selected second data table in the source database, which is beneficial to improving the efficiency of cleaning invalid data.
In one embodiment, a method for cleaning invalid data based on a database historical snapshot further comprises: automatically generating an operation log record after the deleting operation is finished;
the beneficial effects of the above technical scheme are: and the user can conveniently check the deleted data.
In one embodiment, a method for cleaning invalid data based on a historical snapshot of a database further comprises: after cleaning, generating a third Structured Query Language (SQL) command according to the database to be cleaned and the second data table set; executing a third Structured Query Language (SQL) command, and cleaning all second data tables in the database to be cleaned;
the working principle and the beneficial effects of the technical scheme are as follows: after cleaning is finished, generating a third Structured Query Language (SQL) command according to the database to be cleaned and the second data table set; and executing a third Structured Query Language (SQL) command, cleaning all second data tables in the database to be cleaned, wherein the cleaned second data tables are effective data, and timely processing the database to be cleaned after cleaning is completed is beneficial to reducing the continuous loss of database operation.
In one embodiment, a method for cleaning invalid data based on a historical snapshot of a database further comprises: setting fixed cleaning time, and cleaning invalid data in the source database once every other fixed cleaning time;
the beneficial effects of the above technical scheme are: after the cleaning device is set once, a user does not need to manually input a cleaning instruction every time, cleaning automation is achieved, and cleaning intelligence is improved.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (8)

1. A method for cleaning invalid data based on a database historical snapshot is characterized by comprising the following steps: collecting all historical database snapshots in a source database; analyzing data of all collected historical database snapshots to obtain a first data table set; obtaining unidentified data in a database to be cleaned to obtain a second data table set, selecting second data tables in the second data table set in sequence, and deleting the currently selected second data table if the currently selected second data table does not exist in the first data table set until all the second data tables in the second data table set exist in the first data table set;
obtaining unidentified data in a database to be cleaned to obtain a second data table set, wherein the second data table set comprises the following steps: acquiring all data tables marked as unidentified data in a source database, and establishing a database to be cleaned; integrating the data tables marked as unidentified data in the database to be cleaned to obtain a second data table set;
acquiring all data tables marked as unidentified data in a source database, wherein the data tables comprise: acquiring all data tables to be identified in a source database; respectively collecting reading time data and data table reading object data in preset time of each data table to be identified; determining the activity of the corresponding data table to be identified according to the reading time data of the data table to be identified, wherein the activity calculation method comprises the following steps:
Figure QLYQS_2
wherein H is the activity, p is the frequency of the reading times within the preset time, H is the preset time,
Figure QLYQS_4
is a preset weight value corresponding to a preset time h and a frequency p of reading times within the preset time, wherein the shorter the preset time h is, the higher the frequency p of reading times within the preset time is, then ÷>
Figure QLYQS_6
The larger; determining the importance of the corresponding data table to be identified according to the data table reading object data of the data table to be identified, wherein the calculation mode of the importance Z is as follows: />
Figure QLYQS_3
Therein->
Figure QLYQS_5
When it is presetReading the importance of the object in the ith reading in the h, wherein the importance of the reading object depends on the daily use frequency of the reading object, and the higher the daily use frequency is, the higher the importance of the reading object is; carrying out data effective value analysis according to the liveness and the importance of each data table to be identified to obtain the data effective value of each data table to be identified, wherein the calculation mode of the data effective value is as follows: />
Figure QLYQS_7
Wherein Y is a data valid value>
Figure QLYQS_8
And &>
Figure QLYQS_1
Respectively are preset weighted values corresponding to the activity and the importance; and if the effective value of the data of the current data table to be identified is smaller than the threshold value of the effective value of the preset data, marking the unidentified data of the current data table to be identified.
2. The invalid data cleaning method based on the historical database snapshot according to claim 1, wherein performing data analysis on all collected historical database snapshots to obtain a first data table set comprises: analyzing data of all collected database historical snapshots to obtain file information corresponding to each database historical snapshot and path information corresponding to the file information; generating a data table corresponding to each database historical snapshot according to the file information and the path information corresponding to the file information; and integrating the data tables corresponding to all the historical database snapshots to obtain a first data table set.
3. The method for cleaning invalid data based on the historical snapshot of the database as claimed in claim 1, wherein obtaining all the data tables to be identified in the source database comprises: acquiring a marking instruction sentence input by a user, and analyzing the marking instruction sentence to obtain effective data marking information; the identification instruction statement is a corresponding SQL statement generated by combining corresponding Structured Query Language (SQL) commands based on preset effective data identification information of a user; and querying a data table which cannot be matched with the effective data identification information in the visible data of the source database to obtain all data tables to be identified in the source database.
4. The invalid data cleaning method based on the historical snapshot of the database as claimed in claim 1, wherein deleting the currently selected second data table comprises: generating a first Structured Query Language (SQL) command according to the currently selected second data table and the corresponding database to be cleaned; and executing a first Structured Query Language (SQL) command, and cleaning the currently selected second data table in the second data table set.
5. The method for cleaning invalid data based on the historical snapshot of the database as claimed in claim 4, further comprising: generating a second Structured Query Language (SQL) command according to the currently selected second data table and the source database; and executing a second Structured Query Language (SQL) command, and cleaning a table corresponding to the currently selected second data table in the source database.
6. The method for cleaning invalid data based on the historical snapshot of the database as claimed in claim 1, further comprising: and automatically generating an operation log record after the deleting operation is finished.
7. The method for cleaning invalid data based on the historical snapshot of the database as claimed in claim 1, further comprising: after the cleaning is finished, generating a third Structured Query Language (SQL) command according to the database to be cleaned and the second data table set; and executing a third Structured Query Language (SQL) command, and cleaning all second data tables in the database to be cleaned.
8. The method for cleaning invalid data based on the historical snapshot of the database as claimed in claim 1, further comprising: and setting fixed cleaning time, and cleaning the invalid data in the source database once every other fixed cleaning time.
CN202211031439.1A 2022-08-26 2022-08-26 Invalid data cleaning method based on database historical snapshot Active CN115422175B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211031439.1A CN115422175B (en) 2022-08-26 2022-08-26 Invalid data cleaning method based on database historical snapshot

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211031439.1A CN115422175B (en) 2022-08-26 2022-08-26 Invalid data cleaning method based on database historical snapshot

Publications (2)

Publication Number Publication Date
CN115422175A CN115422175A (en) 2022-12-02
CN115422175B true CN115422175B (en) 2023-03-31

Family

ID=84200453

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211031439.1A Active CN115422175B (en) 2022-08-26 2022-08-26 Invalid data cleaning method based on database historical snapshot

Country Status (1)

Country Link
CN (1) CN115422175B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106156047A (en) * 2015-03-27 2016-11-23 中国移动通信集团福建有限公司 A kind of SNAPSHOT INFO processing method and processing device
CN113515487A (en) * 2021-09-07 2021-10-19 联想凌拓科技有限公司 Directory query method, computing device and distributed file system

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10275317B2 (en) * 2016-01-21 2019-04-30 Druva Technologies Pte. Ltd. Time-based data retirement for globally de-duplicated archival storage
CN107301186B (en) * 2016-04-15 2020-10-09 中国移动通信集团重庆有限公司 Invalid data identification method and device
CN112506926A (en) * 2020-12-03 2021-03-16 广州华多网络科技有限公司 Monitoring data storage and query method and corresponding device, equipment and medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106156047A (en) * 2015-03-27 2016-11-23 中国移动通信集团福建有限公司 A kind of SNAPSHOT INFO processing method and processing device
CN113515487A (en) * 2021-09-07 2021-10-19 联想凌拓科技有限公司 Directory query method, computing device and distributed file system

Also Published As

Publication number Publication date
CN115422175A (en) 2022-12-02

Similar Documents

Publication Publication Date Title
CN110427884B (en) Method, device, equipment and storage medium for identifying document chapter structure
US20100031238A1 (en) Method and Apparatus for Locating Memory Leak in a Program
CN105302657A (en) Abnormal condition analysis method and apparatus
CN104769585A (en) System and method for recursively traversing the internet and other sources to identify, gather, curate, adjudicate, and qualify business identity and related data
CN110795614A (en) Index automatic optimization method and device
CN113449168A (en) Method, device and equipment for capturing theme webpage data and storage medium
CN111125213A (en) Data acquisition method, device and system
KR102415962B1 (en) Storage system and method for operating thereof
CN115422175B (en) Invalid data cleaning method based on database historical snapshot
CN111488736A (en) Self-learning word segmentation method and device, computer equipment and storage medium
CN103488695A (en) Data synchronizing device and data synchronizing method
CN114490160A (en) Method, device, equipment and medium for automatically adjusting data tilt optimization factor
KR100913027B1 (en) Data Mining Method and Data Mining System
CN110287114B (en) Method and device for testing performance of database script
CN112380256B (en) Method for accessing data of energy system, database and computer readable storage medium
CN115374065A (en) File cleaning method and system based on cloud platform log record monitoring
CN115509446A (en) Metadata garbage identification method, device and equipment
CN111966655B (en) Method and device for managing file objects in memory in log acquisition process
CN115114264A (en) Application system database performance control method and system based on operation and maintenance flow platform
CN112988722A (en) Hive partition table data cleaning method and device and storage medium
CN109597584B (en) Solid state disk garbage recovery management method, device and equipment
US8161017B2 (en) Enhanced identification of relevant database indices
CN113434376B (en) Web log analysis method and device based on NoSQL
US20240111772A1 (en) Optimize workload performance by automatically discovering and implementing in-memory performance features
CN117435623A (en) Software development application data processing method based on big data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant